Information Systems (Informationssysteme)

Similar documents
Conceptual Design. The Entity-Relationship (ER) Model

Information Systems (Informationssysteme)

Chapter 2: Entity-Relationship Model

1/24/2012. Chapter 7 Outline. Chapter 7 Outline (cont d.) CS 440: Database Management Systems

Using High-Level Conceptual Data Models for Database Design A Sample Database Application Entity Types, Entity Sets, Attributes, and Keys

Intro to DB CHAPTER 6

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

Database Systems. Lecture2:E-R model. Juan Huo( 霍娟 )

CSCC43H: Introduction to Databases

Part 5: Introduction to Logical Design

Database Applications (15-415)

Entity-Relationship Model &

Chapter 6: Entity-Relationship Model. The Next Step: Designing DB Schema. Identifying Entities and their Attributes. The E-R Model.

The Next Step: Designing DB Schema. Chapter 6: Entity-Relationship Model. The E-R Model. Identifying Entities and their Attributes.

Chapter 7: Entity-Relationship Model

Informatics 1: Data & Analysis

Chapter 2: Entity-Relationship Model. Entity Sets. Entity Sets customer and loan. Attributes. Relationship Sets. A database can be modeled as:

Chapter 7: Entity-Relationship Model

The Entity Relationship Model

Chapter 7: Entity-Relationship Model

Chapter 7: Entity-Relationship Model

LAB 2 Notes. Conceptual Design ER. Logical DB Design (relational) Schema Refinement. Physical DD

Databases. Jörg Endrullis. VU University Amsterdam

The Relational Model. Week 2

Database Systems ER Model. A.R. Hurson 323 CS Building

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

Entity Relationship Modelling

Chapter 6: Entity-Relationship Model

MIS Database Systems Entity-Relationship Model.

E-R Model. Hi! Here in this lecture we are going to discuss about the E-R Model.

CS W Introduction to Databases Spring Computer Science Department Columbia University

COMP Instructor: Dimitris Papadias WWW page:

Lecture 4. Lecture 4: The E/R Model

Chapter 6: Entity-Relationship Model

The En'ty Rela'onship Model

course 3 Levels of Database Design CSCI 403 Database Management Mines Courses ERD Attributes Entities title 9/26/2018

Review The Big Picture

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

Roadmap of This Lecture. Weak Entity Sets Extended E-R Features Reduction to Relation Schemas Database Design UML*

A l Ain University Of Science and Technology

Mahathma Gandhi University

The Relational Model

Entity Relationship Diagram (ERD) Dr. Moustafa Elazhary

A database can be modeled as: + a collection of entities, + a set of relationships among entities.

II. Review/Expansion of Definitions - Ask class for definitions

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Database Applications (15-415)

Let s briefly review important EER inheritance concepts

Database Systems ( 資料庫系統 )

Lecture 5. Lecture 5: The E/R Model

A l Ain University Of Science and Technology

Conceptual Database Design. COSC 304 Introduction to Database Systems. Entity-Relationship Modeling. Entity-Relationship Modeling

Conceptual Data Models for Database Design

Data Modeling. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

LECTURE 3: ENTITY-RELATIONSHIP MODELING

Chapter 6: Entity-Relationship Model. E-R Diagrams

CS 377 Database Systems

Tutorial 2: Relational Modelling

Chapter 6: Entity-Relationship Model

CMSC 424 Database design Lecture 3: Entity-Relationship Model. Book: Chap. 1 and 6. Mihai Pop

ER to Relational Model. Professor Jessica Lin

CS275 Intro to Databases

Basant Group of Institution

Contents. Database. Information Policy. C03. Entity Relationship Model WKU-IP-C03 Database / Entity Relationship Model

Data Analysis 1. Chapter 2.1 V3.1. Napier University Dr Gordon Russell

The Relational Model

The Basic (Flat) Relational Model. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model

The Relational Model Constraints and SQL DDL

ENTITY-RELATIONSHIP MODEL. CS 564- Spring 2018

High Level Database Models

Agent 7 which languages? skills?

SOFTWARE ENGINEERING Prof.N.L.Sarda Computer Science & Engineering IIT Bombay. Lecture #10 Process Modelling DFD, Function Decomp (Part 2)

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Database Design Phases. History. Entity-relationship model. ER model basics 9/25/11. Entity-relationship (ER) model. ER model basics II

The Entity-Relationship (ER) Model

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Data Modeling with the Entity Relationship Model. CS157A Chris Pollett Sept. 7, 2005.

Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts

Chapter 2 Conceptual Modeling. Objectives

CS 146 Database Systems

Database Design and the E-R Model (7.4, )

DATABASE TECHNOLOGY - 1DL124

SQL DDL. CS3 Database Systems Weeks 4-5 SQL DDL Database design. Key Constraints. Inclusion Constraints

II B.Sc(IT) [ BATCH] IV SEMESTER CORE: RELATIONAL DATABASE MANAGEMENT SYSTEM - 412A Multiple Choice Questions.

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

Conceptual Data Modeling

The Relational Model. Chapter 3

Database Management System (15ECSC208) UNIT I: Chapter 2: Relational Data Model and Relational Algebra

Relational Database Systems Part 01. Karine Reis Ferreira

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

COSC 304 Introduction to Database Systems. Entity-Relationship Modeling

CMPT 354 Database Systems I

The Entity-Relationship Model (ER Model) - Part 2

The Relational Model

Relational Data Model

01/01/2017. Chapter 5: The Relational Data Model and Relational Database Constraints: Outline. Chapter 5: Relational Database Constraints

Transcription:

Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jensteubner@cstu-dortmundde Summer 2018 c Jens Teubner Information Systems Summer 2018 1

Part IV Database Design c Jens Teubner Information Systems Summer 2018 40

Database Design Process Database systems are very good at handling your data once your data is in a form that can be digested by the system How do we get from a real-world problem to a database schema? Problem mini world requirements analysis 1 2 conceptual design logical design 3 4 schema refinement physical design 5 database schema 1 requirements analysis Meet with customers, understand their problem 2 conceptual design Develop a high-level model for the data that should be stored in the database, typically using an ER diagram c Jens Teubner Information Systems Summer 2018 41

Database Design Process (cont) 3 logical design Convert the conceptual design into the data model of the chosen DBMS Result is a conceptual schema ( slide 17) 4 schema refinement Refine obtained conceptual schema, eg, using normalization (see later) 5 physical design Develop a physical schema that meets the application s performance needs Note: In practice, you ll have to re-iterate some or all of these steps multiple times until you reach a satisfactory design c Jens Teubner Information Systems Summer 2018 42

Requirements Analysis Main Goal: Understand user s needs Meet and discuss with user groups; study existing documentation and/or applications Listen and watch out for real-world entities that should be reflected in the database and how they relate and interact with each others Make sure you understand the user s needs: Note down your understanding in a way that you can discuss with your users (informal notation; prose text) Re-iterate with users to make sure your understanding matches the needs of the users c Jens Teubner Information Systems Summer 2018 43

Requirements Analysis As one of the largest cocktail bars in town, we are really proud of our large collection of cocktail recipes For each cocktail (recipe) we would like to store its name, a short description, instructions how to make the cocktail, and an information how long that cocktail is already in our database Each cocktail consists of a number of ingredients, which have a name, a short textual characterization of their flavor, and an information about the amount of alcohol they contain It is also important to know which supplier offers which of the ingredients (and at which price) Our supplier list contains addresses and URLs for each supplier Sometimes, we even know a direct contact person that belongs to a supplier, including his/her name, phone number, and email address c Jens Teubner Information Systems Summer 2018 44

Requirements Analysis Rule of thumb: Mark subjects in a customer s description that describe concepts (or entities ) that should be stored in the database Eg, cocktails (or recipes) and ingredients should be stored in the database Mark verbs that indicate relationships between concepts Eg, cocktails consist of ingredients (or: ingredients are contained in cocktails) In addition, watch out for attributes that further characterize a concept/entity Eg, name, description, etc characterize cocktails; name, flavor, and alcohol percentage characterize ingredients c Jens Teubner Information Systems Summer 2018 45

Conceptual Design: ER Model Conceptual Database Design: High-level description of data to be stored in database Typically uses a rather formalized notation Typically: Entity-Relationship Model (ER Model) and ER Diagrams Clear notation, yet independent of the data model used by the specific database system The ER Model helps to communicate with users (and verify the model) and translate into a conceptual schema for the used DBMS (We will learn rules how, eg, an ER Diagram can mechanically be translated into a relational database schema) c Jens Teubner Information Systems Summer 2018 46

ER Model: Entity Sets An entity is an object in the real world that is distinguishable from other objects An entity set is a collection of similar entities We represent an entity set in an ER Diagram as a rectangle Description Name Cocktails Since Instructions An entity is described using a set of attributes We use ellipses to represent attributes c Jens Teubner Information Systems Summer 2018 47

Attribute Domains The domain of an attribute describes its possible values Eg, Name: strings of length 30 Description: strings of length 200 Since: date value greater than Jan 1, 1970 Instructions: strings of length 500 c Jens Teubner Information Systems Summer 2018 48

ER Model: Relationship Sets A relationship is an association among two (or more) entities A relationship set is a collection of similar relationships We represent relationship sets as diamonds Description Name Name Flavor Since Cocktails consists of Ingredients Instructions Quantity Alcohol Relationships can carry attributes, too c Jens Teubner Information Systems Summer 2018 49

Entities, Relationships, and Sets Thereof Dry Martini Paradise Screwdriver Relationship Entity 1 dash 5 cl 3 cl 15 cl 15 cl 10 cl 4 cl Entity Vermouth Dry Gin Apricot Brandy Orange Juice Vodka Entity Set Relationship Set Entity Set Cocktails Consists of Ingredients c Jens Teubner Information Systems Summer 2018 50

More Relationships Relationships can also associate two entities within the same entity set Eg, some ingredients can be substituted by another one (when an ingredient has run out of stock): Name Flavor Alcohol Ingredients substitutes And there can be multiple relationship sets between the same entity sets: Employees Works In Manages Departments c Jens Teubner Information Systems Summer 2018 51

n-ary Relationships Relationships can be n-ary: Professors Students takes exam Courses c Jens Teubner Information Systems Summer 2018 52

Attributes and Keys Generally, an entity is uniquely identified by the values of its attributes Sometimes, a subset of attributes is enough to uniquely identify an entity eg, Student ID; SSN; course number + semester; etc We call a minimal set of identifying attributes a key We use underlining to mark the (set of) key attributes Name Student ID Studies Since Street Students Zip Code City c Jens Teubner Information Systems Summer 2018 53

Attributes and Keys If there is no simple identifying (set of) attribute(s), it is often useful to introduce an artificial key attribute (eg, an integer number) Description Name Cocktail ID Cocktails Since Instructions If there are multiple candidate keys, typically one is designated to be the primary key (Having a simple, designated key also eases translation to relational algebra, which we will look at later) c Jens Teubner Information Systems Summer 2018 54

Keys for Relationships? What about keys for relationships? c Jens Teubner Information Systems Summer 2018 55

Participation Constraints Very often, the participation of entities in a relation set can be further constrained: Each cocktail consists of at least one ingredient A contact person for a supplier is optional But there must be at most one contact person per supplier In other words: In the consists-of relationship set, each cocktail participates 1 times, each ingredient participates 0 times In the contact-person-for relationship set, each supplier participates 01 times, each contact person participates 1 time c Jens Teubner Information Systems Summer 2018 56

Participation Constraints: (min, max) Notation Use (min, max) notation to specify such constraints Specifiy minimum and maximum number of times that each entity may participate in the relationship set Cocktails (1, ) (0, ) consists of Ingredients Suppliers (0, 1) contact (1, 1) person for Contact Persons Typically: Use instead of c Jens Teubner Information Systems Summer 2018 57

(min, max) Notation 0, 1, and are certainly seen most often in ER Diagrams But other values can make sense, too Students (0, 2) (5, 15) Participates Seminars Describe the meaning of these constraints in natural language c Jens Teubner Information Systems Summer 2018 58

Alternative Notation An alternative, often-seen notation is to label relationship sets as either 1 : 1, 1 : N (or N : 1), or N : M (min, max) notation: Cocktails (1, 1) (0, ) Served In Glasses Alternative: Cocktails N 1 Served In Glasses The semantics one type of glass can be used for N different cocktails is counter-intuitive to that of the (min, max) notation! c Jens Teubner Information Systems Summer 2018 59

Advanced Concepts: Weak Entities Weak entities are entities that can exist only in combination with a strong entity Name Building No Size Room No Buildings (1, ) (1, 1) Located In Rooms Since weak entities depend on their strong counterpart, they do not themselves have a unique key Use key of partner to form a complete key Here: Building No, Room No together identify a room c Jens Teubner Information Systems Summer 2018 60

Advanced Concepts: Generalization/Specialization Generalization Factor out common characteristics to build a common supertype Specialization Derive new, specialized entity sub-types, possibly by adding new characteristics (such as new attributes) Ingredients is-a Alcoholic Ingredients Non-Alcoholic Ingredients There is no real standard notation to express generalization/specialization c Jens Teubner Information Systems Summer 2018 61

ER Model UML The Unified Modeling Language (UML) has emerged as a modeling language for a wide range of design tasks In UML, class diagrams are closest to ER Diagrams Unlike entity sets in the ER, UML classes can contain methods class name attributes methods Cocktail + name + indbsince + description + instructions + printshoppinglist() UML is more intended for (in-memory) application design Entities/objects are, therefore, identified via pointers, not through explicit key attributes c Jens Teubner Information Systems Summer 2018 62

UML Associations UML associations take the role of relationship sets in ER Models Class A cardinality role name A association name cardinality role name B Class B Often, associations are directed: Class A cardinality role name A association name cardinality role name B Class B Can navigate from object to object only in one direction then c Jens Teubner Information Systems Summer 2018 63

Ingr ID Flavor Supp ID Address Alcohol Ingredients (0, ) (1, ) (1, ) Supplies Name Suppliers (0, 1) Name WWW Consists Of Contact Person For Cocktail ID (1, ) Name (1, 1) Phone Since Cocktails Contact Persons Name Description Instructions Contact ID Email Remaining question: How can we turn that into a database schema? c Jens Teubner Information Systems Summer 2018 64

From an ER Diagram to a Relational Schema Mapping an entity set into a relation is straightforward Each attribute of the entity set becomes an attribute of the table Ingr ID Flavor Alcohol Ingredients Name Ingredients IngrID Name Alcohol Flavor c Jens Teubner Information Systems Summer 2018 65

Relations and Keys Observe that we use the concept of keys also in the relational world A minimal set of fields that uniquely identifies a tuple (row) in a table is called a (candidate) key In any legal instance of the relation, two distinct tuples cannot have identical values in all the fields of a key No subset of the key is a unique identifier for a tuple The key constraint is a property of the schema A column that just happens to contain unique values in the current table instance is not a key Again, among multiple candidate keys, we typically select one primary key c Jens Teubner Information Systems Summer 2018 66

5 Fields marked as NOT NULL cannot be left blank for any row; key columns must be declared NOT NULL DECIMAL(m,n) is a decimal number type with m digits total and n digits after the decimal c Jens Teubner Information Systems Summer 2018 67 Key Constraints In database systems, keys can be declared together with the schema of the table Eg, in SQL: 5 CREATE TABLE Ingredients ( IngrID INTEGER NOT NULL, Name CHAR(30), Alcohol DECIMAL(3,1), Flavor CHAR(20), PRIMARY KEY (IngrID) ) The DBMS will enforce such constraints and reject any modifications that would violate the key constraint

Relationships in the Relational World An n-ary relationship in the ER Model is an n-tuple of entities That is, the corresponding relationship set can be thought of as the set { (e1,, e n ) e 1 E 1,, e n E n } (ignoring relationship attributes for a moment) We could use that representation to express relationships in the relational world: ConsistsOf CockID CName Since Descr Instr IngrID IName Alcohol Flavor Observe that we re-named the Name fields, because column names must be unique within one table c Jens Teubner Information Systems Summer 2018 68

Relationships in the Relational World Such an encoding would incur significant redundancy and storage overhead Eg, every cocktail redundantly appears at least once in the ConsistsOf relation Remember that the Cocktail ID already uniquely determines all remaining cocktail properties: CockID CName Since Descr Instr Columns CName, Since, Descr, and Instr can thus safely be omitted in ConsistsOf When needed, the information can always be looked up in Cocktails (with help of the CockID value) c Jens Teubner Information Systems Summer 2018 69

Relationships in the Relational World Likewise, we can also omit all non-key columns of Ingredients: ConsistsOf CockID IngrID Let us now put the relationship attributes back: ConsistsOf CockID IngrID Quantity c Jens Teubner Information Systems Summer 2018 70

Example (and SQL Refresher) Assuming Cocktails, Ingredients, and ConsistsOf are stored in an SQL database, how could we re-construct the original, full ConsistsOf relation? c Jens Teubner Information Systems Summer 2018 71

Foreign Keys Such lookups occur very often in relational databases The concept is, in fact, a corner stone of relational databases Columns CockID and IngrID in ConsistsOf are called foreign keys They refer to Cocktails and Ingredients (respectively) Foreign keys can be declared in SQL, too: CREATE TABLE ConsistsOf ( CockID INTEGER NOT NULL, IngrID INTEGER NOT NULL, FOREIGN KEY (CockID) REFERENCES Cocktails, FOREIGN KEY (IngrID) REFERENCES Ingredients ) Foreign keys refer to the primary key of the respective relation c Jens Teubner Information Systems Summer 2018 72

6 SQL is declarative and does not really offer navigation primitives c Jens Teubner Information Systems Summer 2018 73 Foreign Keys Relational database systems use regular, user-accessible attribute values to reference between tuples No pointers or other internal data structures Remember physical data independence: Tuples can freely be moved to new locations in memory/on disk without breaking tuple associations Foreign key references can always be followed 6 in both directions

Relationships and Keys Which columns form a key in ConsistsOf? ConsistsOf CockID IngrID Quantity Does ConsistsOf have a key at all? c Jens Teubner Information Systems Summer 2018 74

Participation Constraints For some relationship sets we know that an entity may appear at most once in the set: Suppliers (0, 1) contact (1, 1) person for Contact Persons The respective columns in the resulting relation thus must be keys Here: SuppID is a key candidate in the ContactPersonFor relation ContactID is a key candidate in the ContactPersonFor relation c Jens Teubner Information Systems Summer 2018 75

Merging Relations Tables that have the same key can be merged into one: Cocktails CockID Name ServedIn CockID GlassID merge Cocktails CockID Name GlassID c Jens Teubner Information Systems Summer 2018 76

Participation Constraints Which of the participation constraints in Cocktails (1, 1) (0, ) Served In Glasses does the merged relation implement? c Jens Teubner Information Systems Summer 2018 77

Merging Relations For 1 : 1 relationships, there are various options to merge relations: Suppliers (0, 1) contact (1, 1) person for Contact Persons 1 No relations could be merged 2 ContactPersons could be merged with ContactPersonFor 3 Suppliers could be merged with ContactPersonFor 4 All three relations could be merged into one c Jens Teubner Information Systems Summer 2018 78

Merging Relations Suppose we chose option 3 : Suppliers SuppID Name Address WWW ContactID (where ContactID is a foreign key on the ContactPersons relation) What if there is no contact person for a given supplier? In such a case, we d want to leave the ContactID empty This can be done by setting ContactID to null c Jens Teubner Information Systems Summer 2018 79

Null Values Null values play an important role in relational databases They are used to model a variety of real-world scenarios: A value exists (in the real world), but is not known A supplier might have a WWW URL, but we don t know it No value exists A supplier might not have a WWW URL The attribute is not applicable for this tuple In a Persons relation, the Semester field only applies to students Null is a special value, distinct from any other value in the column s domain Null is not the numeric value 0 and not the empty string c Jens Teubner Information Systems Summer 2018 80

Behavior of Null Values In operations and predicates, think of null as unknown : and true unknown false true true unknown false unknown unknown unknown false false false false false or true unknown false true true true true unknown true unknown unknown false true unknown false Arithmetic operations with null evaluate to null (null + 42 null) Comparisons with null evaluate to null (Semester < null null) Exercise to try at home in SQL: Given the Presidents database used in the lecture, find all presidents that are still alive (ie, their DEATH_AGE is set to NULL) c Jens Teubner Information Systems Summer 2018 81

Null Values in SQL In SQL, null values are expressed using the keyword NULL: INSERT INTO Suppliers VALUES (4711, Shop Rite, 31 Main St, NULL, NULL) Null values can be allowed or disallowed for particular columns See SQL example on slide 67 Key columns must not contain null values c Jens Teubner Information Systems Summer 2018 82

Null Values and Participation Constraints The allowance of null values is another knob to restrict participation constraints in the relational world Eg, ContactPersonFor: Suppliers contact (1, 1) person for Contact Persons Column ContactID in relation Suppliers: NULL (null values allowed) (0, 1) NOT NULL (null values disallowed) (1, 1) By marking ContactID in Suppliers as a key, we can further constrain the maximum participation of ContactPersons in the relationship set c Jens Teubner Information Systems Summer 2018 83

Translation ER Diagram Relational Schema To translate an ER Diagram into a relational schema: 1 Map all entity sets to a relation 2 Identify a primary key in each resulting relation 3 Map all relationship sets to a relation 4 Identify foreign key constraints in all those relations 5 Refine the resulting schema by merging relations with same key Often, there is some degree of freedom in that step (Dis)allow null values to reflect participation constraints Generally, not all participation constraints can losslessly be modeled with only (foreign) key constraints and constraints on null values c Jens Teubner Information Systems Summer 2018 84