DATABASE SYSTEMS I WEEK 2 THE ENTITY-RELATIONSHIP MODEL Class Time and Location: Tue 14:30-16:20 AQ3005 Thu 14:30-15:20 AQ3003 Course Website: http://www.cs.sfu.ca/cc/354/rfrank/ Instructor: Richard Frank, PhD Email: rfrank@sfu.ca Office Hours Location: TASC 9205 Time: Tuesday, 1:30pm-2:30pm SYLLABUS TA: Ankit Gupta Email: aga53@sfu.ca Office Hours Location: ASB9838_TA_1 Time: Monday, 10am-11:30am 2 ADMIN Assignment #2 changed A2Q1 moved to A3Q1 3
OVERVIEW OF DATABASE DEVELOPMENT Requirements Analysis / Ideas High-Level Database Design Conceptual Database Design / Relational Database Schema Physical Database Design / Relational DBMS Similar to software development 4 Requirements Analysis OVERVIEW OF DATABASE DEVELOPMENT What data are to be stored in the enterprise? What are the required applications? What are the most important operations? High-level database design What are the entities and relationships in the enterprise? What information about these entities and relationships should we store in the database? What are the integrity constraints or business rules that hold? 5 Conceptual database design OVERVIEW OF DATABASE DEVELOPMENT What data model to implement for the DBS? E.g., relational data model Map the high-level design (e.g., ER diagram) to a (conceptual) database schema of the chosen data model. Physical database design What DBMS to use? What are the typical workloads of the DBS? Build indexes to support efficient query processing. What redesign of the conceptual database schema is necessary from the point of view of efficient implementation? 6
Short: ER model. ENTITY-RELATIONSHIP MODEL A lot of similarities with other modeling langus such as UML. Concepts Entities / Entity sets, Attributes, Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations). Closer to the way in which people think. 7 ENTITY-RELATIONSHIP DIAGRAMS An Entity-Relationship diagram (ER diagram) is a graph with nodes representing entity sets, attributes and relationship sets. Entity sets denoted by rectangles. Attributes denoted by ovals. Relationship sets denoted by diamonds. Edges (lines) connect entity sets to their attributes and relationship sets to their entity sets. lot d Works_In 8 ENTITIES AND ENTITY SETS Entity: Real-world object distinguishable from other objects e.g. employee Miller. Entity can be physical or abstract object. An entity is associated with the attributes describing its properties. Attribute values are atomic e.g. strings, integer or real numbers. Contain a single piece of information Full? Age? Entity set: A collection of similar entities. E.g., all employees. 9
ENTITIES AND ENTITY SETS All entities in an entity set have the same set of attributes. (At least, for the moment!) Each entity set has a key, i.e. a minimal set of attributes to uniquely identify an entity of this set. Key attributes are underlined. Each attribute has a domain, i.e. a set of all possible attribute values. 10 ENTITIES AND ENTITY SETS first last birthdate salary A key must be unique across all possible (not just the current) entities of its set. A key can consist of more than one attribute. There can be more than one key for a given entity set, but we choose one (primary key) for the ER diagram. 11 RELATIONSHIPS AND RELATIONSHIP Relationship: Association among two or more entities. E.g., Miller works in Pharmacy department. SETS Relationship set: Collection of similar relationships among two or more entity sets. d Works_In 12
RELATIONSHIPS AND RELATIONSHIP SETS An n-ary relationship set R relates n entity sets E1... En. Each relationship in R involves entities e1 E1,..., en En. Binary relationship sets most common. Same entity set can participate in different relationship sets, or in different roles in same set. subordinate supervisor Reports_To 13 Entity RELATIONSHIPS AND RELATIONSHIP object that is distinguishable from other objects Ex: your home address, CMPT 354 Entity Set All home addresses Collection of CMPT courses Each entity set has 1-to-many entities Each entity can belong to multiple entity sets SETS Relationship Joe lives at 45 Main St. Mary lives at 89 Wood Ave. Relationship Set Person lives at home address 14 RELATIONSHIPS AND RELATIONSHIP Relationship sets can also have attributes. SETS Useful for properties that cannot reasonably be associated with one of the participating entity sets. d Works_In 15
INSTANCES OF AN ER DIAGRAM Entity set contains a set of entities. Each entity has one value for each of its attributes. No duplicate instances (not a technical limit) What to do?? 12345678 John Miller 30 14789632 Paul Li 25......... 16 INSTANCES OF AN ER DIAGRAM Relationship set contains a set of relationships, each relating a set of entities, one from each of the participating entity sets. Components are entities, not attribute values. No duplicates (not a technical limit) Works_In Employee () 12345678 1 14789632 1 56756322 2...... Department () 17 RELATIONSHIPS AND RELATIONSHIP SETS Multiway relationship sets (n > 2) are used whenever binary relationships cannot capture the application semantics. tid description Works_For Tasks Projects pid p Infrequent. 18
RELATIONSHIPS AND RELATIONSHIP SETS tid description Works_For Tasks Projects Works_For pid p Employee () Tasks (tid) Project (pid) 12345678 1000 101 12345678 1500 106 56756322 1500 106......... 19 KEY CONSTRAINTS A key constraint on a relationship set specifies that the marked entity set participates in at most one relationship of this relationship set. Entity set is marked with an arrow. d Mans Key constraint 20 MULTIPLICITY OF RELATIONSHIPS An employee can work in many departments; a dept can have many employees. Works_In d Each dept has at most one manr, who may man several (many) departments. Mans d one many 21
MULTIPLICITY OF RELATIONSHIPS The different types of (binary) relationships from a multiplicity point of view: One to one One to many Many to one Many to many one-to-one one-to-many many-to-one many-to-many 22 PARTICIPATION CONSTRAINTS A participation constraint on a relationship set specifies that the marked entity set participates in at least one relationship of this relationship set. Entity set is marked with a bold line. d Mans Works_In Participation constraint 23 WEAK ENTITIES A weak entity exists only in the context of another (owner) entity. The weak entity can be identified uniquely only by considering the primary key of the owner and its own partial key. Owner entity set and weak entity set must participate in a one-to-manyrelationship set (one owner, many weak entities). Weak entity set must have total participation in this supportingrelationship set. cost Policy Dependents Ex: If there is no employee, there cannot be a dependent. 24
SUBCLASSES Sometimes, an entity set contains some entities that do share many, but not all properties with the entity set hierarchies. A ISA B: every A entity is also considered to be a B entity. A specializes B, B generalizes A. A is called subclass, B is called superclass. A subclass inherits the attributes of a superclass, may define ISA additional attributes. Hourly_Emps Contract_Emps 25 SUBCLASSES hourly_ws hours_worked ISA contractid Hourly_Emps Contract_Emps Hourly_Emps and Contract_Emps inherit the (key!), and attributes from. They define additional attributes hourly_ws, hours_worked and contractid, resp. 26 Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? Covering constraints: Does every entity have to be either an Hourly_Emps or a Contract_Emps entity? NO. Unless Hourly_Emps AND Contract_Emps COVER SUBCLASSES YES. Hourly_Emps OVERLAPS Contract_Emps 27
SUBCLASSES There are several good reasons for using ISA relationships and subclasses: Do not have to redefine all the attributes. Can add descriptive attributes specific to a subclass. To identify entitity sets that participate in a relationship set as precisely as possible. ISA relationships form a tree structure (taxonomy) with one entity set serving as root. 28 Faithfulness DESIGN PRINCIPLES Design must be faithful to the specification / reality. Relevant aspects of reality must be represented in the model. Avoiding redundancy Redundant representation blows up ER diagram and makes it harder to understand. Redundant representation wastes stor. Redundancy may lead to inconsistencies in the database. 29 Keep it simple DESIGN PRINCIPLES The simpler, the easier to understand for some (external) reader of the ER diagrams. Avoid introducing more elements than necessary. If possible, prefer attributes over entity sets and relationship sets. Formulate constraints as far as possible A lot of data semantics can (and should) be captured. But some constraints cannot be captured in ER diagrams. 30
HIGH-LEVEL DESIGN WITH ER MODEL Major design choices Should a concept be modeled as an entity or an attribute? a relationship? What relationships to use: binary or ternary? Should address be an attribute of or an entity (connected to by a relationship)? Depends upon the use we want to make of address information, and the semantics of the data: If we have several addresses per employee, address must be an entity ( attributes cannot be set-valued). 31 ENTITY VS. ATTRIBUTE Works_In2 does not allow an employee to work in the same department for two or more periods (why?). We want to record several values of the descriptive attributes for each instance of this relationship. 32 ENTITY VS. RELATIONSHIP lot d d Mans2 This ER diagram o.k. if a manr gets a separate discretionary for each dept. But what if a manr gets a discretionary that covers all mand depts? Redundancy of d, which is stored for each dept mand by the manr. Misleading: suggests d tied to mand dept. 33
ENTITY VS. RELATIONSHIP What about this diagram? who are not manrs will have d=null? The following ER diagram is more appropriate and avoids the above problems! Each manr now has a. 34 BINARY VS. TERNARY RELATIONSHIPS ER diagram says Employee can own several policies Each policy can be owned by several employees Each dependent can be covered by several policies lot p Covers Dependents Policies policyid cost If each policy is owned by just one employee: Key constraint on Policies would mean policy can only cover 1 dependent! (only 1 combination of and Policies can be in Covers) Bad design! 35 BINARY VS. TERNARY RELATIONSHIPS lot p Dependents Purchaser Beneficiary Policies policyid cost This diagram is a better design. Policy can only exist for employees. Dependents only exist if they are covered by a policy. 36
BINARY VS. TERNARY RELATIONSHIPS Previous example illustrated a case when two binary relationships were better than one ternary relationship. An example in the other direction: a ternary relation Contracts relates entity sets Parts, and Suppliers, and has descriptive attribute qty. No combination of binary relationships is an adequate substitute: S can-supply P, D needs P, and D deals-with S does not imply that D has agreed to buy P from S. How do we record qty? 37