IS 263 Database Concepts Lecture 1: Database Design Instructor: Henry Kalisti 1 Department of Computer Science and Engineering
The Entity-Relationship Model? 2
Introduction to Data Modeling Semantic data models attempt to capture the meaning of a database. Practically, they provide an approach for conceptual data modeling. Over the years there have been several different semantic data models that have been proposed. By far the most common is the entity- relationship data model, most often referred to as simply the E- R data model. 3
Introduction to Data Modeling The E- R model is often used as a form of communication between database designers and the end users during the developmental stages of a database. The E- R model contains an extensive set of modeling tools, some of which we will not be concerned with as the primary objective of this course is to give you some insight into conceptual database design and not learning all of the ins and outs of the E- R model. 4
Introduction to Data Modeling Another conceptual modeling which is becoming more common is the Object Definition Language (ODL) which is an object- oriented approach to database design that is emerging as a standard for object- oriented database systems. 5
Database Design The database design process can be divided into six basic steps. Semantic data models are most relevant to only the first three of these steps. Requirements Analysis Conceptual Database Design Logical Database Design Schema Refinement Physical Database Design Security Design 6
Requirements Analysis first step in designing a database application is to understand: What data is to be stored in the database, What applications must be built on top of it, and What operations are most frequent and subject to performance requirements. Often this is an informal process involving discussions with user groups and studying the current environment. Examining existing applications expected to be replaced or complemented by the database system. 7
Requirements Analysis Quiz Choose a domain which you need to develop a database for. Do requirement analysis for the database and outline the following: What data is to be stored in the database, What applications must be built on top of it, and What operations are most frequent and subject to performance requirements. Banking, ticket reservations, customer records, sales records, product records, inventories, employee records, address Databases books, demographic records, are student records, course plans, schedules, surveys, everywhere! test suites, research data, genome bank, medicinal records, time tables, news archives, sports results, e- commerce, user authentication systems, web forums, www.imdb.com, the world wide web, 8
Conceptual Database Design The information gathered in the requirements analysis step is used to develop a high- level description of the data to be stored in the database, along with the constraints that are known to hold on this data. Creation of conceptual schema Concise description of users data requirements Uses high- level conceptual data model, e.g. Entity- relationship model Data model operations used to specify user operations from functional analysis Compatibility check and possible modification 9
Logical Database Design A DBMS must be selected to implement the database and to convert the conceptual database design into a database schema within the data model of the chosen DBMS. Logical design / data model mapping Uses implementation data model, e.g. relational data model Conceptual schema transformed from high- level data model to implementation data model 10
Schema Refinement In this step the schemas developed in the previous step are analyzed for potential problems. It is in this step that the database is normalized. Normalization of a database is based upon some elegant and powerful mathematical theory. We will discuss normalization later in the coming classes. 11
Physical Database Design At this stage in the design of a database, potential workloads and access patterns are simulated to identify potential weaknesses in the conceptual database. This will often cause the creation of additional indices and/or clustering relations. In critical situations, the entire conceptual model will need restructuring. 12
Physical Database Design Physical design Internal storage structures, access paths, file organization specified Application programs designed and implemented 13
Security Design Different user groups are identified and their different roles are analyzed so that access patterns to the data can be defined. The illustration on the following page summarizes the main phases of database design. 14
15
The Entity- Relationship approach Design your database by drawing a picture of it (an Entity- Relationship diagram) Allows us to sketch the design of a database informally (which is good when communicating with customers) Use (more or less) mechanical methods to convert your diagram to relations. This means that the diagram can be a formal specification as well The E- R model employs three basic notions: entity sets, relationship sets, and attributes. 16
Entities and entity sets Entity = thing or object course, room etc. Entity set = collection of similar entities all courses, all rooms etc. Entities are drawn as rectangles Course 17
Attributes Entities have attributes. All entities in an entity set have the same attributes (though not the same values) Attributes are drawn as ovals connected to the entity by a line. 18
Attributes Example: code Keys are underlined name Course teacher A course has three attributes the unique course code, a name and the name of the teacher. All course entities have values for these three attributes, e.g. (IS263, Database Concept, Henry Kalisti). 19
Translation to relations An E- R diagram can be mechanically translated to a relational database schema. An entity becomes a relation, the attributes of the entity become the attributes of the relation, keys become keys. code name Course Courses(code, name, teacher) teacher 6 20
A note on naming policies My view: A rectangle in an E- R diagram represents an entity, hence it is put in singular (e.g. Course). Fits the intuition behind attributes and relationships better. The book: A rectangle represents an entity set, hence it is put in plural (e.g. Courses) Easier to mechanically translate to relations. 21
Definitions Entity an aggregation of a number of data elements each data element is an attribute of the entity Entity type a class of entities with the same attributes Relationship an association between two or more entities that is of particular interest 22
E- R Model Notation 23
E- R Model Notation attribute discriminating attribute of a weak entity set E1 R E2 1:1 cardinality from E1 to E2 E1 R E2 1:M cardinality from E1 to E2 E1 1 R M E2 alternate form for 1:M cardinality from E1 to E2 E1 R E2 M:1 cardinality from E1 to E2 E1 R E2 M:M cardinality from E1 to E2 E1 N R M E2 alternate form for M:M cardinality from E1 to E2 24
Example 1: E- R Diagram (ERD) customer-street customer-name customer-city amount customer-id customer-id customer borrower loan 25
Example 2: E- R Diagram (ERD) first-name middle-name last-name street-num street-name apartment-num customer-name street city customer-id address state phone-num customer age zipcode date-of-birth 26
What is Conceptual Database Design? Process of describing the data, relationships between the data, relationships between the data, and the constraints on the data. After analysis: Gather all the essential data required and understand how the data are related The focus is on the data, rather than on the processes. The out put of the conceptual database design is a Conceptual Data Model ( + Data Dictionary) 27
Gathering Information for Conceptual Data Modeling Two perspectives Top- down Data model is derived from an intimate understanding of the business. Bottom- up Data model is derived by reviewing specifications and business documents. 28
Entity- Relationship (ER) Modeling ER Modeling is a top- down approach to database design. Entity Relationship (ER) Diagram A detailed, logical representation of the entities, associations and data elements for an organization or business Notation uses three main constructs Data entities Relationships Attributes Two types of notation that can be used are Chen Model & Crow s Foot Model 29
Chen Notation Association between the instances of one or more entity types EntityName Verb Phrase AttributeName Person, place, object, event or concept about which data is to be maintained named property or characteristic of an entity Represents a set or collection of objects in the real world that share the same properties 30
Crow s Foot Notation Entity Attribute Relationship EntityName EntityName List of Attributes Verb phrase Acceptable 31
Entities Examples of entities: Person: EMPLOYEE, STUDENT, PATIENT Place: STORE, WAREHOUSE Object: MACHINE, PRODUCT, CAR Event: SALE,REGISTRATION, RENEWAL Concept: ACCOUNT, COURSE Guidelines for naming and defining entity types: An entity type name is a singular noun An entity type should be descriptive and specific An entity name should be concise Event entity types should be named for the result of the event, not the activity or process of the event. 32
Attributes Example of entity types and associated attributes: STUDENT: Student_ID, Student_Name, Home_Address, Phone_Number, Major Guidelines for naming attributes: An attribute name is a noun. An attribute name should be unique To make an attribute name unique and clear, each attribute name should follow a standard format Similar attributes of different entity types should use similar but distinguishing names. 33
Attributes Attribute: property of an entity set Each entity in the set has the same properties Domain: set of permitted values for each attributes Attribute types: Simple vs. composite Single- valued vs. multi- valued Derived 34
Attributes in the E- R Model As used in the E- R model, an attribute can be characterized by the following attribute types: Simple or Composite: A simple attribute contains no subparts while a composite attribute will contain subparts. For example, consider the attribute name. If name represents a simple attribute then we must treat the first name, middle name, and last name as an atomic, indivisible attribute. On the other hand, if name represents a composite attribute then we have the option of dealing with the entire name as a whole or dealing only with one of the subparts. 35
Simple or Composite attribute 36
Single- valued or Multi- valued A single- valued attribute may have at most one value at any particular time instance. A multiple- valued attribute may have several different values at any particular time instance. For example, consider an attribute of the entity set student which might be phone- number. At any given time instant a student may have several different phone numbers and thus a multi- valued attribute would be best to accurately model the student. 37
Entity with a multivalued attribute and derived attribute What s wrong with this? Derived from date employed and current date Multivalued: an employee can have more than one skill 38
Derived Attribute This is an attribute whose value is derived (computed) from the values of other related attributes or entities. For example, suppose that the bank customer entity set contains an attribute loans- held, which represents the number of loans a customer has from the bank. The value of this attribute can be computed for each customer by counting the number of loan entities associated with that customer. 39
Identifier Attributes Candidate key Attribute (or combination of attributes) that uniquely identifies each instance of an entity type Some entities may have more than one candidate key A candidate key for EMPLOYEE is Employee_ID, a second is the combination of Employee_Name and Address. If there is more than one candidate key, need to make a choice. Identifier A candidate key that has been selected as the unique identifying characteristic for an entity type 40
Referential Attributes Make Reference to another instance in another table Make Reference to another instance in another table Referential attribute: Ties the lecturer entity to another entity that is department. Name IdNum DeptID Email Ali 105 LG ali@a.com stance of Lecturer. Mary 106 IT mary@a.com John 107 ENG john@a.com Lim 108 IT lim@a.com 41
Example of Identifier Attribute Also refered as Primary Key Name Gender StaffID IC Staff PK Staff StaffID Name Gender IC 42
Translation to relations A relationship between two entities is translated into a relation, where the attributes are the keys of the related entities. code name name Course LecturesIn Room #seats teacher Courses(code, name, teacher) Rooms(name, #seats) LecturesIn(code, name) 43
References Courses(code, name, teacher) Rooms(name, #seats) LecturesIn(code, name) We must ensure that the codes used in LecturesIn matches those in Courses. Introduce references between relations. e.g. the course codes used in LecturesIn reference those in Courses. Courses(code, name, teacher) Rooms(name, #seats) LecturesIn(code, name) code -> Courses.code name -> Rooms.name References 1 44
Foreign keys Usually, a reference points to the key of another relation. E.g. name in LecturesIn references the key name in Rooms. name is said to be a foreign key in LecturesIn. 45
Relationships Is associations between instances of one or more entity types that is of interest Given a name that describes its function. Relationship name is an active or a passive verb. Relationship name: writes Author Book An author writes one or more books A book can be written by one or more authors. 46
Degree of Relationships Degree: number of entity types that participate in a relationship Three cases Unary: between two instances of one entity type Binary: between the instances of two entity types Ternary: among the instances of three entity types 47
Cardinality and Connectivity Relationships can be classified as either one to one one to many Connectivity many to many Cardinality: minimum and maximum number of instances of Entity B that can (or must be) associated with each instance of entity A. 48
Cardinality and Connectivity Professor teaches Class Professor teaches Class A professor teaches class OR A class is taught by professor How Many?? 49
Connectivity Chen Model 1 to represent one. M to represent many 1 M Crow s Foot One Mandatory one, means (1,1) many One or many Optional? we ll see after this 50
Cardinality and Connectivity Connectivity 1 M Professor teaches Class (1,4) (1,1) Cardinality Connectivity Professor teaches Class (1,1) (1,4) 51 Cardinality
Binary Relationships 1:M relationship Relational modeling ideal Should be the norm in any relational database design 52 The 1: M relationship between PAINTER and PAINTING
Binary Relationships The Implemented 1:M relationship between PAINTER and PAINTING 53
Binary Relationships 1:1 relationship Should be rare in any relational database design A single entity instance in one entity class is related to a single entity instance in another entity class Could indicate that two entities actually belong in the same table 54
Binary Relationships The 1:1 Relationship Between PROFESSOR and DEPARTMENT 55
56
Binary Relationships M:N relationships Must be avoided because they lead to data redundancies. Can be implemented by breaking it up to produce a set of 1:M relationships Can avoid problems inherent to M:N relationship by creating a composite entity or bridge entity This will be used to link the tables that were originally related in a M:N relationship The composite entity structure includes- as foreign keys- at least the primary keys of the tables that are to be linked. 57
The M:N Relationship Between STUDENT and CLASS Bowser Accounting 1 (ACCT-211) Smithson Intro to Microcomputing (CIS-220) Intro to Statistics (QM-261) 58
The tables have many redundancies!! CLASS_CODE + CLASS_CODE + STU_NUM 59
Changing the M:N relationship to TWO 1:M relationships 60
61
Mandatory vs. Optional Cardinalities Specifies whether an instance must exist or can be absent in the relationship Mandatory Optional Lecturer (1,1) handles (0,N) Class Lecturer 1 M handles (0,N) (1,1) Class A Lecturer may handle zero or many classes. 62 A class is handled by one and only one Lecturer.
Recursive relationships Relationship where same entity participates more than once in different roles. Relationships may be given role names to indicate purpose that each participating entity plays in a relationship. 63
Recursive relationships Explicit Role: all the participating entity sets in a relationship are not distinct nship are not distinct 1 Supervisor Supervision Employee N Supervisee Recursive relationship: Same entity set participates more than once in a relationship in different roles Supervision relationship type relates an employee to a supervisor, where both employee and supervisor entities are the members of the same Employee entity set 64
How to Evaluate a Data Model? A good data model has the following: Accuracy and completeness Non redundancy Enforcement of business rules Data Reusability Stability and Flexibility Communication Effectiveness Simplicity 65
A Common Mistake Modeling the business processes or functions instead of the data. What data we want to keep?? We are interested in modeling the data, NOT the processes or functions that use or generate those data. 66
Example Member M Searches N Books Is this part of the data requirement? Are we interested to know the books searched by the members? If answer is NO, then DO NOT include that as a relationship. Use other appropriate diagramming techniques to capture the business processes such as Data Flow Diagram. Do not mix up the use of ER Modeling with DFD. 67
Example Simple Hospital System In a hospital system, each ward has many patients who are cared for by nurses assigned to the ward. Patients may require treatment by more than one specialist doctor. Draw an ERD for the simple hospital system. 68
Example: Simple Hospital System Ward has many patients (1:N) Patients are cared for by nurses (N:M) Ward has assigned many nurses (1:N) Patients require treatment by one or more doctor (N:M) 69
Example: Simple Hospital System WARD DOCTOR has assigned NURSE accommodates cares for PATIENT treats 70
Example: Small College Database Small College Database A small college database has the following structure. A department has many lecturers. A lecturer belongs to only one department. The department offers many different courses, and many lecturers can teach on a single course. Lecturers can also teach on more than one course. Many students enroll for many courses. Draw an ERD for this system 71
Example: Small College Database offers DEPARTMENT is_in COURSE teaches_on LECTURER enrols STUDENT 72
Quiz Draw the ERD for the following scenarios: 1.A player plays for a team. 2.Each patient has one or more patient histories; each instance of patient history belongs to one patient. 3.An employee may be recorded as having many jobs; a particular job may be recorded as having been held by many employees. 4.A person is a citizen of a country. 5.A customer may place many orders; an order is specific to a customer 73
Exercise 1 In this exercise we will practice modeling domains through the use of E- R diagrams. Your task is to draw an E- R diagram of a database for scheduling classes. The following attributes should be represented in your tables: Course names Teacher names Teacher titles (optional, e.g. Professor) Class room names Number of students taking a course Day and time of classes Classes in a particular course are given at the same day and time each week, possibly more than once each week. A teacher can hold several courses, but will only hold classes in the same class room. More than one teacher could have classes in the same class room (though of course not at the same time). 74