1 Introduction

Size: px
Start display at page:

Download "1 Introduction"

Transcription

1 1 Introduction Information Modeling Information = data + semantics Database systems The need for good design Information Modeling Approaches ORM ER UML Historical Background Computer language generations Database kinds The Relevant Skills Modeling Querying Summary 1

2 2 Chapter 1: Introduction 1.1 Information Modeling It s an unfortunate fact of life that names and numbers can sometimes be misinterpreted. This can prove costly, as experienced by senior citizens who had their social security benefits cut off when government agencies incorrectly pronounced them dead because of misreading DOD on hospital forms as date of death rather than the intended date of discharge. A more costly incident occurred in 1999 when NASA s $125 million Mars Climate Orbiter burnt up in the Martian atmosphere. Apparently, errors in its course settings arose from a failure to make a simple unit conversion. One team worked in U.S. customary units and sent its data to a second team working in metric, but no conversion was made. If a man weighs 180, does he need to go on a drastic diet? No if his mass is 180 lb, but yes if it s 180 kg. Data by itself is not enough. What we really need is information, the meaning or semantics of the data. Since computers lack common sense, we need to pay special attention to semantics when we use computers to model some aspect of reality. This book provides a modern introduction to database systems, with the emphasis on information modeling. At its heart is a very high level semantic approach that is fact-oriented in nature. If you model databases using either traditional or objectoriented approaches, you ll find that fact orientation lifts your thinking to a higher level, illuminating your current way of doing things. Even if you re a programmer rather than a database modeler, this semantic approach provides a natural and powerful way to design your data structures. A database is basically a collection of related data (e.g., a company s personnel records). When interpreted by humans, a database may be viewed as a set of related facts an information base. In the context of our semantic approach, we ll often use the popular term database instead of the more technical information base. Discovering the kinds of facts that underlie a business domain, and the rules that apply to the facts, is interesting and revealing. The quality of the database design used to capture these facts and rules is critical. Just as a house built from a good architectural plan is more likely to be safe and convenient for living, a well-designed database simplifies the task of ensuring that its facts are correct and easy to access. Let s review some basic ideas about database systems, and then see how things can go wrong if they are poorly designed. Each database models a business domain we use this term to describe any area of interest, typically a part of the real world. Consider a library database. As changes occur in the library (e.g., a book is borrowed) the database is updated to reflect these changes. This task could be performed manually using a card catalog, or be automated with an online catalog, or both. Our focus is on automated databases. Sometimes these are implemented by means of special-purpose computer programs, coded in a generalpurpose programming language (e.g., C#). More often, database applications are developed using a database management system (DBMS). This is a software system for maintaining databases and answering queries about them (e.g., DB2, Oracle, SQL Server). The same DBMS may handle many different databases.

3 1.1 Information Modeling 3 Typical applications use a database to house the persistent data, an in-memory object model to hold transient data, and a friendly user interface for users to enter and access data. All these structures deal with information and are best derived from an information model that clearly reveals the underlying semantics of the domain. Some tools can use information models to automatically generate not just databases, but also object models and user interfaces. If an application requires maintenance and retrieval of lots of data, a DBMS offers many advantages over manual record keeping. Data may be conveniently captured via electronic interfaces (e.g., screen forms), then quickly processed and stored compactly on disk. Many data errors can be detected automatically, and access rights to data can be enforced by the system. People can spend more time on creative design rather than on routine tasks more suited to computers. Finally, developing and documenting the application software can be facilitated by use of computer-assisted software engineering (CASE) tool support. In terms of the dominant employment group, the Agricultural Age was supplanted late in the 19th century by the Industrial Age, which is now replaced by the Information Age. With the ongoing information explosion and mechanization of industry, the proportion of information workers is steadily rising. Most businesses achieve significant productivity gains by exploiting information technology. Imagine how long a newspaper firm would last if it returned to the methods used before word processing and computerized typesetting. Apart from its enabling employment opportunities, the ability to interact efficiently with information systems empowers us to exploit their information content. Although most employees need to be familiar with information technology, there are vast differences in the amount and complexity of information management tasks required of these workers. Originally, most technical computer work was performed by computer specialists such as programmers and systems analysts. However, the advent of user-friendly software and powerful, inexpensive personal computers led to a redistribution of computing power. End users now commonly perform many information management tasks, such as spreadsheeting, with minimal reliance on professional computer experts. This trend toward more users driving their own computer systems rather than relying on expert chauffeurs does not eliminate the need for computer specialists. There is still a need for programming in languages such as C# and Java. However, there is an increasing demand for high level skills such as modeling complex information systems. The area of information systems engineering includes subdisciplines such as requirements analysis, database design, user interface design, and report writing. In one way or another, all these subareas deal with information. Since the database design phase selects the underlying structures to capture the relevant information, it is of central importance. To highlight the need for good database design, let s consider the task of designing a database to store movie details such as those shown in Table 1.1. The header of this table is shaded to help distinguish it from the rows of data. Even if the header is not shaded, we do not count it as a table row. The first row of data is fictitious.

4 4 Chapter 1: Introduction Table 1.1 An output report about some motion pictures. Movie# Movie Title Released Director Stars 1 Cosmology 2006 Lee Lafferty 2 Kung Fu Hustle 2004 Stephen Chow Stephen Chow 3 The Secret Garden 1987 Alan Grint Gennie James Barret Oliver 4 The Secret Garden 1993 Agnieszka Holland Kate Maberly Heydon Prowse 5 The DaVinci Code 2006 Ron Howard Tom Hanks Ian McKellen Audrey Tautou Different movies may have the same title (e.g., The Secret Garden). Hence movie numbers are used to provide a simple identifier. We interpret the data in terms of facts. For example, movie 5 has the title The DaVinci Code, was released in 2006, was directed by Ron Howard, and starred Tom Hanks, Ian McKellen, and Audrey Tautou. Movie 1, titled Cosmology, had no stars (it is a documentary). This table is an output report. It provides one way to view the data. This might not be the same as how the data is actually stored in a database. In Table 1.1 each cell (row column slot) may contain many values. For example, Movie 3 has two stars recorded in the row 3, column 5 cell. Some databases allow a cell to contain many values like this, but in a relational database each table cell may hold at most one value. Since relational database systems are dominant in the industry, our implementation discussion focuses on them. How can we design a relational database to store these facts? Suppose we use the structure shown in Table 1.2. This has one entry in each cell. Here,? denotes a null (no star is recorded for Cosmology). Some DBMSs display nulls differently (e.g., <NULL> or a blank space). To help distinguish the rows, we ve included lines between them. But from now on, we ll omit lines between rows. Table 1.2 A badly-designed relational database table. Movie: movienr movietitle releaseyr director star 1 Cosmology 2006 Lee Lafferty? 2 Kung Fu Hustle 2004 Stephen Chow Stephen Chow 3 The Secret Garden 1987 Alan Grint Gennie James 3 The Secret Garden 1987 Alan Grint Barret Oliver 4 The Secret Garden 1993 Agnieszka Holland Kate Maberly 4 The Secret Garden 1993 Agnieszka Holland Heydon Prowse 5 The DaVinci Code 2006 Ron Howard Tom Hanks 5 The DaVinci Code 2006 Ron Howard Ian McKellen 5 The DaVinci Code 2006 Ron Howard Audrey Tautou

5 1.1 Information Modeling 5 Each relational table must be named. Here we called the table Movie. See if you can spot the problem with this design before reading on. The table contains redundant information. For example, the facts that movie 5 is titled The DaVinci Code, was released in 2006, and was directed by Ron Howard are shown three times (once for each star). We might try to fix this by deleting the extra copies in the movietitle, releaseyr, and director columns, but this artificially makes some rows special and introduces problems with nulls. In addition to wasting space, the Table 1.2 design can lead to errors. For example, there is nothing to stop us adding a row for movie 2 with a different title (e.g., Kung Fun), a different release year, a different director, and another star. Our database would then be inconsistent with the business domain, where a movie has only one title and release year, and only one director is to be recorded 1. The corrected design uses two relational tables, Movie and Starred (Figure 1.1). The table design is shown in schematic form above the populated tables. In this example, a movie may be identified either by its movie number or by the combination of its title, release year, and director. In database terminology, each of these identifiers provides a candidate key for the Movie table, shown here by underlining each identifier. Movie ( movienr, movietitle, releaseyear, director ) Starred ( movienr, star ) Movie: movienr movietitle releaseyear director Cosmology Kung Fu Hustle The Secret Garden The Secret Garden The DaVinci Code Lee Lafferty Stephen Chow Alan Grint Agnieszka Holland Ron Howard Starred: movienr star Stephen Chow Gennie James Barret Oliver Kate Maberly Heydon Prowse Tom Hanks Ian McKellen Audrey Tautou Figure 1.1 A relational database representation of Table In rare cases, a movie may have multiple directors, but in this business domain we are interested in only one director per movie.

6 6 Chapter 1: Introduction In this case, we chose movienr as the primary way to identify movies throughout the database. This is shown here by doubly underlining the movienr column to indicate that it is the primary key of the Movie table. If a table has only one candidate key, a single underline denotes the primary key. The constraints that each movie has only one title, release year, and director are enforced by checking that each movie number occurs only once in the Movie table. The constraints that each movie must have a title, release year, and director are enforced by checking that all movies occur in the Movie table and excluding nulls from the title, release year, and director columns. In the schema, this is captured by the dotted arrow (indicating that if a movie is listed in the Starred table it must be listed in the Movie table) and by not marking any columns as optional. In relational database terms, this arrow depicts a foreign key constraint, where the movienr column in the Starred table is a foreign key referencing the primary key of the Movie table. The primary key of the Starred table is the combination of its columns, indicated here by underlining. These concepts and notations are fully explained later in the book. Even with this simple example, care is needed for database design. With complex cases, the design problem is much more challenging. The rest of this book is largely concerned with helping you to meet such challenges. Designing databases is both a science and an art. When supported by a good method, this design process is a stimulating and intellectually satisfying activity, with tangible benefits gained from the quality of the database applications produced. The next section explains why Object-Role Modeling (ORM) is chosen as our first modeling method. Later sections provide historical background and highlight the essential communication skills. The chapter concludes with a summary and a supplementary note section, including references for further reading. 1.2 Modeling Approaches When we design a database for a particular business domain, we create a model of it. Technically, the business domain being modeled is called the universe of discourse (UoD), since it is the universe (or world) that we are interested in discoursing (or talking) about. The UoD or business domain is typically part of the real world. To build a good model requires a good understanding of the world we are modeling, and hence is a task ideally suited to people rather than machines. The main challenge is to describe the UoD clearly and precisely. Great care is required here, since errors introduced here filter through to later stages in software development. The later the errors are detected, the more expensive they are to remove. A person who models the UoD is called a modeler. If we are familiar with the business domain, we may do the modeling ourselves. If not, we should consult with others who, at least collectively, understand the business domain. These people are called domain experts or subject matter experts. Modeling is a collaborative activity between the modeler and the domain expert.

7 1.2 Modeling Approaches 7 Since people naturally communicate (to themselves or others) with words, pictures, and examples, the best way to arrive at a clear description of the UoD is to use natural language, intuitive diagrams, and examples. To simplify the modeling task, we examine the information in the smallest units possible: one fact at a time. The model should first be expressed at the conceptual level, in concepts that people find easy to work with. Figure 1.1 depicted a model in terms of relational database structures. This is too far removed from natural language to be called conceptual. Instead, relational database structures are at the level of a logical data model. Other logical data models exist (e.g., network, XML schema, and object-oriented approaches), and each DBMS is aligned with at least one of these. However, in specifying a draft conceptual design, the modeler should be free of implementation concerns. It is a hard enough job already to develop an accurate model of the UoD without having to worry at the same time about how to translate the model into data structures specific to a chosen DBMS. Implementation concerns are of course important, but should be ignored in the early stages of modeling. Once an initial conceptual design is created, it can be mapped down to a logical design in any data model we like. This flexibility also makes it easier to implement and maintain the same application on more than one kind of DBMS. Although most applications involve processes as well as data, we ll focus on the data, because this perspective is more stable, and processes depend on the underlying data. Three information modeling approaches are discussed: Entity-Relationship modeling (ER), fact-oriented modeling, and object-oriented modeling. Any modeling method comprises a notation as well as a procedure for using the notation to construct models. To seed the data model in a scientific way, we need examples of the kinds of data that the system is expected to manage. We call these examples data use cases, since they are cases of data being used by the system. They can be output reports, input screens, or forms and can present information in many ways (tables, forms, graphs, etc.). Such examples may already exist as manual or computer records. Sometimes the application is brand new, or an improved solution or adaptation is required. If needed, the modeler constructs new examples by discussing the application area with the domain expert. As an example, suppose our information system has to output room schedules like that shown in Table 1.3. Let s look at some different approaches to modeling this. It is not important that you understand details of the different approaches at this stage. The concepts are fully explained in later chapters. Table 1.3 A simple data use case for room scheduling. Room Time Activity Code Activity Name Mon 9 a.m. Tue 2 p.m. Mon 9 a.m. Fri 5 p.m. ORC ORC XQC STP ORM class ORM class XQuery class Staff party

8 8 Chapter 1: Introduction ROOM HOUR SLOT # * hour for within booked for allocated ROOM # * room nr... ACTIVITY # * activity code * activity name Figure 1.2 An ER diagram for room scheduling. Entity-Relationship modeling was introduced by Peter Chen in 1976 and is still the most widely used approach for data modeling. It pictures the world in terms of entities that have attributes and participate in relationships. Over time, many versions of ER arose. There is no single, standard ER notation. Different versions of ER may support different concepts and may use different symbols for the same concept. Figure 1.2 uses a popular ER notation long supported by CASE tools from Oracle Corporation. Here, entity types are shown as named, soft rectangles (rounded corners). Attributes are listed below the entity type names. An octothorpe # indicates the attribute is a component of the primary identifier for the entity type, and an asterisk * means the attribute is mandatory. Here, an ellipsis indicates other attributes exist but their display is suppressed. Relationships are depicted as named lines connecting entity types. Only binary relationships are allowed, and each half of the relationship is shown either as a solid line (mandatory) or as a broken line (optional). For example, each RoomHourSlot must have a Room, but it is optional whether a Room is involved in a RoomHourSlot. A bar across one end of a relationship indicates that the relationship is a component of the primary identifier for the entity type at that end. For example, RoomHourSlot is identified by combining its hour and room. Room is identified by its room number, and Activity by its activity code. A fork or crow s foot at one end of a relationship indicates that many instances of the entity type at that end may be associated (via that relationship) with the same entity instance at the other end of the relationship. The lack of a crow s foot indicates that at most one entity instance at that end is associated with any given entity instance at the other end. For example, an Activity may be allocated many RoomHourSlots, but each RoomHourSlot is booked for at most one Activity. To its credit, this ER diagram portrays the domain in a way that is independent of the target software platform. For example, classifying a relationship end as mandatory is a conceptual issue. There is no attempt to specify here how this constraint is implemented (e.g., using mandatory columns, foreign key references, or object references). However, the ER diagram is incomplete (can you spot any missing constraints?). Moreover, the move from the data use case to the model is not obvious. While an experienced ER modeler might immediately see that an entity type is required to model RoomHourSlot, this step might be challenging to a novice modeler.

9 1.2 Modeling Approaches 9 Let s see if fact-oriented modeling can provide some help. Our treatment of factorientation focuses on Object-Role Modeling. ORM began in the early 1970s as a semantic modeling approach that views the world simply in terms of objects (things) playing roles (parts in relationships). For example, you are now playing the role of reading this book, and the book is playing the role of being read. ORM has appeared in a variety of forms such as Natural-language Information Analysis Method (NIAM). The version discussed in this book is based on extensions to NIAM and is supported by industrial software tools. Regardless of how data use cases appear, a domain expert familiar with their meaning should be able to verbalize their information content in natural language sentences. It is the modeler s responsibility to transform that informal verbalization into a formal yet natural verbalization that is clearly understood by the domain expert. These two verbalizations, one by the domain expert transformed into one by the modeler, comprise steps 1a and 1b of ORM s conceptual analysis procedure. Here we verbalize sample data as fact instances that are then abstracted to fact types. Constraints and perhaps derivation rules are then added, and themselves validated by verbalization and sample fact populations. To get a feeling of how this works in ORM, suppose that our system is required to output reports like Table 1.3. We ask the domain expert to read off the information contained in the table, and then we rephrase this in formal English. For example, the subject matter expert might express the facts on the top row of the table as follows: Room 20 at 9 a.m. Monday is booked for the activity ORC which has the name ORM class. As modelers, we rephrase this into two elementary sentences, identifying each object by a definite description: the Room numbered 20 at the HourSlot with day-hourcode Mon 9 a.m. is booked for the Activity coded ORC ; the Activity coded ORC has the ActivityName ORM class. Once the domain expert agrees with this verbalization, we abstract from the fact instances to the fact types (i.e., the types or kinds of fact). We might then depict this structure on an ORM diagram and populate it with sample data and counter data (explained shortly) as shown in Figure 1.3. HourSlot (dhcode) Room (.nr) at is booked for... Activity (.code) has /refers to Activity Name 20 Mon 9 a.m. ORC 20 Tue 2 p.m. ORC 33 Mon 9 a.m. XQC 33 Fri 5 p.m. STP ORC ORM class STM Staff Meeting STP Staff Party XQC XQuery class 20 Mon 9 a.m. XQC 33 Mon 9 a.m. ORC STP Staff Planning SPY Staff Party Figure 1.3 An ORM diagram for room scheduling, with sample and counter data.

10 10 Chapter 1: Introduction By default, entity types are shown in ORM as named, soft rectangles (rounded corners) and must have a reference scheme, i.e., a way for humans to refer to instances of that type. Simple reference schemes may be shown in parentheses (e.g., (.nr) ), as an abbreviation of the relevant association, e.g., Room has RoomNr. Value types such as types of character strings need no reference scheme and are shown as named, dashed, soft rectangles (e.g., ActivityName). This book uses the notation of ORM 2 (second generation ORM), as supported by the NORMA (Neumont ORM Architect) tool, an open source plug-in to Microsoft Visual Studio.NET. The previous version of ORM, as supported by Microsoft Visio for Enterprise Architects, depicts object types as ellipses, not soft rectangles. As a configuration option, NORMA allows object types to be displayed as ellipses or hard rectangles. Unless indicated otherwise, in this book the term ORM is understood to mean ORM 2. When specific reference is made to the previous version of ORM, the term ORM 1 is used. The ORM glossary at the end of this book includes a side-by-side comparison of ORM 1 and ORM 2 notations. In ORM, a role is a part played in a fact type (relationship or association). A relationship is shown as a named sequence of one or more role boxes, each connected to the object type whose instances play that role. Figure 1.3 includes a ternary (threerole) association, Room at HourSlot is booked for Activity, and a binary (two-role) association Activity has ActivityName. Unlike ER, ORM makes no use of attributes in its base models. All facts are represented in terms of objects (entities or values) playing roles. Although this often leads to larger diagrams, an attribute-free approach has advantages for conceptual analysis, including simplicity, stability, and ease of validation. If you are used to modeling in ER or the Unified Modeling Language (UML) (see later), this approach may seem strange at first, but please keep an open mind about it. ORM allows relationships of any arity (number of roles). Each fact type has at least one predicate reading, corresponding to one way of traversing its roles. Any number of readings may be provided for each role ordering. For a binary association, forward and inverse predicate readings may be shown separated by a slash /. As in logic, a predicate is a sentence with object holes in it. Mixfix notation enables the object terms to be mixed in with the predicate reading at various positions (as required in languages such as Japanese). An object placeholder is indicated by an ellipsis (e.g., the ternary predicate at is booked for ). For unary postfix predicates (e.g., smokes ) or binary infix predicates (e.g., has ) the ellipses may be omitted. For each fact type, a fact table may be added with a sample population to help validate the constraints. Each column in a fact table is associated with one role. The lines beside the role boxes depict internal uniqueness constraints, indicating which roles or role combinations must have unique entries. ORM schemas may be represented in diagrammatic or textual form, and some ORM tools can automatically transform between the two representations. Models are validated with domain experts in two main ways: verbalization and population.

11 1.2 Modeling Approaches 11 For example, the uniqueness constraints on the ternary association verbalize as: For each Room and HourSlot, that Room at that HourSlot is booked for at most one Activity; For each HourSlot and Activity, at most one Room at that HourSlot is booked for that Activity. The ternary fact table shows a satisfying population (each Room-HourSlot combination is unique, and each HourSlot-Activity combination is unique). The uniqueness constraints on the binary verbalize as: Each Activity has at most one ActivityName; Each ActivityName refers to at most one Activity. The 1:1 nature of this association is illustrated by the population, where each column entry occurs only once in its column. The solid dot on Activity is a mandatory role constraint, indicating that each instance in the population of Activity must play that role. This verbalizes as Each Activity has some ActivityName. A role that is not mandatory is optional. Since sample data are not always significant, additional data (such as STM in the binary fact type) may be needed to illustrate some rules. The optionality of the other role played by Activity is shown by the absence of STM in its population. Since ORM schemas can be specified in unambiguous sentences backed up by illustrative examples, it is not necessary for domain experts to understand the diagram notation at all. Modelers, however, find diagrams very useful for thinking about the universe of discourse. To double check a constraint, a counterexample to that constraint may be presented. The counterrows appended to the fact tables test the uniqueness constraints. For instance, the first row and counterrow of the ternary indicate that room 20 at 9 a.m. Monday is booked for both the ORC and XQC activities. This challenges the constraint For each Room and HourSlot, that Room at that HourSlot is booked for at most one Activity. This constraint may be recast in negative form as: It is impossible that the same Room at the same HourSlot is booked for more than one Activity. The counterexample provides a test case to see if this situation is actually possible. Concrete examples help domain experts to decide whether something really is a rule. This additional validation step is very useful in cases where the domain expert s command of language suffers from imprecise or even incorrect use of logical terms (e.g., each, at least, at most, exactly, the same, more than, if ). To challenge the constraint that at most one room at the same time is booked for the same activity, the first row and second counterrow of the ternary fact table in Figure 1.3 indicate that both room 20 and room 33 are used at 9 a.m. Monday for the ORC activity. Is this kind of thing possible? If it is (and for some application domains it would be) then this constraint is not a rule, in which case the constraint should be dropped and the counterrow added to the sample data. However, if our business does not allow two rooms to be used at the same time for the same activity, then the constraint is validated and the counterexample is rejected (although it can be retained as an illustrative counterexample). Compare Figure 1.2 with Figure 1.3. ER is often better than ORM for displaying compact overviews. However, ER models are further removed from natural language and may be harder for the domain expert to conceptualize. In this case, it was more natural to verbalize the first schedule fact as a ternary, but all popular ER notations with industrial support are restricted to binary (two-role) relationships.

12 12 Chapter 1: Introduction Being only binary does not make a language less expressive, since an n-ary association (n > 2) may always be transformed into binaries by co-referencing or nesting (see later). However, such a transformation may introduce an object type that appears artificial to the domain expert, which can hinder communication. Wherever possible, we should try to formulate the model in a way that appears natural to the domain expert. ER notation is less expressive than ORM for capturing constraints or business rules. For example, the ER notation used for Figure 1.2 was unable to express the constraint that activity names are unique or the constraint that it is impossible that more than one room at the same hour slot is booked for the same activity. ER encourages decisions about relative importance at the conceptual analysis stage. Sometimes this may be seen as an advantage. For example, it is fairly natural to think of activity names as attributes of activities, and hence treat names as less important than activities themselves. Sometimes, however, early distinctions on relative importance can be disadvantageous. For example, instead of using RoomHourSlot in Figure 1.2, we could model the room schedule information using ActivityHourSlot. Which of these choices is preferable may depend on what other kind of information we might want to record. However, because we have been forced to make a decision about this without knowing what other facts need to be recorded, we may need to change this part of the model later. In general, if you model a feature as an attribute and find out later that you need to record something about it, you are typically forced to remodel it as an entity type or relationship because attributes can t have attributes or participate in relationships. For instance, suppose we record phone as an attribute of Room and then later discover that we want to know which phones support voice mail. Since you rarely know what all the future information requirements will be, an attribute-based model is inherently unstable. Moreover, applications using the model often need to be recoded when a model feature is changed. Since ORM is essentially immune to changes like this, it offers far greater semantic stability. We have already seen that ORM models facilitate validation by both verbalization and population. Attributes make it awkward to use sample data populations. Moreover, populating optional attributes introduces null values, which may be a source of confusion to nontechnical people. In light of the aforementioned considerations, it appears that ORM s fact-oriented approach offers at least some advantages over ER modeling for conceptual analysis. This doesn t mean that you should discard ER, since it has advantages too (e.g., compact diagrams). You can have your cake and eat it too by using ORM for the initial conceptual analysis and automatically generating an ER view from it when desired. Even if you decide to use ER throughout, ignoring the ORM notation completely, you should find that applying or adapting the modeling steps in ORM s conceptual schema design procedure to the ER notation will help you design better ER models. Now let s consider Object-Oriented (OO) modeling, an approach that encapsulates both data and behavior within objects. Although used mainly for designing objectoriented program code, it can also be used for database design. Many object-oriented approaches exist, but by far the most influential is the Unified Modeling Language, which has been adopted by the Object Management Group (OMG).

13 1.2 Modeling Approaches 13 Among its many diagram types, UML includes class diagrams to specify static data structures. Class diagrams may be used to specify operations as well as low level design decisions specific to object-oriented code (e.g., attribute visibility and association navigability). When stripped of such implementation detail, UML class diagrams may be regarded as an extended version of ER. A UML class diagram for our example is shown in Figure 1.4. To overcome some of the problems mentioned for the ER solution, a ternary association is used for the schedule information. Because of its object-oriented focus, UML does not require conceptual identification schemes for its classes. Instead, entity instances are assumed to be identified by internal object identifiers (oids). UML has no standard notation to signify that attribute values must be unique for their class. However, UML does allow user-defined constraints to be added in braces or notes in any language. We ve added {P} to denote primary uniqueness and {U1} for an alternate uniqueness these symbols are not standard and hence not portable. The uniqueness constraints on the ternary are captured by the 0..1 (at most one) multiplicity constraints. Here * is shorthand for 0..*, meaning 0 or more. Attributes are mandatory by default. How well does this UML model support validation with the domain expert? Let s start with verbalization. Although often less than ideal, implicit use of has could be used to form binary sentences from the attributes, but what about the ternary? About the best we can do is something like Booking involves Room and HourSlot and Activity which is pretty useless. What if we replaced the association name with a mixfix predicate, as we did in ORM, e.g., at is booked for? This is no use, because UML association roles (or association ends as they are now called) are not ordered. So formally we can t know if we should read the sentence type as Room at HourSlot is booked for Activity, or Activity at HourSlot is booked for Room etc. This gets worse if the same class plays more than one role in the association (e.g., Person introduced Person to Person). UML requires association roles to have names (ORM allows role names, but does not require them), but role names don t form sentences, which are always ordered in natural language. UML s weakness with regard to verbalization of facts carries over into its verbalization of constraints and derivation rules. HourSlot Room nr {P} dhcode {P} * Booking Activity code {P} name {U1} Figure 1.4 A UML class diagram for room scheduling.

14 14 Chapter 1: Introduction The UML specification recommends the Object Constraint Language (OCL) for formal expression of such rules, but OCL is simply too mathematical in nature to be used for validation by nontechnical domain experts. In principle, a higher level language could be designed for UML that could be automatically transformed to OCL. Since verbalization in UML has inadequate support, let s try validation with sample populations. Not much luck here either. To begin with, attribute-based notations are almost useless for multiple instantiation and they introduce nulls into base populations, along with all their confusing properties. UML does provide object diagrams that enable you to talk about attributed single instances of classes, but that doesn t help with multiple instantiation. For example, the 1:1 nature of the association between activity codes and names is transparent in the ORM fact table in Figure 1.3, but is harder to see by scanning several activity objects. In principle, we could introduce fact tables to instantiate binary associations in UML, but this wouldn t work for non-binary associations. Why not? InUML you can t specify a reading direction for an association unless it s a binary. So there is no obvious connection between an association role and a fact column as there is in ORM. The best we can do is to name each role and then use role names as headers to the fact table. However, the visual connection of the fact columns to the class diagram would be weak because of the nonlinear layout of the association roles, and the higher the arity of the association, the worse it gets. In its favor, UML is far richer than ORM or ER in its ability to capture other aspects of application design (e.g., operations, activities, component packaging, and deployment). UML includes diagramming techniques, such as state machine and activity diagrams, to capture business processes. Any full specification of a business domain needs to address these dynamic aspects. If the application is to be implemented in object-oriented code, UML enables more precise descriptions of the programming code structures to be specified (e.g., attribute visibility and association navigability). If we restrict our attention to conceptual data modeling, however, the ORM notation is significantly richer than ER or UML in its capacity to express business constraints on the data, as well as being far more orthogonal and less impacted by change. As a simple example, consider the output report of Table 1.4. You might like to try modeling this yourself before reading on. Table 1.4 Another sample output report about Movies. Movie Director Reviewers Nr Title Name Born Name Born The DaVinci Code Crocodile Dundee Star Peace Ron Howard Peter Faiman Ann Green US AU US Fred Bloggs Ann Green Ann Green Ima Viewer Tom Sawme? US US US GB AU?

15 1.2 Modeling Approaches 15 Movie nr {P} title moviedirected director * 1 * * moviereviewed reviewer Person name {P} birthcountry Figure 1.5 A UML class diagram for Table 1.4. One way to model this report in UML is shown in Figure 1.5. Although the population of the sample report suggests that movie titles are unique and that a person can direct only one movie, let s assume that the domain expert confirms that this is not the case. We should adapt our sample population to illustrate this (e.g., add a new movie 4 with the same title Star Peace directed by Ron Howard). Assuming people are identified simply by their name, Movie and Person classes may be used as shown. The role names director and reviewer are used here to distinguish the two roles played by Person. Similarly, role names are provided to distinguish the roles played by Movie. In this example, all four role names are required. Association names may be used as well if desired. Unlike Chen s original ER notation, UML binary associations are typically depicted by lines without a diamond. While this is convenient, the use of diamonds in longer associations is somewhat inconsistent, and the avoidance of unary relationships is unnatural. In principle, UML does allow diamonds as an alternative notation for binary associations, but in practice this is rarely seen. In contrast, ORM s depiction of relationships as a sequence of one or more roles, where each role is associated with a fact table column, provides a uniform, general notation that facilitates validation by both verbalization and sample populations. The multiplicity constraints indicate that each movie has exactly one director but may have many reviewers and that a person may direct or review many movies. But there is still a missing business rule. Can you spot it? Figure 1.6 models the same domain in ORM. Here the before has reverses the normal left-to-right reading direction. The rule missing from the UML model is captured graphically by the circled X constraint between the role-pairs comprising the directed and reviewed associations. This is called an exclusion constraint. 1 Ron Howard was directed by /directed MovieTitle has Movie (.nr) Person (.name) was born in Country (.code) was reviewed by /reviewed 1 Ron Howard Figure 1.6 ORM model for Table 1.4, with counterexample for exclusion constraint.

16 16 Chapter 1: Introduction This exclusion constraint verbalizes as No Person directed and reviewed the same Movie or, reading it the other way, No Movie was directed by and was reviewed by the same Person. To validate this rule with the domain expert, you should verbalize the rule and also provide a counterexample. For example, in your model is it possible for Movie 1 to be directed by Ron Howard and also reviewed by Ron Howard? Figure 1.6 includes this counterexample. If the exclusion constraint really does apply, at least one of those two facts must be wrong. Some domain experts are happy to work with diagrams and some are not. Some are good at understanding rules in natural language and some are not. But all domain experts are good at working with concrete examples. Although it is not necessary for the domain expert to see the diagram, being able to instantiate any role directly on the diagram makes it easy for you as a modeler to think clearly about the rules. Although UML has no graphic notation for general exclusion constraints, it does allow you to document constraints in a note attached to the relevant model elements. If a concept is already part of your modeling language, it s easier to think of it. Since the exclusion constraint notation is not built in to the UML language, it is easy to miss the constraint in developing the model. The same thing goes for ER. In contrast, the ORM modeling procedure prompts you to consider such a constraint and allows you to visualize and capture the rule formally. An ORM tool can then map the constraint automatically into executable code to ensure that the rule is enforced in the implementation. ORM diagrams always display semantic domains as object types. The ORM diagram in Figure 1.7(a) includes role names for birthdate and deathdate, shown in square brackets next to the relevant roles. These roles are clearly compatible, as they are both played by the object type Date. In ORM, role names may be used like attribute names in automatically generated attribute-views, as well as in rules specified in attribute style (e.g., deathdate > birthdate). ER diagrams typically hide attribute domains. For example, the birthdate and deathdate attributes in the Barker ER model shown in Figure 1.7(b) should be based on the domain Date, but this is not represented visually. In ER, attribute domains can be listed in another document. In UML class diagrams, attribute domains may be listed after the attribute name and multiplicity (if shown), as in Figure 1.7(c). The [0..1] multiplicity indicates at most one, so the attribute is optional and single-valued. All too often in practice, only syntactic, or value, domains are specified (e.g., String). (a) (b) (c) [birthdate] PRESIDENT President (.name) was born on Date (CE) died on [deathdate] #* person name * birthdate o deathdate President name [1]: String {P} birthdate [1]: Date deathdate [0..1]: Date Figure 1.7 ORM object types are semantic domains.

17 1.2 Modeling Approaches 17 An ER diagram might show population and elevation as attributes of City, and an associated table might list the domains of these attributes simply as Integer, despite the fact that it is nonsense to equate a population with an elevation. Conceptual object types, or semantic domains, provide the conceptual glue that binds the various components in the application model into a coherent picture. Even at the lower level of the relational data model, E.F. Codd, the founder of the relational model, argues that domains are the glue that holds a relational database together (Codd 1990, p. 45). The object types in ORM diagrams are the semantic domains, so the connectedness of a model is transparent. This property of ORM also has significant advantages for conceptual queries, since a user can query the conceptual model directly by navigating through its object types to establish the relevant connections. This notion is elaborated further in later chapters. ER and UML diagrams often fail to express relevant constraints on, or between, attributes. Figure 1.8 provides a simple example. Notice the circled dot over an X in the ORM model in Figure 1.8(a). This specifies two constraints: the dot is a mandatory constraint over the disjunction of the two roles (each truck is either bought or leased) and the X indicates the roles are exclusive (no truck is both bought and leased). The two constraints collectively provide an xor (exclusive-or) constraint (each truck plays exactly one of the roles). Unlike most versions of ER, UML does provide an xor constraint, but only between associations. Since the UML model in Figure 1.8(b) models these two fact types as attributes instead of associations, it cannot capture the constraint graphically (other than adding a note). Notice again how the ORM diagram reveals the semantic domains. For instance, tare may be meaningfully compared with maximum load (both are masses) but not with length. In UML this can be made explicit by appending domain names to the attributes. At various stages in the modeling process, it is helpful for the modeler to see all the relevant information in the one place. (a) has was bought on Length (m:) (b) Truck Truck (.regnr) is leased till has tare of Date (dmy) regnr {P} length purchasedate [0..1] leasedate [0..1] tare maxload [0..1] Mass (kg:) may carry /is max load of Figure 1.8 (a) ORM model. (b) UML model, revealing less detail.

18 18 Chapter 1: Introduction Another ORM feature is its flexible support for subtyping, including multiple inheritance, based on formal subtype definitions. For example, the subtype LargeUSCity may be defined as a City that is in Country US and has a Population > As discussed in a later chapter, subtype definitions provide stronger constraints than declarations about whether subtypes are exclusive or exhaustive. In principle, because there are infinitely many kinds of constraints, a textual constraint language is often required for completeness to supplement the diagram. This is true for ER, ORM, and UML models. However, the failure of ER and UML diagrams to include standard notations for many important ORM constraints makes it harder to develop a comprehensive model or to perform transformations on the model. For example, suppose that in any movie an actor may have a starring role or a supporting role but not both. This can be modeled by two fact types: Actor has starring role in Movie; Actor has supporting role in Movie. The but not both condition is expressed in ORM as a pair-exclusion constraint between the fact types. Alternatively, these fact types may be replaced by a single longer fact type: Actor in Movie has role of RoleKind {star, support}. Transformations are rigorously controlled in ORM to ensure that constraints in one representation are captured in an alternative representation. For instance, the pairexclusion constraint is transformed into the constraint that each Actor-Movie pair has only one RoleKind. The formal theory behind such transformations is easier to apply when the relevant constraints can be visualized. Unlike UML and ER, ORM was built from a linguistic basis. To reap the benefits of verbalization and population for communication with and validation by domain experts, it s better to use a language that was designed with this in mind. The ORM notation is easy to learn and has been successfully taught even to high school students. We are not arguing here that ER and UML have no value. They do. We are simply suggesting that you consider using ORM s modeling techniques, and possibly its graphic notation, to facilitate your original conceptual analysis before using an attribute-based notation such as that of ER, UML, or relational tables. Once you have validated the conceptual model with the domain expert, you need to map it to a DBMS or program code for implementation. At this lower level, you will want to use an attribute-based model, so that you have a compact picture of how facts are grouped into implementation structures. For database applications, you will want to see the table structures, foreign key relationships, and so on. Here a relational or object-relational model offers a compact view, similar to an ER or UML model. ORM models often take up much more space than an attribute-based model, since they show each attribute as a relationship. This is ideal for conceptual analysis, where we should validate one fact type at a time. However, for logical design, we typically group facts into attribute-based structures such as tables or classes. At the logical design stage, attribute-based models are more useful than ORM models. For example, relational schema diagrams provide a simple, compact picture of the underlying tables and foreign key constraints between them. Also, UML is well suited for the logical and physical design of object-oriented code, since it allows implementation detail on the data model (e.g., attribute visibility and association navigation) and can be used to model behavior and deployment.

Entity Relationship modeling from an ORM perspective: Part 2

Entity Relationship modeling from an ORM perspective: Part 2 Entity Relationship modeling from an ORM perspective: Part 2 Terry Halpin Microsoft Corporation Introduction This article is the second in a series of articles dealing with Entity Relationship (ER) modeling

More information

UML Data Models From An ORM Perspective: Part 3

UML Data Models From An ORM Perspective: Part 3 UML Data Models From n ORM Perspective: Part 3 by Dr. Terry Halpin, Sc, DipEd,, MLitStud, PhD Director of Database Strategy, Visio Corporation This paper appeared in the June 998 issue of the Journal of

More information

Introduction to modeling. ER modelling

Introduction to modeling. ER modelling Introduction to modeling ER modelling Slides for this part are based on Chapters 8 from Halpin, T. & Morgan, T. 2008, Information Modeling and Relational Databases, Second Edition (ISBN: 978-0-12-373568-3),

More information

UML data models from an ORM perspective: Part 4

UML data models from an ORM perspective: Part 4 data models from an ORM perspective: Part 4 by Dr. Terry Halpin Director of Database Strategy, Visio Corporation This article first appeared in the August 1998 issue of the Journal of Conceptual Modeling,

More information

Introduction to Conceptual Data Modeling

Introduction to Conceptual Data Modeling Mustafa Jarrar: Lecture Notes on Introduction to Conceptual Data Modeling. University of Birzeit, Palestine, 2018 Version 4 Introduction to Conceptual Data Modeling (Chapter 1 & 2) Mustafa Jarrar Birzeit

More information

Chapter 2: Entity-Relationship Model

Chapter 2: Entity-Relationship Model Chapter 2: Entity-Relationship Model! Entity Sets! Relationship Sets! Design Issues! Mapping Constraints! Keys! E-R Diagram! Extended E-R Features! Design of an E-R Database Schema! Reduction of an E-R

More information

Verbalizing Business Rules: Part 10

Verbalizing Business Rules: Part 10 Verbalizing Business Rules: Part 10 Terry Halpin rthface University Business rules should be validated by business domain experts, and hence specified using concepts and languages easily understood by

More information

0. Database Systems 1.1 Introduction to DBMS Information is one of the most valuable resources in this information age! How do we effectively and efficiently manage this information? - How does Wal-Mart

More information

SOFTWARE ENGINEERING Prof.N.L.Sarda Computer Science & Engineering IIT Bombay. Lecture #10 Process Modelling DFD, Function Decomp (Part 2)

SOFTWARE ENGINEERING Prof.N.L.Sarda Computer Science & Engineering IIT Bombay. Lecture #10 Process Modelling DFD, Function Decomp (Part 2) SOFTWARE ENGINEERING Prof.N.L.Sarda Computer Science & Engineering IIT Bombay Lecture #10 Process Modelling DFD, Function Decomp (Part 2) Let us continue with the data modeling topic. So far we have seen

More information

Object-role modelling (ORM)

Object-role modelling (ORM) Introduction to modeling WS 2015/16 Object-role modelling (ORM) Slides for this part are based on Chapters 3-7 from Halpin, T. & Morgan, T. 2008, Information Modeling and Relational Databases, Second Edition

More information

Entity Relationship Modelling

Entity Relationship Modelling Entity Relationship Modelling Overview Database Analysis Life Cycle Components of an Entity Relationship Diagram What is a relationship? Entities, attributes, and relationships in a system The degree of

More information

Course on Database Design Carlo Batini University of Milano Bicocca

Course on Database Design Carlo Batini University of Milano Bicocca Course on Database Design Carlo Batini University of Milano Bicocca 1 Carlo Batini, 2015 This work is licensed under the Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License.

More information

DATA MODELS FOR SEMISTRUCTURED DATA

DATA MODELS FOR SEMISTRUCTURED DATA Chapter 2 DATA MODELS FOR SEMISTRUCTURED DATA Traditionally, real world semantics are captured in a data model, and mapped to the database schema. The real world semantics are modeled as constraints and

More information

Data Analysis 1. Chapter 2.1 V3.1. Napier University Dr Gordon Russell

Data Analysis 1. Chapter 2.1 V3.1. Napier University Dr Gordon Russell Data Analysis 1 Chapter 2.1 V3.1 Copyright @ Napier University Dr Gordon Russell Entity Relationship Modelling Overview Database Analysis Life Cycle Components of an Entity Relationship Diagram What is

More information

Verbalizing Business Rules: Part 9

Verbalizing Business Rules: Part 9 Verbalizing Business Rules: Part 9 Terry Halpin Northface University Business rules should be validated by business domain experts, and hence specified using concepts and languages easily understood by

More information

Software Service Engineering

Software Service Engineering Software Service Engineering Lecture 4: Unified Modeling Language Doctor Guangyu Gao Some contents and notes selected from Fowler, M. UML Distilled, 3rd edition. Addison-Wesley Unified Modeling Language

More information

Conceptual Design with ER Model

Conceptual Design with ER Model Conceptual Design with ER Model Lecture #2 1/24/2012 Jeff Ballard CS564, Spring 2014, Database Management Systems 1 See the Moodle page Due February 7 Groups of 2-3 people Pick a team name Homework 1 is

More information

Practical UML - A Hands-On Introduction for Developers

Practical UML - A Hands-On Introduction for Developers Practical UML - A Hands-On Introduction for Developers By: Randy Miller (http://gp.codegear.com/authors/edit/661.aspx) Abstract: This tutorial provides a quick introduction to the Unified Modeling Language

More information

Practical UML : A Hands-On Introduction for Developers

Practical UML : A Hands-On Introduction for Developers Borland.com Borland Developer Network Borland Support Center Borland University Worldwide Sites Login My Account Help Search Practical UML : A Hands-On Introduction for Developers - by Randy Miller Rating:

More information

2 Sets. 2.1 Notation. last edited January 26, 2016

2 Sets. 2.1 Notation. last edited January 26, 2016 2 Sets Sets show up in virtually every topic in mathematics, and so understanding their basics is a necessity for understanding advanced mathematics. As far as we re concerned, the word set means what

More information

Topic C. Communicating the Precision of Measured Numbers

Topic C. Communicating the Precision of Measured Numbers Topic C. Communicating the Precision of Measured Numbers C. page 1 of 14 Topic C. Communicating the Precision of Measured Numbers This topic includes Section 1. Reporting measurements Section 2. Rounding

More information

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. # 3 Relational Model Hello everyone, we have been looking into

More information

Chapter 6: Entity-Relationship Model

Chapter 6: Entity-Relationship Model Chapter 6: Entity-Relationship Model Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 6: Entity-Relationship Model Design Process Modeling Constraints E-R Diagram

More information

Chapter 6: Entity-Relationship Model

Chapter 6: Entity-Relationship Model Chapter 6: Entity-Relationship Model Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 6: Entity-Relationship Model Design Process Modeling Constraints E-R Diagram

More information

UML Data Models From An ORM (Object-Role Modeling) Perspective. Data Modeling at Conceptual Level

UML Data Models From An ORM (Object-Role Modeling) Perspective. Data Modeling at Conceptual Level UML Data Models From An ORM (Object-Role Modeling) Perspective. Data Modeling at Conceptual Level Lecturer Ph. D. Candidate DANIEL IOAN HUNYADI, Lecturer Ph. D. Candidate MIRCEA ADRIAN MUSAN Department

More information

Object-Oriented Software Engineering Practical Software Development using UML and Java

Object-Oriented Software Engineering Practical Software Development using UML and Java Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 5: Modelling with Classes Lecture 5 5.1 What is UML? The Unified Modelling Language is a standard graphical

More information

Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts

Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts Chapter Outline Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts Entities and Attributes Entity Types, Value Sets, and Key Attributes Relationships and Relationship

More information

Chapter 7: Entity-Relationship Model

Chapter 7: Entity-Relationship Model Chapter 7: Entity-Relationship Model, 7th Ed. See www.db-book.com for conditions on re-use Chapter 7: Entity-Relationship Model Design Process Modeling Constraints E-R Diagram Design Issues Weak Entity

More information

Chapter 2: Entity-Relationship Model. Entity Sets. Entity Sets customer and loan. Attributes. Relationship Sets. A database can be modeled as:

Chapter 2: Entity-Relationship Model. Entity Sets. Entity Sets customer and loan. Attributes. Relationship Sets. A database can be modeled as: Chapter 2: Entity-Relationship Model Entity Sets Entity Sets Relationship Sets Design Issues Mapping Constraints Keys E-R Diagram Extended E-R Features Design of an E-R Database Schema Reduction of an

More information

Chapter 1: The Database Environment

Chapter 1: The Database Environment Chapter 1: The Database Environment Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Prentice Hall, 2002 1 Definitions Data: Meaningful facts, text, graphics,

More information

Information Systems (Informationssysteme)

Information Systems (Informationssysteme) Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jensteubner@cstu-dortmundde Summer 2018 c Jens Teubner Information Systems Summer 2018 1 Part IV Database Design c Jens Teubner Information

More information

SOFTWARE DESIGN COSC 4353 / Dr. Raj Singh

SOFTWARE DESIGN COSC 4353 / Dr. Raj Singh SOFTWARE DESIGN COSC 4353 / 6353 Dr. Raj Singh UML - History 2 The Unified Modeling Language (UML) is a general purpose modeling language designed to provide a standard way to visualize the design of a

More information

A - 1. CS 494 Object-Oriented Analysis & Design. UML Class Models. Overview. Class Model Perspectives (cont d) Developing Class Models

A - 1. CS 494 Object-Oriented Analysis & Design. UML Class Models. Overview. Class Model Perspectives (cont d) Developing Class Models CS 494 Object-Oriented Analysis & Design UML Class Models Overview How class models are used? Perspectives Classes: attributes and operations Associations Multiplicity Generalization and Inheritance Aggregation

More information

Entity-Relationship Modelling. Entities Attributes Relationships Mapping Cardinality Keys Reduction of an E-R Diagram to Tables

Entity-Relationship Modelling. Entities Attributes Relationships Mapping Cardinality Keys Reduction of an E-R Diagram to Tables Entity-Relationship Modelling Entities Attributes Relationships Mapping Cardinality Keys Reduction of an E-R Diagram to Tables 1 Entity Sets A enterprise can be modeled as a collection of: entities, and

More information

Become a Champion Data Modeler with SQL Developer Data Modeler 3.0

Become a Champion Data Modeler with SQL Developer Data Modeler 3.0 Become a Champion Data Modeler with SQL Developer Data Modeler 3.0 Marc de Oliveira, Simplify Systems Introduction This presentation will show you how I think good data models are made, and how SQL Developer

More information

Chapter 6: Entity-Relationship Model

Chapter 6: Entity-Relationship Model Chapter 6: Entity-Relationship Model Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 6: Entity-Relationship Model Design Process Modeling Constraints E-R Diagram

More information

Chapter 7: Entity-Relationship Model

Chapter 7: Entity-Relationship Model Chapter 7: Entity-Relationship Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 7: Entity-Relationship Model Design Process Modeling Constraints E-R Diagram

More information

Chapter 7: Entity-Relationship Model

Chapter 7: Entity-Relationship Model Chapter 7: Entity-Relationship Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 7: Entity-Relationship Model Design Process Modeling Constraints E-R Diagram

More information

Chapter 7: Entity-Relationship Model

Chapter 7: Entity-Relationship Model Chapter 7: Entity-Relationship Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 7: Entity-Relationship Model Design Process Modeling Constraints E-R Diagram

More information

Copyright 2016 Ramez Elmasr and Shamkant B. Navathei

Copyright 2016 Ramez Elmasr and Shamkant B. Navathei CHAPTER 3 Data Modeling Using the Entity-Relationship (ER) Model Slide 1-2 Chapter Outline Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts Entities and Attributes

More information

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data. Managing Data Data storage tool must provide the following features: Data definition (data structuring) Data entry (to add new data) Data editing (to change existing data) Querying (a means of extracting

More information

The Next Step: Designing DB Schema. Chapter 6: Entity-Relationship Model. The E-R Model. Identifying Entities and their Attributes.

The Next Step: Designing DB Schema. Chapter 6: Entity-Relationship Model. The E-R Model. Identifying Entities and their Attributes. Chapter 6: Entity-Relationship Model Our Story So Far: Relational Tables Databases are structured collections of organized data The Relational model is the most common data organization model The Relational

More information

SOFTWARE ENGINEERING DESIGN I

SOFTWARE ENGINEERING DESIGN I 2 SOFTWARE ENGINEERING DESIGN I 3. Schemas and Theories The aim of this course is to learn how to write formal specifications of computer systems, using classical logic. The key descriptional technique

More information

Software Engineering Lab Manual

Software Engineering Lab Manual Kingdom of Saudi Arabia Ministry Education Prince Sattam Bin Abdulaziz University College of Computer Engineering and Sciences Department of Computer Science Software Engineering Lab Manual 1 Background:-

More information

Lecture 3: Linear Classification

Lecture 3: Linear Classification Lecture 3: Linear Classification Roger Grosse 1 Introduction Last week, we saw an example of a learning task called regression. There, the goal was to predict a scalar-valued target from a set of features.

More information

Chapter 6: Entity-Relationship Model. The Next Step: Designing DB Schema. Identifying Entities and their Attributes. The E-R Model.

Chapter 6: Entity-Relationship Model. The Next Step: Designing DB Schema. Identifying Entities and their Attributes. The E-R Model. Chapter 6: Entity-Relationship Model The Next Step: Designing DB Schema Our Story So Far: Relational Tables Databases are structured collections of organized data The Relational model is the most common

More information

Chapter Outline. Note 1. Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts

Chapter Outline. Note 1. Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts Chapter Outline Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts Entities and Attributes Entity Types, Value Sets, and Key Attributes Relationships and Relationship

More information

CMPT 354 Database Systems I

CMPT 354 Database Systems I CMPT 354 Database Systems I Chapter 2 Entity Relationship Data Modeling Data models A data model is the specifications for designing data organization in a system. Specify database schema using a data

More information

Conceptual Data Modeling Concepts & Principles

Conceptual Data Modeling Concepts & Principles Mustafa Jarrar: Lecture Notes on Conceptual Modeling Concepts. Birzeit University, Palestine. 2011 Conceptual Data Modeling Concepts & Principles (Chapter 1&2) Mustafa Jarrar Birzeit University mjarrar@birzeit.edu

More information

Class diagrams. Modeling with UML Chapter 2, part 2. Class Diagrams: details. Class diagram for a simple watch

Class diagrams. Modeling with UML Chapter 2, part 2. Class Diagrams: details. Class diagram for a simple watch Class diagrams Modeling with UML Chapter 2, part 2 CS 4354 Summer II 2015 Jill Seaman Used to describe the internal structure of the system. Also used to describe the application domain. They describe

More information

Informatics 1: Data & Analysis

Informatics 1: Data & Analysis Informatics 1: Data & Analysis Lecture 3: The Relational Model Ian Stark School of Informatics The University of Edinburgh Tuesday 24 January 2017 Semester 2 Week 2 https://blog.inf.ed.ac.uk/da17 Lecture

More information

Intro to DB CHAPTER 6

Intro to DB CHAPTER 6 Intro to DB CHAPTER 6 DATABASE DESIGN &THEER E-R MODEL Chapter 6. Entity Relationship Model Design Process Modeling Constraints E-R Diagram Design Issues Weak Entity Sets Extended E-R Features Design of

More information

2. An implementation-ready data model needn't necessarily contain enforceable rules to guarantee the integrity of the data.

2. An implementation-ready data model needn't necessarily contain enforceable rules to guarantee the integrity of the data. Test bank for Database Systems Design Implementation and Management 11th Edition by Carlos Coronel,Steven Morris Link full download test bank: http://testbankcollection.com/download/test-bank-for-database-systemsdesign-implementation-and-management-11th-edition-by-coronelmorris/

More information

Essay Question: Explain 4 different means by which constrains are represented in the Conceptual Data Model (CDM).

Essay Question: Explain 4 different means by which constrains are represented in the Conceptual Data Model (CDM). Question 1 Essay Question: Explain 4 different means by which constrains are represented in the Conceptual Data Model (CDM). By specifying participation conditions By specifying the degree of relationship

More information

CHAPTER 2: DATA MODELS

CHAPTER 2: DATA MODELS CHAPTER 2: DATA MODELS 1. A data model is usually graphical. PTS: 1 DIF: Difficulty: Easy REF: p.36 2. An implementation-ready data model needn't necessarily contain enforceable rules to guarantee the

More information

Object Relational Mappings

Object Relational Mappings Object Relational Mappings First step to a formal approach Bachelor thesis Tom van den Broek January 30, 2007 Contents 1 Introduction 2 2 The two models 4 2.1 The object model.........................

More information

Chapter. Relational Database Concepts COPYRIGHTED MATERIAL

Chapter. Relational Database Concepts COPYRIGHTED MATERIAL Chapter Relational Database Concepts 1 COPYRIGHTED MATERIAL Every organization has data that needs to be collected, managed, and analyzed. A relational database fulfills these needs. Along with the powerful

More information

HKALE 2012 ASL Computer Applications Paper 2 Disscussion forum system

HKALE 2012 ASL Computer Applications Paper 2 Disscussion forum system HKALE 2012 ASL Computer Applications Paper 2 Disscussion forum system Yan Chai Hospital Lim Por Yen Secondary School 7A(3) Chu Chun Kit CONTENT PAGE 1. Content Page P.1 2. Schedule P.2 3. Objective P.3-4

More information

The Entity Relationship Model

The Entity Relationship Model The Entity Relationship Model CPS352: Database Systems Simon Miner Gordon College Last Revised: 2/4/15 Agenda Check-in Introduction to Course Database Environment (db2) SQL Group Exercises The Entity Relationship

More information

II. Review/Expansion of Definitions - Ask class for definitions

II. Review/Expansion of Definitions - Ask class for definitions CS352 Lecture - The Entity-Relationship Model last revised July 25, 2008 Objectives: 1. To discuss using an ER model to think about a database at the conceptual design level. 2. To show how to convert

More information

IS 263 Database Concepts

IS 263 Database Concepts IS 263 Database Concepts Lecture 1: Database Design Instructor: Henry Kalisti 1 Department of Computer Science and Engineering The Entity-Relationship Model? 2 Introduction to Data Modeling Semantic data

More information

CS352 Lecture - The Entity-Relationship Model

CS352 Lecture - The Entity-Relationship Model CS352 Lecture - The Entity-Relationship Model Objectives: last revised August 3, 2004 1. To introduce the concepts of entity, relationship, key 2. To show how to convert an ER design to a set of tables.

More information

CHAPTER 2: DATA MODELS

CHAPTER 2: DATA MODELS Database Systems Design Implementation and Management 12th Edition Coronel TEST BANK Full download at: https://testbankreal.com/download/database-systems-design-implementation-andmanagement-12th-edition-coronel-test-bank/

More information

6. Relational Algebra (Part II)

6. Relational Algebra (Part II) 6. Relational Algebra (Part II) 6.1. Introduction In the previous chapter, we introduced relational algebra as a fundamental model of relational database manipulation. In particular, we defined and discussed

More information

A good example of entities and relationships can be seen below.

A good example of entities and relationships can be seen below. Unit 2: Unit 2: Conceptual Design: Data Modeling and the Entity Relationship Model - Discussion 1 Scroll down and click "Respond" to post your reply to the Discussion questions. Please review the Discussion

More information

Using High-Level Conceptual Data Models for Database Design A Sample Database Application Entity Types, Entity Sets, Attributes, and Keys

Using High-Level Conceptual Data Models for Database Design A Sample Database Application Entity Types, Entity Sets, Attributes, and Keys Chapter 7: Data Modeling Using the Entity- Relationship (ER) Model Using High-Level Conceptual Data Models for Database Design A Sample Database Application Entity Types, Entity Sets, Attributes, and Keys

More information

Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques. Fundamentals, Design, and Implementation, 9/e

Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques. Fundamentals, Design, and Implementation, 9/e Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques Fundamentals, Design, and Implementation, 9/e Three Schema Model ANSI/SPARC introduced the three schema model in 1975 It provides a framework

More information

Conceptual Data Models for Database Design

Conceptual Data Models for Database Design Conceptual Data Models for Database Design Entity Relationship (ER) Model The most popular high-level conceptual data model is the ER model. It is frequently used for the conceptual design of database

More information

MIS Database Systems Entity-Relationship Model.

MIS Database Systems Entity-Relationship Model. MIS 335 - Database Systems Entity-Relationship Model http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database Design Main concepts in the ER model? ER Diagrams Database Design

More information

Unified Modeling Language (UML)

Unified Modeling Language (UML) Appendix H Unified Modeling Language (UML) Preview The Unified Modeling Language (UML) is an object-oriented modeling language sponsored by the Object Management Group (OMG) and published as a standard

More information

Overview of Database Design Process. Data Modeling Using the Entity- Relationship (ER) Model. Two main activities:

Overview of Database Design Process. Data Modeling Using the Entity- Relationship (ER) Model. Two main activities: 1 / 14 Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts Entities and Attributes Entity Types, Value Sets, and Key Attributes Relationships and Relationship Types

More information

E-R Model. Hi! Here in this lecture we are going to discuss about the E-R Model.

E-R Model. Hi! Here in this lecture we are going to discuss about the E-R Model. E-R Model Hi! Here in this lecture we are going to discuss about the E-R Model. What is Entity-Relationship Model? The entity-relationship model is useful because, as we will soon see, it facilitates communication

More information

3 February 2011 CSE-3421M Test #1 p. 1 of 14. CSE-3421M Test #1. Design

3 February 2011 CSE-3421M Test #1 p. 1 of 14. CSE-3421M Test #1. Design 3 February 2011 CSE-3421M Test #1 p. 1 of 14 CSE-3421M Test #1 Design Sur / Last Name: Given / First Name: Student ID: Instructor: Parke Godfrey Exam Duration: 75 minutes Term: Winter 2011 Answer the following

More information

6.001 Notes: Section 6.1

6.001 Notes: Section 6.1 6.001 Notes: Section 6.1 Slide 6.1.1 When we first starting talking about Scheme expressions, you may recall we said that (almost) every Scheme expression had three components, a syntax (legal ways of

More information

EC121 Mathematical Techniques A Revision Notes

EC121 Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes Mathematical Techniques A begins with two weeks of intensive revision of basic arithmetic and algebra, to the level

More information

Lecture 10: Introduction to Correctness

Lecture 10: Introduction to Correctness Lecture 10: Introduction to Correctness Aims: To look at the different types of errors that programs can contain; To look at how we might detect each of these errors; To look at the difficulty of detecting

More information

"Relations for Relationships"

Relations for Relationships M359 An explanation from Hugh Darwen "Relations for Relationships" This note might help those who have struggled with M359's so-called "relation for relationship" method of representing, in a relational

More information

The En'ty Rela'onship Model

The En'ty Rela'onship Model The En'ty Rela'onship Model Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata Slides re-used, with minor modification, from Silberschatz, Korth and Sudarshan www.db-book.com Outline

More information

Modeling Relationships

Modeling Relationships Modeling Relationships Welcome to Lecture on Modeling Relationships in the course on Healthcare Databases. In this lecture we are going to cover two types of relationships, namely, the subtype and the

More information

This interview appeared in the September 1995 issue of DBMS and is reproduced here by permission.

This interview appeared in the September 1995 issue of DBMS and is reproduced here by permission. Black Belt Design: Asymetrix Corp. s Dr. Terry Halpin By Maurice Frank This interview appeared in the September 1995 issue of DBMS and is reproduced here by permission. Using object role modeling to design

More information

Data Modeling Using the Entity-Relationship Model

Data Modeling Using the Entity-Relationship Model 3 Data Modeling Using the Entity-Relationship Model Conceptual modeling is a very important phase in designing a successful database application. Generally, the term database application refers to a particular

More information

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS CHAPTER 6 DATABASE MANAGEMENT SYSTEMS Management Information Systems, 10 th edition, By Raymond McLeod, Jr. and George P. Schell 2007, Prentice Hall, Inc. 1 Learning Objectives Understand the hierarchy

More information

Credit where Credit is Due. Lecture 4: Fundamentals of Object Technology. Goals for this Lecture. Real-World Objects

Credit where Credit is Due. Lecture 4: Fundamentals of Object Technology. Goals for this Lecture. Real-World Objects Lecture 4: Fundamentals of Object Technology Kenneth M. Anderson Object-Oriented Analysis and Design CSCI 6448 - Spring Semester, 2003 Credit where Credit is Due Some material presented in this lecture

More information

Proofwriting Checklist

Proofwriting Checklist CS103 Winter 2019 Proofwriting Checklist Cynthia Lee Keith Schwarz Over the years, we ve found many common proofwriting errors that can easily be spotted once you know how to look for them. In this handout,

More information

Relational Database Systems Part 01. Karine Reis Ferreira

Relational Database Systems Part 01. Karine Reis Ferreira Relational Database Systems Part 01 Karine Reis Ferreira karine@dpi.inpe.br Aula da disciplina Computação Aplicada I (CAP 241) 2016 Database System Database: is a collection of related data. represents

More information

CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris

CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris 1 Outline What is a database? The database approach Advantages Disadvantages Database users Database concepts and System architecture

More information

Section of Achieving Data Standardization

Section of Achieving Data Standardization Section of Achieving Data Standardization Whitemarsh Information Systems Corporation 2008 Althea Lane Bowie, Maryland 20716 Tele: 301-249-1142 Email: mmgorman@wiscorp.com Web: www.wiscorp.com 1 Why Data

More information

Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques. Fundamentals, Design, and Implementation, 9/e

Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques. Fundamentals, Design, and Implementation, 9/e Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques Fundamentals, Design, and Implementation, 9/e Three Schema Model ANSI/SPARC introduced the three schema model in 1975 It provides a framework

More information

FROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE

FROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE FROM A RELATIONAL TO A MULTI-DIMENSIONAL DATA BASE David C. Hay Essential Strategies, Inc In the buzzword sweepstakes of 1997, the clear winner has to be Data Warehouse. A host of technologies and techniques

More information

CS352 Lecture - Data Models

CS352 Lecture - Data Models CS352 Lecture - Data Models Last revised July 24, 2008 Objectives: 1. To introduce the entity-relationship model 2. To note the distinctive features of the hierarchical and network models. 3. To introduce

More information

Database Management Systems MIT Introduction By S. Sabraz Nawaz

Database Management Systems MIT Introduction By S. Sabraz Nawaz Database Management Systems MIT 22033 Introduction By S. Sabraz Nawaz Recommended Reading Database Management Systems 3 rd Edition, Ramakrishnan, Gehrke Murach s SQL Server 2008 for Developers Any book

More information

EECS-3421a: Test #1 Design

EECS-3421a: Test #1 Design 2016 October 12 EECS-3421a: Test #1 1 of 14 EECS-3421a: Test #1 Design Electrical Engineering & Computer Science Lassonde School of Engineering York University Family Name: Given Name: Student#: EECS Account:

More information

A l Ain University Of Science and Technology

A l Ain University Of Science and Technology A l Ain University Of Science and Technology 4 Handout(4) Database Management Principles and Applications The Entity Relationship (ER) Model http://alainauh.webs.com/ http://www.comp.nus.edu.sg/~lingt

More information

Alan J. Perlis - Epigrams on Programming

Alan J. Perlis - Epigrams on Programming Programming Languages (CS302 2007S) Alan J. Perlis - Epigrams on Programming Comments on: Perlis, Alan J. (1982). Epigrams on Programming. ACM SIGPLAN Notices 17(9), September 1982, pp. 7-13. 1. One man

More information

DATABASE SCHEMA DESIGN ENTITY-RELATIONSHIP MODEL. CS121: Relational Databases Fall 2017 Lecture 14

DATABASE SCHEMA DESIGN ENTITY-RELATIONSHIP MODEL. CS121: Relational Databases Fall 2017 Lecture 14 DATABASE SCHEMA DESIGN ENTITY-RELATIONSHIP MODEL CS121: Relational Databases Fall 2017 Lecture 14 Designing Database Applications 2 Database applications are large and complex A few of the many design

More information

Non-overlappingoverlapping. Final outcome of the worked example On pages R&C pages R&C page 157 Fig 3.52

Non-overlappingoverlapping. Final outcome of the worked example On pages R&C pages R&C page 157 Fig 3.52 Objectives Computer Science 202 Database Systems: Entity Relation Modelling To learn what a conceptual model is and what its purpose is. To learn the difference between internal models and external models.

More information

CS352 Lecture - Data Models

CS352 Lecture - Data Models CS352 Lecture - Data Models Objectives: 1. To briefly introduce the entity-relationship model 2. To introduce the relational model. 3. To introduce relational algebra Last revised January 18, 2017 Materials:

More information

SQL DDL. CS3 Database Systems Weeks 4-5 SQL DDL Database design. Key Constraints. Inclusion Constraints

SQL DDL. CS3 Database Systems Weeks 4-5 SQL DDL Database design. Key Constraints. Inclusion Constraints SQL DDL CS3 Database Systems Weeks 4-5 SQL DDL Database design In its simplest use, SQL s Data Definition Language (DDL) provides a name and a type for each column of a table. CREATE TABLE Hikers ( HId

More information

Full file at Chapter 2: Foundation Concepts

Full file at   Chapter 2: Foundation Concepts Chapter 2: Foundation Concepts TRUE/FALSE 1. The input source for the conceptual modeling phase is the business rules culled out from the requirements specification supplied by the user community. T PTS:

More information

The functions performed by a typical DBMS are the following:

The functions performed by a typical DBMS are the following: MODULE NAME: Database Management TOPIC: Introduction to Basic Database Concepts LECTURE 2 Functions of a DBMS The functions performed by a typical DBMS are the following: Data Definition The DBMS provides

More information