ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION JING YANG 2010 FALL Class 3: The Relational Data Model and Relational Database Constraints Outline 2 The Relational Data Model and Relational Database Constraints Relational Model Constraints and Relational Database Schemas Update Operations, Transactions, and Dealing with Constraint Violations 1
3 Quick Facts about the Relational Data Model Edgar Frank Ted Codd of IBM research first introduced the relational data mode in his paper "A Relational Model of Data for Large Shared Data Banks (1970) First commercial implementations available in early 1980s (SQL/DS, Oracle DBMS et al.) Has been implemented in a large number of commercial system The standard for commercial relational DBMS is the SQL query language Relational Model 4 Represents data as a collection of relations Relation - Table of values Row Represents a collection of related data values Fact that typically corresponds to a real-world entity or relationship Tuple Table name and column names Interpret the meaning of the values in each row attribute 2
5 Domains 6 Domain D Set of atomic values Atomic Each value indivisible 3
Examples 7 Example1: Name: USA_p phone_ numbers Logical definition: the set of ten-digit phone numbers valid in USA Data type: a character string of the form (ddd)ddd-ddd, where each d is a numerical digit and the first three digits form a valid telephone area code Example 2: Name: Employee_age Logical definition: possible ages of employees, each must be an integer value between 15 and 80 Data type: integer number between 15 and 80 Relation Schema 8 Relation schema R Denoted by R(A 1, A 2,...,AA n ) Made up of a relation name R and a list of attributes, A 1, A 2,..., A n Attribute A i Column header; Name of a role played by some domain D in the relation schema R Degree (or arity) of a relation Number of attributes n of its relation schema 4
9 Practice: 1. Write the scheme of this relation 2. What is the degree of this relation? 3. Can different attributes have the same domain? Relation or Relation State 10 Set of n-tuples r = {t 1, t 2,..., t m } Each n-tuple t Ordered list of n values t =<v 1, v 2,..., v n > Each value v i, 1 i n, is an element of dom(a i ) or is a special NULL value Question: Can r have two tuples with exactly the same values in all attributes? 5
Formal Definition of Relation 11 Relation (or relation state) r(r) Mathematical relation of degree n on the domains dom(a 1 ), dom(a 2 ),..., dom(a n ) Subset of the Cartesian product of the domains that define R: r(r) (dom(a 1 ) dom(a 2 )... dom(a n )) The Cartesian product specifies all possible combination of values from the underlying domains Cardinality Total number of values in a domain dom(a i ) Question 12 R(A 1, A 2, A 3 ) Assume dom(a 1 ) = 2 dom(a 2 ) = 10 dom(a 3 ) = 3 What is the total number of possible tuples that can ever exist in any r(r)? 6
Current relation state 13 Relation state at a given time Reflects only the valid tuples that represent a particular state of the real world Characteristics of Relations 14 Ordering of tuples in a relation Rl Relation dfi defined d as a set of tuples Elements have no order among them Ordering of values within a tuple and an alternative definition of a relation Order of attributes and values is not that important As long as correspondence between attributes and values maintained 7
Characteristics of Relations (cont d.) 15 Alternative definition of a relation Tuple considered d as a set of (<attribute>, <value>) pairs Each pair gives the value of the mapping from an attribute A i to a value v i from dom(a i ) Use the first definition of relation Attributes and the values within tuples are ordered d Simpler notation Characteristics of Relations (cont d.) True or false: The above two relations are identical. 8
Values in Tuples 17 Each value in a tuple is atomic Flat relational model Composite and multivalued attributes not allowed First normal form assumption Multivalued attributes Must be represented by separate relations Composite attributes Represented only by simple component attributes in basic relational model NULLs in Tuples (cont d.) 18 NULL values Represent the values of attributes that may be unknown or may not apply to a tuple Meanings for NULL values Value unknown Value exists but is not available Attribute does not apply to this tuple (also known as value Attribute does not apply to this tuple (also known as value undefined) Trouble caused by NULL values 9
Interpretation (Meaning) of a Relation 19 Assertion Each tuple in the relation is a fact or a particular instance of the assertion Some relations may represent facts about entities, others may represent facts about relationships Predicate Sh Schema: predicate Values in each tuple interpreted as values that satisfy the predicate Relational Model Notation 20 Relation schema R of degree n Denoted by R(A 1, A 2,..., A n ) Uppercase letters Q, R, S Denote relation names Lowercase letters q, r, s Denote relation states Letters t, u, v Denote tuples 10
Relational Model Notation 21 Name of a relation schema: STUDENT Idi Indicates the current set of tuples in that relation Notation: STUDENT(Name, Ssn,...) Refers only to relation schema Attribute A can be qualified with the relation name R to which it belongs Using the dot notation R.A Relational Model Notation 22 n-tuple t in a relation r(r) Denoted by t = <v 1, v 2,..., v n > v i is the value corresponding to attribute A i Component values of tuples: t[a i ] and t.a i refer to the value v i in t for attribute A i t[a u, A w,..., A z ]and t.(a u, A w,..., A z ) refer to the subtuple of values <v u, v w,..., v z > from t corresponding to the attributes specified in the list 11
Example: 23 For the highlighted hli ht tuple, t[name] = < Barbara Benson > t[ssn,gpa,age] = < 533-69-1238,3.25,19> Relational Model Constraints 24 Constraints Restrictions i on the actual values in a database state Derived from the rules in the miniworld that the database represents Inherent model-based constraints or implicit constraints Inherent in the data model 12
25 Relational Model Constraints (cont d.) Schema-based constraints or explicit constraints Can be directly expressed in schemas of the data model Application-based or semantic constraints or business rules Cannot be directly expressed in schemas Expressed and enforced by application program Domain Constraints 26 In each tuple, value of attribute A must be from dom(a) Data type associated with domains: Numeric data types for integers and real numbers Characters Booleans Fixed-length strings Variable-length strings Date, time, timestamp Money Other special data types 13
27 Key Constraints and Constraints on NULL Values No two tuples can have the same combination of values for all their attributes. Superkey A subset of attributes of a relation schema R No two distinct tuples in any state r of R can have the same value for SK Key Superkey of R Removing any attribute A from K leaves a set of attributes K that is not a superkey of R any more 28 Key Constraints and Constraints on NULL Values (cont d.) Key satisfies two properties: Two distinct i tuples in any state of relation cannot have identical values for (all) attributes in key Minimal superkey Cannot remove any attributes and still have uniqueness constraint in above condition hold 14
Questions 29 1. Can you find a superkey for a relation in 1 millisecond? 2. True or false: Each relation has at least one key 3. Find a key and several superkeys for the following relation: 30 Key Constraints and Constraints on NULL Values (cont d.) Candidate key Rl Relation schema may have more than one key Primary key of the relation Designated among candidate keys Underline attribute Other candidate keys are designated as unique keys 15
Key Constraints and Constraints on NULL Values (cont d.) 32 Relational Databases and Relational Database Schemas Relational database schema S Set of relation schemas S = {R 1, R 2,..., R m } Set of integrity constraints IC Relational database state Set of relation states DB = {r 1, r 2,..., r m } Each r i is a state of R i and such that the r i relation states satisfy integrity constraints specified in IC 16
33 Relational Databases and Relational Database Schemas (cont d.) Invalid state Does not obey all the integrity constraints Valid state Satisfies all the constraints in the defined set of integrity constraints IC 34 Integrity, Referential Integrity, and Foreign Keys Entity integrity constraint No primary key value can be NULL Why? Referential integrity constraint Specified between two relations Maintains consistency among tuples in two relations State that a tuple in one relation that refers to another relation must refer to an existing tuple in that relation 17
35 Integrity, Referential Integrity, and Foreign Keys (cont d.) Foreign key rules: The attributes in FK of R 1 have the same domain(s) as the primary key attributes PK of R 2 Value of FK in a tuple t 1 of the current state r 1 (R 1 ) either occurs as a value of PK for some tuple t 2 in the current state r 2 (R 2 ) or is NULL R 1 Referencing relation R 2: Rf Referred relation A foreign key can refer to its own relation 36 Questions: 1. If Pnumber: 1-100, what is the domain of Pno? 2. 2. Can Dno be NULL? Directed arc from each foreign key to the relation it references 18
37 Integrity, Referential Integrity, and Foreign Keys (cont d.) All integrity constraints should be specified on relational database schema using DDL DMBS can automatically enforce them Most relational DBMS support key, entity integrity, and referential integrity constraints Other Types of Constraints 38 Semantic integrity constraints May have to be specified and enforced on a relational l database Use triggers and assertions More common to check for these types of constraints within the application programs 19
Other Types of Constraints (cont d.) 39 Functional dependency constraint Establishes a functional relationship among two sets of attributes X and Y Value of X determines a unique value of Y State constraints Define the constraints that a valid state of the database must satisfy Transition constraints Define to deal with state changes in the database Operations of Relational Model 40 Operations of the relational model can be categorized into retrievals and updates Basic operations that change the states of relations in the database: Insert Delete Update (or Modify) 20
41 42 21
43 The Insert Operation 44 Provides a list of attribute values for a new tuple t that is to be inserted into a relation R Can violate any of the four types of constraints If an insertion violates one or more constraints Default option is to reject the insertion 22
Are they acceptable? 45 1. Insert < Celilia, f, Kolonsky, NULL, 1960-04-05, 6532 Windy Lane, Katy, TX, F, 28000, NULL, 4> 2. Insert < Celilia, f, Kolonsky, 123456789, 1960-04-05, 6532 Windy Lane, Katy, TX, F, 28000, NULL, 4> 3. Insert < Celilia, f, Kolonsky, 677678989, 1960-04-05 05, 6532 Windy Lane, Katy, TX, F, 28000, NULL, 11> 4. Insert < Celilia, f, Kolonsky, 677678989, 1960-04-05, 6532 Windy Lane, Katy, TX, F, 28000, NULL, 4> The Delete Operation 46 Can violate only referential integrity If tuple being deleted is referenced by foreign keys from other tuples Restrict Reject the deletion Cascade Propagate the deletion by deleting tuples that reference the tuple that is being deleted Set null or set default Modify the referencing attribute values that cause the violation What could happen if you delete a tuple from the Employee relation? What should be done? 23
The Update Operation 47 Necessary to specify a condition on attributes of relation Select the tuple (or tuples) to be modified If attribute not part of a primary key nor of a foreign key Usually causes no problems Updating a primary/foreign key Similar issues as with Insert/Delete The Transaction Concept 48 Transaction Executing program Includes some database operations Must leave the database in a valid or consistent state Online transaction processing (OLTP) systems Execute transactions at rates that reach several hundred per second 24
Group Discussion 49 Group of five Design a relational database with three or more relations 1. Write the relation schemas 2. Define domain for each attribute 3. Identify primary keys and foreign keys in each relation 4. Populate the database with some data 5. Give examples of insert, delete, and modify operations that will lead to constraint violations. What will you do after such a violation? Summary 50 Characteristics differentiate relations from ordinary tables or files Classify database constraints into: Inherent model-based constraints, explicit schemabased constraints, and application-based constraints Modification operations on the relational model: Insert, Delete, and Update 25
After Class Exercise 51 3.11 3.13 26