Relational Model
Relational Model Concepts The relational model of data is based on the concept of a Relation. A relation is a mathematical concept based on the idea of sets.
Relational Model The model was first proposed by Dr. E.F. Codd of IBM in 1970 in the following paper: "A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970. The above paper caused a major revolution in the field of Database management and earned Ted Codd the coveted ACM Turing Award.
Informal Definitions RELATION: A table of values A relation may be thought of as a set of rows. A relation may alternately be thought of as a set of columns. Each row has a value of an item or set of items that uniquely identifies that row in the table. Each column typically is called by its column name or column header or attribute name.
Formal Definitions: Schema The Schema of a Relation: definition of the structure of the relation R (A 1, A 2,..., A n ). Relation schema R is defined over attributes A 1, A 2,..., A n For Example - CUSTOMER (Cust-id, Cust-name, Address, Phone#) Here, CUSTOMER is a relation defined over the four attributes Cust-id, Cust-name, Address, Phone#, each of which has a domain or a set of valid values. For example, the domain of Cust-id is 6 digit numbers.
Instance Instance: particular data in the relation. Instances change constantly; schemas rarely.
Tuple A tuple is an ordered set of values. Each row in the CUSTOMER table may be referred to as a tuple in the table and would consist of four values. <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000"> is a tuple belonging to the CUSTOMER relation. A relation may be regarded as a set of tuples (rows).
Attribute Columns in a table are also called attributes of the relation.
Domain A domain has a logical definition: e.g., USA_phone_numbers are the set of 10 digit phone numbers valid in the U.S. A domain may have a data-type or a format defined for it. The USA_phone_numbers may have a format: (ddd)-ddd-dddd where each d is a decimal digit.
Cardinality of a relation The number of tuples in a relation determines its cardinality.
Arity of a relation The number of attributes in a relation determines its arity or degree.
Definition Summary Informal terms Table Column Row Value in a column Table definition Formal terms Relation Attribute Tuple Domain Schema
Example
Relations are sets A relation is a set of tuples, which means: there can be no duplicate tuples order of the tuples doesn't matter In another model, relations are bags a generalization of sets that allows duplicates. Commercial DBMSs use this model. But for now, we will stick with relations as sets.
Database schemas and instances Database schema: a set of relation schemas Database instance: a set of relation instances
Relational Integrity Constraints Constraints are conditions that must hold on all valid relation instances. There are three main types of constraints: 1. Key constraints 2. Entity integrity constraints 3. Referential integrity constraints
Key Constraints
Super keys Informally: A super key is a set of one or more attributes whose combined values are unique. i.e., no two tuples can have the same values on all of these attributes. Formally: If attributes a 1, a 2,, a n form a super key for relation R, then there exist tuples t1 and t2 such that (t1.a 1 = t2.a 1 ) (t1.a 2 = t2.a 2 ) (t1.a n = t2.a n )
Example Course(dept, number, name, breadth) One tuple might be < csc, 343, Introduction to Databases, True > Suppose our knowledge of the domain tells us that no two tuples can have the same value for dept and number. This means that {dept, number} is a superkey. This is a constraint on what can go in the relation.
Does every relation have a super key?
Super key If {dept, number} is a super key, then so is {dept, number, name}. This follows from the definition. But we are more interested in a minimal set of attributes with the super key property. Minimal in the sense that no attributes can be removed from the super key without making it no longer a super key.
Key Key: a minimal super key. In the schema, by convention we often underline a key. Aside: The term super key is related to the term superset. A superkey is a superset of some key. (Not necessarily a proper superset.)
Candidate Key K is a candidate key of relation R if and only if it possesses both of the following properties: Uniqueness: No legal value of R ever contains two distinct tuples with same value for K. Irreducibility: No proper subset of K has the uniqueness property.
Candidate Key Student(Roll_number, Name, Class) If no two students can have the same roll number then, Roll_number is the candidate key. Whereas this relation has four super keys: (Roll_number), (Roll_number, Name), (Roll_number, Class), (Roll_number, Name, Class)
Primary key It has three properties: Uniqueness Irreducibility Not null
Primary Key One of the candidate keys is chosen as the primary keys. Customer(Cust#, Name, Address, Phone) primary key has a single underline
Alternate Key Candidate keys which are not chosen as the primary key are known as the Alternate Keys.
Foreign Keys Foreign keys are the attributes of a table which refer to the primary key of some another table. Foreign keys permit only those values, which appears in the primary key of the table to which it refers or may be null. It is used to link together two or more tables which have some form of relationship with each other.
Example
Foreign Key The foreign key is a reference to the tuple of a table from which it was taken, this tuple being called the Referenced or Target tuple. The table containing the referenced tuple will be called the Target table.
Declaring Foreign Keys A bit of notation: R[A] R is a relation and A is a list of attributes in R. R[A] is the set of all tuples from R, but with only the attributes in list A. We declare foreign key constraints this way: R1[X] R2[Y] X and Y may be lists of attributes, of same arity Y must be a key in R2 Example: Contact[custID] Customer[custID]
Entity Integrity Rule
Entity Integrity Rule This rule states that in a relation, value of attribute of a primary key cannot be null. Roll No Name Class Marks 1 ABC B.Tech 100 2 - B.Tech 150 - AH B.Tech 300 4 - - - - AH B.Tech 300
Referential Integrity
Referential Integrity It states that, if a foreign key exists in a relation, either the foreign key value must match a primary key value of some tuple in its home relation or must be wholly null.
Thanks to Marina Barsky and Diane Horton for the material.