CMP-3440 Database Systems Logical Design Lecture 03 zain 1
Database Design Process Application 1 Conceptual requirements Application 1 External Model Application 2 Application 3 Application 4 External Model External Model External Model Application 2 Conceptual requirements Application 3 Conceptual requirements Conceptual Model Logical Model Internal Model Application 4 Conceptual requirements
Logical Model: Mapping to a Relational Model Each entity in the ER Diagram becomes a relation. A properly normalized ER diagram will indicate where intersection relations for many-to-many mappings are needed. Relationships are indicated by common columns (or domains) in tables that are related. 3
Data Dependencies Functional Dependency: Given a relation R(A,B,C,D) where A, B, C, D are attributes Functional Dependencies A -> BCD, BC -> D Full Functional Dependency: A -> B,C,D Partial Dependency: (A,B,C,D,E,F) AB -> C,D,EF A -> C,E and B -> D,F Transitive Dependency: (A,B,C,D) A -> B B - > C C -> D 4
Normalization Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting data than other sets of relations containing the same data Normalization is a multi-step process beginning with an unnormalized relation 5
Normal Forms First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) 6
Normal Forms No transitive dependency between nonkey attributes All determinants are candidate keys - Single multivalued dependency Boyce- Codd and Higher Functional dependency of nonkey attributes on the primary key - Atomic values only Full Functional dependency of nonkey attributes on the primary key 7
Unnormalised Form (UNF) A table that contains one or more repeating groups. To create an unnormalized table: Transform data from information source (e.g. form) into table format with columns and rows. 8
First Normal Form To move to First Normal Form a relation must contain only atomic values at each row and column. A relation in which intersection of each row and column contains one and only one value. Disallows composite attributes, multivalued attributes & nested relations UNF to 1NF Nominate an attribute or group of attributes to act as the key for the unnormalized table. Identify repeating group(s) in unnormalized table which repeats for the key attribute(s). 9
UNF to 1NF Remove repeating group by: Entering appropriate data into the empty columns of rows containing repeating data ( flattening the table). 10
Second Normal Form A relation is said to be in Second Normal Form when every nonkey attribute is fully functionally dependent on the primary key. That is, every nonkey attribute needs the full primary key for unique identification Identify primary key for the 1NF relation. Identify functional dependencies in the relation. If partial dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant. 11
2NF Example EMP_PROJ SSN PNUMBER HOURS ENAME PNAME PLOCATION SSN, PNUMBER -> HOURS SSN -> ENAME PNUMBER -> PNAME, PLOCATION 12
Third Normal Form A relation is said to be in Third Normal Form if there is no transitive functional dependency between nonkey attributes When one nonkey attribute can be determined with one or more nonkey attributes there is said to be a transitive functional dependency. 3NF - A relation that is in 1NF and 2NF and in which no nonprimary-key attribute is transitively dependent on the primary key. 3NF - All non-prime attributes are fully & directly dependent on the PK. 13
3NF EMP_PROJ SSN ENAME BDATE ADDRESS DNUMBER DNAME DMGRSSN SSN -> ENAME, BDATE, ADDRESS, DNUMBER DNUMBER -> DNAME, DMGRSSN SSN ENAME BDATE ADDRESS DNUMBER DNUMBER DNAME DMGRSSN 14
Boyce-Codd Normal Form Most 3NF relations are also BCNF relations. A 3NF relation is NOT in BCNF if: Candidate keys in the relation are composite keys (they are not single attributes) There is more than one candidate key in the relation, and A relation is in BCNF if & only if every determinant is a candidate key. 15
BCNF Example 2 composite candidate keys: FD1:{Client, Subject} Staff FD2:{Client & Staff} Subject Client Subject Staff Client Staff Subject These candidate keys are overlapping on Client. Staff is a determinate but not a candidate key 16
Data Integrity Ensuring the integrity of the organization s databases is a key component of the DBA s job. A database is of little use if the data it contains is inaccurate or if it cannot be accessed due to integrity problems. The DBA has many tools at his disposal to ensure data integrity. 17
Entity Integrity That each occurrence of an entity must be uniquely identifiable. Although most DBMSs do not FORCE the creation of a primary key for each table, it is a tenet of the Relational Model. Enforce entity integrity by creating a PK for each table in the database. A primary key constraint can consist of one or more columns from the same table that are unique within the table. A table can have only one primary key constraint, which cannot contain nulls. 18
Uniqueness A unique constraint is similar to a primary key constraint. However, each table can have many unique constraints. Unique constraints cannot be used to support referential constraints. The values stored in the column, or combination of columns, must but unique within the table. That is, no other row can contain the same value. A unique constraint most likely requires a unique index to enforce. 19
Data Types Data type and data length are the most fundamental integrity constraints applied to data in the database. DBAs must choose data types wisely. The DBMS will automatically ensure that only the correct type of data is stored in that column. Choose the data type that most closely matches the domain of values for the column. For example, a numeric column should be defined as one of the numeric data types: integer, decimal, or floating point. If you specify a character data type for a column that will contain numeric data the DBMS cannot automatically enforce the integrity of the data. 20
Default Values Each column can be assigned a default value that will be used if subsequent INSERTs do not provide a value. Each column can have only one default value. The column s data type, length, and property must be able to support the default value specified. The default may be null, but only if the column is created as a nullable column. 21
CHECK Constraints A check constraint is a DBMS-defined restriction placed on the data values that can be stored in a column or columns. The expression is explicitly defined in the table DDL and is formulated in much the same way that SQL WHERE clauses are formulated. Any attempt to modify the column data (INSERT or UPDATE) causes the expression to be evaluated. If the modification conforms to the expression, the modification is permitted to proceed. If not, the statement will fail with a constraint violation. 22
CHECK Constraints Example 23
Referential Integrity Referential Integrity (RI) is a method for ensuring the correctness of data. The identification of the primary and foreign keys that constitute a relationship between tables is a component of defining referential integrity. RI also requires the definition of rules that dictate how modification of data stored in columns involved in the relationship can be accomplished. 24
RI: Parent / Child For any given referential constraint, the parent table is the table that contains the primary key, and the child table is the table that contains the foreign key. The parent table in the employed-by relationship is the DEPT table. The child table is the EMP table.
RI Rules Three types of rules can be attached to each referential constraint: INSERT rule UPDATE rule DELETE rule
INSERT Rules The INSERT rule indicates what will happen if you attempt to insert a value into a foreign key column without a corresponding primary key value in the parent table. There are two aspects to the RI INSERT rule: 1. It is never permissible to insert a row into a dependent table with a foreign key value that does not correspond to a primary key value. This is known as the restrict-insert rule. 2. Whether actual values must be specified instead of nulls.
UPDATE Rule The UPDATE rule controls updates such that a foreign key value cannot be updated to a value that does not correspond to a primary key value in the parent table. There are, however, two ways to view the update rule: Foreign key perspective. Primary key perspective.
UPDATE Rule: FK Perspective Once you have assigned a foreign key to a row, either at insertion or afterward, you must decide whether that value can be changed. This is determined by looking at the business definition of the relationship and the tables it connects. If you permit a foreign key value to be updated, the new value must either be equal to a primary key value currently in the parent table or be null.
UPDATE Rule: PK Perspective If a primary key value is updated, three options exist for handling foreign key values: Restricted UPDATE. The modification of the primary key column(s) is not allowed if foreign key values exist. Neutralizing UPDATE. All foreign key values equal to the primary key value(s) being modified are set to null. Of course, neutralizing UPDATE requires that nulls be permitted on the foreign key column(s). Cascading UPDATE. All foreign key columns with a value equal to the primary key value(s) being modified are modified as well.
DELETE Rule Similar to the primary key perspective of the update rule, three options exist when deleting a row from a parent table: Restricted DELETE. The deletion of the primary key row is not allowed if a foreign key value exists. Neutralizing DELETE. All foreign key values equal to the primary key value of the row being deleted are set to null. Cascading DELETE. All foreign key rows with a value equal to the primary key of the row about to be deleted are deleted as well.