CS5300 Database Systems

Similar documents
Database Systems Relational Model. A.R. Hurson 323 CS Building

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk

CS5300 Database Systems

Administrivia. The Relational Model. Review. Review. Review. Some useful terms

Data Modeling. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

The Relational Model. Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC)

The Relational Model

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

The Relational Model of Data (ii)

Mobile and Heterogeneous databases

Database Applications (15-415)

The Relational Model

The Relational Model. Chapter 3

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Relational data model

The Relational Data Model. Data Model

Database Management Systems. Chapter 1

Comp 5311 Database Management Systems. 4b. Structured Query Language 3

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Database Systems ( 資料庫系統 )

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

Database Management Systems. Chapter 3 Part 1

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Introduction to Data Management. Lecture #4 (E-R à Relational Design)

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

CS425 Midterm Exam Summer C 2012

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

Supplier-Parts-DB SNUM SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens

John Edgar 2

The Relational Model. Database Management Systems

Relational Model. Topics. Relational Model. Why Study the Relational Model? Linda Wu (CMPT )

DBMS. Relational Model. Module Title?

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

Page 1. Goals for Today" What is a Database " Key Concept: Structured Data" CS162 Operating Systems and Systems Programming Lecture 13.

BBM371- Data Management. Lecture 1: Course policies, Introduction to DBMS

King Fahd University of Petroleum and Minerals

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Information Systems. Relational Databases. Nikolaj Popov

CS275 Intro to Databases

Relational Data Model

Introduction to Database Systems

Relational Database Systems Part 01. Karine Reis Ferreira

Relational Databases BORROWED WITH MINOR ADAPTATION FROM PROF. CHRISTOS FALOUTSOS, CMU /615

Page 1. Quiz 18.1: Flow-Control" Goals for Today" Quiz 18.1: Flow-Control" CS162 Operating Systems and Systems Programming Lecture 18 Transactions"

Database Systems. Course Administration

Chapter 4. The Relational Model

CompSci 516 Database Systems. Lecture 2 SQL. Instructor: Sudeepa Roy

Keys, SQL, and Views CMPSCI 645

The Relational Model. Week 2

The Relational Model 2. Week 3

Introduction to Data Management. Lecture #4 E-R Model, Still Going

Chapter 1: Introduction

Mobile and Heterogeneous databases

CS2300: File Structures and Introduction to Database Systems

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

Database Management Systems MIT Introduction By S. Sabraz Nawaz

Review: Where have we been?

CS430/630 Database Management Systems Spring, Betty O Neil University of Massachusetts at Boston

Database Systems. Lecture2:E-R model. Juan Huo( 霍娟 )

01/01/2017. Chapter 5: The Relational Data Model and Relational Database Constraints: Outline. Chapter 5: Relational Database Constraints

CS 377 Database Systems

Relational Data Structure and Concepts. Structured Query Language (Part 1) The Entity Integrity Rules. Relational Data Structure and Concepts

The Relational Model and Normalization

Score. 1 (10) 2 (10) 3 (8) 4 (13) 5 (9) Total (50)

Relational Algebra Homework 0 Due Tonight, 5pm! R & G, Chapter 4 Room Swap for Tuesday Discussion Section Homework 1 will be posted Tomorrow

CS 2451 Database Systems: Relational Data Model

Unit 3 The Relational Model

Relational Query Languages

Mobile and Heterogeneous databases Security. A.R. Hurson Computer Science Missouri Science & Technology

Data about data is database Select correct option: True False Partially True None of the Above

Introduction to Data Management. Lecture #3 (E-R Design, Cont d.)

A Sample Solution to the Midterm Test

CIS 330: Applied Database Systems

Database Management Systems MIT Lesson 01 - Introduction By S. Sabraz Nawaz

Announcements If you are enrolled to the class, but have not received the from Piazza, please send me an . Recap: Lecture 1.

1 (10) 2 (8) 3 (12) 4 (14) 5 (6) Total (50)

9/8/2018. Prerequisites. Grading. People & Contact Information. Textbooks. Course Info. CS430/630 Database Management Systems Fall 2018

CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris

Databases. Jörg Endrullis. VU University Amsterdam

Computer Organization

Lecture 2 08/26/15. CMPSC431W: Database Management Systems. Instructor: Yu- San Lin

Introduction to Databases

Course Logistics & Chapter 1 Introduction

Database Technology Introduction. Heiko Paulheim

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

Database Systems ( 資料庫系統 ) Practicum in Database Systems ( 資料庫系統實驗 ) 9/20 & 9/21, 2006 Lecture #1

Introduction to Database Management Systems

UFCEKG 20 2 : Data, Schemas and Applications

What s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence

U1. Data Base Management System (DBMS) Unit -1. MCA 203, Data Base Management System

Lecture 15: Database Normalization

The Relational Model Constraints and SQL DDL

The Relational Model

Introduction The SELECT statement: basics Nested queries Set operators Update commands Table management

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

MIT Database Management Systems Lesson 01: Introduction

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

Transcription:

CS5300 Database Systems Introduction A.R. Hurson 128 EECH Building hurson@mst.edu

Instructor: Ali R. Hurson 128 EECH Building hurson@mst.edu Text: Fundamentals of Database Systems 7 th Edition (Elmasri and Navathe) Course material is available at: http://hurson.weebly.com/cs-5300-database-systems.html 2

Grading Distribution Periodic Exams and Quizzes: 40% Final Exam (Comprehensive): 30% Project(s) (Mandatory): 20% Home works: 10% 3

Hardcopy of homeworks and projects are collected at the beginning of class, It is encouraged to work as a group (at most two people per group) on homeworks and project(s) (grouping is fixed through out the semester, Groups are not allowed to discuss about homework assignments and project(s) with each other, December 1 is the deadline for filing grade corrections; no requests for grade change/update will be entertained after this deadline. 4

Outline 1. Introduction 2. Relational Model 3. Relational Algebra a) Traditional Set Operators b) Extended Relational Operators 4. SQL Language a) Data Definition Language b) Data Manipulation Language c) SQL Queries 5. Performance Measure a) Basic Operations b) Query Optimization 6. Schema Refinement and Normal Forms 7. Transaction Management a) Concurrency Control b) Recovery 8. Access Control and Data Security 9. Beyond Centralized Databases a) Distributed Databases b) Multidatabases c) Mobile Databases 5

Note, this unit will be covered in two lectures. In case you finish it earlier, then you have the following options: 1) Take the early test and start CS5300.module2 2) Study the supplement module (supplement CS5300.module1) 3) Act as a helper to help other students in studying CS5300.module1 Note, options 2 and 3 have extra credits as noted in course outline. 6

Enforcement of background Current Module { Glossary of prerequisite topics No Familiar with the topics? Yes Take Test No Pass? Remedial action Yes { Glossary of topics No Familiar with the topics? Take the Module Yes Take Test No Pass? Yes Options Study next module? Review CS5300.module1.background Lead a group of students in this module (extra credits)? Study more advanced related topics (extra credits)? At the end: take exam, record the score, impose remedial action if not successful 7 Extra Curricular activities

You are expected to be familiar with: Basic principles of relational database model, Data Storage and Indexing File organizations, Some performance metrics If not, you need to study CS5300.module1.background 8

This section is intended to: motivate concept of databases and database management systems, distinguish differences between file management system and database management system, define a set of terminologies, mainly within the scope of relational model, that we will refer to extensively in this course. 9

Multimedia Sensors? Image Binary Text Data 1970 1980 1990 2000 2010 10

Sample Data sources In mid 1980s, it was estimated that the U.S. Patent Office and Trademark has a database of size 25 terabytes (1 tera = 10 12 ). In 1990s, it was estimated that the NASA's Earth Observing Project will generate more than 11,000 terabytes of data. An estimate puts the amount of new information generated in 2002 to 5 exabytes (1 exa = 10 18 ). It is estimated that in 2010 more than 13 Exabyte of new data has been stored (over 50,000 times the data in the library of Congress). 11

1 Bit = Binary Digit 8 Bits = 1 Byte 1000 Bytes = 1 Kilobyte 1000 Kilobytes = 1 Megabyte (10 6 ) 1000 Megabytes = 1 Gigabyte (10 9 ) 1000 Gigabytes = 1 Terabyte (10 12 ) 1000 Terabytes = 1 Petabyte (10 15 ) 1000 Petabytes = 1 Exabyte (10 18 ) 1000 Exabytes = 1 Zettabyte (10 21 ) 1000 Zettabytes = 1 Yottabyte (102 4 ) 1000 Yottabytes = 1 Brontobyte (10 27 ) 1000 Brontobytes = 1 Geopbyte (10 30 ) Sources: http://www.whatsabyte.com/ and http://wiki.answers.com 12

Feb. 2011 295 Exabytes (10 18 ) all human documents in 40,000 Yrs all spoken words in all lives Every two days we create as much data as we did from the beginning of mankind until 2003! amount human minds can store in 1yr Sources: Lesk, Berkeley SIMS, Landauer, EMC, TechCrunch, Smart Planet 14

How many trees does it take to print out an Exabyte? 1 Exabyte could hold approximately 500,000,000,000,000 pages of standard printed text It takes one tree to produce 94,200 pages of a book Thus it will take 530,785,562,327 trees to store an Exabyte of data In 2005, there were 400,246,300,201 trees on Earth We can store.75 Exabytes of data using all the trees on the entire planet. Sources: http://www.whatsabyte.com/ and http://wiki.answers.com 15

Five V s : Database Systems Volume the sheer size of data is a major challenge and is the most easily recognizable, Variety heterogeneity of the data type, representation, and semantic interpretation, Velocity the rate at which data is generated and the time in which it must be processed, Variability the rate at which data is changing and generated (dynamic data), and Veracity Accuracy and reliability of data. Big data has to be managed in context, which may be noisy. 13

Economical Impact Database Systems In 2010, it was estimated that 140,000-190,000 workers with deep analytical experience is needed in US alone, in addition 1.5 million managers will need to become data-literate. It is estimated that during the period of 2012-2022, 57% of STEM jobs are in Computer Science. Over the next 15 years, up to 19 million new STEM related jobs will be created worldwide and the Big Data talent pool will increase more than 500 percent by 2030. This makes Big Data related jobs the second among all STEM disciplines. 16

Areas of application: Scientific research Education Health care Intelligent Transportation System Environmental modelling 17

Challenges Heterogeneity and Incompleteness Scale Timeliness Privacy Human Collaboration 18

Amount we actually aware of (.7%) 19 Source Chris Johnson, University of Utah, IPDPS2012

Database systems are the outgrowth of the traditional file management systems. It is essentially a computerized record-keeping system it is a repository for a collection of computerized data files. 20

The user will be given facilities to perform a variety of operations such as: Adding new files to the database, Inserting new data into existing files, Retrieving data, Updating data, Deleting data, Removing existing files from the database. 21

In short, database systems are designed to provide timely and reliable access to information. Timely could imply; efficiency, speed, performance, robustness,. Reliable could imply; performance, security, integrity, accuracy,. 22

Database is an integrated, time dependent, shared collection of data describing the activities of one or more related organizations. Integrated means that the database can be thought of as a unification of several otherwise distinct data files. Time dependent means that the contents of database changes over time. Shared implies a multi-user environment. 23

A database system is defined by: A set of data to represent entities (objects) and interrelationships among them, A set of rules to preserve the data integrity, and A set of operations to manipulate the data. 24

Database management system (DBMS) is a collection of system routines designed to maintain and utilize databases an umbrella shielding database user from hardware/software details. 25

In a file management system data is a collection of information sources. It was designed to support efficient storage and access to the data. In short, it mainly addresses the organization of data on secondary storage. 26

A file management system: Does not provide data independence. Is not a robust environment. Could be very efficient for a specific application. In general, is not efficient for a group of applications. 27

In contrast to a file management system, a database management system takes a top down approach to process the information. It provides: Data independence, insulates application programs from data representation and storage details (physical data independence). Furthermore, changes in logical structure of data is hidden from applications (logical data independence). 28

Efficiency (storage and execution time), this feature is traced back to the file systems; Redundancy can be reduced, Data can be shared, Conflicting requirements can be balanced. Data integrity and security (integrity constraints) Inconsistencies can be avoided, Security restrictions can be applied, Integrity can be maintained. 29

Data administration Concurrent accesses and recovery Reduced application development time. 30

Data Independence Database Management System Efficiency Data Integrity & Security Concurrency Control Reduced Application Development time 31

However, database management systems: May not be efficient for specific application domains, Are robust within the set of applications they are intended for. 32

Questions Why transition from centralized databases to distributed databases? Why break a file into several smaller files? Redundancy, is it good or bad? 33

A Database management system: Stores proper information on secondary storage. Defines methods to identify relevant data for an application. Allows to write application programs to carry out tasks for each application (user). Develops schemes to guarantee data integrity (consistency). Provides means for data security, privacy, and protection. Makes sure that after proper modification, new state of database is preserved on permanent storage medium. 34

Data model a method of describing data: Relational data model Database is represented by a set of tables (relations), in which a row (tuple) represents an entity (object, record) and a column corresponds to an attribute of an entity. Schema A description of a particular collection of data using a given data model. 35

Levels of abstraction Conceptual (logical) level describes stored data in terms of the data model s conceptual data base design. This provides physical data independence. Physical level specifies additional storage details. In another words, it is the realization of conceptual schema level. External level (user s view of data), this provides logical data independence. 36

Transaction is a sequence of operations that transfers database from one consistent state to another consistent state. A transaction can be looked at as a collection of read and write operations terminated by a commit or abort operation. 37

To provide a timely access to information, a database management system should interleave several transactions and allow concurrent execution of transactions. This also means that proper concurrency rules (constraints) must be applied so that each transaction appears to execute in isolation. 38

In general, a database management system should guarantee ACID property of a transaction (Atomicity, Consistency, Isolation, Durability): 39

Atomicity: Either all operations of a transaction happen or none happen. State changes in a transaction are atomic. Consistency: A transaction produces results consistent with integrity requirements of the database. Isolation: In spite of the concurrent execution of transactions, each transaction believes it is executing in isolation. Intermediate results of a transaction should be hidden from other concurrently executing transactions. Durability: On successful completion of a transaction, the effects of the transaction should survive failures. 40

As noted before, database systems are designed to provide timely and reliable access to information. Within the scope of transaction processing: Timely means interleaving the actions of several transactions and concurrent execution of transactions (Consistency, Isolation). Reliable means Atomicity and Durability. 41

A relational database is a collection of relations. A relation corresponds to an entity which is referred to as a table (a collection of rows of the same structure. It is a two dimensional array (m * n) of m distinct rows and n columns. m is called the cardinality and n is called the degree (arity). 42

A tuple corresponds to each row of the table, it is a collection of n attributes. The primary key is an identifier for the table, a column or a collection of columns, that uniquely identifies each tuple. 43

Domain is a pool of values from which one or more attributes draw their actual values. In another words, a domain is a collection of smallest semantic unit of data (Scalar data). A domain is a set of scalar values of the same type. Note at this point we assume that a domain does not contain a null element (this will be relaxed later). A domain can be classifies as: Simple domain Complex domain 44

Simple domain: Domains of scalar values, Complex domain: domains. A collection of simple Note: if two attributes draw their values from the same domain, then it make sense to perform various operations (such as comparison, join, ) involving these two attributes. 45

Relation: A relation R on a set of domains D 1, D 2,, D n (not necessarily all distinct) is a subset of Cartesian products of D 1, D 2,, D n. R D 1 * D 2 *. * D n In another words, R is a collection of tuples each of n elements (a i 1 i n) where a i D i for 1 i n. 46

A relation consists of two parts: Heading (Schema, relation schema) Column heads: consists of a fixed set of attributes, more precisely, attribute domain pairs; {(A 1 :D 1 ), (A 2 :D 2 ),, (A n :D n )}, where A is are distinct. A schema represents the relation name, name of each attributes and the domain of each attribute. Students (Sid:string, name:string, login:string, age:integer, gpa:real) 47

Body (relation instance): consists of a timevarying set of tuples. Each tuple consists of a set of attribute value pairs. {(A 1 :vi 1 ), (A 2 :vi 2 ),, (A n :vi n )} Note cardinality of a relation may change in time, but its degree (arity) does not. 48

For example: S# Sname Status City S 1 Smith 20 London S 2 Jones 10 Paris S 3 Blake 30 Paris S 4 Clark 20 London S 5 Adams 30 Athens 49

The previous example shows a relation with cardinality of 5 and degree of 4. Each row represents a tuple and each column represents an attribute. Furthermore, each column draw its value from its domain. 50

Refer to our previous example: City is an attribute that draws its values from a domain which contains London London Paris, etc. Similarly, S# draws its value from a domain which contains S 1, S 2, S 3 S 4, S 5, etc. 51

Properties of relations: There is no duplicated tuples. No two rows are identical there is always a primary key. Tuples are unordered. Attributes are unordered. All attribute values are atomic (we might relax this later). 52

Formally, let R (f 1 :D 1 ), (f 2 :D 2 ),, (f n :D n ) be a relation schema and for each f i (1 i n) let Dom i be the set of values associated with the domain D i. An instance of R that satisfies the domain constraints in the schema is a set of tuples (a relation) with n fields: {<f 1 :d 1 >,<f 2 :d 2 >,,<f n :d n > d 1 Dom 1,d 2 Dom 2,,d n Dom n } <Sid:5,000>,<name:Dave>,<login:dave@cse.psu.edu>, <age:23>,<gpa:3.45> is a tuple. 53

Kinds of relations: Base relation is a relation that is sufficiently important (carry important information) that the database designer has decided to made it a part of the database. View is a derived relation, it does not have any separate, distinguishable stored data of its own (logical relation). Snapshot like a view is a derived relation. However, unlike view, a snapshot is real and not virtual it is defined by its definition and its own stored data. 54

Query result is an output relation resulting from some specified query. Intermediate result is a relation that is the result of a relational expression nested within larger expression. Temporary relation is a base, view, or snapshot created during the course of operations and destroyed automatically at some appropriate time by the system. 55

Creating and Modifying Relations Using SQL: CREATE TABLE is a statement that allows one to create a new table (relation); CREATE TABLE Student ( Sid name login age CHAR(20), CHAR(30), CHAR(20), INTEGER, gpa REAL ) 56

INSERT command is used to insert a single tuple into a created relation (table): Relation name Relation Schema INSERT INTO Student (Sid, name, login, age, gpa) VALUES (53688, Smith, smith@cs, 18, 3.2) 57

DELETE command eliminates a tuple from a relation: DELETE FROM WHERE Relation name Students name = Smith Quantifier 58

UPDATE command allows one to modify the attribute value of a tuple: UPDATE Student SET age = age + 1, gpa = gpa +.1 WHERE Sid = 53680 Relation name Quantifier 59

Integrity Rules The database definition needs to be extended to include certain constraints (integrity rules reliable operations). In the real world, for example, weight cannot be negative. Integrity rules are needed so that it can prevent such impossible configuration of values from occurring. 60

Integrity Rules Most databases will be subject to a very large number of such integrity rules. Note that most of the integrity rules are specific in the sense that they are applied to one specific database. 61

Integrity Rules In this section, we will discuss about three general integrity rules that applies to every databases. Note, the integrity rules both general and specific are defined for base relations. 62

Integrity Rules Consider the following three relations. Note that Primary key is allowed to be composite. It is possible to have all the attributes of the relation to represent the primary key All key relation. It is also possible for a relation to have more than one unique identifiers Multiple candidate keys. In this case, one of the candidate keys will be chosen as the primary key and the rest will be chosen as alternative keys. 63

Integrity Rules S# Sname Status City S 1 Smith 20 London S 2 Jones 10 Paris S 3 Blake 30 Paris S 4 Clark 20 London S 5 Adams 30 Athens 64

Integrity Rules P# Pname Color Weight City P 1 Nut Red 12 London P 2 Bolt Green 17 Paris P 3 Screw Blue 17 Rome P 4 Screw Red 14 London P 5 Cam Blue 12 Paris P 6 Cog Red 19 London 65

Integrity Rules S# P# QTY S 1 P 1 300 S 1 P 2 200 S 1 P 3 400 S 1 P 4 200 S 1 P 5 100 S 1 P 6 100 S 2 P 1 300 S 2 P 2 400 S 3 P 2 200 S 4 P 2 200 S 4 P 4 300 S 4 P 5 400 66

Integrity Rules Candidate Key: Attribute K (possibly composite) of relation R is a candidate key for R, if and only if it satisfies the following two time-independent properties: Uniqueness: At any point in time, no two tuples of R have the same value for K. Minimality: If K is composite, then no components of K can be eliminated without destroying the uniqueness property. 67

Integrity Rules No component of the primary key of a base relation is allowed to be null (undefined). This may not be applied to alternative keys. Primary key allows associative addressing. 68

Integrity Rules Foreign Key is an attribute of a relation R 2 (possibly composite) whose values are required to match those of the primary key of another relation R 1. A foreign key is a mechanism to establish references between two relations. This brings an issue know as the referential integrity. Attributes S# and P#, for SP relation, are examples of foreign keys. 69

Integrity Rules Attribute FK (possibly composite) of base relation R 2 is a foreign key, if and only if it satisfies the following time-independent properties: Each value of FK is either wholly null or wholly non-null (all null or none null). There exists a base relation R 1 (target relation) with primary key PK such that each non-null value of FK is identical to the value of PK in some tuple of R 1. 70

Integrity Rules A foreign key value represents a reference to the tuple containing the matching primary key value. The relation that contains the foreign key is the referencing relation and the relation that contains the corresponding primary key is the target (referenced) relation. In our example: Referential diagram S SP P 71

Integrity Rules In the following example: DEPT ( DEPT#,, BUDGET, ) EMP ( EMP#,, DEPT#,, SALARY, ) DEPT# is the foreign key in EMP relation and it is not a part of the primary key of EMP relation. DEPT EMP A relation can be both a referenced relation and referencing relation. R 3 R 2 R 1 72

Integrity Rules Referential Integrity rule: The database must not contain any unmatched foreign key value. Foreign key rules: At each moment in time, the database should be in a valid state and any transaction should transfer the database from a valid state to another valid state. Therefore, for an operation that violates these constraints, either the operation should be denied, or the database automatically, should generate a compensating operation. 73

Questions Can a foreign key accept nulls? What should happen on an attempt to delete the target of a foreign key reference? What should happen on an attempt to update the primary key of the target of a foreign key reference? 74

Specifying Key Constraints in SQL Primary Key The key words UNIQUE, CONSTRAINT, and PRIMARY are used to define candidate keys. At most, one of these candidate keys can be declared as Primary Key. 75

Primary Key CREATE TABLE Students ( Sid name login age CHAR(20), CHAR(30), CHAR(20), INTEGER, gpa REAL, UNIQUE (name, age), CONSTRAINT Studentskey PRIMARY KEY (Sid) ) Studentskey is called the constraint name It will be returned if the constraint is violated. 76

Foreign Key Assume that in addition to the Students relation, we have: Enroll (Sid: string, Cid: string, grade: string) Here Sid of Enroll relation should also appear in the Sid field of some tuples in the Students relation. The Sid field of Enroll relation is called a foreign key. 77

Foreign Key The foreign key of the referencing relation (Enroll) must match the primary key of the referenced relation (Students), must have the same number of columns and compatible type, though the column names can be different. 78

Cid grade Sid Carnatic101 C 53831 Reggae203 B 53832 Topology112 A 53650 History105 B 53666 Referencing relation Sid name login age gpa 50000 Dave dave@cs 19 3.3 53666 Jones jones@cs 18 3.4 53688 Smith smith@ee 18 3.2 53650 Smith smith@math 19 3.8 53831 Maday maday@music 11 1.8 53832 Guldu guldu@music 12 2.0 Referenced relation 79

Foreign Key Key words FOREIGN KEY and REFERENCE are used to specify this constraint: CREATE TABLE Enroll ( Sid Cid grade CHAR(20), CHAR(20), CHAR(10), PRIMARY KEY (Sid, Cid), FOREIGN KEY (Sid) REFERENCES Students) Referenced relation 80

General Constraints To preserve data integrity, we may need to define domain specific integrity rules. These rules can be enforced in the form of table constraints or assertions. Table constraints are defined for each relation and are enforced whenever contents of the relation is modified (insert, delete, update). Assertions involve several relations and are enforced whenever any of these relations is modified. 81