Chapter 4: Intermediate SQL Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use
4.1 Join Expressions Let s first review the joins from ch.3 4.2
1 SELECT * FROM student, takes WHERE student.id = takes.id; What s wrong with WHERE? Why do we need anything beyond it? 4.3
1 SELECT * FROM student, takes WHERE student.id = takes.id AND dept_name = 'Comp.Sci. ; Complicated queries are more readable if the join conditions are kept separate from the other WHERE conditions. 4.4
2 SELECT * FROM student NATURAL JOIN takes; 3 SELECT * FROM student JOIN takes USING(ID); 4.5
3 SELECT * FROM student JOIN takes ON student.id = takes.id; New syntax in ch.4! QUIZ: How can be make the ID to be displayed only once in the result? 4.6
To learn more More here about the history and motivations behind the join options in SQL: http://www.databasesoup.com/2013/08/fancy-sql-mondayon-vs-natural-join-vs.html (linked on our webpage) 4.7
QUIZ: Create a table of room numbers associated with all the section IDs that were ever taught in those rooms. Use JOIN ON 4.8
Create a table of room numbers associated with all the section IDs that were ever taught in those rooms. SELECT C.room_no, S.sec_id FROM section AS S JOIN classroom AS C ON (C.room_no = S.room_no AND C.building = S.building); 4.9
Side note: The ON clause accepts any SQL predicate (<>, AND, OR, LIKE, BETWEEN, <=, etc.) SELECT C.room_no, S.sec_id FROM section AS S join classroom AS C ON(C.room_no >= S.room_no AND building LIKE 'Haunted %') but it s almost always used with = (or LIKE). 4.10
Problems when joining tables course prerequisite Write a query that returns all courses and their prerequisites. What happens with CS-315? Write a query that returns all courses in prereq and their information. What happens with CS-437? Conclusion: In either case, the tuples with unmatched attributes are missing! 4.11
Another reason to use JOIN instead of WHERE when joining tables: We are about to extend the concept of a JOIN! 4.12
4.1.2 Outer Joins They are extensions of the join operation that avoid the loss of information. After normally computing the join (now called inner join), they add tuples from one or both tables that do not match tuples in the other table. To do so, NULL values are inserted. 4.13
course Left Outer Join prerequisite SELECT * FROM course NATURAL LEFT OUTER JOIN prereq ; 4.14
course Right Outer Join prerequisite SELECT * FROM course NATURAL RIGHT OUTER JOIN prereq ; 4.15
course Full Outer Join prerequisite SELECT * FROM course NATURAL FULL OUTER JOIN prereq ; 4.16
QUIZ: Outer Joins Write a query to display a list of all students (ID and name), along with the courses they have taken. Hint: SELECT student.id, student.name FROM student NATURAL JOIN takes does not work for all cases (Why?) 4.17
solution Write a query to display a list of all students (ID and name), along with the courses they have taken. SELECT student.id, student.name FROM student NATURAL LEFT OUTER JOIN takes ; 4.18
Conclusion on Outer Joins Left Outer Join preserves tuples only on left relation Right Outer Joinpreserves tuples only on right relation Full Outer Join preserves tuples in both relations Join type defines how tuples in each relation that do not match any tuple in the other relation (based on the join condition) are treated. 4.19
Practice exercise 4.2 Rewrite SELECT * FROM student NATURAL LEFT OUTER JOIN takes ; without using the OUTER JOIN clause. Hint: An outer join is an inner join with some extra tuples added use UNION. 4.20
Rewrite SELECT * FROM student NATURAL LEFT OUTER JOIN takes ; without using the OUTER JOIN clause. ( Practice Exercise 4.2 ) ( ) 4.21
4.2 Views Remember the idea of abstraction from Ch.1! 4.22
4.2 Views In some cases, it is not desirable for all users to see the entire logical model (that is, all the actual relations stored in the database.) Consider a person who needs to know an instructors name and department, but not the salary. This person should see a relation described by SELECT ID, name, dept_name FROM instructor ; 4.23
4.2 Views SELECT ID, name, dept_name FROM instructor ; A view provides a mechanism to hide certain data from the view of certain users. Any relation that is not in the DM schema, but is made visible to a user as a virtual relation is called a view. 4.24
View Definition CREATE VIEW v AS < query expression > where: <query expression> is any legal SQL query. The view name is v. example 4.25
Example Parentheses are optional, but they increase readability! The view is listed in the catalog! 4.26
QUIZ: Create a view with all course sections offered by the Physics dept. in the Fall 2009 semester, with the building and room number of each section. CREATE VIEW phys_fall_2009 AS (? ); 4.27
QUIZ: Create a view with all course sections offered by the Physics dept. in the Fall 2009 semester, with the building and room number of each section. CREATE VIEW phys_fall_2009 AS ( ); 4.28
Extra-credit QUIZ 4.29
View nitty-gritty Once a view is defined, the view name can be used to refer to the virtual relation that the view generates. View definition is not the same as creating a new relation by evaluating the query expression Rather, the view definition causes only the saving of the SQL query expression The expression will later be substituted into any query that uses that view. A view is permanently deleted from the catalog using DROP VIEW v example 4.30
CREATE VIEW faculty AS ( SELECT ID, name, dept_name FROM instructor ); We can now use the view as a table to write queries, e.g. Find all instructors in the Biology department: SELECT name FROM faculty WHERE dept_name = ' Biology ; 4.31
View nitty-gritty Views can also be used to define other views: CREATE VIEW phys_fall_2009 AS CREATE VIEW phys_fall_2009_watson AS SELECT course_id, room_number FROM phys_fall_2009 WHERE building= Watson ; 4.32
How is this process implemented in SQL? Think about the C preprocessor create view physics_fall_2009 as select course.course_id, sec_id, building, room_number from course, section where course.course_id = section.course_id and course.dept_name = Physics and section.semester = Fall and section.year = 2009 ; create view physics_fall_2009_watson as select course_id, room_number from physics_fall_2009 where building= Watson ; This is called view expansion 4.33
View nitty-gritty The attribute names in a view can be specified explicitly: CREATE VIEW dept_sal (dept_name, total_salary) AS ( SELECT dept_name, SUM (salary) FROM instructor GROUP BY dept_name ); 4.34
Read and take notes: 4.2.3 Materialized Views SQL does not define a standard way of specifying that a view is materialized, but many DBMSs provide their own SQL extensions for this task. In PostgreSQL: (Details in the lab) 4.35
4.2.4 Updating Views CREATE VIEW faculty AS ( SELECT ID, name, dept_name FROM instructor ); Add a new tuple to this view: INSERT INTO faculty VALUES ( 30765, Green, Music ); This insertion requires insertion of the tuple ( 30765, Green, Music, NULL) into the instructor relation. 4.36
NULL here! 4.37
Problem: Some updates cannot be translated uniquely! CREATE VIEW instructor_info AS SELECT ID, name, building FROM instructor NATURAL JOIN department ; INSERT INTO instructor_info VALUES ( 69987, White, Taylor ); Which department, if multiple departments in Taylor? What happens if there is no department in Taylor? 4.38
Next problem: Some updates should not be allowed at all! create view history_instructors as select * from instructor where dept_name= History ; What happens if we insert into this view the tuple ( 25566, Brown, Biology, 100000)? 4.39
create view history_instructors as select * from instructor where dept_name= History ; What happens if we insert into this view the tuple ( 25566, Brown, Biology, 100000)? If we allow insertion, the inserted tuple does not appear in the view! Should someone using this view even be allowed to insert instructors working in a different department? 4.40
Conclusion on updating views Most SQL implementations allow updates only on simple views, e.g. The from clause has only one database relation. The select clause contains only attribute names of the relation, and does not have any expressions, aggregates, or distinct specification. Any attribute not listed in the select clause can be set to null The query does not have a group by or having clause. 4.41
Updating views in PostgreSQL PostgreSQL has a proprietary extension to the CREATE VIEW syntax: (Details in the lab) 4.42
To do for next time: Read sections covered: 4.1, 4.2 Solve end-of-chapter exercises 4.3, 4.5 4.43 EOL 1
QUIZ Write a query to return all sections, along with the ID of the instructor teaching them, even those not yet assigned to an instructor. 4.44
QUIZ Write a view to display only the sections taught in the Science building. 4.45
4.3 Transactions Transaction = Atomic unit of work = Sequence of operations that are either fully executed or rolled back as if none of them occurred Why do we need atomicity? Example: transferring money from account A to account B: Subtract amount from A Add amount to B What happens if the computer crashes here? 4.46
4.3 Transactions Aside from Atomicity, transaction-processing DBMSs must also ensure: Consistency Isolation Durability These are called the ACID properties (See Chs. 14, 15, 16 for details) 4.47
Transactions in PostgreSQL Not in text In PostgreSQL, a transaction is set up by surrounding the SQL commands of the transaction with BEGIN and COMMIT commands, e.g.: BEGIN; UPDATE accounts SET balance = balance - 100.00 WHERE name = 'Alice'; UPDATE accounts SET balance = balance + 100.00 WHERE name = Bob'; COMMIT; This example is based on the tutorial at https://www.postgresql.org/docs/current/static/tutorial-transactions.html 4.48
Not in text PostgreSQL treats every SQL statement as being executed within a transaction. If we don t issue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT wrapped around it. A group of statements surrounded by BEGIN and COMMIT is called a transaction block. This example is based on the tutorial at https://www.postgresql.org/docs/current/static/tutorial-transactions.html 4.49
If, partway through the transaction, we decide we do not want to commit (perhaps we just noticed that Alice's balance went negative), we can issue the command ROLLBACK instead of COMMIT, and all our updates so far will be canceled, e.g.: BEGIN; UPDATE accounts SET balance = balance - 150.00 WHERE name = 'Alice'; UPDATE accounts SET balance = balance + 100.00 WHERE name = Bob'; -- Oops, now realizing the amount 150 was wrong ROLLBACK; Not in text This example is based on the tutorial at https://www.postgresql.org/docs/current/static/tutorial-transactions.html 4.50
QUIZ: Transactions in PostgreSQL Write a PostgreSQL query that increases the salaries of all instructors in the CS dept. by 6%, and those of all instructors in the Biology dept. by 5% atomically. 4.51
solution Write a PostgreSQL query that increases the salaries of all instructors in the CS dept. by 6%, and those of all instructors in the Biology dept. by 5% atomically. BEGIN; UPDATE instructor SET salary = salary * 1.06 WHERE dept_name = 'CS'; UPDATE instructor SET salary = salary * 1.05 WHERE dept_name = 'Biology'; COMMIT; 4.52
4.4 Integrity Constraints They guard against accidental damage to the DB, by ensuring that changes do not result in a loss of data consistency, e.g. A checking account must have a balance greater than $10,000.00 A salary of a bank employee must be at least $4.00 an hour A customer must have a (non-null) phone number 4.53
4.4 Integrity Constraints Integrity Constraints on a Single Relation NOT NULL PRIMARY KEY UNIQUE CHECK(P), where P is a predicate They can all be included in CREATE TABLE, or added later with ALTER TABLE. 4.54
UNIQUE ( A 1, A 2,, A m ) UNIQUE clause Typo on p.130 of text! States that the attributes A 1, A 2, A m form a candidate super key. Candidate keys are permitted to be null (in contrast to primary keys). If we want, we can add NOT NULL. Note: We have encountered the keyword UNIQUE in ch.3 (subqueries), but there it was part of the DML, whereas here it s part of the DDL. 4.55
check (P) where P is a predicate CHECK clause Example: ensure that semester is one of Fall, Winter, Spring or Summer: CREATE TABLE section ( course_id varchar (8), sec_id varchar (8), semester varchar (6), year numeric (4,0), building varchar (15), room_number varchar (7), time slot id varchar (4), PRIMARY KEY (course_id, sec_id, semester, year), CHECK (semester in ( Fall, Winter, Spring, Summer )) ); 4.56
Constraints on two relations: Referential Integrity Ensures that a value that appears in one relation for a given set of attributes also appears for a certain set of attributes in another relation. Example: If Biology is a department name appearing in one (or more) of the tuples in the instructor relation, then there exists a (unique!) tuple in the department relation for Biology. 4.57
What if an update (INSERT, DELETE, UPDATE) leads to violation of referential integrity? The normal procedure is to reject the action that caused the violation (the transaction performing the update action is rolled back). However, a foreign key clause can specify that instead of rejecting the action, the system must take steps to change the tuple in the referencing relation to restore the constraint. example 4.58
CREATE TABLE course ( course_id CHAR(5) PRIMARY KEY, title VARCHAR(20), dept_name VARCHAR(20) REFERENCES department ); CREATE TABLE course (.... dept_name VARCHAR(20) REFERENCES department ON DELETE CASCADE, ON UPDATE CASCADE ); Alternative actions: SET NULL, SET DEFAULT 4.59
Exercise 4.9 What happens when a tuple is deleted? (any tuple!) 4.60
Exercise 4.9 What happens when a tuple is deleted? (any tuple!) A: Due to cascading, all the tree of employees under the manager employee_name is deleted. 4.61
create table person ( ID char(10), name char(40), mother char(10), father char(10), primary key ID, foreign key father references person, foreign key mother references person); Integrity Constraint Violation During Transactions How to insert a tuple without causing constraint violation? insert father and mother of a person before inserting person OR, set father and mother to null initially, update after inserting all persons (not possible if father and mother attributes declared to be not null) OR defer constraint checking (next slide) 4.62
Deferred Constraints in PostgreSQL Upon creation, a constraint is given one of three characteristics: DEFERRABLE INITIALLY DEFERRED DEFERRABLE INITIALLY IMMEDIATE NOT DEFERRABLE. create table person ( ID char(10), name char(40), mother char(10), father char(10), primary key ID, foreign key father references person DEFERRABLE INITIALLY DEFERRED, foreign key mother references person) DEFERRABLE INITIALLY DEFERRED; Source: https://www.postgresql.org/docs/9.1/static/sql-set-constraints.html 4.63
Deferred Constraints in PostgreSQL Inside a transaction, the check will de deferred until right before the COMMIT, e.g. this transaction will succeed: BEGIN; INSERT INTO person VALUES ( 01, Kal-El, Jor-El, Lara ); INSERT INTO person VALUES ( 02, Jor-El, NULL, NULL ); INSERT INTO person VALUES ( 03, Lara, NULL, NULL ); COMMIT; 4.64
Complex CHECK conditions SQL standard says that the predicate can be anything, including a subquery! CHECK (time_slot_id IN ( SELECT time_slot_id FROM time_slot)); However, few DBMSs support subqueries in CHECK clause! (PostgreSQL doesn t.) CHECK is meant to handle constraints on a row's value in isolation. Alternatives: Triggers and functions (Ch.5) Assertions 4.65
Assertions CREATE ASSERTION <assertion-name> CHECK <predicate>; What integrity condition is enforced here? 4.66
Assertions CREATE ASSERTION <assertion-name> CHECK <predicate>; A: All tuples in student must have total credit equal to the sum of credits for the classes successfully completed. 4.67
QUIZ: Assertions Write an assertion to ensure that an instructor doesn t teach two different sections in the same year, semester and time slot. Use the previous assertion for example: 4.68
solution An instructor cannot teach two different sections in the same year, semester and time slot. CREATE ASSERTION non-ubiquity CHECK ( UNIQUE ( SELECT T.ID, T.semester, T.year, S.time_slot_id FROM teaches AS T NATURAL JOIN section AS S ) ); 4.69
4.5 SQL Data Types and Schemas Remember the basic data types (Sec.3.2): char(n) varchar(n) int smallint numeric(p,d) (user-specified precision) real, double precision (machine-dependent precision) float(n) (user-specified precision) 4.70
More data types: date and time date calendar date yyyy-mm-dd time hh:mm:ss timestamp combination date + time Example: 2013-02-21 9:10:59.45-6 Fractions of a second are optional Timezone is optional 4.71
interval period of time Subtracting a date/time/timestamp value from another gives an interval value Interval values can be added to date/time/timestamp values, e.g. Source: http://www.postgresql.org/docs/9.1/static/functions-datetime.html 4.72
create table student ( ) Default values ID varchar (5), name varchar (20) not null, dept_name varchar (20) default qwerty tot_cred numeric (3,0) default 0, primary key (ID) 4.73
Index Creation create table student (ID varchar (5), name varchar (20) not null, dept_name varchar (20), tot_cred numeric (3,0) default 0, primary key (ID)); create index studentid_index on student(id); Index = data structure used to speed up access to records with specified values for index attributes, e.g. SELECT * FROM student WHERE ID = 12345 ; More details on indices in Chapter 11 4.74
Large-Object Types Large objects (photos, videos, CAD files, etc.) are stored as a large object: BLOB: binary large object -- object is a large collection of uninterpreted binary data (i.e. its interpretation is left to an application outside of the DB system) CLOB: character large object -- object is a large collection of character data CLOB is implemented as text in PostgreSQL (up to 1 GB) When a query returns a large object, a pointer is returned rather than the large object itself. 4.75
Read and take notes: 4.5.5 User-Defined Types 4.5.7 Schemas, Catalogs, and Environments 4.76
4.5.6 CREATE TABLE Extensions Applications often require creation of tables that have the same schema as an existing table: When writing a complex query, it is often useful to store the result of a query as a new table: How it this different from a view? 4.77
4.6 Authorization Forms of authorization on the data in the database: Read - allows reading, but not modification of data. Insert - allows insertion of new data, but not modification of existing data. Update - allows modification, but not deletion of data. Delete - allows deletion of data. Forms of authorization on the schema of the database: Index - allows creation and deletion of indices. Resources - allows creation of new relations. Alteration - allows addition or deletion of attributes in a relation. Drop - allows deletion of relations. 4.78
4.6 Authorization Read over lightly the remainder of this section 4.79
Homework for Ch.4 Due Thursday, Feb.23 - End of chapter 1, 3, 12, 14, 16 (Hint: CHECK), 18 - Table with all SQL constructs from ch.4, each used in an example (OK if example is from text) 4.80
Schema for University database 4.81
Schema for Bank database (fig.3.19) 4.82