Database technology Lecture 2: Relational databases and SQL Jose M. Peña jose.m.pena@liu.se Database design process 1
Relational model concepts... Attributes... EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Tuples... Ramesh K Narayan 666884444 1962-09-15 M 38000 888665555 5 Joyce A English 453453453 1972-07-31 F 25000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 M 25000 888665555 4 James E Borg 888665555 1937-11-10 M 55000 null 1 Relation: Set of tuples, i.e. no duplicates are allowed. Database: Collection of relations. EMPLOYEE ( FNAME, M, LNAME, SSN, BDATE, ADDRESS, S, SALARY, SUPERSSN, DNO) Relation schema 3 Relational model concepts Domain String shorter than 30 chars yyyy-mm-dd Character M or F Integer 400 < x < 8000 EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 M 38000 888665555 5 Joyce Null English 453453453 1972-07-31 F 38000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 M 25000 888665555 4 James Null Borg 888665555 1937-11-10 M 55000 Null 1 NULL value 4 2
Relational model constraints EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 M 38000 888665555 5 Joyce Null English 453453453 1972-07-31 F 38000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 M 25000 888665555 4 James Null Borg 888665555 1937-11-10 M 55000 Null 1 Entity integrity constraint 5 Relational model constraints Foreign keys EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 M 38000 888665555 5 Joyce A English 453453453 1972-07-31 F 25000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 M 25000 888665555 4 James E Borg 888665555 1937-11-10 M 55000 Null 1 Referential integrity constraint DEPARTMENT DNAME DNUMBER MGRSSN MGRSTARTDATE Research 5 666884444 1988-05-22 Administration 4 987987987 1995-01-01 Headquarters 1 888665555 1981-06-19 6 3
Relational model constraints (Atomic) domain (or NULL). Key. Entity integrity: A PK cannot take NULL values. Referential integrity: A FK in a relation can only refer to the PK of another relation, and the domains of the FK and PK must coincide, and the FK takes NULL value or values that exist for the PK. 7 SQL relational data model SQL relation table attribute column tuple row Used in many DBMSs. Declarative (what data to get, not how). DDL (Data Definition Language) CREATE, ALTER, DROP Queries SELECT DML (Data Manipulation Language) INSERT, DELETE, UPDATE 8 4
COMPANY schema EMPLOYEE (FNAME, MINIT, LNAME, SSN, BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO) DEPT_LOCATIONS (DNUMBER, DLOCATION) DEPARTMENT (DNAME, DNUMBER, MGRSSN, MGRSTARTDATE) WORKS_ON (ESSN, PNO, HOURS) PROJECT (PNAME, PNUMBER, PLOCATION, DNUM) DEPENDENT (ESSN, DEPENDENT-NAME, SEX, BDATE, RELATIONSHIP) 9 Create tables CREATE TABLE <tablename> ( <colname> <datatype> [<constraint>],, [<constraint>], ); Data types: Integer, decimal, number, varchar,char, etc. Constraints: Not null, primary key, foreign key, unique, etc. 10 5
Create tables CREATE TABLE WORKS_ON ( ESSN integer, PNO integer, HOURS decimal(3,1), constraint pk_workson primary key (ESSN, PNO), constraint fk_works_emp FOREIGN KEY (ESSN) references EMPLOYEE(SSN), ); constraint fk_works_proj FOREIGN KEY (PNO) references PROJECT(PNUMBER) 11 Modify tables Change the definition of a table: Add, delete and modify columns and constraints. ALTER TABLE EMPLOYEE ADD JOB VARCHAR(12); ALTER TABLE EMPLOYEE DROP COLUMN ADDRESS CASCADE; ALTER TABLE WORKS_ON DROP FOREIGN KEY fk_works_emp; ALTER TABLE WORKS_ON ADD CONSTRAINT fk_works_emp FOREIGN KEY (ESSN) REFERENCES EMPLOYEE(SSN); Delete a table and its definition DROP TABLE EMPLOYEE; 12 6
Query tables SELECT <attribute-list> FROM <table-list> WHERE <condition>; Attribute list: A 1,, A r Attributes whose values are required. Table list: R 1,, R k Relations to be queried Condition: Boolean expression It identifies the tuples that should be retrieved. It may include comparison operators(=, <>, >, >=, etc.) and/or logical operators (and, or, not). 13 Simple query List the SSN for all employees. SELECT SSN FROM EMPLOYEE; SSN 123456789 333445555 999887777 987654321 666884444 453453453 987987987 888665555 14 7
Use of * List all information about the employees of department 5. SELECT FNAME, MINIT, LNAME,SSN, BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO FROM EMPLOYEE WHERE DNO = 5; or SELECT * FROM EMPLOYEE WHERE DNO = 5; Comparison operators {=, <>, >, =>, etc.} 15 Simple query List the last name, birth date and address for all employees whose name is `Alicia J. Zelaya. SELECT LNAME, BDATE, ADDRESS FROM EMPLOYEE WHERE FNAME = Alicia AND MINIT = J AND LNAME = Zelaya ; Logical operators {and, or, not} LNAME BDATE ADDRESS Zelaya 1968-07-19 3321 Castle, Spring, TX 16 8
Pattern matching List the birth date and address for all employees whose last name contains the substring aya. SELECT BDATE, ADDRESS FROM EMPLOYEE WHERE LNAME LIKE %aya% ; LNAME BDATE ADDRESS LIKE comparison operator % replaces 0 or more characters _ replaces a single character Zelaya 1968-07-19 3321 Castle, Spring, TX Narayan 1962-09-15 975 Fire Oak, Humble, TX 17 Tables as sets List all salaries. SELECT SALARY FROM EMPLOYEE; SALARY 30000 40000 25000 43000 38000 25000 25000 55000 18 9
Tables as sets SQL considers a table as a multi-set (bag), i.e. tuples can occur more than once in a table. This is different in a relational model. Why? Removing duplicates is expensive. User may want information about duplicates. Aggregation operators. 19 Example List all salaries. SELECT SALARY FROM EMPLOYEE; SALARY 30000 40000 25000 43000 38000 25000 25000 55000 List all salaries without duplicates. SELECT DISTINCT SALARY FROM EMPLOYEE; SALARY 30000 40000 25000 43000 38000 55000 20 10
Set operations Queries can be combined by set operations: UNION, INTERSECT, EXCEPT (MySQL only supports UNION) Retrieve the first names of all people in the database. SELECT FNAME FROM EMPLOYEE E D UNION SELECT DEPENDENT_NAME FROM DEPENDENT; Which department managers have dependents? Show their SSN. SELECT MGRSSN FROM DEPARTMENT INTERSECT SELECT ESSN FROM DEPENDENT; Duplicate tuples are removed. M DE 21 Ambiguous names: Aliasing What if the same attribute name is used in different relations? No alias Whole name Alias SELECT NAME, NAME FROM EMPLOYEE, DEPARTMENT WHERE DNO=DNUMBER; SELECT EMPLOYEE.NAME, DEPARTMENT.NAME FROM EMPLOYEE, DEPARTMENT WHERE EMPLOYEE.DNO=DEPARTMENT.DNUMBER; SELECT E.NAME, D.NAME FROM EMPLOYEE E, DEPARTMENT D WHERE E.DNO=D.DNUMBER; 22 11
Join: Cartesian product List all employees and the names of their departments. EMPLOYEE SELECT LNAME, DNAME FROM EMPLOYEE, DEPARTMENT; LNAME Smith Wong Zelaya Wallace Narayan English Jabbar Borg DNO 5 5 4 4 5 5 4 1 DEPARTMENT DNAME Research Administration headquarters DNUM 5 4 1 LNAME DNAME Smith Research Wong Research Zelaya Research Wallace Research Narayan Research English Research Jabbar Research Borg Research Smith Administration Wong Administration Zelaya Administration Wallace Administration Narayan Administration English Administration Jabbar Administration Borg Administration Smith Headquarters Wong Headquarters Zelaya Headquarters Wallace Headquarters Narayan Headquarters English Headquarters Jabbar Headquarters Borg Headquarters 23 Join: Equijoin List all employees and the names of their departments. SELECT LNAME, DNAME FROM EMPLOYEE, DEPARTMENT WHERE DNO = DNUMBER; Thetajoin {=, <>, >, =>, <=,!=} Equijoin Cartesian product Foreign key in EMPLOYEE Primary key in DEPARTMENT LNAME DNO DNAME DNUMBER Smith 5 Research 5 Wong 5 Research 5 Zelaya 4 Research 5 Wallace 4 Research 5 Narayan 5 Research 5 English 5 Research 5 Jabbar 4 Research 5 Borg 1 Research 5 Smith 5 Administration 4 Wong 5 Administration 4 Zelaya 4 Administration 4 Wallace 4 Administration 4 Narayan 5 Administration 4 English 5 Administration 4 Jabbar 4 Administration 4 Borg 1 Administration 4 Smith 5 Headquarters 1 Wong 5 Headquarters 1 Zelaya 4 Headquarters 1 Wallace 4 Headquarters 1 Narayan 5 Headquarters 1 English 5 Headquarters 1 Jabbar 4 Headquarters 1 Borg 1 Headquarters 1 24 12
Join: Self-join List the last name for all employees together with the last names of their bosses. SELECT E.LNAME Employee, S. LNAME Boss FROM EMPLOYEE E, EMPLOYEE S WHERE E.SUPERSSN = S.SSN; Employee Boss Smith Wong Wong Borg Zelaya Wallace Wallace Borg Narayan Wong English Wong Jabbar Wallace 25 Join: Inner join List the last name for all employees together with the last names of their bosses. SELECT E.LNAME Employee, S.LNAME Boss FROM EMPLOYEE E, EMPLOYEE S WHERE E.SUPERSSN = S.SSN; SELECT E.LNAME Employee, S.LNAME Boss FROM EMPLOYEE E INNER JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; 26 13
Join: Outer join List the last name for all employees and, if available, show the last names of their bosses. SELECT E.LNAME Employee, S. LNAME Boss FROM EMPLOYEE E LEFT JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; LEFT JOIN, RIGHT JOIN, FULL JOIN Employee Boss Smith Wong Wong Borg Zelaya Wallace Wallace Borg Narayan Wong English Wong Jabbar Wallace Borg NULL 27 Joins revisited Cartesian product SELECT * FROM a, b; A2 A1 B1 B2 A 100 100 W B null 100 W C 300 100 W D null 100 W A 100 200 X B null 200 X C 300 200 X D null 200 X A 100 null Y B null null Y C 300 null Y D null null Y A 100 null Z B null null Z C 300 null Z D null null Z A A1 A2 100 A null B 300 C null D B B1 B2 100 W 200 X null null Equijoin, natural join, inner join SELECT * from A, B WHERE A1=B1; A2 A1 B1 B2 A 100 100 W Thetajoin SELECT * from A, B WHERE A1>B1; A2 A1 B1 B2 C 300 100 W C 300 200 X Y Z 28 14
Outer joins revisited Right outer join SELECT * FROM A RIGHT JOIN B on A1=B1; A2 A1 B1 B2 A 100 100 W null null 200 X null null null Y null null null Z Left outer join SELECT * FROM A LEFT JOIN B on A1=B1; A2 A1 B1 B2 A 100 100 W C 300 null null B null null null D null null null A B A1 A2 B1 B2 100 A 100 W null B 200 X 300 C null Y null D null Z Full outer join (union of right+left) SELECT * FROM A FULL JOIN b on A1=B1; A2 A1 B1 B2 A 100 100 W null null 200 X null null null Y null null null Z C 300 null null B null null null D null null null 29 Subqueries List all employees that do not have any project assignment with more than 10 hours. SELECT LNAME FROM EMPLOYEE, WORKS_ON WHERE SSN = ESSN AND HOURS <= 10.0; {>, >=, <, <=, <>} + {ANY, SOME, ALL} SELECT LNAME FROM EMPLOYEE WHERE SSN NOT IN (SELECT ESSN FROM WORKS_ON WHERE HOURS > 10.0); Or EXISTS SELECT LNAME FROM EMPLOYEE WHERE NOT EXISTS (SELECT * FROM WORKS_ON WHERE SSN = ESSN AND HOURS > 10.0); 30 15
Extended SELECT syntax SELECT <attribute-list and function-list> FROM <table-list> [ WHERE <condition> ] [ GROUP BY <grouping attribute-list>] [ HAVING <group condition> ] [ ORDER BY <attribute-list> ]; 31 Aggregate functions Built-in functions: AVG(), SUM(), MIN(), MAX(), COUNT() They appear only in SELECT and HAVING clauses. NULL values are not considered in the computations. List the total number of employees. SELECT COUNT(*) FROM EMPLOYEE; AVG() 50 50 100 100 Null 0 75 50 32 16
Grouping Used to apply an aggregate function to subgroups of tuples in a relation. GROUP BY: Grouping attributes. HAVING: Condition that a group has to satisfy. List, for each department with more than two employees, the department number, the number of employees and the average salary. SELECT DNO, COUNT(*), AVG(SALARY) FROM EMPLOYEE GROUP BY DNO HAVING COUNT(*) > 2; DNO COUNT(*) AVG(SALARY) 5 4 33250 4 3 31000 1 1 55000 33 Sort query results Show the department names and their locations in alphabetical order. SELECT DNAME, DLOCATION FROM DEPARTMENT D, DEPT_LOCATIONS DL WHERE D.DNUMBER = DL.DNUMBER ORDER BY DNAME ASC, DLOCATION DESC; DNAME DLOCATION Administration Stafford Headquarters Houston Research Sugarland Research Houston Research Bellaire 34 17
Null values List all employees that do not have a boss. SELECT FNAME, LNAME FROM EMPLOYEE WHERE SUPERSSN IS NULL; SUPERSSN = NULL and SUPERSSN <> NULL will not return any matching tuples, because NULL is incomparable to any value, including another NULL. 35 Insert data INSERT INTO <table> (<attr>, ) VALUES ( <val>, ) ; INSERT INTO <table> (<attr>, ) <subquery> ; Store information about how many hours an employee works for the project 1' into WORKS_ON. INSERT INTO WORKS_ON VALUES (123456789, 1, 32.5); Integrity constraint! Referential integrity constraint! 36 18
Update data UPDATE <table> SET <attr> = <val>, WHERE <condition> ; UPDATE <table> SET (<attr>,.) = ( <subquery> ) WHERE <condition> ; Give all employees in the Research department a 10% raise in salary. Integrity constraint! Referential integrity constraint! UPDATE EMPLOYEE SET SALARY = SALARY*1.1 WHERE DNO IN (SELECT DNUMBER FROM DEPARTMENT WHERE DNAME = Research ); 37 Delete data DELETE FROM <table> WHERE <condition> ; Delete the employees having the last name Borg from the EMPLOYEE table. DELETE FROM EMPLOYEE WHERE LNAME = Borg ; Foreign key EMPLOYEE FNAME M LNAME SSN DEPARTMENT DNAME DNUMBER MGRSSN Ramesh K Narayan 666884444 Research 5 333445555 Joyce A English 453453453 Administration 4 987654321 Ahmad V Jabbar 987987987 Headquarters 1 888665555 James E Borg 888665555 ON DELETE SET NUL / DEFAULT / CASCADE? Referential integrity constraint! 38 19
Views A virtual table derived from another (possibly virtual) tables, i.e. always up-to-date. CREATE VIEW dept_view AS SELECT DNO, COUNT(*), AVG(SALARY) FROM EMPLOYEE GROUP BY DNO; Why? Simplify query commands. Provide data security. Enhance programming productivity. Update problems. 39 20