Thanks to José and Vaida for most of the slides. Relational databases and MySQL Juha Takkinen juhta@ida.liu.se Outline 1. Introduction: Relational data model and SQL 2. Creating tables in Mysql 3. Simple queries and matching 4. Joins and aliasing 5. More about syntax and built-in functions 6. What is NULL? 7. Inserting, deleting and modifying data 8. Views 3 Overview Relational data model Real world Database DBMS Model Physical database Query Processing of queries and updates Access to stored data Answer 4 IBM Research Laboratory, San Jose, California Edgar F. Codd (1970), A Relational Model of Data for Large Shared Data Banks, in Communications of the ACM, vol. 13 no. 6, p. 377-387, June 1970. All data organized into tables Other models, e.g. hierarchical, network, object or object-relational DBMS, XML (back to network model!) 5 1
Relational model concepts Relational database constraints Relation name Attributes...... EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Domain of values String shorter than 30 chars yyyy-mm-dd Character M or F Integer 400 < x < 8000 Tuples... Ramesh K Narayan 666884444 1962-09-15 M 38000 333445555 5 Joyce A English 453453453 1972-07-31 F 25000 333445555 5 Ahmad V Jabbar 987987987 1969-03-29 M 25000 987654321 4 James E Borg 888665555 1937-11-10 M 55000 1 EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 M 38000 333445555 5 Joyce A English 453453453 1972-07-31 F 25000 333445555 5 Ahmad V Jabbar 987987987 1969-03-29 M 25000 987654321 4 Relation schema EMPLOYEE ( FNAME, M, LNAME, SSN, BDATE, ADDRESS, S, SALARY, SUPERSSN, DNO) James E Borg 888665555 1937-11-10 M 55000 1 Is not Database collection of relations Database schema collection of relation schemas + integrity constraints Relation set of tuples, i.e. no duplicates Unique Candidate keys Primary key + entity integrity constraint 6 7 Relational database constraints DEPARTMENT Foreign keys + referential integrity constraint EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 M 38000 333445555 5 Joyce A English 453453453 1972-07-31 F 25000 333445555 5 Ahmad V Jabbar 987987987 1969-03-29 M 25000 987654321 4 James E Borg 888665555 1937-11-10 M 55000 1 DNAME DNUMBER MGRSSN MGRSTARTDATE Research 5 333445555 1988-05-22 Integrity constraints (Atomic) domain (or NULL). Key. NOT NULL. Entity integrity: PK is NOT NULL. Referential integrity: FK of R referring to S if domain(fk(r))=domain(pk(s)) r.fk = s.pk for some s, otherwise NULL. Administration 4 987654321 1995-01-01 Headquarters 1 888665555 1981-06-19 8 9 2
SQL and MySQL Structured Query Language DDL and DML Declarative (what, not how) Originally interface to System R (SEQUEL) Used in many database systems, e.g. Oracle http://www.forbes.com/lists/2010/10/ billionaires-2010_lawrence-ellison_jkex.html Standard language for relational databases MySQL: Open source DBMS. Table, row, column = = relation, tuple, attribute 10 COMPANY schema from book EMPLOYEE (FNAME, MINIT, LNAME, SSN, BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO) DEPT-LOCATIONS (DNUMBER, DLOCATION) DEPARTMENT (DNAME, DNUMBER, MGRSSN, MGRSTARTDATE) WORKS-ON (ESSN, PNO, HOURS) PROJECT (PNAME, PNUMBER, PLOCATION, DNUM) DEPENDENT (ESSN, DEPENDENT-NAME, SEX, BDATE, RELATIONSHIP) 11 Creating tables Optional but necessary if reserved words in table name Creating tables CREATE TABLE `<tablename>` ( <colname> <datatype> [<constraint>],, [<constraint>], ); ENGINE=InnoDB DEFAULT CHARSET=latin1; data types: integer, decimal(3,1), boolean, varchar(8), char(2), date, time, datetime, timestamp, enum( v1, v2, v3 ), constraints: not, primary key, foreign key, unique, check Other: auto_increment 12 CREATE TABLE works_on ( essn integer references emp(ssn) on delete cascade, pno integer, hours decimal(3,1), constraint pk_workson primary key (essn, pno), constraint fk_workson foreign key (pno) references proj(pnumber) on delete cascade ) ENGINE=InnoDB; 13 3
Creating tables CREATE TABLE dependent ( essn integer references emp(ssn) on delete cascade, dependent_name varchar(9) default 'NN', sex varchar(1) check (sex in ('F', 'M')), bdate date, relationship varchar(8), constraint pk_dependent primary key (essn, dependent_name)) ENGINE=InnoDB; Creating tables CONSTRAINT cand_key UNIQUE(FName,LName) PNum FName LName Office Phone CREATE TABLE TEACHER ( PNum CHAR(11), FName VARCHAR(20) UNIQUE, LName VARCHAR(20), Office CHAR(10) DEFAULT CommonRoom, Phone CHAR(4) NOT NULL, CONSTRAINT pk_teacher PRIMARY KEY(PNum), CONSTRAINT fk_teacher FOREIGN KEY(Office) REFERENCES OFFICE(ID) ON DELETE CASCADE ON UPDATE SET NULL ) ENGINE=InnoDB; 14 15 Modifying tables Change the definition of a table: add, delete and modify columns and constraints ALTER TABLE EMPLOYEE ADD COLUMN JOB VARCHAR(12); ALTER TABLE EMPLOYEE DROP COLUMN ADDRESS CASCADE; ALTER TABLE DEPTS-INFO DROP PRIMARY KEY; ALTER TABLE DEPTS-INFO DROP FOREIGN KEY fk_department_employee; ALTER TABLE DEPTS-INFO ADD CONSTRAINT PK_Dept PRIMARY KEY (Dno); ALTER TABLE TEACHER ADD CONSTRAINT fk_teacher FOREIGN KEY (Office) REFERENCES OFFICE(ID) ON DELETE CASCADE ON UPDATE SET NULL; ALTER TABLE EMPLOYEE MODIFY COLUMN ADDRESS VARCHAR(10) DEFAULT None ; ALTER TABLE ACCOUNT CONVERT TO CHARACTER SET latin1 COLLATE latin1_swedish_ci; Delete a table and its definition DROP TABLE EMPLOYEE; Getting information about the tables SHOW TABLES; DESCRIBE TEACHER; Querying tables SELECT <attribute-list> FROM <table-list> WHERE <condition>; attribute-list: R 1.A 1,, R k.a r Attributes that are required table-list: R 1,, R k Relations that are needed to process the query condition: expression with logical operators (and, or, not) and equality, inequality and comparison operators(=, <>, >, >=, ); identifies the tuples that should be retrieved SHOW CREATE TABLE TEACHER; 16 17 4
Simple query Use of * List SSN for all employees List all information about the employees of department 5 SELECT SSN FROM EMPLOYEE; SSN 123456789 333445555 999887777 987654321 666884444 453453453 987987987 888665555 18 SELECT FNAME, MINIT, LNAME,SSN, BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO FROM EMPLOYEE WHERE DNO = 5; or SELECT * WHERE DNO = 5; 19 Simple query Exact vs substring matching List last name, birth date and address for all employees whose name is `Alicia J. Zelaya' SELECT LNAME, BDATE, ADDRESS WHERE FNAME = Alicia AND MINIT = J AND LNAME = Zelaya ; LNAME BDATE ADDRESS Zelaya 1968-07-19 3321 Castle, Spring, TX 20 List birth date and address for all employees whose name contains the substring aya SELECT BDATE, ADDRESS WHERE LNAME LIKE %aya% ; % replaces 0 or more characters _ replaces a single character Case insensitive!!! Different from WHERE LNAME = %aya% ; LNAME BDATE ADDRESS Zelaya 1968-07-19 3321 Castle, Spring, TX Narayan 1962-09-15 975 Fire Oak, Humble, TX 21 5
Tables as multisets SQL considers a table as a multi-set (bag), i.e. tuples can occur more than once in a table Why? Removing duplicates is expensive User may want information about duplicates Aggregation operators Exceptions: The table has a key, i.e. PK or UNIQUE (which is not compulsory). Use SELECT DISTINCT instead of SELECT. SELECT attributes1 FROM tables1 WHERE condition1; UNION SELECT attributes2 FROM tables2 WHERE condition2; But not if UNION ALL. Difference wrt relational model!!! Example List all salaries SELECT SALARY ; SALARY 30000 40000 25000 43000 38000 25000 25000 55000 List all salaries without duplicates. SELECT DISTINCT SALARY ; SALARY 30000 40000 25000 43000 38000 55000 22 23 Join. Cartesian product List all employees and their department SELECT LNAME, DNAME INNER JOIN DEPARTMENT; Result: each tuple in EMPLOYEE is combined with each tuple in DEPARTMENT LNAME DNAME Smith Research Research Zelaya Research Research Narayan Research English Research Jabbar Research Borg Research Smith Administration Administration Zelaya Administration Administration Narayan Administration English Administration Jabbar Administration Borg Administration Smith Headquarters Headquarters Zelaya Headquarters Headquarters Narayan Headquarters English Headquarters Jabbar Headquarters Borg Headquarters 24 Join. Equijoin List all employees and their department SELECT LNAME, DNAME INNER JOIN DEPARTMENT ON DNO = DNUMBER; Equijoin Cartesian product (result emphasized) Foreign key in EMPLOYEE LNAME DNO Primary key in DEPARTMENT DNAME DNUMBER Smith 5 Research 5 5 Research 5 Zelaya 4 Research 5 4 Research 5 Narayan 5 Research 5 English 5 Research 5 Jabbar 4 Research 5 Borg 1 Research 5 Smith 5 Administration 4 5 Administration 4 Zelaya 4 Administration 4 4 Administration 4 Narayan 5 Administration 4 English 5 Administration 4 Jabbar 4 Administration 4 Borg 1 Administration 4 Smith 5 Headquarters 1 5 Headquarters 1 Zelaya 4 Headquarters 1 4 Headquarters 1 Narayan 5 Headquarters 1 English 5 Headquarters 1 Jabbar 4 Headquarters 1 Borg 1 Headquarters 1 25 6
Ambiguous names. Aliasing Why? Same attribute name used in different relations To increase readability (long relation names) No alias SELECT LNAME, DNAME INNER JOIN DEPARTMENT ON DNO=DNUMBER; Whole name SELECT EMPLOYEE.LNAME, Alias DEPARTMENT.DNAME INNER JOIN DEPARTMENT ON EMPLOYEE.DNO= DEPARTMENT.DNUMBER; SELECT E.LNAME, D.NAME E INNER JOIN DEPARTMENT D ON E.DNO=D.DNUMBER; 26 Join. Self-join List last name for all employees together with last names of their bosses SELECT E.LNAME Employee, S.LNAME Boss E INNER JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; + WHERE E.LNAME= Borg ; Employee Boss Smith Zelaya Narayan English Jabbar Borg Borg 27 Join. Outer join, cont d SELECT E.LNAME, E.SUPERSSN, S.LNAME, S.SSN E INNER JOIN EMPLOYEE S Join. Outer join List last name for all employees together with last names of their bosses SELECT E.LNAME Employee, S.LNAME Boss E INNER JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; Equijoin does not consider tuples having join attributes with NULL values, i.e. an employee Borg is not included in the answer Use outer join instead, to E.LNAME E.SUPERSSN S.LNAME S.SSN Smith 333445555 Smith 123456789 888665555 Smith 123456789 Zelaya 987654321 Smith 123456789 888665555 Smith 123456789 Narayan 333445555 Smith 123456789 English 333445555 Smith 123456789 Jabbar 987654321 Smith 123456789 Borg Smith 123456789 Smith 333445555 333445555 888665555 333445555 Zelaya 987654321 333445555 888665555 333445555 Narayan 333445555 333445555 English 333445555 333445555 Jabbar 987654321 333445555 Borg 333445555 Smith 333445555 Zelaya 999887777 888665555 Zelaya 999887777... List last name for all employees and, if available, show last names of their bosses SELECT E.LNAME Employee, S.LNAME Boss E LEFT OUTER JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; RIGHT OUTER JOIN. Employee Boss catch the NULLs! 28 29 Smith Zelaya Narayan English Jabbar Borg Borg Borg NULL 7
A Joins revisited A1 A2 Cartesian product SELECT * FROM a INNER JOIN b; A2 A1 B1 B2 A 100 100 W B 100 W C 300 100 W D 100 W A 100 200 X B 200 X C 300 200 X D 200 X A 100 Y B Y C 300 Y D Y A 100 Z B Z C 300 Z D Z 100 A B 300 C D B B1 B2 100 W 200 X Equijoin, natural join, inner join SELECT * from a INNER JOIN b ON a1=b1; A2 A1 B1 B2 A 100 100 W Thetajoin SELECT * from a INNER JOIN b ON a1>b1; A2 A1 B1 B2 C 300 100 W C 300 200 X Y Z 30 Outer Joins revisited Right outer join A A1 A2 100 A B 300 C SELECT * FROM a RIGHT OUTER JOIN b ON a1=b1; A2 A1 B1 B2 A 100 100 W 200 X Y Z D B B1 B2 100 W 200 X Y Z Left outer join SELECT * FROM a LEFT OUTER JOIN b ON a1=b1; A2 A1 B1 B2 A 100 100 W C 300 B D 31 Subqueries SQL syntax Which employees have a 10 hour (exact) project assignment? The following query returns duplicates (why?): SELECT LNAME INNER JOIN WORKS_ON ON SSN = ESSN WHERE HOURS = 10.0; SELECT LNAME WHERE SSN IN (SELECT ESSN FROM WORKS_ON WHERE HOURS = 10.0); Or SELECT LNAME WHERE EXISTS (SELECT * FROM WORKS_ON WHERE SSN = ESSN AND HOURS = 10.0); NOT EXISTS NOT IN {=,>,<,>=,<=,<>} + {ANY, SOME, ALL} SOME ANY IN =ANY 32 SELECT <attribute-list and function-list> FROM <table-list> [ WHERE <condition> ] [ GROUP BY <grouping attribute-list>] [ HAVING <group condition> ] [ ORDER BY <attribute-list> ]; (SELECT.) [AS] NAME 34 8
Aggregate functions Built-in functions: AVG(), SUM(), MIN(), MAX(), COUNT(), List the number of employees SELECT COUNT(*) ; May appear just in SELECT and HAVING clauses! COUNT(*) NULLs are not ignored COUNT(expression) NULLs are ignored [DISTINCT] attribute, or * Wrong in YOUR notes 35 Grouping Used to apply an aggregate function to subgroups of tuples in a relation GROUP BY grouping attributes HAVING condition that a group has to satisfy List for each department the department number, the number of employees and the average salary. SELECT DNO, COUNT(*), AVG(SALARY) GROUP BY DNO HAVING COUNT(*) > 2; No HAVING without GROUP BY DNO COUNT(*) AVG(SALARY) 5 4 33250 4 3 31000 1 1 55000 Only grouping attributes and aggregate functions 36 Order of query results Select department names and their locations in alphabetical order. SELECT DNAME, DLOCATION FROM DEPARTMENT D, DEPT_LOCATIONS DL WHERE D.DNUMBER = DL.DNUMBER ORDER BY DNAME ASC, DLOCATION DESC; DNAME DLOCATION NULL NULL = unknown, unavailable, or not applicable. Hence, each NULL is different from every other. Hence, three-valued logic for AND, OR and NOT operators (T, F, UNKNOWN, and only tuples that evaluate to T are selected). Moreover, Administration Stafford Headquarters Houston Research Sugarland Research Houston Research Bellaire 37 SELECT FName, LName FROM TEACHER WHERE Office = NULL; Wrong! Each NULL is different SELECT FName, LName FROM TEACHER WHERE Office IS NULL; IS NOT 38 9
Inserting new data Deleting stored data May be a subquery INSERT INTO <table> (<attr>, ) VALUES ( <val>, ) ; INSERT INTO <table> (<attr>, ) <subquery> ; Store in WORKS_ON information about how many hours an employee works for project 1: INSERT INTO WORKS_ON VALUES (123456789, 1, 32.5); DELETE FROM <table> WHERE <condition> ; Delete employees having the last name Borg from the EMPLOYEE table DELETE WHERE LNAME = Borg ; Foreign key INSERT INTO TEACHER(PNum, FName, LName, Office, Phone, Dep) SELECT * FROM OLD_TEACHER WHERE Phone<999; NULL? DEFAULT? REFERENCIAL INTEGRITY ACTIONS!!! INTEGRITY CONSTRAINTS!!! EMPLOYEE FNAME M LNAME SSN Ramesh K Narayan 666884444 Joyce A English 453453453 Ahmad V Jabbar 987987987 James E Borg 888665555 DEPARTMENT DNAME Research Administration Headquarters DNUMBER 5 4 1 MGRSSN 333445555 987654321 888665555 39 NULL? CASCADE? DEFAULT? REFERENCIAL INTEGRITY ACTIONS!!! INTEGRITY CONSTRAINTS!!! 40 Modifying stored data Views (virtual tables) UPDATE <table> SET <attr> = <val>, WHERE <condition> ; Give all employees in the Research department a 10% raise in salary. UPDATE EMPLOYEE SET SALARY = SALARY*1.1 WHERE DNO IN (SELECT DNUMBER FROM DEPARTMENT NULL? CASCADE? DEFAULT? REFERENCIAL INTEGRITY ACTIONS!!! INTEGRITY CONSTRAINTS!!! WHERE DNAME = Research ); 41 A virtual table derived from other possible virtual - tables CREATE VIEW dept_view AS SELECT DNUMBER, DNAME FROM DEPARTMENT; DROP VIEW dept_view; Why? Simplify query commands Increase efficiency Always up-to-date Updating of views is problematic 42 10