T E H U N I V E R S I T Y O H F R G E D I N B U CS2 Current Technologies Lecture 3: SQL - Joins and Subqueries Chris Walton (cdw@dcs.ed.ac.uk) 11 February 2002
Multiple Tables 1 Redundancy requires excess storage and causes Insert/Delete/Update errors: STUDENTID COURSE COURSEID LECTURER GRADE 10 DBS 100 CDW A 11 LSI 101 GDP C 12 CT 102 STG B 13 DBS 100 CDW B 14 CT 102 STG F Solution is to split into multiple tables: STUDENTID COURSEID GRADE 10 100 A 11 101 C 12 102 B 13 100 B 14 102 F COURSEID COURSE LECTURER 100 DBS CDW 101 LSI GDP 102 CT STG
Querying Multiple Tables 2 Up to this point we have only considered SQL operations on a single table. However, we clearly would like to perform queries involving multiple tables. We could simply perform two separate queries, e.g: DEPTNO DEPT LOC = DALLAS ; ENAME EMP DEPTNO = NNN; This is not a very satisfactory solution - the correct approach is SQL Joins.
Joins - Selecting Data from Multiple Tables 3 Example - Employees located in DALLAS. ENAME EMP, DEPT EMP.DEPTNO = DEPT.DEPTNO AND LOC = DALLAS ; ENAME SMITH ADAMS FORD SCOTT JONES Example - Location of employee named ALLEN : AND ENAME, LOC EMP, DEPT EMP.DEPTNO = DEPT.DEPTNO ENAME = ALLEN ; ENAME ALLEN LOC CHICAGO
Non-Equi-Joins and Outer Joins 4 Example: Find all the employees who earn more than JONES. AND X.ENAME, X.SAL EMP X, EMP Y X.SAL > Y.SAL Y.ENAME = JONES ; ENAME SAL SCOTT 3000 KING 5000 FORD 3000 Example - Departments with no employees: DISTINCT DEPT.DEPTNO, DNAME, LOC DEPT, EMP DEPT.DEPTNO = EMP.DEPTNO (+) AND EMPNO IS NULL; DEPTNO DNAME LOC 40 OPERATIONS BOSTON
Joining a Table with Itself 5 Example - Employees whose salary exceeds their managers salary: AND W.ENAME, W.SAL, M.ENAME AS "MNAME", M.SAL AS "MSAL" EMP W, EMP M W.MGR = M.EMPNO W.SAL > M.SAL; ENAME SAL MNAME MSAL SCOTT 3000 JONES 2975 FORD 3000 JONES 2975 The EMP table is treated as two separate tables named W and M by using an alias (e.g. EMP W). New columns MNAME and MSAL are created using AS.
Sub-Queries 6 SQL allows us to specify queries subqueries within the clause. Example - Which jobs are paid a higher-than-average salary? DISTINCT JOB EMP SAL > ( AVG(SAL) EMP ); JOB ANALYST MANAGER PRESIDENT Subqueries are processed before the main query. results of the subquery for its own processing. The main query uses the Subqueries can use all clauses that we have discussed so far. Subqueries can themselves have other subqueries.
Subqueries with Multiple Columns 7 Where a subquery returns several columns, we put parentheses around the list of columns to specify matching conditions. Example - What are the names, jobs and salaries of employees with the same job and salary as FORD? ENAME, JOB, SAL EMP (JOB, SAL) IN ( JOB, SAL EMP ENAME = FORD ); ENAME JOB SAL SCOTT ANALYST 3000 FORD ANALYST 3000
Subqueries Returning Sets of Values 8 Use ANY or ALL to specify use of returned values in outer clause. Example - Display the highest paid employee (over complex): JOB, ENAME EMP SAL >= ALL ( SAL EMP); JOB PRESIDENT ENAME KING Example - Display employees whose salary is greater than all employees in department 30: ENAME, SAL EMP SAL > ALL ( SAL EMP DEPTNO = 30); ENAME SAL JONES 2975 SCOTT 3000 KING 5000 FORD 3000
Subqueries with ANY and ALL 9 Example - Which employees earn more than ANY (i.e. at least one) employee in Department 30? ORDER BY SAL, ENAME EMP SAL > ANY ( SAL EMP DEPTNO = 30) SAL DESC; SAL ENAME 5000 KING 3000 SCOTT 3000 FORD 2975 JONES 2850 BLAKE 2450 CLARK 1600 ALLEN 1500 TURNER 1300 MILLER 1100 ADAMS Use IN for = ANY, andnot IN for!= ALL.
Subqueries with EXISTS 10 If a sub-query returns at least one row, then EXIST evaluates to true. Example - Display information about employees who have at least one other employee reporting to them: JOB, ENAME, EMPNO, DEPTNO EMP X EXISTS ( * EMP X.EMPNO = MGR); JOB ENAME EMPNO DEPTNO MANAGER JONES 7566 20 MANAGER BLAKE 7698 30 PRESIDENT KING 7839 10 ANALYST SCOTT 7788 20 ANALYST FORD 7902 20
Multiple Sub-Queries 11 Sub-queries can be composed using UNION, INTERSECT, and MINUS. Example - Display the names of all employees who have a JOB category such that there are employees in that category in both Department 20 and Department 30: JOB, ENAME, DEPTNO EMP JOB IN ( JOB EMP DEPTNO = 20 INTERSECTION JOB EMP DEPTNO = 30); JOB ENAME DEPTNO CLERK SMITH 20 CLERK ADAMS 20 CLERK JAMES 30 CLERK MILLER 10 MANAGER JONES 20 MANAGER BLAKE 30 MANAGER CLARK 10
Correlated Subqueries 12 In the subqueries so far each subquery is executed once, and the resulting value (or set of values) was used by the clause in the containing query. It is also possible to compose a subquery that is executed repeatedly, once for each candidate row considered for selection by the outer query. A correlated subquery refers to a column selected by the main query. If the subquery selects from the same table as the main query, the main query must define an alias for the table name. The subquery uses the alias to refer to the column s values in the main query s candidate rows.
Correlated Subquery Example 13 Example - Find all employees who have a salary that is more than the average salary of the employees in their own department, and list them in department order. DEPTNO, ENAME, SAL EMP X SAL > ( AVG(SAL) EMP X.DEPTNO = DEPTNO) ORDER BY DEPTNO; DEPTNO ENAME SAL 10 KING 5000 20 JONES 2975 20 SCOTT 3000 20 FORD 3000 30 ALLEN 1600 30 BLAKE 2850
Explanation of Example 14 The alias X in the main query is used in the clause of the subquery to refer to a candidate row s value of DEPTNO. The unqualified reference to DEPTNO on the right-hand side of the =sign refers to the DEPTNO field in each row of the table. When the subquery is executed for each of the main query s candidate rows, SQL compares the value of DEPTNO in the candidate row to the value of DEPTNO in each row of the table. The subquery selects each row for which the candidate row s department number equals the selected row s department number, returning the value of AVG(SAL) computed over the selected rows.
Example Table EMP 15 EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO 7369 SMITH CLERK 7902 17-DEC-80 800 20 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 7566 JONES MANAGER 7839 02-APR-81 2975 20 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 7788 SCOTT ANALYST 7566 27-JUN-90 3000 20 7839 KING PRESIDENT 17-NOV-81 5000 10 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 7876 ADAMS CLERK 7788 31-JUL-90 1100 20 7900 JAMES CLERK 7698 03-DEC-81 950 30 7902 FORD ANALYST 7566 03-DEC-81 3000 20 7934 MILLER CLERK 7782 23-JAN-82 1300 10
Example Table DEPT 16 DEPTNO DNAME LOC 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON