Introductory SQL SQL Joins: Viewing Relationships Pg 1 SQL Joins: Viewing Relationships Ray Lockwood Points: The relational model uses foreign keys to establish relationships between tables. SQL uses Joins using tables and keys to make the relationships work. There are several types of SQL joins; Inner and Outer joins are the most important. It is a good practice to qualify column names with dot notation. Foreign Keys Link Tables Together ID Dependent A relational database consists of tables, and each table has at least one key column. Instead of saying Key Column we just say Key. Key values are unique in a column no two rows can have the same key value in the same table. Keys are the handle we use to identify and pick specific rows. Keys are also the mechanism that link tables together into relationships. The simplicity of keys to link tables is a great strength of the relational model. Tables are linked together by foreign keys. Below, the foreign key is in the table. It starts as the primary key of Employee (EmployeeNum) and is placed into to point back to the Employee. The foreign key is always on the many side (child side) of a one-to-many relationship. The foreign key is in a child table; it s the primary key of the parent table. Here we have two tables (keys are underlined; has a composite key): Employee EmployeeNum LastName FirstName DeptNum 014 Smith Bob 100 086 Jones Bill 200 127 Doe John 100 192 Knot Jill 300 Foreign Key (Part of the PK) Composite Primary Key (Two attributes) EmployeeNum FamMbrNum LastName FirstName Relationship 014 1 Smith Susan Wife 014 2 Smith Billy Son 014 3 Smith John Son 127 1 Doe Jane Wife 127 2 Doe Mary Daughter 127 3 Doe Lucy Daughter 192 1 Knot Sam Husband 192 2 Knot Ed Son Key review. The foreign key goes on the many side of a one-to-many relationship. A composite key is composed of more than one attribute. In this case, the foreign key is part of the primary key. This is not generally the case. Side note: is ID-dependent on Employee.
Introductory SQL SQL Joins: Viewing Relationships Pg 2 These tables are related in a one-to-many relationship, illustrated by this ER diagram: Employee is a weak entity in an identifying relationship. One side (The Parent ) Many side (The Child ) Strong entity Mandatory Parent Optional Child Weak entity To review, the key of Employee is EmployeeNum, and contains EmployeeNum as a foreign key. The EmployeeNum key links Employee to in a parent/child relationship. The foreign key is in the child table. By the way, is an ID-dependent entity and therefore weak it doesn't have a key that's solely its own. It borrows EmployeeId from Employee, and tacks on FamMbrNum to form a unique key. The status of weak entity is a business rule of the company that hires employees the company sees people as family members only if they're attached to one of its employees. Review: An Identifying Relationship is one in which the key of the parent is part of the key of the child. SQL Joins A Join is SQL-speak for a way to combine two tables by matching the foreign key in one table to the corresponding primary key in the other. This is done using a clause in a SQL statement it reunites what was previously put apart into separate tables. Inner and Outer Joins Let s take a look at terminology. A Join takes two or more tables as input, and produces output in the form of a single table. It s useful to produce more than one type of output, so there s more than one type of Join: 1. Inner Join: Output only the combination of Employee rows and rows that have PK/FK values that match each other. 2. Left Outer Join: Output all the Employee rows, together with those rows, if any, that have matching PK/FK values. 3. Right Outer Join: Output all the rows, together with those Employee rows, if any, that have matching PK/FK values. 4. Full Outer Join: Combines the result of the Left and Right Outer Joins. Outputs all the rows from both tables, and melded by matching PK/FK if any match. A Join is where the linking between tables is made visible. The Inner Join and the Left Outer Join are the workhorse joins. We usually say Left Join instead of Left Outer Join. There's little difference between a Left Join and a Right Join. They refer to the order you list the tables in the FROM clause.
Introductory SQL SQL Joins: Viewing Relationships Pg 3 Two Syntaxes for Joins To match up the Employees with their Family Members, we can use one of two SQL syntaxes for joins: the newer explicit join operator or the older implicit join operator: The explicit join operator was added in SQL-92 Notice that SQL works with data a-set-at-a-time, not like procedural languages that operate on data a-row-at-a-time. Explicit join operator: FROM Employee INNER JOIN Implicit join operator: FROM Employee, WHERE Employee.EmployeeNum =.EmployeeNum; The explicit form of the syntax is preferred for readability: 1. It explicitly names the type of join being made. 2. It separates the keys used in joining from selection criteria in the WHERE clause. However, when joining a lot of tables together, the implicit form is sometimes easier to use. Inner Join The Inner Join returns all the rows for which there is a pairing from both tables. Here, it returns all the employees who have family members. (Equivalently, all the family members who have employees). Uses the JOIN keyword. Uses the ON keyword to specify attributes to be matched. Separated by a comma. In keeping with the relational model (the math of sets), SQL deals with sets. Attributes to be matched are tested in the WHERE clause. The explicit form is easier to read. The implicit form is easier for joining large numbers of tables. Inner Join FROM Employee INNER JOIN Inner Join Syntax Corresponding rows from these tables Let s look at the syntax. The FROM clause names the tables that are to be joined together, and the type of Join: The Inner Join returns only the rows for which there is a match in the other table. FROM Employee INNER JOIN The ON clause names the keys to use in the join. One is the Primary Key of Employee, the other is the Foreign Key that links in a one-to-many relationship: Tables to be joined Attributes to be matched SQL doesn't limit you to using only the attributes that you designate as keys. You can use any attributes in the ON clause to link tables together. The columns being matched are usually the Primary Key and the Foreign Key.
Introductory SQL SQL Joins: Viewing Relationships Pg 4 The word INNER is ignored. These two syntaxes are interchangeable: INNER JOIN or JOIN These mean the same thing Since the word INNER is ignored, most most users omit it from the query: Corresponding rows from these tables FROM Employee JOIN The INNER keyword is ignored. Inner Join Results The query returns the following set of rows showing employees and their family members: 014 Smith Bob 100 014 1 Smith Susan Wife 014 Smith Bob 100 014 2 Smith Billy Son 014 Smith Bob 100 014 3 Smith John Son 127 Doe John 100 127 1 Doe Jane Wife 127 Doe John 100 127 2 Doe Mary Daughter 127 Doe John 100 127 3 Doe Lucy Daughter 192 Knot Jill 300 192 1 Knot Sam Husband 192 Knot Jill 300 192 2 Knot Ed Son Any row without a match in the other table is excluded. This is a list of Employees who have family members. The Inner Join returns only the rows where a match is made an intersection between the two tables. No row for employee 086 Jones Bill 200 is returned by the Inner Join. That s because there's no matching row in for this employee. The rows returned are limited to the set of employees that have family members. Left Outer Join (Left Join) The most useful of the Joins Left Outer Join What do we do if we want a list of all employees, regardless of whether they have family members? And if an employee has family members, how do we get those listed, too? To return the set of all employees regardless of whether they have family members, then we use an Outer Join. Right Outer Join The word OUTER is ignored. These two syntaxes are interchangeable: LEFT OUTER JOIN or LEFT JOIN These mean the same thing The Outer Join returns rows even if they aren t matched in the other table.
Introductory SQL SQL Joins: Viewing Relationships Pg 5 Since the word OUTER is ignored by SQL, most most users omit it from the query: All of these rows The matching rows of this FROM Employee LEFT JOIN Equivalently but less commonly, we can reverse the left and right join operands and change the join type to RIGHT JOIN which produces the same result: The matching rows of this All of these rows The Outer keyword is ignored. The Left and Right joins are the same thing, depending on the order you list the tables. FROM RIGHT JOIN Employee Left (Outer) Join Syntax Let s look at the Left Join syntax and see what it means: FROM Employee LEFT JOIN This says we want all the rows from the table on the left side of the join. The left table is the parent in a parent/child relationship. The left table is dominant and we want to see all rows from this table. If there happen to be any matching rows in the table on the right, then the we want those, too, matched with the corresponding row from the left table. If there are any rows in the right table that don t match, they re ignored. The left table is running the show in a Left Join. It s generally the one side (parent) in a one-to-many relationship. The rows in the left table are looking for their children in the right table. Left Outer Join Results This query returns all the rows in the table on the left side of the Left Join, and any matching rows from the table on the right side. Missing data for the attributes that don t have matching rows are filled in by NULLs: 014 Smith Bob 100 014 1 Smith Susan Wife 014 Smith Bob 100 014 2 Smith Billy Son 014 Smith Bob 100 014 3 Smith John Son 086 Jones Bill 200 NULL NULL NULL NULL NULL 127 Doe John 100 127 1 Doe Jane Wife 127 Doe John 100 127 2 Doe Mary Daughter 127 Doe John 100 127 3 Doe Lucy Daughter 192 Knot Jill 300 192 1 Knot Sam Husband 192 Knot Jill 300 192 2 Knot Ed Son Now we have all the employees, regardless of whether they have a family member. LEFT JOIN is used more often than RIGHT JOIN because it places the parent table on the left and the child on the right. This left-to-right arrangement flows better for most people. Always use a Left Join instead of a Right Join. Put the (1) Parent table on the left (2) Child table on the right (3) Use a Left Join. You ll get all the parents, paired with their children, if any. Attributes in missing child rows are filled with NULLs. The Left and Right joins are the same thing, varying only in the order you list the tables. You might never use a Right join.
Introductory SQL SQL Joins: Viewing Relationships Pg 6 Full Outer Join This is a combination of the LEFT JOIN and the RIGHT JOIN. All of the rows of both tables are returned, and they re matched with their corresponding rows, if any, in the other table. You ll find that you don t use the FULL OUTER JOIN (or FULL JOIN) very much. Since the word OUTER is ignored by SQL, most people omit it from the query: Same as FULL OUTER JOIN FROM Employee FULL JOIN Full Outer Join OUTER is an optional keyword ignored by the query processor. Dot Notation and Correlation Names (Aliases) Dot Notation Qualifier Notice the dot notation in the examples above. This notation qualifies ambiguous attribute names by specifying the table that the attribute belongs to: Dot Notation is ordinary hierarchical notation. Table name Dot Attribute name Table name Dot Attribute name The attribute names in both tables are the same. The dot notation tells them apart. Correlation Names (Aliases) with AS keyword (Saves typing) We can also use the dot notation with shorthand table names (correlation names or table aliases) substituted for the full table names by using the optional AS keyword: Correlation names defined FROM Employee AS e LEFT JOIN AS f ON e.employeenum = f.employeenum; Correlation names used A Correlation Name is also called a Table Alias. Using table aliases make the SQL statement less wordy than using the table name. We can omit the keyword AS (it s ignored anyway) with the same result: No AS keyword FROM Employee e LEFT JOIN f e and f are called correlation names and are substitutes for the table names. (Some texts call them tuple variables or table aliases). Correlation names can be used throughout the The keyword AS is ignored by the query parser. It's just for readability.
Introductory SQL SQL Joins: Viewing Relationships Pg 7 query, even ahead of where they're defined. Here s an example that displays specific columns (instead of using * to display all columns): SELECT emp.employeenum, emp.firstname, emp.lastname, fam.relationship, fam.firstname FROM Employee AS emp LEFT JOIN AS fam ON emp.employeenum = fam.employeenum WHERE fam.relationship = Wife OR fam.relationship = Husband ; Aliases can be any length. This query joins and filters in the same statement. There are two filters (horizontal and vertical) in this query: 1. In the SELECT clause we re naming the specific columns we want returned. 2. In the WHERE clause we re selecting a subset of rows from the join. The query returns the set of employees with spouses: 014 Bob Smith Wife Susan 127 John Doe Wife Jane 192 Jill Knot Husband Sam Aliases are optional you get the same result using table names. It s good practice to include explicit names (either table names or correlation names) in all queries because it prevents subtle mistakes. In fact, it s essential in many multi-table queries. Using dot notation to explicitly name the tables reduces mistakes.