Advanced SQL Clause and Functions Pg 1 Clause and Functions Ray Lockwood Points: s (such as COUNT( ) work on groups of Instead of returning every row read from a table, we can aggregate rows together using the clause. After the groups are made, we can filter the groups we want to see by using the HAVING clause. Functions Here's our Employee table: Employee EmployeeNum LastName FirstName DeptNum Salary 014 Smith Bob 100 20000 086 Jones NULL 200 35000 127 Doe John 100 60000 859 Thompson Joe 300 45000 273 Watson Ed 200 55000 662 Wilson Tom 200 30000 589 Morrison Fred 100 40000 840 Estes Jerry 300 25000 509 Harris William 200 50000 This table has nine There's a NULL in the FirstName column. COUNT( ) If we run this query: SELECT * WHERE DeptNum = 100; We'll get a set of rows: 014 Smith Bob 100 20000 127 Doe John 100 60000 589 Morrison Fred 100 40000 The output of a SQL statement is a table.
Advanced SQL Clause and Functions Pg 2 When we use an Function, it returns a single number (a scalar) built from a collection (aggregation) of many rows combined. For example, we can use the aggregate COUNT(*) to tell us how many employees are in department 100: SELECT COUNT(*) WHERE DeptNum = 100; Instead of a set of rows, this query returns a single scalar value: 3 The result is an aggregate of the rows returned by the WHERE clause. Two versions of COUNT( ) There are two versions of the COUNT : COUNT(*) gives a count of all the rows returned by the query. COUNT(column name) returns a count of the rows for which the named is not NULL. The left-hand query below returns the number of rows in the table. The right-hand query returns the number of rows in which FirstName is non-null: s work on groups of The COUNT(*) counts rows, not s. A single value is called a Scalar Value. It is really a set with one row and one column. COUNT(*) doesn t care if an is NULL. COUNT(Column) omits NULLs from the count. SELECT COUNT(*) ; The values returned by each: SELECT COUNT(FirstName) ; There's one NULL in the FirstName column. 9 All the rows in 8 the table All the rows in which FirstName is not NULL All the Functions There are five aggregate s: 1. AVG( ) 2. COUNT( ) 3. MAX( ) 4. MIN( ) 5. SUM( ) AVG( ) and SUM( ) work only on numeric data. Let s try all the aggregate s: SELECT COUNT(*),AVG(Salary),MAX(Salary),MIN(Salary),SUM(Salary) WHERE DeptNum = 100 All of these s work on an aggregation of
Advanced SQL Clause and Functions Pg 3 Here's the result: 3 40000 60000 20000 120000 Row count Average Salary Greatest Salary Least Salary Sum of the Salaries All the s produce a scalar (single value) output. You Can t Mix and Non- Results It would make no sense to make a query like this, because the COUNT(*) wants to return a single row, and the LastName wants to return three This returns a scalar value SELECT COUNT(*), LastName WHERE DeptNum = 100; This returns three rows This query causes an error! If you use an aggregate, then all the items in the SELECT clause must either be aggregates or named in a clause, which we look at next. You can t output s that have different numbers of The aggregate will produce a result over the group. Clause We've produced a head count of one department, and by omiting the WHERE clause we'd get the head count of the whole company. How can we get a count of each one of the departments at once? We use the clause: divides results into groups. SELECT DeptNum, COUNT(*) DeptNum; reports on each group. This query gives a count for each group. aggregates the rows specified in the clause. This statement returns the DeptNum of each department, and the corresponding row count for each department: DeptNum COUNT(*) 100 3 200 4 300 2 The aggregate is applied to each group.
Advanced SQL Clause and Functions Pg 4 The steps the clause takes are: 1. Sort the table by the. 2. Combine all rows having the same value in the into groups. 3. Apply the aggregate s to each group. SELECT Clause Is Restricted To Grouped Attributes & Functions The statement builds groups of rows, so the SELECT clause can contain only the things that make sense in the context of groups. The SELECT clause can contain only: The grouping. s pertaining to the group. The only things allowed in the SELECT clause are: s s In the above SQL statement, the SELECT clause contains the grouping DeptNum, and the aggregate COUNT(*) which counts the rows in each group. Here is an invalid SQL statement: SELECT DeptNum, COUNT(*), Salary DeptNum Multiple values per group! BAD! This query causes an error! The SELECT clause contains something other than or an aggregate! The above query will produce an error because there are many salaries for each department number. We can fix this by applying an aggregate to salary: SELECT DeptNum, COUNT(*), AVG(Salary) DeptNum This query gives the count and average salary for each group. Good! Only the and aggregate s are in the SELECT clause. This statement returns the DeptNum of each department, and the corresponding row count and average salary: DeptNum COUNT(*) AVG(Salary) 100 3 40000 200 4 42500 300 2 35000 The aggregate is applied to each group.
Advanced SQL Clause and Functions Pg 5 Multiple Attributes You can put multiple s in the clause to make fine grained groups: SELECT ZipCode, PlusFour, COUNT(*) FROM Address ZipCode, PlusFour Each group consists of addresses from a PlusFour zone within a ZipCode Multiple GROUP BY s are OK. HAVING clause The HAVING clause filters the groups produced by the clause. It's applied after the groups are built. Let s add it to an earlier query to restrict the output to groups with more than two rows: SELECT DeptNum, COUNT(*) DeptNum HAVING COUNT(*) > 2 The HAVING clause takes the output of the and includes only the groups that test TRUE for the Boolean expression. The output contains only those groups (departments) that have more than two employees: DeptNum COUNT 100 3 200 4 or The values in the HAVING clause must pertain to groups, not individual HAVING filters groups after WHERE filters HAVING filters groups according to a Boolean. The Order of Clauses In a SQL Statement When building a SELECT statement, the clauses have to appear in a specific order: 1. SELECT 2. FROM 3. WHERE 4. 5. HAVING 6. ORDER BY Applies to rows Applies to groups When you build a SELECT statement, it must follow this structure.
Advanced SQL Clause and Functions Pg 6 Here s an example using all of the clauses: Row filter Group filter SELECT DeptNum, COUNT(*), AVG(Salary), MAX(Salary) WHERE Salary >= 35000 DeptNum HAVING COUNT(*) > 2 ORDER BY AVG(Salary) or This is the order that the statement is executed. Values in the ORDER BY clause pertain to groups, not individual The order of execution is the same as the order that the statement is written. Remember that HAVING and ORDER BY work on groups because they re placed after the clause. Order of Execution SELECT statements process the rows first, then the groups. The order of execution is: 1. WHERE Filters the rows from the input table. 2. s the remaining rows into groups. 3. HAVING Filters the groups. 4. ORDER BY Sorts the remaining groups.