This lecture. Databases - SQL II. Counting students. Summary Functions

Similar documents
Databases - SQL II. (GF Royle, N Spadaccini ) Structured Query Language II 1 / 22

COMP 244 DATABASE CONCEPTS & APPLICATIONS

CSC Web Programming. Introduction to SQL

This lecture. Basic Syntax

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

Database Usage (and Construction)

T-SQL Training: T-SQL for SQL Server for Developers

Databases - Relational Algebra. (GF Royle, N Spadaccini ) Databases - Relational Algebra 1 / 24

This lecture. Projection. Relational Algebra. Suppose we have a relation

12. MS Access Tables, Relationships, and Queries

Relational terminology. Databases - Sets & Relations. Sets. Membership

Part I: Structured Data

Operating systems fundamentals - B07

SQL Functions (Single-Row, Aggregate)

SQL Data Query Language

SQL - Data Query language

SQL. Char (30) can store ram, ramji007 or 80- b

Database Management Systems,

Unit Assessment Guide

INTRODUCTION (SQL & Classification of SQL statements)

SQL - Lecture 3 (Aggregation, etc.)

CSCI-UA: Database Design & Web Implementation. Professor Evan Sandhaus

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

SQL Data Manipulation Language. Lecture 5. Introduction to SQL language. Last updated: December 10, 2014

Databases SQL2. Gordon Royle. School of Mathematics & Statistics University of Western Australia. Gordon Royle (UWA) SQL2 1 / 26

CS 327E Lecture 2. Shirley Cohen. January 27, 2016

CS 582 Database Management Systems II

CSEN 501 CSEN501 - Databases I

ASSIGNMENT NO Computer System with Open Source Operating System. 2. Mysql

Chapter 9: Working With Data

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems

Databases - Transactions II. (GF Royle, N Spadaccini ) Databases - Transactions II 1 / 22

Structured Query Language Continued. Rose-Hulman Institute of Technology Curt Clifton

Database Management Systems. Chapter 5

Database Management Systems by Hanh Pham GOALS

Lecture 6 - More SQL

Structure Query Language (SQL)

Logical Operators and aggregation

Database Usage (and Construction)

SQL Aggregation and Grouping. Outline

SQL Aggregation and Grouping. Davood Rafiei

WEEK 3 TERADATA EXERCISES GUIDE

Database Management Systems. Chapter 5

Chapter # 7 Introduction to Structured Query Language (SQL) Part II

Querying Data with Transact SQL

Relational Database Management Systems for Epidemiologists: SQL Part II

Create a simple database with MySQL

4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML)

4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML)

SQL QUERIES. CS121: Relational Databases Fall 2017 Lecture 5

SQL BASICS WITH THE SMALLBANKDB STEFANO GRAZIOLI & MIKE MORRIS

This lecture. Step 1

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects.

SQL: Aggregate Functions with Grouping

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

Data Manipulation Language (DML)

What Are Group Functions? Reporting Aggregated Data Using the Group Functions. Objectives. Types of Group Functions

Database Management Systems. Chapter 5

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview - rel. model. Overview - detailed - SQL

GIFT Department of Computing Science. CS-217: Database Systems. Lab-4 Manual. Reporting Aggregated Data using Group Functions

Database Applications (15-415)

SQL: Data Querying. B0B36DBS, BD6B36DBS: Database Systems. h p:// Lecture 4

Institute of Aga. Network Database LECTURER NIYAZ M. SALIH

Introduction to Data Management. Lecture 14 (SQL: the Saga Continues...)

STIDistrict Query (Basic)

SQL Data Querying and Views

Introduction to database design

Based on the following Table(s), Write down the queries as indicated: 1. Write an SQL query to insert a new row in table Dept with values: 4, Prog, MO

DB2 SQL Class Outline

Introduction to MySQL

Database Languages. A DBMS provides two types of languages: Language for accessing & manipulating the data. Language for defining a database schema

Automatically Synthesizing SQL Queries from Input-Output Examples

MIS2502: Data Analytics SQL Getting Information Out of a Database Part 1: Basic Queries

CSCI-UA: Database Design & Web Implementation. Professor Evan Sandhaus Lecture #17: MySQL Gets Done

EECS 647: Introduction to Database Systems

Institute of Aga. Microsoft SQL Server LECTURER NIYAZ M. SALIH

Lesson 2. Data Manipulation Language

Introduction SQL DRL. Parts of SQL. SQL: Structured Query Language Previous name was SEQUEL Standardized query language for relational DBMS:

SQL functions fit into two broad categories: Data definition language Data manipulation language

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL

SQL Aggregate Functions

More on SQL Nested Queries Aggregate operators and Nulls

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

Chapter 4 SQL. Database Systems p. 121/567

INDEX. 1 Basic SQL Statements. 2 Restricting and Sorting Data. 3 Single Row Functions. 4 Displaying data from multiple tables

SQL Retrieving Data from Multiple Tables

NULLs & Outer Joins. Objectives of the Lecture :

Outline. Introduction to SQL. What happens when you run an SQL query? There are 6 possible clauses in a select statement. Tara Murphy and James Curran

SQL. Chapter 5 FROM WHERE

Introduction to SQL. Tara Murphy and James Curran. 15th April, 2009

QQ Group

Oracle Database 10g: Introduction to SQL

Computing for Medicine (C4M) Seminar 3: Databases. Michelle Craig Associate Professor, Teaching Stream

SQL. SQL Queries. CISC437/637, Lecture #8 Ben Cartere9e. Basic form of a SQL query: 3/17/10

HKTA TANG HIN MEMORIAL SECONDARY SCHOOL SECONDARY 3 COMPUTER LITERACY. Name: ( ) Class: Date: Databases and Microsoft Access

Relational Database Management Systems for Epidemiologists: SQL Part I

Introduction to SQL. Multirelation Queries Subqueries. Slides are reused by the approval of Jeffrey Ullman s

Sql Server Syllabus. Overview

This lecture. Databases - JDBC I. Application Programs. Database Access End Users

Transcription:

This lecture Databases - SQL II This lecture focuses on the summary or aggregate features provided in MySQL. The summary functions are those functions that return a single value from a collection of values for example, functions that produce counts, totals and averages. (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 1 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 2 / 22 Summary Functions Counting students One of the main uses of a database is to summarize the data it contains, in particular to provide statistical data. The main summary functions are COUNT to count rows SUM to add the values in a column MIN to find the minimum value in a column MAX to find the maximum value in a column AVG to find the average value in a column STD to find the standard deviation of the values in a column How many students are in the class for the grade-keeping project? SELECT COUNT(*) FROM student; 31 The COUNT function says to count the number of rows that are returned by the SELECT statement notice that this syntax is not very intuitive. (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 3 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 4 / 22

How many men and women? If we add a WHERE clause to the statement, then the COUNT will apply only to the selected rows. SELECT COUNT(*) FROM student WHERE sex = M ; 16 SELECT COUNT(*) FROM student WHERE sex = F ; 15 With one statement We can count both men and women in a single statement by using the GROUP BY clause. SELECT COUNT(*) FROM student GROUP BY sex; 15 16 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 5 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 6 / 22 But which is which Statistical Data As it stands, we don t know which value is associated with which sex! SELECT sex, COUNT(*) FROM student GROUP BY sex; +----- sex +----- F 15 M 16 +----- The GROUP BY clause says to first group the rows according to the distinct values of the specified attribute(s) and then do the counting. Now let s try and find statistical data about the quizzes and tests. SELECT event_id, MIN(score), MAX(score), AVG(score) FROM score GROUP BY event_id; ------------+------------+------------+ event_id MIN(score) MAX(score) AVG(score) ------------+------------+------------+ 1 9 20 15.1379 2 8 19 14.1667 3 60 97 78.2258 4 7 20 14.0370 5 8 20 14.1852 6 62 100 80.1724 ------------+------------+------------+ (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 7 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 8 / 22

Counting tests and quizzes Separating tests and quizzes How many of the events were tests and how many were quizzes? SELECT G.category, COUNT(*) FROM grade_event G GROUP BY G.category; ----------+ category ----------+ T 2 Q 4 ----------+ Can we get separate summary data for the quizzes and the tests? To do this we will need to do a multi-table query because score does not know what type each event is. SELECT G.category, AVG(S.score) FROM grade_event G, score S WHERE G.event_id = S.event_id GROUP BY G.category; --------------+ category AVG(S.score) --------------+ T 79.1667 Q 14.3894 --------------+ (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 9 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 10 / 22 Separating males and females Super-aggregate Now suppose we want to find the averages for each sex separately and for tests and quizzes separately. SELECT G.category, S.sex, AVG(M.score) WHERE G.event_id = M.event_id AND M.student_id = S.student_id GROUP BY G.category, S.sex; -----+--------------+ category sex AVG(M.score) -----+--------------+ T F 77.5862 T M 80.6452 Q F 14.6981 Q M 14.1167 -----+--------------+ SELECT G.category, S.sex, AVG(M.score) WHERE G.event_id = M.event_id AND M.student_id = S.student_id GROUP BY G.category, S.sex WITH ROLLUP; ------+--------------+ category sex AVG(M.score) ------+--------------+ Q F 14.6981 Q M 14.1167 Q NULL 14.3894 T F 77.5862 T M 80.6452 T NULL 79.1667 NULL NULL 36.8555 ------+--------------+ (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 11 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 12 / 22

What rollup does Adding the names The ROLLUP clause generates summaries of summaries that are inserted at appropriate places in the table. The GROUP BY clauses caused the data to summarised according to the four groups (Q, F), (Q, M), (T, F), (T, M). Rollup causes these groups to be further grouped together into (Q, both) and (T, both) and then finally combined into a single group. The fields where multiple values have been counted together are displayed in the result set by using NULL for that field. At the end of semester, the lecturer needs to know how many marks each person got in their quizzes and tests. SELECT S.name, G.category, COUNT(*), SUM(M.score) WHERE G.event_id = M.event_id AND S.student_id = M.student_id GROUP BY S.name, G.category WITH ROLLUP; (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 13 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 14 / 22 The output +---------------------+--------------+ name category SUM(M.score) +---------------------+--------------+ Abby Q 4 63 Abby T 2 194 Abby NULL 6 257 Aubrey Q 4 58 Aubrey T 2 137 Aubrey NULL 6 195 Avery Q 3 40 Avery T 2 138 Avery NULL 5 178 Becca Q 4 60 Becca T 2 176 Filtering on aggregate values Suppose we want to find the student who got the highest average quiz mark. SELECT S.name, COUNT(*), AVG(M.score) WHERE G.category = Q AND G.event_id = M.event_id AND S.student_id = M.student_id GROUP BY S.name ORDER BY AVG(M.score) DESC; +-------------------------+ name AVG(M.score) +-------------------------+ Megan 3 17.3333 Gabrielle 3 17.0000 Michael 4 16.7500 Teddy 4 16.2500 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 15 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 16 / 22

Using HAVING But the quiz-prize can only go to a student who sat all of the quizzes. SELECT S.name, COUNT(*), AVG(M.score) WHERE G.category = Q AND G.event_id = M.event_id AND S.student_id = M.student_id GROUP BY S.name HAVING COUNT(*) = 4 ORDER BY AVG(M.score) DESC; +-----------------------+ name AVG(M.score) +-----------------------+ Michael 4 16.7500 Teddy 4 16.2500 Summary The HAVING clause behaves exactly like a WHERE clause except that it operates on the summarized data, so the whole process is as follows: The named columns are extracted from the Cartesian product of all the tables listed in the FROM clause. All of these rows are then filtered according to the WHERE clause The filtered rows are then grouped together according to the GROUP BY clause The aggregate functions are applied to the rows in each group. The resulting rows are then filtered by the HAVING clause. The filtered, aggregated rows are then ordered by the ORDER BY clause. (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 17 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 18 / 22 Using DISTINCT In order to count the number of different states from which the presidents come, we can use SELECT COUNT (DISTINCT state) FROM president; +-----------------------+ COUNT(DISTINCT state) +-----------------------+ 20 +-----------------------+ The DISTINCT keyword eliminates the duplicate values before counting. Tables with NULL values Consider a table with the following data mysql> select * from test; +------+ mark +------+ 10 15 20 NULL 8 +------+ What is the number of rows, the sum of the rows and the average value for the single field? (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 19 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 20 / 22

Sometimes NULL counts, sometimes not! Learning how to summarize mysql> SELECT COUNT(*), SUM(mark), AVG(mark) FROM test; -----------+-----------+ SUM(mark) AVG(mark) -----------+-----------+ 5 53 13.2500 -----------+-----------+ Notice that AVG is not equal to SUM / COUNT. Learning how to use the summary functions requires a lot of practice because you just have to learn the somewhat strange syntax, and the error messages produced by MySQL are not very informative. For example, to someone used to a normal programming language it seems very strange to type SELECT COUNT(*) FROM president rather than WHERE death IS NULL; COUNT (SELECT * FROM president WHERE death is NULL); (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 21 / 22 (GF Royle 2006-8, N Spadaccini 2008) Structured Query Language II 22 / 22