Databases - SQL II. (GF Royle, N Spadaccini ) Structured Query Language II 1 / 22

Similar documents
This lecture. Databases - SQL II. Counting students. Summary Functions

COMP 244 DATABASE CONCEPTS & APPLICATIONS

CSC Web Programming. Introduction to SQL

This lecture. Basic Syntax

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

Database Usage (and Construction)

T-SQL Training: T-SQL for SQL Server for Developers

Databases - Relational Algebra. (GF Royle, N Spadaccini ) Databases - Relational Algebra 1 / 24

This lecture. Projection. Relational Algebra. Suppose we have a relation

12. MS Access Tables, Relationships, and Queries

Relational terminology. Databases - Sets & Relations. Sets. Membership

Part I: Structured Data

Operating systems fundamentals - B07

SQL Functions (Single-Row, Aggregate)

SQL Data Query Language

SQL - Data Query language

SQL. Char (30) can store ram, ramji007 or 80- b

Database Management Systems,

Unit Assessment Guide

INTRODUCTION (SQL & Classification of SQL statements)

SQL - Lecture 3 (Aggregation, etc.)

CSCI-UA: Database Design & Web Implementation. Professor Evan Sandhaus

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

SQL Data Manipulation Language. Lecture 5. Introduction to SQL language. Last updated: December 10, 2014

CS 327E Lecture 2. Shirley Cohen. January 27, 2016

Databases SQL2. Gordon Royle. School of Mathematics & Statistics University of Western Australia. Gordon Royle (UWA) SQL2 1 / 26

CS 582 Database Management Systems II

Chapter 9: Working With Data

CSEN 501 CSEN501 - Databases I

ASSIGNMENT NO Computer System with Open Source Operating System. 2. Mysql

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems

Databases - Transactions II. (GF Royle, N Spadaccini ) Databases - Transactions II 1 / 22

Structured Query Language Continued. Rose-Hulman Institute of Technology Curt Clifton

Database Management Systems. Chapter 5

Database Management Systems by Hanh Pham GOALS

Lecture 6 - More SQL

Structure Query Language (SQL)

Logical Operators and aggregation

Database Usage (and Construction)

SQL Aggregation and Grouping. Outline

SQL Aggregation and Grouping. Davood Rafiei

WEEK 3 TERADATA EXERCISES GUIDE

Querying Data with Transact SQL

Database Management Systems. Chapter 5

Chapter # 7 Introduction to Structured Query Language (SQL) Part II

Relational Database Management Systems for Epidemiologists: SQL Part II

Create a simple database with MySQL

4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML)

4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML)

SQL QUERIES. CS121: Relational Databases Fall 2017 Lecture 5

SQL BASICS WITH THE SMALLBANKDB STEFANO GRAZIOLI & MIKE MORRIS

This lecture. Step 1

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects.

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

SQL: Aggregate Functions with Grouping

Data Manipulation Language (DML)

What Are Group Functions? Reporting Aggregated Data Using the Group Functions. Objectives. Types of Group Functions

Database Management Systems. Chapter 5

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview - rel. model. Overview - detailed - SQL

GIFT Department of Computing Science. CS-217: Database Systems. Lab-4 Manual. Reporting Aggregated Data using Group Functions

Database Applications (15-415)

SQL: Data Querying. B0B36DBS, BD6B36DBS: Database Systems. h p:// Lecture 4

Institute of Aga. Network Database LECTURER NIYAZ M. SALIH

Introduction to Data Management. Lecture 14 (SQL: the Saga Continues...)

STIDistrict Query (Basic)

SQL Data Querying and Views

Introduction to database design

Based on the following Table(s), Write down the queries as indicated: 1. Write an SQL query to insert a new row in table Dept with values: 4, Prog, MO

DB2 SQL Class Outline

Automatically Synthesizing SQL Queries from Input-Output Examples

Introduction to MySQL

Database Languages. A DBMS provides two types of languages: Language for accessing & manipulating the data. Language for defining a database schema

MIS2502: Data Analytics SQL Getting Information Out of a Database Part 1: Basic Queries

CSCI-UA: Database Design & Web Implementation. Professor Evan Sandhaus Lecture #17: MySQL Gets Done

EECS 647: Introduction to Database Systems

Institute of Aga. Microsoft SQL Server LECTURER NIYAZ M. SALIH

Lesson 2. Data Manipulation Language

Introduction SQL DRL. Parts of SQL. SQL: Structured Query Language Previous name was SEQUEL Standardized query language for relational DBMS:

SQL functions fit into two broad categories: Data definition language Data manipulation language

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL

More on SQL Nested Queries Aggregate operators and Nulls

SQL Aggregate Functions

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

INDEX. 1 Basic SQL Statements. 2 Restricting and Sorting Data. 3 Single Row Functions. 4 Displaying data from multiple tables

Chapter 4 SQL. Database Systems p. 121/567

SQL Retrieving Data from Multiple Tables

NULLs & Outer Joins. Objectives of the Lecture :

Outline. Introduction to SQL. What happens when you run an SQL query? There are 6 possible clauses in a select statement. Tara Murphy and James Curran

SQL. Chapter 5 FROM WHERE

Introduction to SQL. Tara Murphy and James Curran. 15th April, 2009

QQ Group

Oracle Database 10g: Introduction to SQL

Computing for Medicine (C4M) Seminar 3: Databases. Michelle Craig Associate Professor, Teaching Stream

HKTA TANG HIN MEMORIAL SECONDARY SCHOOL SECONDARY 3 COMPUTER LITERACY. Name: ( ) Class: Date: Databases and Microsoft Access

SQL. SQL Queries. CISC437/637, Lecture #8 Ben Cartere9e. Basic form of a SQL query: 3/17/10

Relational Database Management Systems for Epidemiologists: SQL Part I

Sql Server Syllabus. Overview

Introduction to SQL. Multirelation Queries Subqueries. Slides are reused by the approval of Jeffrey Ullman s

This lecture. Databases - JDBC I. Application Programs. Database Access End Users

Transcription:

Databases - SQL II (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 1 / 22

This lecture This lecture focuses on the summary or aggregate features provided in MySQL. The summary functions are those functions that return a single value from a collection of values for example, functions that produce counts, totals and averages. (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 2 / 22

Summary Functions One of the main uses of a database is to summarize the data it contains, in particular to provide statistical data. The main summary functions are COUNT to count rows SUM to add the values in a column MIN to find the minimum value in a column MAX to find the maximum value in a column AVG to find the average value in a column STD to find the standard deviation of the values in a column (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 3 / 22

Counting students How many students are in the class for the grade-keeping project? SELECT COUNT(*) FROM student; COUNT(*) 31 The COUNT function says to count the number of rows that are returned by the SELECT statement notice that this syntax is not very intuitive. (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 4 / 22

How many men and women? If we add a WHERE clause to the statement, then the COUNT will apply only to the selected rows. SELECT COUNT(*) FROM student WHERE sex = M ; COUNT(*) 16 SELECT COUNT(*) FROM student WHERE sex = F ; COUNT(*) 15 (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 5 / 22

With one statement We can count both men and women in a single statement by using the GROUP BY clause. SELECT COUNT(*) FROM student GROUP BY sex; COUNT(*) 15 16 (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 6 / 22

But which is which As it stands, we don t know which value is associated with which sex! SELECT sex, COUNT(*) FROM student GROUP BY sex; +----- sex COUNT(*) +----- F 15 M 16 +----- The GROUP BY clause says to first group the rows according to the distinct values of the specified attribute(s) and then do the counting. (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 7 / 22

Statistical Data Now let s try and find statistical data about the quizzes and tests. SELECT event_id, MIN(score), MAX(score), AVG(score) FROM score GROUP BY event_id; ------------+------------+------------+ event_id MIN(score) MAX(score) AVG(score) ------------+------------+------------+ 1 9 20 15.1379 2 8 19 14.1667 3 60 97 78.2258 4 7 20 14.0370 5 8 20 14.1852 6 62 100 80.1724 ------------+------------+------------+ (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 8 / 22

Counting tests and quizzes How many of the events were tests and how many were quizzes? SELECT G.category, COUNT(*) FROM grade_event G GROUP BY G.category; ----------+ category COUNT(*) ----------+ T 2 Q 4 ----------+ (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 9 / 22

Separating tests and quizzes Can we get separate summary data for the quizzes and the tests? To do this we will need to do a multi-table query because score does not know what type each event is. SELECT G.category, AVG(S.score) FROM grade_event G, score S WHERE G.event_id = S.event_id GROUP BY G.category; --------------+ category AVG(S.score) --------------+ T 79.1667 Q 14.3894 --------------+ (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 10 / 22

Separating males and females Now suppose we want to find the averages for each sex separately and for tests and quizzes separately. SELECT G.category, S.sex, AVG(M.score) FROM grade_event G, student S, score M WHERE G.event_id = M.event_id AND M.student_id = S.student_id GROUP BY G.category, S.sex; -----+--------------+ category sex AVG(M.score) -----+--------------+ T F 77.5862 T M 80.6452 Q F 14.6981 Q M 14.1167 -----+--------------+ (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 11 / 22

Super-aggregate SELECT G.category, S.sex, AVG(M.score) FROM grade_event G, student S, score M WHERE G.event_id = M.event_id AND M.student_id = S.student_id GROUP BY G.category, S.sex WITH ROLLUP; ------+--------------+ category sex AVG(M.score) ------+--------------+ Q F 14.6981 Q M 14.1167 Q NULL 14.3894 T F 77.5862 T M 80.6452 T NULL 79.1667 NULL NULL 36.8555 ------+--------------+ (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 12 / 22

What rollup does The ROLLUP clause generates summaries of summaries that are inserted at appropriate places in the table. The GROUP BY clauses caused the data to summarised according to the four groups (Q, F), (Q, M), (T, F), (T, M). Rollup causes these groups to be further grouped together into (Q, both) and (T, both) and then finally combined into a single group. The fields where multiple values have been counted together are displayed in the result set by using NULL for that field. (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 13 / 22

Adding the names At the end of semester, the lecturer needs to know how many marks each person got in their quizzes and tests. SELECT S.name, G.category, COUNT(*), SUM(M.score) FROM grade_event G, student S, score M WHERE G.event_id = M.event_id AND S.student_id = M.student_id GROUP BY S.name, G.category WITH ROLLUP; (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 14 / 22

The output +---------------------+--------------+ name category COUNT(*) SUM(M.score) +---------------------+--------------+ Abby Q 4 63 Abby T 2 194 Abby NULL 6 257 Aubrey Q 4 58 Aubrey T 2 137 Aubrey NULL 6 195 Avery Q 3 40 Avery T 2 138 Avery NULL 5 178 Becca Q 4 60 Becca T 2 176 (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 15 / 22

Filtering on aggregate values Suppose we want to find the student who got the highest average quiz mark. SELECT S.name, COUNT(*), AVG(M.score) FROM grade_event G, student S, score M WHERE G.category = Q AND G.event_id = M.event_id AND S.student_id = M.student_id GROUP BY S.name ORDER BY AVG(M.score) DESC; +-------------------------+ name COUNT(*) AVG(M.score) +-------------------------+ Megan 3 17.3333 Gabrielle 3 17.0000 Michael 4 16.7500 Teddy 4 16.2500 (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 16 / 22

Using HAVING But the quiz-prize can only go to a student who sat all of the quizzes. SELECT S.name, COUNT(*), AVG(M.score) FROM grade_event G, student S, score M WHERE G.category = Q AND G.event_id = M.event_id AND S.student_id = M.student_id GROUP BY S.name HAVING COUNT(*) = 4 ORDER BY AVG(M.score) DESC; +-----------------------+ name COUNT(*) AVG(M.score) +-----------------------+ Michael 4 16.7500 Teddy 4 16.2500 (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 17 / 22

Summary The HAVING clause behaves exactly like a WHERE clause except that it operates on the summarized data, so the whole process is as follows: The named columns are extracted from the Cartesian product of all the tables listed in the FROM clause. All of these rows are then filtered according to the WHERE clause The filtered rows are then grouped together according to the GROUP BY clause The aggregate functions are applied to the rows in each group. The resulting rows are then filtered by the HAVING clause. The filtered, aggregated rows are then ordered by the ORDER BY clause. (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 18 / 22

Using DISTINCT In order to count the number of different states from which the presidents come, we can use SELECT COUNT (DISTINCT state) FROM president; +-----------------------+ COUNT(DISTINCT state) +-----------------------+ 20 +-----------------------+ The DISTINCT keyword eliminates the duplicate values before counting. (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 19 / 22

Tables with NULL values Consider a table with the following data mysql> select * from test; +------+ mark +------+ 10 15 20 NULL 8 +------+ What is the number of rows, the sum of the rows and the average value for the single field? (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 20 / 22

Sometimes NULL counts, sometimes not! mysql> SELECT COUNT(*), SUM(mark), AVG(mark) FROM test; -----------+-----------+ COUNT(*) SUM(mark) AVG(mark) -----------+-----------+ 5 53 13.2500 -----------+-----------+ Notice that AVG is not equal to SUM / COUNT. (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 21 / 22

Learning how to summarize Learning how to use the summary functions requires a lot of practice because you just have to learn the somewhat strange syntax, and the error messages produced by MySQL are not very informative. For example, to someone used to a normal programming language it seems very strange to type SELECT COUNT(*) FROM president WHERE death IS NULL; rather than COUNT (SELECT * FROM president WHERE death is NULL); (GF Royle, N Spadaccini 2006-2010) Structured Query Language II 22 / 22