Introduction to Database Systems CSE 414

Similar documents
Introduction to Data Management CSE 344

SELECT Product.name, Purchase.store FROM Product JOIN Purchase ON Product.name = Purchase.prodName

Database Systems CSE 414

Introduction to Data Management CSE 344

Announcements. Outline UNIQUE. (Inner) joins. (Inner) Joins. Database Systems CSE 414. WQ1 is posted to gradebook double check scores

Lecture 03: SQL. Friday, April 2 nd, Dan Suciu Spring

Principles of Database Systems CSE 544. Lecture #2 SQL The Complete Story

CSC 261/461 Database Systems Lecture 5. Fall 2017

Lecture 4: Advanced SQL Part II

Introduction to SQL Part 2 by Michael Hahsler Based on slides for CS145 Introduction to Databases (Stanford)

SQL. Assit.Prof Dr. Anantakul Intarapadung

SQL. Structured Query Language

Introduction to Database Systems CSE 414

Lecture 04: SQL. Monday, April 2, 2007

Lecture 04: SQL. Wednesday, October 4, 2006

L07: SQL: Advanced & Practice. CS3200 Database design (sp18 s2) 1/11/2018

Principles of Database Systems CSE 544. Lecture #2 SQL, Relational Algebra, Relational Calculus

Introduction to Data Management CSE 344

CSE 544 Principles of Database Management Systems

Subqueries. 1. Subqueries in SELECT. 1. Subqueries in SELECT. Including Empty Groups. Simple Aggregations. Introduction to Database Systems CSE 414

CS 2451 Database Systems: More SQL

Introduction to Data Management CSE 344

Lectures 2&3: Introduction to SQL

CS 564 Midterm Review

CSE 344 MAY 14 TH ENTITIES

Where Are We? Next Few Lectures. Integrity Constraints Motivation. Constraints in E/R Diagrams. Keys in E/R Diagrams

Introduction to Data Management CSE 344

CSE 544 Principles of Database Management Systems

Database design and implementation CMPSCI 645. Lecture 04: SQL and Datalog

Please pick up your name card

CSE 344 AUGUST 1 ST ENTITIES

Introduction to Relational Databases

Introduction to Data Management CSE 344

CSE 344 JULY 30 TH DB DESIGN (CH 4)

Announcements. Subqueries. Lecture Goals. 1. Subqueries in SELECT. Database Systems CSE 414. HW1 is due today 11pm. WQ1 is due tomorrow 11pm

Introduction to Database Systems

Database Systems CSE 414

Database Systems CSE 414

CSE 344 JANUARY 19 TH SUBQUERIES 2 AND RELATIONAL ALGEBRA

Database Systems CSE 414

HW1 is due tonight HW2 groups are assigned. Outline today: - nested queries and witnesses - We start with a detailed example! - outer joins, nulls?

Announcements. Multi-column Keys. Multi-column Keys (3) Multi-column Keys. Multi-column Keys (2) Introduction to Data Management CSE 414

Announcements. Database Design. Database Design. Database Design Process. Entity / Relationship Diagrams. Introduction to Data Management CSE 344

Introduction to Database Systems CSE 414

Database design and implementation CMPSCI 645. Lecture 06: Constraints and E/R model

Database Design Process Entity / Relationship Diagrams

Introduction to Database Systems CSE 414. Lecture 8: SQL Wrap-up

CSC 261/461 Database Systems Lecture 14. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Announcements. Multi-column Keys. Multi-column Keys. Multi-column Keys (3) Multi-column Keys (2) Introduction to Data Management CSE 414

Introduction to Data Management CSE 414

Introduction to SQL Part 1 By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

DS Introduction to SQL Part 1 Single-Table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Introduction to Database Systems CSE 444

Announcements. Agenda. Database/Relation/Tuple. Discussion. Schema. CSE 444: Database Internals. Room change: Lab 1 part 1 is due on Monday

Keys, SQL, and Views CMPSCI 645

Entity/Relationship Modelling

Polls on Piazza. Open for 2 days Outline today: Next time: "witnesses" (traditionally students find this topic the most difficult)

Announcements. Database Design. Database Design. Database Design Process. Entity / Relationship Diagrams. Database Systems CSE 414

Announcements. Agenda. Database/Relation/Tuple. Schema. Discussion. CSE 444: Database Internals

Database Systems CSE 414

Agenda. Discussion. Database/Relation/Tuple. Schema. Instance. CSE 444: Database Internals. Review Relational Model

Announcements. Joins in SQL. Joins in SQL. (Inner) joins. Joins in SQL. Introduction to Database Systems CSE 414

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

CS 2451 Database Systems: SQL

Lecture 3 Additional Slides. CSE 344, Winter 2014 Sudeepa Roy

Querying Data with Transact SQL

Introduction to Database Systems CSE 344

SQL Continued! Outerjoins, Aggregations, Grouping, Data Modification

Structured Query Language Continued. Rose-Hulman Institute of Technology Curt Clifton

CSE 414 Midterm. April 28, Name: Question Points Score Total 101. Do not open the test until instructed to do so.

Introduction to Data Management CSE 414

CSE 344 Introduction to Data Management. Section 2: More SQL

CS 327E Lecture 11. Shirley Cohen. March 2, 2016

In-class activities: Sep 25, 2017

SQL Functionality SQL. Creating Relation Schemas. Creating Relation Schemas

CSE 344 JANUARY 8 TH SQLITE AND JOINS

CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE

CS 327E Lecture 2. Shirley Cohen. January 27, 2016

L12: ER modeling 5. CS3200 Database design (sp18 s2) 2/22/2018

Announcements. Two typical kinds of queries. Choosing Index is Not Enough. Cost Parameters. Cost of Reading Data From Disk

MIS2502: Data Analytics SQL Getting Information Out of a Database Part 1: Basic Queries

Lesson 2. Data Manipulation Language

Lecture 2: Introduction to SQL

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2006 Lecture 3 - Relational Model

Data Storage. Query Performance. Index. Data File Types. Introduction to Data Management CSE 414. Introduction to Database Systems CSE 414

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

Introduction to Database Systems CSE 414. Lecture 3: SQL Basics

Handout 9 CS-605 Spring 18 Page 1 of 8. Handout 9. SQL Select -- Multi Table Queries. Joins and Nested Subqueries.

Database Systems CSE 414

Principles of Database Systems CSE 544. Lecture #1 Introduction and SQL

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

SQL: Part II. Announcements (September 18) Incomplete information. CPS 116 Introduction to Database Systems. Homework #1 due today (11:59pm)

5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414

Introduction to SQL. IT 5101 Introduction to Database Systems. J.G. Zheng Fall 2011

Introduction to Database Systems CSE 414

CMPT 354: Database System I. Lecture 4. SQL Advanced

Principles of Database Systems CSE 544. Lecture #1 Introduction and SQL

Lecture 6 - More SQL

MIS2502: Data Analytics SQL Getting Information Out of a Database. Jing Gong

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

Transcription:

Introduction to Database Systems CSE 414 Lectures 4 and 5: Aggregates in SQL CSE 414 - Spring 2013 1

Announcements Homework 1 is due on Wednesday Quiz 2 will be out today and due on Friday CSE 414 - Spring 2013 2

Outline Outer joins (6.3.8) Aggregations (6.4.3 6.4.6) Examples, examples, examples CSE 414 - Spring 2013 3

Outerjoins Product(name, category) Purchase(prodName, store) -- prodname is foreign key An inner join : Same as: SELECT Product.name, Purchase.store FROM Product, Purchase WHERE Product.name = Purchase.prodName SELECT Product.name, Purchase.store FROM Product JOIN Purchase ON Product.name = Purchase.prodName But some Products are not listed! Why? 4

Outerjoins Product(name, category) Purchase(prodName, store) -- prodname is foreign key If we want to include products that never sold, then we need an outerjoin : SELECT Product.name, Purchase.store FROM Product LEFT OUTER JOIN Purchase ON Product.name = Purchase.prodName CSE 414 - Spring 2013 5

Product Purchase Name Category ProdName Store Gizmo gadget Gizmo Wiz Camera Photo Camera Ritz OneClick Photo Camera Wiz Name Gizmo Camera Camera OneClick Store Wiz Ritz Wiz NULL 6

Outer Joins Left outer join: Include the left tuple even if there s no match Right outer join: Include the right tuple even if there s no match Full outer join: Include both left and right tuples even if there s no match CSE 414 - Spring 2013 7

Aggregation in SQL >sqlite3 lecture04!! sqlite> create table Purchase!! (pid int primary key,!! product text,!! price float,!! quantity int,!! month varchar(15));!! sqlite> -- download data.txt! sqlite>.import data.txt Purchase! Specify a filename where the database will be stored Other DBMSs have other ways of impor5ng data CSE 414 - Spring 2013 8

Comment about SQLite One cannot load NULL values such that they are actually loaded as null values So we need to use two steps: Load null values using some type of special value Update the special values to actual null values update Purchase set price = null where price = null! CSE 414 - Spring 2013 9

Simple Aggregations Five basic aggregate operations in SQL select count(*) from Purchase! select sum(quantity) from Purchase! select avg(price) from Purchase! select max(quantity) from Purchase! select min(quantity) from Purchase! Except count, all aggregations apply to a single attribute CSE 414 - Spring 2013 10

Aggregates and NULL Values Null values are not used in aggregates insert into Purchase values(12, 'gadget', NULL, NULL, 'april')! Let s try the following select count(*) from Purchase! select count(quantity) from Purchase!! select sum(quantity) from Purchase!! select sum(quantity) from Purchase where quantity is not null;! 11

Counting Duplicates COUNT applies to duplicates, unless otherwise stated: SELECT Count(product) FROM Purchase WHERE price > 4.99 same as Count(*) We probably want: SELECT Count(DISTINCT product) FROM Purchase WHERE price> 4.99 CSE 414 - Spring 2013 12

More Examples SELECT Sum(price * quantity) FROM Purchase SELECT Sum(price * quantity) FROM Purchase WHERE product = bagel What do they mean? CSE 414 - Spring 2013 13

Simple Aggregations Purchase Product Price Quantity Bagel 3 20 Bagel 1.50 20 Banana 0.5 50 Banana 2 10 Banana 4 10 SELECT Sum(price * quantity) FROM Purchase WHERE product = Bagel CSE 414 - Spring 2013 90 (= 60+30) 14

Grouping and Aggregation Purchase(product, price, quantity) Find total quantities for all sales over $1, by product. SELECT product, Sum(quantity) AS TotalSales FROM Purchase WHERE price > 1 GROUP BY product Let s see what this means CSE 414 - Spring 2013 15

Grouping and Aggregation 1. Compute the FROM and WHERE clauses. 2. Group by the attributes in the GROUPBY 3. Compute the SELECT clause: grouped attributes and aggregates. CSE 414 - Spring 2013 16

1&2. FROM-WHERE-GROUPBY Product Price Quantity Bagel 3 20 Bagel 1.50 20 Banana 0.5 50 Banana 2 10 Banana 4 10 WHERE price > 1 CSE 414 - Spring 2013 17

3. SELECT Product Price Quantity Bagel 3 20 Bagel 1.50 20 Banana 0.5 50 Banana 2 10 Banana 4 10 Product TotalSales Bagel 40 Banana 20 SELECT product, Sum(quantity) AS TotalSales FROM Purchase WHERE price > 1 GROUP BY product 18

Other Examples Compare these two queries: SELECT product, count(*) FROM Purchase GROUP BY product SELECT month, count(*) FROM Purchase GROUP BY month SELECT product, sum(quantity) AS SumQuantity, max(price) AS MaxPrice FROM Purchase GROUP BY product CSE 414 - Spring 2013 What does it mean? 19

Need to be Careful SELECT product, max(quantity) FROM Purchase GROUP BY product SELECT product, quantity FROM Purchase GROUP BY product sqlite is WRONG on this query. Product Price Quantity Bagel 3 20 Bagel 1.50 20 Banana 0.5 50 Banana 2 10 Banana 4 10 Advanced DBMS (e.g. SQL Server) gives an error CSE 414 - Spring 2013 20

Ordering Results SELECT product, sum(price*quantity) as rev FROM purchase GROUP BY product ORDER BY rev desc CSE 414 - Spring 2013 21

HAVING Clause Same query as earlier, except that we consider only products that had at least 30 sales. SELECT product, sum(price*quantity) FROM Purchase WHERE price > 1 GROUP BY product HAVING Sum(quantity) > 30 HAVING clause contains conditions on aggregates. CSE 414 - Spring 2013 22

WHERE vs HAVING WHERE condition is applied to individual rows The rows may or may not contribute to the aggregate No aggregates allowed here HAVING condition is applied to the entire group Entire group is returned, or not al all May use aggregate functions in the group CSE 414 - Spring 2013 23

Aggregates and Joins create table Product (pid int primary key, pname varchar(15), manufacturer varchar(15));!! insert into product values(1,'bagel,'sunshine Co.');! insert into product values(2,'banana,'busyhands');! insert into product values(3,'gizmo,'gizmoworks');! insert into product values(4,'gadget,'busyhands');! insert into product values(5,'powergizmo,'powerworks');! CSE 414 - Spring 2013 24

Aggregate + Join Example SELECT x.manufacturer, count(*) FROM Product x, Purchase y WHERE x.pname = y.product GROUP BY x.manufacturer What do these query mean? SELECT x.manufacturer, y.month, count(*) FROM Product x, Purchase y WHERE x.pname = y.product GROUP BY x.manufacturer, y.month CSE 414 - Spring 2013 25

General form of Grouping and Aggregation SELECT S FROM R 1,,R n WHERE C1 GROUP BY a 1,,a k HAVING C2 Why? S = may contain attributes a 1,,a k and/or any aggregates but NO OTHER ATTRIBUTES C1 = is any condition on the attributes in R 1,,R n C2 = is any condition on aggregate expressions and on attributes a 1,,a k CSE 414 - Spring 2013 26

Semantics of SQL With Group-By SELECT S FROM R 1,,R n WHERE C1 GROUP BY a 1,,a k HAVING C2 Evaluation steps: 1. Evaluate FROM-WHERE using Nested Loop Semantics 2. Group by the attributes a 1,,a k 3. Apply condition C2 to each group (may have aggregates) 4. Compute aggregates in S and return the result CSE 414 - Spring 2013 27

Empty Groups In the result of a group by query, there is one row per group in the result No group can be empty! In particular, count(*) is never 0 SELECT x.manufacturer, count(*) FROM Product x, Purchase y WHERE x.pname = y.product GROUP BY x.manufacturer What if there are no purchases for a manufacturer CSE 414 - Spring 2013 28

Empty Groups: Example SELECT product, count(*) FROM purchase GROUP BY product SELECT product, count(*) FROM purchase WHERE price > 2.0 GROUP BY product 5 groups in our example dataset 3 groups in our example dataset CSE 414 - Spring 2013 29

Empty Group Problem SELECT x.manufacturer, count(*) FROM Product x, Purchase y WHERE x.pname = y.product GROUP BY x.manufacturer What if there are no purchases for a manufacturer CSE 414 - Spring 2013 30

Empty Group Solution: Outer Join SELECT x.manufacturer, count(y.pid) FROM Product x LEFT OUTER JOIN Purchase y ON x.pname = y.product GROUP BY x.manufacturer CSE 414 - Spring 2013 31