MIS2502: Data Analytics SQL Getting Information Out of a Database Part 1: Basic Queries

Similar documents
MIS2502: Data Analytics SQL Getting Information Out of a Database. Jing Gong

MIS2502: Review for Exam 1. Jing Gong

MIS2502: Data Analytics Relational Data Modeling (2) Alvin Zuyin Zheng

Exam #1 Review. Zuyin (Alvin) Zheng

MIS2502: Data Analytics Relational Data Modeling. Jing Gong

MIS2502: Data Analytics Relational Data Modeling. Jing Gong

MIS2502: Review for Exam 2. JaeHwuen Jung

Jarek Szlichta

CSC Web Programming. Introduction to SQL

Logical Operators and aggregation

MIS2502: Review for Exam 2. Jing Gong

How to use SQL to work with a MySQL database

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

In this exercise you will practice some more SQL queries. First let s practice queries on a single table.

MySQL by Examples for Beginners

MIS2502: Data Analytics MySQL and SQL Workbench. Jing Gong

12. MS Access Tables, Relationships, and Queries

tablename ORDER BY column ASC tablename ORDER BY column DESC sortingorder, } The WHERE and ORDER BY clauses can be combined in one

Instructor: Craig Duckett. Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables

SQL stands for Structured Query Language. SQL lets you access and manipulate databases

Introduction to SQL. IT 5101 Introduction to Database Systems. J.G. Zheng Fall 2011

CS 327E Lecture 2. Shirley Cohen. January 27, 2016

SQL functions fit into two broad categories: Data definition language Data manipulation language

Lesson 2. Data Manipulation Language

ASSIGNMENT NO Computer System with Open Source Operating System. 2. Mysql

MySQL. Prof.Sushila Aghav

Oracle NoSQL Database Parent-Child Joins and Aggregation O R A C L E W H I T E P A P E R A P R I L,

CGS 3066: Spring 2017 SQL Reference

30. Structured Query Language (SQL)

Databases (MariaDB/MySQL) CS401, Fall 2015

Stat Wk 3. Stat 342 Notes. Week 3, Page 1 / 71

Relational Database Management Systems for Epidemiologists: SQL Part I

Database Management Systems by Hanh Pham GOALS

STOP DROWNING IN DATA. START MAKING SENSE! An Introduction To SQLite Databases. (Data for this tutorial at

Relational Database Development

Structured Query Language (SQL) Part A. KSE 521 Topic 10 Mun Yi

SQL BASICS WITH THE SMALLBANKDB STEFANO GRAZIOLI & MIKE MORRIS

COMP 244 DATABASE CONCEPTS & APPLICATIONS

You can write a command to retrieve specified columns and all rows from a table, as illustrated

STIDistrict Query (Basic)

Introduction to Database Systems CSE 414

Oracle NoSQL Database Parent-Child Joins and Aggregation O R A C L E W H I T E P A P E R M A Y,

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

Introduction to Data Management CSE 344

CIS 363 MySQL. Chapter 12 Joins Chapter 13 Subqueries

HKTA TANG HIN MEMORIAL SECONDARY SCHOOL SECONDARY 3 COMPUTER LITERACY. Name: ( ) Class: Date: Databases and Microsoft Access

3/3/2008. Announcements. A Table with a View (continued) Fields (Attributes) and Primary Keys. Video. Keys Primary & Foreign Primary/Foreign Key

Chapter 7. Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel

How to use SQL to create a database

Based on the following Table(s), Write down the queries as indicated: 1. Write an SQL query to insert a new row in table Dept with values: 4, Prog, MO

WEEK 3 TERADATA EXERCISES GUIDE

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

SQL Workshop. Introduction Queries. Doug Shook

NCSS: Databases and SQL

M I C R O S O F T A C C E S S : P A R T 2 G E T T I N G I N F O R M A T I O N O U T O F Y O U R D A T A

Follow these steps to get started: o Launch MS Access from your start menu. The MS Access startup panel is displayed:

DB2 SQL Class Outline

1Z Oracle Database 11g - SQL Fundamentals I Exam Summary Syllabus Questions

RESTRICTING AND SORTING DATA

Chapter-14 SQL COMMANDS

EE221 Databases Practicals Manual

Relational Databases. APPENDIX A Overview of Relational Database Structure and SOL

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

Why Relational Databases? Relational databases allow for the storage and analysis of large amounts of data.

Chapter # 7 Introduction to Structured Query Language (SQL) Part II

CS 582 Database Management Systems II

Introduction to SQL Server 2005/2008 and Transact SQL

M I C R O S O F T A C C E S S : P A R T 2 G E T T I N G I N F O R M A T I O N O U T O F Y O U R D A T A

Import and Browse. Review data. bp_stages is a chart based on a graphic

Lecture 4: Advanced SQL Part II

CSCI 1100L: Topics in Computing Lab Lab 07: Microsoft Access (Databases) Part I: Movie review database.

MIS2502: Data Analytics Dimensional Data Modeling. Jing Gong

Access Intermediate

Princeton University Exercise Workbook Training Developed by Elisabetta Zodeiko, Princeton University

Lecture 2: Chapter Objectives. Relational Data Structure. Relational Data Model. Introduction to Relational Model & Structured Query Language (SQL)

PowerPoint Presentation to Accompany GO! All In One. Chapter 13

Intermediate SQL: Aggregated Data, Joins and Set Operators

Access - Introduction to Queries

Data Manipulation Language (DML)

Lecture 2: Chapter Objectives

chapter 2 G ETTING I NFORMATION FROM A TABLE

Exact Numeric Data Types

Simple SQL. Peter Y. Wu. Dept of Computer and Information Systems Robert Morris University

3. Getting data out: database queries. Querying

Structure Query Language (SQL)

Advanced SQL GROUP BY Clause and Aggregate Functions Pg 1

5 SQL (Structured Query Language)

Access Intermediate

DATABASE MANAGEMENT SYSTEMS PREPARED BY: ENGR. MOBEEN NAZAR

Row 1 is called the header row which contains all the field names. Records start in row 2.

QUETZALANDIA.COM. 5. Data Manipulation Language

SQL stands for Structured Query Language. SQL is the lingua franca

Review. Objec,ves. Example Students Table. Database Overview 3/8/17. PostgreSQL DB Elas,csearch. Databases

Unit 3 Fill Series, Functions, Sorting

Getting Information from a Table

Chapter 16: Databases

Unit 3 Functions Review, Fill Series, Sorting, Merge & Center

Access Objects. Tables Queries Forms Reports Relationships

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

CSC 261/461 Database Systems Lecture 5. Fall 2017

Transcription:

MIS2502: Data Analytics SQL Getting Information Out of a Database Part 1: Basic Queries JaeHwuen Jung jaejung@temple.edu http://community.mis.temple.edu/jaejung

Where we are Now we re here Data entry Transactional Database Data extraction Analytical Data Store Data analysis Stores real-time transactional data Stores historical transactional and summary data

What do we want to do? Get information out of the database (retrieve) Database Management System Put information into the database (modify/change)

To do this we use SQL Structured Query Language (SQL) A high-level set of statements (commands) that let you communicate with the database With SQL, you can Retrieve records Join (combine) tables Insert records Delete records Update records Add and delete tables We will be doing this. A statement is any SQL command that interacts with a database. A SQL statement that retrieves information is referred to as a query.

Some points about SQL It s not a typical programming language It can be used by programming languages to interact with databases There is no standard syntax MySQL, Oracle, SQL Server, and Access all have slight differences There are a lot of statements and variations among them This is a great online reference for SQL syntax: http://www.w3schools.com/sql Here s the one specifically for MySQL, but it s not as wellwritten: http://dev.mysql.com/doc/ref man/5.6/en/sql-syntax.html We will be covering the basics, but the most important ones

Connecting to a MySQL server Click on the plus sign next to MySQL Connections to create a new connection.

Connecting to a MySQL server At the Setup New Connection dialog, fill in the information as follows: Connection Name: mis2502 Hostname: dataanalytics.temple.edu Username/PW: Your username given to you by the instructor The username and password will be available on Canvas. Under Grades, click MySQL ID/PW, and your username/password would appear as a comment. The first value that starts with m is the username, and the second value is the password.

The MySQL Workbench interface SQL Query panel Overview tablesheet (the database schemas) How many tables are in the m0orderdb schema?

SELECT statement The SELECT statement is used to select data from a database. Syntax: SELECT column_name(s) FROM schema_name.table_name; A column is a table field that you would like to select from the table. A schema is a collection of tables. It is, essentially, the database. It s good practice to end every statement with a semicolon, especially when entering multiple statements.

Customer SELECT statement Suppose we have a schema named orderdb. We want to select the first names from the Customer table. CustomerID FirstName LastName City State Zip 1001 Greg House Princeton NJ 09120 1002 Lisa Cuddy Plainsboro NJ 09123 1003 James Wilson Pittsgrove NJ 09121 1004 Eric Foreman Warminster PA 19111 This is done using the SELECT statement: SELECT FirstName FROM orderdb.customer; Returns: FirstName Greg Lisa James Eric This returns the FirstName column for every row in the Customer table.

Retrieving multiple columns SELECT FirstName, State FROM orderdb.customer; Returns: FirstName Greg Lisa James Eric State NJ NJ NJ PA The * means return every column. SELECT * FROM orderdb.customer; CustomerID FirstName LastName City State Zip Returns: 1001 Greg House Princeton NJ 09120 1002 Lisa Cuddy Plainsboro NJ 09123 1003 James Wilson Pittsgrove NJ 09121 1004 Eric Foreman Warminster PA 19111

Capitalization and spacing SQL syntax is not sensitive to cases and spacing SELECT FirstName FROM orderdb.customer; select firstname from orderdb.customer; SELECT FirstName FROM orderdb.customer; Correct Best Practice Correct Correct Best Practice: We will write all SQL keywords (e.g., SELECT and FROM) in upper case Use space appropriately for readability

Retrieving unique values SELECT DISTINCT State FROM orderdb.customer; SELECT DISTINCT returns only distinct (different) values. State NJ PA SELECT DISTINCT City, State FROM orderdb.customer; City Princeton Plainsboro Pittsgrove Warminster State NJ NJ NJ PA In this case, each combination of City AND State is unique, so it returns all of them.

Customer Returning only certain records Sometimes we want to filter records. We use the WHERE clause to specify criterions. Syntax: SELECT * FROM schema_name.table_name WHERE condition; Example: CustomerID FirstName LastName City State Zip 1001 Greg House Princeton NJ 09120 1002 Lisa Cuddy Plainsboro NJ 09123 1003 James Wilson Pittsgrove NJ 09121 1004 Eric Foreman Warminster PA 19111 Let s retrieve only those customers who live in New Jersey. SELECT * FROM orderdb.customer WHERE State= 'NJ'; CustomerID FirstName LastName City State Zip returns this: 1001 Greg House Princeton NJ 09120 1002 Lisa Cuddy Plainsboro NJ 09123 1003 James Wilson Pittsgrove NJ 09121

More conditional statements SELECT * FROM orderdb.customer WHERE State <> 'NJ'; CustomerID FirstName LastName City State Zip 1004 Eric Foreman Warminster PA 19111 SELECT * FROM orderdb.product WHERE Price > 2; ProductID ProductName Price 2251 Cheerios 3.99 2505 Eggo Waffles 2.99 The <> means not equal to. Text Fields vs. Numeric Fields Put single quotes around string (non-numeric) values. For example, 'NJ' The quotes are optional for numeric values.

Operators in the WHERE Clause The following list of operators that can be used in the WHERE clause: Operator Description = Equal to > Greater than >= Greater than or equal to < Less than <= Less than or equal to <> Not equal to

More conditional statements: AND & OR Operators SELECT * FROM orderdb.product WHERE Price > 2 AND Price<=3.5; ProductID ProductName Price 2505 Eggo Waffles 2.99 The AND operator displays a record if both the first condition AND the second condition are true. SELECT * FROM orderdb.customer WHERE City = Princeton OR City = Pittsgrove ; CustomerID FirstName LastName City State Zip 1001 Greg House Princeton NJ 09120 1003 James Wilson Pittsgrove NJ 09121 The OR operator displays a record if either the first condition OR the second condition is true.

Sorting using ORDER BY SELECT * FROM orderdb.product WHERE Price > 2 ORDER BY Price; ProductID ProductName Price 2505 Eggo Waffles 2.99 2251 Cheerios 3.99 ORDER BY sorts results from lowest to highest based on a field (in this case, Price)

ORDER BY ASC and DESC SELECT * FROM orderdb.product WHERE Price > 2 ORDER BY Price DESC; ProductID ProductName Price 2251 Cheerios 3.99 2505 Eggo Waffles 2.99 Forces the results to be sorted in DESCending order SELECT * FROM orderdb.product WHERE Price > 2 ORDER BY Price ASC; ProductID ProductName Price 2505 Eggo Waffles 2.99 2251 Cheerios 3.99 Forces the results to be sorted in ASCending order

SQL Functions SQL has many built-in functions for performing calculations COUNT() - Returns the number of rows MAX() - Returns the largest value MIN() - Returns the smallest value AVG() - Returns the average value SUM() - Returns the sum

Functions: Counting records SELECT COUNT(FirstName) FROM orderdb.customer; 4 Total number of records in the table where the field is not empty (that is, missing values will not be counted). (don t forget the parentheses!) SELECT COUNT(CustomerID) FROM orderdb.customer; 4 Why is this the same number as the previous query? SELECT COUNT(*) FROM orderdb.customer;? What number would be returned?

Customer What if there is missing data? CustomerID FirstName LastName City State Zip 1001 House Princeton NJ 09120 1002 Lisa Cuddy Plainsboro NJ 09123 1003 James Wilson Pittsgrove NJ 09121 1004 Eric Foreman Warminster PA 19111 SELECT COUNT(FirstName) FROM orderdb.customer; 3 SELECT COUNT(CustomerID) FROM orderdb.customer; 4 SELECT COUNT(*) FROM orderdb.customer; 4 If missing data are possible, it is best to count using the primary key (e.g., COUNT(CustomerID)), or use COUNT(*)

Product Functions: Retrieving highest, lowest, average, and sum ProductID ProductName Price 2251 Cheerios 3.99 2282 Bananas 1.29 2505 Eggo Waffles 2.99 SELECT MAX(Price) FROM orderdb.product; SELECT MIN(Price) FROM orderdb.product; SELECT AVG(Price) FROM orderdb.product; SELECT SUM(Price) FROM orderdb.product; Price 3.99 Price 1.29 Price 2.756 Price 8.27

What if we want to arrange records in groups? CustomerID FirstName LastName City State Zip 1001 Greg House Princeton NJ 09120 1002 Lisa Cuddy Plainsboro NJ 09123 1003 James Wilson Pittsgrove NJ 09121 1004 Eric Foreman Warminster PA 19111 How do we find the number of customers by each state?

GROUP BY SELECT State, COUNT(FirstName) FROM orderdb.customer GROUP BY State; State NJ 3 PA 1 COUNT(FirstName) So it looks for unique State values and then counts the number of records for each of those values. GROUP BY is usually used in conjunction with the aggregate functions (COUNT, MAX, MIN, AVG, SUM), to group the results by one or more columns.

Order-Product Another GROUP BY OrderProductID OrderNumber ProductID Quantity 1 101 2251 2 Ask: What is the total quantity sold per product? 2 101 2282 3 3 101 2505 1 4 102 2251 5 5 102 2282 2 6 103 2505 3 7 104 2505 8 SELECT ProductID, SUM(Quantity) FROM orderdb.orderproduct GROUP BY ProductID; ProductID SUM(Quantity) 2251 7 2282 5 2505 12

Back quotes? We surround schema, table or column names with back quotes (in the form of `name`) when the name 1) contains SQL reserved words 2) Contains blank space or special characters Where is the back quote key on the keyboard?

Back quotes for reserved words When the table/column name is a reserved word: SELECT * FROM orderdb.`order`; Order is a reserved word in SQL. It is a command. As in ORDER BY The back quotes tell MySQL to treat `Order` as a database object and not a command. For a list of reserved words in MySQL, go to: http://dev.mysql.com/doc/refman/5.1/en/reserved-words.html

Back quotes for space or special characters Space or special characters in schema/table/column name contains : SELECT * FROM orderdb.`order-product` SELECT `Last Name` FROM hospitaldb.patient

Counting and sorting SELECT ProductID, SUM(Quantity) FROM orderdb.orderproduct GROUP BY ProductID ORDER BY SUM(Quantity); ProductID SUM(Quantity) 2282 5 2251 7 2505 12 GROUP BY organizes the results by column values. ORDER BY sorts results from lowest to highest based on SUM(Quantity)

Combining WHERE and COUNT SELECT COUNT(FirstName) FROM orderdb.customer WHERE State= 'NJ'; 3 Asks: How many customers live in New Jersey? SELECT COUNT(ProductName) FROM orderdb.product WHERE Price < 3; 2 Asks: How many products cost less than $3? Review: Does it matter which field in the table you use in the SELECT COUNT query?

WHERE, GROUP BY, and ORDER BY Recall the Customer table: CustomerID FirstName LastName City State Zip 1001 Greg House Princeton NJ 09120 1002 Lisa Cuddy Plainsboro NJ 09123 1003 James Wilson Pittsgrove NJ 09121 1004 Eric Foreman Warminster PA 19111 Ask: How many customers are there in each city in New Jersey? Sort the results alphabetically by city

One more note: Combining WHERE, GROUP BY, and ORDER BY SELECT City, COUNT(*) FROM orderdb.customer WHERE State='NJ' GROUP BY City ORDER BY City ASC; This is the correct SQL statement SELECT City, COUNT(*) X FROM orderdb.customer GROUP BY City ORDER BY City ASC WHERE State='NJ'; This won t work City COUNT(*) Pittsgrove 1 Plainsboro 1 Princeton 1 When combining WHERE, GROUP BY, and ORDER BY, write the WHERE condition first, then GROUP BY, then ORDER BY.

Summary: The full syntax for SELECT SELECT [DISTINCT] expression(s) FROM schema_name.table_name(s) [WHERE condition(s)] [GROUP BY expression(s)] [ORDER BY expression(s) [ ASC DESC ]] [LIMIT number_rows]; The [] means the element is optional Element expression(s) schema_name.table_name(s) DISTINCT WHERE condition(s) GROUP BY expression(s) ORDER BY expression(s) LIMIT number_rows Description The column(s) or function(s) that you wish to retrieve. The table(s) that you wish to retrieve records from. Optional. Return unique values. Optional. The conditions that must be met for the records to be selected. Optional. Organize the results by column values. Optional. Sort the records in your result set Optional. Restrict the maximum number of records to retrieve.

In Class Activity #4