CMPT 354: Database System I. Lecture 3. SQL Basics

Similar documents
CMPT 354: Database System I. Lecture 2. Relational Model

Lecture 2: Introduction to SQL

CMPT 354: Database System I. Lecture 4. SQL Advanced

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

Introduction to SQL Part 1 By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

CMPT 354: Database System I. Lecture 1. Course Introduction

CMPT 354: Database System I. Lecture 7. Basics of Query Optimization

DS Introduction to SQL Part 1 Single-Table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Lecture 3: SQL Part II

CMPT 354: Database System I. Lecture 5. Relational Algebra

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk

The Relational Model. Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC)

Announcements (September 14) SQL: Part I SQL. Creating and dropping tables. Basic queries: SFW statement. Example: reading a table

Polls on Piazza. Open for 2 days Outline today: Next time: "witnesses" (traditionally students find this topic the most difficult)

Review: Where have we been?

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

Database Languages. A DBMS provides two types of languages: Language for accessing & manipulating the data. Language for defining a database schema

Simple queries Set operations Aggregate operators Null values Joins Query Optimization. John Edgar 2

DS Introduction to SQL Part 2 Multi-table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

SQL: The Query Language Part 1. Relational Query Languages

The SQL data-definition language (DDL) allows defining :

The Relational Data Model. Data Model

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL

CMPT 354: Database System I. Lecture 11. Transaction Management

Relational data model

Administrivia. The Relational Model. Review. Review. Review. Some useful terms

CSC 261/461 Database Systems Lecture 13. Fall 2017

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems

Keys, SQL, and Views CMPSCI 645

Chapter 3: Introduction to SQL

Database Applications (15-415)

CSEN 501 CSEN501 - Databases I

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

Relational Model. Topics. Relational Model. Why Study the Relational Model? Linda Wu (CMPT )

Database Management Systems. Chapter 3 Part 1

Chapter 4. Basic SQL. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Chapter 3: Introduction to SQL

Database Applications (15-415)

The Relational Model of Data (ii)

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

CompSci 516 Database Systems. Lecture 2 SQL. Instructor: Sudeepa Roy

The Relational Model. Week 2

The Relational Model

Chapter 4. Basic SQL. SQL Data Definition and Data Types. Basic SQL. SQL language SQL. Terminology: CREATE statement

Introduction to Data Management. Lecture #4 (E-R à Relational Design)

Database Systems ( 資料庫系統 )

SQL: Part II. Announcements (September 18) Incomplete information. CPS 116 Introduction to Database Systems. Homework #1 due today (11:59pm)

Lecture 16. The Relational Model

Announcements (September 18) SQL: Part II. Solution 1. Incomplete information. Solution 3? Solution 2. Homework #1 due today (11:59pm)

WHAT IS SQL. Database query language, which can also: Define structure of data Modify data Specify security constraints

The Relational Model. Chapter 3

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Simple SQL Queries (2)

EECS 647: Introduction to Database Systems

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model

Lecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University

Data Modeling. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Today s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls

Part I: Structured Data

Announcements If you are enrolled to the class, but have not received the from Piazza, please send me an . Recap: Lecture 1.

CS 582 Database Management Systems II

Relational Algebra and SQL

The Relational Model 2. Week 3

The Relational Model. Roadmap. Relational Database: Definitions. Why Study the Relational Model? Relational database: a set of relations

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

COMP 244 DATABASE CONCEPTS AND APPLICATIONS

Relational Databases BORROWED WITH MINOR ADAPTATION FROM PROF. CHRISTOS FALOUTSOS, CMU /615

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

QQ Group

SQL. Chapter 5 FROM WHERE

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Lecture 6 - More SQL

Comp 5311 Database Management Systems. 4b. Structured Query Language 3

COMP 244 DATABASE CONCEPTS & APPLICATIONS

Chapter 4 SQL. Database Systems p. 121/567

Chapter 6 The database Language SQL as a tutorial

Ø Last Thursday: String manipulation & Regular Expressions Ø guest lecture from the amazing Sam Lau Ø reviewed in section and in future labs & HWs

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

Database Systems CSE 303. Lecture 02

Database Systems CSE 303. Lecture 02

Computing for Medicine (C4M) Seminar 3: Databases. Michelle Craig Associate Professor, Teaching Stream

SQL - Lecture 3 (Aggregation, etc.)

SQL Part 2. Kathleen Durant PhD Northeastern University CS3200 Lesson 6

SQL. Lecture 4 SQL. Basic Structure. The select Clause. The select Clause (Cont.) The select Clause (Cont.) Basic Structure.

SQL: Data Manipulation Language. csc343, Introduction to Databases Diane Horton Winter 2017

SQL: csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Sina Meraji. Winter 2018

Database Systems SQL SL03

Modern Database Systems Lecture 1

Using a DBMS. Shan-Hung Wu & DataLab CS, NTHU

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

Implementing Table Operations Using Structured Query Language (SQL) Using Multiple Operations. SQL: Structured Query Language

SQL - Data Query language

The Relational Model

Database Systems SQL SL03

L3 The Relational Model. Eugene Wu Fall 2018

Page 1. Quiz 18.1: Flow-Control" Goals for Today" Quiz 18.1: Flow-Control" CS162 Operating Systems and Systems Programming Lecture 18 Transactions"

Transcription:

CMPT 354: Database System I Lecture 3. SQL Basics 1

Announcements! About Piazza 97 enrolled (as of today) Posts are anonymous to classmates You should have started doing A1 Please come to office hours if you need any help 2

SQL Motivation Dark times in 2000s Are relational databases dead? Now, as before: everyone sells SQL Pig, Hive, Impala SparkSQL NoSQL Non SQL Not-Only-SQL Not-Yet-SQL 3

SQL: Introduction S.Q.L. or sequel Supported by all major commercial database systems Standardized many new features over time Declarative language 4

SQL is a Data Definition Language (DDL) Define relational schema Create/alter/delete tables and their attributes Data Manipulation Language (DML) Insert/delete/modify tuples in tables Query one or more tables discussed next! 5

Outline Single-table Queries The SFW query Useful operators: DISTINCT, ORDER BY, LIKE Handle missing values: NULLs Multiple-table Queries Foreign key constraints Joins: basics Joins: SQL semantics 6

The SFW Query SELECT <columns> FROM <table name> WHERE <conditions> To write the query, ask yourself three questions: Which table are you interested in? Which rows are you interested in? Which columns are you interested in? 7

Conditions SELECT <columns> FROM <table name> WHERE <conditions> Which rows are you interested in? WHERE gpa > 3.5 WHERE school = SFU AND gpa > 3.5 WHERE (school = SFU OR school = UBC ) AND gpa > 3.5 WHERE age * 365 > 7500 8

Columns SELECT <columns> FROM <table name> WHERE <conditions> Which columns are you interested in? SELECT * SELECT name, age SELECT name as studentname, age SELECT name, age * 365 as ageday 9

A Few Details SQL commands are case insensitive: Same: SELECT, Select, select Same: Student, student Same: gpa, GPA Values are not: Different: 'SFU', 'sfu' SQL strings are enclosed in single quotes e.g. name = 'Mike Single quotes in a string can be specified using an initial single quote character as an escape author = 'Shaq O''Neal' Strings can be compared alphabetically with the comparison operators e.g. 'fodder' < 'foo' is TRUE 10

DISTINCT: Eliminating Duplicates SELECT School FROM Students Versus School SFU SFU UBC UT UT SELECT DISTINCT School FROM Students School SFU UBC UT 11

ORDER BY: Sorting the Results SELECT name, gpa, age FROM Students WHERE school = 'SFU' ORDER BY gpa DESC, age ASC The output of an SQL query can be ordered By any number of attributes, and In either ascending or descending order The default is to use ascending order, the keywords ASC and DESC, following the column name, sets the order 12

LIKE: Simple String Pattern Matching SELECT * FROM Students WHERE name LIKE ' Sm_t%' SQL provides pattern matching support with the LIKE operator and two symbols The % symbol stands for zero or more arbitrary characters The _ symbol stands for exactly one arbitrary character The % and _ characters can be escaped with \ E.g., name LIKE Michael\_Jordan' 13

Exercise - 1 Which names will be returned? SELECT * FROM Students WHERE name LIKE 'Sm_t%' 1. Smit 2. SMIT 3. Smart 4. Smith 5. Smythe 6. Smut 7. Smeath 8. Smt 14

Exercise - 1 Which names will be returned? SELECT * FROM Students WHERE name LIKE 'Sm_t%' 1. Smit 2. SMIT 3. Smart 4. Smith 5. Smythe 6. Smut 7. Smeath 8. Smt 1, 4, 5, 6 15

NULLS in SQL Whenever we don t have a value, we can put a NULL Can mean many things: Value does not exists Value exists but is unknown Value not applicable Etc. NULL constraints CREATE TABLE Students ( name CHAR(20) NOT NULL, age CHAR(20) NOT NULL, gpa FLOAT ) 16

What will happen? name age gpa Mike 20 4.0 Joe 18 NULL Alice 21 3.8 1. SELECT gpa*100 FROM students 2. SELECT name FROM students WHERE gpa > 3.5 3. SELECT name FROM students WHERE age > 15 OR gpa > 3.5 17

Two Important Rules Arithmetic operations (+, -, *, /) on nulls return NULL NULL * 100 NULL NULL * 0 NULL Comparisons with nulls evaluate to UNKNOWN NULL > 3.5 UNKNOWN NULL = NULL UNKNOWN 1. SELECT gpa*100 FROM students 2. SELECT gpa*0 FROM students 3. SELECT name FROM students WHERE gpa > 3.5 4. SELECT name FROM students WHERE gpa = NULL 18

Combinations of true, false, unknown Truth values for unknown results true OR unknown = true, false OR unknown = unknown, unknown OR unknown = unknown, true AND unknown = unknown, false AND unknown = false, unknown AND unknown = unknown SELECT * FROM students WHERE age > 15 OR gpa > 3.5 SELECT * FROM students WHERE age > 15 AND gpa > 3.5 The result of a WHERE clause is treated as false if it evaluates to unknown WHERE unknown à false 19

What will happen? name age gpa Mike 20 4.0 Joe 18 NULL Alice 21 3.8 1. SELECT gpa*100 FROM students 2. SELECT name FROM students WHERE gpa > 3.5 3. SELECT name FROM students WHERE age > 15 OR gpa > 3.5 gpa 400 NULL 380 name Mike Alice name Mike Joe Alice 20

Exercise - 2 Will it return all students? SELECT * FROM Students WHERE age < 25 OR age >= 25 21

Exercise - 2 Will it return all students? SELECT * FROM Students WHERE age < 25 OR age >= 25 OR age is NULL There are special operators to test for null values IS NULL tests for the presence of nulls and IS NOT NULL tests for the absence of nulls 22

Outline Single-table Queries The SFW query Other useful operators: DISTINCT, LIKE, ORDER BY NULLs Multiple-table Queries Foreign key constraints Joins: basics Joins: SQL semantics 23

Foreign Key constraints Foreign-key constraint: student_id references sid Students sid name gpa 101 Bob 3.2 123 Mary 3.8 Enrolled student_id cid grade 123 354 A 123 454 A+ 156 354 A 24

Foreign Key constraints Foreign-key constraint: student_id references sid Students sid name gpa 101 Bob 3.2 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A 123 454 A+ 156 354 A 25

Declaring Foreign Keys student_id cid grade 123 354 A 123 454 A+ 156 354 A CREATE TABLE Enrolled( student_id CHAR(20), cid CHAR(20), grade CHAR(10), PRIMARY KEY (student_id, cid), FOREIGN KEY (student_id) REFERENCES Students(sid) ) 26

Insert operations What if we insert a tuple into Enrolled, but no corresponding student? INSERT is rejected Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A 123 454 A+ 156 354 A 190 354 A 27

Delete operations What if we delete a student, who has enrolled courses? Disallow the delete (ON DELETE RESTRICT) Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A 123 454 A+ 156 354 A 28

ON DELETE RESTRICT student_id cid grade 123 354 A 123 454 A+ 156 354 A CREATE TABLE Enrolled( student_id CHAR(20), cid CHAR(20), grade CHAR(10), PRIMARY KEY (student_id, cid), FOREIGN KEY (student_id) REFERENCES Students(sid) ON DELETE RESTRICT ) 29

Delete operations What if we delete a student, who has enrolled courses? Remove all of the courses for that student (ON DELETE CASCADE) Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A 123 454 A+ 156 354 A 30

ON DELETE CASCADE student_id cid grade 123 354 A 123 454 A+ 156 354 A CREATE TABLE Enrolled( student_id CHAR(20), cid CHAR(20), grade CHAR(10), PRIMARY KEY (student_id, cid), FOREIGN KEY (student_id) REFERENCES Students(sid) ON DELETE CASCADE ) 31

Delete operations What if we delete a student, who has enrolled courses? Set Foreign Key to NULL (ON DELETE SET NULL) Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A 123 454 A+ NULL 156 354 A Interestingly, although it satisfies the foreign-key constraint, it violates the primary-key constraint, thus the deletion operation is disallowed. 32

ON DELETE SET NULL student_id cid grade 123 354 A 123 454 A+ 156 354 A CREATE TABLE Enrolled( student_id CHAR(20), cid CHAR(20), grade CHAR(10), PRIMARY KEY (student_id, cid), FOREIGN KEY (student_id) REFERENCES Students(sid) ON DELETE SET NULL ) 33

Outline Single-table Queries The SFW query Other useful operators: DISTINCT, LIKE, ORDER BY NULLs Multiple-table Queries Foreign key constraints Joins: basics Joins: SQL semantics Set Operators 34

Why do we have multiple tables? Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A 123 454 A+ 156 354 A VS. EnrolledStudents student_id name gpa cid grade 123 Mary 3.8 354 A 123 Mary 3.8 454 A+ 156 Mike 3.7 354 A 35

Store data into multiple tables vs. single table Multiple tables Data updating is easier (e.g., update Mary s gpa to 3.9) Querying each individual table is faster (e.g., retrieve Mary s gpa) A single table Data exchange is easier (e.g., share your data with others) Avoid the cost of joining multiple tables (e.g., retrieval all the courses that Mary has taken) 36

Joins The SFW query over a single table SELECT <columns> FROM <table name> WHERE <conditions> Which rows are you interested in? The SFW query over multiple tables SELECT <columns> FROM <table names> WHERE <conditions> Which rows are you interested in? How to join the multiple tables? 37

Joins: Example Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A+ 123 454 A+ 156 354 A SELECT name FROM WHERE Which rows are you interested in? Find all student who have got an A+ in 354; return their names and gpas Students, Enrolled sid = student_id AND cid = 354 AND grad = A+ How to join the two tables? 38

Other ways to write joins SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 354 AND grad = A+ SELECT name FROM Students JOIN Enrolled ON sid = student_id WHERE cid = 354 AND grad = A+ SELECT name FROM Students JOIN Enrolled ON sid = student_id AND cid = 354 AND grad = A+ 39

The Need fo Tuple Variable Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid name grade 123 354 DB I A+ 123 454 DB II A+ 156 354 DB I A SELECT name FROM WHERE Students, Enrolled sid = student_id Which name? 40

Tuple Variable Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid name grade 123 354 DB I A+ 123 454 DB II A+ 156 354 DB I A SELECT Students.name FROM Students, Enrolled WHERE sid = student_id SELECT S.name FROM Students S, Enrolled WHERE sid = student_id 41

Outline Single-table Queries The SFW query Other useful operators: DISTINCT, LIKE, ORDER BY NULLs Multiple-table Queries Foreign key constraints Joins: basics Joins: SQL semantics Set Operators 42

Meaning (Semantics) of Join Queries SELECT x 1.a 1, x 1.a 2,, x n.a k FROM R 1 AS x 1, R 2 AS x 2,, R n AS x n WHERE Conditions(x 1,, x n ) Answer = {} for x 1 in R 1 do for x 2 in R 2 do.. for x n in R n do if Conditions(x 1,, x n ) return Answer then Answer = Answer È {(x 1.a 1, x 1.a 2,, x n.a k )} Note: this is a multiset union This is called nested loop semantics since we are interpreting what a join means using a nested loop 43

Three steps SELECT x 1.a 1, x 1.a 2,, x n.a k FROM R 1 AS x 1, R 2 AS x 2,, R n AS x n WHERE Conditions(x 1,, x n ) 1. Take cross product R 1 R 2 R n 2. Apply conditions Conditions(x 1,, x n ) Note: This is NOT how the DBMS executes the query. 3. Apply projections x 1.a 1, x 1.a 2,, x n.a k 44

Exercise Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A+ 123 454 A+ 156 354 A SELECT name FROM Students, Enrolled WHERE sid = student_id AND grade >= A Which one(s) are correct? name Mary name Mary Mike name Mary Mike Mary name Mary Mary Mike (A) (B) (C) (D) 45

Outline Single-table Queries The SFW query Other useful operators: DISTINCT, LIKE, ORDER BY NULLs Multiple-table Queries Foreign key constraints Joins: basics Joins: SQL semantics Set Operators 46

Set Operations SQL supports union, intersection and set difference operations Called UNION, INTERSECT, and EXCEPT These operations must be performed on union compatible tables Although these operations are supported in the SQL standard, implementations may vary EXCEPT may not be implemented When it is, it is sometimes called MINUS 47

One of Two Courses Find all students who have taken either 354 or 454 Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A+ 123 454 A+ 156 354 A SELECT name FROM Students, Enrolled WHERE sid = student_id AND (cid = 354 OR cid = 454) 48

One of Two Courses - UNION Find all students who have taken either 354 or 454 Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A+ 123 454 A+ 156 354 A SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 354 UNION SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 454 49

Both Courses Find all students who have taken both 354 and 454 Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A+ 123 454 A+ 156 354 A SELECT name FROM Students S, Enrolled E WHERE sid = student_id AND (E.cid = 354 AND E.cid = 454) 50

Both Courses Again Find all students who have taken both 354 and 454 Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A+ 123 454 A+ 156 354 A SELECT name FROM Students S, Enrolled E1, Enrolled E2 WHERE S.sid = E1.student_id AND S.sid = E2.student_id AND (E1.cid = 354 AND E2.cid = 454) 51

Both Courses - INTERSECT Find all students who have taken both 354 and 454 Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A+ 123 454 A+ 156 354 A SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 354 INTERSECT SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 454 52

One Course But Not The Other Find all students who have taken 354 but not 454 Students sid name gpa 123 Mary 3.8 156 Mike 3.7 Enrolled student_id cid grade 123 354 A+ 123 454 A+ 156 354 A SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 354 EXCEPT SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 454 53

Set Operations and Duplicates Unlike other SQL operations, UNION, INTERSECT, and EXCEPT queries eliminate duplicates by default SQL allows duplicates to be retained in these three operations using the ALL keyword (i.e., multi-set operations) SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 354 INTERSECT ALL SELECT name FROM Students, Enrolled WHERE sid = student_id AND cid = 454 54

Acknowledge Some lecture slides were copied from or inspired by the following course materials W4111:Introduction to databases by Eugene Wu at Columbia University CSE344: Introduction to Data Management by Dan Suciu at University of Washington CMPT354: Database System I by John Edgar at Simon Fraser University CS186: Introduction to Database Systems by Joe Hellerstein at UC Berkeley CS145: Introduction to Databases by Peter Bailis at Stanford CS 348: Introduction to Database Management by Grant Weddell at University of Waterloo 55