Keys, SQL, and Views CMPSCI 645

Similar documents
Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

SQL: Queries, Constraints, Triggers

Basic form of SQL Queries

SQL. Chapter 5 FROM WHERE

SQL: Queries, Programming, Triggers

CompSci 516 Database Systems. Lecture 2 SQL. Instructor: Sudeepa Roy

SQL: The Query Language Part 1. Relational Query Languages

CIS 330: Applied Database Systems

Database Applications (15-415)

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems

Relational data model

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

The Relational Model. Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC)

SQL Part 2. Kathleen Durant PhD Northeastern University CS3200 Lesson 6

Review: Where have we been?

The Relational Model of Data (ii)

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

SQL: Queries, Programming, Triggers. Basic SQL Query. Conceptual Evaluation Strategy. Example of Conceptual Evaluation. A Note on Range Variables

Announcements If you are enrolled to the class, but have not received the from Piazza, please send me an . Recap: Lecture 1.

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

Introduction to Data Management. Lecture 14 (SQL: the Saga Continues...)

The SQL Query Language. Creating Relations in SQL. Adding and Deleting Tuples. Destroying and Alterating Relations. Conceptual Evaluation Strategy

Introduction to Data Management. Lecture #14 (Relational Languages IV)

Database Management Systems. Chapter 5

SQL - Lecture 3 (Aggregation, etc.)

Database Management Systems. Chapter 3 Part 1

Introduction to Data Management. Lecture #13 (Relational Calculus, Continued) It s time for another installment of...

Database Systems. October 12, 2011 Lecture #5. Possessed Hands (U. of Tokyo)

The Relational Model. Chapter 3

Enterprise Database Systems

Database Applications (15-415)

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Database Systems ( 資料庫系統 )

The Relational Data Model. Data Model

The Database Language SQL (i)

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

Comp 5311 Database Management Systems. 4b. Structured Query Language 3

Database Applications (15-415)

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

The Relational Model. Week 2

The Relational Model

The Relational Model

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Relational Model. Topics. Relational Model. Why Study the Relational Model? Linda Wu (CMPT )

Administrivia. The Relational Model. Review. Review. Review. Some useful terms

The Relational Model 2. Week 3

QUERIES, PROGRAMMING, TRIGGERS

Relational Algebra Homework 0 Due Tonight, 5pm! R & G, Chapter 4 Room Swap for Tuesday Discussion Section Homework 1 will be posted Tomorrow

Database Systems ( 資料庫系統 )

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

Data Science 100. Databases Part 2 (The SQL) Slides by: Joseph E. Gonzalez & Joseph Hellerstein,

CSEN 501 CSEN501 - Databases I

SQL: Queries, Programming, Triggers

Data Modeling. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Introduction to SQL Part 1 By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Relational Databases BORROWED WITH MINOR ADAPTATION FROM PROF. CHRISTOS FALOUTSOS, CMU /615

Databases - Relational Algebra. (GF Royle, N Spadaccini ) Databases - Relational Algebra 1 / 24

This lecture. Projection. Relational Algebra. Suppose we have a relation

SQL: Queries, Constraints, Triggers (iii)

This lecture. Step 1

Data Science 100 Databases Part 2 (The SQL) Previously. How do you interact with a database? 2/22/18. Database Management Systems

More on SQL Nested Queries Aggregate operators and Nulls

SQL: The Query Language (Part 1)

Introduction to Data Management. Lecture #10 (Relational Calculus, Continued)

Today s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls

Experimenting with bags (tables and query answers with duplicate rows):

Lecture 2: Introduction to SQL

Advanced Database Management Systems

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

Database Applications (15-415)

Advanced Database Management Systems

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model

Introduction to Data Management. Lecture 16 (SQL: There s STILL More...) It s time for another installment of...

CAS CS 460/660 Introduction to Database Systems. Relational Algebra 1.1

Relational Calculus. Chapter Comp 521 Files and Databases Fall

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board

HW1 is due tonight HW2 groups are assigned. Outline today: - nested queries and witnesses - We start with a detailed example! - outer joins, nulls?

Relational Algebra 1

CSCC43H: Introduction to Databases. Lecture 3

Introduction to Data Management. Lecture #17 (SQL, the Never Ending Story )

CITS3240 Databases Mid-semester 2008

Relational Algebra. Relational Query Languages

Relational Calculus. Chapter Comp 521 Files and Databases Fall

Database Systems. Course Administration

Relational Query Languages. Relational Algebra. Preliminaries. Formal Relational Query Languages. Relational Algebra: 5 Basic Operations

The Relational Model. Roadmap. Relational Database: Definitions. Why Study the Relational Model? Relational database: a set of relations

Database Management Systems. Chapter 5

SQL. Assit.Prof Dr. Anantakul Intarapadung

Midterm Exam #2 (Version A) CS 122A Winter 2017

CompSci 516 Data Intensive Computing Systems

MIS Database Systems Relational Algebra

CompSci 516: Database Systems

Introduction to Data Management. Lecture #4 (E-R à Relational Design)

CIS 330: Applied Database Systems

SQL. Structured Query Language

HW2 out! Administrative Notes

Transcription:

Keys, SQL, and Views CMPSCI 645

SQL Overview SQL Preliminaries Integrity constraints Query capabilities SELECT-FROM- WHERE blocks, Basic features, ordering, duplicates Set ops (union, intersect, except) Aggregation & Grouping Nested queries (correlation) Null values Modifying the database Views Review in the textbook, Ch 5

The SQL Query Language Structured Query Language Developed by IBM (system R) in the 1970s Need for a standard since it is used by many vendors Evolving standard SQL-86 SQL-89 (minor revision) SQL-92 (major revision) SQL-99 (major extensions) SQL-2003 (minor revisions)...

Two parts of SQL Data Definition Language (DDL) Create/alter/delete tables and their attributes establish and modify schema Data Manipulation Language (DML) Query and modify database instance

Creating Relations in SQL Creates the Student relation. Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. As another example, the Takes table holds information about courses that students take. CREATE TABLE Student (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa REAL) CREATE TABLE Takes (sid CHAR(20), cid CHAR(20), grade CHAR(2))

Data Types in SQL Characters: CHAR(20)!! -- fixed length VARCHAR(40)! -- variable length Numbers: BIGINT, INT, SMALLINT, TINYINT REAL, FLOAT! -- differ in precision MONEY Times and dates: DATE DATETIME!! Others...

Destroying and Altering Relations DROP TABLE Student Destroys the relation Student. The schema information and the tuples are deleted. ALTER TABLE Student ADD COLUMN firstyear integer!the schema of Student is altered by adding a new field; every tuple in the current instance is extended with a null value in the new field.

Integrity Constraints (ICs) IC: condition that must be true for any instance of the database. ICs are specified when schema is defined. ICs are checked when relations are modified. A legal instance of a relation is one that satisfies all specified ICs. DBMS should only allow legal instances. If the DBMS checks ICs, stored data is more faithful to real-world meaning. Avoids data entry errors, too!

Definition: Key Constraints A key constraint is a statement that a certain minimal subset of the fields of a relation is a unique identifier for a tuple. A set of fields is a key for a relation if : 1. No two distinct tuples can have same values in all key fields, and... 2. This is not true for any subset of the key. If part 2 false: then fields are a superkey. If thereʼs more than one key for a relation, one of the keys is chosen (by DBA) to be the primary key. E.g., sid is a key for Students. (What about name?) The set {sid, gpa} is a superkey.

Student table STUDENT sid name login age gpa 50000 Dave dave@cs 19 3.2 53666 Jones jones@cs 18 3.3 53688 Smith smith@ee 18 3.2 53650 Smith smith@math 19 3.7 53831 Madayan madayan@music 11 1.8 53832 Guldu guldu@music 12 2.0 10

Specifying Key Constraints in SQL CREATE TABLE Student (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa REAL, UNIQUE (name, age), PRIMARY KEY (sid) ) 11

Primary and Candidate Keys in SQL Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key. " For a given student and course, there is a single grade. vs. Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade. "Used carelessly, an IC can prevent the storage of database instances that arise in practice! CREATE TABLE Takes (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid) ) CREATE TABLE Takes (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade) )

Foreign Keys, Referential Integrity Foreign key : Set of fields in one relation that is used to `referʼ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointerʼ. E.g. sid is a foreign key referring to Students: Takes(sid: string, cid: string, grade: string) If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references. Can you think of a data model without referential integrity?

Foreign Keys in SQL Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Takes (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students ) Takes sid cid grade 50000 445 A 53688 483 C 53666 435 B STUDENT sid name login 50000 Dave dave@cs 53666 Jones jones@cs 53688 Smith smith@ee 53650 Smith smith@math

Enforcing Referential Integrity Consider Student and Takes; sid in Takes is a foreign key that references Student. What should be done if a Takes tuple with a non-existent student id is inserted? (Reject it!) What should be done if a Student tuple is deleted? Also delete all Takes tuples that refer to it. Disallow deletion of a Students tuple that is referred to. Set sid in Takes tuples that refer to it to a default sid. (In SQL, also: Set sid in Takes tuples that refer to it to a special value null, denoting `unknown or `inapplicable.) Similar if primary key of Students tuple is updated.

Referential Integrity in SQL SQL/92 and SQL:1999 support all 4 options on deletes and updates. Default is NO ACTION (delete/ update is rejected) CASCADE (also delete all tuples that refer to deleted tuple) SET NULL / SET DEFAULT (sets foreign key value of referencing tuple) CREATE TABLE Takes (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students ON DELETE CASCADE ON UPDATE SET DEFAULT )

Where do ICs Come From? ICs are based upon the semantics of the real-world enterprise that is being described in the database relations. We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at an instance. An IC is a statement about all possible instances! From example, we know name is not a key, but the assertion that sid is a key is given to us. Key and foreign key ICs are the most common; more general ICs supported too.

Query capabilities SQL Overview SELECT-FROM-WHERE blocks, Basic features, ordering, duplicates Set operations (union, intersect, except) Aggregation & Grouping Nested queries (correlation) Null values 18

Example database Sailors (sid, sname, rating, age) Boats (bid, bname, color) Reserves (sid, bid, day) Key for each table indicated by underlined attributes. 19

Sailors sid sname rating age 29 brutus 1 33 85 art 3 25.5 95 bob 3 63.5 96 frodo 3 25.5 22 dustin 7 45 64 horatio 7 35 31 lubber 8 55.5 32 andy 8 25.5 74 horatio 9 35 58 rusty 10 35 71 zorba 10 16 Reserves sid bid day 22 101 10/10 22 102 10/10 22 103 10/8 22 104 10/7 31 102 11/10 31 103 11/6 31 104 11/12 64 101 9/5 64 102 9/8 74 103 9/8 Boats bid bname color 101 Interlake blue 102 Interlake red 103 Clipper green 20 104 Marine red

SQL Query Basic form: (plus many many extensions) SELECT FROM WHERE [DISTINCT] target-list relation-list qualification conditions For example: SELECT sid, sname, rating, age FROM Sailors WHERE age > 21

Basic SQL Query SELECT FROM WHERE [DISTINCT] target-list relation-list qualification target-list A list of attributes of relations in relationlist relation-list A list of relation names (possibly with a range-variable after each name). qualification Comparisons (Attr op const or Attr1 op Attr2, where op is one of ) combined using AND, OR and NOT. DISTINCT is an optional keyword indicating that the answer should not contain duplicates. Default is that duplicates are not eliminated!

Simple SQL Query Sailors sid sname rating age 22 dustin 7 45 31 lubber 8 55.5 58 rusty 10 35 71 zorba 10 16 SELECT * FROM Sailors WHERE age > 21 sid sname rating age 22 dustin 7 45 31 lubber 8 55.5 58 rusty 10 35 Conditions in the WHERE clause are like selection: σage<21

Selection conditions What goes in the WHERE clause: x = y, x < y, x <= y, x!= y, etc For number, they have the usual meanings For CHAR and VARCHAR: lexicographic ordering For dates and times, what you expect... Also, pattern matching on strings: s LIKE p

The LIKE operator s LIKE p: pattern matching on strings p may contain two special symbols: % = any sequence of characters _ = any single character Find all students whose name begins and ends with ʻbʼ: SELECT * FROM Sailors WHERE sname LIKE b%b sid sname rating age 29 brutus 1 33 85 art 3 25.5 95 bob 3 63.5 96 frodo 3 25.5 22 dustin 7 45 64 horatio 7 35 31 lubber 8 55.5

Simple SQL Query Sailors sid sname rating age 22 dustin 7 45 31 lubber 8 55.5 58 rusty 10 35 71 zorba 10 16 SELECT sname, age FROM Sailors WHERE age > 21 sname age dustin 45 lubber 55.5 rusty 35 Conditions in the SELECT clause are like projection: Πsname,age

Note confusing terminology SELECT sname, age FROM Sailors WHERE age > 21 Conditions in the WHERE clause are like selection: σage<21 Conditions in the SELECT clause are like projection: Πsname,age

Eliminating Duplicates SELECT DISTINCT sname FROM Sailors sname brutus art bob frodo dustin horatio lubber andy Compare to: SELECT sname FROM Sailors sname brutus art bob frodo dustin horatio lubber andy horatio Default behavior does not eliminate duplicates.

Ordering the Results SELECT sname, rating, age FROM Sailors WHERE age > 18 ORDER BY rating, sname Ordering is ascending, unless you specify the DESC keyword. Ties are broken by the second attribute on the ORDER BY list, etc.

Conceptual Evaluation Strategy SELECT FROM WHERE [DISTINCT] target-list relation-list qualifications Semantics of an SQL query defined in terms of a conceptual evaluation strategy: Compute the cross-product of relation-list. Discard resulting tuples if they fail qualifications. Delete attributes that are not in target-list. If DISTINCT is specified, eliminate duplicate rows. Probably the least efficient way to compute a query -- optimizer will find more efficient plan. RA equiv: σ Π

Example of Conceptual Evaluation SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid AND R.bid=103

Example SELECT sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = red What does this query compute? Find the names of sailors who have reserved a red boat 32

Please write in SQL Find the colors of boats reserved by ʻLubberʼ SELECT B.color FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND S.sname = ʻLubberʼ 33

Renaming Columns Sailors sid sname rating age 22 dustin 7 45 31 lubber 8 55.5 58 rusty 10 35 71 zorba 10 16 SELECT sname AS name, age AS x FROM Sailors WHERE age > 21 name x dustin 45 lubber 55.5 rusty 35

Disambiguating Attributes Sometimes two relations have the same attr: Person(pname, address, worksfor) Company(cname, address) SELECT DISTINCT pname, address FROM Person, Company WHERE worksfor = cname Which address? SELECT DISTINCT Person.pname, Company.address FROM Person, Company WHERE Person.worksfor = Company.cname

Range Variables in SQL Purchase (buyer, seller, store, product) Find all stores that sold at least one product that was sold at ʻBestBuyʼ: SELECT DISTINCT x.store FROM Purchase AS x, Purchase AS y WHERE x.product = y.product AND y.store = ʻBestBuyʼ

Please write in SQL Self-join on Flights: The departure and arrival cities of trips consisting of two direct flights. SELECT F1.depart, F2.arrive FROM Flights as F1, Flights as F2 WHERE F1.arrive = F2.depart FLIGHTS depart arrive NYC Reno NYC Oakland Boston Tampa Oakland Boston Tampa NYC 37

Query capabilities SQL Overview SELECT-FROM-WHERE blocks, Basic features, ordering, duplicates Set operations (union, intersect, except) Aggregation & Grouping Nested queries (correlation) Null values 38

Set operations UNION INTERSECTION EXCEPT (sometimes called MINUS) Recall: schemas must match for these operations. 39

UNION example Find the names of sailors who have reserved a red or a green boat. SELECT sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ʻredʼ UNION SELECT sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ʻgreenʼ 40

UNION Duplicates ARE NOT eliminated by default in basic SELECT-FROM-WHERE queries Duplicate ARE eliminated by default for UNION queries. To preserve duplicates in UNION, you must use UNION ALL 41

UNION example, alternative: Find the names of sailors who have reserved a red or a green boat. SELECT DISTINCT sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND (B.color = ʻredʼ OR B.color = ʻgreenʼ) 42

Find the names of sailors who have reserved a red or a green boat. Find the names of sailors who have reserved a red and a green boat. SELECT sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND (B.color = ʻredʼ AND B.color = ʻgreenʼ) This doesnʼt work! What does this query return?

Find the names of sailors who have reserved a red and a green boat. SELECT sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ʻredʼ INTERSECT SELECT sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ʻgreenʼ

Query capabilities SQL Overview SELECT-FROM-WHERE blocks, Basic features, ordering, duplicates Set ops (union, intersect, except) Aggregation & Grouping 45

Aggregation SELECT Avg(S.age) FROM Sailors WHERE S.rating = 10 SQL supports several aggregation operations: COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A)

Aggregation: Count SELECT Count(*) FROM Sailors WHERE rating > 5 Except for COUNT, all aggregations apply to a single attribute

Aggregation: Count COUNT applies to duplicates, unless otherwise stated: SELECT Count(category) FROM Product WHERE year > 1995 Better: SELECT Count(DISTINCT category) FROM Product WHERE year > 1995

Simple Aggregation Purchase(product, date, price, quantity) Example 1: find total sales for the entire database SELECT Sum(price * quantity) FROM Purchase Example 1ʼ: find total sales of bagels SELECT Sum(price * quantity) FROM Purchase WHERE product = bagel

GROUP BY and HAVING clauses We often want to apply aggregates to each of a number of groups of rows in a relation. Find the age of the youngest sailor for each rating level. SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i For i = 1, 2,... 10 50

Sailors Grouping sid sname rating age 29 brutus 1 33 85 art 3 25.5 95 bob 3 63.5 96 frodo 3 25.5 22 dustin 7 45 64 horatio 7 35 31 lubber 8 55.5 32 andy 8 25.5 74 horatio 9 35 58 rusty 10 35 71 zorba 10 16 SELECT S.rating, MIN(S.age) FROM Sailors S GROUP BY S.rating New Table rating 1 3 7 8 9 10 age?