COMP 430 Intro. to Database Systems

Similar documents
DS Introduction to SQL Part 2 Multi-table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

COMP 430 Intro. to Database Systems

Lecture 2: Introduction to SQL

Introduction to SQL Part 1 By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

Lecture 3: SQL Part II

Lectures 2&3: Introduction to SQL

Introduction to Database Systems CSE 444

Polls on Piazza. Open for 2 days Outline today: Next time: "witnesses" (traditionally students find this topic the most difficult)

Database Systems CSE 303. Lecture 02

Database Systems CSE 303. Lecture 02

L12: ER modeling 5. CS3200 Database design (sp18 s2) 2/22/2018

CSC 261/461 Database Systems Lecture 13. Fall 2017

The Entity-Relationship Model (ER Model) - Part 2

DS Introduction to SQL Part 1 Single-Table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Databases. Jörg Endrullis. VU University Amsterdam

CS425 Midterm Exam Summer C 2012

CSC 261/461 Database Systems Lecture 5. Fall 2017

Introduction to Databases CSE 414. Lecture 2: Data Models

Lecture 3 Additional Slides. CSE 344, Winter 2014 Sudeepa Roy

Agenda. Discussion. Database/Relation/Tuple. Schema. Instance. CSE 444: Database Internals. Review Relational Model

4/10/2018. Relational Algebra (RA) 1. Selection (σ) 2. Projection (Π) Note that RA Operators are Compositional! 3.

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Keys, SQL, and Views CMPSCI 645

CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE

Chapter 3: Introduction to SQL

EECS 647: Introduction to Database Systems

Lecture 16. The Relational Model

HW1 is due tonight HW2 groups are assigned. Outline today: - nested queries and witnesses - We start with a detailed example! - outer joins, nulls?

Introduction to Data Management CSE 414

CS 564 Midterm Review

SQL. Assit.Prof Dr. Anantakul Intarapadung

Introduction to Data Management. Lecture #4 (E-R à Relational Design)

2. E/R Design Considerations

Database Languages. A DBMS provides two types of languages: Language for accessing & manipulating the data. Language for defining a database schema

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Introduction to Database Systems. The Relational Data Model

Introduction to Database Systems. The Relational Data Model. Werner Nutt

INF 212 ANALYSIS OF PROG. LANGS SQL AND SPREADSHEETS. Instructors: Crista Lopes Copyright Instructors.

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2006 Lecture 3 - Relational Model

Relational Model, Relational Algebra, and SQL

CMPT 354: Database System I. Lecture 3. SQL Basics

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

Chapter 1: Introduction

ENTITY-RELATIONSHIP MODEL. CS 564- Spring 2018

Announcements. Agenda. Database/Relation/Tuple. Discussion. Schema. CSE 444: Database Internals. Room change: Lab 1 part 1 is due on Monday

CSE 344 JANUARY 8 TH SQLITE AND JOINS

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

SQL. Structured Query Language

Chapter 3: Introduction to SQL

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL

CSE 544 Principles of Database Management Systems

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

Principles of Database Systems CSE 544. Lecture #2 SQL The Complete Story

SQL: Part II. Announcements (September 18) Incomplete information. CPS 116 Introduction to Database Systems. Homework #1 due today (11:59pm)

Announcements. Agenda. Database/Relation/Tuple. Schema. Discussion. CSE 444: Database Internals

The SQL data-definition language (DDL) allows defining :

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Introduction to Relational Databases

MTAT Introduction to Databases

CONCEPTUAL DESIGN: ER TO RELATIONAL TO SQL

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

Data Modeling. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

CSE 544 Principles of Database Management Systems

Announcements. Multi-column Keys. Multi-column Keys (3) Multi-column Keys. Multi-column Keys (2) Introduction to Data Management CSE 414

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.

The Relational Model. Chapter 3

Midterm Review. Winter Lecture 13

Database Applications (15-415)

Relational Model. CS 377: Database Systems

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Database Management Systems. Chapter 3 Part 1

Relational Algebra for sets Introduction to relational algebra for bags

Introduction to SQL Part 2 by Michael Hahsler Based on slides for CS145 Introduction to Databases (Stanford)

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

Intermediate SQL ( )

Session 6: Relational Databases

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13

CS275 Intro to Databases

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

CMPT 354: Database System I. Lecture 4. SQL Advanced

Lecture 4. Lecture 4: The E/R Model

Announcements (September 18) SQL: Part II. Solution 1. Incomplete information. Solution 3? Solution 2. Homework #1 due today (11:59pm)

Announcements. Multi-column Keys. Multi-column Keys. Multi-column Keys (3) Multi-column Keys (2) Introduction to Data Management CSE 414

CS530 Database Architecture Models. Database Model. Prof. Ian HORROCKS. Dr. Robert STEVENS. and Design The Relational

Comp 5311 Database Management Systems. 4b. Structured Query Language 3

Entity/Relationship Modelling

Chapter 5. Relational Model Concepts 5/2/2008. Chapter Outline. Relational Database Constraints

CS2300: File Structures and Introduction to Database Systems

Chapter 5. Relational Model Concepts 9/4/2012. Chapter Outline. The Relational Data Model and Relational Database Constraints

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

The Relational Model 2. Week 3

Database Systems ( 資料庫系統 )

SQL: Structured Query Language Nested Queries

Relational terminology. Databases - Sets & Relations. Sets. Membership

Relational Algebra and SQL

Chapter 5. The Relational Data Model and Relational Database Constraints. Slide 5-١. Copyright 2007 Ramez Elmasri and Shamkant B.

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 5-1

SQL Overview. CSCE 315, Fall 2017 Project 1, Part 3. Slides adapted from those used by Jeffrey Ullman, via Jennifer Welch

Database Systems. Lecture2:E-R model. Juan Huo( 霍娟 )

Transcription:

COMP 430 Intro. to Database Systems Multi-table SQL Get clickers today! Slides use ideas from Chris Ré and Chris Jermaine.

The need for multiple tables Using a single table leads to repeating data Provides the opportunity for inconsistency Requires more storage space & I/O time Product p_name price manufacturer address city state Gizmo 19.99 GizmoWorks 123 Gizmo St. Houston TX Powergizmo 39.99 GizmoWorks 123 Gizmo St. Houston TX Widget 19.99 WidgetsRUs 20 Main St. New York NY HyperWidget 203.99 Hyper 1 Mission Dr. San Francisco CA Product (p_name, price, manufacturer, address, city, state)

Using multiple tables Product p_name price manufacturer Gizmo 19.99 GizmoWorks Powergizmo 39.99 GizmoWorks Widget 19.99 WidgetsRUs Company c_name address city state GizmoWorks 123 Gizmo St. Houston TX WidgetsRUs 20 Main St. New York NY Hyper 1 Mission Dr. San Francisco CA HyperWidget 203.99 Hyper Product (p_name, price, manufacturer) Company (c_name, address, city, state) Later in course: Deciding what fields belong in what tables.

Foreign keys & referential integrity Product p_name price manufacturer Gizmo 19.99 GizmoWorks Powergizmo 39.99 GizmoWorks Widget 19.99 WidgetsRUs Company c_name address city state GizmoWorks 123 Gizmo St. Houston TX WidgetsRUs 20 Main St. New York NY Hyper 1 Mission Dr. San Francisco CA HyperWidget 203.99 Hyper Product s Manufacturer is a foreign key. Foreign keys always refer to primary keys. We want to enforce referential integrity each Manufacturer value is an existing c_name.

Creating a table with a foreign key CREATE TABLE Product ( p_name VARCHAR(50), price DECIMAL(6,2), manufacturer VARCHAR(50), PRIMARY KEY (p_name), FOREIGN KEY (manufacturer) REFERENCES Company (c_name ) ); CREATE TABLE Company ( c_name VARCHAR(50), address VARCHAR(50), city VARCHAR(50), state CHAR(2), PRIMARY KEY (c_name) );

Foreign key represents a dependence Product p_name price manufacturer Gizmo 19.99 GizmoWorks Company c_name address city state GizmoWorks 123 Gizmo St. Houston TX Product conceptually dependent on Company to elaborate details. Product data dependent on Company data being entered. CREATE TABLE Product (. FOREIGN KEY (manufacturer) REFERENCES Company (c_name) ); CREATE TABLE Company ( ); Product def n dependent on Company def n.

Foreign keys vs. pointers Foreign keys relate tables, or equivalently, sets of attributes. Repeats data to connect individual records. One relationship between tables vs. many pointers between table records. Avoid data pointers because Difficult to maintain pointers when data moves, esp. with concurrency. Want to relate with query results, in addition to tables.

In what ways do we relate tables? Explore design issues & common patterns later. Two basic building blocks: One-to-many relationships Many-to-many relationships

In what ways do we relate tables? Explore design issues & common patterns later. Two basic building blocks: One-to-many relationships Many-to-many relationships

One-to-many & many-to-one relationships Product p_name price manufacturer Gizmo 19.99 GizmoWorks Powergizmo 39.99 GizmoWorks Widget 19.99 WidgetsRUs Company c_name address city state GizmoWorks 123 Gizmo St. Houston TX WidgetsRUs 20 Main St. New York NY Hyper 1 Mission Dr. San Francisco CA HyperWidget 203.99 Hyper Product (p_name, price, manufacturer) Company (c_name, address, city, state) Each Company can have many Products. Each Product is made by exactly one Company.

Many-to-many relationships Student s_id first_name last_name S01 John Smith S02 Mary Wallace S03 Sue Roper S04 Mark Jones Enrollment s_id crn grade S01 01234 A S01 46117 C S02 01234 B Course crn dept number 01234 COMP 140 15134 COMP 160 46117 ELEC 220 Each Student can be enrolled in many Courses. Each Course has many Students.

What is Student s primary key? Response Counter A. s_id B. s_id, first_name, last_name C. first_name, Last_name D. last_name Student 25% 25% 25% 25% s_id first_name last_name S01 John Smith S02 Mary Wallace S03 Sue Roper S04 Mark Jones s_id last_name ast_name last_name

What is Course s primary key? Response Counter A. crn B. crn, dept, number C. dept, number D. number 25% 25% 25% 25% Course crn dept number 01234 COMP 140 15134 COMP 160 46117 ELEC 220 r r

What is Enrollment s primary key? Response Counter A. s_id B. s_id, crn C. s_id, crn, grade D. crn E. grade Enrollment s_id crn grade 20% 20% 20% 20% 20% S01 01234 A S01 46117 C S02 01234 B

What is a foreign key in Student? Response Counter A. s_id B. first_name, last_name 33% 33% 33% C. N/A Student (s_id, FirstName, last_name) Enrollment (s_id, crn, grade) Course (crn, dept, number) s_id name N/A

What is a foreign key in Course? Response Counter A. crn B. dept, number 33% 33% 33% C. N/A Student (s_id, first_name, last_name) Enrollment (s_id, crn, grade) Course (crn, dept, number)

What is a foreign key in Enrollment? Response Counter A. s_id B. crn C. the pair sid, crn is a single foreign key D. s_id and crn are each foreign keys 25% 25% 25% 25% Student (s_id, first_name, last_name) Enrollment (s_id, crn, grade) Course (crn, dept, number) s_id crn s a single forei.. ach foreign keys

Creating a table with a foreign key CREATE TABLE Student ( s_id CHAR(10), first_name VARCHAR(50), last_name VARCHAR(50), PRIMARY KEY (s_id) ); CREATE TABLE Enrollment ( s_id CHAR(10), crn CHAR(10), grade CHAR(2), PRIMARY KEY (s_id, crn), FOREIGN KEY (s_id) REFERENCES Student (s_id), FOREIGN KEY (crn) REFERENCES Course (crn) ); CREATE TABLE Course ( crn CHAR(10), dept CHAR(4), number CHAR(3), PRIMARY KEY (crn) );

Another representation ER diagram Style: Crow s foot Conceptual level: Physical

A more abstract ER diagram Style: Chen Conceptual level: Logical s_id grade crn Student N Takes N Course first_name last_name dept number

Joining tables Product p_name price manufacturer Gizmo 19.99 GizmoWorks Powergizmo 39.99 GizmoWorks Widget 19.99 WidgetsRUs HyperWidget 203.99 Hyper SELECT * FROM Product, Company WHERE manufacturer = c_name; Company c_name address city state GizmoWorks 123 Gizmo St. Houston TX WidgetsRUs 20 Main St. New York NY Hyper 1 Mission Dr. San Francisco CA Join condition p_name price manufacturer address city state Gizmo 19.99 GizmoWorks 123 Gizmo St. Houston TX Powergizmo 39.99 GizmoWorks 123 Gizmo St. Houston TX Widget 19.99 WidgetsRUs 20 Main St. New York NY HyperWidget 203.99 Hyper 1 Mission Dr. San Francisco CA

Joining tables Product p_name price manufacturer Gizmo 19.99 GizmoWorks Powergizmo 39.99 GizmoWorks Widget 19.99 WidgetsRUs HyperWidget 203.99 Hyper Company c_name address city state GizmoWorks 123 Gizmo St. Houston TX WidgetsRUs 20 Main St. New York NY Hyper 1 Mission Dr. San Francisco CA This is simplest & most common kind of join an inner join. See other kinds later. p_name price manufacturer address city state Gizmo 19.99 GizmoWorks 123 Gizmo St. Houston TX Powergizmo 39.99 GizmoWorks 123 Gizmo St. Houston TX Widget 19.99 WidgetsRUs 20 Main St. New York NY HyperWidget 203.99 Hyper 1 Mission Dr. San Francisco CA

Joins forgetting the join condition SELECT * FROM Product, Company; p_name price manufacturer c_name address city state Gizmo 19.99 GizmoWorks GizmoWorks 123 Gizmo St. Houston TX Gizmo 19.99 GizmoWorks WidgetsRUs 20 Main St. New York NY Gizmo 19.99 GizmoWorks Hyper 1 Mission Dr. San Francisco CA Powergizmo 39.99 GizmoWorks GizmoWorks 123 Gizmo St. Houston TX Powergizmo 39.99 GizmoWorks WidgetsRUs 20 Main St. New York NY Powergizmo 39.99 GizmoWorks Hyper 1 Mission Dr. San Francisco CA Widget 19.99 WidgetsRUs GizmoWorks 123 Gizmo St. Houston TX All combinations of records! Cross-product of tables.

Selection & Projection SELECT p_name FROM Product, Company WHERE manufacturer = c_name AND state = TX ; Selection p_name price manufacturer address city state Gizmo 19.99 GizmoWorks 123 Gizmo St. Houston TX Powergizmo 39.99 GizmoWorks 123 Gizmo St. Houston TX Projection p_name Gizmo Powergizmo

Attribute name conflicts Person (name, address, works_for) Company (name, address) SELECT name, address FROM Person, Company WHERE works_for = name; SELECT Person.name, Person.address FROM Person, Company WHERE Person.works_for = Company.name; SELECT p.name, p.address FROM Person p, Company c WHERE p.works_for = c.name;

Semantics, multisets, sets

Semantics set notation SELECT [DISTINCT] T 1.a 1, T 1.a 2,, T n.a m FROM T 1, T 2,, T n WHERE Conditions(T 1.a 1, T 1.a 2,, T n.a p ); {(T 1.a 1, T 1.a 2,, T n.a m ) Conditions(T 1.a 1, T 1.a 2,, T n.a p )} Multisets by default. Sets with DISTINCT.

Semantics order of steps SELECT [DISTINCT] T 1.a 1, T 1.a 2,, T n.a m FROM T 1, T 2,, T n WHERE Conditions(T 1.a 1, T 1.a 2,, T n.a p ); 1. Cross-product 2. Selection 3. Projection Answer = {} for row 1 in T 1 do Multiset union by default. for row 2 in T 2 do Set union with DISTINCT. for row n in T n do if Conditions(row 1.a 1, row 1.a 2,, row n.a p ) then Answer = Answer {(row 1.a 1, row 1.a 2,, row n.a m )} return Answer

Activity: Writing multi-table queries 03a-queries.ipynb

What is result of query? Response Counter A. R (S T) B. R S T C. R (S T) D. None of the above Schemas: R(a) S(a) T(a) SELECT DISTINCT R.a FROM R, S, T WHERE R.a = S.a OR R.a = T.a; S R T

What is result of query if S is empty? Response Counter A. R B. T C. R T D. Schemas: R(a) S(a) T(a) SELECT DISTINCT R.a FROM R, S, T WHERE R.a = S.a OR R.a = T.a; S R T

Query summary Schemas: R(a) S(a) T(a) SELECT DISTINCT R.a FROM R, S, T WHERE R.a = S.a OR R.a = T.a; If S =, then If T =, then Else R (S T)

Two ways to understand multisets Tuple (1, a) (1, a) (1, b) (2, c) (2, c) (2, c) Tuple λ(x) (1, a) 2 (1, b) 1 (2, c) 3 (1, d) 2 (1, d) (1, d)

Multiset union Tuple λ(x) (1, a) 2 (1, b) 0 (2, c) 3 (1, d) 0 Tuple λ(y) (1, a) 5 (1, b) 1 (2, c) 2 = (1, d) 2 Tuple λ(z) (1, a) 7 (1, b) 1 (2, c) 5 (1, d) 2 λ Z = λ X + λ Y

Multiset intersection Tuple λ(x) (1, a) 2 (1, b) 0 (2, c) 3 (1, d) 0 Tuple λ(y) (1, a) 5 (1, b) 1 (2, c) 2 = (1, d) 2 Tuple λ(z) (1, a) 2 (1, b) 0 (2, c) 2 (1, d) 0 λ Z = min λ X, λ Y

Multiset difference Tuple λ(x) (1, a) 2 (1, b) 0 (2, c) 3 (1, d) 0 Tuple λ(y) (1, a) 5 (1, b) 1 (2, c) 2 = (1, d) 2 Tuple λ(z) (1, a) 0 (1, b) 0 (2, c) 1 (1, d) 0 λ Z = λ X λ Y

Set & multiset union SELECT R.a FROM R, S WHERE R.a = S.a UNION SELECT R.a FROM R, T WHERE R.a = T.a; SELECT R.a FROM R, S WHERE R.a = S.a UNION ALL SELECT R.a FROM R, T WHERE R.a = T.a;

Set & multiset intersection SELECT R.a FROM R, S WHERE R.a = S.a INTERSECTION SELECT R.a FROM R, T WHERE R.a = T.a; SELECT R.a FROM R, S WHERE R.a = S.a INTERSECTION ALL SELECT R.a FROM R, T WHERE R.a = T.a;

Set & multiset difference SELECT R.a FROM R, S WHERE R.a = S.a EXCEPT SELECT R.a FROM R, T WHERE R.a = T.a; SELECT R.a FROM R, S WHERE R.a = S.a EXCEPT ALL SELECT R.a FROM R, T WHERE R.a = T.a;

Activity Sets & multisets 03b-sets-multisets.ipynb

A final tricky example Company (c_name, hq_loc) Product (p_name, manufacturer, factory_loc) Goal: Find HQs of companies that manufacture in both U.S. and China. Suggested solution: SELECT hq_loc FROM Company, Product WHERE manufacturer = c_name AND factory_loc = U.S. INTERSECT SELECT hq_loc FROM Company, Product WHERE manufacturer = c_name AND factory_loc = China ;

Exercise: Compute results SELECT hq_loc FROM Company, Product WHERE manufacturer = c_name AND factory_loc = U.S. INTERSECT SELECT hq_loc FROM Company, Product WHERE manufacturer = c_name AND factory_loc = China ; Product p_name manufacturer factory_loc Gizmo GizmoWorks U.S. WhatNot What China Company c_name hq_loc GizmoWorks U.S. What U.S.

One solution: set membership + subquery More subqueries later. SELECT DISTINCT hq_loc FROM Company, Product WHERE manufacturer = c_name AND c_name IN ( SELECT manufacturer FROM Product WHERE factory_loc = U.S. ) AND c_name IN ( SELECT manufacturer FROM Product WHERE factory_loc = China ); Product p_name manufacturer factory_loc Gizmo GizmoWorks U.S. WhatNot What China Company c_name hq_loc GizmoWorks U.S. What U.S.