EECS 647: Introduction to Database Systems

Similar documents
Announcements (September 14) SQL: Part I SQL. Creating and dropping tables. Basic queries: SFW statement. Example: reading a table

EECS 647: Introduction to Database Systems

Database Languages. A DBMS provides two types of languages: Language for accessing & manipulating the data. Language for defining a database schema

EECS 647: Introduction to Database Systems

SQL - Data Query language

CMPT 354: Database System I. Lecture 4. SQL Advanced

EECS 647: Introduction to Database Systems

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

The Relational Model. Suan Lee

SQL. Chapter 5 FROM WHERE

CS3DB3/SE4DB3/SE6DB3 TUTORIAL

What s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence

CSEN 501 CSEN501 - Databases I

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

COMP 244 DATABASE CONCEPTS & APPLICATIONS

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview - rel. model. Overview - detailed - SQL

SQL: Queries, Programming, Triggers

CMPT 354: Database System I. Lecture 3. SQL Basics

Basic form of SQL Queries

EECS 647: Introduction to Database Systems

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

SQL - Lecture 3 (Aggregation, etc.)

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

Basic operators: selection, projection, cross product, union, difference,

SQL: Part II. Announcements (September 18) Incomplete information. CPS 116 Introduction to Database Systems. Homework #1 due today (11:59pm)

SQL Aggregation and Grouping. Outline

SQL Aggregation and Grouping. Davood Rafiei

Today s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls

Keys, SQL, and Views CMPSCI 645

SQL. CS 564- Fall ACKs: Dan Suciu, Jignesh Patel, AnHai Doan

SQL: Queries, Constraints, Triggers

Database Management Systems. Chapter 5

The Relational Data Model. Data Model

CS 582 Database Management Systems II

Chapter 2: Intro to Relational Model

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems

CIS 330: Applied Database Systems

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

Part I: Structured Data

Announcements (September 18) SQL: Part II. Solution 1. Incomplete information. Solution 3? Solution 2. Homework #1 due today (11:59pm)

Relational Model. Topics. Relational Model. Why Study the Relational Model? Linda Wu (CMPT )

Relational Model & Algebra. Announcements (Thu. Aug. 27) Relational data model. CPS 116 Introduction to Database Systems

Relational Model, Relational Algebra, and SQL

Relational Query Languages

4/10/2018. Relational Algebra (RA) 1. Selection (σ) 2. Projection (Π) Note that RA Operators are Compositional! 3.

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

SQL: Part III. Announcements. Constraints. CPS 216 Advanced Database Systems

CS425 Midterm Exam Summer C 2012

Plan for today. Query Processing/Optimization. Parsing. A query s trip through the DBMS. Validation. Logical plan

CMP-3440 Database Systems

Relational Algebra. Procedural language Six basic operators

Computing for Medicine (C4M) Seminar 3: Databases. Michelle Craig Associate Professor, Teaching Stream

Database Management Systems. Chapter 5

Database Applications (15-415)

CSC 261/461 Database Systems Lecture 13. Fall 2017

CSE 544 Principles of Database Management Systems

Lecture 16. The Relational Model

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

SQL Part 2. Kathleen Durant PhD Northeastern University CS3200 Lesson 6

MIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: Time: 60 min Marks: 38

The SQL data-definition language (DDL) allows defining :

NULL. The special value NULL could mean: Unknown Unavailable Not Applicable

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

CSE 544 Principles of Database Management Systems

Chapter 2: Intro to Relational Model

CSIT5300: Advanced Database Systems

Chapter 3: Introduction to SQL

SQL. SQL Queries. CISC437/637, Lecture #8 Ben Cartere9e. Basic form of a SQL query: 3/17/10

Structured Query Language Continued. Rose-Hulman Institute of Technology Curt Clifton

A subquery is a nested query inserted inside a large query Generally occurs with select, from, where Also known as inner query or inner select,

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL

Relational Model & Algebra. Announcements (Tue. Sep. 3) Relational data model. CompSci 316 Introduction to Database Systems

Introduction to Data Management. Lecture 14 (SQL: the Saga Continues...)

An Introduction to Structured Query Language

CPS 216 Spring 2003 Homework #1 Assigned: Wednesday, January 22 Due: Monday, February 10

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

An Introduction to Structured Query Language

Chapter 2 Introduction to Relational Models

SQL: Queries, Programming, Triggers. Basic SQL Query. Conceptual Evaluation Strategy. Example of Conceptual Evaluation. A Note on Range Variables

Relational Algebra and SQL. Basic Operations Algebra of Bags

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects.

Data about data is database Select correct option: True False Partially True None of the Above

Goals for Today. CS 133: Databases. Relational Model. Multi-Relation Queries. Reason about the conceptual evaluation of an SQL query

CS143: Relational Model

Database Management Systems,

CSCC43H: Introduction to Databases

Database Management Systems. Chapter 3 Part 1

Database Management Systems Paper Solution

Database Systems. October 12, 2011 Lecture #5. Possessed Hands (U. of Tokyo)

Querying Data with Transact SQL

Chapter 3: Introduction to SQL

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

SQL: Data Manipulation Language. csc343, Introduction to Databases Diane Horton Winter 2017

SQL: csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Sina Meraji. Winter 2018

Relational Algebra for sets Introduction to relational algebra for bags

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #2: The Relational Model and Relational Algebra

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Database Systems ( 資料庫系統 )

Transcription:

EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009

Stating Points A database A database management system A miniworld A data model Conceptual model Relational model 2/24/2009 Luke Huan Univ. of Kansas 2

So What? Database Schemas A database schema is a description of a database, using a given data model (relational model by default). External schemas (user views) describe how users see the data. Conceptual schema defines logical structure Internal schema describes the physical storage structure of the database. Users View 1 View 2 View 3 Conceptual Schema Internal Schema DB 2/24/2009 Luke Huan Univ. of Kansas 3

Data Independence Applications insulated from how data is structured and stored. Logical data independence: the capacity to change the conceptual schema without having to change the external schema Physical data independence: the capacity to change the internal schema without having to change the conceptual schema Q: Why is this particularly important for DBMS?

O.K. So, how to model a miniworld? Using conceptual model such as ER model 2/24/2009 Luke Huan Univ. of Kansas 5

Where does a conceptual model lead us to? A tabular view 2/24/2009 Luke Huan Univ. of Kansas 6

Now you have a draft, how do you improve your design? Using integrity constraints Attributes value must come from its domain Every relation mush have a primary key The primary key value in a tuple can not be NULL The foreign key value in a referenced tuple must exist or the foreign key value in the referencing tuple is NULL 2/24/2009 Luke Huan Univ. of Kansas 7

How do you improve relational data base design (cont.)? Using functional dependency Non-key FD always leads to redundancy BCNF normal form Pick up a non-key FD, do binary decomposition First normal form: no attribute may be composite or multi-valued (2 nd normal form and 3 rd normal form are coming ) 2/24/2009 Luke Huan Univ. of Kansas 8

What could we do about relations? Relational algebra expression Using temporary variable Identifying information source Pay attention to non monotonic operation 2/24/2009 Luke Huan Univ. of Kansas 9

SELECT a list of attributes FROM a list of relations WHERE condition; Review Condition may have logical operators AND, OR, NOT Condition may have comparison operators: <. <=, <>, >=, > String comparison may use = (exactly match) or LIKE (matching with regular expressions) %, _, \ (Arithmetic) expressions of attributes are allowed 2/24/2009 Luke Huan Univ. of Kansas 10

Examples of bag operations Bag1 fruit Apple Apple Orange Bag2 fruit Apple Orange Orange Bag1 UNION ALL Bag2 Bag1 INTERSECT ALL Bag2 fruit Apple Apple Apple Orange Orange Orange Bag1 EXCEPT ALL Bag2 fruit Apple fruit Orange Apple 2/24/2009 Luke Huan Univ. of Kansas 11

Exercise SELECT sid, 2009 age, FROM STUDENT WHERE name LIKE %John% OR GPA > 3.6; sid name age gpa sid name age gpa 1234 John Smith 21 3.5 1234 John Smith 1988 21 3.5 1123 Mary Carter 19 3.8 1123 Mary Carter 2000 19 3.8 1011 Bob Lee 2.6 1011 Bob Lee 2.6 1204 Susan Wong 3.4 1204 Susan Wong 3.4 1306 Kevin Kim 18 2.9 1306 Kevin Kim 18 2.9 asdfsadf 2/24/2009 Luke Huan Univ. of Kansas 12

Aggregates Standard SQL aggregate functions: COUNT, SUM, AVG, MIN, MAX Example: number of students under 18, and their average GPA SELECT COUNT(*), AVG(GPA) FROM Student WHERE age < 18; COUNT(*) counts the number of rows 2/24/2009 Luke Huan Univ. of Kansas 13

Aggregates with DISTINCT Example: How many students are taking classes? SELECT COUNT (SID) FROM Enroll; SELECT COUNT(DISTINCT SID) FROM Enroll; 2/24/2009 Luke Huan Univ. of Kansas 14

GROUP BY SELECT FROM WHERE GROUP BY list_of_columns; Example: find the average GPA for each age group SELECT age, AVG(GPA) FROM Student GROUP BY age; 2/24/2009 Luke Huan Univ. of Kansas 15

Operational semantics of GROUP BY SELECT FROM WHERE GROUP BY ; Compute FROM ( ) Compute WHERE (σ) Compute GROUP BY: group rows according to the values of GROUP BY columns Compute SELECT for each group (π) For aggregation functions with DISTINCT inputs, first eliminate duplicates within the group Number of groups = number of rows in the final output 2/24/2009 Luke Huan Univ. of Kansas 16

Example of computing GROUP BY SELECT age, AVG(GPA) FROM Student GROUP BY age; sid 1234 1123 1011 1204 1306 name John Smith Mary Carter Bob Lee Susan Wong Kevin Kim age 21 19 19 gpa 3.5 3.8 2.6 3.4 2.9 Compute SELECT for each group age 21 19 gpa 3.5 3.35 3.0 Compute GROUP BY: group rows according to the values of GROUP BY columns sid 1234 1123 1306 1011 1204 name John Smith Mary Carter Kevin Kim Bob Lee Susan Wong age 21 19 19 gpa 3.5 3.8 2.9 2.6 3.4 2/24/2009 Luke Huan Univ. of Kansas 17

Aggregates with no GROUP BY An aggregate query with no GROUP BY clause represent a special case where all rows go into one group SELECT AVG(GPA) FROM Student; Compute aggregate over the group sid 1234 1123 1011 1204 1306 name John Smith Mary Carter Bob Lee Susan Wong Kevin Kim age 21 19 19 gpa 3.5 3.8 2.6 3.4 2.9 sid 1234 1123 1011 1204 1306 name John Smith Mary Carter Bob Lee Susan Wong Kevin Kim age 21 19 19 gpa 3.5 3.8 2.6 3.4 2.9 gpa 3.24 Group all rows into one gro p 2/24/2009 Luke Huan Univ. of Kansas 18

Restriction on SELECT If a query uses aggregation/group by, then every column referenced in SELECT must be either Aggregated, or A GROUP BY column This restriction ensures that any SELECT expression produces only one value for each group 2/24/2009 Luke Huan Univ. of Kansas 19

Examples of invalid queries SELECT SID, age FROM Student GROUP BY age; Recall there is one output row per group There can be multiple SID values per group SELECT SID, MAX(GPA) FROM Student; Recall there is only one group for an aggregate query with no GROUP BY clause There can be multiple SID values Wishful thinking (that the output SID value is the one associated with the highest GPA) does NOT work 2/24/2009 Luke Huan Univ. of Kansas 20

HAVING Used to filter groups based on the group properties (e.g., aggregate values, GROUP BY column values) SELECT FROM WHERE GROUP BY HAVING condition; Compute FROM ( ) Compute WHERE (σ) Compute GROUP BY: group rows according to the values of GROUP BY columns Compute HAVING (another σ over the groups) Compute SELECT (π) for each group that passes HAVING 2/24/2009 Luke Huan Univ. of Kansas 21

HAVING examples Find the average GPA for each age group over 10 SELECT age, AVG(GPA) FROM Student GROUP BY age HAVING age > 10; Can be written using WHERE without table expressions List the average GPA for each age group with more than a hundred students SELECT age, AVG(GPA) FROM Student GROUP BY age HAVING COUNT(*) > 100; Can be written using WHERE and table expressions 2/24/2009 Luke Huan Univ. of Kansas

A First Touch of Subqueries Use query result as a table In set and bag operations, FROM clauses, etc. A way to nest queries Example: names of students who are in more clubs than classes SELECT DISTINCT name FROM Student, (SELECT SID FROM ClubMember) EXCEPT ALL (SELECT SID FROM Enroll) ) AS S WHERE Student.SID = S.SID; 2/24/2009 Luke Huan Univ. of Kansas 23