Introduction to Data Management. Lecture #11 (Relational Languages I)

Similar documents
Introduction to Data Management. Lecture #11 (Relational Algebra)

Lecture #8 (Still More Relational Theory...!)

Relational Algebra. Chapter 4, Part A. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Relational Algebra. [R&G] Chapter 4, Part A CS4320 1

Database Management Systems. Chapter 4. Relational Algebra. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Relational algebra. Iztok Savnik, FAMNIT. IDB, Algebra

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board

MIS Database Systems Relational Algebra

Database Management System. Relational Algebra and operations

Relational Query Languages

Relational Algebra. Relational Query Languages

Relational Algebra. Study Chapter Comp 521 Files and Databases Fall

Relational Algebra 1. Week 4

v Conceptual Design: ER model v Logical Design: ER to relational model v Querying and manipulating data

Relational Query Languages. Preliminaries. Formal Relational Query Languages. Example Schema, with table contents. Relational Algebra

CSCC43H: Introduction to Databases. Lecture 3

CAS CS 460/660 Introduction to Database Systems. Relational Algebra 1.1

Database Applications (15-415)

Relational Query Languages. Relational Algebra. Preliminaries. Formal Relational Query Languages. Relational Algebra: 5 Basic Operations

Relational Algebra Homework 0 Due Tonight, 5pm! R & G, Chapter 4 Room Swap for Tuesday Discussion Section Homework 1 will be posted Tomorrow

Relational Algebra 1

Relational Algebra and Calculus. Relational Query Languages. Formal Relational Query Languages. Yanlei Diao UMass Amherst Feb 1, 2007

Database Applications (15-415)

CompSci 516: Database Systems

Relational Algebra 1

Today s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

Database Systems. Course Administration. 10/13/2010 Lecture #4

Introduction to Data Management. Lecture #10 (Relational Calculus, Continued)

Introduction to Data Management. Lecture #14 (Relational Languages IV)

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Evaluation of Relational Operations. Relational Operations

Schema Refinement and Normal Forms

Introduction to Data Management. Lecture #13 (Relational Calculus, Continued) It s time for another installment of...

Evaluation of Relational Operations

Implementation of Relational Operations

Introduction to Data Management. Lecture 14 (SQL: the Saga Continues...)

Relational Query Languages: Relational Algebra. Juliana Freire

Databases - Relational Algebra. (GF Royle, N Spadaccini ) Databases - Relational Algebra 1 / 24

This lecture. Projection. Relational Algebra. Suppose we have a relation

Introduction to Data Management. Lecture #17 (Physical DB Design!)

CompSci 516 Data Intensive Computing Systems

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

Evaluation of relational operations

Lecture #16 (Physical DB Design)

Principles of Data Management. Lecture #9 (Query Processing Overview)

Relational Model and Relational Algebra

Introduction to Data Management. Lecture #7 (Relational Design Theory)

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

Friday Nights with Databases!

Relational Algebra. Lecture 4A Kathleen Durant Northeastern University

Evaluation of Relational Operations

Introduction to Data Management. Lecture #4 E-R Model, Still Going

Introduction to Data Management. Lecture #4 (E-R à Relational Design)

Introduction to Data Management. Lecture #2 (Big Picture, Cont.)

Database Applications (15-415)

Database Applications (15-415)

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Overview of Query Evaluation. Overview of Query Evaluation

Introduction to Data Management. Lecture #18 (SQL, the Final Chapter )

Overview. Understanding the Workload. Physical Database Design And Database Tuning. Chapter 20

Introduction to Data Management. Lecture #3 (E-R Design, Cont d.)

CSC 261/461 Database Systems Lecture 13. Fall 2017

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

RELATIONAL ALGEBRA. CS 564- Fall ACKs: Dan Suciu, Jignesh Patel, AnHai Doan

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

Midterm Exam (Version B) CS 122A Spring 2017

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Lecture 5 Design Theory and Normalization

Overview of Query Evaluation

15-415/615 Faloutsos 1

SQL: Queries, Programming, Triggers

CIS 330: Applied Database Systems

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

ECS 165B: Database System Implementa6on Lecture 7

Implementing Joins 1

Database Applications (15-415)

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

This lecture. Step 1

Conceptual Design. The Entity-Relationship (ER) Model

SQL: Queries, Programming, Triggers. Basic SQL Query. Conceptual Evaluation Strategy. Example of Conceptual Evaluation. A Note on Range Variables

Principles of Data Management. Lecture #12 (Query Optimization I)

Final Review. Zaki Malik November 20, 2008

SQL: Queries, Constraints, Triggers

SQL - Lecture 3 (Aggregation, etc.)

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1

Relational Model & Algebra. Announcements (Thu. Aug. 27) Relational data model. CPS 116 Introduction to Database Systems

Administrivia. Physical Database Design. Review: Optimization Strategies. Review: Query Optimization. Review: Database Design

Ian Kenny. November 28, 2017

EECS 647: Introduction to Database Systems

Introduction to Data Management. Lecture 16 (SQL: There s STILL More...) It s time for another installment of...

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems

Review: Where have we been?

Introduction to Data Management. Lecture #3 (Conceptual DB Design)

Basic operators: selection, projection, cross product, union, difference,

Evaluation of Relational Operations: Other Techniques

Transcription:

Introduction to Data Management Lecture #11 (Relational Languages I) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements Homework stuff HW #3 is due today (tomorrow if late) HW #4 will come out on Friday (after the exam) Exam stuff (time flies!) Midterm #1 is this Friday (in class) i.e., in two days! We ll use assigned seating come early! You may bring an 8.5 x11 (2-sided) cheat sheet (as long as it has your name and seating location on it!) Today s plan: Relational languages the next frontier BTW: Today s material isn t on Friday s exam! J Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2

But 1st: A Word on Learning. When *I* was your age (J) 8-bit!processors, LEDs, hex keypad, 16-bit for its address space, Unix appeared, Unix/INGRES DB 64K process design point, then 1MB of memory was amazing! FORTRAN, Pascal, C, Lisp, Snobol, APL, Quel No web! (Just the early ArpaNet) Fast forward 35 years Your cell phones dwarf our ~1980 computing platforms Python, Jaa, Go, C++, Ruby, SQL, SQL++, Twitter (L) WHAT THIS MEANS: Critical to learn how to read and learn, how to search for info/resources, etc!!! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3 Reminder: Normal Forms All relations 1NF 2NF 3NF BCNF... Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4

Relational Design Theory Summary If a relation is in BCNF, it is free of redundancies that can be detected using FDs. (Trying to ensure that all relations are in BCNF is thus a good goal.) If a relation is not in BCNF, we can decompose it into a lossless-join collection of BCNF relations. Are all FDs presered? If a lossless-join, dependencypresering decomposition into BCNF is not possible (or is unsuitable for typical queries), consider 3NF instead. Note: Decompositions should be carried out while also keeping performance requirements in mind. (More later!) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 5 Note: Refining an ER Based Design Before: 1st diagram translated: Workers(S,N,L,D,S) Departments(D,M,B) ssn Lots associated with workers. Suppose all workers in a dept are assigned the same lot: D à L... Redundancy; fixed by: Workers2(S,N,D,S) WorkersLots(D,L) Departments(D,M,B) Can further fine-tune this: Workers2(S,N,D,S) Departments(D,M,B,L) name Workers lot since did dname budget N 1 Works_In Departments Notice: Lot wasn t really a Worker attribute! After: ssn name Workers since did dname budget N 1 Works_In Departments lot Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 6

PS: On Refining ER Based Designs Before: 1st diagram translated: since Workers(S,N,L,D,S) name dname Note: Departments(D,M,B) ssn lot did Lots associated In many with cases workers. the relational translation of an ER design N 1 Suppose all will workers take in you a dept right are to 3NF Workers (and BCNF)! Works_In Departments assigned the same Entity lot: key D àl... attributes for entity sets. Redundancy; fixed by: Relationship key Notice: à attributes Lot wasn t for really relationship a Worker attribute! sets. Workers2(S,N,D,S) WorkersLots(D,L) (But problems can exist After: with FDs between attributes.) budget Departments(D,M,B) since Can further fine-tune this: name dname Workers2(S,N,D,S) ssn did lot Departments(D,M,B,L) Workers N 1 Works_In Departments budget Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7 Lingering Questions? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8

On to Relational Query Languages! Query languages: Allow manipulation and retrieal of data from a database. Relational model supports simple, powerful QLs: Strong formal foundation based on logic. Allows for much optimization. Query Languages programming languages! QLs not expected to be Turing complete. QLs not intended to be used for complex calculations. QLs support easy, efficient access to large data sets. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9 Formal Relational Query Languages Two mathematical Query Languages form the basis for real languages (e.g., SQL), and for their implementation: Relational Algebra: More operational, ery useful for representing execution plans. Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non-operational, declaratie.) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10

Preliminaries A query is applied to relation instances, and the result of a query is also a relation instance. Schemas of input relations for a query are fixed (but query will run regardless of instance!) The schema for the result of a gien query is also fixed! Determined by definition of query language constructs. Positional s. named-field notation: Positional notation easier for formal definitions, named-field notation more readable. Both used in SQL (but try to aoid positional stuff!) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11 Example Instances Sailors and Reseres relations for our examples. We ll use positional or named field notation, and assume that names of fields in query results are inherited from names of fields in query input relations (when possible). S1 S2 R1 sid bid day 22 101 10/10/96 58 103 11/12/96 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12

Relational Algebra Basic operations: s Selection ( ) Selects a subset of rows from relation. π Projection ( ) Omits unwanted columns from relation. Cross-product ( ) Allows us to combine two relations. - Set-difference ( ) Tuples in reln. 1, but not in reln. 2. Union (! ) Tuples in reln. 1 and in reln. 2. Additional operations: Intersection, join, diision, renaming: Not essential, but (ery!) useful. (I.e., don t add expressie power, but ) Since each operation returns a relation, operations can be composed! (Algebra is closed.) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13 Projection Remoes attributes that are not in projection list. Schema of result contains exactly the fields in the projection list, with the same names that they had in the (only) input relation. Relational projection operator has to eliminate duplicates! (Why??) Note: real systems typically don t do duplicate elimination unless the user explicitly asks for it. (Q: Why not?) sname yuppy 9 lubber 8 guppy 5 rusty 10 rating π sname,rating (S2) age 35.0 55.5 p age ( S2) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14

Selection Selects rows that satisfy a selection condition. No duplicates in result! (Why?) Schema of result identical to schema of its (only) input relation. Result relation can be the input for another relational algebra operation! (This is operator composition.) 28 yuppy 9 35.0 58 rusty 10 35.0 s S rating >8 2) p sname rating yuppy 9 rusty 10 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 15 s ( ( S )) sname, rating rating>8 2 Union, Intersection, Set-Difference All of these operations take two input relations, which must be union-compatible: Same number of fields. Corresponding fields are of the same type. What is the schema of result? 22 dustin 7 45.0 S1 S2 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 44 guppy 5 35.0 28 yuppy 9 35.0 S1È S2 31 lubber 8 55.5 58 rusty 10 35.0 S1Ç S2 - Q: Any issues w/duplicates? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 16

Cross-Product Each row of S1 is paired with each row of R1. Result schema has one field per field of S1 and R1, with field names inherited if possible. Conflict: Both S1 and R1 hae a field called sid. (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 Renaming operator: Result relation name Attribute renaming list Source expression (anything!) r ( C( 1 sid1, 5 sid2), S1 R1) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 17 Renaming Conflict: S1 and R1 both had sid fields, giing: (sid) sname rating age Seeral renaming options aailable: ρ (S1R1(1 sid1), S1 R1) (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 Positional renaming Name-based renaming ρ (TempS1(sid sid1), S1) TempS1 R1 (π (S1)) R1 sid sid1,sname,rating,age Generalized projection (I like this notation best! J) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 18

Joins Condition Join: R!" c S = s c ( R S) (sid) sname rating age (sid) bid day 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 58 103 11/12/96 S1 sid<sid R1 Result schema same as that of cross-product. Fewer tuples than cross-product, so might be able to compute more efficiently Sometimes (often!) called a theta-join. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 19