Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

Similar documents
Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

Database Management Systems CSEP 544. Lecture 3: SQL Relational Algebra, and Datalog

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2006 Lecture 3 - Relational Model

Announcements. Agenda. Database/Relation/Tuple. Discussion. Schema. CSE 444: Database Internals. Room change: Lab 1 part 1 is due on Monday

Agenda. Discussion. Database/Relation/Tuple. Schema. Instance. CSE 444: Database Internals. Review Relational Model

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto

Announcements. Agenda. Database/Relation/Tuple. Schema. Discussion. CSE 444: Database Internals

CSE 544 Principles of Database Management Systems

Lecture 16. The Relational Model

4/10/2018. Relational Algebra (RA) 1. Selection (σ) 2. Projection (Π) Note that RA Operators are Compositional! 3.

L22: The Relational Model (continued) CS3200 Database design (sp18 s2) 4/5/2018

CSC 261/461 Database Systems Lecture 13. Fall 2017

Relational Model and Relational Algebra

CMPT 354: Database System I. Lecture 5. Relational Algebra

Chapter 2: The Relational Algebra

Chapter 2: The Relational Algebra

CSE 544 Principles of Database Management Systems

Relational Model and Algebra. Introduction to Databases CompSci 316 Fall 2018

Relational Model & Algebra. Announcements (Tue. Sep. 3) Relational data model. CompSci 316 Introduction to Database Systems

Relational Model, Relational Algebra, and SQL

CS411 Database Systems. 04: Relational Algebra Ch 2.4, 5.1

Relational Model & Algebra. Announcements (Thu. Aug. 27) Relational data model. CPS 116 Introduction to Database Systems

CSE 344 JANUARY 26 TH DATALOG

Relational Model: History

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Announcements. Using Electronics in Class. Review. Staff Instructor: Alvin Cheung Office hour on Wednesdays, 1-2pm. Class Overview

Ian Kenny. November 28, 2017

Database Technology Introduction. Heiko Paulheim

Databases Relational algebra Lectures for mathematics students

Faloutsos - Pavlo CMU SCS /615

Overview. Carnegie Mellon Univ. School of Computer Science /615 - DB Applications. Concepts - reminder. History

CS 377 Database Systems

2.2.2.Relational Database concept

RELATIONAL ALGEBRA. CS 564- Fall ACKs: Dan Suciu, Jignesh Patel, AnHai Doan

CMP-3440 Database Systems

Chapter 3. The Relational Model. Database Systems p. 61/569

Chapter 6 The Relational Algebra and Relational Calculus

Relational Query Languages: Relational Algebra. Juliana Freire

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

Introduction to Databases CSE 414. Lecture 2: Data Models

Relational Algebra Part I. CS 377: Database Systems

Relational Algebra and SQL

Relational Algebra and SQL. Basic Operations Algebra of Bags

The Extended Algebra. Duplicate Elimination. Sorting. Example: Duplicate Elimination

Chapter 2: Intro to Relational Model

Relational Algebra. Relational Query Languages

Chapter 6 Part I The Relational Algebra and Calculus

Relational Algebra. Algebra of Bags

Relational Algebra BASIC OPERATIONS DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA

Introduction to Data Management. Lecture #11 (Relational Algebra)

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview

Relational Algebra. [R&G] Chapter 4, Part A CS4320 1

Chapter 2 The relational Model of data. Relational algebra

Relational Query Languages. Preliminaries. Formal Relational Query Languages. Example Schema, with table contents. Relational Algebra

Relational Algebra. Study Chapter Comp 521 Files and Databases Fall

Introduction to Data Management CSE 344. Lecture 2: Data Models

Midterm Review. Winter Lecture 13

1 Relational Data Model

Relational Model and Relational Algebra. Rose-Hulman Institute of Technology Curt Clifton

The Relational Model and Relational Algebra

Web Science & Technologies University of Koblenz Landau, Germany. Relational Data Model

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board

CAS CS 460/660 Introduction to Database Systems. Relational Algebra 1.1

Relational Algebra. Relational Query Languages

Informatics 1: Data & Analysis

Relational Algebra. Lecture 4A Kathleen Durant Northeastern University

3. Relational Data Model 3.5 The Tuple Relational Calculus

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.

ECE 650 Systems Programming & Engineering. Spring 2018

Database Management Systems. Chapter 4. Relational Algebra. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Selection and Projection

Database Applications (15-415)

Relational Algebra for sets Introduction to relational algebra for bags

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

Data Models. CS552- Chapter Three

Chapter 6 The Relational Algebra and Calculus

Chapter 5 Relational Algebra. Nguyen Thi Ai Thao

Relational Query Languages

From Relational Algebra to the Structured Query Language. Rose-Hulman Institute of Technology Curt Clifton

CSC 261/461 Database Systems Lecture 19

RELATIONAL DATA MODEL: Relational Algebra

Relational Algebra. Procedural language Six basic operators

Relational Model 2: Relational Algebra

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

Informationslogistik Unit 4: The Relational Algebra

Introduction to Database Systems CSE 444

DATABASE DESIGN I - 1DL300

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL and Rela2onal Algebra- - - Part 1

Database Systems SQL SL03

CPS 216 Spring 2003 Homework #1 Assigned: Wednesday, January 22 Due: Monday, February 10

Relational Query Languages

MIS Database Systems Relational Algebra

CS3DB3/SE4DB3/SE6DB3 TUTORIAL

Relational Algebra. Mr. Prasad Sawant. MACS College. Mr.Prasad Sawant MACS College Pune

CS3DB3/SE4DB3/SE6M03 TUTORIAL

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #2: The Relational Model and Relational Algebra

Database Systems SQL SL03

Chapter 8: Relational Algebra

From SQL-query to result Have a look under the hood

CS 317/387. A Relation is a Table. Schemas. Towards SQL - Relational Algebra. name manf Winterbrew Pete s Bud Lite Anheuser-Busch Beers

Transcription:

Introduction to Data Management CSE 344 Lectures 8: Relational Algebra CSE 344 - Winter 2016 1

Announcements Homework 3 is posted Microsoft Azure Cloud services! Use the promotion code you received Due in two weeks 2

Where We Are Data models SQL, SQL, SQL Declaring the schema for our data (CREATE TABLE) Inserting data one row at a time or in bulk (INSERT/.import) Querying the data (SELECT) Modifying the schema and updating the data (ALTER/UPDATE) Next step: More knowledge of how DBMSs work Relational algebra, query execution, and physical tuning Client-server architecture CSE 344 - Fall 2016 3

Query Evaluation Steps SQL query Translate query string into internal representation Logical plan à physical plan Parse & Check Query Decide how best to answer query: query optimization Check syntax, access control, table names, etc. Relational Algebra Query Execution Return Results Query Evaluation CSE 344 - Winter 2016 4

The WHAT and the HOW SQL = WHAT we want to get from the data Relational Algebra = HOW to get the data we want The passage from WHAT to HOW is called query optimization SQL à Relational Algebra à Physical Plan Relational Algebra = Logical Plan CSE 344 - Winter 2016 5

Relational Algebra CSE 344 - Winter 2016 6

Turing Awards in Data Management Charles Bachman, 1973 IDS and CODASYL Ted Codd, 1981 Relational model Michael Stonebraker, 2014 INGRES and Postgres CSE 344 - Fall 2016 7

Sets v.s. Bags Sets: {a,b,c}, {a,d,e,f}, { },... Bags: {a, a, b, c}, {b, b, b, b, b},... Relational Algebra has two semantics: Set semantics = standard Relational Algebra Bag semantics = extended Relational Algebra DB systems implement bag semantics (Why?) CSE 344 - Winter 2016 8

Relational Algebra Operators Union, intersection, difference - Selection σ Projection π Cartesian product X, join Rename ρ Duplicate elimination δ Grouping and aggregation ɣ Sorting τ RA Extended RA All operators take in 1 or more relations as inputs and return another relation CSE 344 - Winter 2016 9

Union and Difference R1 R2 R1 R2 What do they mean over bags? CSE 344 - Winter 2016 10

What about Intersection? Derived operator using minus R1 R2 = R1 (R1 R2) Derived using join R1 R2 = R1 R2 Only makes sense if R1 and R2 have the same schema CSE 344 - Winter 2016 11

Selection Returns all tuples which satisfy a condition Examples σ c (R) σ Salary > 40000 (Employee) σ name = Smith (Employee) The condition c can be =, <, <=, >, >=, <> combined with AND, OR, NOT CSE 344 - Winter 2016 12

Employee SSN Name Salary 1234545 John 20000 5423341 Smith 60000 4352342 Fred 50000 σ Salary > 40000 (Employee) SSN Name Salary 5423341 Smith 60000 4352342 Fred 50000 CSE 344 - Winter 2016 13

Projection Eliminates columns π A1,,An (R) Example: project social-security number and names: π SSN, Name (Employee) Answer(SSN, Name) Different semantics CSE 344 over - Winter 2016 sets or bags! Why? 14

Employee SSN Name Salary 1234545 John 20000 5423341 John 60000 4352342 John 20000 π Name,Salary (Employee) Name Salary John 20000 John 60000 John 20000 Bag semantics Name Salary John 20000 John 60000 Set semantics CSE 344 - Winter 2016 Which is more efficient? 15

Composing RA Operators Patient no name zip disease 1 p1 98125 flu 2 p2 98125 heart 3 p3 98120 lung 4 p4 98120 heart π zip,disease (Patient) zip disease 98125 flu 98125 heart 98120 lung 98120 heart σ disease= heart (Patient) no name zip disease 2 p2 98125 heart 4 p4 98120 heart π zip,disease (σ disease= heart (Patient)) zip disease 98125 heart 98120 heart CSE 344 - Winter 2016 16

Cartesian Product Each tuple in R1 with each tuple in R2 R1 X R2 Rare in practice; mainly used to express joins CSE 344 - Winter 2016 17

Cross-Product Example Employee Name SSN John 999999999 Tony 777777777 Dependent EmpSSN DepName 999999999 Emily 777777777 Joe Employee X Dependent Name SSN EmpSSN DepName John 999999999 999999999 Emily John 999999999 777777777 Joe Tony 777777777 999999999 Emily Tony 777777777 777777777 Joe CSE 344 - Winter 2016 18

Renaming Changes the schema, not the instance Example: ρ B1,,Bn (R) ρ N, S (Employee) à Answer(N, S) Not really used by systems, but needed on paper CSE 344 - Winter 2016 19

Natural Join R1 R2 Meaning: R1 R2 = Π A (σ θ (R1 R2)) Where: Selection checks equality of all common attributes (i.e., attributes with same names) Projection eliminates duplicate common attributes CSE 344 - Winter 2016 20

Natural Join Example R A B S B C X Y Z U X Z V W Y Z Z V Z V R S = Π ABC (σ R.B=S.B (R S)) A B C X Z U X Z V Y Z U Y Z V Z V W CSE 344 - Winter 2016 21

Natural Join Example 2 AnonPatient P age zip disease 54 98125 heart 20 98120 flu Voters V name age zip p1 54 98125 p2 20 98120 P V age zip disease name 54 98125 heart p1 20 98120 flu p2 CSE 344 - Winter 2016 22

Natural Join Given schemas R(A, B, C, D), S(A, C, E), what is the schema of R S? Given R(A, B, C), S(D, E), what is R S? Given R(A, B), S(A, B), what is R S? CSE 344 - Winter 2016 23

AnonPatient (age, zip, disease) Voters (name, age, zip) Theta Join A join that involves a predicate R1 θ R2 = σ θ (R1 X R2) Here θ can be any condition No projection in this case! For our voters/patients example: P P.zip = V.zip and P.age >= V.age -1 and P.age <= V.age +1 V 24

Equijoin A theta join where θ is an equality predicate Projection drops all redundant attributes R1 θ R2 = π A (σ θ (R1 X R2)) By far the most used variant of join in practice CSE 344 - Winter 2016 25

Equijoin Example AnonPatient P age zip disease 54 98125 heart 20 98120 flu Voters V name age zip p1 54 98125 p2 20 98120 P P.age=V.age V age P.zip disease name V.zip 54 98125 heart p1 98125 20 98120 flu p2 98120 CSE 344 - Winter 2016 26

Join Summary Theta-join: R θ S = σ θ (R x S) Join of R and S with a join condition θ Cross-product followed by selection θ Equijoin: R θ S = π A (σ θ (R x S)) Join condition θ consists only of equalities Projection π A drops all redundant attributes Natural join: R S = π A (σ θ (R x S)) Equijoin Equality on all fields with same name in R and in S Projection π A drops all redundant attributes CSE 344 - Winter 2016 27