Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Administrivia. Administrivia. Faloutsos/Pavlo CMU /615

Similar documents
Cost-based Query Sub-System. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class.

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Implementation of Relational Operations: Other Operations

Today's Class. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Example Database. Query Plan Example

Evaluation of Relational Operations: Other Techniques

Query Processing. Lecture #10. Andy Pavlo Computer Science Carnegie Mellon Univ. Database Systems / Fall 2018

Database Applications (15-415)

15-415/615 Faloutsos 1

Evaluation of Relational Operations: Other Techniques

Evaluation of Relational Operations: Other Techniques

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Overview of Query Evaluation. Chapter 12

Database Applications (15-415)

Administrivia Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Database Applications (15-415)

Administrivia Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Evaluation of Relational Operations

Evaluation of Relational Operations

Evaluation of relational operations

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CSIT5300: Advanced Database Systems

CompSci 516 Data Intensive Computing Systems

Join Algorithms. Lecture #12. Andy Pavlo Computer Science Carnegie Mellon Univ. Database Systems / Fall 2018

Database Applications (15-415)

Sorting & Aggregations

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

Spring 2017 QUERY PROCESSING [JOINS, SET OPERATIONS, AND AGGREGATES] 2/19/17 CS 564: Database Management Systems; (c) Jignesh M.

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview - rel. model. Overview - detailed - SQL

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Principles of Data Management. Lecture #9 (Query Processing Overview)

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky

Implementation of Relational Operations

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Administrivia Final Exam. Administrivia Final Exam

Introduction to Data Management. Lecture 14 (SQL: the Saga Continues...)

Last Class. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Why do we need to sort? Today's Class

Advanced Database Systems

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

Query Processing. Introduction to Databases CompSci 316 Fall 2017

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

Operator Implementation Wrap-Up Query Optimization

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

RELATIONAL OPERATORS #1

Evaluation of Relational Operations. Relational Operations

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1

CMP-3440 Database Systems

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Chapter 12: Query Processing. Chapter 12: Query Processing

SQL Part 2. Kathleen Durant PhD Northeastern University CS3200 Lesson 6

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Database System Concepts

Evaluation of Relational Operations

Database Management System

Chapter 13: Query Processing

Administrivia. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Relational Model: Keys. Correction: Implied FDs

ASSIGNMENT NO Computer System with Open Source Operating System. 2. Mysql

OLTP vs. OLAP Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Overview of Implementing Relational Operators and Query Evaluation

CSE 344 Midterm Exam

Midterm Review CS634. Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

Query Processing & Optimization

Chapter 12: Query Processing

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15

CS317 File and Database Systems

SQL - Data Query language

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

Chapter 13: Query Processing Basic Steps in Query Processing

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Database Systems CSE 344

Instructor: Craig Duckett. Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Database Systems CSE 414

Database Applications (15-415)

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Last Class. Faloutsos/Pavlo CMU /615

Introduction to Wireless Sensor Network. Peter Scheuermann and Goce Trajcevski Dept. of EECS Northwestern University

20761 Querying Data with Transact SQL

Query Evaluation (i)

Database Applications (15-415)

CIS 330: Applied Database Systems

Today s Class. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Formal Properties of Schedules. Conflicting Operations

EECS 647: Introduction to Database Systems

Introduction to Data Management. Lecture #11 (Relational Algebra)

Introduction to Database Systems CSE 444

Database Systems CSE 414

CSE 544 Principles of Database Management Systems

Reminders - IMPORTANT:

CSE 344 Final Review. August 16 th

DBMS Query evaluation

Database Management Systems,

Database Management Systems. Chapter 5

Oracle Database 11g: SQL and PL/SQL Fundamentals

Administrivia. CS186 Class Wrap-Up. News. News (cont) Top Decision Support DBs. Lessons? (from the survey and this course)

Advanced SQL Tribal Data Workshop Joe Nowinski

Introduction to Data Management CSE 344. Lecture 12: Cost Estimation Relational Calculus

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis

Query Evaluation! References:! q [RG-3ed] Chapter 12, 13, 14, 15! q [SKS-6ed] Chapter 12, 13!

COURSE OUTLINE. Page : 1 of 5. Semester: 2 Academic Session: 2017/2018

Transcription:

Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos A. Pavlo Lecture#14(b): Implementation of Relational Operations Administrivia HW4 is due today. HW5 is out. Faloutsos/Pavlo 15-415/615 2 Administrivia Mid-term on Tues March 4th Will cover everything up to and including this week s lectures. Closed book, one sheet of notes (double-sided). Please email Christos+Andy if you need special accommodations. See exam guideline: http://bit.ly/1hupegv Faloutsos/Pavlo 15-415/615 3 1

Extended Office Hours Christos: Friday Feb 28 th 3:00pm-5:00pm Andy: Friday Feb 28 th 10:00am-12:00pm Monday Mar 3 rd 9:00am-10:00am Tuesday Mar 4 th 10:00am-12:00pm Faloutsos/Pavlo 15-415/615 4 Last Class: Selections Approach #1: Find the cheapest access path, retrieve tuples using it, and apply any remaining terms that don t match the index Approach #2: Use multiple indexes to find the intersection of matching tuples. Faloutsos/Pavlo 15-415/615 5 Last Class: Joins Nested Loop Joins Index Nested Loop Joins Sort-Merge Joins Hash Joins Faloutsos/Pavlo 15-415/615 6 2

Today s Class Set Operations Aggregate Operations Explain Mid-term Review + Q&A Faloutsos/Pavlo 15-415/615 7 Set Operations Intersection (R S) Cross-Product (R S) Union (R S) Difference (R-S) Special case of join. Use same techniques from last class. We can use sorting or hashing strategies. Faloutsos/Pavlo 15-415/615 8 Union/Difference Sorting Sort both relations on combination of all attributes. Scan sorted relations and merge them. For union, just eliminate duplicates as we go. For difference, we emit tuples from R if they don t appear in S. Faloutsos/Pavlo 15-415/615 9 3

Union/Difference Sorting External Merge Sort from Lecture #12 Sort both relations on combination of all attributes. Scan sorted relations and merge them. For union, just eliminate duplicates as we go. For difference, we emit tuples from R if they don t appear in S. Faloutsos/Pavlo 15-415/615 10 Union/Difference Hashing Partition R and S using hash function h 1. For each S-partition, build in-memory hash table (using h 2 ), scan corresponding R- partition and add tuples to table. For union, discard duplicates. For difference, probe the hash table for S and emit R tuples that are missing. Faloutsos/Pavlo 15-415/615 11 Union/Difference Hashing Two-Phase Hashing Partition R and S using hash function from Lecture h 1. #13 For each S-partition, build in-memory hash table (using h 2 ), scan corresponding R- partition and add tuples to table. For union, discard duplicates. For difference, probe the hash table for S and emit R tuples that are missing. Faloutsos/Pavlo 15-415/615 12 4

Today s Class Set Operations Aggregate Operations Explain Mid-term Review + Q&A Faloutsos/Pavlo 15-415/615 13 Aggregate Operators Basic SQL-92 aggregate functions: MIN Return the minimum value. MAX Return the maximum value. SUM Return the sum. COUNT Return a count of the # of rows. AVG Return the average value. Note that each can be executed with or without GROUP BY. Faloutsos/Pavlo 15-415/615 14 Aggregate Operators Scan the relation and maintain running information about matched tuples. MIN Smallest value seen thus far. MAX Largest value seen thus far. SUM Total of the values seen thus far. COUNT The # of rows seen thus far. AVG <Total,Count> of the values seen. Faloutsos/Pavlo 15-415/615 15 5

Running Totals SELECT bid, COUNT(*) FROM Reserves GROUP BY bid; sid bid day rname 6 103 2014-02-01 matlock 1 102 2014-02-02 macgyver 2 101 2014-02-02 a-team 1 101 2014-02-01 dallas bid count 6 12 1 1 2 1 Hash Table Faloutsos/Pavlo 15-415/615 16 Aggregate Operators SELECT COUNT(*) FROM Reserves Without grouping: In general, requires scanning the relation. Given index whose search key includes all attributes in the SELECT or WHERE clauses, can do index-only scan. Faloutsos/Pavlo 15-415/615 17 Aggregate Operators SELECT bid, COUNT(*) FROM Reserves GROUP BY bid; With grouping, we have three approaches: Sorting Hashing A suitable tree-based index Faloutsos/Pavlo 15-415/615 18 6

Aggregates with Grouping Approach #1: Sorting Sort on group-by attributes, then scan relation and compute aggregate for each group. Approach #2: Hashing Build in-memory hash table on group-by attributes. Update running totals for each tuple that we examine. Faloutsos/Pavlo 15-415/615 19 Aggregates with Grouping Approach #3: Indexes Given tree index whose search key includes all attributes in SELECT, WHERE, and GROUP BY clauses, we can do index-only scan. If GROUP BY attributes form prefix of search key, we can retrieve data entries/tuples in group-by order. Faloutsos/Pavlo 15-415/615 20 Today s Class Set Operations Aggregate Operations Explain Mid-term Review + Q&A Faloutsos/Pavlo 15-415/615 21 7

EXPLAIN When you precede a SELECT statement with the keyword EXPLAIN, the DBMS displays information from the optimizer about the statement execution plan. The system explains how it would process the query, including how tables are joined and in which order. Faloutsos/Pavlo 15-415/615 22 EXPLAIN SELECT bid, COUNT(*) AS cnt FROM Reserves GROUP BY bid ORDER BY cnt Pseudo Query Plan: SORT COUNT GROUP BY p bid RESERVES Faloutsos/Pavlo 15-415/615 23 EXPLAIN EXPLAIN SELECT bid, COUNT(*) AS cnt FROM Reserves GROUP BY bid ORDER BY cnt Postgres v9.1 Faloutsos/Pavlo 15-415/615 24 8

EXPLAIN EXPLAIN SELECT bid, COUNT(*) AS cnt FROM Reserves GROUP BY bid ORDER BY cnt MySQL v5.5 Faloutsos/Pavlo 15-415/615 25 EXPLAIN ANALYZE ANALYZE option causes the statement to be actually executed. The actual runtime statistics are displayed This is useful for seeing whether the planner's estimates are close to reality. Note that ANALYZE is a Postgres idiom. Faloutsos/Pavlo 15-415/615 26 EXPLAIN ANALYZE EXPLAIN ANALYZE SELECT bid, COUNT(*) AS cnt FROM Reserves GROUP BY bid ORDER BY cnt Postgres v9.1 Faloutsos/Pavlo 15-415/615 27 9

EXPLAIN ANALYZE Works on any type of query. Since ANALYZE actually executes a query, if you use it with a query that modifies the table, that modification will be made. Faloutsos/Pavlo 15-415/615 28 Today s Class Set Operations Aggregate Operations Explain Mid-term Review + Q&A Faloutsos/Pavlo 15-415/615 29 Mid-term Review Everything from the beginning of the course to today is fair game: 01intro.pdf to 14RelOp.pdf Bring a calculator. Your phone is unfortunately not a calculator. Faloutsos/Pavlo 15-415/615 30 10

Relational Model Chapters 2-4 E-R Diagrams Relational Algebra Relational Calculus Faloutsos/Pavlo 15-415/615 31 SQL Chapter 5 Basic Syntax Different Types of Joins Nested Queries Faloutsos/Pavlo 15-415/615 32 Storage & Indexes Chapters 8-10 How a DBMS stores data on disk. B-Tree Indexes Hash Table Indexes Make sure you know costs + trade-offs. Faloutsos/Pavlo 15-415/615 33 11

Query Evaluation Chapters 12-14 Sorting Hashing Selection + Access Paths Join Algorithms Faloutsos/Pavlo 15-415/615 34 Questions? Faloutsos/Pavlo 15-415/615 35 12