Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS.

Size: px
Start display at page:

Download "Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS."

Transcription

1 Chapter 18 Strategies for Query Processing We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS. 1

2 1. Translating SQL Queries into Relational Algebra and Other Operators - SQL is most commonly used query language - Translated into equivalent extended relational algebra expression - It is represented as a query tree data structure and optimized - SQL queries are decomposed into query blocks; basic units; translated into expressions and optimized - A query block contains single select-from-where expression including GROUP BY and HAVING - Nested queries within a query are identified as separate blocks Q1: Retrieve the names of employees who earn a salary that is greater than the highest salary in department 5. SELECT Lname, Fname FROM EMPLOYEE WHERE Salary > (SELECT MAX (Salary) FROM EMPLOYEE WHERE (Dno = 5); The Q1 is a nested query, it is decomposed into two sub queries as follows. 2

3 Q1a: (inner) SELECT MAX (Salary) FROM EMPLOYEE WHERE (Dno = 5) Q1b: (outer) SELECT Lname, Fname FROM EMPLOYEE WHERE Salary > c (c represents the results of Q1a) Q1a can be translated to: (inner) ᵮ MAX Salary (σ Dno=5 (EMPLOYEE)) (Selection operator) Q1b can be translated to: (outer) π Lname, Fname (σ Salary>c (EMPLOYEE)) (Projection operator) - The query optimizer will choose the execution plan for each query block - The inner block needs to be evaluated only once - The outer block has to iterate for each employee - This type of nested queries are called non-correlated queries 3

4 4

5 Additional Operators Semi-join and Anti-join Semi-join is generally used for un-nesting EXISTS, IN and ANY subqueries Consider: EMPLOYEE (Ssn, Bdate, Address, Sex, Salary, Dno) DEPARTMENT (Dnumber, Dname, Dmgrssn, Zipcode) Where a department is located in a specific zip code. Q(SJ): Counting the number of departments, that have employees, who make more than salary. SELECT COUNT(*) FROM DEPARTMENT D WHERE D.Dnumber IN (SELECT E.Dno FROM EMPLOYEE E WHERE E.Salary > ); The nested query is joined by the connector IN operator. 5

6 To remove the nested query: (SELECT E.Dno FROM EMPLOYEE WHERE E.Salary > ) is called as un-nesting. This query gives a set of Dno with employees who make more than It leads to the following query with an operation called semi-join, shown as S= SELECT COUNT(*) FROM EMPLOYEE E, DEPARTMENT D WHERE D.Dnumber S = E.Dno and E.Salary > ; Consider another query: Count the number of employees who do not work in department located in Zipcode Q(AJ): SELECT COUNT(*) FROM EMPLOYEE WHERE EMPLOYEE.Dno NOT IN (SELECT DEPARTMENT.Dnumber FROM DEPATMENT WHERE Zipcode = 30332); 6

7 SELECT COUNT(*) FROM EMPLOYEE DEPARTMENT WHERE EMPLOYEE.Dno A = DEPARTMENT AND Zipcode=30332 (Antijoin) A row of T1 is rejected as soon as T1.X finds a match with any value of T2.y. A row of T1 is returned, only if T1.x does not match with any value of T2.y. 2.Algorithms for External Sorting External sorting: sorting for large files or records stored on disk that do not fit entirely in memory Sorting most commonly used in databases. Ex. Whenever order by is used, it sorts results. Join, Union, Intersection, also use sorting. Sorting may be avoided if primary or clustering index is available. External sorting algorithms are used to sort large files on disk. A typical sort used for this is sort-merge algorithm. Requires buffer space in main memory, where sorting is done, which is called DBMS cache that is controlled by DBMS. Sort small sub files called runs, merge the sorted files. Sorting Phase: - Read a piece of file into buffer - Sort using internal sort algorithm - Write back to disk (as temporary disk space) 7

8 Merging Phase: - The sorted runs are merged into one or more merge phases - Each merge phase can have one or more merge steps - One block is used to keep merged results 8

9 Available buffer space = n B blocks Number of file blocks b If n B = 5 and b = 1024 Number of initial runs (each of size 5 blocks except the last one) = n R = гb/n B = 1024 / 5 = 205 After sort phase, 205 sorted runs are stored as temporary subfields on disk Degree of merging d M is the number of runs that can be merged in each pass d M = (n B 1) or n R (smaller of) d M = 5-1 = 4 (four way merging) n P = гlog dm (n R ) = number of passes = log 4 (205) = 4 Number of block accesses = (2 * b) + (sort phase, once to read and once to write) (2 * (b * log dm (n R ))) (merge phase) = 2 * (2 * (1024 * 4))) = = (205 -> >13 -> > 1) 9

10 10

11 3.Algorithms for SELECT Operation There are many algorithms for SELECT operation, basically a search operation to locate records to satisfy certain conditions. They depend on the access paths. The following operations are used to illustrate the algorithms: OP1: σ ssn= (EMPLOYEE) OP2: σ Dnumber > 5 (DEPARTMENT) OP3: σ Dno=5 (EMPLOYEE) OP4: σ Dno=5 AND Salary > AND Sex= F (EMPLOYEE) OP5: σ Essn= AND Pno=10 (WORKS_ON) OP6: An SQL Query SELECT * FROM EMPLOYEE WHERE Dno IN (3, 27, 49) OP7: An SQL Query SELECT First_name, Lname FROM EMPLOYEE WHERE ((Salary * Commission_pct) + Salary) > 15000; 11

12 Search Methods A number of search algorithms are possible for selecting records from a file, called file scans. (1) S1 Linear Search; read all related disk blocks into memory buffers and search for required conditions (2) S2 Binary Search: if equality condition on a key attribute; binary search is efficient (file is ordered on this attribute), OP1 (3) S3a Using a primary index; equality comparison and a key attribute; OP1; use primary index to retrieve the record; retrieves a single record (4) S3b Using a hash key: equality condition, use the hash key to retrieve a record; OP1 (5) S4 Using a primary index to retrieve multiple records; if the comparison is > >= < <=, key field with a primary index, Dnumber > 5; OP2; find a record satisfying the equality condition, then retrieve all preceding records (6) S5 Using a clustering field to retrieve multiple records: equality condition on a nonkey; OP3; retrieve all records (7) S6 Using a secondary (B+ index) on an equality comparison; single record if a key, multiple records for non-key attribute; can also be used for range queries; leaf nodes provide the ordering of values to select multiple values in a range SKIP other methods Search methods for conjunctive selection Made up of several conditions connected with AND 12

13 1. S8 Using an individual index: if an attribute has an access path, one of the methods S2 S6 can be used to retrieve the record; check if the retrieved record satisfies other conditions 2. S9 Using a composite index: if a composite index created for one or more conditions, then use it retrieve the records 3. S10 Using intersection of record pointers: if secondary indexes or access paths are available for more than one field, and the indexes include a record pointer, then each record can be retried and checked for a given condition; the intersection of these records can only be retrieved to check for the conjunctive condition Search methods for disjunctive selection OP4 example Uses OR operator in the Selection. Require to retrieve records that satisfy each condition and make a union of records and eliminate duplicates. If one of the attribute does not have an access path, then we need to use linear search to retrieve records. All the above steps for access paths can be used if they exist. Estimating the selectivity of a condition - System catalog can maintain some statistics on database attributes such as number of distinct values, number of records, frequency of data changes etc - The above information can be used in query optimization 13

14 4.Implementing the JOIN Operation - one of the most time consuming operation - equijoin or natural join are most commonly used operations - two-way JOIN involves joining two files - multi-way JOIN involves multiple files (number of possible ways to execute grows rapidly) - we discuss two-way JOINS here R A=B S A, B are the join attributes, domain compatible OP6: EMPLOYEE OP7: DEPARTMENT Dnumber=Dno DEPARTMENT Mgrssn=Ssn EMPLOYEE Implementing Joins (1) J1 Nested Loop Join (or nested block join): Default, no need to have access paths; test if this condition is true for each record; t[a] = s[b] (2) J2 Index based nested loop Join: use an access structure to retrieve the matching records; if an index or hash value exists for one of the records, B of the S, retrieve each record t in R s[b] = t[a] 14

15 (3) J3 Sort merge join: The records of R and S are physically sorted by the value of attributes A and B respectively; both files are scanned concurrently and matched for the same values of A and B (4) J4 partitioned has join (or just hash-join): R and S are partitioned into smaller files; the partitioning of each file is done using the same hash function h on the attributes A of S and B of R (partitioning phase); partitioning phase and probe phase R S. hash some records h(a) h(b).... (smaller file will fit into memory) Match records from S Continue the process. Join Performance OP6: EMPLOYEE Dnumber=Dno DEPARTMENT Assume buffer space is available Nested loop approach 15

16 Available buffer space = n B blocks Each memory buffer is same size as one disk block DEPARTMENT file consists of r D = 50 records stored in b D = 10 disk blocks EMPLOYEE file consists of e D = 6000 records stored in b E = 2000 disk blocks Outer loop file records should be read to memory (as many as possible) One block for the inner loop and one block to write the result and remaining n b -2 blocks for outer loop Read one block at a time for the inner loop file and use its records to probe (that is search) that outer loop blocks that are currently in memory for matching records; The contents of the result block are appended to the result file on the disk In the nested loop join, it makes a difference which file is chosen for the outer loop and which file is chosen for the inner loop; If EMPLOYEE is chosen for outer loop, each block of employee is read once; entire DEPARTMENT file (each of its blocks) read once for each time we read in n B -2 blocks of the EMPLOYEE file. Total number of blocks accessed for outer loop file = b E Number of times (n B -2) blocks of outer file are loaded into memory = гb E /(n B 2) 16

17 Total number of blocks accessed for inner loop file = b D * гb E /(n B 2) Total number of read accesses = b E + b D * гb E /(n B 2) = (2000/5) = 6000 block accesses Total number of read accesses (if DEPARTMENT is in outer loop) = b D + b E * гb D /(n B 2) = (10/5) = 4010 block accesses b RES block accesses are added to the above result for a total block accesses. JOIN Selection Factor and Performance A fraction of records in one file is joined with other file. It is called a join selection factor w.r.t equijoin OP7: DEPARTMENT Mgrssn=Ssn EMPLOYEE r D = 50 (50 records in the department); 50 records will be joined with 6000 employee records, but 5950 will not be matched. Suppose secondary indexes exist on Mgr_ssn and Ssn. The number of index levels are Xssn = 4 and Xmgr_ssn = 2; 17

18 Retrieve each EMPLOYEE record and then use the index Xmgr_ssn to find a matching record for the DEPARTMENT entries: The number of accesses = b E + (r E * (X mgr_ssn + 1)) = (6000 * 3)) = block accesses The second option is: retrieve each DEPARTMENT record and then use the index on Ssn of EMPLOYEE to find a matching manager: The number of accesses = b D + (r D * (X ssn + 1)) = 10 + (50 * 5)) = 260 SKIP and Algorithms for PROJECT and Set Operations <attribute-list> ( R) selects columns in the list of attributes SELECT Salary FROM EMPLOYEE; Produces a list of salary for all employees. It is linear search of all employees. If attribute list does not include a key attribute, then you have duplicate values in the result. If duplicate tuples exist, simply sort and remove the duplicates. Set operations UNION, INTERSECTION, DIFFERENCE, CARTESIAN PRODUCT are expensive Sort merge can be used for set operations. Cartesian product should be avoided and use JOIN instead. SKIP

19 SKIP Combining Operations Using Pipelining A query gets translated into a relational algebraic expression with many operations. A single operation at a time requires temporary files to hold results; these temporary files can be passed to next operation; this is called materialized evaluation. To avoid large temporary files, a combination of operations used to improve query execution. A JOIN can be combined with select operations on two input files and projection on one output file. It is common to generate query execution code directly to implement multiple operations. The output of one operation is fed as input of another operation; a series of pipeline to perform combined operations. Iterator for implementing physical operations - If the operator is implemented in such a way that it outputs one tuple at a time, then it is regarded as a iterator (nested join) - The iterator interfaces consists of the following methods: o Open() o GetNext() o Close() - Iterator concept can also be applied to access methods (B trees) 8.Parallel Algorithms for Query Processing 19

20 Parallel database architectures (distributed databases, big data and NOSQL) (1) Shared Memory Architectures a. Interconnection network, common memory b. Each processor has access to entire memory c. Contention for memory and not scalable, interference (2) Shared Disk Architectures a. Every processor has its own memory b. Every processor has access all disks through interconnection network c. Every processor may not have disk of its own (SANs, and NAS) (3) Shared-nothing Architectures a. Commonly used b. Its own memory and disk c. Inexpensive to build (racks of storage with processors) d. Allocating more processors and disks can achieve linear speedup (a) Operator-level Parallelism - Partition data across disks - Horizontal partitioning; distribution of tuples across disks based on some method; ith tuple to (i mod n); range partitioning (10 disks, 10 ranges for storing employee records); hash partitioning ith record assigned to h(i) disk - Sorting: sort on age and partition data - Selection based on some condition; range portioning - Projection and duplicate elimination: sorting tuples and discarding; sorting can use partitioning strategies 20

21 - JOIN: split the relations to be joined; join is divided into multiple smaller joins to perform in parallel on n processors and take a union of the result; Partitions must be non-overlapping on the join key (b) Intraquery Parallelism - A query execution graph can be modeled as a graph of operations - Provide appropriate data in the partition to execute in parallel - If operations do not depend upon one another, they can be run in parallel - If one of the operations can be generated tuple-by-tuple and fed to the other operation, then result is pipelined parallelism (c) Interquery Parallelism - Scale up the parallelism - Avoid cache coherency, concurrency control - Increase the overall rate at which transactions can be processed in parallel by increasing the processors 21

22 22

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

Chapter 3. Algorithms for Query Processing and Optimization

Chapter 3. Algorithms for Query Processing and Optimization Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms

More information

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques 376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list

More information

Chapter 19 Query Optimization

Chapter 19 Query Optimization Chapter 19 Query Optimization It is an activity conducted by the query optimizer to select the best available strategy for executing the query. 1. Query Trees and Heuristics for Query Optimization - Apply

More information

CSC 742 Database Management Systems

CSC 742 Database Management Systems CSC 742 Database Management Systems Topic #16: Query Optimization Spring 2002 CSC 742: DBMS by Dr. Peng Ning 1 Agenda Typical steps of query processing Two main techniques for query optimization Heuristics

More information

COSC344 Database Theory and Applications. σ a= c (P) S. Lecture 4 Relational algebra. π A, P X Q. COSC344 Lecture 4 1

COSC344 Database Theory and Applications. σ a= c (P) S. Lecture 4 Relational algebra. π A, P X Q. COSC344 Lecture 4 1 COSC344 Database Theory and Applications σ a= c (P) S π A, C (H) P P X Q Lecture 4 Relational algebra COSC344 Lecture 4 1 Overview Last Lecture Relational Model This Lecture ER to Relational mapping Relational

More information

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count

More information

Query Processing & Optimization. CS 377: Database Systems

Query Processing & Optimization. CS 377: Database Systems Query Processing & Optimization CS 377: Database Systems Recap: File Organization & Indexing Physical level support for data retrieval File organization: ordered or sequential file to find items using

More information

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview The Relational Algebra Relational Algebra Relational algebra is the basic set of operations for the relational model These operations enable a user to specify basic retrieval requests (or queries) Relational

More information

Chapter 6 The Relational Algebra and Calculus

Chapter 6 The Relational Algebra and Calculus Chapter 6 The Relational Algebra and Calculus 1 Chapter Outline Example Database Application (COMPANY) Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary

More information

Chapter 8: The Relational Algebra and The Relational Calculus

Chapter 8: The Relational Algebra and The Relational Calculus Ramez Elmasri, Shamkant B. Navathe(2017) Fundamentals of Database Systems (7th Edition),pearson, isbn 10: 0-13-397077-9;isbn-13:978-0-13-397077-7. Chapter 8: The Relational Algebra and The Relational Calculus

More information

Chapter 6 Part I The Relational Algebra and Calculus

Chapter 6 Part I The Relational Algebra and Calculus Chapter 6 Part I The Relational Algebra and Calculus Copyright 2004 Ramez Elmasri and Shamkant Navathe Database State for COMPANY All examples discussed below refer to the COMPANY database shown here.

More information

More SQL: Complex Queries, Triggers, Views, and Schema Modification

More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries

More information

More SQL: Complex Queries, Triggers, Views, and Schema Modification

More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Algorithms for Query Processing and Optimization

Algorithms for Query Processing and Optimization Algorithms for Query Processing and Optimization In this chapter we discuss the techniques used by a DBMS to process, optimize, and execute high-level queries. A query expressed in a high-level query language

More information

RELATIONAL DATA MODEL: Relational Algebra

RELATIONAL DATA MODEL: Relational Algebra RELATIONAL DATA MODEL: Relational Algebra Outline 1. Relational Algebra 2. Relational Algebra Example Queries 1. Relational Algebra A basic set of relational model operations constitute the relational

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25-1 Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25-1 Chapter 25 Distributed Databases and Client-Server Architectures Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 25 Outline

More information

Slides by: Ms. Shree Jaswal

Slides by: Ms. Shree Jaswal Slides by: Ms. Shree Jaswal Overview of SQL, Data Definition Commands, Set operations, aggregate function, null values, Data Manipulation commands, Data Control commands, Views in SQL, Complex Retrieval

More information

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building External Sorting and Query Optimization A.R. Hurson 323 CS Building External sorting When data to be sorted cannot fit into available main memory, external sorting algorithm must be applied. Naturally,

More information

Chapter 8: Relational Algebra

Chapter 8: Relational Algebra Chapter 8: elational Algebra Outline: Introduction Unary elational Operations. Select Operator (σ) Project Operator (π) ename Operator (ρ) Assignment Operator ( ) Binary elational Operations. Set Operators

More information

More SQL: Complex Queries, Triggers, Views, and Schema Modification

More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries

More information

CS 377 Database Systems

CS 377 Database Systems CS 377 Database Systems Relational Algebra and Calculus Li Xiong Department of Mathematics and Computer Science Emory University 1 ER Diagram of Company Database 2 3 4 5 Relational Algebra and Relational

More information

A taxonomy of SQL queries Learning Plan

A taxonomy of SQL queries Learning Plan A taxonomy of SQL queries Learning Plan a. Simple queries: selection, projection, sorting on a simple table i. Small-large number of attributes ii. Distinct output values iii. Renaming attributes iv. Computed

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Wentworth Institute of Technology COMP570 Database Applications Fall 2014 Derbinsky. Physical Tuning. Lecture 10. Physical Tuning

Wentworth Institute of Technology COMP570 Database Applications Fall 2014 Derbinsky. Physical Tuning. Lecture 10. Physical Tuning Lecture 10 1 Context Influential Factors Knobs Denormalization Database Design Query Design Outline 2 Database Design and Implementation Process 3 Factors that Influence Attributes: Queries and Transactions

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 7 More SQL: Complex Queries, Triggers, Views, and Schema Modification Slide 7-2 Chapter 7 Outline More Complex SQL Retrieval Queries Specifying Semantic Constraints as Assertions and Actions as

More information

Database design process

Database design process Database technology Lecture 2: Relational databases and SQL Jose M. Peña jose.m.pena@liu.se Database design process 1 Relational model concepts... Attributes... EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS

More information

Chapter 6 5/2/2008. Chapter Outline. Database State for COMPANY. The Relational Algebra and Calculus

Chapter 6 5/2/2008. Chapter Outline. Database State for COMPANY. The Relational Algebra and Calculus Chapter 6 The Relational Algebra and Calculus Chapter Outline Example Database Application (COMPANY) Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Chapter 17: Parallel Databases

Chapter 17: Parallel Databases Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems Database Systems

More information

Chapter 6: RELATIONAL DATA MODEL AND RELATIONAL ALGEBRA

Chapter 6: RELATIONAL DATA MODEL AND RELATIONAL ALGEBRA Chapter 6: Relational Data Model and Relational Algebra 1 Chapter 6: RELATIONAL DATA MODEL AND RELATIONAL ALGEBRA RELATIONAL MODEL CONCEPTS The relational model represents the database as a collection

More information

A subquery is a nested query inserted inside a large query Generally occurs with select, from, where Also known as inner query or inner select,

A subquery is a nested query inserted inside a large query Generally occurs with select, from, where Also known as inner query or inner select, Sub queries A subquery is a nested query inserted inside a large query Generally occurs with select, from, where Also known as inner query or inner select, Result of the inner query is passed to the main

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Introduction to SQL. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

Introduction to SQL. ECE 650 Systems Programming & Engineering Duke University, Spring 2018 Introduction to SQL ECE 650 Systems Programming & Engineering Duke University, Spring 2018 SQL Structured Query Language Major reason for commercial success of relational DBs Became a standard for relational

More information

Relational Algebra 1

Relational Algebra 1 Relational Algebra 1 Motivation The relational data model provides a means of defining the database structure and constraints NAME SALARY ADDRESS DEPT Smith 50k St. Lucia Printing Dilbert 40k Taringa Printing

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) 2 LECTURE OUTLINE Query Processing Methodology Basic Operations and Their Costs Generation of Execution Plans 3 QUERY PROCESSING IN A DDBMS

More information

ECE 650 Systems Programming & Engineering. Spring 2018

ECE 650 Systems Programming & Engineering. Spring 2018 ECE 650 Systems Programming & Engineering Spring 2018 Relational Databases: Tuples, Tables, Schemas, Relational Algebra Tyler Bletsch Duke University Slides are adapted from Brian Rogers (Duke) Overview

More information

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact: Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base

More information

SQL: A COMMERCIAL DATABASE LANGUAGE. Complex Constraints

SQL: A COMMERCIAL DATABASE LANGUAGE. Complex Constraints SQL: A COMMERCIAL DATABASE LANGUAGE Complex Constraints Outline 1. Introduction 2. Data Definition, Basic Constraints, and Schema Changes 3. Basic Queries 4. More complex Queries 5. Aggregate Functions

More information

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 Outline More Complex SQL Retrieval

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:

More information

University of Waterloo Midterm Examination Sample Solution

University of Waterloo Midterm Examination Sample Solution 1. (4 total marks) University of Waterloo Midterm Examination Sample Solution Winter, 2012 Suppose that a relational database contains the following large relation: Track(ReleaseID, TrackNum, Title, Length,

More information

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and Chapter 6 The Relational Algebra and Relational Calculus Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 Outline Unary Relational Operations: SELECT and PROJECT Relational

More information

The Relational Algebra

The Relational Algebra The Relational Algebra Relational Algebra Relational algebra is the basic set of operations for the relational model These operations enable a user to specify basic retrieval requests (or queries) 27-Jan-14

More information

Chapter 8 SQL-99: Schema Definition, Basic Constraints, and Queries

Chapter 8 SQL-99: Schema Definition, Basic Constraints, and Queries Copyright 2004 Pearson Education, Inc. Chapter 8 SQL-99: Schema Definition, Basic Constraints, and Queries Copyright 2004 Pearson Education, Inc. 1 Data Definition, Constraints, and Schema Changes Used

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

SQL: Advanced Queries, Assertions, Triggers, and Views. Copyright 2012 Ramez Elmasri and Shamkant B. Navathe

SQL: Advanced Queries, Assertions, Triggers, and Views. Copyright 2012 Ramez Elmasri and Shamkant B. Navathe SQL: Advanced Queries, Assertions, Triggers, and Views Copyright 2012 Ramez Elmasri and Shamkant B. Navathe NULLS IN SQL QUERIES SQL allows queries that check if a value is NULL (missing or undefined or

More information

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large Chapter 20: Parallel Databases Introduction! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems!

More information

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!

More information

Chapter 20: Parallel Databases. Introduction

Chapter 20: Parallel Databases. Introduction Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!

More information

Database Technology. Topic 3: SQL. Olaf Hartig.

Database Technology. Topic 3: SQL. Olaf Hartig. Olaf Hartig olaf.hartig@liu.se Structured Query Language Declarative language (what data to get, not how) Considered one of the major reasons for the commercial success of relational databases Statements

More information

Overview Relational data model

Overview Relational data model Thanks to José and Vaida for most of the slides. Relational databases and MySQL Juha Takkinen juhta@ida.liu.se Outline 1. Introduction: Relational data model and SQL 2. Creating tables in Mysql 3. Simple

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Relational Algebra Part I. CS 377: Database Systems

Relational Algebra Part I. CS 377: Database Systems Relational Algebra Part I CS 377: Database Systems Recap of Last Week ER Model: Design good conceptual models to store information Relational Model: Table representation with structures and constraints

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases Chapter 18: Parallel Databases Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery

More information

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction Chapter 18: Parallel Databases Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of

More information

Hash-Based Indexing 165

Hash-Based Indexing 165 Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19

More information

SQL STRUCTURED QUERY LANGUAGE

SQL STRUCTURED QUERY LANGUAGE STRUCTURED QUERY LANGUAGE SQL Structured Query Language 4.1 Introduction Originally, SQL was called SEQUEL (for Structured English QUery Language) and implemented at IBM Research as the interface for an

More information

Chapter 5 Relational Algebra. Nguyen Thi Ai Thao

Chapter 5 Relational Algebra. Nguyen Thi Ai Thao Chapter 5 Nguyen Thi Ai Thao thaonguyen@cse.hcmut.edu.vn Spring- 2016 Contents 1 Unary Relational Operations 2 Operations from Set Theory 3 Binary Relational Operations 4 Additional Relational Operations

More information

COSC344 Database Theory and Applications. Lecture 5 SQL - Data Definition Language. COSC344 Lecture 5 1

COSC344 Database Theory and Applications. Lecture 5 SQL - Data Definition Language. COSC344 Lecture 5 1 COSC344 Database Theory and Applications Lecture 5 SQL - Data Definition Language COSC344 Lecture 5 1 Overview Last Lecture Relational algebra This Lecture Relational algebra (continued) SQL - DDL CREATE

More information

Wentworth Institute of Technology COMP2670 Databases Spring 2016 Derbinsky. Physical Tuning. Lecture 12. Physical Tuning

Wentworth Institute of Technology COMP2670 Databases Spring 2016 Derbinsky. Physical Tuning. Lecture 12. Physical Tuning Lecture 12 1 Context Influential Factors Knobs Database Design Denormalization Query Design Outline 2 Database Design and Implementation Process 3 Factors that Influence Attributes w.r.t. Queries/Transactions

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

PES Institute of Technology Bangalore South Campus (1 K.M before Electronic City,Bangalore ) Department of MCA. Solution Set - Test-II

PES Institute of Technology Bangalore South Campus (1 K.M before Electronic City,Bangalore ) Department of MCA. Solution Set - Test-II PES Institute of Technology Bangalore South Campus (1 K.M before Electronic City,Bangalore 560100 ) Solution Set - Test-II Sub: Database Management Systems 16MCA23 Date: 04/04/2017 Sem & Section:II Duration:

More information

Simple SQL Queries (contd.)

Simple SQL Queries (contd.) Simple SQL Queries (contd.) Example of a simple query on one relation Query 0: Retrieve the birthdate and address of the employee whose name is 'John B. Smith'. Q0: SELECT BDATE, ADDRESS FROM EMPLOYEE

More information

RELATIONAL OPERATORS #1

RELATIONAL OPERATORS #1 RELATIONAL OPERATORS #1 CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Algorithms for relational operators: select project 2 ARCHITECTURE OF A DBMS query

More information

CSC 261/461 Database Systems Lecture 19

CSC 261/461 Database Systems Lecture 19 CSC 261/461 Database Systems Lecture 19 Fall 2017 Announcements CIRC: CIRC is down!!! MongoDB and Spark (mini) projects are at stake. L Project 1 Milestone 4 is out Due date: Last date of class We will

More information

SQL-99: Schema Definition, Basic Constraints, and Queries. Create, drop, alter Features Added in SQL2 and SQL-99

SQL-99: Schema Definition, Basic Constraints, and Queries. Create, drop, alter Features Added in SQL2 and SQL-99 SQL-99: Schema Definition, Basic Constraints, and Queries Content Data Definition Language Create, drop, alter Features Added in SQL2 and SQL-99 Basic Structure and retrieval queries in SQL Set Operations

More information

University of Waterloo Midterm Examination Solution

University of Waterloo Midterm Examination Solution University of Waterloo Midterm Examination Solution Winter, 2011 1. (6 total marks) The diagram below shows an extensible hash table with four hash buckets. Each number x in the buckets represents an entry

More information

SQL. Copyright 2013 Ramez Elmasri and Shamkant B. Navathe

SQL. Copyright 2013 Ramez Elmasri and Shamkant B. Navathe SQL Copyright 2013 Ramez Elmasri and Shamkant B. Navathe Data Definition, Constraints, and Schema Changes Used to CREATE, DROP, and ALTER the descriptions of the tables (relations) of a database Copyright

More information

Session Active Databases (2+3 of 3)

Session Active Databases (2+3 of 3) INFO-H-415 - Advanced Databes Session 2+3 - Active Databes (2+3 of 3) Consider the following databe schema: DeptLocation DNumber DLocation Employee FName MInit LName SSN BDate Address Sex Salary SuperSSN

More information

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E) 1 NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E) 2 LECTURE OUTLINE More Complex SQL Retrieval Queries Self-Joins Renaming Attributes and Results Grouping, Aggregation, and Group Filtering

More information

Advanced Databases: Parallel Databases A.Poulovassilis

Advanced Databases: Parallel Databases A.Poulovassilis 1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger

More information

2.2.2.Relational Database concept

2.2.2.Relational Database concept Foreign key:- is a field (or collection of fields) in one table that uniquely identifies a row of another table. In simpler words, the foreign key is defined in a second table, but it refers to the primary

More information

CMPUT 391 Database Management Systems. Query Processing: The Basics. Textbook: Chapter 10. (first edition: Chapter 13) University of Alberta 1

CMPUT 391 Database Management Systems. Query Processing: The Basics. Textbook: Chapter 10. (first edition: Chapter 13) University of Alberta 1 CMPUT 391 Database Management Systems Query Processing: The Basics Textbook: Chapter 10 (first edition: Chapter 13) Based on slides by Lewis, Bernstein and Kifer University of Alberta 1 External Sorting

More information

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology Mobile and Heterogeneous databases Distributed Database System Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you

More information

ECE 650 Systems Programming & Engineering. Spring 2018

ECE 650 Systems Programming & Engineering. Spring 2018 ECE 650 Systems Programming & Engineering Spring 2018 Introduction to SQL Tyler Bletsch Duke University Slides are adapted from Brian Rogers (Duke) Structured Query Language SQL Major reason for commercial

More information

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

Query Processing. Solutions to Practice Exercises Query:

Query Processing. Solutions to Practice Exercises Query: C H A P T E R 1 3 Query Processing Solutions to Practice Exercises 13.1 Query: Π T.branch name ((Π branch name, assets (ρ T (branch))) T.assets>S.assets (Π assets (σ (branch city = Brooklyn )(ρ S (branch)))))

More information

Query Processing. Introduction to Databases CompSci 316 Fall 2017

Query Processing. Introduction to Databases CompSci 316 Fall 2017 Query Processing Introduction to Databases CompSci 316 Fall 2017 2 Announcements (Tue., Nov. 14) Homework #3 sample solution posted in Sakai Homework #4 assigned today; due on 12/05 Project milestone #2

More information

TotalCost = 3 (1, , 000) = 6, 000

TotalCost = 3 (1, , 000) = 6, 000 156 Chapter 12 HASH JOIN: Now both relations are the same size, so we can treat either one as the smaller relation. With 15 buffer pages the first scan of S splits it into 14 buckets, each containing about

More information

CMP-3440 Database Systems

CMP-3440 Database Systems CMP-3440 Database Systems Relational DB Languages Relational Algebra, Calculus, SQL Lecture 05 zain 1 Introduction Relational algebra & relational calculus are formal languages associated with the relational

More information

Midterm Review. March 27, 2017

Midterm Review. March 27, 2017 Midterm Review March 27, 2017 1 Overview Relational Algebra & Query Evaluation Relational Algebra Rewrites Index Design / Selection Physical Layouts 2 Relational Algebra & Query Evaluation 3 Relational

More information

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 General approach to the implementation of Query Processing and Query Optimisation functionalities in DBMSs 1. Parse

More information

Query Processing: The Basics. External Sorting

Query Processing: The Basics. External Sorting Query Processing: The Basics Chapter 10 1 External Sorting Sorting is used in implementing many relational operations Problem: Relations are typically large, do not fit in main memory So cannot use traditional

More information

The Relational Algebra and Calculus. Copyright 2013 Ramez Elmasri and Shamkant B. Navathe

The Relational Algebra and Calculus. Copyright 2013 Ramez Elmasri and Shamkant B. Navathe The Relational Algebra and Calculus Copyright 2013 Ramez Elmasri and Shamkant B. Navathe Chapter Outline Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary

More information

Evaluation of Relational Operations

Evaluation of Relational Operations Evaluation of Relational Operations Chapter 14 Comp 521 Files and Databases Fall 2010 1 Relational Operations We will consider in more detail how to implement: Selection ( ) Selects a subset of rows from

More information

Overview of Implementing Relational Operators and Query Evaluation

Overview of Implementing Relational Operators and Query Evaluation Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders

More information

Chapter 17 Indexing Structures for Files and Physical Database Design

Chapter 17 Indexing Structures for Files and Physical Database Design Chapter 17 Indexing Structures for Files and Physical Database Design We assume that a file already exists with some primary organization unordered, ordered or hash. The index provides alternate ways to

More information

Relational Calculus: 1

Relational Calculus: 1 CSC 742 Database Management Systems Topic #8: Relational Calculus Spring 2002 CSC 742: DBMS by Dr. Peng Ning 1 Relational Calculus: 1 Can define the information to be retrieved not any specific series

More information

Chapter 4. Basic SQL. SQL Data Definition and Data Types. Basic SQL. SQL language SQL. Terminology: CREATE statement

Chapter 4. Basic SQL. SQL Data Definition and Data Types. Basic SQL. SQL language SQL. Terminology: CREATE statement Chapter 4 Basic SQL Basic SQL SQL language Considered one of the major reasons for the commercial success of relational databases SQL Structured Query Language Statements for data definitions, queries,

More information

Relational Database Systems 2 5. Query Processing

Relational Database Systems 2 5. Query Processing Relational Database Systems 2 5. Query Processing Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 5 Query Processing 5.1 Introduction:

More information