Database Management Systems (COP 5725) Homework 3

Size: px
Start display at page:

Download "Database Management Systems (COP 5725) Homework 3"

Transcription

1 Database Management Systems (COP 5725) Homework 3 Instructor: Dr. Daisy Zhe Wang TAs: Yang Chen, Kun Li, Yang Peng yang, kli, ypeng@cise.uf l.edu November 26, 2013 Name: UFID: Address: Pledge(Must be signed according to UF Honor Code) On my honor, I have neither given nor received unauthorized aid in doing this assignment. Signature For grading use only: Question: I II III IV Total Points: Score: i

2 COP5725, Fall 2013 Homework 3 Page 1 of 7 I. [26 points] Indexing. Make the following assumptions for questions (1) and (2): A bucket can hold two keys and a pointer. The initial database D contains one object with key Six objects with the following keys are inserted to D in the following order: 00110, 11010, 10011, 01010, 10110, and (1) [3 points] Assume that an extensible hash table is used to index the database. Show the index structure after the insertions. (2) [3 points] Assume that a linear hash table is used to index the database with the restriction that at most 80% of the hash table can be full at any time. Show the index structure after the insertions. For (1) (left) and (2) (right). i = i = 3 n = 5 r = (3) [3 points] Describe a scenario when extensible hash tables are preferred over linear hash tables. Explain your answer. Two possibilities: When the insertions are very frequent. This is because linear hash table is reorganized every time a new bucket is added. A new bucket is added frequently. When the values of the keys that the index is built on uniformly distributed keys. (i.e., the hash table is filled uniformly). For (4)-(6), consider a B+ tree whose nodes contain up to 4 keys (5 pointers). (4) [3 points] Bulkload the B+ tree with values 46, 10, 70, 49, 23, 40, 59, 29, 34, 54, 75, 30.

3 COP5725, Fall 2013 Homework 3 Page 2 of (5) [3 points] Show the result B+ tree after inserting values 80, 24, (6) [3 points] Based on the B+ tree in (5), show the result B+ tree after deleting values 10, (7) [8 points] Fill in the cost table below for Alternative 1 ISAM and B+ tree indices. Assume each index takes P pages on disk, has height H, and fanout F at each internal node. Assume there are R tuples in the relation, and B tuples fit on a leaf (or overflow) page. In each case, assume infinite buffer pool size, but the buffer pool starts out empty. For each page that gets dirty, add 1 to your I/O cost since it will eventually have to be flushed to disk. For ISAM, assume that a leaf node maintains only a pointer to the beginning of an overflow list. Given the constraints of a B+ Tree/ISAM, assume whatever data you want in the tree for each case below.

4 COP5725, Fall 2013 Homework 3 Page 3 of 7 ISAM B+ Tree Worst-case # IOs for range query P (index consists of root with a linear string of overflow pages. Need to look at all overflow pages since they re not sorted) or H + R/B or H + (F H 1) + R/B (look at whole leaf level and all data in last leaf overflow) H + F H (range query covers the whole table) H + R/B P was not accepted here, as this would imply only 2 I/Os, given the structure of the index. Worst-case # IOs for insert P + 2 (index consist of a root with a string of overflow pages. Need to scan til the end, and add a new overflow page in the worst case, and update the previous last overflow page with a pointer) or H + R/B + 2 3H + 1 (every node needs to split, +1 for new root. Read pages we re going to split on the way down, so we don t need to read them again.) II. [25 points] Query Evaluation. Suppose we want to compute (R(a, b) S(a, c)) T (a, d) in the order indicated. We have M = 101 main memory buffers, and the number of disk blocks (pages) for R and S B(R) = B(S) = Now we decide to use one-pass or two-pass sort-merge-join algorithms to implement the query. (1) [2 points] Would you use a one- or two-pass sort-merge-join for R S? Explain. Two-pass sort-merge-join, since both operands are larger than main memory. (2) We shall use the appropriate number of passes for the second join, first dividing T into some number of sublists sorted by a, and merging them with the sorted and pipelined stream of tuples from the join R S. For what values of B(T ) should we choose for the join of T with R S: i. [3 points] A one-pass join; i.e., we read T into memory, and compare its tuples with the tuples of R S as they are generated. B(T ) 60. ii. [3 points] A two-pass join; i.e., we create sorted sublists for T and keep one buffer in memory for each sorted sublist, while we generate tuples of R S. B(T ) > 60. iii. [4 points] For cases in i. ii., what is the total number of disk I/O s (in terms of B(T ))? For i. we need 3 ( ) = 12, 000 I/O s to perform the two-pass sortmerge-join of R and S, and B(T ) I/O s to read T in the one-pass join of (R S) T. The total # of I/O s is 12, B(T ).

5 COP5725, Fall 2013 Homework 3 Page 4 of 7 For ii. we need 2B(T ) disk I/O s to sort B(T ) into sublists; 12,000 disk I/O s to join R S; B(T ) to read the sorted lists of T. The total number of disk I/O s is 12, B(T ). (3) [4 points] Consider the query (R(a, b) S(a, c)) T (c, d), i.e., the second join is based on attribute c instead of a. How would you choose the join algorithms? Provide a new cost estimation if your choices differ from (2). We need to re-sort the intermediate result R S based on the attribute c. New cost is 12, B(T ) + 2B(R S), where the 2B(R S) term comes from writing out the sublists of R S and read them in again while joining (R S) T. For (4)-(6), you are given M memory blocks and a relation R. (4) [3 points] Describe a two-pass hash-based algorithm for duplicate elimination, δ(r). (Hint: review the aggregation algorithm with grouping). Hash R into M 1 buckets based on all attributes. Perform δ on each bucket in isolation, using M memory blocks. (5) [3 points] What is the largest relation your algorithm can handle given M blocks of main memory? M(M 1). (6) [3 points] What is the number of disk I/O s of your algorithm? B(R) for reading R and hashing; B(R) for writing out the buckets; B(R) for reading the buckets and do the actual duplication. 3B(R) in total. III. [24 points] Query Optimization. Consider the following database schema: Employees(eid: integer, ename: string, sal: integer, title: string, age: integer) Suppose that the following indexes, all using Alternative (2) for data entries, exist: a hash index on eid, a B+ tree index on sal, a hash index on age, and a clustered B+ tree index on (age, sal). Each Employees record is 100 bytes long, and you can assume that each index data entry is 20 bytes long. The Employees relation contains 10,000 pages. (1) Consider each of the following selection conditions and, assuming that the reduction factor (RF) for each term that matches an index is 0.1, compute the cost of the most selective access path for retrieving all Employees tuples that satisfy the condition (in terms of the number of I/O s):

6 COP5725, Fall 2013 Homework 3 Page 5 of 7 i. [4 points] age=25. The clustered B+ tree index would be the best option here, with a cost of 2 (lookup) (data pages) (index pages) 0.1 = Although the hash index has a less lookup time, the potential number of record lookups ( tuples per page = 20000) renders the clustered index more efficient. ii. [4 points] sal>200 AND age>30 AND title= CFO. Here an age condition is present, so the clustered B+ tree index on (age, sal) can be used. The cost is , (all index pages needs to be fetched satisfying age>30) + 10, (data pages) = 302. Consider the following relational schema and SQL query: Emp(eid, did, sal, hobby) Dept(did, dname, floor, phone) Finance(did, budget, sales, expences) SELECT D.dname, F.budget FROM Emp E, Dept D, Finance F WHERE E.did = D.did AND D.did = F.did AND D.floor = 1 AND E.sal >= AND E.hobby = yodelling; (2) [5 points] Identify a query plan that a decent query optimizer would choose. π D.dname, F.budget π F.did, F.budget π E.did π D.did, D.dname F σ E.sal 59000, E.hobby="yodelling" σ D.floor=1 (3) Suppose that the following additional information is available: E Unclustered B+ tree indexes exist on Emp.did, Emp.sal, Dept.did, and Finance.did (each leaf page contains up to 200 entries). The systems statistics indicate that employee salaries range from 10,000 to 60,000, employees enjoy 200 different hobbies. The company owns two floors in the building. There are a total of 50,000 employees and 5,000 departments (each with corresponding financial information) in the database. The DBMS used by the company has just one join method available, namely index nested loops. D

7 COP5725, Fall 2013 Homework 3 Page 6 of 7 i. [3 points] For each of the query s base relations, estimate the number of tuples that would be initially selected from that relation if all of the non-join predicates on that relation were applied to it before any join processing begins. Emp: = 5. Dept: = Finance: ii. [8 points] Under the System R approach, determine a join order that has the least estimated cost. Compute the cost of your plan (in terms of the number of disk I/O s). ((D E) F ). First, we use the fact that there is a B-tree index on salary to retrieve the tuples from E such that E.salary >= We estimate that (50000/50) = 1000 such tuples selected out, with a cost of 1 tree traversal (say 3 I/O s to get to the leaf) + the cost of scanning the leaf pages (1000/ = 5) + the cost of retrieving the 1000 tuples (since the index is unclustered each tuple is potentially 1 disk I/O) = = Of these 1000 retrieved tuples, do an on-the fly select out only those that have hobby = "yodelling", we estimate there will be (1000/200) = 5 such tuples. Pipeline these 5 tuples from E one at a time to D. By using the B+ tree index on D.did and the fact the D.did is a key, we can find the matching tuples for the join by searching the D.did B+ tree and retrieving at most 1 matching tuple per tuple from E. The cost of E D is hence total cost of index nested loop. 5 (tree traversal of D.did Btree + record retrieval) = 5 (3 + 1) = 20. Now select out the 5/2 = 3 tuples that have D.floor = 1 on the fly and pipeline it to the next level F. (This is done after E D is done). Use the B+ tree index on F.did and the fact that F.did is a key to retrieve at most 1 tuple for each of the 3 pipelined tuples. This cost is at most 3 (3 + 1) = 12. Ignoring the cost of writing out the final result, we get a total cost of = IV. [25 points] Transactions and Concurrency Control. (1) For each of the following schedules: a) r 1 (A); r 2 (B); w 1 (B); w 2 (C); r 3 (C); w 3 (A); b) r 1 (A); r 2 (A); r 1 (B); r 2 (B); r 3 (A); r 4 (B); w 1 (A); w 2 (B); Answer the following questions: i. [4 points] What is the precedence graph for the schedule? ii. [4 points] Is the schedule conflict-serializable? If so, what is an equivalent serial schedules? a) i. T 2 T 1, T 2 T 3, T 1 T 3. ii. Yes, equivalent schedules: T 2 T 1 T 3. b) i. T 2 T 1, T 3 T 1, T 1 T 2, T 4 T 2. ii. No, there are cycles in the precedence graph (T 2 T 1, T 1 T 2 ). (2) [17 points] Consider the following two transactions: T 1 : w 1 (C); r 1 (A); w 1 (A); r 1 (B); w 1 (B); T 2 : r 2 (B); w 2 (B); r 2 (A); w 2 (A);

8 COP5725, Fall 2013 Homework 3 Page 7 of 7 Say our scheduler performs exclusive locking only (i.e., no shared locks). For each of the following three instances of transactions T 1 and T 2 annotated with lock and unlock actions, say whether the annotated transactions: 1. obey two-phase locking, 2. will necessarily result in a conflict serializable schedule (if no deadlock occurs), 3. will necessarily result in a strict schedule (if no deadlock occurs), 4. will necessarily result in a serial schedule (if no deadlock occurs), and 5. may result in a deadlock. a) T 1 : l 1 (B); l 1 (C); w 1 (C); l 1 (A); r 1 (A); w 1 (A); r 1 (B); w 1 (B); Commit; u 1 (A); u 1 (C); u 1 (B); T 2 : l 2 (B); r 2 (B); w 2 (B); l 2 (A); r 2 (A); w 2 (A); Commit; u 2 (A); u 2 (B); b) T 1 : l 1 (C); l 1 (A); r 1 (A); w 1 (C); w 1 (A); l 1 (B); r 1 (B); w 1 (B); u 1 (A); u 1 (C); u 1 (B); Commit; T 2 : l 2 (B); r 2 (B); w 2 (B); l 2 (A); r 2 (A); w 2 (A); Commit; u 2 (A); u 2 (B); c) T 1 : l 1 (C); w 1 (C); l 1 (A); r 1 (A); w 1 (A); l 1 (B); r 1 (B); w 1 (B); Commit; u 1 (A); u 1 (C); u 1 (B); T 2 : l 2 (B); r 2 (B); w 2 (B); l 2 (A); r 2 (A); w 2 (A); Commit; u 2 (A); u 2 (B); Format your answer in a table with Yes(Y)/No(N) entries. 2PL Necessarily conflict Serializable Necessarily strict schedule Necessarily Serial schedule May result in deadlock a) Y Y Y Y N b) Y Y N Y Y c) Y Y Y N Y

Question 1 (a) 10 marks

Question 1 (a) 10 marks Question 1 (a) Consider the tree in the next slide. Find any/ all violations of a B+tree structure. Identify each bad node and give a brief explanation of each error. Assume the order of the tree is 4

More information

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly

More information

192 Chapter 14. TotalCost=3 (1, , 000) = 6, 000

192 Chapter 14. TotalCost=3 (1, , 000) = 6, 000 192 Chapter 14 5. SORT-MERGE: With 52 buffer pages we have B> M so we can use the mergeon-the-fly refinement which costs 3 (M + N). TotalCost=3 (1, 000 + 1, 000) = 6, 000 HASH JOIN: Now both relations

More information

Hash-Based Indexing 165

Hash-Based Indexing 165 Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19

More information

Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems

Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems HAND IN Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems Final Examination December 14, 2002 Instructor: Pat Martin Instructions: 1. This examination

More information

CS 564 Final Exam Fall 2015 Answers

CS 564 Final Exam Fall 2015 Answers CS 564 Final Exam Fall 015 Answers A: STORAGE AND INDEXING [0pts] I. [10pts] For the following questions, clearly circle True or False. 1. The cost of a file scan is essentially the same for a heap file

More information

QUERY OPTIMIZATION [CH 15]

QUERY OPTIMIZATION [CH 15] Spring 2017 QUERY OPTIMIZATION [CH 15] 4/12/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 1 Example SELECT distinct ename FROM Emp E, Dept D WHERE E.did = D.did and D.dname = Toy EMP

More information

TotalCost = 3 (1, , 000) = 6, 000

TotalCost = 3 (1, , 000) = 6, 000 156 Chapter 12 HASH JOIN: Now both relations are the same size, so we can treat either one as the smaller relation. With 15 buffer pages the first scan of S splits it into 14 buckets, each containing about

More information

CS 245 Midterm Exam Winter 2014

CS 245 Midterm Exam Winter 2014 CS 245 Midterm Exam Winter 2014 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have 70 minutes

More information

IMPORTANT: Circle the last two letters of your class account:

IMPORTANT: Circle the last two letters of your class account: Fall 2001 University of California, Berkeley College of Engineering Computer Science Division EECS Prof. Michael J. Franklin FINAL EXAM CS 186 Introduction to Database Systems NAME: STUDENT ID: IMPORTANT:

More information

Optimizing logical query plans

Optimizing logical query plans Optimizing logical query plans Exercises Academic year 2017-2018 Algebraic laws Exercise 1. Consider the following relational schema: Hotel(id, name, address Room(rid, hid, type, price Booking(hid, gid,

More information

IMPORTANT: Circle the last two letters of your class account:

IMPORTANT: Circle the last two letters of your class account: Spring 2011 University of California, Berkeley College of Engineering Computer Science Division EECS MIDTERM I CS 186 Introduction to Database Systems Prof. Michael J. Franklin NAME: STUDENT ID: IMPORTANT:

More information

Optimization of Logical Queries

Optimization of Logical Queries Optimization of Logical Queries Task: Consider the following relational schema: Hotel(hid, name, address) Room(rid, hid, type, price) Booking(hid, gid, date from, date to, rid) Guest(gid, name, address)

More information

Final Review. CS 377: Database Systems

Final Review. CS 377: Database Systems Final Review CS 377: Database Systems Final Logistics May 3rd, 3:00-5:30 M 8 single-sided handwritten cheat sheets Comprehensive covering everything up to current class Focus slightly more on the latter

More information

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15 Examples of Physical Query Plan Alternatives Selected Material from Chapters 12, 14 and 15 1 Query Optimization NOTE: SQL provides many ways to express a query. HENCE: System has many options for evaluating

More information

Database Management Systems (COP 5725) Homework 2

Database Management Systems (COP 5725) Homework 2 Database Management Systems (COP 5725) Homework 2 Instructor: Dr. Daisy Zhe Wang TAs: Yang Chen, Kun Li, Yang Peng yang, kli, ypeng@cise.uf l.edu October 8, 2013 Name: UFID: Email Address: Pledge(Must

More information

EECS 647: Introduction to Database Systems

EECS 647: Introduction to Database Systems EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 External Sorting Today s Topic Implementing the join operation 4/8/2009 Luke Huan Univ. of Kansas 2 Review DBMS Architecture

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2009 Quiz I Solutions

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2009 Quiz I Solutions Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.830 Database Systems: Fall 2009 Quiz I Solutions There are 15 questions and 12 pages in this quiz booklet.

More information

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados -------------------------------------------------------------------------------------------------------------- INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados Exam 1 - Solution

More information

CompSci 516: Database Systems

CompSci 516: Database Systems CompSci 516 Database Systems Lecture 9 Index Selection and External Sorting Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 Announcements Private project threads created on piazza

More information

Storage and Indexing

Storage and Indexing CompSci 516 Data Intensive Computing Systems Lecture 5 Storage and Indexing Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Announcement Homework 1 Due on Feb

More information

CS-245 Database System Principles

CS-245 Database System Principles CS-245 Database System Principles Midterm Exam Summer 2001 SOLUIONS his exam is open book and notes. here are a total of 110 points. You have 110 minutes to complete it. Print your name: he Honor Code

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

Lecture 19: Query Optimization (1)

Lecture 19: Query Optimization (1) Lecture 19: Query Optimization (1) May 17, 2010 Dan Suciu -- 444 Spring 2010 1 Announcements Homework 3 due on Wednesday in class How is it going? Project 4 posted Due on June 2 nd Start early! Dan Suciu

More information

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based

More information

Goal of Concurrency Control. Concurrency Control. Example. Solution 1. Solution 2. Solution 3

Goal of Concurrency Control. Concurrency Control. Example. Solution 1. Solution 2. Solution 3 Goal of Concurrency Control Concurrency Control Transactions should be executed so that it is as though they executed in some serial order Also called Isolation or Serializability Weaker variants also

More information

DATABASE MANAGEMENT SYSTEMS

DATABASE MANAGEMENT SYSTEMS www..com Code No: N0321/R07 Set No. 1 1. a) What is a Superkey? With an example, describe the difference between a candidate key and the primary key for a given relation? b) With an example, briefly describe

More information

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery CPSC 421 Database Management Systems Lecture 19: Physical Database Design Concurrency Control and Recovery * Some material adapted from R. Ramakrishnan, L. Delcambre, and B. Ludaescher Agenda Physical

More information

CS 222/122C Fall 2017, Final Exam. Sample solutions

CS 222/122C Fall 2017, Final Exam. Sample solutions CS 222/122C Fall 2017, Final Exam Principles of Data Management Department of Computer Science, UC Irvine Prof. Chen Li (Max. Points: 100 + 15) Sample solutions Question 1: Short questions (15 points)

More information

RELATIONAL OPERATORS #1

RELATIONAL OPERATORS #1 RELATIONAL OPERATORS #1 CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Algorithms for relational operators: select project 2 ARCHITECTURE OF A DBMS query

More information

Query Processing. Introduction to Databases CompSci 316 Fall 2017

Query Processing. Introduction to Databases CompSci 316 Fall 2017 Query Processing Introduction to Databases CompSci 316 Fall 2017 2 Announcements (Tue., Nov. 14) Homework #3 sample solution posted in Sakai Homework #4 assigned today; due on 12/05 Project milestone #2

More information

CS222P Fall 2017, Final Exam

CS222P Fall 2017, Final Exam STUDENT NAME: STUDENT ID: CS222P Fall 2017, Final Exam Principles of Data Management Department of Computer Science, UC Irvine Prof. Chen Li (Max. Points: 100 + 15) Instructions: This exam has seven (7)

More information

Transactions and Concurrency Control

Transactions and Concurrency Control Transactions and Concurrency Control Transaction: a unit of program execution that accesses and possibly updates some data items. A transaction is a collection of operations that logically form a single

More information

Midterm Review CS634. Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

Midterm Review CS634. Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke Midterm Review CS634 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke Coverage Text, chapters 8 through 15 (hw1 hw4) PKs, FKs, E-R to Relational: Text, Sec. 3.2-3.5, to pg.

More information

CS 245 Midterm Exam Solution Winter 2015

CS 245 Midterm Exam Solution Winter 2015 CS 245 Midterm Exam Solution Winter 2015 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

CSE 190D Spring 2017 Final Exam

CSE 190D Spring 2017 Final Exam CSE 190D Spring 2017 Final Exam Full Name : Student ID : Major : INSTRUCTIONS 1. You have up to 2 hours and 59 minutes to complete this exam. 2. You can have up to one letter/a4-sized sheet of notes, formulae,

More information

Indexing. Chapter 8, 10, 11. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Indexing. Chapter 8, 10, 11. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Indexing Chapter 8, 10, 11 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Tree-Based Indexing The data entries are arranged in sorted order by search key value. A hierarchical search

More information

External Sorting Implementing Relational Operators

External Sorting Implementing Relational Operators External Sorting Implementing Relational Operators 1 Readings [RG] Ch. 13 (sorting) 2 Where we are Working our way up from hardware Disks File abstraction that supports insert/delete/scan Indexing for

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Assignment 6 Solutions

Assignment 6 Solutions Database Systems Instructors: Hao-Hua Chu Winston Hsu Fall Semester, 2007 Assignment 6 Solutions Questions I. Consider a disk with an average seek time of 10ms, average rotational delay of 5ms, and a transfer

More information

Relational DBMS Internals Solutions Manual. A. Albano, D. Colazzo, G. Ghelli and R. Orsini

Relational DBMS Internals Solutions Manual. A. Albano, D. Colazzo, G. Ghelli and R. Orsini Relational DBMS Internals Solutions Manual A. Albano, D. Colazzo, G. Ghelli and R. Orsini February 10, 2015 CONTENTS 2 Permanent Memory and Buffer Management 1 3 Heap and Sequential Organizations 5 4

More information

Announcements. Reading Material. Today. Different File Organizations. Selection of Indexes 9/24/17. CompSci 516: Database Systems

Announcements. Reading Material. Today. Different File Organizations. Selection of Indexes 9/24/17. CompSci 516: Database Systems CompSci 516 Database Systems Lecture 9 Index Selection and External Sorting Announcements Private project threads created on piazza Please use these threads (and not emails) for all communications on your

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Overview of Implementing Relational Operators and Query Evaluation

Overview of Implementing Relational Operators and Query Evaluation Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders

More information

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems Last Name: First Name: Student ID: 1. Exam is 2 hours long 2. Closed books/notes Problem 1 (6 points) Consider

More information

CMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein. Student ID: UCSC

CMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein. Student ID: UCSC CMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein Student Name: Student ID: UCSC Email: Final Points: Part Max Points Points I 15 II 29 III 31 IV 19 V 16 Total 110 Closed

More information

Review of Storage and Indexing

Review of Storage and Indexing Review of Storage and Indexing CMPSCI 591Q Sep 17, 2007 Slides adapted from those of R. Ramakrishnan and J. Gehrke 1 File organizations & access methods Many alternatives exist, each ideal for some situations,

More information

Database Management Systems Written Examination

Database Management Systems Written Examination Database Management Systems Written Examination 14.02.2007 First name Student number Last name Signature Instructions for Students Write your name, student number, and signature on the exam sheet. Write

More information

Physical Design. Elena Baralis, Silvia Chiusano Politecnico di Torino. Phases of database design D B M G. Database Management Systems. Pag.

Physical Design. Elena Baralis, Silvia Chiusano Politecnico di Torino. Phases of database design D B M G. Database Management Systems. Pag. Physical Design D B M G 1 Phases of database design Application requirements Conceptual design Conceptual schema Logical design ER or UML Relational tables Logical schema Physical design Physical schema

More information

Homework 2: Query Processing/Optimization, Transactions/Recovery (due February 16th, 2017, 9:30am, in class hard-copy please)

Homework 2: Query Processing/Optimization, Transactions/Recovery (due February 16th, 2017, 9:30am, in class hard-copy please) Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Spring 2017, Prakash Homework 2: Query Processing/Optimization, Transactions/Recovery (due February 16th, 2017, 9:30am, in class hard-copy

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing UVic C SC 370 Dr. Daniel M. German Department of Computer Science July 2, 2003 Version: 1.1.1 7 1 Overview of Storage and Indexing (1.1.1) CSC 370 dmgerman@uvic.ca Overview

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 21, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 DBMS Architecture Query Parser Query Rewriter Query Optimizer Query Executor Lock

More information

QUERY OPTIMIZATION. CS 564- Spring ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan

QUERY OPTIMIZATION. CS 564- Spring ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan QUERY OPTIMIZATION CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? What is a query optimizer? Generating query plans Cost estimation of query plans 2 ARCHITECTURE

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 7 - Query execution

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 7 - Query execution CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 7 - Query execution References Generalized Search Trees for Database Systems. J. M. Hellerstein, J. F. Naughton

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

More information

Chapter 3. Algorithms for Query Processing and Optimization

Chapter 3. Algorithms for Query Processing and Optimization Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms

More information

EXTERNAL SORTING. Sorting

EXTERNAL SORTING. Sorting EXTERNAL SORTING 1 Sorting A classic problem in computer science! Data requested in sorted order (sorted output) e.g., find students in increasing grade point average (gpa) order SELECT A, B, C FROM R

More information

Implementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Implementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Implementing Relational Operators: Selection, Projection, Join Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Readings [RG] Sec. 14.1-14.4 Database Management Systems, R. Ramakrishnan and

More information

R has a ordered clustering index file on its tuples: Read index file to get the location of the tuple with the next smallest value

R has a ordered clustering index file on its tuples: Read index file to get the location of the tuple with the next smallest value 1 of 8 3/3/2018, 10:01 PM CS554, Homework 5 Question 1 (20 pts) Given: The content of a relation R is as follows: d d d d... d a a a a... a c c c c... c b b b b...b ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^ ^^^^^^^^^^^^^^

More information

CS 222/122C Fall 2016, Midterm Exam

CS 222/122C Fall 2016, Midterm Exam STUDENT NAME: STUDENT ID: Instructions: CS 222/122C Fall 2016, Midterm Exam Principles of Data Management Department of Computer Science, UC Irvine Prof. Chen Li (Max. Points: 100) This exam has six (6)

More information

Review. Administrivia (Preview for Friday) Lecture 21: Query Optimization (1) Where We Are. Relational Algebra. Relational Algebra.

Review. Administrivia (Preview for Friday) Lecture 21: Query Optimization (1) Where We Are. Relational Algebra. Relational Algebra. Administrivia (Preview for Friday) Lecture 21: Query Optimization (1) November 17, 2010 For project 4, students are expected (but not required) to work in pairs. Ideally you should pair up by end of day

More information

CISC437/637 Database Systems Final Exam

CISC437/637 Database Systems Final Exam CISC437/637 Database Systems Final Exam You have from 1:00 to 3:00pm to complete the following questions. The exam is closed-note and closed-book. Good luck! Multiple Choice (2 points each; 52 total) 1.

More information

Database Management Systems Paper Solution

Database Management Systems Paper Solution Database Management Systems Paper Solution Following questions have been asked in GATE CS exam. 1. Given the relations employee (name, salary, deptno) and department (deptno, deptname, address) Which of

More information

Physical Disk Structure. Physical Data Organization and Indexing. Pages and Blocks. Access Path. I/O Time to Access a Page. Disks.

Physical Disk Structure. Physical Data Organization and Indexing. Pages and Blocks. Access Path. I/O Time to Access a Page. Disks. Physical Disk Structure Physical Data Organization and Indexing Chapter 11 1 4 Access Path Refers to the algorithm + data structure (e.g., an index) used for retrieving and storing data in a table The

More information

CSE 190D Spring 2017 Final Exam Answers

CSE 190D Spring 2017 Final Exam Answers CSE 190D Spring 2017 Final Exam Answers Q 1. [20pts] For the following questions, clearly circle True or False. 1. The hash join algorithm always has fewer page I/Os compared to the block nested loop join

More information

McGill April 2009 Final Examination Database Systems COMP 421

McGill April 2009 Final Examination Database Systems COMP 421 McGill April 2009 Final Examination Database Systems COMP 421 Wednesday, April 15, 2009 9:00-12:00 Examiner: Prof. Bettina Kemme Associate Examiner: Prof. Muthucumaru Maheswaran Student name: Student Number:

More information

Lecture 21: Query Optimization (1)

Lecture 21: Query Optimization (1) Lecture 21: Query Optimization (1) November 17, 2010 1 Administrivia (Preview for Friday) For project 4, students are expected (but not required) to work in pairs. Ideally you should pair up by end of

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part V Lecture 13, March 10, 2014 Mohammad Hammoud Today Welcome Back from Spring Break! Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+

More information

CSE 444: Database Internals. Lectures 5-6 Indexing

CSE 444: Database Internals. Lectures 5-6 Indexing CSE 444: Database Internals Lectures 5-6 Indexing 1 Announcements HW1 due tonight by 11pm Turn in an electronic copy (word/pdf) by 11pm, or Turn in a hard copy in my office by 4pm Lab1 is due Friday, 11pm

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L11: Physical Database Design Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR, China

More information

Final Exam Review. Kathleen Durant PhD CS 3200 Northeastern University

Final Exam Review. Kathleen Durant PhD CS 3200 Northeastern University Final Exam Review Kathleen Durant PhD CS 3200 Northeastern University 1 Outline for today Identify topics for the final exam Discuss format of the final exam What will be provided for you and what you

More information

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007 Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:

More information

Concurrency Control. R &G - Chapter 19

Concurrency Control. R &G - Chapter 19 Concurrency Control R &G - Chapter 19 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book Review DBMSs support concurrency, crash recovery with: ACID

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky Administriva Lab 2 Final version due next Wednesday CS 133: Databases Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky Problem sets PSet 5 due today No PSet out this week optional practice

More information

CS698F Advanced Data Management. Instructor: Medha Atre. Aug 11, 2017 CS698F Adv Data Mgmt 1

CS698F Advanced Data Management. Instructor: Medha Atre. Aug 11, 2017 CS698F Adv Data Mgmt 1 CS698F Advanced Data Management Instructor: Medha Atre Aug 11, 2017 CS698F Adv Data Mgmt 1 Recap Query optimization components. Relational algebra rules. How to rewrite queries with relational algebra

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Why Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page

Why Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page Why Is This Important? Overview of Storage and Indexing Chapter 8 DB performance depends on time it takes to get the data from storage system and time to process Choosing the right index for faster access

More information

CISC437/637 Database Systems Final Exam

CISC437/637 Database Systems Final Exam CISC437/637 Database Systems Final Exam You have from 1:00 to 3:00pm to complete the following questions. The exam is closed-note and closed-book. Good luck! Multiple Choice (2 points each; 52 total) x

More information

Query Processing: The Basics. External Sorting

Query Processing: The Basics. External Sorting Query Processing: The Basics Chapter 10 1 External Sorting Sorting is used in implementing many relational operations Problem: Relations are typically large, do not fit in main memory So cannot use traditional

More information

CMPUT 391 Database Management Systems. Query Processing: The Basics. Textbook: Chapter 10. (first edition: Chapter 13) University of Alberta 1

CMPUT 391 Database Management Systems. Query Processing: The Basics. Textbook: Chapter 10. (first edition: Chapter 13) University of Alberta 1 CMPUT 391 Database Management Systems Query Processing: The Basics Textbook: Chapter 10 (first edition: Chapter 13) Based on slides by Lewis, Bernstein and Kifer University of Alberta 1 External Sorting

More information

Overview of Query Processing. Evaluation of Relational Operations. Why Sort? Outline. Two-Way External Merge Sort. 2-Way Sort: Requires 3 Buffer Pages

Overview of Query Processing. Evaluation of Relational Operations. Why Sort? Outline. Two-Way External Merge Sort. 2-Way Sort: Requires 3 Buffer Pages Overview of Query Processing Query Parser Query Processor Evaluation of Relational Operations Query Rewriter Query Optimizer Query Executor Yanlei Diao UMass Amherst Lock Manager Access Methods (Buffer

More information

CSE 544 Principles of Database Management Systems

CSE 544 Principles of Database Management Systems CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 5 - DBMS Architecture and Indexing 1 Announcements HW1 is due next Thursday How is it going? Projects: Proposals are due

More information

Database Management Systems Written Exam

Database Management Systems Written Exam Database Management Systems Written Exam 07.0.011 First name Student number Last name Signature Instructions for Students Write your name, student number, and signature on the exam sheet and on every solution

More information

Spring 2013 CS 122C & CS 222 Midterm Exam (and Comprehensive Exam, Part I) (Max. Points: 100)

Spring 2013 CS 122C & CS 222 Midterm Exam (and Comprehensive Exam, Part I) (Max. Points: 100) Spring 2013 CS 122C & CS 222 Midterm Exam (and Comprehensive Exam, Part I) (Max. Points: 100) Instructions: - This exam is closed book and closed notes but open cheat sheet. - The total time for the exam

More information

Final Exam CSE232, Spring 97

Final Exam CSE232, Spring 97 Final Exam CSE232, Spring 97 Name: Time: 2hrs 40min. Total points are 148. A. Serializability I (8) Consider the following schedule S, consisting of transactions T 1, T 2 and T 3 T 1 T 2 T 3 w(a) r(a)

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 17, March 24, 2015 Mohammad Hammoud Today Last Two Sessions: DBMS Internals- Part V External Sorting How to Start a Company in Five (maybe

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Chapter 8 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Data on External Storage Disks: Can retrieve random page at fixed cost But reading several consecutive

More information

Cost-based Query Sub-System. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class.

Cost-based Query Sub-System. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Cost-based Query Sub-System Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Queries Select * From Blah B Where B.blah = blah Query Parser Query Optimizer C. Faloutsos A. Pavlo

More information

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors: Query Optimization atabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,

More information

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados -------------------------------------------------------------------------------------------------------------- INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados Exam 1 - solution

More information

University of Waterloo Midterm Examination Sample Solution

University of Waterloo Midterm Examination Sample Solution 1. (4 total marks) University of Waterloo Midterm Examination Sample Solution Winter, 2012 Suppose that a relational database contains the following large relation: Track(ReleaseID, TrackNum, Title, Length,

More information

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See  for conditions on re-use Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files Static

More information

Midterm 1: CS186, Spring I. Storage: Disk, Files, Buffers [11 points] cs186-

Midterm 1: CS186, Spring I. Storage: Disk, Files, Buffers [11 points] cs186- Midterm 1: CS186, Spring 2016 Name: Class Login: cs186- You should receive 1 double-sided answer sheet and an 11-page exam. Mark your name and login on both sides of the answer sheet, and in the blanks

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information