IMPORTANT: Circle the last two letters of your class account:

Similar documents
IMPORTANT: Circle the last two letters of your class account:

UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division

IMPORTANT: Circle the last two letters of your class account:

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15

CSE 444: Database Internals. Section 4: Query Optimizer

University of California, Berkeley. CS 186 Introduction to Databases, Spring 2014, Prof. Dan Olteanu MIDTERM

Midterm Exam (Version B) CS 122A Spring 2017

CS 564 Final Exam Fall 2015 Answers

Principles of Data Management. Lecture #9 (Query Processing Overview)

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

Midterm Exam #2 (Version A) CS 122A Winter 2017

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Database Management Systems (COP 5725) Homework 3

Database Applications (15-415)

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

IMPORTANT: Circle the last two letters of your class account:

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky

Overview of Query Evaluation. Chapter 12

CompSci 516 Data Intensive Computing Systems

External Sorting Implementing Relational Operators

CSE 444: Database Internals. Sec2on 4: Query Op2mizer

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

Overview of Implementing Relational Operators and Query Evaluation

Schema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes

Midterm 2: CS186, Spring 2015

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

CS330. Query Processing

Implementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Overview of Query Evaluation. Overview of Query Evaluation

CSE 444, Winter 2011, Midterm Examination 9 February 2011

Relational Query Optimization

Cost-based Query Sub-System. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class.

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

Relational Query Optimization. Highlights of System R Optimizer

An SQL query is parsed into a collection of query blocks optimize one block at a time. Nested blocks are usually treated as calls to a subroutine

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Course No: 4411 Database Management Systems Fall 2008 Midterm exam

Database Applications (15-415)

Overview of Query Evaluation

CS/B.Tech/CSE/New/SEM-6/CS-601/2013 DATABASE MANAGEMENENT SYSTEM. Time Allotted : 3 Hours Full Marks : 70

Endterm Exam (Version B) CS 122A Spring 2017

2011 DATABASE MANAGEMENT SYSTEM

ATYPICAL RELATIONAL QUERY OPTIMIZER

192 Chapter 14. TotalCost=3 (1, , 000) = 6, 000

TotalCost = 3 (1, , 000) = 6, 000

Database Applications (15-415)

Database Applications (15-415)

Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems

1. (a) Briefly explain the Database Design process. (b) Define these terms: Entity, Entity set, Attribute, Key. [7+8] FIRSTRANKER

Introduction to Data Management. Lecture #10 (Relational Calculus, Continued)

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System

Administrivia. CS186 Class Wrap-Up. News. News (cont) Top Decision Support DBs. Lessons? (from the survey and this course)

Question 1 (a) 10 marks

Basic form of SQL Queries

CSIT5300: Advanced Database Systems

Evaluation of Relational Operations

CPSC 310: Database Systems / CSPC 603: Database Systems and Applications Final Exam Fall 2005

McGill April 2009 Final Examination Database Systems COMP 421

Overview of Query Processing

CSIT5300: Advanced Database Systems

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1

CS330. Some Logistics. Three Topics. Indexing, Query Processing, and Transactions. Next two homework assignments out today Extra lab session:

Principles of Data Management. Lecture #12 (Query Optimization I)

Name Class Account UNIVERISTY OF CALIFORNIA, BERKELEY College of Engineering Department of EECS, Computer Science Division J.

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

15-415/615 Faloutsos 1

Evaluation of relational operations

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

CSE344 Midterm Exam Winter 2017

CSIT5300: Advanced Database Systems

DATABASE MANAGEMENT SYSTEMS

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Introduction to Data Management. Lecture #13 (Relational Calculus, Continued) It s time for another installment of...

Database Management Systems (CS 601) Assignments

CS 461: Database Systems. Final Review. Julia Stoyanovich

Administrivia. Relational Query Optimization (this time we really mean it) Review: Query Optimization. Overview: Query Optimization

Database Systems. Course Administration. 10/13/2010 Lecture #4

Overview of Query Processing. Evaluation of Relational Operations. Why Sort? Outline. Two-Way External Merge Sort. 2-Way Sort: Requires 3 Buffer Pages

Query Processing and Query Optimization. Prof Monika Shah

Final Review. CS634 May 11, Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

Database Management Systems Paper Solution

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

Evaluation of Relational Operations: Other Techniques

Database Applications (15-415)

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Database Management System

CS145 Final Examination

Homework 2: Query Processing/Optimization, Transactions/Recovery (due February 16th, 2017, 9:30am, in class hard-copy please)

Midterm Exam #2 December 4, 2013 CS162 Operating Systems

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz I

CSE 444 Midterm Exam

Database Applications (15-415)

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems

CompSci 516 Data Intensive Computing Systems. Lecture 11. Query Optimization. Instructor: Sudeepa Roy

Query Evaluation (i)

CS 186 Databases. and SID:

Solutions to Final Examination

Transcription:

Fall 2001 University of California, Berkeley College of Engineering Computer Science Division EECS Prof. Michael J. Franklin FINAL EXAM CS 186 Introduction to Database Systems NAME: STUDENT ID: IMPORTANT: Circle the last two letters of your class account: cs186 a b c d e f g h i j k l m n o p q r s t u v w x y z a b c d e f g h i j k l m n o p q r s t u v w x y z DISCUSSION SECTION DAY & TIME: TA NAME: General Information: This is a closed book examination but you are allowed two 8.5 x 11 sheets of notes (double sided). You have 2 hours and 45 minutes to answer as many questions as possible. Partial credit will be given. There are 100 points in all. You should read all of the questions before starting the exam, as some of the questions are substantially more time-consuming than others. Write all of your answers directly on this paper. Be sure to clearly indicate your final answer for each question. Also, be sure to state any assumptions that you are making in your answers. GOOD LUCK!!! Problem Possible Score 1. Logical Database Design 10 2. Physical Database Design 8 3. B+Trees 12 4. Hashing 8 5. Query Optimization 15 6. Query Estimation 9 7. SQL 15 8. Concurrency Control 15 9. Recovery 8 TOTAL 100

Use this page for scratch space if you like. CS 186 Final Exam December 18, 2001 Page 2 of 18

Name: SID: Question 1 Logical and Physical Database Design [4 parts, 10 points total]: An intramural softball league plays on Kleeburger field every Monday night from 6pm 10pm. There are eight teams in the league. So, every Monday there are four one-hour games, each team playing in one of the games. - Each team plays all the others exactly once during a seven week season. - For each game, there is one Home team and one Visiting team. - Every team has one member who serves as captain. - The league has the phone numbers of the captains and calls the captain of the Home team if a game is cancelled for any reason. - All team names and all phone numbers are unique. - Players names are not unique. The data for the league is stored in the relation G = (D, T, H, V, C, P), where the attributes are date (D), time (T), home team (H), visiting team (V), captain name(c), and phone number (P). Queries of the following form are frequently asked, and you must be able to answer them without computing a join: What is the phone number and name of the captain of team X? Given a date Y and a time Z, who is the home team and who is the visiting team? a) [3 points] The following two FDs hold: 1) P-> CH 2)H -> CP. In addition, there are 6 other FDs that you must find based on the above information. Every one of them has DTHVCP as the right side. List the 6 candidate keys that make up the left sides of these FDs. b) [2 points] Is the schema G in 3NF? If so, why? else, give a specific reason (including a specific FD) why not. c) [3 points] Design a lossless BCNF database schema for the intramural league that satisfies the query requirements stated above. d) [2 points] Give an example of a query that is likely to run slower on this schema than on the relation G (English description is sufficient). CS 186 Final Exam December 18, 2001 Page 3 of 18

Question 2 Physical Databse Design [4 parts, 8 points total]: Consider the following relation, with the primary key underlined: Emp (eid: integer, sal: integer, age: real, deptid: integer) There is a clustered index on eid and an unclustered index on age. a) [2 points] How would you use the indexes to enforce the constraint that eid is a key? b) [2 points] Give an example of an update that is definitely speeded up because of the available indexes. (English description is sufficient.) c) [2 points] Give an example of an update that is definitely slowed down because of the indexes. (English description is sufficient.) d) [2 points] Give an example of an update that is neither speeded up nor slowed down by the indexes. CS 186 Final Exam December 18, 2001 Page 4 of 18

Name: SID: Question 3 B+Trees [6 parts, 12 points total]: For each of the following B+ Trees, decide whether it is a valid B+ Tree (i.e., one that could exist after numerous inserts and deletes) or if it is invalid. Circle your choice, and if it is invalid, describe in one sentence the single main reason why. The trees follow all rules in the book including merging on delete. All of the trees are of order d=2. 20 45 10 17 25 32 56 70 90 2* 3* 5* 20* 22* 23* 10* 12* 14* 15* 26* 28* 30* 17* 18* 19* 32* 35* 44* a) [2 points] circle one: valid invalid If invalid, why? 45* 47* 52* 58* 62* 68* 70* 72* 74* 85* 92* 95* 96* 98* 20 40 60 80 3* 8* 14* 17* 23 29 45 50 55 60* 62* 65* 75* 90* 93* 95* 99* 20* 21* 22* 24* 26* 27* 28* 31* 32* 39* 40* 42* 43* 45* 47* 48* 49* 50* 52* 54* 56* 57* 58* b) [2 points] circle one: valid invalid If invalid, why? CS 186 Final Exam December 18, 2001 Page 5 of 18

10 2* 3* 5* 10* 12* 14* 15* c) [2 points] circle one: valid invalid If invalid, why? 20 45 10 17 25 56 70 90 2* 3* 5* 10* 12* 14* 15* 17* 18* 19* 20* 22* 23* 26* 28* 30* 45* 47* 52* 58* 62* 68* 70* 72* 74* 85* 92* d) [2 points] circle one: valid invalid If invalid, why? CS 186 Final Exam December 18, 2001 Page 6 of 18

Name: SID: 24 47 10 17 24 36 47 70 91 2* 3* 5* 21* 22* 23* 10* 12* 14* 15* 26* 28* 30* 18* 19* 20* 36* 38* 44* e) [2 points] circle one: valid invalid If invalid, why? 47* 49* 52* 58* 62* 68* 70* 72* 74* 85* 92* 95* 96* 40 84 30 37 50 65 73 90 95 20* 22* 25* 30* 32* 35* 36* 37* 38* 39* 41* 43* 46* 47* 50* 53* 62* 67* 68* 70* 72* 74* 76* 82* 83* 85* 86* 89* 91* 92* 94* 95* 98* 99* f) [2 points] circle one: valid invalid If invalid, why? CS 186 Final Exam December 18, 2001 Page 7 of 18

Question 4 Hashing [1 part, 8 points]: Consider the following 5 update operations. operation no. operation key value (binary) 1 insert 20 (10100) 2 insert 46 (101110) 3 delete 13 (1101) 4 insert 18 (10010) 5 insert 23 (10111) Now, consider an extendible hash structure where each bucket can hold up to 4 entries, with a depth 2 and an initial state as shown below. hash function h(n) = n mod Draw the extendible hash structure and its contents after the 5 operations have occurred in the order shown. We recommend that you do your scratch work on this page at first. But, this page will not be graded. You MUST put your final answer on the following page!! 00 01 10 11 2 2 8 16 1 5 7 13 21 2 6 10 22 CS 186 Final Exam December 18, 2001 Page 8 of 18

Name: SID: Final answer for Question 4 - Extendible Hashing: Only this page will be graded for question 4. The final structure should have a directory of size 8 so use the template below. Show all buckets and pointers Label the directory entries with their corresponding hash value (as on the previous page). Make sure to include local depths for all buckets and the global depth of the directory. CS 186 Final Exam December 18, 2001 Page 9 of 18

Question 5 Query Plan Optimization [5 parts, 15 points total]: Consider the following 2 relations: Sailors Reserves # of Pages 500 5,000 # of tuples 2500 100,000 tuples/page 5 20 indexes B+ Tree on rating Hash Table on sid sorted by sid date clustered by sid date cost of sorting by any column 2000 I/Os 30,000 I/Os Find the # of I/O s that will be estimated for each join on the following pages. Just to recap, here are the rules we re using. - The indexes use Alternative 2. - Do not include the cost of outputting the final result - Assume any duplicates exist together on the same page - The fudge factor, f, is 1. This means you can ignore it. - The optimizer only knows how to use the simplest methods for Sort-Merge-Join and Hash-Join. No special optimizations are used for these. - A Hash table lookup costs 1.2 I/O s to get the rid. - A B-Tree lookup costs 3 I/O s to get the rid. You have a buffer with 52 pages available. One page is used as the output buffer. That leaves 51 pages for you to work with. The arithmetic for this question should be very simple. CS 186 Final Exam December 18, 2001 Page 10 of 18

Name: SID: Please write neatly and circle your answer. All of these joins are on sid=sid. a) [3 points] Cost of S Join R (S as the outer) using Index Nested Loops. If not possible, explain why. b) [3 points] Cost of R Join S (R as the outer) using Index Nested Loops. If not possible, explain why. c) [3 points] Cost of S Join R (S as the outer) using Block Nested Loops. If not possible, explain why. CS 186 Final Exam December 18, 2001 Page 11 of 18

Use this page for scratch space if you like. CS 186 Final Exam December 18, 2001 Page 12 of 18

Name: SID: d) [3 points] Cost of S Join R using Sort-Merge Join. If not possible, explain why. e) [3 points] Cost of R Join S using Hash-Join. If not possible, explain why. CS 186 Final Exam December 18, 2001 Page 13 of 18

Question 6 Query Plan Estimation [5 parts, 9 points total]: Consider the following SQL Query: SELECT B.name, S.name FROM Boats B, Reserves R, Sailors S WHERE B.bid = R.bid AND R.sid = S.sid AND B.color = Red AND S.rating > 5 A (rather bad) query optimizer decides to use the following plan: σ rating>5 σ color=' Re d ' π bid,sid S B R The following part of the system catalog: Boats (500 tuples): size min value max value distinct values bid 4 bytes 1 100 100 color 10 bytes Blue Yellow 5 name 20 byes Anabelle Zues 100 Reserves (100,000 tuples): size min value max value distinct values bid 4 bytes 20 90 50 sid 4 bytes 12 462 400 date 8 bytes 01/12/76 03/22/01 1,500 Sailors (2,500 tuples): size min value max value distinct values sid 4 bytes 1 500 500 name 25 bytes Aaron Wendy 400 rating 4 bytes 1 20 20 CS 186 Final Exam December 18, 2001 Page 14 of 18

Name: SID: a) [1 point] What is the reduction factor of the selection on color? b) [1 point] What is the reduction factor of the projection on bid,sid? c) [1 point] What is the reduction factor of the selection on rating? d) [3 points] The sailing club has a policy that only high ranking members are allowed to reserve red boats. Explain how the actual output size after reduction might differ from the one a System-R optimizer would calculate for the selection on rating, due to this policy. e) [3 points] Use System-R estimation to find out how many tuples are in the final result. CS 186 Final Exam December 18, 2001 Page 15 of 18

Question 7 SQL [3 parts, 15 points total]: Use the following relational schema for an employee database (primary keys are underlined) Employee(emp_SSN, emp_name, street, city, salary, manager_ssn) Work(emp_SSN, proj_id, time) Projects(proj_ID, project_name, budget) Note, in the above relations, managers are also employees. Express the following in SQL a) [5 points] Find the names of all employees who manage at least one other manager. Do not return duplicates. b) [5 points] For each employee, return his/her SSN, name, and the number of projects that he/she works on. Sort your result by SSN. All employees should appear exactly once in the result. If an employee does not work on any projects, they should be returned with a count of zero. c) [5 points] For each project that has more than 5 employees working on it, return the name of the project, its budget, and the number of workers working on it. CS 186 Final Exam December 18, 2001 Page 16 of 18

Name: SID: Question 8 Concurrency Control [4 parts, 15 points total]: T1: W(A) W(B) COMMIT T2: R(A) R(B) COMMIT T3: W(B) R(?) COMMIT I. Producible using 2 Phase Locking II. Conflict Serializable a) [4 points] If? = A, this schedule is which of the following: a. I & II b. I only c. II only d. neither I nor II b) [4 points] If? = B, this schedule is which of the following: a. I & II b. I only c. II only d. neither I nor II c) [4 points] If? = C, this schedule is which of the following: a. I & II b. I only c. II only d. neither I nor II d) [3 points] In a system that implements intention locking, why is it ok to allow two separate transactions to each hold an IX lock on the same object. CS 186 Final Exam December 18, 2001 Page 17 of 18

Question 9 Recovery [2 parts, 8 points total]: Your project partner decides to implement a database system that uses a buffer manager with a Steal/No Force policy and a recovery system similar to ARIES. a) [4 points] Your project partner decides to implement the recovery system without using any checkpoints. What affect does this decision have on each of the three phases (Analysis, Redo, Undo).? b) [4 points] Now your partner decides that the Undo phase is not needed because all the changes were lost at the time of crash. Explain why your partner is wrong. CS 186 Final Exam December 18, 2001 Page 18 of 18