CS 186 Databases. and SID:

Similar documents
CSE 444, Winter 2011, Midterm Examination 9 February 2011

Midterm 2: CS186, Spring 2015

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

IMPORTANT: Circle the last two letters of your class account:

CPSC 310: Database Systems / CSPC 603: Database Systems and Applications Final Exam Fall 2005

IMPORTANT: Circle the last two letters of your class account:

Name Class Account UNIVERISTY OF CALIFORNIA, BERKELEY College of Engineering Department of EECS, Computer Science Division J.

University of California, Berkeley. (2 points for each row; 1 point given if part of the change in the row was correct)

CSE 444 Midterm Exam

Attach extra pages as needed. Write your name and ID on any extra page that you attach. Please, write neatly.

University of Waterloo Midterm Examination Sample Solution

6.830 Problem Set 3 Assigned: 10/28 Due: 11/30

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz I

Computer Science 304

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2004.

Topics in Reliable Distributed Systems

Principles of Data Management. Lecture #9 (Query Processing Overview)

CSE 544, Winter 2009, Final Examination 11 March 2009

Goals for Today. CS 133: Databases. Final Exam: Logistics. Why Use a DBMS? Brief overview of course. Course evaluations

CS-245 Database System Principles

CSE 190D Spring 2017 Final Exam Answers

Database Management Systems (COP 5725) Homework 3

CS186 Final Exam Spring 2016

CSE 190D Spring 2017 Final Exam

CS 564 Final Exam Fall 2015 Answers

Course No: 4411 Database Management Systems Fall 2008 Midterm exam

Weak Levels of Consistency

Introduction to Data Management. Lecture #26 (Transactions, cont.)

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15

CompSci 516: Database Systems

Midterm 1: CS186, Spring 2015

CSCE 4523 Introduction to Database Management Systems Final Exam Spring I have neither given, nor received,unauthorized assistance on this exam.

CS 245 Final Exam Winter 2016

6.830 Lecture Recovery 10/30/2017

CSE 444 Midterm Exam

Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz II

L i (A) = transaction T i acquires lock for element A. U i (A) = transaction T i releases lock for element A

CS222P Fall 2017, Final Exam

Database Tuning and Physical Design: Execution of Transactions

Final Review. May 9, 2017

Evaluation of Relational Operations

6.830 Lecture Recovery 10/30/2017

Final Review. May 9, 2018 May 11, 2018

CISC437/637 Database Systems Final Exam

Database Management Systems Paper Solution

CS 245 Final Exam Winter 2017

CISC437/637 Database Systems Final Exam

Final Review. CS634 May 11, Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

CSE 344 Final Review. August 16 th

Query Processing: A Systems View. Announcements (March 1) Physical (execution) plan. CPS 216 Advanced Database Systems

Database Systems Management

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2009 Quiz I Solutions

CS 161 Computer Security

Assignment 6 Solutions

Homework 2: Query Processing/Optimization, Transactions/Recovery (due February 16th, 2017, 9:30am, in class hard-copy please)

Introduction to Data Management. Lecture #25 (Transactions II)

Midterm 1: CS186, Spring I. Storage: Disk, Files, Buffers [11 points] SOLUTION. cs186-

9/8/2018. Prerequisites. Grading. People & Contact Information. Textbooks. Course Info. CS430/630 Database Management Systems Fall 2018

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Final Exam CSE232, Spring 97

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky

CSE 444, Winter 2011, Final Examination. 17 March 2011

Problems Caused by Failures

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

CS 161 Computer Security

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

Transaction Processing: Concurrency Control. Announcements (April 26) Transactions. CPS 216 Advanced Database Systems

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

University of California, Berkeley. CS 186 Introduction to Databases, Spring 2014, Prof. Dan Olteanu MIDTERM

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

Homework 6 (by Sivaprasad Sudhir) Solutions Due: Monday Nov 27, 11:59pm

Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems

Chapter 17: Recovery System

Failure Classification. Chapter 17: Recovery System. Recovery Algorithms. Storage Structure

UNIT 9 Crash Recovery. Based on: Text: Chapter 18 Skip: Section 18.7 and second half of 18.8

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Transaction Management and Concurrency Control. Chapter 16, 17

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

CSE332 Summer 2012 Final Exam, August 15, 2012

CSE 444, Fall 2010, Midterm Examination 10 November 2010

Concurrency Control & Recovery

IMPORTANT: Circle the last two letters of your class account:

CSE 562 Final Exam Solutions

CSE 344 MARCH 9 TH TRANSACTIONS

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

Intro to Transaction Management

The transaction. Defining properties of transactions. Failures in complex systems propagate. Concurrency Control, Locking, and Recovery

Transaction Management Overview

Spring 2013 CS 122C & CS 222 Midterm Exam (and Comprehensive Exam, Part I) (Max. Points: 100)

Midterm 1: CS186, Spring I. Storage: Disk, Files, Buffers [11 points] cs186-

More on Arrays CS 16: Solving Problems with Computers I Lecture #13

Evaluation of Relational Operations: Other Techniques

Transactions. Kathleen Durant PhD Northeastern University CS3200 Lesson 9

Problem Max Points Score Total 100

CS348: INTRODUCTION TO DATABASE MANAGEMENT (Winter, 2011) FINAL EXAMINATION

Transcription:

Prof. Brewer Fall 2015 CS 186 Databases Midterm 3 Print your name:, (last) (first) I am aware of the Berkeley Campus Code of Student Conduct and acknowledge that academic misconduct will be reported to the Center for Student Conduct. Sign your name: Print your class account login: cs186- and SID: Your TA s name: Your section time: Login of the person sitting to your left: Login of the person sitting to your right: You should receive a double-sided answer sheet in addition to this packet. Mark your name and login on the answer sheet, and in the blanks above. All answers must be marked on the answer sheet do not show any work unless asked. You have 80 minutes for this exam do not spend too much time on any single problem. Turn in both this packet and the answer sheet when you are done. Do not turn this page until your instructor tells you to do so. Question: 1 2 3 4 Total Points: 20 23 15 27 85 Score: Page 1 of 7

Problem 1 SQL and Query Optimization (20 points) For this problem, show your work for partial credit, and leave all numerical answers as simple expressions. We have provided plenty of room on the answer sheet, and some lines/whitespace may be unnecessary. As an aspiring paleontologist you decide to explore the exciting world of fossils while putting your databases knowledge to good use. You recall that our very own Campanile has a fine collection of fossils (numbering in the hundreds of thousands!), and lo and behold, you find the tables: 1. Collection, whichhasarecordforeveryfossilstoredinthecampanile, 2. Donors, whichhasarecordforeverydonorthatdonatedtothecollection,and 3. Fossils, which contains the sum total of all human knowledge on fossils. Collection [C] = 2 13 Donors [D] = 2 7 Fossils [F] = 2 17 p C =2 5 p D =2 6 p F =2 8 id int id int id int donor id int num donated int age int floor int [2-7]* donor since int type String condition int [1-100]*...... * = inclusive range of values, for which you may assume a uniform distribution Afewothernotes: We shorthand Collection as C, Donors as D, Fossils as F. C.donor id is a foreign key referencing D.id. The ids in C are strictly a subset of the ids in F. You also have a few indexes, which may or may not be useful: 1. A clustered index on F.id [2 10 pages, height = 3] 2. An unclustered index on F.age [2 8 pages, height = 1] 3. Aclusteredindexon(D.donorsince, D.num donated) [2 3 pages, height = 1] 1. [6 points] You want to find the average condition of fossils in the Campanile collection, grouped and sorted by age in ascending order. However, you only want to look at fossils on floors 5, 6, and 7, and from small-scale donors (fewer than 100 fossils donated) whohavenot donated prior to the year 2000. Writea SQL query that expresses this. Midterm 3 Page 2 of 7 CS 186 Fa 15

You then want to prove to the world (or just us) that you know how a System R query optimizer estimates plan costs for the SQL query above (**you can do this problem using only the description from Q1). In addition to what you already know from the query, you estimate that roughly half of donors have donated fewer than 100 fossils, and roughly 1/8th of donors started donating from 2000 or later. 2. [7 points] Naturally, you begin by considering single table plans. (a) For each table, calculate: the IO cost of a file scan the number of pages that will be passed down the pipeline ( downstream ) (b) Name all non-filescan single-table plans that are kept (be specific). Choose one plan (indicate your choice), and calculate its estimated cost. Recall that System R optimizers will consider interesting orders. 3. [7 points] You recall that System R optimizers never consider cross joins given alternatives, so you narrow your two-table plans to C./F, F./C, C./D,and D./C. (a) If we were to use Chunk-Nested Loop Joins for the four joins listed above, find the smallest and largest reduction in IOs that would result from a single pre-written temp file. Think through this problem before attempting any calculations. (b) Calculate (as a System R query optimizer) the cost of an Index-Nested Loop Join for C./F. Do include costs for reading C (the outer loop). (c) How many tuples will be passed down the pipeline for this join (Index-Nested Loop Join between C./F)? For a Chunk-Nested Loop Join? Midterm 3 Page 3 of 7 CS 186 Fa 15

Problem 2 Concurrency Control (23 points) 1. [14 points] Consider the following schedule S. Assume that each transaction commits or aborts immediately after its last operation, and that locks can be upgraded. For example, T1 aborts or commits immediately after timestamp 5. (a) Write the number of conflicting operations in S. (b) Draw the dependency graph for S. (c) Select all schedules below that are conflict equivalent to S. i. ii. iii. iv. (d) Write the timestamps of one possible set of operations that can be removed from S such that the resulting schedule adheres to the strict 2-phase locking protocol. Use the minimum number of operations possible. (e) Does S avoid cascading aborts? If not, state one transaction that would cause acascadingabortifaborted,andwhichoperationcanberemovedfromthe transaction to avoid that cascading abort. Midterm 3 Page 4 of 7 CS 186 Fa 15

2. [5 points] Consider a database where A is a table in the database and contains pages B and C. Page B contains tuples D and F, and page C contains tuples G and H. Assume you have the following locking schedule for transactions T1, T2, T3, and T4 that utilizes multiple granularity locks. Assume no locks are released within the given timeframe, and no other transactions are running. (a) Write all possible locks that the given transaction can acquire on the given object without having to wait for another transaction. i. At time 8 ii. At time 9 iii. At time 10 iv. At time 11 v. At time 12 (b) Provide a minimal set of locks from timestamps 8 through 12 that would force a deadlock in the schedule. If no lock at a particular timestamp a ects deadlock, write None. 3. [4 points] Consider the following Timestamp Ordered Multi-version concurrency control timeline for a tuple X. (a) What value of X would a transaction with timestamp 13 read? (b) What value of X would a transaction with timestamp 7 read? (c) At which timestamps would an attempted write result in an abort? Midterm 3 Page 5 of 7 CS 186 Fa 15

Problem 3 Iterator Model. (15 points) Given the following query plan, answer the following questions on the answer sheet: On the Appendix for the Iterator Model, there are method signatures and docstrings provided. Those are all the functions you re expected to use. There are implementation details left out. Here are a few tips and assumptions: The single table access plans are file scans (SeqScan) Specific operator predicates will not matter. This query will run to completion without interruption. Exceptions thrown are not important Join is an implementation of a Nested Loop Join. 1. After calling Query.execute(), what order will the open() calls be made? Fill out the answer sheet. We are only considering the calls made before we start iterating through the tuples. You may or may not use all the lines provided. Leave all answers in form of (class).open(); ie, Project.open(). You may ignore Query.start(). 2. In a proper, working implementation of SimpleDB, between calls of Query.next(), which of the following could be called? Circle on answer sheet. (a) Filter.rewind(); (b) Filter.open(); (c) Join.open(); (d) SeqScan.open(); (e) Filter.close(); (f) Filter.next(); (g) Join.next(); (h) SeqScan.next(); Midterm 3 Page 6 of 7 CS 186 Fa 15

Problem 4 Logging and Recovery (27 points) Please see Appendix: Recovery to find the log in reference. This is the log found on disk as we begin the recovery process. If it is attached to your exam, you may rip it o. 1. [3 points] Fill in the missing records denoted by???. Please follow the same format as the records provided. The system comes back up after the crash and starts performing recovery. 2. [4 points] Fill in the transaction table as it appears at the end of analysis. You may not need to use all rows in the table. 3. [4 points] Which pages are in the dirty page table at the end of analysis? You may not need to use all rows in the table. 4. [5 points] What actions are redone during the REDO phase? List the LSNs on the answer sheet in the order that they are redone, separated by commas. No new log records will be written during REDO. You do not need to include actions that do not change pages. 5. [7 points] What are the records written during the UNDO phase? Start writing new log records at LSN 240. The di erence between the LSNs of two adjacent log records should be 10. You may not need to use all rows in the table. 6. [4 points] In the ARIES protocol, recovery consists of 3 distinct, sequentially executed phases: Analysis, REDO, and UNDO. However, in hw5, we were able to perform Analysis and REDO operations at the same time, in a single pass of the log. Briefly explain the key di erence between ARIES and SimpleDB that allowed us to do this. Midterm 3 Page 7 of 7 CS 186 Fa 15