INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

Similar documents
INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

Final Review. May 9, 2017

Final Review. May 9, 2018 May 11, 2018

6.830 Lecture Recovery 10/30/2017

Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems

6.830 Lecture Recovery 10/30/2017

CS 564 Final Exam Fall 2015 Answers

DATABASE MANAGEMENT SYSTEMS

IMPORTANT: Circle the last two letters of your class account:

Database Systems Management

CMSC 461 Final Exam Study Guide

IMPORTANT: Circle the last two letters of your class account:

Database Management Systems Paper Solution

CS 245 Final Exam Winter 2016

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

CSE 562 Final Exam Solutions

Goal of Concurrency Control. Concurrency Control. Example. Solution 1. Solution 2. Solution 3

Final Exam CSE232, Spring 97

Concurrency Control. R &G - Chapter 19

Concurrency Control. Chapter 17. Comp 521 Files and Databases Fall

Concurrency Control! Snapshot isolation" q How to ensure serializability and recoverability? " q Lock-Based Protocols" q Other Protocols"

Database Applications (15-415)

Transactions and Concurrency Control

CS348: INTRODUCTION TO DATABASE MANAGEMENT (Winter, 2011) FINAL EXAMINATION

CSE 444, Winter 2011, Midterm Examination 9 February 2011

Database Management Systems (COP 5725) Homework 3

Database Tuning and Physical Design: Execution of Transactions

Problem 1 (12P) Indicate whether each of the following statements is true or false; explain briey. a) In query optimization, it is always better to pe

; Spring 2008 Prof. Sang-goo Lee (14:30pm: Mon & Wed: Room ) ADVANCED DATABASES

Problems Caused by Failures

CISC437/637 Database Systems Final Exam

Introduction to Data Management. Lecture #26 (Transactions, cont.)

Spring 2013 CS 122C & CS 222 Midterm Exam (and Comprehensive Exam, Part I) (Max. Points: 100)

Question 1 (a) 10 marks

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

Final Review. CS634 May 11, Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

University of Waterloo Midterm Examination Sample Solution

Concurrency Control. Chapter 17. Comp 521 Files and Databases Spring

CS 245 Midterm Exam Winter 2014

Transaction Management: Introduction (Chap. 16)

Transaction Management

CSE 190D Spring 2017 Final Exam

6.830 Problem Set 3 Assigned: 10/28 Due: 11/30

Data on External Storage

CISC437/637 Database Systems Final Exam

Implementing Isolation

Concurrency Control CHAPTER 17 SINA MERAJI

COSC-4411(M) Midterm #1

Database Management Systems Written Exam

McGill April 2009 Final Examination Database Systems COMP 421

NJIT Department of Computer Science PhD Qualifying Exam on CS 631: DATA MANAGEMENT SYSTEMS DESIGN. Summer 2012

Final Exam CSE232, Spring 97, Solutions

Database System Concepts

Chapter 15 : Concurrency Control

CSE 190D Spring 2017 Final Exam Answers

COSC-4411(M) Midterm #1

CS 347 Parallel and Distributed Data Processing

Query Processing: The Basics. External Sorting

Transaction Processing: Concurrency Control. Announcements (April 26) Transactions. CPS 216 Advanced Database Systems

Database Applications (15-415)

Transaction Processing. Introduction to Databases CompSci 316 Fall 2018

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

Outline. Database Tuning. Join Strategies Running Example. Outline. Index Tuning. Nikolaus Augsten. Unit 6 WS 2014/2015

Heckaton. SQL Server's Memory Optimized OLTP Engine

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

Consistent deals with integrity constraints, which we are not going to talk about.

XI. Transactions CS Computer App in Business: Databases. Lecture Topics

Overview of Transaction Management

CS6302- DATABASE MANAGEMENT SYSTEMS- QUESTION BANK- II YEAR CSE- III SEM UNIT I

Transaction Management and Concurrency Control. Chapter 16, 17

CS 245 Final Exam Winter 2017

Concurrency Control. Chapter 17. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Transaction Management Overview

Intro to Transactions

CSE 444: Database Internals. Lectures Transactions

Physical Disk Structure. Physical Data Organization and Indexing. Pages and Blocks. Access Path. I/O Time to Access a Page. Disks.

CS/B.Tech/CSE/New/SEM-6/CS-601/2013 DATABASE MANAGEMENENT SYSTEM. Time Allotted : 3 Hours Full Marks : 70

Outline. Database Management and Tuning. Index Tuning Examples. Exercise 1 Query for Student by Name. Concurrency Tuning. Johann Gamper.

CS 245: Database System Principles

Deadlock Prevention (cont d) Deadlock Prevention. Example: Wait-Die. Wait-Die

CMPUT 391 Database Management Systems. Query Processing: The Basics. Textbook: Chapter 10. (first edition: Chapter 13) University of Alberta 1

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

CMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein. Student ID: UCSC

Concurrency Control. [R&G] Chapter 17 CS432 1

Name Class Account UNIVERISTY OF CALIFORNIA, BERKELEY College of Engineering Department of EECS, Computer Science Division J.

Outline. Database Management and Tuning. Outline. Join Strategies Running Example. Index Tuning. Johann Gamper. Unit 6 April 12, 2012

Administração e Optimização de Bases de Dados 2012/2013 Index Tuning

Concurrency Control. Conflict Serializable Schedules. Example. Chapter 17

Database Systems. Announcement

Database Applications (15-415)

Relational DBMS Internals Solutions Manual. A. Albano, D. Colazzo, G. Ghelli and R. Orsini

Chapter 12: Query Processing. Chapter 12: Query Processing

Homework 2: Query Processing/Optimization, Transactions/Recovery (due February 16th, 2017, 9:30am, in class hard-copy please)

Advances in Data Management Transaction Management A.Poulovassilis

CS 245 Midterm Exam Solution Winter 2015

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery

Storage hierarchy. Textbook: chapters 11, 12, and 13

Transcription:

-------------------------------------------------------------------------------------------------------------- INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados Exam 1 16 June 2014 -------------------------------------------------------------------------------------------------------------- The duration of this exam is 2,5 Hours. You can access your own written materials, but the exam is to be done individually. You are not allowed to use computers, tablets, nor mobile phones. The maximum grade of the exam is 20 pts. Write your answers below the questions. Write your number and name at the top of each page. Present all calculations performed. After the exam starts, you can leave the room one hour after delivering the exam. The following table is be used by instructors, ONLY: 1 2 3 4 5 SUM 4 4 4 4 4 20 1

1. (4 vals) Indexing 1.1. (2,5 pts) Suppose that we are using extendable hashing on a file that contains records with the following search-key values: Search-key values Hash value Ronaldo 00010 Messi 00011 Hernandez 00101 Iniesta 00111 Lahm 00011 Ibrahimovic 00001 Rooney 00011 Neymar 00111 Show the extendable hash structure for this file, if the hash values for each search key are as shown in the table above, and if buckets can hold up to three records. Use the least significant bits of the hash value. 2

1.2. (1 pt) Suppose that you have a sorted file and want to construct a dense primary B+ tree index on this file. a) One way to accomplish this task is to scan the file, record by record, inserting each one using the B+ tree insertion procedure. What performance and storage utilization problems are there with this approach? b) Explain how the bulk-loading algorithm improves upon this scheme. 1.3. (0,5 pt) What is the difference between a B+-tree index and a B+-tree file organization? Indicate one advantage of the each schema. 3

2. (4 pts) Query Processing and Optimization 2.1. (2,5 pts) Consider performing a natural join between the following two relations: Client(Name,ID) ClientDetails(ID,Property,Value) Assume that the Client tuples are stored contiguously on 2000 disk blocks and that the ClientDetails tuples are stored contiguously on 400 blocks. Each block of Client or ClientDetails holds up to 50 tuples. There are 102 memory blocks available. Compute the I/O cost for each of the following join algorithms, justifying your result. Ignore the I/O cost of writing the output to disk. Unless stated otherwise, the tuples in the relations are not sorted. a) Merge join, sorted relations (i.e., assume that both relations are sorted). b) Merge join, unsorted relations (i.e., assume that both relations are unsorted). c) Index join. Assume that there is an index on the ID column of Client. We read a block of ClientDetails and, for each tuple in this block, we use the index to find all matching tuples of Client. Each of these Client tuples is read into memory and joined with the tuples from ClientDetails. We repeat the process for all blocks of ClientDetails. Assume that the index is entirely in memory, and assume that, on average, each tuple of ClientDetails matches 4 tuples of Client. d) Hash join. Assume Client as the build relation. 4

2.2. (1 pt) Consider the following database relations: Client(Name,Address,ClientID) ClientSubcriptions(ClientID,SubscriptionType) ClientID: Foreign Key(Client) The relation Client has 200 tuples and the relation ClientSubscriptions has 600 tuples. Answer the following questions: a) Estimate the number of tuples of Client X ClientSubscriptions. b) Consider the selection: σ ClientID=2 (Client X ClientSubscriptions). Estimate the number of tuples returned by the selection. 2.3. (0,5 pt) What would change in the answer to question 2.2.b) if the selection condition was ClientID>2? 5

3.(4 pts) Transactions and Concurrency Control 3.1. (2,5 pts) Consider a multi-granularity locking system, with lock modes IX, X, IS, S and SIX. The objects are arranged in the following hierarchy: relation / \ / \ block A block B / \ / \ / \ / \ a1 a2 a3 b1 b2 Assume there are two active transactions T1 and T2, and there are no lock upgrades. a) Transaction T1 has already obtained the following locks: IS lock on R, S on B. Transaction T2 wants to modify b2 (and nothing else) while T1 is active. What lock does T2 need to get? Which of these locks can T2 get at this point? b) Transaction T1 has already obtained the following locks: IS lock on R, S on A. Transaction T2 wants to read b2 and modify a1 (and nothing else) while T1 is active. What locks does T2 need to get? Which of these locks can T2 get at this point? c) Transaction T1 has already obtained the following locks: IX lock on R, IX on A, X on a1. Transaction T2 wants to read a2 and modify a3 (and nothing else) while T1 is active. What locks does T2 need to get? Which of these locks can T2 get at this point? d) Transaction T1 has already obtained the following locks: IX lock on R, IX on A, X on a1. Transaction T2 wants to read a2 and modify a1 (and nothing else) while T1 is active. What locks does T2 need to get? Which of these locks can T2 get at this point? 6

3.2. (1 pt) Consider the following two transactions: T1: R1(X) W1(X) R1(Y) W1(Y) T2: R2(Y) W2(Y) Which of the following schedules would be allowed under the 2-Phase Locking protocol? Justify your answer. Assume that the L lock actions in the schedules are exclusive. a) L1(X) L1(Y) L2(Y) R1(X) W1(X) R1(Y) W1(Y) R2(Y) W(Y) U2(Y) U1(Y) U1(X) b) L1(Y) L1(X) R1(Y) W1(Y) R1(X) W1(X) U1(X) U1(Y) L2(Y) R2(Y) W2(Y) U2(Y) 7

3.3. (0,5 pt) Consider the following schedule for three concurrent transactions and indicate whether it is possible under the timestamp-based protocol. Justify. T1 T2 T3 -------------------------------------------- R(A) R(C) W(A) W(B) W(B) W(C) 8

4. (4 pts) Recovery Management 4.1. (2,5 pts) Briefly answer the following questions regarding the ARIES recovery algorithm: a) If the system fails repeatedly during recovery, what is the maximum number of log records that can be written (as a function of the number of update and other log records written before the crash) before restart completes successfully? Justify. b) What is the oldest log record we need to retain? Justify. 9

4.2. (1 pt) In the context of the ARIES algorithm, explain the purpose of the checkpoint mechanism. Explain how does the frequency of checkpoints affect: (i) The system's performance when no failure occurs. (ii) The time it takes to recover from a system crash. (iii) The time it takes to recover from a disk crash (i.e., a crash on stable storage). 4.3. (0,5 pt) How does the recovery manager ensure atomicity of transactions? How does it ensure durability? 10

5. (4 pts) Miscellaneous 5.1. (1 pt) Discuss how the SQL Server DBMS supports data indexing, trying to answer the following particular questions: a) Are clustered indexes on non-key attributes supported? b) Are composite indexes supported? Besides composite indexes, is there any other way to implement the idea of "covering indexes" through non-clustered indexes? c) Can the non-clustered indexes be sparse, or do they have to be dense? d) How are tuple locators (i.e., the pointers to the actual tuples associated to the index keys) implemented in the case of non-clustered indexes (i.e., what is the relationship between clustered and non-clustered indexes)? e) Why should clustered index keys use as few columns as possible? 11

5.2. (1 pt) Explain the two general strategies through which parallelism can be used in DBMS processing. Clearly indicate how DBMSs can make use of these two general strategies (i.e., what are the typical operations where each of the strategies can be employed). 12

5.3.(1 pt) Consider a database storing information about the expenses and revenues associated to the employees of a given company, specifically (i) the monthly revenues generated by each employee, and (ii) the yearly salary for each employee. Consider also the following two equivalent SQL queries, which return the employees that are paid yearly the same value that they generate to the company in revenues: SELECT * FROM Employees WHERE salary = 12*revenue; SELECT * FROM Employees WHERE salary/12 = revenue; Explain which query is better (i.e., in which situations is one query better than the other, in terms of execution efficiency). Justify your answer. 13

5.4. (1 pt) Explain the difference between a system crash and a "disaster." Indicate what kinds of strategies are used in database management systems for handling both types of problems. 14