McGill April 2009 Final Examination Database Systems COMP 421

Similar documents
CS 564 Final Exam Fall 2015 Answers

CSE 344 Final Examination

CSE 414 Database Systems section 10: Final Review. Joseph Xu 6/6/2013

Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems

IMPORTANT: Circle the last two letters of your class account:

Midterm 2: CS186, Spring 2015

CSE 444, Winter 2011, Midterm Examination 9 February 2011

IMPORTANT: Circle the last two letters of your class account:

CISC437/637 Database Systems Final Exam

CISC437/637 Database Systems Final Exam

CSE 344 Midterm. November 9, 2011, 9:30am - 10:20am. Question Points Score Total: 100

VIEW OTHER QUESTION PAPERS

CSE 444: Database Internals. Lectures Transactions

Intro to Transaction Management

Transaction Management and Concurrency Control. Chapter 16, 17

Database Management Systems (COP 5725) Homework 3

Attach extra pages as needed. Write your name and ID on any extra page that you attach. Please, write neatly.

CSE 344 MARCH 9 TH TRANSACTIONS

L i (A) = transaction T i acquires lock for element A. U i (A) = transaction T i releases lock for element A

Introduction to Data Management. Lecture #26 (Transactions, cont.)

CSE 344 Final Examination

CSE 344 Midterm. November 9, 2011, 9:30am - 10:20am. Question Points Score Total: 100

Conflict Equivalent. Conflict Serializability. Example 1. Precedence Graph Test Every conflict serializable schedule is serializable

Database Systems CSE 414

Final Exam. December 5th, :00-4:00. CS425 - Database Organization Results

CSE 344 Midterm. Wednesday, February 19, 2014, 14:30-15:20. Question Points Score Total: 100

CSE 444 Final Exam. August 21, Question 1 / 15. Question 2 / 25. Question 3 / 25. Question 4 / 15. Question 5 / 20.

Introduction to Data Management CSE 414

Homework 5: Miscellanea (due April 26 th, 2013, 9:05am, in class hard-copy please)

CSE 344 MARCH 25 TH ISOLATION

CS145 Midterm Examination

Introduction to Data Management CSE 344

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

Overview of Transaction Management

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

Question 1. SQL and Relational Algebra [25 marks] Question 2. Enhanced Entity Relationship Data Model [25 marks]

Transaction Management Overview

CSE 344 Final Examination

Overview. Introduction to Transaction Management ACID. Transactions

Goal of Concurrency Control. Concurrency Control. Example. Solution 1. Solution 2. Solution 3

Database Applications (15-415)

Computer Science 304

Final Exam CSE232, Spring 97

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

CS/B.Tech/CSE/New/SEM-6/CS-601/2013 DATABASE MANAGEMENENT SYSTEM. Time Allotted : 3 Hours Full Marks : 70

CSE 190D Spring 2017 Final Exam

CSE 344 MARCH 5 TH TRANSACTIONS

CS348: INTRODUCTION TO DATABASE MANAGEMENT (Winter, 2011) FINAL EXAMINATION

Relational DBMS Internals Solutions Manual. A. Albano, D. Colazzo, G. Ghelli and R. Orsini

Transaction Management: Introduction (Chap. 16)

CS 5300 module6. Problem #1 (10 Points) a) Consider the three transactions T1, T2, and T3, and the schedules S1 and S2.

Transactions Processing (i)

CMSC 461 Final Exam Study Guide

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

University of California, Berkeley. CS 186 Introduction to Databases, Spring 2014, Prof. Dan Olteanu MIDTERM

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2015 Quiz I

Examination paper for TDT4145 Data Modelling and Database Systems

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

Including Aborts in Serializability. Conflict Serializable Schedules. Recall Conflicts. Conflict Equivalent

CSE 544, Winter 2009, Final Examination 11 March 2009

Department of Computer Science Final Exam, CS 4411a Databases II

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

Database Systems. Announcement

Final Review. May 9, 2018 May 11, 2018

Final Review. May 9, 2017

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems

1 (10) 2 (8) 3 (12) 4 (14) 5 (6) Total (50)

Delhi Noida Bhopal Hyderabad Jaipur Lucknow Indore Pune Bhubaneswar Kolkata Patna Web: Ph:

CSE 444: Database Internals. Lectures 13 Transaction Schedules

Introduction to Data Management CSE 344

Transaction Management Overview. Transactions. Concurrency in a DBMS. Chapter 16

CSE 444 Midterm Exam

Final Review. CS634 May 11, Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

Database Management Systems Written Examination

Database design and implementation CMPSCI 645. Lectures 18: Transactions and Concurrency

Databases - Transactions

Database Tuning and Physical Design: Execution of Transactions

Solutions to Final Examination

TRANSACTION MANAGEMENT

Databases - Transactions II. (GF Royle, N Spadaccini ) Databases - Transactions II 1 / 22

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Introduction to Data Management. Lecture #24 (Transactions)

CSCE 4523 Introduction to Database Management Systems Final Exam Spring I have neither given, nor received,unauthorized assistance on this exam.

Lock-based Concurrency Control

Introduction to Data Management. Lecture #18 (Transactions)

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

In these relations, vin is a foreign key in ORDER referring to the CAR relation, and techid is a foreign key

CS222P Fall 2017, Final Exam

CSE 344 Midterm. Wednesday, February 19, 2014, 14:30-15:20. Question Points Score Total: 100

VANCOUVER ISLAND UNIVERSITY CSCI 370 FINAL EXAMINATION 17 April 2009, 13:00 16:00

CSE 190D Spring 2017 Final Exam Answers

DATABASE MANAGEMENT SYSTEMS

Database Management Systems (Classroom Practice Booklet Solutions)

Schema And Draw The Dependency Diagram

Concurrency Control & Recovery

Goals for Today. CS 133: Databases. Final Exam: Logistics. Why Use a DBMS? Brief overview of course. Course evaluations

Assignment 6 Solutions

Introduction to Database Systems. Announcements CSE 444. Review: Closure, Key, Superkey. Decomposition: Schema Design using FD

CSE 344 Final Exam. March 17, Question 1 / 10. Question 2 / 30. Question 3 / 18. Question 4 / 24. Question 5 / 21.

CSCE 4523 Introduction to Database Management Systems Final Exam Fall I have neither given, nor received,unauthorized assistance on this exam.

Transcription:

McGill April 2009 Final Examination Database Systems COMP 421 Wednesday, April 15, 2009 9:00-12:00 Examiner: Prof. Bettina Kemme Associate Examiner: Prof. Muthucumaru Maheswaran Student name: Student Number: Do not answer on the exam paper. Simple calculators are allowed (no programmable calculators) You can bring 8 sheets of paper with handwritten notes. Books are NOT allowed. Dictionaries are NOT allowed (except translation dictionaries). This examination must be returned. This exam comprises 12 pages, including the cover page. You can achieve 145 points (plus 5 bonus points). 1

COMP-421 2 1 Data Models (24 Points) 1. (15 Points) Below is the E/R schema of question 1 of the midterm (slightly reduced and with some correction). (a) Translate the E/R schema into the relational schema. Write the tables in the form R(A1,..., An). Indicate primary keys by underlining them. Indicate the foreign keys in each relation and show the relation to which they refer to. You do NOT need to write SQL statements. (b) The schema shows two participation constraints. One for entity-set Paper in the relationship set write (a paper must have at least one author). The other for entity-set Reviews in relationship-set writes (a review must have a reviewer).for each of these two participation constraints, indicate whether there is a way to express it with the CREATE TABLE statement, and if yes, how would you do it. (No need to show the entire SQL CREATE TABLE statement; a simple explanation or simply showing the line in the statement that indicates the constraint is enough). email password Users name affiliation ISA Authors Reviewers contactflag write content post writes Paper is about Message mid Originality pid Reviews of pdf title id Presentation abstract status Detailed Comments Technical Depth Summary

COMP-421 3 2. (9 Points) Assume a driving school with the following information. There are instructors. Each instructor is identified by an id. Additionally he/she has a name. There are students. Each student is identified by an id. Additionally, he/she has a name and an address. Each student is assigned a subset of the instructors (one or more). A driving lesson is at a certain date and time. It is of a certain type (city drive, country-side drive, highway drive, night drive,...). Each driving lesson is given by an instructor to a student. Only an instructor who is assigned to a student can give a driving lesson to this student. Draw the E/R diagram for this specification.

COMP-421 4 2 Functional Dependencies (16 Points) 1. The E/R diagram below depicts some information to be stored in the database of a cafeteria. The cafeteria has a certain repertoire of menus it can produce. Each menu has a name and a price. Each day the cafeteria offers menus from its repertoire, and for each menu offered at a particular day, it keeps track of how often it is sold. Furthermore, a menu has a main-dish, and for this main-dish the recipe is kept in the database. menuname price day Menu MenuInstance numsold Main-Dish dishname recipe However, the designer of the database skipped the design process, and simply created one big relation Cafeteria(menuname, price, day, numsold, dishname, recipe) (a) Indicate the functional dependencies that hold in the relation Cafeteria given the textual description and the E/R schema above. (Hint: translate the E/R schema into the relational schema; that might make it easier for you to find the FDs) (b) Give an example of redundancy that can occur in Cafeteria. Give an explanation based on FDs why such redundancy can occur. 2. Given a relation R(a,b,c,d,e,f,g) and the following FDs a,b e f g a d, f Indicate the key candidate(s) of the relation R. Provide reasoning for your decision.

COMP-421 5 3 Relational Algebra and SQL (30 Points) The travel agency Discover the World offers organized vacation tours. It has the following relational database schema (simplified). Below are some example records for each of the tables. Tours (TourId, type, start-date, duration, price) Specials (SpecId, name, price) Reservations (ResId, TourId, cname, caddress, cost) SpecialRes (ResId, SpecId) Tours describe the tours that can be booked. Each has a identifier (TourId), is of a certain type ( Brazil jungle, Kenia Safari,...), has a start-date, a duration, and a basic price. The Specials lists all specials that can be booked extra. This could be, for instance, a singe-room, vegetarian food, etc. Specials has an identifying attribute, a name, and a price attribute. The price gives the cost for this special per day (e.g., 20$ per day for a single room). Reservations contains information about each booked tour. It has an identifying attribute, a reference to the tour booked, and the name and address of the customer. Furthermore, it contains an attribute with the total cost of the booked tour including the full costs of all booked specials. SpecialRes contains for each reservation all the specials that are booked together with the tour. We can assume that when a special is booked it is booked for every day of the tour. Tours TourId type start-date duration price 1 Brazil jungle 16-April 14 2229 2 Brazil jungle 30-April 21 2999 3 Kenia safari 30-April 21 3229... Specials SpecId name price 11 single 20 12 vegetarian 5... Reservations ResId TourId cname caddress cost 541 1 Bettina Kemme Montreal 2579 542 2 your name your address 3105... SpecialRes ResId SpecId 541 11 541 12 542 12...

COMP-421 6 1. (4 Points) Write in Relational Algebra: Give the names of customers who have booked a tour of at least 21 days. 2. (4 Points) Look at the following two queries Query 1: SELECT r. cname FROM Reservations r, SpecialRes s, WHERE r.resid = s.resid Query 2: SELECT r.cname FROM Reservations r WHERE r.resid IN (SELECT s.resid FROM SpecialRes s) Although the queries seem to produce the same answers they don t. How do they differ? You can give example instances of Reservations and SpecialRes that show that these two queries can produce different answers. 3. (6 Points) Write the following query in SQL: Return ResID and cname for reservations that neither include the special vegetarian nor the special single. 4. (6 Points) Write the following query in SQL: Return the ResID and the number of specials of Reservations that have booked at least two specials. 5. (10 Points) The attribute cost of the Reservation table is a derived attribute. We assume that when a new reservation record is inserted, it contains as value for cost the price value of the corresponding tour (e.g.,, INSERT INTO Reservations (ResId, TourId, cname, caddress, cost) VALUES (x, 1, y, z, 2229)). After that, the cost attribute needs to be updated after further modifications of the database so that it always reflects the current price of all things booked within the reservation. This can be done, e.g., through triggers. (a) Indicate all events for which a trigger needs to be created. Describe shortly the actions that have to be performed within each trigger. (b) Write one of the triggers in SQL notation.

COMP-421 7 4 XML (15 Points + 5 BONUS Points) The travel agency Discover the World of Question 3 might also store its data in XML format. Below an XML document that describes the same information as in Question 3. <DiscoverTheWorld> <tour TourId="1"> <type> Brazil junge </type> <start-date> 16-April </start-date> <duration> 14 </duration> <price> 2229 </price> <tour TourId="2"> <type> Brazil junge </type> <start-date> 30-April </start-date> <duration> 21 </duration> <price> 2999 </price> </tour> <tour TourId="3>> <type> Kenia safari </type> <start-date> 30-April </start-date> <duration> 21 </duration> <price> 3229 </price> </tour> <reservation ResId="541" TourId="1"> <cname> Bettina Kemme </cname> <caddress> Montreal </caddress> <cost> 2579 </cost> <special price="5"> vegetarian </special> <special price="20"> single </special> </reservation> <reservation ResId="542" TourId="2"> <cname> Your Name </cname> <caddress> Your Address </caddress> <cost> 3105 </cost> <special price="5"> vegetarian </special> </reservation> </DiscoverTheWorld>

COMP-421 8 1. An incomplete DTD for this document could look as following: <!DOCTYPE DiscoverTheWorld [ <!ELEMENT DiscoverTheWorld (tour*,reservation*)> <!ELEMENT tour... > <!ELEMENT reservation...> <!ATTLIST tour... > <!ATTLIST reservation...> <!ELEMENT type (#PCDATA) > <!ELEMENT start-date (#PCDATA) > <!ELEMENT duration (#PCDATA) > <!ELEMENT price (#PCDATA) > <!ELEMENT cname (#PCDATA) > <!ELEMENT caddress (#PCDATA) > <!ELEMENT cost (#PCDATA) > <!ELEMENT special (#PCDATA) > <!ATTLIST special price CDATA #REQUIRED> ]> Complete the DTD in a way the reflects the semantics of the application. You only need to provide the four entries above that contain.... <!ELEMENT tour (type, start-date, duration, price) > <!ELEMENT reservation (cname, caddress, cost, special*)> <!ATTLIST tour TourId ID #REQUIRED > <!ATTLIST reservation ResID ID #REQUIRED TourID IDREF #REQUIRED> 2. Write an XPath expression that returns each reservation that has at least two specials. /DiscoverTheWorld/reservation[count(special) > 1] 3. Write an XQuery that returns for each reservation of the tour with TourId=1, the specials that are booked with it. for $x in document("discover.xml")/discovertheworld/reservation[@tourid = 1] let $z = $y/special 4. (5 Bouns points) Write an XQuery that returns for each reservation of a Brazil jungle trip, the specials that are booked with it. More concrete, for the document above the following should be returned

COMP-421 9 <specials-booked> <special price="5"> vegetarian </special> <special price="20"> single </special> </specials-booked> <specials-booked> <special price="5"> vegetarian </special> </specials-booked> for $x in document("discovertheworld.xml")/discovertheworld/ /tour[type="brazil jungle"]@tourid for $y in document("discovertheworld.xml")/discovertheworld/ /reservation[@tourid = $x] let $z = $y/special return <specials-booked> $z </specials-booked>

COMP-421 10 5 Query Evaluation and Optimization (35 Points) Given the relations R(a, b, c), S(d, e, f) and T(g, a, d, h). In T, a is foreign key to R, and d is foreign key to S. All attributes have data type integer and an integer has 10 Bytes. Data pages have 4K. R has around 100,000 tuples distributed over 1000 pages. For 10,000 tuples of R, attributebstores a unique name (e.g., product name) with values between [1,10000]. For the other 90,000 tuples, b contains 2000 different uniformly distributed values ranging from [10001,12000] (e.g., it stores the product type and there are 2000 different types). T has around 50,000 tuples distributed over 600 pages. Index pages are 4 KByte, a pointer in an index page has 6 Bytes, an rid has 10 Bytes. 1. (5 Points) Consider the relational algebra expression π f (σ b=10 ((S T) R)) For each of the expressions below indicate whether they are equivalent to the expression above. (a) π f (σ b=10 (R) (S π a,d (T))) (b) π f (σ b=10 (π a,b (R) (π d (S) π a,d (T)))) 2. (7 Points) Consider the following SQL query SELECT R.b, S.d FROM R, S, T WHERE R.a = T.a AND S.d = T.d AND S.e = 500 AND R.c <> T.h Give an equivalent expression in relational algebra. If your expression is not yet optimal, perform an algebraic optimization according to the rules discussed in class. There is no need to consider cardinalities. There is no need to indicate what kind of joins are going to be performed. 3. (8 Points) There is an unclustered B+-tree for the attribute b of relation R where each data entry of a leaf page can consist of a list of data entries. Leaf pages are filled around 70%. Indicate the number of data entries, the number of rids per data entry, the size of data entries, and the number of leaf pages. Give a short explanation of how you derived the numbers.

COMP-421 11 4. (15 Points) Assume a clustered B+tree on attribute b of relation R, and an unclustered B+-tree on attribute a of relation T. Assume that all inner pages of indices reside in main memory and only leaf pages and data pages require I/O. Assume that you have around 50 buffer pages for sorts and block based loop. Now consider the query SELECT R.a, R.b, T.h FROM R, T WHERE R.a = T.a AND R.b <= X ORDER BY T.h (a) Provide the best execution strategies for how to execute the query if X = 10000. (b) Provide the best execution strategy if X = 50. In both cases give a rough estimation of the number of I/O for this strategy. Indicate how you have calculated your I/O estimation. This might also include estimates of the size of intermediate results (to justify that it fits into main memory or that it has to be written out to disk, etc.). You can describe the execution strategy in informal English or by giving an execution plan (as long as I understand what you meant.) If you provide a correct but expensive execution strategy then you will get points but maybe not the maximum.

COMP-421 12 6 Transactions (25 Points) 1. (15 Points) Given the following schedule T1 T2 T3 w1(a) w1(b) c1 r2(b) r2(c) w2(c) c2 r3(b) w3(c) r3(b) c3 (a) Is the execution serializable? If yes, provide an equivalent serial schedule. If not, indicate the conflicting pairs that lead to the schedule not being serializable. (b) Is the schedule (i) recoverable, (ii) avoids cascading abort, (iii) strict? (c) Assume now that the figure above does not depict the final schedule but the sequence of operations submitted to the system. Now assume a DBMS that uses strict 2PL for concurrency control. Assume for shared locks the following behavior. When a shared lock is requested, and an exclusive lock is granted, then wait. If only shared locks are granted, then grant the new shared lock (even if some exclusive locks are waiting). The DBMS processes actions in the order shown. If a transaction is blocked, assume that all of its actions are queued until it is resumed; the DBMS continues with the next action (according to the listed sequence) of an unblocked transaction. i. Describe how strict 2PL handles the sequence above. Add lock S i (a) if T i acquires shared lock on a, X i (a) if T i acquires exclusive lock on object a, and unlock request U i (a) if T i releases any lock on object a to the above sequence. ii. Provide a serial schedule that is conflict-equivalent to the schedule produced under 2PL.

COMP-421 13 2. (10 Points) Assume the following concurrency protocol. For any update transaction, i.e., a transaction that has at least one write operation, strict 2PL is used for all its operations. Additionally, the system uses a multi-version system. That is, each write operation of T i on x creates a new version of x i. Thus, we indicate write operation of T i on x as w i (x i ). Read-only transactions, i.e., transactions that only have read operations and no update operation, do not acquire any locks. Instead, whenever a read-only transaction T i performs a read operation r i (x) it reads the version x j such that T j was the last to write x and commit before T i started. The following figure illustrates this behavior. T i reads the version x 2 created by T 2 because T 2 is the last to write x and commit before T i started. T i does not read x 1 because T 1 is not the last to write x and commit before T i started. T i does not read x 3, although x 3 was already committed when the read takes place, as T 3 is concurrent to T i (did not commit before T i started). The figure also shows that the read of the version x 2 can take place concurrently to the write w 4 (x 4 ) of T 4 as no read lock is set and different versions are accessed. T1 w1(x1) T2 w2(x2) T3 w3(x3) T4 w4(x4) Ti ri(x2) For each of the anomalies dirty read, unrepeatable read, phantom, indicate whether read-only transactions can experience this anomaly. Provide short reasoning. Time