INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

-------------------------------------------------------------------------------------------------------------- INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados Exam 1 - Solution 1 June 2013 -------------------------------------------------------------------------------------------------------------- The duration of this exam is 2,5 Hours. You can access your own written materials, but the exam is to be done individually. You are not allowed to use computers, tablets, nor mobile phones. The maximum grade of the exam is 20 pts. Write your answers below the questions. Write your number and name at the top of each page. Present all calculations performed. After the exam starts, you can leave the room one hour after delivering the exam. The following table is be used by instructors, ONLY: 1 2 3 4 5 SUM 4 4 4 4 4 20 1

1. (4 vals) Data Indexing 1.1. (2,5 pts). Consider the following B+ tree structure: 13 17 24 30 2 3 5 7 14 16 19 20 22 24 27 29 33 34 38 39 Assume that when a leaf node is split, two values are copied to the left node and three to the right left node. Show the intermediate trees resulting from inserting the entry with key 42, and then deleting the entry with key 16. Insert 42: and then, delete 16: 2

1.2. (1 pt) Taking into account the original tree, indicate a sequence of five search key values such that when inserting them in the considered order, and then deleting them in the opposite order (i.e., insert a, insert b, delete b, delete a) results in the original tree. Justify your choice. Insert 13, 15, 23, 25, 28. The first four will make no change to the number of nodes in the tree, because they will fill empty slots in leaf nodes. Inserting 28 will cause the leaf node (24, 25, 27, 29) to split into: (24, 25,, ) and (27, 28, 29, ) and then to split the parent node into (13, 17,, ) and (27, 30,, ) and add a new root node (24,,,, ). When we delete in the opposite order: - deleting 28 makes no difference in the number of nodes - deleting 25 causes merging the two leaf nodes into (24, 27, 29, ) and also to merge the parent nodes into (13, 17, 24, 30) leading to the original tree. 1.3. (0,5 pt) Taking into account the original tree, indicate a sequence of five search key values such that when inserting them in the considered order, and then deleting them in the opposite order (i.e., insert a, insert b, delete b, delete a) results in a tree different from the original one. Justify your choice. Insert 13, 15, 23, 28, 42. It is enough to look into the solution of question 4.1. to understand why deleting the entries in the opposite order does not lead to the original tree. 3

2. (4 pts) Query Processing and Optimization 2.1. (2,5 pts) Consider the following database relations: student(id, name, master-program, avg-grade) takes(id, course, semester, year, grade) The student relation has 5,000 tuples and 1 page accommodates 50 tuples. The takes relation has 10,000 tuples and 1 page accommodates 25 tuples. Answer the following questions, assuming the size of the memory available is 11 pages: a) What is the cost, in terms of I/O, of joining student and takes using the merge join algorithm? b) What is the cost, in terms of I/O, of joining the same tables using the hash join algorithm? Student: 5,000 tuples => 5,000/50 = 100 pages = b student Takes: 10,000 tuples => 10,000/25 = 400 pages = b takes M = 11 a) Merge join: Cost = b student + b takes + cost(sorting student) + cost(sorting takes) We are assuming none of the relations is sorted by id (you could assume student is, because id is the primary key, but you could not assume the same for takes). Cost(sorting student) = b student * (2*ceiling(log M-1 b student /M) +1) = 300 Cost(sorting takes) = b takes * (2*ceiling(log M-1 b takes /M) +1) = 2000 Cost = 400 + 100 + 300 + 2,000 = 2,800 b) Hash join: build relation: student number of partitions = 100/11 = 9.1 < 11, so it is not necessary to perform partitioning. Cost = 3*( b student + b takes ) = 1,500 4

2.2. (1 pt). Suppose that, before applying the join operation of Question 2.1, a selection is applied to the takes relation, filtering out students with grade less than or equal to 12. Assuming you have no statistical information about the relations, other than the one presented in Question 2.1, estimate the number of tuples resulting from the join operation. Justify. σ grade <= 12 (takes) X student Since we do not have any statistical information about the selection, the estimated number of tuples produced is equal to the nb of tuples of takes /2 = 5,000. Let us call takes the result of the selection. Takes is one of the inputs of the join operation. The attribute in common between takes and student is ID which a foreign key in takes referencing student. So, the estimated number of tuples produced by the join is equal to the number of tuples of takes which is 5,000. 2.3. (0,5 pt) What would be the answer to Question 2.2 if the selection condition was grade = 18 and if we know that the number of distinct values that the grade attribute can take in relation takes is 20. In this case, the number of tuples produced by the selection is equal to nb tuples of takes/v(grade, takes) = 10,000/20 = 500, where V(grade, takes) is the number of distinct values grade can take in relation takes. For the same reason explained in 2.2., the estimated number of tuples produced by the join is 500. 5

3. (4 pts) Concurrency Control 3.1. (2,5 pts) Consider a relation with the schema shopping-cart(name, price), and consider the following two transactions: T1: begin transaction Q1: insert into shopping-cart values ( milk, 10) Q2: update shopping-cart set price = price + 5 Commit T2: begin transaction Q3: insert into shopping-cart values ( butter, 5) Q4: select sum(price) from shopping-cart Commit Suppose the table shopping-cart is initially empty. a) If both transactions execute successfully till the end with isolation level SERIALIZABLE, what are the possible values returned by Q4? b) If both transactions execute successfully till the end with isolation level READ UNCOMMITTED, what are the possible values returned by Q4? c) If both transactions execute with isolation level SERIALIZABLE, T2 commits, but T1 rolls back at some point before it commits, what are the possible values returned by query Q4? d) If both transactions execute with isolation level READ UNCOMMITTED, T2 commits, but T1 rolls back at some point before it commits, what are the possible values returned by query Q4? e) If both transactions execute successfully till the end, T1 uses isolation level REPEATABLE READ and T2 uses isolation level READ UNCOMMITTED, what are the possible values returned by query Q4? a) T1 then T2, Q4: 20 T2 then T1, Q4: 5 b) Q4: 5, 15, 20, 25 c) Q4: 5 d) Q4: 5, 15, 20, 25 e) Q4: 5, 15, 20. 6

3.2. (1 pt) Consider the following schedule for three concurrent transactions: T1 T2 T3 -------------------------------------------- R(A) R(C) W(A) W(B) W(B) W(C) a) Indicate whether it is a serializable schedule and justify. b) Is it possible to add lock and unlock requests to the individual operations, such that the schedule satisfies the 2-Phase Locking Protocol? Justify. a) Precedence graph: T1 -> T3 -> T2 -> T1 is cyclic, so the schedule is not serializable. b) It is not possible that the schedule satisfies the 2PL protocol because T3 would have to unlock_s(a) before T2 obtains the lock_x(a), and then request a lock_x(c). 3.3. (0,5 pt) Consider the schedule of Question 3.2 and indicate whether it is possible under the timestamp-based protocol. Justify. When T1 tries to write B, its timestamp TS(T1) < W_timestamp(B) = TS(T2) so T1 is rolled back. 7

4. (4 pts) Recovery Management Consider the execution shown in the following transaction log, and assume that the system crashed after the last instruction shown in the transaction log. Assume also that the Dirty Page Table (DPT) and the Active Transaction Table (TT) are both empty when the checkpoint is written to the log, at LSN 1. LSN LOG RECORD 1 Checkpoint 2 Update : T1 writes P3 3 Update : T2 writes P5 4 T2 abort 5 Update : T1 writes P2 6 Update : T3 writes P5 7 Update : T1 writes P4 8 T1 abort Explain how the system would recover from the crash, using the ARIES recovery algorithm. Be precise about what happens in each step of the recovery algorithm, answering the following questions: 4.1. (2,5 pts) Regarding the recovery with the ARIES algorithm, state explicitly: (i) What is done during the ANALYSIS phase of the algorithm, and (ii) What is done during the REDO phase of the algorithm. Explain exactly how each of these steps is executed over the transaction log shown above. The analysis stage should start by determining that the last checkpoint was at LSN 1. In the following explanation, the active transaction table records are denoted as (transid, lastlsn, status), and the DPT records are denoted as (pageid, reclsn) sets. The analysis phase runs until the last LSN shown in the log record, and does the following: LSN 2 : Adds (T1, 2, Active) to TT and (P3,2) to DPT. LSN 3 : Adds (T2, 3, Active) to TT and (P5,3) to DPT. LSN 4 : Changes (T2, 3, Active) to (T2, 4, Aborted) on the TT. LSN 5 : Adds (P2, 5) to DPT and Changes (T1, 2, Active) to (T1, 5, Active) in the TT. LSN 6 : Adds (T3, 6, Active) to TT. Does not change the P5 entry in the DPT. LSN 7 : Adds (P4, 7) to the DPT and Changes (T1, 2, Active) to (T1, 7, Active) in the TT. LSN 8 : Changes (T1, 7, Active) to (T1, 8, Aborted) on the TT. 8

After the analysis, the Active Transaction Table has three entries: (T1, 8, Aborted), (T2, 4, Aborted) and (T3, 6, Active). After the analysis, the Dirty Page Table has four entries: (P2,5), (P3,2), (P5,3) and (P4, 7). After the analysis stage, the REDO stage of the ARIES algorithm performs the following actions, starting from LSN 2 (i.e., the least reclsn in the DPT) LSN 2 : P3 is retrieved and its pagelsn is checked. If the page had been written to disk before the crash (i.e. if pagelsn >= 2), nothing is redone, otherwise the changes are redone. LSN 3 : P5 is retrieved and its pagelsn is checked. If the page had been written to disk before the crash (i.e. if pagelsn >= 3), nothing is redone, otherwise the changes are redone. LSN 4 : No action LSN 5 : P2 is retrieved and its pagelsn is checked. If the page had been written to disk before the crash (i.e. if pagelsn >= 5), nothing is redone, otherwise the changes are redone. LSN 6 : P5 is retrieved and its pagelsn is checked. If the page had been written to disk before the crash (i.e. if pagelsn >= 6), nothing is redone, otherwise the changes are redone. LSN 7 : P4 is retrieved and its pagelsn is checked. If the page had been written to disk before the crash (i.e. if pagelsn >= 7), nothing is redone, otherwise the changes are redone. LSN 8 : No action. 4.2. (1 pt) After answering the first question, explain what is done during the UNDO phase of the ARIES recovery algorithm, and show the resulting log when the recovery procedure is complete, including all prevlsn and undonextlsn values in the records that have, eventually, been added to the log. The UNDO stage starts at LSN 8 (i.e., the highest lastlsn in the active transaction table). Initially, the ToUndo list consists of LSNs 8, 6 and 4, for transactions T1, T3 and T2, respectively. LSN 8 : Remove 8 from the ToUndo. Adds LSN 7 to the ToUndo. ToUndo = (7, 6, 4). LSN 7 : Remove 7 from the ToUndo. Undoes the change on P4 and adds a CLR indicating this Undo (LSN 9, undonextlsn = 5). ToUndo = (6, 5, 4). LSN 6 : Undo the change on P5 and adds a CLR indicating this Undo (LSN 10, undonextlsn = null). ToUndo = (5, 4, 2). LSN 5 : Remove 5 from the ToUndo. Undoes the change on P2 and adds a CLR indicating this Undo (LSN 11, undonextlsn = 2). ToUndo = (4, 2). LSN 4 : Remove 4 from the ToUndo. Undoes the change on P5 and adds a CLR indicating 9

this Undo (LSN 12, undonextlsn = null). ToUndo = (2). LSN 2 : Remove 2 from the ToUndo. Undo the change on P3 and adds a CLR indicating this Undo (LSN 13, undonextlsn= null). In the end, the log would contain the following records: LSN LOG RECORD 1 Checkpoint 2 Update: T1 writes P3 3 Update: T2 writes P5 4 T2 commit 5 Update: T1 writes P2 6 Update: T3 writes P5 7 Update: T1 writes P4 8 T1 abort 9 CLR : Undo T1 (page=p4, undonextlsn = 5) 10 CLR : Undo T3 (page=p5, undonextlsn = null) 11 CLR : Undo T1 (page=p2, undonextlsn = 2) 12 CLR : Undo T2 (page=p5, undonextlsn = null) 13 CLR : Undo T1 (page=p3, undonextlsn = null) 4.3. (0,5 pt) When executing the operations that are registered in the log, and latter on when performing the recovery with the ARIES algorithm, imagine that the buffer pool was always large enough, so that uncommitted data would never be forced to disk. In this scenario, would the UNDO stage still be necessary? How about the REDO stage? Justify your answers. The UNDO stage would not be required, because uncommitted data would never be written to disk (notice, however, that we would have to ensure that, when performing the REDO stage of a recovery procedure, uncommitted data would again never be forced to disk). The REDO stage would still be required, because some committed data might have not been written to disk yet. 10

5. (4 pts) Miscelaneous 5.2. (1 pt) Consider the following simple query, whose performance within a given DBMS is not satisfactory. SELECT NAME, SALARY FROM EMPLOYEE WHERE SALARY / 12 = 4000; Assume that there are few other operations being executed concurrently with the query, and assume that there is a non-clustering B+Tree index on the SALARY attribute. Indicate two possible reasons for the poor performance of the query, and indicate how you would address these two particular problems. Justify your answer. Three possible reasons for the poor performance would be: (i) The non-clustering B+Tree index on SALARY is not being used to answer the query, because of the arithmetic operation corresponding to "SALARY / 12". On some systems, the query optimizer may not be able to replace the condition "SALARY / 12 = 4000" by the condition "SALARY = 4000 * 12", this way failing to use the available index. One can address this issue by rewriting the query, replacing the condition "SALARY / 12 = 4000" by the condition "SALARY = 4000 * 12". (ii) The selection condition "SALARY / 12 = 4000" may still return many tuples from the original table, and this way the non-clustering B+Tree index is not particularly useful (i.e., we would still need to follow many pointers, in order to retrieve the NAME attribute from the tuples of the table). This issue could be addressed by using a clustered index instead, or by suggesting to query optimizer the usage of an execution plan involving instead a full-table scan (depending on the number of attributed returned by the selection predicate, this may be preferable to using the non-clustered index). (iii) The non-clustering B+Tree index may still not provide the required level of performance and for an equality query, it may be more interesting to use a Hash-based index, instead of a B+Tree. This strategy would be particularly useful if the condition "SALARY / 12 = 4000" is highly selecting (see the explanation associated to the second possible reason). 11

5.3. (1 pt) Consider a relational database, where a table storing EMPLOYEE data is stored on a DISK1, a table storing CUSTOMER data is stored on a DISK2, and the recovery log is on DISK2 as well. The EMPLOYEE table is smaller than the CUSTOMER table, but it is accessed more often. The applications using both these tables are essentially read-only, and they involve many scans. Through the usage of a profiling tool, you noticed that the operations also involve many disk seeks (i.e., many random accesses), besides scans. In terms of the hardware specifications, you known that DISK2 should be particularly efficient for random accesses, and it also supports more than twice the I/O rate of DISK1. Consider also that the owner of the database is not willing to buy a new disk. What would you do to tune the performance of the database, in terms of how the data in stored on the different physical storage units (i.e., on DISK1 and DISK2). Justify. Several ideas could apply. Probably the best thing to do is to change the way through which the data is stored, putting the recovery log on DISK1, and the data from both tables on DISK2. The log should work better in this case (i.e., without being stored on the same disk where the data is stored as well), given that it is essentially a sequential storage medium for keeping recovery information. The data files would also be stored on the disk that is faster. Given that the applications essentially perform scans on the tables, and given that there is still a significant amount of random accesses to data on the disks, you can also consider reorganizing the data files on the disks, so as to occupy large sequential portions of the disk. You can also raise the prefetching levels and increase the page utilization. All these aspects should reduce the seek time, and the number of random disk accesses. Given that the EMPLOYEE table is accessed more often, a partially correct answer to the exercise could involve placing this table in DISK2 (i.e., the fastest disk), instead of leaving it on DISK 1, instead placing the CUSTOMER table on DISK1. Notice, however, that it is a better idea to keep the log separately from the data. 12

5.3. (1 pt) Recall the invited talk where a Engineer from NovaBase discussed practical DBMS performance tuning issues, and consider the main differences between mission critical databases of type "Rally" and of type "Formula 1". For each of the following tuning options, indicate and justify on what type of system (i.e., Rally or Formula 1) would it make more sense to consider the option: (i) Tuning the locking granularity and adjusting the isolation level. (ii) Creating more indexes over the tables, in order to improve query efficiency. Option 1 would make more sense for Rally systems, given that for these systems we should focus on improving concurrent access to the data (i.e., they combine concurrent reads with concurrent updates). Option 2 would make more sense for F1 systems, given that for these systems we should focus on improving query performance. Rally systems, on the other hand, involve both a large number of insertions/updates and often complex queries, and adding more indexes would likely decrease performance for the updates. 5.4. (1 pt) In the context of RAID storage systems, explain the concept of striping, and explain how does it affect I/O performance? In case of DBMS where table data is stored on RAID systems, on what type of query will striping produce the greatest performance improvement? Justify your answers. Striping concerns with segmenting logically sequential data (e.g., a database file), so that consecutive segments are stored on different physical storage devices. In some RAID configurations, striping is used as a way to boost read performance (i.e., multiple, independent requests to data pages can be serviced in parallel by separate disks, decreasing the queueing time seen by I/O requests). Striping will produce the greatest performance improvement for queries involving scans to very large tables, given that through striping one can perform the reads in parallel over the different storage units. 13