Prof. Li-Yan Yuan CMPUT 391: Database Management Systems Solutions to Final Examination December 15, 2005 It is a close-book examination and the time for the test is 120 minutes. There are twelve (12) questions over three (3) pages. The value of each question is indicated in [ ] and the total is 70. Good luck to all of you. 1. A well-known program package in C, called the Berkeley DB, provides all the access methods to various hard disk files, including B+tree, hash tables, and Extended Linear Hashing. The Berkeley DB also provides full transactional support, database recovery, on-line backups, and separate access to locking, logging and shared memory caching subsystems. Is such a system considered as a database management system? Explain. [5] Solution: Berkeley DB cannot be considered as a DBMS because it does not provide any query language such as SQL. 2. Consider the following relation schema Account (account number, customer, branch, balance) and two transactions that contain the following two SQL commands respectively select sum (balance) insert into Account from Account values ( 001, Tom, UofA, 1000) where branch = UofA Is there any potential conflict ( in terms of database consistency and/or SQL isolation levels ) among two transactions. Why? [5] Solution: Yes, it might lead to phantom phenomena, which can be prevented only if the isolation level is set at serialiable. ( Recall Question 1 in the midterm. ) 3. Give an example of a schedule at the READ COMMITTED isolation level in which a lost update occurs. By a lost update, we mean the update by a transaction is effectively ignored by the database system. [5] Solution: Any schedule, such as the one below, with a lost update but no dirty read will be an answer to this question. T 1 T 2 Read(A) Write(A) Commit Write(A) Commit The above schedule has no dirty read but the update by T 2 is lost. 4. Suppose there a relation R = ABCDE in the database. Describe how the trigger mechanism can be used to impose an FD constraint AB C. You need to present one (or more) CREATE TRIGGER statement(s) in Oracle, SQL99, or some other similar languages for this task. [10] Solution: The following two triggers are needed. Note the difference between the two. 1
CREASTE TRIGGER fd_enforcer_update BEFORE UPDATE on R FOR EACH ROW DECLARE counter INT BEGIN SELECT COUNT(*) INTO counter FROM R WHERE R.A = NEW.A AND R.B = NEW.B AND R.C <> NEW.C AND NOT (R.A = OLD AND R.B = OLD.B AND R.C = OLD.C AND R.D = OLD.D AND R.E = OLD.E); IF (counter > 0 ) THEN raise_exception( AB->C on R was violated ); END; CREASTE TRIGGER fd_enforcer_insert BEFORE INSERT on R FOR EACH ROW DECLARE counter INT BEGIN SELECT COUNT(*) INTO counter FROM R WHERE R.A = NEW.A AND R.B = NEW.B AND R.C <> NEW.C; IF (counter > 0 ) THEN raise_exception( AB->C on R was violated ); END; 5. Consider R = ABCDE. For each of the following instances of R, state whether (1) it violates the FD AE C, and (2) it violates the MVD AC D: [5] (a) an empty table (b) Solution: A B C D E a 2 3 4 5 2 a 3 5 5 a 2 3 6 5 (a) An empty table satisfies both dependencies. (b) R satisfies AE C but not AC D. 6. Consider R = ABCDEGHI and the following set F of functional dependencies: H GD E D HD CE BD A (a) Find a join loss-less, dependency preserving and 3NF decomposition of R. 2
(b) Indicate whether your database schema is in BCNF with respect to F. Explain. [10] Solution: (a) We first find a minimal cover of the FDs, as shown below. Right reduced Left Reduced: Minimal Cover H G H G H G H D H D E D E D E D H C HD C H C H E HD E H E BD A BD A BD A Then construct a database D = {HGCE, ED, BDA}. Now, we need to check if D contains any candidate key. Since no FD in the minimal cover above contains H, B, or I in its right side, any candidate key shall contain these three attributes. Further, it is not difficult to check that HBI is indeed a candidate key. Therefore, HBI is the only candidate key of R, and shall be added to D. Hence, D = {HGCE, ED, BDA, HBI} is a join loss-less, dependency preserving and 3NF decomposition of R. (b) D is in BCNF since all the non-trivial FDs X A in held in any relation R i D, X is a key of R i. 7. In designing a relational database schema, why might we choose a non-bcnf design? [5] Solution: This is because in many cases, there exists no database schema that is both BCNF and dependence preserving. If one prefers to have a dependence preserving database schema, then one have to choose a normal form, such as 3NF, that is weaker than BCNF. 8. After a transaction is rolled back under the timestamps ordering protocol, it is usually assigned a new timestamps when it starts again. Can it keep its old timestamps? Explain. [5] Solution: The re-submitted transaction cannot be assigned to its old timestamps, simply because it will probably be rolled back again for all its updates will be too later for others to read. 9. Consider the following log information that involves three transactions. Assume that the immediate update protocol with check-pointing is used for crash recovery. And note that the log file does not necessarily record all operations of a transaction. 3
T1 R(A) R(B) R(C) T2 R(B) W(B) R(D) Commit T3 R(A) R(D) W(D) R(B) Check Point Crash Point Time Is the schedule, as shown in the log, recoverable? Why? [5] Solution: The schedule is not recoverable because T 2 reads D whose value was written by T 3, but T 2 commits before T 3 commits. 10. Explain that, when evaluating possible association rules, the confidence is always larger than the support. [5] Solution: This is because the confidence is always no less than the support. Assume the possible association rule is X Y. Let N be the number of transactions, and P (X), P (X Y ) be the number of transactions that contains X, and both X and Y, respectively. Obvious, P (X) N. Then the confidence is defined as P (X Y ) P (X) while the support is defined as P (X Y ) N. Since P (X) N, we have proved that the confidence is always grater than or equal to the support. 11. Consider the following three documents, each of which is an English sentence. Construct an inverted index on these documents. [5] D 1 : it is an open book examination D 2 : she likes examination D 3 : is it a book Solution: The inserted index is shown in the following table. word document a D 3 an D 1 book D 1, D 3 examination D 1, D 2 is D 1, D 3 it D 1, D 3 likes D 2 open D 1 she D 2 12. Consider the following XML document. Define a relational database schema suitable for storing the information in the document in Oracle, and populate your database according to the XML 4
document. You may use a table with column names to show both the schema and instance of the table. The better the schema, the higher the mark [5] The database with the following three tables is a good choice for the given document. Obviously, it is in BCNF. Courses course number department title c291 Computing Science File and Database management Systems c391 Computing Science Database management Systems Registration student course grade student 12345 c291 8 5321 c391 5 54321 c391 6 auth id name email 12345 Charles M. Schulz schulz@hotmail 5321 Richard Brewka brewka@www.com 5
<?xml version="1.0" encoding="utf-8"?> <course number="c391" department = "computing science"> <title> Database Management Systems </title> <student> <student sid = "54321"> <name> Richard Brewka </name> <email> brewka@cs.ualberta.ca </email> 6 </grade> </student> <student sid = "12345"> <name> Charles M. Schulz </name> <email> Schulz@cs.ualberta.ca </email> 5 </student> </students> </course> <course number="c291" department = "computing science"> <title> File and Database Management Systems </title> <students> <student sid = "12345"> <name> Charles M. Schulz </name> <email> schulz@cs.aublerta.ca </email> 8 </grade> </student> </students> </course> 6