Solutions to Final Examination

Prof. Li-Yan Yuan CMPUT 391: Database Management Systems Solutions to Final Examination April 23, 2007 It is a close-book examination and the time for the test is 120 minutes. There are ten (10) questions over two (2) pages. The value of each question is indicated in [ ] and the total is 100. Good luck to all of you. 1. Indicate whether each of the following statements is TRUE or FALSE. Each correct answer worths 4 points while an incorrect answer leads to the deduction of 2 points. [16] (a) Statement: The K-means algorithm is used in clustering because the number of clusters (categories) to be determined is known in advance. Solution: FALSE. (b) Statement: The ID3 algorithm for data classification is guaranteed to terminate. Solution: TRUE. (c) The two phase locking protocol uses rollbacks to resolve deadlock while the timestamps protocol uses rollback to resolve starvation. Solution: FALSE. (d) Statement: The FOR clause of XQuery is used to declare variables and bind each variable to its range. Solution: TRUE. 2. Consider a database with the following two tables with obvious meanings course(course id, course name, no credit), registration(student id, course id, grade). Note that the underlined attributes are keys of the tables. Write the triggers for imposing the following foreign key constraint specified in the CREATE TABLE statement for registration: [12] FOREIGN KEY (course id) REFERENCES course ON DELETE CASCADE. To enforce the foreign key constraint, both the deletion on course and the update/insertion on registration must be monitored. Consequently, the following two triggers are needed: One is to implement the DELETE CASCASE: CREASTE TRIGGER reference_constraint_on_course AFTER DELETE OR UPDATE ON COURSE FOR EACH ROW BEGIN IF (UPDATE AND NEW.course_id <> OLD.course_id OR DELETE) DELETE registration WHERE course_id = OLD.course_id; END; One is to make sure any registed course is indeed recoreded in the course table. CREASTE TRIGGER foregin_key_on_registration BEFORE INSERT OR UPDATE ON registration FOR EACH ROW DECLARE counter INT BEGIN 1

SELECT COUNT(*) INTO counter FROM course WHERE course_id = NEW.course_id; IF (counter < 1 ) THEN raise_exception( the foreign key constraint violated END; 3. Consider a relation schema R = ABCDE, functional dependencies B E E A A D D E and a decomposition D = {AB, BCD, ADE} of R. Is D dependency preserving? Explain. [6] Solution: Yes, D is dependency preserving. This is because D is equivalent to D = {B A, E A, A D, D E}, and D is dependency preserving. 4. Consider the database schema R = ABCDEF, and the following FDs: ABF C CF B CD A BD AE C F (a) Find a minimal cover of the given set of FDs. Show all steps. Solution: i. Right reducing: Replace BD AE with BD A and BD E. ii. Left reducing: ABF C can be replaced with: AB C while CF B can be replaced with: C B. This is because (1) AB C is entailed by ABF C and ; and C B is entailed by CF B and C F; (2) AB C entails ABF C and C B entails CF B; and therefore (3) the set AB C, is equivalent to ABF C, ; and C B, C F is equivalent to CF B, C F. That is, after Left-reducing, we have AB C C B CD A BD A BD E C F iii. Eliminate redundant FDs: C F is entailed by C B and, while CD A is entailed by C B and BD A. Therefore, a minimal cover is AB C C B BD A BD E 2

(b) Construct a join lossless, dependency preserving, and 3NF decomposition of R. Solution: The 3NF decomposition that corresponds to this minimal cover is {ABC, CB, BDAE, BF }. Note that BDAE is a super key of R. (c) Are all the schemas in the previous step in BCNF? Explain. [12] Solution: R 1 = ABC is not in BCNF because C B but C is not a key of R 1. All the rest are in BCNF. 5. Consider the database schema R = ABCDE and a set of MVDs M = {AB D}. Prove or disprove that D = {ABE, ABCD} is a join lossless decomposition of R with respect to M. Note that a proof must follow the definitions of the join lossless and MVD while a counter example is sufficient for a disproof. [10] Solution: D is not join lossless with respect to M. Consider the following table r over R. A B C D E a b c d 1 e 1 a b c d 2 e 2 Then r satisfies M but r Π ABE (r) Π ABCD (r). 6. Consider the following database schema with obvious meanings: prof( p id, p name, department) course(c code, department, c name) teaching(p id, c code, term) and the following query SELECT c.c_name, p.p_name FROM prof p, teaching t, course c WHERE t.term = Fall 2007 AND p.department = cs AND p.p_id = t.p_id AND t.c_code = c.c_code (a) Show the unoptimized relational algebra expression that corresponding to the above SQL query. Solution: Π course.c name,prof.p name σ θ (prof teaching course) where θ represents teaching.term = F all 2007 prof.department = cs. (b) Draw the corresponding query tree for the relational algebra expression obtained in the previous step. (c) Draw the query tree obtained from the previous tree by applying the heuristic optimization rules. [10] 7. Consider the following schedule S. T1 T2 T3 T4 write(x) read(y) write(z) write(y) read(z) 3

(a) Draw a precedence graph (serialization graph as called in the text book) for S. Solution: The precedence graph is given below. T1 / \ / \ / \ T2 ---> T3 ---> T4 (b) If S is serializable give an equivalent serial schedule, if not explain why. [10] Solution: Yes, S is serializable and the equivalent serial schedule is T1, T2, T3, T4. 8. Can a deadlock occur in the timestamps concurrency control protocol. [4] Solution: No, a deadlock cannot occur in the timesstamps protocol because no transaction ever needs to wait. 9. The SQL99 specifies four (4) different levels of isolation, i.e., READ UNCOMMITTED, READ COM- MITTED, REPEATABLE READ, and (ANOMALY) SERIALIZABLE. (a) Present a sample schedule S such that S observes READ UNCOMMITTED but not READ COMMITTED. Solution: A schedule with DIRTY READ, shown below, observes observes READ UNCOM- MITTED but not READ COMMITTED. T1 T2 Write(X) Rollback Read(X) (b) Demonstrate that your schedule will not be possible under the two phase locking protocol. The two phase locking protocol prevents T2 from Reading X before T1 rollbacks or commits. Indicate any reasonable assumptions you may have. [10] 10. Given the fact table below: sales(market id, item id, sale) and the following SQL statement: SELECT market_id, item_id, SUM(sale) GROUP BY CUBE (market_id, item_id) Write one SQL statement without using any GROUP BY CUBE clause that returns the same result set as the above SQL statement. [10] Solutions SELECT market_id, item_id, SUM(sale) GROUP BY CUBE market_id, item_id UNION 4

SELECT market_id, NULL, SUM(sale) GROUP BY CUBE market_id SELECT NULL, item_id, SUM(sale) GROUP BY item_id SELECT NULL, NULL, SUM(sale) 5