1 of 8 pages Database Systems ( 資料庫系統 ) Midterm exam, November 15, 2006 Time: 10:00 ~ 12:20 Name: Student ID: I herewith state that I understand and will adhere to the following academic integrity: I will not use or attempt to use any unauthorized assistance, material, or study aids in this midterm examination. If I violate academic integrity, I may receive a grade of 0. Signature (5 points) Questions Max Points Points Signature 5 1 7 2 4 3 12 4 12 5 6 6 4 7 4 8 6 9 6 10 6 11 6 12 6 13 10 14 6 Total 100
2 of 8 pages Question 1: (7 points) Drugwarehouse.com has offered you a free life-time supply of prescription drugs (no questions asked) if you design its database schema. Given the rising cost of health care, you agree. Here is the information that you gathered: Patients are identified by their SSN, and we also store their names and age. Doctors are identified by their SSN, and we also store their names and specialty. Each patient has one family doctor, and we want to know since when the patient has been with her family doctor. Each doctor has at least one patient. Draw an ER diagram that captures the above information: Choice 1: Choice 2: Question 2: (4 points) Consider the following E/R diagram. Suppose the key of entity set A is attribute A, the key of B is B, the key of C is C, and the key of D is D. If we translate relationship set R into a relation R(A,B,C,D), what are all the keys of R? (A) {A} (B) {B}, {C}, and {D} (C) {B,C,D} (D) {A,B,C}, {A,B,D}, and {A,C,D} Answer: B
3 of 8 pages Question 3: (12 points) Consider the following ER schema for the MOVIES database. Assume that MOVIES is a populated database. Actor is used as a generic term and includes actresses. Given the constraints shown in the ER schema, respond to the following statements by circling True, False, or Maybe. Circle the response of Maybe to statements that, while not explicitly shown to be True, cannot be proven False based on the schema as shown. a. There are no actors in this database that have been in no movies. ( True / False / Maybe ) b. There are some actors who have acted in more than ten movies. ( True / False / Maybe ) c. Some actors have done a lead role in multiple movies. ( True / False / Maybe ) d. A movie can have only a maximum of one lead actor. ( True / False / Maybe ) e. Every director has been an actor in some movie. ( True / False / Maybe ) f. No producer has ever been an actor. ( True / False / Maybe ) g. A producer cannot be an actor in some other movie. ( True / False / Maybe ) h. There are movies with more than a dozen actors. ( True / False / Maybe ) i. Some producers have been a director as well. ( True / False / Maybe ) j. Most movies have one director and one producer. ( True / False / Maybe ) k. Some movies have one director but several producers. ( True / False / Maybe ) l. No movie has a director that also acted in that movie. ( True / False / Maybe )
4 of 8 pages Questions 4 and 5 below refer to the following E/R design for a database that keeps track of buildings, rooms, and in particular, conference rooms. Question 4: (12 points) Convert the above E/R diagram into relations using CREATE TABLE command. CREATE TABLE Buildings { name CHAR(20), year INTEGER, PRIMARY KEY (name) } CREATE TABLE Rooms_in ( area CHAR(20), number INTEGER, name CHAR(20) NOT NULL, PRIMARY KEY (number, name), FOREIGN KEY (name) REFERENCES Buildings ON DELETE CASCADE, ) CREATE TABLE ConferenceRooms { capacity INTEGER, number INTEGER, name CHAR(20), PRIMARY KEY (number, name), FOREIGN KEY (number) REFERENCES Rooms_in, FOREIGN KEY (name) REFERENCES Buildings, } Question 5: (6 points) Which of the following statements are true according to the constraints encoded by the E/R diagram above? Do not make any assumptions other than those encoded by the E/R diagram. I. The number of entities in the Rooms entity set must be greater than or equal to the number of entities in the ConferenceRooms entity set II. The number of entities in the Rooms entity set must be greater than or equal to the number of entities in the Buildings entity set (A) I only (B) II only (C) Both I and II (D) Neither I nor II Answer: A If there is incorrect statement(s), provide a short explanation (e.g., a counter example) the incorrect one(s). (II) is incorrect because there may be buildings with no rooms (there is no constraint stating that a Building must have a room).
5 of 8 pages Question 6: (4 points) Suppose that two relations R(A,B) and S(A,B) have exactly the same schema. Which of the following equalities hold in relational algebra? I. R S = R (R S) II. R S = S (S R) III. R S = R S // is natural join (A) I only (B) I and II only (C) I, II, and III (D) None of the above Answer: C If there is incorrect equality(s), provide a short explanation (e.g., a counter example) for the incorrect one(s). None Question 7: (4 points) Suppose we have two relations R(A, B) and S(A, B) with the same schema. The only key of R is {A}; the only key of S is {A} as well. Let relation T(A,B) be the set union of R and S, i.e., T = R S. What are the keys of T? (A) {A} (B) {B} (C) {A} and {B} (D) {A,B} Answer: D Questions 8 and 9 below refer to the following database schema: Person(SSN, employersymbol, salary) Holding(SSN, symbol, numshares) A person is uniquely identified by a social security number (SSN). A company is uniquely identified by its stock ticker symbol. Each person is employed by exactly one company, but may hold any number of different stocks. Question 8: (6 points) Suppose we wish to find the SSN s of the persons who do not own stocks of their employers. Which of the following queries will return the correct set of SSN s? I. π SSN (σ employersymbol symbol (Person Holding)) II. π SSN (π SSN,sym (ρ P(SSN,sym,sal) (Person)) π SSN,sym (ρ H(SSN,sym,num) (Holding))) III. SELECT SSN FROM Person WHERE employersymbol <> ALL (SELECT symbol FROM Holding WHERE Person.SSN = Holding.SSN); (A) II only (B) I and II only (C) I and III only (D) II and III only If there is incorrect query(s), provide a short explanation (e.g., a counter example) for the incorrect one(s). Answer: D (I) is incorrect. If someone works for MSFT but owns both MSFT and YHOO stocks, his/her
6 of 8 pages name will be returned. Question 9: (6 points) Suppose we wish to find the average salary of the persons who own more than 100 shares of Microsoft (MSFT) or more than 100 shares of Yahoo! (YHOO). Which of the following queries will correctly compute the desired average? I. SELECT AVG(salary) FROM Person WHERE SSN IN (SELECT SSN FROM Holding WHERE (symbol = MSFT OR symbol = YHOO ) AND numshares > 100); II. SELECT AVG(salary) FROM Person, Holding WHERE Person.SSN = Holding.SSN AND ((symbol = MSFT AND numshares > 100) OR (symbol = YHOO AND numshares > 100)); (A) I only (B) II only (C) Both I and II (D) Neither I nor II Answer: A If there is incorrect query(s), provide a short explanation (e.g., a counter example) for the incorrect one(s). (II) is incorrect. Suppose John owns 200 shares of MSFT and 200 shares of YHOO, we end up with 2 tuples of John; therefore John s salaries will be counted twice in the average. Questions 10-13 below refer to the following database schema: Person(SSN, employersymbol, salary) Holding(SSN, symbol, numshares) StockPrice(symbol, date, price) Person and Holding relations are identical to the ones used by Questions 8 and 9. We have added a third relation StockPrice, which tracks the closing price (per share) of each stock on each trading day. Question 10: (6 points) Write a relational algebra query to find the SSN s of all Microsft (MSFT) employees who own more than 50 shares of Oracle (ORCL) stock. Π SSN (σ employersymbol= MSFT (Person) σ symbol= ORCL AND numshares>50 (Holding)) Question 11: (6 points) Write a SQL query to find the total number of shares of Oracle (ORCL) stock owned by Microsoft (MSFT) employees. SELECT SUM(numShares) FROM Person, Holding WHERE employersymbol = MSFT AND Person.SSN = Holding.SSN AND symbol = ORCL
7 of 8 pages Question 12: (6 points) Write a relational algebra query to find the ticker symbols of all superstocks. A superstock is a stock whose closing price always rises on every trading day. You may compare date values using =, >, etc. (Hint: you do not need arithmetics on date values.) Π symbol (StockPrice) Π symbol (σ s1.symbol=s2.symbol AND s1.price >= s2.price AND s1.date < s2.date (ρ s1 (StockPrice) x ρ s2 (StockPrice)) Question 13: (10 points) Let us define a widely-held stock to be one that is owned by more than 40% of the investors in our database. Write a SQL query to find the latest closing price for each widely-held stock. Note that some quotes may be delayed: for example, the latest closing price of Microsoft stored in our database might be one day old, while the latest closing price of Macrohard might be two days old. SELECT symbol, price FROM StockPrice S WHERE symbol IN (SELECT symbol FROM Holding GROUP BY symbol HAVING COUNT(*)>(SELECT 0.4*COUNT(DISTINCT SSN) FROM Holding)) AND date >= ALL (SELECT date FROM StockPrice WHERE symbol = s.symbol);
8 of 8 pages Question 14 (6 points) Consider a relation Took(name, class, quarter) whose tuples record that a student took a given class in a given quarter. You may assume that no two students have the same name, and there is no key for this relation except all three attributes together. Write a SQL query to find all pairs of students who have never taken a class together i.e., have never taken the same class in the same quarter. Make sure to return each pair of students only once. (For example, if your expression returns <Mary,Fred> then it should not also return <Fred, Mary> or any additional copies of <Mary,Fred>.) You may compare name values using =, >, etc. Your expression will be graded on simplicity as well as correctness. SELECT DISTINCT (T1.name), T2.name FROM Took AS T1, Took AS T2 WHERE T1.name > T2.name EXCEPT SELECT T3.name, T4.name FROM Took T3, Took T4 WHERE T3.class=T4.class AND T3.quarter = T4.quarter AND T3.name > T4.name