University of Texas at Austin 6 May 2005 Department of Computer Science Theory in Programming Practice, Spring 2005 Test #3 Instructions. This is a 50-minute test. No electronic devices (including calculators) are permitted. The test is closed book and closed notes, except that one page of notes is permitted (both sides may be used). Solve as many of the questions as you can. There are a total of 50 points available, but the test will be graded out of 25 (i.e., the maximum possible score is 50 out of 25). IMPORTANT: Unless you are explicitly asked to justify your answer, it is sufficient for you to simply provide a final answer. Question Score Points 1 6 2 4 3 8 4 9 5 5 6 5 7 4 8 3 9 6 Out Of 25 Name:
1. (6 points total) In the following parts, let R denote the database relation containing all triples of integers x, y, and z such that 0 < x y z and x + y + z = 7. (a) (2 points) Give a tabular representation of R. Hint: R contains four triples. (b) (2 points) Give a tabular representation of the database relation σ p (R), where p denotes the predicate x + y = 4. (c) (2 points) Give a tabular representation of the database relation π z (R). 2. (4 points) Explain why the following identity, in which p denotes a predicate and R and S denote relations, is useful in the context of query optimization. σ p (R) S = σ p (R S) 2
3. (8 points total) Consider a database with two tables. The first table is called parts and has columns partnum (a unique integer ID for a particular part) and price (a floating point value indicating the price of the part). The second table is called orders and has columns customer (a string specifying the name of the customer), partnum (the integer ID of the part being ordered), and quantity (a nonnegative integer indicating the order quantity). In each of the following parts, give a single SQL query that provides the specified information. (a) (2 points) For each part with at least one associated order, list the part number and the total quantity ordered. The output columns should be entitled partnum and quantity. (b) (3 points) For each order, list the customer name, the part number, and the total cost of the order. The output columns should be entitled customer, partnum, and cost. (c) (3 points) For each customer, list the customer name and the total cost of all that customer s orders. The output columns should be entitled customer and total. 3
4. (9 points total) In parts (a) and (b) below, you may assume that n is a power of 2. (a) (2 points) Give a Θ-bound on the number of comparisons used by bitonic merge to merge two sorted sequences of length n into a single sorted sequence of length 2n. (b) (2 points) Give a Θ-bound on the number of comparisons used by bitonic sort to sort an input of length n. (c) (3 points) Explain why the following algorithm correctly sorts any 0-1 input sequence of length n: Take the first element of the sequence, call it x, and use it to partition the entire set of n elements into those elements which are less than x (output these elements first, in arbitrary order), those which are equal to x (output these next), and those which are greater than x (output these last). (d) (2 points) Note that the algorithm of part (c) uses Θ(n) -type comparisons. (For example, to determine whether x = y, we can check whether x y and y x.) Does the zero-one principle therefore imply that there is an oblivious compare-interchange algorithm for sorting arbitrary sequences of n integers using Θ(n) comparisons? Justify your answer. 4
5. (5 points total) The following parts deal with the FFT algorithm as described in the course packet and lecture slides, i.e., over the complex numbers. (a) (2 points) Give a Θ-bound on the number of arithmetic operations needed to execute the FFT algorithm on a powerlist of length n. (b) (3 points) Suppose that the FFT of the powerlist a 0 a 1 a 2 a 3 is the powerlist b 0 b 1 b 2 b 3. Specify a polynomial p and a complex number z such that b 3 = p(z). 6. (5 points) Let n be a power of two and let p denote the powerlist As in the lectures, define p as the powerlist p 0 p 1 p 2 p n 1. 0 p 0 p 1 p 2 p n 2. Solve the following equation in the powerlist variable z = z 0 z 1 z 2 z n 1, that is, express each z i as a function of the p j s: z = z + p. Briefly justify your answer. 5
7. (4 points total) Consider an instance of the exact string matching problem in which the text string is of length m and the pattern string is of length n, where m n. (a) (2 points) Give an exact expression for the maximum possible number of matches, as a function of m and n. (b) (2 points) Suppose we look for matches using the Rabin-Karp algorithm (RK), and there are no matches to be found. Further assume that RK uses an idealized hash function that maps every string to a uniformly random 32-bit integer. Give an exact expression (as a function of m and n) for the expected number of false positives that occur during the execution of RK. Remark: By a false positive, we mean a situation in which RK is forced to do a normal string comparison because the hash of the current window is equal to the hash of the pattern. 8. (3 points) Let A be a program fragment that involves two nonnegative integer variables i and j, and assume that execution of A from any program state has one of the two following effects on the the values of i and j: (1) i is incremented and j is reduced by 9; (2) i is left unchanged and j is incremented. Specify an integer-valued function of i and j that is guaranteed to increase whenever A is executed. Briefly justify your answer. 6
9. (6 points total) Let p, r, r, s, and v be strings such that the following conditions hold: (1) r is a prefix of p; (2) s is a suffix of r; (3) v is the length-(p r + s) suffix of p; (4) s v; (5) c(v) s; (6) r is the length-(r s + c(v)) prefix of p. (a) (3 points) Prove that r > r. (b) (3 points) Prove that s is a suffix of r. 7