COMP 3403 Algorithm Analysis Part 2 Chapters 4 5 Jim Diamond CAR 409 Jodrey School of Computer Science Acadia University
Chapter 4 Decrease and Conquer
Chapter 4 43 Chapter 4: Decrease and Conquer Idea: (a) (b) reduce instance to one smaller instance of the same problem solve the smaller instance (c) We consider three categories of decrease and conquer: decrease by a constant (which is usually 1) e.g., insertion sort, DFS, BFS e.g., binary search, exponentiation by squaring decrease by a variable amount
Chapter 4 44 Example: Exponentiation Consider the problem of computing a n Brute force: iteratively do n 1 multiplications: a n = a a a a a Divide and conquer: a n = a n/2 a n/2 (n > 1) Decrease by a constant: a n = a a n 1 Decrease by a constant factor: a n = (a n/2) 2 if n is an even number > 0 ( a (n 1)/2) 2 a if n is an odd number > 1 a if n = 1 Which, if any, of these have the same efficiency? Which, if any, is(are) the most efficient?
Chapter 4 45 Decrease by One: Insertion Sort Algorithm Insertion Sort: InsertionSort(A[0..n-1]) for i = 1 to n - 1 v = A[i] j = i - 1 while j >= 0 and A[j] > v A[j + 1] = A[j] j = j - 1 A[j + 1] = v Decrease by one: when sorting A[0..k + 1], you make use of the fact that A[0..k] is already sorted Quick and dirty analysis: there are loops nested to a depth of 2, each of which have O(n) iterations, so you might expect an overall complexity of O(n 2 )
Chapter 4 46 Insertion Sort: 2 E.g., after k iterations of the outer loop, we have a situation like this: 2 5 13 8 3 34 21 That is, the numbers to the left of are sorted, and the number in bold is the next number to be inserted in the sorted list Similar to selection sort, but: 1 in IS, 0 in SS finding the next element to insert into the sorted list: finding the location to insert the next element into IS uses O(k) comparisons, SS uses 0 moves to insert the element into the sorted list IS uses O(k) data moves, SS uses O(1)
Chapter 4 47 Insertion Sort: 3 Concern: IS uses more data moves (O(n 2 )) than SS (O(n)) But: the number of comparisons + data moves in the inner loop of IS is data dependent SS must examine all unsorted elements to find the minimum remaining value Thus, for IS, the average case behavior may be (and is!) better than its worst case Consider the case of random data with no duplicate elements on average, the new element will be inserted halfway down the currently-sorted list this cuts down the number of comparisons and the number of data moves by 1/2 T avg (n) n 2 /4 = Θ(n 2 ) for IS
Chapter 4 48 Insertion Sort: 4 An interesting case to consider is that of almost sorted data here IS really improves upon SS (and QS and MS and... ): T best (n) = n 1!! Final thought: the book comments on how, by using a sentinel, we could write while A[j] > v A[j + 1] = A[j] j = j - 1 instead of while j >= 0 and A[j] > v... j = j - 1 Question for those of you who recall their computer architecture: Is that a bogus comment? why or why not?
Chapter 4 49 Binary Insertion Sort Observe that IS inserts an element into a sorted array We could find the insertion location by using binary search Issue: in general, binary search is better than linear search to find an element s place in a sorted array but in the case of sorted or almost-sorted arrays binary search turns out to be worse (why?) GEQ: what are the worst, average and best-case complexities for binary insertion sort?
Chapter 4 50 Graphs: Review A graph G = (V, E) is a (finite) set of vertices V and a set of edges E, where E { {v 1, v 2 } v 1 V, v 2 V } (in simple graphs v 1 v 2 ) it is common to write n for the number of vertices and m for the number of edges if {v 1, v 2 } E, we say v 1 is adjacent to v 2 Normally when we say graph, we mean simple graph some people study multi-graphs which allow multiple edges between pairs of vertices e.g., here is a graph with multiple edges and loops 1 2 3 in this class we will only be using simple graphs
Chapter 4 51 Graph Representation for Algorithms In order to use graphs to solve problems with computer algorithms, we must be able to represent graphs in our programs adjacency matrix adjacency list An adjacency matrix (for some n-vertex graph G) is an n n matrix A, where A i,j is 1 iff v i is adjacent to v j ; otherwise A i,j is 0 An adjacency list (for some n-vertex graph G) is an array of n lists (one for each vertex), such that v i is in v j s list iff v i is adjacent to v j Most graph algorithms favour the adjacency list approach, since the size of that representation is linear in the size of the graph: Θ( V + E ) in other words, with an adjacency matrix representation, there is no hope of having an algorithm (for non-trivial problems) guaranteed to run in O( G ) time
Chapter 4 52 Graph Searching: Introduction A common operation on graphs is to start searching at one vertex until either all vertices have been visited (there are other possibilities, but these are the usual ideas) The search proceeds by searching from an already-visited vertex v to some vertex w which is adjacent to v if there are multiple possible choices for w, the particular graph search may dictate which one(s) are valid choices
Chapter 4 53 Graph Searching: DFS and BFS Two important graph search techniques are known as breadth-first search (BFS) Both techniques can be considered as decrease by one e.g., starting from one of the n vertices in a graph, we do a search of a (sub-)graph consisting of n 1 vertices Both DFS and BFS are linear in the size of the graph representation
Chapter 4 54 DFS vs. BFS The two algorithms are quite similar BFS uses a queue to keep track of the vertices to be visited next Since recursion automagically implements a stack, it can be easier to write a DF search than a BF search HOMEWORK!! Review textbook algorithms instead, it does the initialization and then calls a function which can be recursive note also the problems which the textbook says these algs solve Many important graph algorithms are based on these, particularly DFS e.g., finding the bi-connected components of a graph
Chapter 4 55 Directed Graphs (Digraphs) A directed graph G = (V, E) is a (finite) set of vertices V and a set of edges E, where E {(v 1, v 2 ) v 1 V, v 2 V } we say (v 1, v 2 ) is an edge from v 1 to v 2 A directed cycle in a digraph is a (finite) sequence of vertices v 1, v 2,..., v k such that (v i, v i+1 ) E for 1 i < k, and (v k, v 1 ) E A digraph with no directed cycle is a directed acyclic graph (dag) Algorithms on directed graphs are sometimes more complex than on undirected graphs
Chapter 4 56 Dags In Real Life Dags can be used to represent situations in which there is an ordering or precedence among some items Examples: manufacturing: component A must be completed before B can be started, but components C and D can be completed in either order, or in parallel car example: the engine can be completed at the same time the body is constructed, but the engine should be installed in the car before the hood is attached the foundation must be constructed before the walls are erected the wiring and plumbing must be put in before the walls are finished, but the wiring and the plumbing can be installed in either order
Chapter 4 57 Topological Sorting: DFS Approach A topological sorting of a digraph (V, E) with n vertices is an ordering v i1, v i2,..., v in of the vertices such that for all (v j, v k ) E, v j appears before v k in this ordering it is also true (if less clear) that every dag has a topological sorting DFS topological sort (start at any vertex with in-degree 0):
Chapter 4 58 Topological Sorting: Vertex Deletion Approach A decrease by one algorithm Find a source vertex (i.e., one with in-degree 0) wash, rinse, repeat
Chapter 4 59 Decrease by a Constant Factor: Fake Coin Problem Problem: we have n identical-looking coins, but there is one counterfeit coin which weighs less than the real ones. Given a balance scale, how can we find the fake? Solution 1: divide the coins into two piles of size n/2 (with one coin left over if n is odd) weigh the two piles if they are the same, it must be that n is odd and the left-over is the fake The recurrence equation (for the number of weighings) is T (1) = 0 T (n) = T ( n/2 ) + 1 for n > 1 this gives T (n) = lg n (which is very fast!)
Chapter 4 60 Decrease by a Constant Factor: Russian Peasant Multiplication Suppose you want to compute n m brilliant observation: if n is even, n m = n/2 2m if n is odd, n m = n 1 2m + m 2 So we can multiply as follows: n m remainder 60 12 30 24 15 48 48 7 96 96 3 192 192 1 384 384 Using this technique requires you to only know addition, division by 2, and multiplication by 2 720
Chapter 4 61 Decrease by a Variable Size: Booth s Algorithm Booth s algorithm is used to speed multiplication in hardware Base 10 example: 999 624 = 1000 624 624 = 624000 624 = 623376 Strings of 9 s don t occur frequently in base 10 numbers E.g., suppose multiplier is 0011 1011 1100 0100 instead, use 0100!100 0!00 1!00 where each! represents a subtraction instead of an addition each string of the form 01 1 is replaced by (an equal length string) of the form 10! bigger savings are available with longer words
Chapter 4 62 Euclid s Algorithm Euclid s algorithm is based on repeated application of the equality gcd(m, n) = gcd(n, m mod n) For example, gcd(80,44) = gcd(44,36) = gcd(36, 8) = gcd(8,4) = gcd(4,0) = 4 It can be shown that the size, measured by the second number, decreases at least by half after two consecutive iterations. Therefore T (n) O(log n) Idea of proof: assume m > n GEQ: why can I do that? Consider k 1 = m mod n: if k 1 n/2, we are done (in only 1 iteration, but that s OK) otherwise k 1 > n/2; on the next iteration, we compute n mod k 1 thus k 2 must be < n/2, as required
Chapter 4 63 A Quick Review(?) of QuickSort Algorithm: Quicksort(A[0..n-1]) if n > 1 rearrange A so that all values in A[0..s-1] are <= A[s] % partition 1 all values in A[s+1..n-1] are > A[s] % partition 2 Quicksort(A[0..s-1]) Quicksort(A[s+1..n-1]) Notice that there is no work to be done after the recursive calls return If we can rearrange A in O(n) time, we get T (n) = T (s) + T (n s) + O(n) n > 1 As will be seen in Chapter 5, the rearrangement step is crucial to good performance
Chapter 4 64 Selection Problem Find the k-th smallest element in a list of n numbers Easy for k = 1 or k = n How? Not as obvious for the median: k = n/2 Example: 2 7 1 14 12 56 3 12 7 9999 18 16 7 median =? The median is commonly used in statistics as an representative value of a set of measurements in many cases the median is a better (more robust) indicator than the mean, which is sometimes used for the same purpose e.g., the mean salary at a company may be heavily influenced by the CEO s $4,000,000 salary in this case the median is $10,000, which is more representative of what the ordinary schmo is going to earn
Chapter 4 65 Algorithms for the Selection Problem You could sort the data in (say) Θ(n lg n) time and use k as an index into the resulting array Can we do better? Yes: take a lesson from quicksort: choose a pivot p (for example, choose the middle element as pivot) and partition the array into three parts: numbers <= p p numbers > p Example: find median of 2 7 1 14 12 56 3 12 7 9999 18 16 7: k = 13/2 = 7; k-th element is in the right partition: reduce k by 3 iteration 2: pivot = 12, k = 4: 7 12 7 7 12 14 9999 18 16 56 iteration 3: pivot = 12, k = 4: 7 7 7 12 k-th = 4-th element is the pivot, so the median is 12
Chapter 4 66 Selection Problem: Algorithmic Efficiency Consider the (relatively lucky) case where, on every iteration, the array is partitioned as evenly as possible the master throrem tells us this is Θ(n), a very good result (GEQ: do you think the complexity of this problem is Ω(n)?) In general, we won t get a reduction by half at each iteration Sadly, the worst case behaviour is Θ(n 2 ) There is a Θ(n) worst case algorithm known I m not sure I agree with that assessment: e.g., you might need it in a real time system
Chapter 4 67 The Game of Nim (One Pile Version) Game: there is a pile of n chips. Two players take turns removing at least 1 and at most m chips from the pile. the winner is the player that takes the last chip Q: Who wins the game the player moving first or second, if both player make the best moves possible? A (idea): It s a good idea to analyze this and similar games backwards, i.e., starting with n = 0, n = 1, n = 2,...
Chapter 4 68 Analyzing One-Pile Nim Example: m = 3 (fix one variable, contemplate the other) if n is 4, the first player loses regardless of his move if n is 5, 6 or 7, the first player wins by choosing 1, 2 or 3 (respectively), putting the other player in a losing situation Generalizing this, the first player has a winning strategy if n 4, 8, 12, 16,... More generally yet, the first player has a winning strategy if n mod (m + 1) 0 specifically, the first player s winning strategy is to remove n mod (m + 1) chips on each move
Chapter 4 69 Nim with l Piles Game: there are l > 1 piles of chips, containing n 1, n 2,..., n l chips Q: if you play first, what is your winning strategy? A (partial): consider l = 2 clearly, taking all the chips from one pile is a losing strategy so is taking all but one chip from one pile Why? Consider n 1 = 1, n 2 = 1: Consider n 1 = 2, n 2 = 2: Hypothesis? reducing the problem to the n 1 = 1, n 2 = 1 case (you lose) Except in what particular case?
Chapter 4 70 Nim with 2 Piles Hypothesis: for l = 2, having to choose when the two piles have the same number of chips means you are in a losing position Proof : if n 1 = n 2 = 3, your opponent duplicates your move, either reducing the problem to a previously solved case (where you lose), if you remove less than 3 chips, or by pseudo-induction, you are in a losing position whenever n 1 = n 2 In summary, if n 1 = n 2, you lose Q: if n 1 n 2, do you have a winning strategy? Q : If so, what is it?
Chapter 4 71 Nim with l > 2 Piles This is a more difficult problem But... an elegant solution exists: i.e., the bits in each position are XORed together E.g. with l = 3: n 1 = 9, n 2 = 12, n 3 = 3: 1001 1100 0011 0110 The fact that the sum is 0 means the first player has a winning strategy in this case, remove 2 chips from the second pile: 1001 1010 0011 = 0000 Note that our strategy for l = 2 does exactly this
Chapter 5 Divide and Conquer
Chapter 5 72 Chapter 5: Divide and Conquer A More General Master Theorem Given a recurrence equation ( ) n T (n) = at c + f(n) where f(n) Θ(n d ) and d 0, Θ(n d ) T (n) Θ(n d log n) Θ(n log c a ) if a < c d if a = c d if a > c d In our previous version, f(n) = bn so d = 1, which simplifies the above equation to Θ(n) T (n) Θ(n log n) Θ(n log c a ) if a < c if a = c if a > c
Chapter 5 73 Mergesort Algorithm Mergesort: Mergesort(A[0..n-1]) if n > 1 B[] <- A[0..(n-1)/2] % integer division! C[] <- A[(n-1)/2+1..n-1] Mergesort(B[0..(n-1)/2]) Mergesort(C[0..n/2-1]) A[] <- Merge(B[0..(n-1)/2], C[0..n/2-1]) In the worst case, Merge() requires n 1 comparisons of data elements in any case, n elements have to be moved, so it is Ω(n) eh? in the more general master theorem f(n) Θ(n) If n = 2 k, the master theorem says Mergesort() is in Θ(n log n) This is very close to the theoretical minimum for comparison-based sorting algorithms what is the disadvantage of Mergesort()? GEQ?
Chapter 5 74 Quicksort Algorithm Quicksort: Quicksort(A[0..n-1]) if n > 1 rearrange A so that all values in A[0..s-1] are <= A[s] % partition 1 all values in A[s+1..n-1] are > A[s] % partition 2 Quicksort(A[0..s-1]) Quicksort(A[s+1..n-1]) Q: how do we do the rearrangement? there are many variations on this theme all(?) involve moving all values which are the pivot to locations in the array which are before all values > than the pivot (and making sure the pivot itself is at the end of the 1st partition)
Chapter 5 75 Quicksort: Choosing a Pivot How do we choose the pivot? just pick the first element of A just pick the last element of A We could pick the middle element of A pathological data sets can be constructed which cause this to perform badly exercise: construct such a pathological data set We could randomly choose an element of A We could pick 2m + 1 elements of A (how?), find the median of those elements, and use that as the pivot
Chapter 5 76 Quicksort: Partitioning Data Array Here is one technique to partition A with respect to the pivot a two-finger technique This technique assumes the pivot is in A[0] let i = 1 and j = n - 1 put left finger on A[i] put right finger on A[j] move it left until for some j we have A[j] <= A[0] swap A[i] with A[j] swap A[0] with A[j]
Chapter 5 77 Analysis of Quicksort: Worst Case Worst case: pivot is largest or smallest element T (0) = T (1) = 0 T (n) = n 1 + T (n 1) so T (2) = 1 + T (1) = 1 + 0 = 1 T (3) = 2 + T (2) = 2 + 1 = 3 T (4) = 3 + T (3) = 3 + 3 = 6 T (5) = 4 + T (4) = 10 T (6) = 5 + T (5) = 15 Guess: T (n) = n 1 i=1 i = n(n 1) 2 So quicksort is Θ(n 2 ) in the worst case!
Chapter 5 78 Analysis of Quicksort: Best Case Best case: the two partitions are (roughly) the same size T (0) = T (1) = 0 T (n) = n 1 + 2 T (n/2) We could solve this the hard way Or we could apply the master theorem and get T (n) Θ(n lg n) nice!
Chapter 5 79 Analysis of Quicksort: Average case (1) This is the tricky case to solve, but also the most interesting case (and the most significant real-world case) Assume that all permutations of the input are equally likely After division into two lists (and the pivot) we have lists of size i (0 i n 1) and n i 1 Applying the equally likely assumption and basic probability theory, we get (for n > 1) T (n) = n 1 + n 1 i=0 1 ( ) T (i) + T (n i 1) n
Chapter 5 80 Analysis of Quicksort: Average case (2) We want to solve: T (0) = T (1) = 0 T (n) = n 1 + n 1 i=0 1 ( ) T (i) + T (n i 1) n Note: The sum of those T (i) s is T (0) + T (1) + + T (n 1) The sum of those T (n i 1) s is T (n 1) + T (n 2) + + T (0) Thus T (n) = n 1 + 2 n n 1 i=0 T (i) ( ) Now what? Guess: T (n) c n ln n n 1
Chapter 5 81 Analysis of Quicksort: Average case (3) To be shown: T (n) c n ln n n 1 The proof is by induction on n (surprise!) Base case (n = 1): c 1 ln 1 = 0 since T (1) = 0, the base case holds Substitute our guess (the induction hypothesis) into ( ): T (n) n 1 + 2 n n 1 i=1 c i ln i Note lower index! How can we handle this sum? Note that the sum a 1 + a 2 + a 3 is equal to the total area of 3 rectangles of width 1 and heights (respectively) a 1, a 2, a 3 a 1 a 2 a 3
Chapter 5 82 Analysis of Quicksort: Average case (4) We can bound this area with an integral calculus can be useful! n T (n) n 1 + 2 c x ln x dx note upper limit n 1 n 1 + 2c ( 1 n 2 n2 ln n 1 ) 4 n2 n 1 + c n ln n c n ( 2 c n ln n + n 1 c ) 1 (**) 2 To show T (n) c n ln n, all we need to show is that the second and third terms of (**) are 0 n(1 c/2) is 0 whenever 1 c/2 0, i.e., 2 c so choose c = 2 and we are done Alternative approach: subtract the summation formula for (n 1)T (n 1) from the summation formula for nt (n) and watch most of the terms vanish
Chapter 5 83 Analysis of Quicksort: Final Thoughts Intelligent choice of pivot can (usually) keep us from the Θ(n 2 ) situation Other improvements: recall: a Θ(n 2 ) algorithm can be better than a Θ(n lg n) algorithm for small values of n eliminate recursion (by clever programming) the textbook claims such improvements can decrease running time by 2025%
Chapter 5 84 Binary Search An efficient algorithm for searching in a sorted array two cases: sought element K is in array A, sought element K is not in array A left = 0 right = n-1 while left <= right mid = (left + right) / 2 % truncation to integer if K == A[mid] return mid if K < A[mid] right = mid - 1 % look in left half else left = mid + 1 % look in right half return -1 % flag for "not found" GEQ: Is this the optimum way to do the comparisons? Note: if the array is really big, the calculation of mid overflows!
Chapter 5 85 Analysis of Binary Search Count number of comparisons of data items (bogosity: counting the 3-way comparison as 1 comparison) worst case: the sought item is not in the array T (1) = 1 T (n) = 1 + T (n/2) solution (generalized master theorem (be careful!)): T (n) Θ(lg n) This is optimal for searching a sorted array Note that this is similar to the bisection algorithm for solving continuous equations of the form f(x) = 0
Chapter 5 86 Multiplying Large Integers There are many applications in which arithmetic of (arbitrarily) large numbers is required Example(s)? Consider the standard way of multiplying two n-digit integers by hand: a n 1 a n 2 a 1 a 0 b n 1 b n 2 b 1 b 0 c 0n d 0n 1 d 0n 2 d 01 d 00 c 1n d 1n 1 d 1n 2 d 11 d 10 c n 1n d n 1n 1 d n 1 1 d n 1 0 p 2n 1 p 2n 2 p n 1 p n 2 p 1 p 0 In total, there are n 2 one-digit multiplications and a similar number of additions Idea for long integers: break up integers into smaller pieces which can be multiplied with one machine instruction (e.g., n < 10 5 on a 32-bit machine) GEQ: is this still Θ(n 2 ) or is it now o(n 2 )?
Chapter 5 87 Multiplication: Divide and Conquer? Suppose we want to multiply n-digit numbers A and B; will it help to divide them into two pieces? write A = A 1 A 2 and B = B 1 B 2 where A = A 1 10 n/2 + A 2 and B = B 1 10 n/2 + B 2 we get AB = (A 1 10 n/2 + A 2 ) (B 1 10 n/2 + B 2 ) = A 1 B 1 10 2(n/2) + (A 1 B 2 + A 2 B 1 ) 10 n/2 + A 2 B 2 How many (single-digit) multiplications is this? n 2 n 2 2 n 2 n 2 for the first term for the second term in total, 4 n 2 n 2 = n2 No savings in this case! So what then?
Chapter 5 88 Multiplication: Divide and Conquer Try 2 As seen, we can write AB = A 1 B 1 10 n + (A 1 B 2 + A 2 B 1 ) 10 n/2 + A 2 B 2 and that with a little thought we can also write (A 1 + A 2 ) (B 1 + B 2 ) = A 1 B 1 + (A 1 B 2 + A 2 B 1 ) + A 2 B 2 rearranging, we get (A 1 B 2 + A 2 B 1 ) = (A 1 + A 2 ) (B 1 + B 2 ) A 1 B 1 A 2 B 2 But we already have computed A 1 B 1 and A 2 B 2, so we can compute (A 1 B 2 + A 2 B 1 ) with one n/2-digit multiplication instead of two (and four n/2-digit additions instead of one) So counting the number of single-digit multiplications for two n-digit numbers, we get T (1) = 1, T (n) = 3T (n/2) Solution: T (n) = 3 lg n = n lg 3 n 1.585 gnuplot-multiplication!!!
Chapter 5 89 Fast Matrix Multiplication The straightforward algorithm to multiply two n n matrices is Θ(n 3 ) 1969: Strassen discovered that by dividing a matrix into 4 n/2 n/2 submatrices and doing some similar rearrangements, instead of using 8 multiplications of n/2 n/2 matrices, only 7 are needed This gives the recurrence equation T (1) = 1, T (n) = 7 T (n/2) for n > 1 The solution to this is T (n) = n lg 7 n 2.81 More complex algorithms have been found in 1987, Coppersmith and Winograd gave a n 2.376 algorithm These algorithms have large/huge multiplicative constants not practical unless matrices are huge (but theoretically interesting)
Chapter 5 90 Closest-Pair Problem: Divide and Conquer Recall that the brute-force solution was O(n 2 ) Idea: divide points into two equal subsets S 1 and S 2 based upon their x coordinates (sort points using an O(n lg n) algorithm if needed) Recursively find the closest pair in each of S 1 and S 2 let d = min(d 1, d 2 ) We must now consider pairs of points (p 1, p 2 ) where p 1 S 1 and p 2 S 2 but... we need only consider pairs of points within distance d of the dividing line (call these C 1 and C 2 ) Key observation: for each p 1 in C 1, there are at most 6 points close enough in C 2 to be of interest! The running time is T (n) = 2 T (n/2) + M(n), where M(n) O(n) It can be shown that this problem is Ω(n log n)
Chapter 5 91 Closest-Pair Problem: As noted in the text, a key observation is the following: when considering each point in C 1 for points in C 2 closer than d, at most 6 points must be considered However, we must be able to efficiently find those 6 points The book talks about how we can sort the points to do this GEQ: what, exactly, is the textbook saying? GEQ: if we have to pre-sort (in O(n lg n) time) does this change the analysis? Why or why not?
Chapter 5 92 Convex Hull by Divide and Conquer ( Quickhull ) Notation: let P a P b denote the line segment from P a to P b Assume we have n points S = {P i : 1 i n}, where each P i = (x i, y i ) These points are sorted according to increasing order of their x coordinates; any ties are broken according to increasing y coordinate Choose the sorted list s first and last points P 1 and P n Divide S into S 1 and S 2, according to P 1 P n points to the left of P 1 P n are in S 1, points to the right of P 1 P n are in S 2 we will need an analytical test for left and right The convex hull of S is the union of the convex hulls of S 1 (the upper hull ) and S 2 (the lower hull ) Q: how can we efficiently divide S into S 1 and S 2?
Chapter 5 93 Upper Hull Construction If S 1 = Ø, the upper hull is just P 1 P n Otherwise, find point P S 1 which is furthest from P 1 P n Find sets S 1,1, the points to the left of P 1 P and S 1,2, the points to the left of P P n It can be shown that the points inside the triangle P 1 P P n cannot be on the upper hull (and thus we can ignore them from now on) there are no points to the left of both P 1 P and P P n Recursively compute the upper hull of {P 1 } S 1,1 {P } and {P } S 1,2 {P n } the concatenation of these upper hulls gives the upper hull of {P 1 } S 1 {P n }
Chapter 5 94 Quickhull: Testing Whether a Point is Left of a Line Let P 1 = (x 1, y 1 ), P 2 = (x 2, y 2 ) and P 3 = (x 3, y 3 ) Amazing fact you might not know: the area of the triangle bounded by P 1, P 2 and P 3 is equal to 1/2 of the magnitude of this determinant: x 1 y 1 1 x 2 y 2 1 x 3 y 3 1 = x 1(y 2 1 y 3 1) x 2 (y 1 y 3 ) + x 3 (y 1 y 2 ) The sign of this determinant is negative iff P 3 is to the left of the line P 1 P 2 Thus we can check in constant time whether a point is to the left of a line defined by two other points Quickhull is Θ(n 2 ) in the worst case