1 Introduction 1.1 Parallel Randomized Algorihtms Using Sampling A fundamental strategy used in designing ecient algorithms is divide-and-conquer, where that input data is partitioned into several subproblems which are solved recursively. The solutions to the subproblems are then merged to form the solution of the original problem. Two issues arise when designing a divide-and-conquer algorithm: how to partition the data and how to merge subproblems. Sometimes there is a tradeo between ease of partitioning and the time needed for the merging step; careful partitioning may take more time but could lead to a more ecient methods of combining solutions. One method of partitioning input is random sampling. This technique has been used successfully in designing ecient parallel algorithms for problems in computational geometry [?,?]. A random sample of the input points, typically of size n for some 0 < < 1, is used to partition the data space so that each input point is included in one or more of the partitions. In some applications the input is not partitioned but rather decomposed into such that an input point may be placed in more than one set. If that is the case, the total size of all sets is kept bounded to avoid explosion of the work bounds. For sequential algorithms, Clarkson [?] showed that random sampling can be used to give a simple, general technique for building geometric structures incrementally and fast processing of random input points. In this paper we describe a useful technique that enables us to transform algorithms that use random sampling to partition the data into dynamic algorithms that allow insertion and deletion of data points. Dynamic (or incremental) algorithms update their output of a solution to a problem when the input is dynamically modied. Usually it is not practical to recompute the solution \from scratch", so special data structures are devised that can be updated at small cost. Dynamic algorithms are very useful in interactive applications, including network optimization, VLSI, and computer graphics. Many dynamic data structures have been devised to deal with problems in computational geometry [3, 2]. Throughout the rest of this paper we will refer to algorithms with xed input as static algorithms, and to algorithms with a dynamically changing input set as dynamic algorithms. 1.2 Dynamizing Static Randomized Algorithms We can visualize a divide-and-conquer parallel algorithm as a tree. The input set is stored at the root. The root has one child for each partition of the input, and that child node stores the elements of that partition. Each child node is is then recursively partitioned. Recursion stops when the number of data points stored at a leaf node drops below a specied bound, usually a polylog function of the input size. One of the requirements for an ecient dynamic algorithm is that this algorithm tree and the corresponding data structures will remain balanced throughtout the execution of the dynamic algorithm. Below we describe a general technique that achieves this goal. Note that this is a general technique, and details have to be tailored to suit specic implementations. Later in the paper we describe in detail how this technique can be used in a convex hull algorithm. Consider a simple static algorithm A which, given an input set I of size n, uses information from a small subset of size of the input to partition its data set. Let the work performed by A be T(n). Now consider a new data point p being presented to the algorithm to be inserted into the input set (a similar argument holds for deleting a data point). Suppose I [ p was input to A. Since the partition of the data set depends only on a small sample of the input set, the probability that p 1
would be included in is small: to be exact. Thus with probability1, the same partition n+1 n+1 would be produced regardless of whether p is in the input set. By performing a single Bernoulli trial, we can decide whether or not the partition should be aected by insertion or deletion of p. If the answer is \yes", we invoke the static algorithm A, repartitioning the input and recomputing the output from scratch. The probability of this happening is. Since the cost of invoking A is n+1 at most T (n + 1), the cost of adding the new input point p is at most T (n + 1). n+1 If repartitioning does not take place, we look at the current partitioning of the data set and determine which partition p falls into. We then recursively repeat the process on that partition. The dynamic maintenance of the algorithm tree and corresponding data structures proceeds inductively. At each step, the following induction hypothesis holds: After a sequence of any number of updates to the input, the partitioning of the data points will have the same probability distribution as a partitioning achieved by the static algorithm, had the same input been presented to it. With each new insertion or deletion, with probability at most the entire data set may be n completely repartitioned by the static algorithm, using unbiased independent random sampling. After each step, the induction still holds. Section 3 gives more details of this approach. 2 A Static, Randomized, Convex Hull Algorithm We describe a randomized divide-and-conquer convex algorithm. In many respects it is similar to the deterministic convex hull algorithm of []. We give rst a high-level description, followed by a more detailed description. Algorithm 1 Input: A set P of n points in the plane. Output: The convex hull of P. 1. Sort the input points by their x-coordinate. 2. Divide the points into an upper and a lower hull. Without loss of generality, the rest of the algorithm deals with the construction of the upper hull. The lower hull is constructed similarly. 3. Let n 0 be the number of points in the input. Randomly select a sample 0 containing s(n 0 )= n 0 points of the input. These points split the plane into s(n 0) sectors; the borders of the sectors are the rays emanating from a central point and going through each of the points in the sample. The size of each sector is the number of points within the sector. The expected size of each sector is n 1, 0. 4. Recursively compute the convex hull of each sector. 5. Combine the sector hulls into a single hull. We now give a more detailed description of the recursive construction and the merge step. 2.1 Recursive construction of a sector hull We dene a partition tree of the point set as follows: The input points are stored in the leaves of a tree. The internal nodes of the tree dene the recursive partitioning of the sectors. Each internal node at level i corresponds to a sector at the ith recursion level, and is the root of a subtree which contains the points included in that sector. The size of a subtree is the number of points stored in its leaves. 2
Lemma 2.1 The expected size of a subtree rooted at level i of the partition tree is n i = n 1,i Partition the points in each sector. At level i of the recursion, a sector contains n i points and is partitioned into s i subsectors by a random sample i of size s i = n. i The root of the tree is considered to be level 0. The root has s 0 = n 0 children, each corresponding to a sector. A subtree rooted at level i has s i = n children, each corresponding to a sub i sector. Lemma 2.2 The expected number of children of a node at level i of the partition tree is s i = n (1,)i. Dene L i to be the number of internal nodes at level i of the tree, that is, the number of subsectors at the ith level of recursion. Lemma 2.3 The expected size of L i is n 1,(1,e)i. Proof: The number of internal nodes at a given level of the partition tree is the number of nodes at one higher level times their expected number of children, giving the recursion L i = L i,1 s i,1 = L i,1 n 1,i,1 with L 0 = 1 (the root). Solving this recurrence, we get where L i = i,1 j=0 n(1,)j = n f (;i) = n 1,(1,)i f(; i) = i,1 (1 1 j=0,, (1, )i = )j 1, (1, ) =1,(1, )i 2 7 Alternate proof: The leaves of the tree contain all n 0 = O(n) input points. From Lemma 2.1, the expected size of a subtree is n 1,i. Since the subtrees dene a partition, they are disjoint. Thus the expected number of subtrees is n=n 1,i = n 1,(1,)i. 2 The recursion stops when the average sector size is log c n for some constant c. Lemma 2.4 The expected depth of the partition tree is O(log log n). Proof: The size of a sector at level i of the partition is n 1,i (Lemma 2.1). To nd the level when this size is polylog, solve n 1,k = log c n for k. 2 Partitioning a sector into subsectors has two components: selecting the random sample, which can be done in constant time per sample point, and determining which subsector each point belongs to, which can be done in constant time per input point. Lemma 2.5 The sectors at level i of the recursion can be partitioned in O(n) time. Proof: By lemma 2.3, there are n 1,(1,e)i sectors in the ith level of the recursion. By Lemma 2.2, the sample selected in each of these sectors is of size n (1,)i. Thus the time to select all the samples in this level is n 1,(1,e)i n (1,)i = n,(1,)i+1. Each of the input points falls into one subsector (if a point lies exactly on the border between two sectors, arbitrarily assign it into one of the two). It is possible to determine the subsector a point lies in in O(1) time, requiring O(n) time to determine the sectors for all input points. 1 The total time to perform the ith partition level is therefore n + = O(n). 2 n (1,)i+1 Since the depth of the partition tree is O(log log n)(by Lemma 2.4), we have: Lemma 2.6 The total time requirted to recursively partition the input point set is O(n log log n). 3
2.2 Merging of sector hulls Build the hulls of the sectors at the leaves of the partition tree using any standard convex hull algorithm. Build the hull of a level i sector from the (already constructed) hulls of the sectors in level i + 1, by merging pairs of adjacent hulls. Combine adjacent pairs using the upper tangent algorithm of [], which is linear in the number of points whose hulls are to be merged. This reduces the number of hulls by half. Continue merging adjacent pairs until there is only one hull. Lemma 2.7 The time required to construct one of the level i hulls is (1, ) i n (1,)i log n. Proof: By Lemmas 2.2 and 2.1, each level i sector is split into s i = n (1,)i subsectors, each of size n i+1 = n 1,i+1. Merging adjacent pairs, the merge process can be viewed as a tree of merges of depth log s i. Each merge level requires s i+1 n i+1 work (sequential time). Thus the total time required until all level i + 1 sector hulls are merged into one level i hull is s i n i+1 log s i = n (1,)i n 1,i+1 log n (1,)i = (1, ) i n (1,)i log n 2 Construction of the partition tree was a top-down process. Hulls are created at the leaves, and built bottom-up along the partition tree, with a merge step at every level. This leads to Lemma 2.8 The time required to merge the hulls at the leaves into a single hull is O(n log n). Proof: The total merge work over all levels of the tree is log log n i=0 (1, ) i n (1,)i log log n log n = log ni=0 (1, ) i n (1,)i = O(n log n) **** this is wrong!! need to multiply by the number of sectors at each level still ******* 2 2.3 Complexity analysis Combining the results of the previous section, we have: Theorem 2.1 The algorithm computes a convex hull for a set of points in the plane in time O(n log n. In the next section, we show how to use the data structures constructed for this algorithm in a dynamic convex hull algorithm. 3 Dynamic Construction Our goal is to maintain the convex hull of a dynamically changing point set. We give algorithms for dynamic maintenance of a convex hull of a set P of points in the plane. The algorithm accepts a sequence of requests from an adversary. Each request is a pair (point, action), where an action may be to INSERT or DELETE the input point from P, or answer a QUERY about the input point, e.g. determining whether it is on the convex hull of P, or searching for it in the partition. Our algorithm proceeds inductively. Given a partition tree, the entire structure or a part thereof may be completely rebuilt using the static algorithm with each new insertion or deletion. When a request for insertion arrives, the input to that stage of the algorithm is a set of n, 1 points P 0 arranged in a partition tree, the hull obtained using that partition tree, and one additional 4
point p. Consider the situation where all n points are presented to the static algorithm described in the previous section. At the top level of the algorithm, the key step of the static algorithm which is important for our purpose is selecting the set 0 which is used to determine the partitioning of the points into sectors. If j 0 j = n, the probability that the new point would have not been included is1, n n.thus we perform a single Bernoulli trial, with probability of success 1,. n n Failure in this trial means that p needs to be included in the subset of points that will determine the partition. If that is the case, We need to reconstruct the whole partition. Otherwise, determine which sector the point belongs to, then recursively repeat this process, reconstructing the partition at level i with probability P i =1, n i n i. How should the reconstruction be done? A naive approach would be to call the static algorithm and redo everything. Since the probability for reconstructing the entire partition and the hull is small, one would expect the overall complexity to be reasonably bounded. However, this is not the case, and total reconstruction yields a polynomial update time. Note that even if no repartitioning needs to be done, the hull always has to be reconstructed at the leaves under this scheme. A better approach is to substitute an existing dynamic algorithm such as the O(log 2 n) dynamic update of Overmars and VanLeeuwen [4]. The data structure used in their algorithm is relatively similar and easy to tailor to the partition tree (details below). Let // P i be the probability ofupdating the partition at level i.// C i be the cost of updating the partition and hull at level i.// F i be the cost of nding which sector a point belongs to. // We get the following recursion for the total cost of an update: In this algorithm, P i T (n i ) P i C i +(1,P i )(F i )+T(n i+1 ) is the probability ofbeing included in the sample at level i which is, by lemma 2.2, n(1,)i = n,(1,)i+1. n The cost of (1,)i nding the sector that a point belongs to is the cost of performing binary search through the sectors. Note that there is no need to search through all subsectors at a level. At the top level, search through n sectors. When the sector is found at that level, need only search through its subsectors. Thus the cost of nding the subsector that a point belongs to at level i is log n = (1, log n. i )i The cost of updating the partition at level i has two components { updating the sector hull and merging the sector hull up the partition tree (levels i, 1 :::0). Perform the update in polylog time using the modied update algorithm described below in section 4. The merge step no longer involves all subsectors in a sector. Start by merging the modied subsectors with its neighbors, and stop the merge when a subsector hull is not modied by the merge. Note that in the worst case all lavel i + 1 subsectors in a level i sector are modied, and that other level i sectors may be modied in the next merge step, but that no other level i + 1 subsectors outside the sector are modied. **** I'm redoing this section now. Just found what I think is an error in the merge step. It could be a ptoblem. ******* 4 Modied dynamic convex hull algorithm 5
References [1] Cole R, Goodrich MT. Optimal parallel algorithms for polygon and point-set problems. Department of computer science, 88{14, Johns Hopkins University, 1988. [2] Chiang Y, Tamassia R. Dynamic algorithms in computational geometry. Tech Report CS{91{ 24, Department of Computer Science, Brown University, 1991. [3] Overmars M. The design of dynamic data structures. Lecture Notes in Computer Science, 156, 1983. [4] Overmars M, Van Leeuwen. 1981. 6