Complexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks

Size: px
Start display at page:

Download "Complexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks"

Transcription

1 Journal of Comuting and Information Technology - CIT 8, 2000, 1, Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks Eunice E. Santos Deartment of Electrical Engineering and Comuter Science Lehigh University We consider the roblem of designing otimal and efficient algorithms for solving tridiagonal linear systems with multile right-hand side vectors on two-dimensional mesh interconnection networks. We derive asymtotic uer and lower bounds for these solvers using odd-even cyclic reduction. We resent various imortant lower bounds on execution time for solving these systems including general lower bounds which are indeendent of initial data assignment, and lower bounds based on classifications of initial data assignments which classify assignments via the roortion of initial data assigned amongst rocessors. Finally, different algorithms are designed in order to achieve running times that are within a small constant factor of the lower bounds rovided. 1. Introduction In this aer, we consider the roblem of designing algorithms for solving tridiagonal linear systems. Such algorithms are referred to as tridiagonal solvers. A method for solving these systems, in which there is articular interest by designers, is the well-known odd-even cyclic reduction method which is a direct method. Due to this interest, we chose to focus our attention on odd-even cyclic reduction. Thus, our results will be alicable to tridiagonal solvers designed on a two-dimensional mesh utilizing cyclic reduction. Much research has been sent exloring this roblem which deals with designing and analyzing algorithms that solve these systems on secific tyes of interconnection networks [1, 5, 7, 8, 9, 10, 11] such as hyercube or butterfly. However, very little has been done on determining lower bounds for solving tridiagonal linear systems [4, 6, 10, 11] on any tye of interconnection network or on secific general arallel models [9]. Moreover, few of these aers consider the case of multile righthand side (HS) vectors and the authors know of no aers which derive lower bounds on arallel run-time for odd-even cyclic solvers with multile HS vectors. The main objective of this aer is to resent asymtotic uer and lower bounds on the running time for solving tridiagonal systems with multile right hand sides which utilize odd-even cyclic reduction on 2-dimensional meshes. Our decision to work with meshes is based on the simle fact that they are very common and frequently used interconnection networks. The results obtained in this aer will rovide not only a means for measuring efficiency of existing algorithms but also rovide a means of inointing algorithm design issues, data layouts and communication atterns in order to achieve otimal or near-otimal running times. While we have already derived results for tridiagonal solvers with only a single vector on a 2-D mesh [10], the results were not easily adatable to multile vectors. In fact, multile vectors roduced several layers of comlexity towards analysis of this roblem. Some of the interesting results we shall show include the following: The skewness in the roortion of data will significantly affect running time. Another interesting result is the roof on the threshold for esearch artially suorted by an NSF CAEE Grant.

2 2 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks rocessor utilization. For some roblem sizes, using common data layouts and straightforward communication atterns do not result in significantly higher comlexities than assuming that all rocessors have access to all data items regardless of communication attern. In many cases, common data layouts can be used to obtain otimal running times. However, there is no one algorithm, data layout or communication attern which will rovide otimal run-times for all cases. In fact, we resent more than 6 algorithms/subalgorithms which are needed in order to show how to achieve otimal or nearotimal running times for all cases. The aer is divided as follows. Section 2 contains a descrition of the 2-D mesh toology. In Section 3 we discuss the odd-even cyclic reduction method for solving tridiagonal systems. In Section 4, we derive various imortant lower bounds on execution. First, we derive general lower bounds for solving tridiagonal systems, i.e. the bounds hold regardless of data assignment. We follow this by deriving lower bounds which rely on categorizing data layouts via the roortion of data assigned amongst rocessors. Furthermore, we describe commonlyused data layouts designers utilize for this roblem. Lastly, we resent a variety of algorithms and subalgorithms which will roduce running times within a small constant factor of the lower bounds derived. Section 6 gives the conclusion and summary of results Dimensional Mesh Interconnection Network A mesh network is a arallel model on rocessors in which rocessors are groued as two tyes: border rocessors and interior rocessors. Each interior rocessor is linked with exactly four neighbors. Each but for four border rocessors have three neighbors. And the remaining four border rocessors have exactly two neighbors. More recisely: Denote rocessors by some i j where 1 i j - For 1 < i j < the four neighbors of i j are i+1 j, i;1 j, i j+1, and i j;1 -Fori = 1 and 1 < j < the three neighbors of i j are i+1 j, i j+1, and i j;1 the three neighbors of i j are i;1 j, i j+1, and i j;1 -Fori = and 1 < j < -Forj = 1 and 1 < i < the three neighbors of i j are i+1 j, i;1 j, and i j+1 the three neighbors of i j are i+1 j, i;1 j, and i j;1 -Forj = and 1 < i < -Fori = 1 and j = 1 the two neighbors of i j are i+1 j, and i j+1 -Fori = 1 and j = the two neighbors of i j are i+1 j, and i j;1 -Fori = and j = 1 the two neighbors of i j are i;1 j, and i j+1 -Fori = and j = the two neighbors of i j are i;1 j, and i j;1 Communication between neighbor rocessors require exactly 1 time ste. By this, we mean that if a rocessor transmits a message to its neighbor rocessor at the beginning of time x, its neighbor rocessor will receive the message at the beginning of time x + 1. Moreover, we assume that rocessors can be receiving[transmitting] a message while erforming a local oeration. 3. Odd-Even Cyclic eduction Method The roblem: Given M which is an NN matrix, and vectors b 1 b 2 b each of size N. For each s( ), solve x s where Mx s = b s. We assume that b 1 b 2 b can be stored in one matrix B of size N where the s th column of B reresents vector b s. We make an analogous assumtion for x 1 x 2 x. Moreover, we assume for the sake of simlicity that 1 < = 2 2k N where k is an integer and = 2 r. An algorithm is simly a set of arithmetic oerations such that each rocessor is assigned a sequential list of these oerations. An initial assignment of data to the rocessors is called a data layout. A list of message transmissions and recetions between rocessors is called a communication attern. These three

3 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks 3 Fig. 1. ow deendency for odd-even cyclic reduction for N = 15 and = 1. comonents (algorithm, data layout, communication attern) are needed in order to determine running time. Odd-even cyclic reduction [2, 3, 7] is a recursive method for solving tridiagonal systems of size N = 2 n ; 1. This method is divided into two arts: reduction and back substitution. The first ste of reduction is to remove each odd-indexed x si and create a tridiagonal system of size 2 n;1 ; 1. We then do the same to this new system and continue in the same manner until we are left with a system of size 1. This requires n hases. We refer to the tridiagonal matrix of hase j as M j and the vector as b j s. The original M and b s are denoted M 0 and b 0 s. The three non-zero items of each row i in M j are denoted l j i m j i r j i (left, middle, right). Below is the list of oerations needed to determine the items of row i in matrix M j (which we refer to as M j i. e j i = ; lj;1 i m j;1 f j = i ; rj;1 i m j;1 i;2 j;1 l j i = e j i lj;1 i;2 j;1 rj i = fi j rj;1 i+2 j;1 i+2 j;1 m j i = m j;1 i + e j i rj;1 + f j i;2 j;1 i lj;1 i+2 j;1 b j s i = b j;1 s i + e j ib j;1 + f j s i;2 j;1 i b j;1 s i+2 j;1 The results in this aer assume that determining values for each e j i fi j l j i ri j m j i or b j s i are atomic, i.e. if a rocessor is comuting, for examle, e j i, then the rocessor must comute all the oerations to satisfy the equation for e j i given above. Clearly, each system is deendent on the revious systems. The lower half of Figure 1 shows the deendency between all the M j s in the reduction hase. The nodes in level j of the figure reresent the rows of M j. A directed ath from node a in level k to node b in level j imlies that row a in M k is needed in order to determine the entries for row b in M j. In fact, Figure 1 also inoints the deendencies between all the b j s i s. Since there are no deendencies between some b j s 1 i and some b y s 2 z (where s 1 6= s 2 ), the only deendencies remaining are those between M j s and b y s s. From the definition of the roblem, we see that for each s, b j s i is deendent on M j i.

4 4 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks In this aer, for simlicity, we assume that if an algorithm emloys odd-even cyclic reduction, we assume that a rocessor comuted items of whole rows of a matrix (i.e. the three non-zero data items). The back substitution hase is initiated after the system of one equation has been determined. We recursively determine the values of the x si s. The first oeration is x s2 n;1 = bn;1 s 2 n;1 m n;1 : For the remaining variables x n;1 si, let j denote the last hase 2 in the reduction ste before x si is removed, then x si = bj;1 s i ; l j;1 i x si;2 j;1 ; r j;1 m j;1 i i x si+2 j;1 : Again, we are assuming the oerations to comute the x si s are atomic for = 1. However, for the case of > 1, we divide the oerations for x si into the following atomic oerations: and tem i = lj;1 i x si;2 j;1 + r j;1 m j;1 i x si = bj;1 s i ; m j;1 tem i : i i x si+2 j;1 eturning back to Figure 1, we see that the back substitution is reresented by the to half of the grah. In essence, the entire tridiagonal system s deendency grah is + 1 coies of the grah in Figure 1 (one for the M matrix, and the remaining for the vectors) with some additional edges. Clearly, for any two vectors of b, their resective deendency grahs will not have any edges between them. In fact the only edges to be added are from the sub-deendency grah of M to every vector of b. In other words, for all vectors b s, each node in the M sub-deendency grah will have an edge to the mirror node in the sub-deendency grah of b s. The serial comlexity of this method is denoted by S(N ) where = 1. = 1 is a secial case, i.e. S(N 1 1)=19N ; 14 log(n + 1) ; 4: For > 1, S(N 1)=15N ; 10n;5+M(6N ; 4n;1): In Section 4, we derive several imortant lower bounds on running time for odd-even cyclic reduction algorithms on 2-D mesh networks. Secifically, we derive general lower bounds indeendent of data layout, lower bounds which are based on categorization of data layouts, and lower bounds on running time for common data layouts for this roblem. In Section 5.6 we resent otimal algorithm running times on 2-d mesh toologies. 4. Lower bounds for Odd-Even Cyclic eduction Definition 4.1. The class of all communication atterns is denoted by C. The class of all data layouts is denoted by D. A data layout D is said to be a single-item layout if each nonzero matrix-item is initially assigned to a unique rocessor. In the following sections we shall rovide lower bounds based on certain tyes of data layouts as well as general lower bounds. The lower bounds hold regardless of the choice of communication attern A General Lower Bound for Odd-Even Cyclic eduction In this subsection we assume the data layout is the one in which each rocessor has a coy of every non-zero entry of M 0 and b 0 s. We denote this layout by D. Since D is the best data layout ossible, a lower bound based on this data layout will rovide a general lower bound. emark 4.1. We will denote the (arallel) comutation lower bound for odd-even cyclic reduction of size N and and utilizing rocessors by S(N ). emark 4.2. Let A be an odd-even cyclic reduction algorithm. A comutation lower bound for A, given rocessors and assuming none of the rocessors are totally idle is S(N )= Ω( N + log N):

5 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks 5 Theorem 4.1. Let A be an odd-even cyclic reduction algorithm. The following is a lower bound for A regardless of data layout and communication attern: 8 >< >: max(s(n ) (=))=Ω( N + log N + (=)) if N 2 3 max(s(n N 2 3))=Ω(N 1 3) otherwise roof. We will begin by assuming that all rocessors will be used in comutation and/or communication. Clearly, S(N ) is a lower bound for the roblem. Furthermore, we see that there are grous of rocessors which work together in order to solve the items in the deendency grah of M or in any of the b s s grahs. Denote these (not necessarily disjoint) grous of rocessors by M 1, M 2 M z. Let 0 = max iz jm i j ( min(n )). Since these rocessors must ultimately roduce only one row of M in the back substitution hase, it is clear there is some tye of all-to-one broadcast of 0 rocessors in order to comlete the back-substitution hase of M and/or b s. Thus, on a two-dimensional mesh, this will require communication of at least 0. Moreover, this will also require comutation of at least N 0. Therefore, a lower bound, assuming all rocessors are used, is max(s(n ) N 0 0 ). This leads to the fact that 0 = max(1 ) in order to minimize the lower bound. Furthermore, we see that using more than Ω(N 2 3) rocessors, will reduce efficiency; i.e. the threshold for rocesor utilization is Ω(N 2 3). In Section 5.6 we will rovide otimal algorithms, i.e. the running times are within a small constant factor of the lower bounds. Therefore, we note that our results show that when is sufficiently large, i.e. = Ω(N 2 3) using more than N 2 3 rocessors will not lead to any substantial imrovements in running time if utilizing data layout D Lower Bounds for O on c -data layouts Many algorithms designed for solving tridiagonal linear systems assume that the data layout is single-item and that each rocessor is assigned roughly 1 th of the rows of M 0 and the items of b where is the number of rocessors available. However, in order to determine whether skewness of roortion of data has an effect on running time and if so, exactly how much, we classify data layouts by initial assignment roortions to each rocessor. In other words, in this section, we consider single-item data layouts in which each rocessor is assigned at most a fraction c of the rows of M and items of b where 1 c ( N+N;+1 N+N ). Definition 4.2. Consider c where 1 c < ( N+ N ; + 1 ). A data layout D on N + N rocessors is said to be a c -data layout if (a) D is single-item, (b) no rocessor is assigned more than a fraction c of the rows of M0 or items of b, (c) at least one rocessor is assigned exactly a fraction c of the rows of M0 or items of b, and (d) each rocessor is assigned at least one row of M 0 or items of b. Denote the class of c data layouts by D( c ). Theorem 4.2. If D 2D( c ), then for any A 2 O for vectors, a lower bound is 8 Nc 6 ) = Ω( N + log N + + Nc ) max(s(n ) roof. The comutational lower bound must also be a lower bound for the roblem regardless of communication. Thus S(N ) is a lower bound. Moreover, since at least one rocessor is assigned Nc ieces of data, this rocessor (denoted as 0 ) can either choose to use that data item in a binary arithmetic oeration or send out the item to another rocessor. Therefore, it is clear that Nc 6 must also be a lower bound. We will now describe the argument for why 8 must also be a lower bound. We will use a contradiction argument. Consider any two data items from M and/or B. These two items are said to be "related" if they

6 6 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks are both required by some comutation in order to rovide final solutions. Clearly, any initial data item of M, i.e. M 0 will be related to any other initial item of M. Suose the 8 is not a lower bound, this imlies that any two rocessors which are both assigned initial items of M cannot be more than or equal to a distance of 4 from each other on the 2-D mesh. Furthermore, consider a rocessor which comutes M n;1. rocessor will require data from all of the items of M 0. Clearly, only rocessors which are within at most a distance of 8 ;1 can be assigned items from M 0. Furthermore, exanding the distance from 8 ; 1to 4 ; 2 from will now contain all the rocessors which will utilize any item from M 0 for comutation. Finally exanding the distance from to 3 8 ; 3 will reresent all the rocessors that are assigned data items from not only M 0 but also items of B 0 which are related to any item in M 0. Every rocessor is assigned at least one initial iece of data and the layout is single-item. Moreover, every item in B 0 is related to at least one item of M 0. Thus, every rocessor assigned an item of B 0 or M 0 must be inside this region. However, the region does not include all the rocessors in the mesh. Therefore, this is a contradiction. Analyzing the result given in the above theorem, we see that 1 -data layouts will result in the best lower bounds for this class of data layouts, i.e. : Corollary 4.1. If D 2 D( 1 ), then for any A 2Ofor vectors, a lower bound is max(s(n ) )=Ω( N + log N + ) Comaring results against the general lower bound, we see that for sufficiently large N (i.e. N >>, or more recisely (N) 2 3), the lower bound for 1 -data layouts is recisely equal to the general lower bound. The comlexity of any algorithm A 2Ousing a 1 -data layout is Ω(log N + N + ). In Section 5.6, we resent algorithms that are within a small constant factor of the bounds resented in this and revious subsections. 5. Otimal Algorithms, Data Layouts, and Communication Schedules In this section, we design the algorithms which will roduce running times which are within a small constant factor of the lower bounds derived in the last section. We begin by discussing the data layouts we lan to use. We follow this with a discussion of some of the communication schedules (mostly dealing with distribution/broadcast of the data to rocessors that require this information for comutation). Lastly, we resent the otimal algorithms along with their running times. We will discuss under which circumstances one algorithm should be used over another in order to achieve otimal or near-otimal running times Definitions of Data Layouts In the revious sections, we have derived various lower bounds on running time for tridiagonal solvers which utilize odd-even cyclic reduction on a 2-dimensional mesh interconnection toology. In order to design algorithms which are within a constant factor of these lower bounds, we must determine which tyes of data layouts we will use. In this section, we resent definitions on data layouts in order to use these layouts in the next subsections in our algorithm designs. We would like to oint out that while we can easily use the best data layout D to obtain the general lower bounds, in this section we will resent data layouts which are not so owerful but which still achieve the lower bounds stated. In fact, in the cases where N >>, no rocessor is assigned more than O( N ) data items. We begin by resenting artial layouts and in each subsection, we discuss a data layout we will utilize in our otimal algorithms. Definition 5.1. Let be an integer where 1 N and 2 i ;1 < 2 i+1 ;1. A data layout on rocessors, 1, is an M-relicativeblocked data layout if for all j < 2 i;1, j is assigned the nonzero items in rows (j;1)2 n;i +1 to (j + 1)2 n;i ; 1 of M 0. We denote this layout by D M.

7 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks 7 Definition 5.2. Let be an integer where 1 N and 2 i ;1 < 2 i+1 ;1. A data layout on rocessors, 1,isanb s -relicativeblocked data layout if for all j < 2 i;1, j is assigned b s l where (j ; 1)2 n;i + 1 l (j+ 1)2 n;i ; 1. We denote this layout by D bs. Definition 5.3. Let be an integer where 1 N and = 2 i. A data layout on rocessors, 1, is an M-single-item-blocked data layout if (a) for all j < ;1, j is assigned the nonzero items in rows (j;1) ;1 N N +1 to j ;1 of M 0, (b) ;1 is assigned the nonzero items in rows ( ; 2) N ;1 + 1 to ( ; 1) N ;1 ; 1 of M0, and (c) is assigned the nonzero items in row NofM 0. We denote this layout by D 0 M. Definition 5.4. Let be an integer where 1 N and 2 i ;1 < 2 i+1 ;1. A data layout on rocessors, 1,isanb s -single-itemblocked data layout (a) for all j < ; 1, j is assigned the vector items (j;1) ;1 N N +1 to j ;1 of b s, (b) ;1 is assigned the nonzero items in vector items ( ; 2) N ;1 + 1 to ( ; 1) N ;1 ; 1 of b s, and (c) is assigned the vector item N of b s. We denote this layout by D 0 b s Data Layout 1 If N 2 n;1 3 2b then we will only use 2 3 c rocessors, and, in this case, we will simly assume n;1 2b = 2 3 c. This data layout will be utilized in order to achieve the general lower bounds when. We begin by artitioning the rocessors into grous of = 22k;r rocessors. For the sake of simlicity, we will assume that r is even. These grous will be referred to as i j where 1 i j. The rocessors assigned to grou s t are i j where (s ; 1)2 k; 2 r + 1 i s2 k; 2 r and (t ; 1)2 k; 2 r + 1 j t2 k; 2 r. We will create a new notation for the mesh rocessors in order to utilize the artial data layouts defined in this section. Consider the rocessors in grou s t, these rocessors will be denoted by s t 1 s t is denoted by 2 s t - s t if i is odd (i;1) =+j where (s;1)2k; r 2 +i (t;1)2 k; r 2 +j - s t if i is even (i;1) =+ =;j+1 For each grou s t, we will allocate the items of M using data layout D M. Moreover, this grou of rocessors will be allocated vector b (s;1) +t using data layout D b(s;1) r Data Layout 2 This data layout will be utilized in order to achieve the general lower bounds when >. We begin by artitioning the assignment of the b s vectors to the rocessors. Each rocessor will be assigned whole vectors. In articular i j will be assigned the items in vectors b l where (i;1) +(j;1) +1 l (i;1) +(j) : Moreover, every rocessor will be assigned the items in matrix M Data Layout 3 This is a single-item data layout and will be utilized in order to achieve the single-item data layout lower bounds when < N. We will create a new notation for the mesh rocessors in order to utilize the artial data layouts defined in this section. The rocessors will be denoted by 1 2 where i j is denoted by - (i;1) +j if i is odd - (i;1) + ;j+1 if i is even We will allocate the items of M using data layout D 0 M. Moreover, the rocessors will be allocated each vector b s using data layout D 0 b s. Clearly, this is a c -data layout where c is constant.

8 8 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks Data Layout 4 This is a single-item data layout and will be utilized in order to achieve the single-item data layout lower bounds when > N and. We begin by artitioning the rocessors into grous of = 22k;r rocessors. These grous will be referred to as i where 1 i. If then the rocessors assigned to grou l are i j where (l ; 1)2 k;r + 1 i l2 k;r and 1 j. We will create a new notation for the mesh rocessors in order to utilize the artial data layouts defined in this section. Consider the rocessors in grou l, these rocessors will be denoted by l 1 l 2 l where (l;1)2 k;r +i j is denoted by - l (i;1) +j if i is odd - l (i;1) + ;j+1 if i is even However, if > then the rocessors assigned to grou l where l = x + y are i j where i = x + 1 and (y ; 1) + 1 j y. We will create a new notation for the mesh rocessors in order to utilize the artial data layouts defined in this section. Consider the rocessors in grou l, these rocessors will be denoted by l 1 l 2 l where (l;1)2k;r +i j is denoted by l (i;1) +j. egardless of the size of, for grou 1,we will allocate the items of M using data layout D 0 M. Moreover, each grou l will be allocated vector b l using data layout D 0 Clearly, b l. this is a c -data layout where c is constant Data Layout 5 This is a single-item data layout and will be utilized in order to achieve the single-item data layout lower bounds when > > N. We begin by artitioning the assignment of the b s vectors to the rocessors. Each rocessor will be assigned full vectors. In articular i j will be assigned the items in vectors b l where (i;1) +(j;1) +1 l (i;1) +(j) : Moreover, rocessor 1 1 will be assigned the items in matrix M. Clearly, this is a c -data layout where c is constant. The data layouts resented in this subsection will be used to design otimal or near-otimal algorithms for solving tridiagonal systems on a 2-dimensional mesh. We again note that while we could have simly assumed the best data layout for designing our algorithms to achieve the general lower bounds, we instead have listed data layouts in which no rocessor has been assigned more than O( N ) initial data items. Furthermore, we have resented several single item data layouts which we will be using in our otimal algorithm designs in later subsections. In the following subsection, we resent some communication schedules which we will use in our algorithms designs. These schedules are used to address data redistribution Communication Schedules For some of the algorithms we will design in the next subsection, it is imortant that we are able to redistribute the data from one layout to another. In this section, we resent communication schedules which do the needed redistributions Schedule 1 In this communication schedule, we are redistributing data layout 3 into the layout in which the matrix M is distributed using D M and each b s is distributed using D bs. Clearly, only neighbor rocessors (as ordered in the data layout discussion for 3) need to communicate. This would be accomlished by running the following: For all j ; 2, rocessors j do in arallel send all initially assigned data items to rocessor j+1 For all j 2, rocessor j do in arallel

9 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks 9 send all initially assigned data items to rocessor j;1 unning Time: O( N ) Schedule 2 In this communication schedule, we are redistributing data layout 4 into the layout in which the matrix M is distributed to each grou using layout D M. Moreover, we also wish to redistribute the b vectors such that instead of D 0 we wish the distribution to be D b, b.two slightly different schedules for this redistribution are needed: one for and one for >. We will discuss the schedule for. The redistribution schedule for the other case is very similar and, for brevity, we will omit its discussion. Consider. We begin by redistributing so that each grou is assigned the matrix M using D 0 Clearly, only rocessors above and M. below (on the mesh) a rocessor need the same data to communicate. Collecting the data items of M in each column to the first rocessor in the column and then sending out the information to every other rocessor in the column would effectively accomlish the first ste of the redistribution. The remaining redistributions can be easily taken care of using Schedule 1 above. This would be accomlished by running the following: For all j do in arallel For all 2 < i every rocessor i j do in arallel send all initial data items of M to 1 j (ielined broadcast) 1 j do a one to all (ielined) broadcast of the items of M to rocessors 2 j j. For each grou l do in arallel un schedule 1 for the assigned vector un schedule 1 for M unning Time: O( N + ) Schedule 3 In this communication schedule, we are redistributing data layout 5 into the layout in which the matrix M is distributed to every rocessor. This is simly a one-to-all broadcast of 3(N ; 1) items. Clearly, this can be accomlished in O(N ; 1 + ) stes. Due to the assumtions made in layout 5, the running tims is O( N + + log N) The Otimal Algorithms In the revious subsections, we have discussed the data layouts and some communication redistribution schedules we lan to use in our algorithms. In this section, we resent the algorithms which will roduce running times within at most a small constant factor of the lower bounds resented in the revious section SubAlgorithm 1 for Best Data Layout This algorithm solves a tridiagonal system by sequentially solving values based on values it has already comuted and not sent from another rocessor. This Subalgorithm will be a building block for the full algorithms. Assume y of the vectors of b are assigned to this grou of 0 = 2 q rocessors. Moreover, assume the rocessors are ordered using the ordering given in data layout 3. The items of row i of level j are considered to be l j i r j i m j i and b j s i (of all y vectors). Lastly, we will utilize only 0 ; 1 rocessors. SubAlgorithm 1 for Data Layout D Every rocessor i (1 i < 2 q ) do in arallel : for s = 1ton ; q do for z = 2 n;q;s (i ; 1)+1 to 2 n;q;s (i + 1) ; 1 do comute (serially) the items of row z of level s send items of row i of level n ; q to i+1 and receive from i;1

10 10 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks /*ifi = 0 ; 1 do not send, and if i = 1 do not receive */ send items of row i of level n ; q to i;1 and receive from i+1 /*ifi = 2 q ;1 do not receive, and if i = 1 do not send */ for s = 1toq ; 1 do if 2 i s is an integer then comute the items of row 2 ;s i of level n ; q + s if s 1 2 log then send items of row i2 ;s of level n;q+s to i+2 s routed through the ath ( i+1 i+2 i+k i+k+1 i+2 s ;1) and received from i;2 s /*ifi + 2s > 2 q ; 1 do not send, and if i ; 2 s < 1 do not receive */ send items of row i2 ;s of level n;q+s to i;2 s routed through the ath ( i;1 i;2 i;k i;k;1 i;2 s ;1) and received from i+2 s /*ifi + 2s > 2 q ; 1 do not receive, and if i ; 2 s < 1 do not send */ else s = 1 2 log + k send items of row i2 ;s of level n;q+s to i+2 s routed through the ath ( i+ i+2 i+j i+(j+1) i+(2 k ;1) ) and received from i;2 s /*ifi + 2s > 2 q ; 1 do not send, and if i ; 2 s < 1 do not receive */ send items of row i2 ;s of level n;q+s to i;2 s routed through the ath ( i; i;2 i;j i;(j;1) i;(2 k ;1) ) and received from i+2 s /*ifi + 2s > 2 q ; 1 do not receive, and if i ; 2 s < 1 do not send */ For back substitution, if i was the last rocessor to handle items of row j then i comutes all x si where b s is assigned to i (and if necessary, send and receive aroriate values of all x sm ) unning Time: O( Ny 0 + y log ) 5.7. Algorithm 1 for Data Layout 1 This data layout is used when. This algorithm is simly solving the tridiagonal system with one vector. We have artitioned the rocessors into grous. Since the best data layout actually is not needed for subalgorithm 1 and in fact our data layout is sufficient, we simly call SubAlgorithm 1 for each grou in arallel with 0 =, y = 1. Therefore the total running time is O( N +log +q )=O( N +log N+q ): As we stated in the subsection defining Data Layout 1, if N 2 3, we simly assume = N 2 3. Therefore the running time is clearly O(N 1 3) Algorithm 2 for Data Layout 2 This data layout is used when >. This algorithm is simly sequentially solving the tridiagonal system with right-hand side vectors. We have artitioned the vectors into grous. Each rocessor in arallel, serially solves their assigned system. Therefore the total running time is O( N +log N)=O(N +log N +q ): 5.9. Algorithm 3 for Data Layout 3 This algorithm first calls schedule 1, then it simly solves the tridiagonal system with vectors. We consider the case < N. Since the best log N data layout is actually not needed for subalgorithm 1 and in fact our data layout is sufficient (after schedule 1), we simly call SubAlgorithm 1 with 0 =, y =. Therefore the total running time is the time for Schedule 1 lus the time for Subalgorithm 1: O( N + log N + ): Since < O( N N log N + + log N). then the run-time is clearly N N, subal- For the case where log N gorithm 1 is easily modified so that the tasks

11 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks 11 in the last hases of reduction and the first hases of back substitution are relicated so that at least half the rocessors are always working. O( N This will roduce a running time of: + log N + ): Algorithm 4 for Data Layout 4 This algorithm first calls schedule 2 then it is simly solving the tridiagonal system with one vector. We have artitioned the rocessors into grous. Since the best data layout is actually not needed for subalgorithm 1 and in fact our data layout is sufficient (after schedule 2), we simly call SubAlgorithm 1 for each grou in arallel with 0 =, y = 1. Therefore the total running time is O( N + log N + ): Algorithm 5 for Data Layout 5 This algorithm first calls schedule 3, then it is simly sequentially solving the tridiagonal system with right-hand side vectors. We have artitioned the vectors into grous. Each rocessor in arallel, serially solves their assigned system. Therefore the total running time is O( N + log N + ): In this section, we designed algorithms for each of the data layouts which cover all roblem sizes. Analyzing the running times of our algorithms, we see that they are all within a small constant factor of the lower bounds rovided. In articular, the algorithms for Data Layouts 1 and 2 satisfy the general lower bounds. The algorithms for the remaining Data Layouts all satisfy the single-item bounds. Moreover, analyzing the design of the algorithms, we see that to achieve the single-item bounds simly requires redistribution of a single-item layout to the layouts used to achieve the general lower bounds. Furthermore, no layout assigned any rocessor more than O( N ) initial data items. 6. Conclusion In this aer we examined the roblem of designing otimal tridiagonal solvers on 2-dimensional mesh interconnection networks with multile right hand sides using odd-even cyclic reduction. This is the first aer to have tackled and solved this roblem asymtotically. We were able to derive asymtotic lower bounds on the execution time. Moreover, we were able to rovide algorithms whose running times differ by only a small constant factor of these lower bounds. Secifically, we roved that the comlexity for solving tridiagonal linear systems regardless of data layout (i.e. a general arallel lower bound on a 2-dimensional mesh) is Ω(N 1 3) when = Ω(N 2 3) else the bound is Ω( N +log N+q ). This shows that utilizing more than Ω(N 2 3) rocessors will not result in any substantial runtime imrovements. In fact, utilizing more rocessors may result in much slower run-times. When we added the realistic assumtion that the data layouts are single-item and the number of data items assigned to a rocessor is bounded, we derived lower bounds for classes of data layouts. The asymtotic lower bound is Ω( N + log N + ). We rovided three tyes of data layouts which rovided otimal running times. In other words, we showed that the best tyes of layouts in general are those in which either (a) each rocessor is assigned an equal number of rows of M 0 and b, (b) the rocessors are groued into grous in which every grou is assigned of the vectors, or (c) each rocessor is assigned whole vectors of b. Therefore, we see that the skewness of roortion of data will significantly affect erformance. Comaring the lower bounds for the c -layouts with the general lower bounds, we see that restricting the roortion of data items assigned to a rocessor to N does not result in a significantly higher comlexity than assuming all rocessors have all the data items for sufficiently large N (i.e. N 2 3). Lastly, we show that there are algorithms, data layouts, and communication atterns whose running times are within a constant factor of the lower bounds rovided. This rovides us with the Θ-bounds stated above. To achieve the general lower bound, i.e. the comlexity for these methods regardless of data layout, while we could have used D, the best

12 12 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks data layout, i.e. the data layout in which every rocessor is assigned all the data items, we did not. We resented two data layouts in which no rocessor was assigned more than O( N ) data items. Two tyes of algorithms roduce otimal running times: (a) algorithms in which rocessors are artitioned into sets such that each set is in essence an indeendent tridiagonal system, or (b) algorithms in which each rocessor sequentially solves of the vectors. As stated reviously, three different data layouts were rovided in order to design otimal algorithms for single-item layouts. The three tyes of algorithms are: (a) algorithms which in essence utilize block layouts and solve the tridiagonal system times. (b) algorithms in which rocessors are artitioned into sets such that each set is in essence an indeendent tridiagonal system, or (c) algorithms in which each rocessor sequentially solves of the vectors. For (b)-(c), the algorithms rovided are very similar to those to achieve the otimal running times (regardless of data layout). In fact, the difference in run-time is rimarily the need to do redistributions of data in order to run the otimal algorithms. Finally, it is worthwhile to observe that for sufficiently large N, the single-item lower bounds are asymtotic to the general lower bound. eferences [1] C. AMODIO AND N. MASTONADI, A arallel version of the cyclic reduction algorithm on a hyercube, arallel Comuting, 19, [2] D. HELLE, A survey of arallel algorithms in numerical linear algebra, SIAM J. Numer. Anal., 29(4), [3] A. W. HOCKNEY AND C.. JESSHOE, arallel Comuters, Adam-Hilger, [4] S. L. JOHNSSON, Solving tridiagonal systems on ensemble architectures, SIAM J. Sci. Stat. Comut., 8, [5] S.. KUMA, Solving tridiagonal systems on the butterfly arallel comuter, International J. Suercomuter Alications, 3, [6] S. LAKSHMIVAAHAN AND S. D. DHALL, A Lower Bound on the Communication Comlexity for Solving Linear Tridiagonal Systems on Cube Architectures, In Hyercubes 1987, [7] S. LAKSHMIVAAHAN AND S. D. DHALL, Analysis and Design of arallel Algorithms : Arithmetic and Matrix roblems, McGraw-Hill, [8] F. T. LEIGHTON, Introduction to arallel Algorithms and Architectures: Arrays-Trees-Hyercubes, Morgan Kaufmann, [9] E. E. SANTOS, Otimal arallel Algorithms for Solving Tridiagonal Linear Systems, In Sringer- Verlag Lecture Notes in Comuter Science #1300 (roceedings of Euro-ar), [10] E. E. SANTOS, Otimal Tridiagonal Solvers on Mesh Interconnection Networks In Sringer-Verlag Lecture Notes in Comuter Science #1557 (roceedings of ACC), [11] E. E. SANTOS, Solving Tridiagonal Linear Systems on a Torus roceedings of the International Conference on arallel and Distributed rocessing and Techniques, eceived: May 15, 1999 Acceted in revised form: January 21, 2000 Contact address: Eunice E. Santos Deartment of Electrical Engineering and Comuter Science 19 Memorial Drive West Lehigh University Bethlehem, A hone: +1 (610) Fax: +1 (610) santos@eecs.lehigh.edu EUNICE E. SANTOS is an Assistant rofessor in the Deartment of Electrical Engineering and Comuter Science at Lehigh University. She joined the faculty at Lehigh in August of 1995 after receiving a h.d. in Comuter Science from the University of California, Berkeley in May of that year. Dr. Santos has also obtained B.S. and M.S. degrees in both Mathematics and Comuter Science. Dr. Santos s research interests include: arallel and distributed rocessing, algorithm design and analysis, arallel comlexity, scientific and numerical comuting, otimization, comutational science, and evolutionary comuting. She has over 30 technical ublications in arallel rocessing alone. In 1996, Dr. Santos was awarded an NSF CAEE Grant in order to ursue her research interests.

Introduction to Parallel Algorithms

Introduction to Parallel Algorithms CS 1762 Fall, 2011 1 Introduction to Parallel Algorithms Introduction to Parallel Algorithms ECE 1762 Algorithms and Data Structures Fall Semester, 2011 1 Preliminaries Since the early 1990s, there has

More information

Sensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network

Sensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network 1 Sensitivity Analysis for an Otimal Routing Policy in an Ad Hoc Wireless Network Tara Javidi and Demosthenis Teneketzis Deartment of Electrical Engineering and Comuter Science University of Michigan Ann

More information

Directed File Transfer Scheduling

Directed File Transfer Scheduling Directed File Transfer Scheduling Weizhen Mao Deartment of Comuter Science The College of William and Mary Williamsburg, Virginia 387-8795 wm@cs.wm.edu Abstract The file transfer scheduling roblem was

More information

Lecture 3: Geometric Algorithms(Convex sets, Divide & Conquer Algo.)

Lecture 3: Geometric Algorithms(Convex sets, Divide & Conquer Algo.) Advanced Algorithms Fall 2015 Lecture 3: Geometric Algorithms(Convex sets, Divide & Conuer Algo.) Faculty: K.R. Chowdhary : Professor of CS Disclaimer: These notes have not been subjected to the usual

More information

split split (a) (b) split split (c) (d)

split split (a) (b) split split (c) (d) International Journal of Foundations of Comuter Science c World Scientic Publishing Comany ON COST-OPTIMAL MERGE OF TWO INTRANSITIVE SORTED SEQUENCES JIE WU Deartment of Comuter Science and Engineering

More information

COMP Parallel Computing. BSP (1) Bulk-Synchronous Processing Model

COMP Parallel Computing. BSP (1) Bulk-Synchronous Processing Model COMP 6 - Parallel Comuting Lecture 6 November, 8 Bulk-Synchronous essing Model Models of arallel comutation Shared-memory model Imlicit communication algorithm design and analysis relatively simle but

More information

AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K.

AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K. inuts er clock cycle Streaming ermutation oututs er clock cycle AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS Ren Chen and Viktor K.

More information

Collective Communication: Theory, Practice, and Experience. FLAME Working Note #22

Collective Communication: Theory, Practice, and Experience. FLAME Working Note #22 Collective Communication: Theory, Practice, and Exerience FLAME Working Note # Ernie Chan Marcel Heimlich Avi Purkayastha Robert van de Geijn Setember, 6 Abstract We discuss the design and high-erformance

More information

Randomized algorithms: Two examples and Yao s Minimax Principle

Randomized algorithms: Two examples and Yao s Minimax Principle Randomized algorithms: Two examles and Yao s Minimax Princile Maximum Satisfiability Consider the roblem Maximum Satisfiability (MAX-SAT). Bring your knowledge u-to-date on the Satisfiability roblem. Maximum

More information

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model.

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model. U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 18 Professor Satish Rao Lecturer: Satish Rao Last revised Scribe so far: Satish Rao (following revious lecture notes quite closely. Lecture

More information

10. Parallel Methods for Data Sorting

10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting... 1 10.1. Parallelizing Princiles... 10.. Scaling Parallel Comutations... 10.3. Bubble Sort...3 10.3.1. Sequential Algorithm...3

More information

Collective communication: theory, practice, and experience

Collective communication: theory, practice, and experience CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Comutat.: Pract. Exer. 2007; 19:1749 1783 Published online 5 July 2007 in Wiley InterScience (www.interscience.wiley.com)..1206 Collective

More information

Coarse grained gather and scatter operations with applications

Coarse grained gather and scatter operations with applications Available online at www.sciencedirect.com J. Parallel Distrib. Comut. 64 (2004) 1297 1310 www.elsevier.com/locate/jdc Coarse grained gather and scatter oerations with alications Laurence Boxer a,b,, Russ

More information

[9] J. J. Dongarra, R. Hempel, A. J. G. Hey, and D. W. Walker, \A Proposal for a User-Level,

[9] J. J. Dongarra, R. Hempel, A. J. G. Hey, and D. W. Walker, \A Proposal for a User-Level, [9] J. J. Dongarra, R. Hemel, A. J. G. Hey, and D. W. Walker, \A Proosal for a User-Level, Message Passing Interface in a Distributed-Memory Environment," Tech. Re. TM-3, Oak Ridge National Laboratory,

More information

CASCH - a Scheduling Algorithm for "High Level"-Synthesis

CASCH - a Scheduling Algorithm for High Level-Synthesis CASCH a Scheduling Algorithm for "High Level"Synthesis P. Gutberlet H. Krämer W. Rosenstiel Comuter Science Research Center at the University of Karlsruhe (FZI) HaidundNeuStr. 1014, 7500 Karlsruhe, F.R.G.

More information

IMS Network Deployment Cost Optimization Based on Flow-Based Traffic Model

IMS Network Deployment Cost Optimization Based on Flow-Based Traffic Model IMS Network Deloyment Cost Otimization Based on Flow-Based Traffic Model Jie Xiao, Changcheng Huang and James Yan Deartment of Systems and Comuter Engineering, Carleton University, Ottawa, Canada {jiexiao,

More information

A Model-Adaptable MOSFET Parameter Extraction System

A Model-Adaptable MOSFET Parameter Extraction System A Model-Adatable MOSFET Parameter Extraction System Masaki Kondo Hidetoshi Onodera Keikichi Tamaru Deartment of Electronics Faculty of Engineering, Kyoto University Kyoto 66-1, JAPAN Tel: +81-7-73-313

More information

Randomized Selection on the Hypercube 1

Randomized Selection on the Hypercube 1 Randomized Selection on the Hyercube 1 Sanguthevar Rajasekaran Det. of Com. and Info. Science and Engg. University of Florida Gainesville, FL 32611 ABSTRACT In this aer we resent randomized algorithms

More information

Efficient Parallel Hierarchical Clustering

Efficient Parallel Hierarchical Clustering Efficient Parallel Hierarchical Clustering Manoranjan Dash 1,SimonaPetrutiu, and Peter Scheuermann 1 Deartment of Information Systems, School of Comuter Engineering, Nanyang Technological University, Singaore

More information

Space-efficient Region Filling in Raster Graphics

Space-efficient Region Filling in Raster Graphics "The Visual Comuter: An International Journal of Comuter Grahics" (submitted July 13, 1992; revised December 7, 1992; acceted in Aril 16, 1993) Sace-efficient Region Filling in Raster Grahics Dominik Henrich

More information

Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Spanning Trees 1

Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Spanning Trees 1 Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Sanning Trees 1 Honge Wang y and Douglas M. Blough z y Myricom Inc., 325 N. Santa Anita Ave., Arcadia, CA 916, z School of Electrical and

More information

Distributed Estimation from Relative Measurements in Sensor Networks

Distributed Estimation from Relative Measurements in Sensor Networks Distributed Estimation from Relative Measurements in Sensor Networks #Prabir Barooah and João P. Hesanha Abstract We consider the roblem of estimating vectorvalued variables from noisy relative measurements.

More information

Patterned Wafer Segmentation

Patterned Wafer Segmentation atterned Wafer Segmentation ierrick Bourgeat ab, Fabrice Meriaudeau b, Kenneth W. Tobin a, atrick Gorria b a Oak Ridge National Laboratory,.O.Box 2008, Oak Ridge, TN 37831-6011, USA b Le2i Laboratory Univ.of

More information

RESEARCH ARTICLE. Simple Memory Machine Models for GPUs

RESEARCH ARTICLE. Simple Memory Machine Models for GPUs The International Journal of Parallel, Emergent and Distributed Systems Vol. 00, No. 00, Month 2011, 1 22 RESEARCH ARTICLE Simle Memory Machine Models for GPUs Koji Nakano a a Deartment of Information

More information

Image Segmentation Using Topological Persistence

Image Segmentation Using Topological Persistence Image Segmentation Using Toological Persistence David Letscher and Jason Fritts Saint Louis University Deartment of Mathematics and Comuter Science {letscher, jfritts}@slu.edu Abstract. This aer resents

More information

A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH

A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH Jin Lu, José M. F. Moura, and Urs Niesen Deartment of Electrical and Comuter Engineering Carnegie Mellon University, Pittsburgh, PA 15213 jinlu, moura@ece.cmu.edu

More information

Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University

Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University Query Processing Shuigeng Zhou May 18, 2016 School of Comuter Science Fudan University Overview Outline Measures of Query Cost Selection Oeration Sorting Join Oeration Other Oerations Evaluation of Exressions

More information

An Efficient Video Program Delivery algorithm in Tree Networks*

An Efficient Video Program Delivery algorithm in Tree Networks* 3rd International Symosium on Parallel Architectures, Algorithms and Programming An Efficient Video Program Delivery algorithm in Tree Networks* Fenghang Yin 1 Hong Shen 1,2,** 1 Deartment of Comuter Science,

More information

Lecture 8: Orthogonal Range Searching

Lecture 8: Orthogonal Range Searching CPS234 Comutational Geometry Setember 22nd, 2005 Lecture 8: Orthogonal Range Searching Lecturer: Pankaj K. Agarwal Scribe: Mason F. Matthews 8.1 Range Searching The general roblem of range searching is

More information

Parallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka

Parallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka Parallel Construction of Multidimensional Binary Search Trees Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka School of CIS and School of CISE Northeast Parallel Architectures Center Syracuse

More information

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets An imroved algorithm for Hausdorff Voronoi diagram for non-crossing sets Frank Dehne, Anil Maheshwari and Ryan Taylor May 26, 2006 Abstract We resent an imroved algorithm for building a Hausdorff Voronoi

More information

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications OMNI: An Efficient Overlay Multicast Infrastructure for Real-time Alications Suman Banerjee, Christoher Kommareddy, Koushik Kar, Bobby Bhattacharjee, Samir Khuller Abstract We consider an overlay architecture

More information

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data Efficient Processing of To-k Dominating Queries on Multi-Dimensional Data Man Lung Yiu Deartment of Comuter Science Aalborg University DK-922 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Deartment of

More information

A BICRITERION STEINER TREE PROBLEM ON GRAPH. Mirko VUJO[EVI], Milan STANOJEVI] 1. INTRODUCTION

A BICRITERION STEINER TREE PROBLEM ON GRAPH. Mirko VUJO[EVI], Milan STANOJEVI] 1. INTRODUCTION Yugoslav Journal of Oerations Research (00), umber, 5- A BICRITERIO STEIER TREE PROBLEM O GRAPH Mirko VUJO[EVI], Milan STAOJEVI] Laboratory for Oerational Research, Faculty of Organizational Sciences University

More information

Distributed Algorithms

Distributed Algorithms Course Outline With grateful acknowledgement to Christos Karamanolis for much of the material Jeff Magee & Jeff Kramer Models of distributed comuting Synchronous message-assing distributed systems Algorithms

More information

Theoretical Analysis of Graphcut Textures

Theoretical Analysis of Graphcut Textures Theoretical Analysis o Grahcut Textures Xuejie Qin Yee-Hong Yang {xu yang}@cs.ualberta.ca Deartment o omuting Science University o Alberta Abstract Since the aer was ublished in SIGGRAPH 2003 the grahcut

More information

PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS

PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS Kevin Miller, Vivian Lin, and Rui Zhang Grou ID: 5 1. INTRODUCTION The roblem we are trying to solve is redicting future links or recovering missing links

More information

Non-Strict Independence-Based Program Parallelization Using Sharing and Freeness Information

Non-Strict Independence-Based Program Parallelization Using Sharing and Freeness Information Non-Strict Indeendence-Based Program Parallelization Using Sharing and Freeness Information Daniel Cabeza Gras 1 and Manuel V. Hermenegildo 1,2 Abstract The current ubiuity of multi-core rocessors has

More information

Improving Trust Estimates in Planning Domains with Rare Failure Events

Improving Trust Estimates in Planning Domains with Rare Failure Events Imroving Trust Estimates in Planning Domains with Rare Failure Events Colin M. Potts and Kurt D. Krebsbach Det. of Mathematics and Comuter Science Lawrence University Aleton, Wisconsin 54911 USA {colin.m.otts,

More information

RST(0) RST(1) RST(2) RST(3) RST(4) RST(5) P4 RSR(0) RSR(1) RSR(2) RSR(3) RSR(4) RSR(5) Processor 1X2 Switch 2X1 Switch

RST(0) RST(1) RST(2) RST(3) RST(4) RST(5) P4 RSR(0) RSR(1) RSR(2) RSR(3) RSR(4) RSR(5) Processor 1X2 Switch 2X1 Switch Sub-logarithmic Deterministic Selection on Arrays with a Recongurable Otical Bus 1 Yijie Han Electronic Data Systems, Inc. 750 Tower Dr. CPS, Mail Sto 7121 Troy, MI 48098 Yi Pan Deartment of Comuter Science

More information

Simulating Ocean Currents. Simulating Galaxy Evolution

Simulating Ocean Currents. Simulating Galaxy Evolution Simulating Ocean Currents (a) Cross sections (b) Satial discretization of a cross section Model as two-dimensional grids Discretize in sace and time finer satial and temoral resolution => greater accuracy

More information

Modeling Reliable Broadcast of Data Buffer over Wireless Networks

Modeling Reliable Broadcast of Data Buffer over Wireless Networks JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 3, 799-810 (016) Short Paer Modeling Reliable Broadcast of Data Buffer over Wireless Networs Deartment of Comuter Engineering Yeditee University Kayışdağı,

More information

level 0 level 1 level 2 level 3

level 0 level 1 level 2 level 3 Communication-Ecient Deterministic Parallel Algorithms for Planar Point Location and 2d Voronoi Diagram? Mohamadou Diallo 1, Afonso Ferreira 2 and Andrew Rau-Chalin 3 1 LIMOS, IFMA, Camus des C zeaux,

More information

Sensitivity of multi-product two-stage economic lotsizing models and their dependency on change-over and product cost ratio s

Sensitivity of multi-product two-stage economic lotsizing models and their dependency on change-over and product cost ratio s Sensitivity two stage EOQ model 1 Sensitivity of multi-roduct two-stage economic lotsizing models and their deendency on change-over and roduct cost ratio s Frank Van den broecke, El-Houssaine Aghezzaf,

More information

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Min Hu, Saad Ali and Mubarak Shah Comuter Vision Lab, University of Central Florida {mhu,sali,shah}@eecs.ucf.edu Abstract Learning tyical

More information

Privacy Preserving Moving KNN Queries

Privacy Preserving Moving KNN Queries Privacy Preserving Moving KNN Queries arxiv:4.76v [cs.db] 4 Ar Tanzima Hashem Lars Kulik Rui Zhang National ICT Australia, Deartment of Comuter Science and Software Engineering University of Melbourne,

More information

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics structure arises in many alications of geometry. The dual structure, called a Delaunay triangulation also has many interesting roerties. Figure 3: Voronoi diagram and Delaunay triangulation. Search: Geometric

More information

Truth Trees. Truth Tree Fundamentals

Truth Trees. Truth Tree Fundamentals Truth Trees 1 True Tree Fundamentals 2 Testing Grous of Statements for Consistency 3 Testing Arguments in Proositional Logic 4 Proving Invalidity in Predicate Logic Answers to Selected Exercises Truth

More information

Ad Hoc Networks. Latency-minimizing data aggregation in wireless sensor networks under physical interference model

Ad Hoc Networks. Latency-minimizing data aggregation in wireless sensor networks under physical interference model Ad Hoc Networks (4) 5 68 Contents lists available at SciVerse ScienceDirect Ad Hoc Networks journal homeage: www.elsevier.com/locate/adhoc Latency-minimizing data aggregation in wireless sensor networks

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Distrib. Comut. 71 (2011) 288 301 Contents lists available at ScienceDirect J. Parallel Distrib. Comut. journal homeage: www.elsevier.com/locate/jdc Quality of security adatation in arallel

More information

An Indexing Framework for Structured P2P Systems

An Indexing Framework for Structured P2P Systems An Indexing Framework for Structured P2P Systems Adina Crainiceanu Prakash Linga Ashwin Machanavajjhala Johannes Gehrke Carl Lagoze Jayavel Shanmugasundaram Deartment of Comuter Science, Cornell University

More information

Relations with Relation Names as Arguments: Algebra and Calculus. Kenneth A. Ross. Columbia University.

Relations with Relation Names as Arguments: Algebra and Calculus. Kenneth A. Ross. Columbia University. Relations with Relation Names as Arguments: Algebra and Calculus Kenneth A. Ross Columbia University kar@cs.columbia.edu Abstract We consider a version of the relational model in which relation names may

More information

GEOMETRIC CONSTRAINT SOLVING IN < 2 AND < 3. Department of Computer Sciences, Purdue University. and PAMELA J. VERMEER

GEOMETRIC CONSTRAINT SOLVING IN < 2 AND < 3. Department of Computer Sciences, Purdue University. and PAMELA J. VERMEER GEOMETRIC CONSTRAINT SOLVING IN < AND < 3 CHRISTOPH M. HOFFMANN Deartment of Comuter Sciences, Purdue University West Lafayette, Indiana 47907-1398, USA and PAMELA J. VERMEER Deartment of Comuter Sciences,

More information

Improved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties

Improved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties Imroved heuristics for the single machine scheduling roblem with linear early and quadratic tardy enalties Jorge M. S. Valente* LIAAD INESC Porto LA, Faculdade de Economia, Universidade do Porto Postal

More information

SPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation

SPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation To aear in IEEE VLSI Test Symosium, 1997 SITFIRE: Scalable arallel Algorithms for Test Set artitioned Fault Simulation Dili Krishnaswamy y Elizabeth M. Rudnick y Janak H. atel y rithviraj Banerjee z y

More information

A Metaheuristic Scheduler for Time Division Multiplexed Network-on-Chip

A Metaheuristic Scheduler for Time Division Multiplexed Network-on-Chip Downloaded from orbit.dtu.dk on: Jan 25, 2019 A Metaheuristic Scheduler for Time Division Multilexed Network-on-Chi Sørensen, Rasmus Bo; Sarsø, Jens; Pedersen, Mark Ruvald; Højgaard, Jasur Publication

More information

An empirical analysis of loopy belief propagation in three topologies: grids, small-world networks and random graphs

An empirical analysis of loopy belief propagation in three topologies: grids, small-world networks and random graphs An emirical analysis of looy belief roagation in three toologies: grids, small-world networks and random grahs R. Santana, A. Mendiburu and J. A. Lozano Intelligent Systems Grou Deartment of Comuter Science

More information

Equality-Based Translation Validator for LLVM

Equality-Based Translation Validator for LLVM Equality-Based Translation Validator for LLVM Michael Ste, Ross Tate, and Sorin Lerner University of California, San Diego {mste,rtate,lerner@cs.ucsd.edu Abstract. We udated our Peggy tool, reviously resented

More information

Fast Distributed Process Creation with the XMOS XS1 Architecture

Fast Distributed Process Creation with the XMOS XS1 Architecture Communicating Process Architectures 20 P.H. Welch et al. (Eds.) IOS Press, 20 c 20 The authors and IOS Press. All rights reserved. Fast Distributed Process Creation with the XMOS XS Architecture James

More information

Applying the fuzzy preference relation to the software selection

Applying the fuzzy preference relation to the software selection Proceedings of the 007 WSEAS International Conference on Comuter Engineering and Alications, Gold Coast, Australia, January 17-19, 007 83 Alying the fuzzy reference relation to the software selection TIEN-CHIN

More information

Extracting Optimal Paths from Roadmaps for Motion Planning

Extracting Optimal Paths from Roadmaps for Motion Planning Extracting Otimal Paths from Roadmas for Motion Planning Jinsuck Kim Roger A. Pearce Nancy M. Amato Deartment of Comuter Science Texas A&M University College Station, TX 843 jinsuckk,ra231,amato @cs.tamu.edu

More information

The Anubis Service. Paul Murray Internet Systems and Storage Laboratory HP Laboratories Bristol HPL June 8, 2005*

The Anubis Service. Paul Murray Internet Systems and Storage Laboratory HP Laboratories Bristol HPL June 8, 2005* The Anubis Service Paul Murray Internet Systems and Storage Laboratory HP Laboratories Bristol HPL-2005-72 June 8, 2005* timed model, state monitoring, failure detection, network artition Anubis is a fully

More information

Object and Native Code Thread Mobility Among Heterogeneous Computers

Object and Native Code Thread Mobility Among Heterogeneous Computers Object and Native Code Thread Mobility Among Heterogeneous Comuters Bjarne Steensgaard Eric Jul Microsoft Research DIKU (Det. of Comuter Science) One Microsoft Way University of Coenhagen Redmond, WA 98052

More information

A Study of Protocols for Low-Latency Video Transport over the Internet

A Study of Protocols for Low-Latency Video Transport over the Internet A Study of Protocols for Low-Latency Video Transort over the Internet Ciro A. Noronha, Ph.D. Cobalt Digital Santa Clara, CA ciro.noronha@cobaltdigital.com Juliana W. Noronha University of California, Davis

More information

Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4

Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4 Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4 M. Gajbe a A. Canning, b L-W. Wang, b J. Shalf, b H. Wasserman, b and R. Vuduc, a a Georgia Institute of Technology,

More information

A DEA-bases Approach for Multi-objective Design of Attribute Acceptance Sampling Plans

A DEA-bases Approach for Multi-objective Design of Attribute Acceptance Sampling Plans Available online at htt://ijdea.srbiau.ac.ir Int. J. Data Enveloment Analysis (ISSN 2345-458X) Vol.5, No.2, Year 2017 Article ID IJDEA-00422, 12 ages Research Article International Journal of Data Enveloment

More information

The Scalability and Performance of Common Vector Solution to Generalized Label Continuity Constraint in Hybrid Optical/Packet Networks

The Scalability and Performance of Common Vector Solution to Generalized Label Continuity Constraint in Hybrid Optical/Packet Networks The Scalability and Performance of Common Vector Solution to Generalized abel Continuity Constraint in Hybrid Otical/Pacet etwors Shujia Gong and Ban Jabbari {sgong, bjabbari}@gmuedu George Mason University

More information

Continuous Visible k Nearest Neighbor Query on Moving Objects

Continuous Visible k Nearest Neighbor Query on Moving Objects Continuous Visible k Nearest Neighbor Query on Moving Objects Yaniu Wang a, Rui Zhang b, Chuanfei Xu a, Jianzhong Qi b, Yu Gu a, Ge Yu a, a Deartment of Comuter Software and Theory, Northeastern University,

More information

Parallel Algorithms for the Summed Area Table on the Asynchronous Hierarchical Memory Machine, with GPU implementations

Parallel Algorithms for the Summed Area Table on the Asynchronous Hierarchical Memory Machine, with GPU implementations Parallel Algorithms for the Summed Area Table on the Asynchronous Hierarchical Memory Machine, ith GPU imlementations Akihiko Kasagi, Koji Nakano, and Yasuaki Ito Deartment of Information Engineering Hiroshima

More information

Efficient Selection and Sorting Schemes for Processing Large Distributed Files in de Bruijn Networks and Hypercubes

Efficient Selection and Sorting Schemes for Processing Large Distributed Files in de Bruijn Networks and Hypercubes Efficient Selection and Sorting Schemes for Processing Large Distributed Files in de Bruijn Networks and Hyercubes DavidS.L.Wei Deartment of Comuter and Information Sciences Fordham University New York,

More information

An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2

An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2 An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2 Mingliang Chen 1, Weiyao Lin 1*, Xiaozhen Zheng 2 1 Deartment of Electronic Engineering, Shanghai Jiao Tong University, China

More information

The R-LRPD Test: Speculative Parallelization of Partially Parallel Loops

The R-LRPD Test: Speculative Parallelization of Partially Parallel Loops The R-LRPD Test: Seculative Parallelization of Partially Parallel Loos Francis Dang, Hao Yu, Lawrence Rauchwerger Det. of Comuter Science, Texas A&M University College Station, TX 778- {fhd,hy89,rwerger}@cs.tamu.edu

More information

A joint multi-layer planning algorithm for IP over flexible optical networks

A joint multi-layer planning algorithm for IP over flexible optical networks A joint multi-layer lanning algorithm for IP over flexible otical networks Vasileios Gkamas, Konstantinos Christodoulooulos, Emmanouel Varvarigos Comuter Engineering and Informatics Deartment, University

More information

CMSC 425: Lecture 16 Motion Planning: Basic Concepts

CMSC 425: Lecture 16 Motion Planning: Basic Concepts : Lecture 16 Motion lanning: Basic Concets eading: Today s material comes from various sources, including AI Game rogramming Wisdom 2 by S. abin and lanning Algorithms by S. M. LaValle (Chats. 4 and 5).

More information

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism Erlin Yao, Mingyu Chen, Rui Wang, Wenli Zhang, Guangming Tan Key Laboratory of Comuter System and Architecture Institute

More information

Near-Optimal Routing Lookups with Bounded Worst Case Performance

Near-Optimal Routing Lookups with Bounded Worst Case Performance Near-Otimal Routing Lookus with Bounded Worst Case Performance Pankaj Guta Balaji Prabhakar Stehen Boyd Deartments of Electrical Engineering and Comuter Science Stanford University CA 9430 ankaj@stanfordedu

More information

Ad hoc networks beyond unit disk graphs

Ad hoc networks beyond unit disk graphs Wireless Netw (008) 14:715 79 DOI 10.1007/s1176-007-0045-6 Ad hoc networks beyond unit disk grahs Fabian Kuhn Æ Roger Wattenhofer Æ Aaron Zollinger Published online: 13 July 007 Ó Sringer Science+Business

More information

Source Coding and express these numbers in a binary system using M log

Source Coding and express these numbers in a binary system using M log Source Coding 30.1 Source Coding Introduction We have studied how to transmit digital bits over a radio channel. We also saw ways that we could code those bits to achieve error correction. Bandwidth is

More information

The Spatial Skyline Queries

The Spatial Skyline Queries The Satial Skyline Queries Mehdi Sharifzadeh Comuter Science Deartment University of Southern California Los Angeles, CA 90089-078 sharifza@usc.edu Cyrus Shahabi Comuter Science Deartment University of

More information

A GPU Heterogeneous Cluster Scheduling Model for Preventing Temperature Heat Island

A GPU Heterogeneous Cluster Scheduling Model for Preventing Temperature Heat Island A GPU Heterogeneous Cluster Scheduling Model for Preventing Temerature Heat Island Yun-Peng CAO 1,2,a and Hai-Feng WANG 1,2 1 School of Information Science and Engineering, Linyi University, Linyi Shandong,

More information

Building Better Nurse Scheduling Algorithms

Building Better Nurse Scheduling Algorithms Building Better Nurse Scheduling Algorithms Annals of Oerations Research, 128, 159-177, 2004. Dr Uwe Aickelin Dr Paul White School of Comuter Science University of the West of England University of Nottingham

More information

Matlab Virtual Reality Simulations for optimizations and rapid prototyping of flexible lines systems

Matlab Virtual Reality Simulations for optimizations and rapid prototyping of flexible lines systems Matlab Virtual Reality Simulations for otimizations and raid rototying of flexible lines systems VAMVU PETRE, BARBU CAMELIA, POP MARIA Deartment of Automation, Comuters, Electrical Engineering and Energetics

More information

Communication-Avoiding Parallel Algorithms for Solving Triangular Matrix Equations

Communication-Avoiding Parallel Algorithms for Solving Triangular Matrix Equations Research Collection Bachelor Thesis Communication-Avoiding Parallel Algorithms for Solving Triangular Matrix Equations Author(s): Wicky, Tobias Publication Date: 2015 Permanent Link: htts://doi.org/10.3929/ethz-a-010686133

More information

Source-to-Source Code Generation Based on Pattern Matching and Dynamic Programming

Source-to-Source Code Generation Based on Pattern Matching and Dynamic Programming Source-to-Source Code Generation Based on Pattern Matching and Dynamic Programming Weimin Chen, Volker Turau TR-93-047 August, 1993 Abstract This aer introduces a new technique for source-to-source code

More information

Submission. Verifying Properties Using Sequential ATPG

Submission. Verifying Properties Using Sequential ATPG Verifying Proerties Using Sequential ATPG Jacob A. Abraham and Vivekananda M. Vedula Comuter Engineering Research Center The University of Texas at Austin Austin, TX 78712 jaa, vivek @cerc.utexas.edu Daniel

More information

A Novel Iris Segmentation Method for Hand-Held Capture Device

A Novel Iris Segmentation Method for Hand-Held Capture Device A Novel Iris Segmentation Method for Hand-Held Cature Device XiaoFu He and PengFei Shi Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200030, China {xfhe,

More information

Constrained Path Optimisation for Underground Mine Layout

Constrained Path Optimisation for Underground Mine Layout Constrained Path Otimisation for Underground Mine Layout M. Brazil P.A. Grossman D.H. Lee J.H. Rubinstein D.A. Thomas N.C. Wormald Abstract The major infrastructure comonent reuired to develo an underground

More information

Constrained Empty-Rectangle Delaunay Graphs

Constrained Empty-Rectangle Delaunay Graphs CCCG 2015, Kingston, Ontario, August 10 12, 2015 Constrained Emty-Rectangle Delaunay Grahs Prosenjit Bose Jean-Lou De Carufel André van Renssen Abstract Given an arbitrary convex shae C, a set P of oints

More information

The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing

The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing Mikael Taveniku 2,3, Anders Åhlander 1,3, Magnus Jonsson 1 and Bertil Svensson 1,2

More information

The Edge-flipping Distance of Triangulations

The Edge-flipping Distance of Triangulations The Edge-fliing Distance of Triangulations Sabine Hanke (Institut für Informatik, Universität Freiburg, Germany hanke@informatik.uni-freiburg.de) Thomas Ottmann (Institut für Informatik, Universität Freiburg,

More information

Autonomic Physical Database Design - From Indexing to Multidimensional Clustering

Autonomic Physical Database Design - From Indexing to Multidimensional Clustering Autonomic Physical Database Design - From Indexing to Multidimensional Clustering Stehan Baumann, Kai-Uwe Sattler Databases and Information Systems Grou Technische Universität Ilmenau, Ilmenau, Germany

More information

Selecting Discriminative Binary Patterns for a Local Feature

Selecting Discriminative Binary Patterns for a Local Feature BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 3 Sofia 015 rint ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-015-0044 Selecting Discriminative Binary atterns

More information

Classes of K-Regular Semi ring

Classes of K-Regular Semi ring Secial Issue on Comutational Science, Mathematics and Biology Classes of K-Regular Semi ring M.Amala 1, T.Vasanthi 2 ABSTRACT: In this aer, it was roved that, If S is a K-regular semi ring and (S, +) is

More information

Efficient Computation of Morse-Smale Complexes for Three-dimensional Scalar Functions.

Efficient Computation of Morse-Smale Complexes for Three-dimensional Scalar Functions. Efficient omutation of Morse-Smale omlexes for Three-dimensional Scalar Functions. ttila Gyulassy Institute for Data nalysis and Visualization Det. of omuter Science University of alifornia, Davis Vijay

More information

An Efficient VLSI Architecture for Adaptive Rank Order Filter for Image Noise Removal

An Efficient VLSI Architecture for Adaptive Rank Order Filter for Image Noise Removal International Journal of Information and Electronics Engineering, Vol. 1, No. 1, July 011 An Efficient VLSI Architecture for Adative Rank Order Filter for Image Noise Removal M. C Hanumantharaju, M. Ravishankar,

More information

Identity-sensitive Points-to Analysis for the Dynamic Behavior of JavaScript Objects

Identity-sensitive Points-to Analysis for the Dynamic Behavior of JavaScript Objects Identity-sensitive Points-to Analysis for the Dynamic Behavior of JavaScrit Objects Shiyi Wei and Barbara G. Ryder Deartment of Comuter Science, Virginia Tech, Blacksburg, VA, USA. {wei,ryder}@cs.vt.edu

More information

Improving Parallel-Disk Buffer Management using Randomized Writeback æ

Improving Parallel-Disk Buffer Management using Randomized Writeback æ Aears in Proceedings of ICPP 98 1 Imroving Parallel-Disk Buffer Management using Randomized Writeback æ Mahesh Kallahalla Peter J Varman Detartment of Electrical and Comuter Engineering Rice University

More information

Computing the Convex Hull of. W. Hochstattler, S. Kromberg, C. Moll

Computing the Convex Hull of. W. Hochstattler, S. Kromberg, C. Moll Reort No. 94.60 A new Linear Time Algorithm for Comuting the Convex Hull of a Simle Polygon in the Plane by W. Hochstattler, S. Kromberg, C. Moll 994 W. Hochstattler, S. Kromberg, C. Moll Universitat zu

More information

CMSC 754 Computational Geometry 1

CMSC 754 Computational Geometry 1 CMSC 754 Comutational Geometry 1 David M. Mount Deartment of Comuter Science University of Maryland Fall 2002 1 Coyright, David M. Mount, 2002, Det. of Comuter Science, University of Maryland, College

More information

Grouping of Patches in Progressive Radiosity

Grouping of Patches in Progressive Radiosity Grouing of Patches in Progressive Radiosity Arjan J.F. Kok * Abstract The radiosity method can be imroved by (adatively) grouing small neighboring atches into grous. Comutations normally done for searate

More information