A Parallel Gauss-Seidel Algorithm for Sparse Power System. Matrices. D. P. Koester, S. Ranka, and G. C. Fox

Size: px
Start display at page:

Download "A Parallel Gauss-Seidel Algorithm for Sparse Power System. Matrices. D. P. Koester, S. Ranka, and G. C. Fox"

Transcription

1 A Parallel Gauss-Sedel Algorthm for Sparse Power System Matrces D. P. Koester, S. Ranka, and G. C. Fox School of Computer and Informaton Scence and The Northeast Parallel Archtectures Center (NPAC) Syracuse Unversty Syracuse, NY 3- Abstract We descrbe the mplementaton and performance of an ecent parallel Gauss-Sedel algorthm that has been developed for rregular, sparse matrces from electrcal power systems applcatons. Although, Gauss- Sedel algorthms are nherently sequental, by performng specalzed orderngs on sparse matrces, t s possble to elmnate much of the data dependences caused by precedence n the calculatons. A twopart matrx orderng technque has been developed rst to partton the matrx nto block-dagonalbordered form usng dakoptc technques and then to mult-color the data n the last dagonal block usng graph colorng technques. The ordered matrces often have extensve parallelsm, whle mantanng the strct precedence relatonshps n the Gauss-Sedel algorthm. We present tmng results for a parallel Gauss-Sedel solver mplemented on the Thnkng Machnes CM-5 dstrbuted memory mult-processor. The algorthm presented here requres actve message remote procedure calls n order to mnmze communcatons overhead and obtan good relatve speedup. Introducton Power system dstrbuton networks are generally herarchcal wth lmted numbers of hgh-voltage lnes transmttng electrcty to hghly nterconnected local networks that eventually dstrbute power to customers. Electrcal power grds have graph representatons whch n turn can be expressed as matrces electrcal buses are graph nodes and matrx dagonal elements, whle electrcal transmsson lnes are graph edges whch can be represented as non-zero odagonal matrx elements. We show that t s possble to dentfy the herarchcal structure wthn a power system matrx usng only the knowledge of the nterconnecton pattern by tearng the matrx nto parttons and couplng equatons that yeld a block-dagonal-bordered matrx. Nodetearng-based parttonng dentes the basc network structure that provdes parallelsm for the majorty of calculatons wthn a Gauss-Sedel teraton. Graph mult-colorng has been used to order the last dagonal matrx block and subsequently dentfy avalable parallelsm. We mplemented explct load balancng as part of each of the aforementoned orderng steps to maxmze parallel algorthm ecency. We mplemented the parallel Gauss-Sedel algorthm on the Thnkng Machnes CM-5 dstrbuted memory mult-processor exclusvely usng explct message passng based on Connecton Machne actve message layer (CMAML) remote procedure calls (RPCs). The communcatons paradgm we use throughout ths algorthm employs CMAML RPCs to send ndvdual values to destnaton processors as soon as values have been calculated. Ths paradgm greatly smpled the development and mplementaton of ths parallel sparse Gauss-Sedel algorthm. Parallel mplementatons of Gauss-Sedel have generally been developed for regular problems such as the soluton of Laplace's equatons by nte derences [, 5], where red-black colorng schemes are used to provde ndependence n the calculatons and some parallelsm. Ths scheme has been extended to multcolorng for addtonal parallelsm n more complcated regular problems [5], however, we are nterested n the soluton of rregular lnear systems. There has been some research nto applyng parallel Gauss-Sedel to crcut smulaton problems [], although ths work showed poor parallel speedup potental n a theoretcal study. Ths reference also extended tradtonal Gauss-

2 Sedel and Gauss-Jacob technques to waveform relaxaton methods that trade overhead and convergence rate for parallelsm. Other research wth parallel Gauss-Sedel methods for power systems applcatons s presented n [7], although our research ders substantally from that work: our research utlzes a dfferent matrx orderng paradgm, a derent load balancng paradgm, and a derent parallel mplementaton paradgm. Our work utlzes dakoptc-based matrx parttonng technques developed ntally for a parallel block-dagonal-bordered drect sparse lnear solver [9, ]. In reference [9] we examned load balancng ssues assocated wth parttonng power systems matrces for parallel Cholesk factorzaton. The paper s organzed as follows. In secton, we ntroduce the electrcal power systems applcaton that s the bass for ths work. In secton 3, we brey revew the Gauss-Sedel teratve method, then present a theoretcal dervaton of the avalable parallelsm wth Gauss-Sedel for a block-dagonal-bordered form sparse matrx. We dscuss the preprocessng phase that orders the sparse matrces n secton 5, and we descrbe our parallel Gauss-Sedel algorthm mplementaton n secton 6. Analyss of parallel algorthm performance for actual power system load ow matrces are presented n secton 7. We present our conclusons n secton 8. Power System Applcatons The underlyng motvaton for our research s to mprove the performance of electrcal power system applcatons to provde real-tme power system control and real-tme support for proactve decson makng. Ths research has focused on matrces from load-ow applcatons []. Load-ow analyss examnes steadystate equatons based on the symmetrc postve defnte network admttance matrx that represents the power system dstrbuton network. Load ow analyss entals the soluton of non-lnear systems of smultaneous equatons, whch are performed by repeatedly solvng sparse lnear equatons. Sparse lnear solvers account for the majorty of oatng pont operatons encountered n load-ow analyss. 3 The Gauss-Sedel Method We are consderng an teratve soluton to the lnear system Ax = b; () where A s an (n n) sparse matrx, x and b are vectors of length n, and we are solvng for x. Iteratve solvers are an alternatve to drect methods that attempt to calculate an exact soluton to the system of equatons. Iteratve methods attempt to nd a soluton to the system of lnear equatons by repeatedly solvng the lnear system usng approxmatons to the x vector. Iteratons contnue untl the soluton s wthn a predetermned acceptable bound on the error. The Gauss-Sedel method can be wrtten as: x (k+) X X b? a j x (k+) j? a j x (k) A j ; a j< j> () where: or x (k) s the th unknown n x durng the k th teraton, = ; ; n and k = ; ; :::, x () s the ntal guess for the th unknown n x, a j s the coecent of A n the th row and j th column, b s the th value n b. x (k+) = (D + L)? [b? Ux (k) ]; (3) where: x (k) s the k th soluton to x, k = ; ; :::, x () s the ntal guess at x, D s the dagonal of A, L s the strctly lower trangular porton of A, U s the strctly upper trangular porton of A, b s rght-hand-sde vector. The representaton n equaton s used n the development of the parallel algorthm, whle the equvalent matrx-based representaton n equaton 3 s used below n dscussons of avalable parallelsm. We present a general sequental sparse Gauss-Sedel algorthm n gure. Ths algorthm calculates a constant number of teratons before checkng convergence. It s very dcult to determne f one-step teratve methods, lke the Gauss-Sedel method, converge for general matrces. Nevertheless, t s possble to prove that the Gauss-Sedel method does converge and yelds the unque soluton x for Ax = b wth any ntal startng vector x () for both dagonally domnant and symmetrc postve dente matrces [5]. These theorems prove that the Gauss-Sedel method converges for these matrx types, however, there s no evdence as to the rate of convergence. Symmetrc sparse matrces can be represented by graphs wth elements n equatons correspondng to

3 whle > converge for k = to n ter for = to n ~x x x b for each j such that a j 6= x x? (a j x j ) x x =a for = to n + abs(~x? x ) endwhle Fgure : Sparse Gauss-Sedel Algorthm undrected edges n the graph [6]. Orderng a symmetrc sparse matrx s actually lttle more than changng the labels assocated wth nodes n an undrected graph. Modfyng the orderng of a sparse matrx s smple to perform usng a permutaton matrx P that smply generates elementary row and column exchanges. Applyng a permutaton matrx to the orgnal lnear system n equaton yelds (PAP T )(Px) = (Pb): () Whle orderng the matrx can greatly smplfy accessng parallelsm nherent wthn the matrx structure, orderng can have an eect on convergence [5]. In secton 7, we present emprcal data to show that n spte of the orderng to yeld parallelsm, convergence appears to be rapd for postve dente power systems load-ow matrces. Avalable Parallelsm Whle Gauss-Sedel algorthms for dense matrces are nherently sequental, t s possble to dentfy sparse matrx parttons that do not have mutual data dependences, so calculatons can proceed n parallel whle mantanng the strct precedence rules n the Gauss-Sedel technque. Entre sparse matrx parttons can be calculated n parallel wthout requrng communcatons. All parallelsm n the Gauss-Sedel algorthm s derved from wthn the actual nterconnecton relatonshps between elements n the matrx. Whle much of the parallelsm n ths algorthm comes from the block-dagonal-bordered orderng of the sparse matrx, further orderng of the last dagonal block s requred to provde parallelsm n what would otherwse be a purely sequental porton of the algorthm. The last dagonal block represents the nterconnecton structure wthn the equatons that couple the parttons n the block-dagonal porton of the matrx. These equatons are rather sparse, so, t s smple to color the graph representng ths porton of the matrx. Separate graph colors represent rows where x (k+) can be calculated n parallel, because wthn a color, no two nodes have any adjacent edges. To clearly dentfy the avalable parallelsm n the block-dagonal-bordered Gauss-Sedel method, we de- ne a block dagonal matrx partton, apply that partton to formula 3, and equate terms to dentfy avalable parallelsm. We must also dene a subparttonng of the last dagonal block to dentfy parallelsm after mult-colorng.. Block-Dagonal-Bordered Matrces We dene a parttonng of the system of lnear equatons dened n equaton, where the permutaton matrx P orders the matrx nto block-dagonalbordered form. We dene PAP T = A ; A ;m.... A m?;m? A m?;m C A ; A m; A m;m? A m;m (5) and Px and Pb are parttoned wth smlar dmensons. Equaton 3 dvdes the PAP T matrx nto a dagonal component D, a strctly lower dagonal matrx L, and a strctly upper dagonal matrx U such that: PAP T = D + L + U (6) Dervaton of the block-dagonal-bordered form of the D, L, and U matrces s straghtforward. Equaton 3 requres the calculaton of (D + L)?, whch also s smple to determne explctly, because ths matrx has block-dagonal-lower-bordered form. Gven these parttoned matrces, t s relatvely straghtforward to dentfy avalable parallelsm. For, ( = ; ; m? ), we obtan: x (k+) = (D ; + L ; )? h b? U ; x (k)? U ;m x (k) m ; (7)

4 and for the lower border and last dagonal block we obtan: x (k+) m = (D m;m + L m;m )? h b m? P m? = (L? m; x(k+) )? U m;m x (k) m :(8) We can dentfy the parallelsm n the blockdagonal-bordered porton of the matrx by examnng equatons 7 and 8. If the block-dagonal-bordered matrx parttons A ;, A m;, and A ;m ( m? ) are assgned to the same processor, then there are no communcatons untl x (k+) m s calculated. Note that the vector x m (k) s requred for the calculatons n each partton, however, there s no volaton of the strct precedence rules n the Gauss-Sedel method, because these values are not calculated untl the last step. After calculatng x (k+) n the rst (m? ) parttons, the values of x (k+) m must be calculated usng the lower border and last block. If we assgn ^b = b m? m? X = L? m; x(k+) ; (9) then the formulaton of x (k+) m = ^x (k+) looks smlar to equaton 3: ^x (k+) = (D m;m + L m;m ) h^b?? U m;m x (k) : () Fgure descrbes the calculaton steps n the parallel Gauss-Sedel for a block-dagonal-bordered sparse matrx. Ths gure depcts four dagonal blocks, and data/processor assgnments (,,, and ) are lsted for the data block.. Mult-Colored Matrces The orderng mposed by the permutaton matrx P, ncludes mult-colorng-based orderng of the last dagonal block that produces sub-parttons wth parallelsm, We dene the sub-parttonng as: A m;m = ^D ; ^A; ^A;c ^A ; ^D; ^A;c..... ^A c; ^Ac; ^Dc;c C A : () where ^D; are dagonal blocks and c s the number of colors. After formng L m;m and U m;m, t s straght forward to prove that: ^x (k+) = ^D? ; ^b? X j< ^A ;j^x (k+) j? X j> ^A ;j^x (k) j 3 5 () () SOLVE FOR x IN DIAGONAL BLOCKS x = () CALCULATE (MATRIX X VECTOR) (3) SOLVE FOR x PRODUCT AND SEND IN LAST DIAGONAL BLOCK Fgure : Block-Bordered-Dagonal Form Gauss- Sedel Method (m = 5) Calculatng ^x (k+) n each sub-partton (color) of the last dagonal block does not requre values of ^x (k+) wthn the sub-partton, so we can calculate the ndvdual values wthn a color n any order and dstrbute these calculatons to separate processors wthout concern for precedence. In order to mantan the strct precedence n the Gauss-Sedel algorthm, the values of ^x (k+) calculated n each step must be broadcast to all processors, and processng cannot proceed for any processor untl t receves the new values of ^x (k+) from all other processors. Fgure 3 llustrates the data/processor assgnments n the last dagonal block. 5 The Preprocessng Phase In the prevous secton, we developed the theory for parallel Gauss-Sedel methods, however, before such technques can be mplemented on real power systems matrces, we must be able to generate the permutaton matrces, P, to produce block-dagonalbordered/mult-colored sparse matrces. All avalable parallelsm for our Gauss-Sedel algorthm s dented from the nterconnecton structure of elements n the sparse matrx durng ths preprocessng phase. Inherent n both preprocessng steps s explct load-balancng to determne processor/data mappngs for ecent mplementaton of the Gauss-Sedel algorthm. Ths preprocessng phase ncurs sgncantly more overhead than solvng a sngle nstance of the b

5 C C C3 () SOLVE FOR x WITHIN A COLOR () BROADCAST NEW x VALUES Fgure 3: Mult-Colored Gauss-Sedel Method for the Last Dagonal Block (c = 3) sparse matrx; consequently, the use of ths technque wll be lmted to problems that have statc matrx structures that can reuse the ordered matrx multple tmes n order to amortze the cost of the preprocessng phase over numerous matrx solutons. 5. Orderng the Matrx nto Block- Bordered-Dagonal Form We requre a technque that orders rregular matrces nto block-dagonal-bordered form whle lmtng the number of couplng equatons. Mnmzng the number of couplng equatons mnmzes the sze of the last dagonal block, and mnmzes the amount of broadcast communcatons requred when calculatng values of ^x (k+). Mnmzng the sze of the last dagonal block has some drawbacks. We have found an nverse relatonshp between last block sze and loadmbalance between processors. Ths can aect potental parallelsm f the resultng workload n the dagonal blocks cannot be dstrbuted unformly throughout a mult-processor [9]. When determnng the optmal orderng for a sparse matrx, the sze of the last dagonal block and the subsequent addtonal communcatons may be traded for an orderng that yelds good load balance n the hghly parallel porton of the calculatons, especally for larger numbers of processors. We have chosen node-tearng [, ], whch s a specalzed form of dakoptcs, to order sparse power systems matrces nto block-dagonal-bordered form. We have selected node-tearng nodal analyss because ths algorthm determnes the natural structure n the matrx whle provdng the means to mnmze the number of couplng equatons []. Tearng here refers x b to breakng the orgnal problem nto smaller subproblems whose partal solutons can be combned to gve the soluton of the orgnal problem. The node-tearng-based orderng algorthm has a user-selectable nput parameter, max DB, the maxmum sze of the dagonal blocks. Varyng ths nput parameter permts the user to vary characterstcs n the ordered dagonal blocks. Emprcal data s presented later n secton 7 to llustrate parallel lnear solver algorthm performance as a functon of ths parameter. Load balancng for node-tearng-based orderng s performed wth a smple pgeon-hole type algorthm that uses a metrc based on the number of oatng pont multply/add operatons n a partton, nstead of smply usng the number of rows per partton. Load balancng examnes the number of operatons when calculatng x (k+) n the matrx parttons and the number of operatons when calculatng the sparse matrx vector products n preparaton to solve for ^x (k+). Ths algorthm nds an optmal dstrbuton for workload to processors, however, actual dsparty n processor workload s dependent on the actual rregular sparse matrx structure. 5. Orderng the Last Dagonal Block The mult-colorng algorthm we selected for ths work s based on the saturaton degree orderng algorthm []. We also requre load balancng, a feature not commonly mplemented wthn graph multcolorng. The saturaton degree orderng algorthm selects a node n the graph that has the largest number of derently colored neghbors. We have added the capablty to the saturaton degree orderng algorthm to select the color for a node n a manner that equalzes the number of nodes wth a partcular color. The graphs encountered for colorng n ths work were very sparse, and often requred three or less colors. Detals of ths graph mult-colorng algorthm are presented n [8]. 6 Parallel Implementaton We have mplemented a parallel verson of a blockdagonal-bordered sparse Gauss-Sedel algorthm n the C programmng language for the Thnkng Machnes CM-5 mult-computer usng CMAML RPCs as the exclusve bass for nterprocessor communcatons [3]. Underlyng the whole concept of actve messages s the paradgm that the user takes the responsblty for handlng messages as they arrve at a destnaton.

6 The user wrtes a handler functon that takes the data from a regster and uses t n a calculaton or assgns the data to memory. By assgnng message handlng responsbltes to the user, communcatons overhead can be sgncantly reduced. Sgncant mprovements n the performance of the algorthm were observed for actve messages, when compared to more tradtonal communcatons paradgms that use the standard blockng CMMD send and CMMD receve functons n conjuncton wth packng data nto communcatons buers. A sgncant porton of communcatons requre each processor to send short data buers to every other processor. For tradtonal message passng paradgms, the cost for communcatons ncreases drastcally as the number of processors ncreases, because each message ncurs the same latency regardless of the amount of data sent. As a result, performance for buered communcatons quckly becomes unacceptable as the number of processors ncreases. To sgncantly reduce communcatons overhead, we mplemented each porton of the algorthm usng CMAML remote procedure calls (CMAML rpc). The communcatons paradgm we use throughout ths algorthm s to send a double precson data value to the destnaton processor as soon as the value s calculated. Communcatons n the algorthm occur at dstnct tme phases, so pollng for the actve message handler functon s ecent. An actve message on the CM-5 has a four word payload, whch s more than adequate to send a double precson oatng pont value and an nteger vector poston ndcator. The use of actve messages greatly smpled the development and mplementaton of ths parallel sparse Gauss-Sedel algorthm, because there was no requrement to mantan and pack communcatons buers. Ths mplementaton uses mplct data structures based on vectors of C programmng language structures to store and retreve data ecently wthn the sparse matrx. These data structures provde good cache coherence, because non-zero data values and column locaton ndcators are stored n adjacent physcal memory locatons. Data s stored as sparse vectors wth mplct referencng, so only the SPARC processors on each node were used for calculatons. Our parallel Gauss-Sedel algorthm has the followng dstnct sectons:. solve for x (k+) n the P dagonal blocks. calculate ^b m? = b m? L? = m; x(k+) by formng the (matrx vector) products n parallel 3. solve for ^x (k+) n the last dagonal block. check convergence A pseudo-code representaton of the parallel Gauss- Sedel solver s presented n gure. 7 Emprcal Results Overall performance of our parallel Gauss-Sedel lnear solver s dependent on both the performance of the matrx orderng n the preprocessng phase and the performance of the parallel Gauss-Sedel mplementaton. Because these two components of the parallel Gauss-Sedel mplementaton are nextrcably related, the best way to assess the potental of ths technque s to measure the speedup performance usng real power system load-ow matrces. We rst present speedup results for three separate power systems matrces: BCSPWR9,73 nodes and,39 edges [] BCSPWR 5,3 nodes and 8,7 edges [] EPRI-6K,8 nodes and 5,6 edges [3] Matrces BCSPWR9 and BCSPWR are from the Boeng Harwell seres and the EPRI-6K matrx s dstrbuted wth the Extended Transent-Mdterm Stablty Program (ETMSP) from EPRI. These matrces have been preprocessed usng a sequental program that orders the matrx, load balances each orderng step, and produces the mplct data structures for the parallel Gauss-Sedel lnear solver. The preprocessng was repeated for multple values of max DB, the nput value to the node-tearng algorthm. Due to the statc nature of the power system grd, such orderngs could be reused for many hours or even days of calculatons n real electrcal power utlty operatons load-ow applcatons. Emprcal performance data was collected for each of the aforementoned power systems matrces usng through 3 processors on the Thnkng Machnes CM- 5 at the Northeast Parallel Archtectures Center at Syracuse Unversty. The NPAC CM-5 s congured wth all 3 nodes n a sngle partton, so user software was requred to dene the number of processors used to actually solve a lnear system. We present emprcal speedup data collected on the parallel Gauss-Sedel algorthm for the three power systems matrces, and we also present a detaled performance analyss usng actual run tmes for the ndvdual subsectons of the parallel Gauss-Sedel lnear solver to llustrate the ef- cacy of the load balancng step n the preprocessng phase and to llustrate performance bottlenecks. All tmng samples are for a combnaton of four teratons and a sngle convergence check.

7 Node Program whle > converge for k = to n ter /* solve for x (k+) n the dagonal blocks */ for all rows on ths processor ~x x x b for each j [; n] such that a j 6= x x? (a j x j ) x x =a /* calculate L? m; x(k+) */ for all rows on ths processor ~x x ^b b for all lower border non-zero rows for each j such that a j 6=? (a j x j ) usng actve message rpc on processor () ^b ^b? /* solve for ^x (k+) n the last dagonal block */ for all colors on ths processor c for all rows n color c x ^b for each j such that a j 6= x x? (a j x j ) x x =a usng actve message rpc broadcast x wat untl all values of x have arrved /* check convergence */ for all rows on ths processor + abs(~x? x ) for all other processors usng actve message rpc on processor + endwhle Fgure : Parallel Sparse Gauss-Sedel Algorthm RELATIVE SPEEDUP RELATIVE SPEEDUP FOR GAUSS SEIDEL 6 8 BCSPWR9 BCSPWR EPRI-6K Fgure 5: Relatve Speedup,, 8, 6, and 3 processors 7. Performance Analyss As an ntroducton to the performance of the parallel Gauss-Sedel algorthm, we present a graph that plots relatve speedup versus the number of processors. Fgure 5 plots the best speedup measured for each of the power systems matrces for,, 8, 6, and 3 processors. These graphs show that performance for the EPRI-6K data set s the best of the three data sets examned. Speedup reaches a maxmum of.6 for 3 processors and speedups of greater than. were measured for 6 processors. Relatve speedups for the BCSPWR9 and BC- SPWR matrces are less than for the EPRI-6K matrx, but each has speedup n excess of 7. for 6 processors. For both the BCSPWR9 and BCSPWR matrces, the last dagonal block requres approxmately 5% of the total calculatons whle the last block of the EPRI-6K matrx can be ordered so that only % of all calculatons occur there. The lkely cause for lmted speedup wth the Boeng-Harwell matrces s that communcatons overhead becomes a sgncant part of the overall processng tme because ^x (k+) values must be broadcast to other processors before processng can proceed to the next color. There are nsuf- cent parallel operatons when solvng for x (k+) n the dagonal blocks for these matrces to oset the effect of the communcatons overhead encountered n the last block. A detaled examnaton of relatve speedup s presented n gure 6 for the EPRI-6K data. Ths gure contans a graph wth four curves plottng relatve speedup for each of four maxmum matrx partton szes, 8, 9, 56, and 3 nodes, used n the node-

8 RELATIVE SPEEDUP NODES 9 NODES 56 NODES 3 NODES MILLISECONDS Dagonal Blocks and Upper Border RUN TIME - 8 RUN TIME - 9 RUN TIME - 56 RUN TIME Fgure 6: Relatve Speedup for EPRI-6K Data,, 8, 6, and 3 processors tearng algorthm. The speedup curves for the varous matrx orderngs clearly llustrate the eects of load mbalance for some matrx orderngs. For all four matrx orderngs, speedup s nearly equal for through 6 processors. However, the values for relatve speedup dverge for 3 processors. We can look further nto the cause of the dsparty n the relatve speedup values n the EPRI-6K data by examnng the performance of each of the four dstnct sectons of the parallel algorthm. Fgure 7 contans four graphs that each have four curves that plot processng tme n mllseconds versus the number of processors for each of four values of max DB. These graphs are log-log scaled, so for perfect speedup, processng tmes should fall on a straght lne wth decreasng slope for repeated doublng of the number of processors. One or more curves on each of the performance graphs for the dagonal blocks and upper border, for updatng the last dagonal block, and for convergence checks llustrate nearly perfect speedup wth as many as 3 processors. Unfortunately the performance for calculatng values n the last block does not also have stellar parallel performance. The performance graph for the dagonal blocks and lower border clearly llustrates the causes for the load mbalance observed n the relatve speedup graph n gure 6. For some matrx orderngs, load balancng s not able to dvde the work evenly for larger numbers of processors. Ths occurs for larger values of max DB. Selectng small values of max DB wll provde better speedups for sxteen or more processors. Updatng the last block requres both calculatons of sparse (matrx vector) products and rregular communcatons, but yelds good performance even for 3 MILLISECONDS MILLISECONDS MILLISECONDS 8 Update Last Block RUN TIME - 8 RUN TIME - 9 RUN TIME - 56 RUN TIME Last Block RUN TIME - 8 RUN TIME - 9 RUN TIME - 56 RUN TIME Check Convergence RUN TIME - 8 RUN TIME - 9 RUN TIME - 56 RUN TIME Fgure 7: Tmngs for Algorthm Components EPRI-6K Data,, 8, 6, and 3 processors

9 processors. Update tmes are correlated to the sze of the last dagonal block, whch s nversely related to the magntude of max DB. The performance graph for checkng convergence llustrates that the load balancng step does not assgn equal numbers of rows to all processors. The number of rows on a processor vares as a functon of the load balancng. Whle the curves on ths graph are somewhat erratc, performance s mprovng wth near perfect parallelsm even for 3 processors. We must reterate that all avalable parallelsm n ths work s a result of orderng the matrx and dentfyng relatonshps n the connectvty pattern wthn the structure of the matrx. Power systems load ow matrces are some of the most sparse rregular matrces encountered. For the EPRI-6K data, the most frequently encountered number of edges at a node s only two, and 8.% of the nodes have three or less edges. For the BCSPWR matrx, 7% of the nodes have three or less edges. Consequently, power systems matrces pose sgncant challenges to produce ecent parallel sparse matrx algorthms. In gure 8, we present a representatve orderng of the EPRI-6K data wth max DB equal to 56 nodes. Ths matrx represents the adjacency structure of the network graph, and clearly llustrates sparsty. Nonzero entres n the matrx are represented as dots, and the matrx s delmted by a boundng box. Ths gure contans two matrces: the ordered sparse matrx and an enlargements of the last block after mult-colorng. Ths parttoned matrx has been load-balanced for eght processors. The number of nodes n the last dagonal block s, the numbers of edges are only, and ths matrx partton s bpartte requrng only two colors. To obtan the full benets of parallel processng speedup throughout a load ow applcaton, all data redstrbuton must be elmnated. Jacoban calculatons when solvng the systems of non-lnear equatons must consder the processor/data assgnments from the sparse lnear solver. Otherwse, data redstrbuton overhead would nullfy any speedup obtanable n the parallel lnear solver. 7. Convergence Convergence for a gven data set s crtcal to the performance of an teratve lnear solver. We have appled our solver to sample postve dente matrces that have actual power networks as the bass for the sparsty pattern, and random values for the entres. A sample of measured convergence data s presented n table. Ths table presents the total error and LAST DIAGONAL BLOCK Fgure 8: Ordered EPRI-6K Matrx max DB = 56 Iteraton Total Error P 8 abs(x(k+)? x (k) ) max 8 x (k+) Table : Convergence for EPRI-6K Data the maxmum value for an teraton. All ntal values, x (), have been dened to equal :. Convergence s rather rapd, and after four teratons, total error equals?. We hypothesze that ths good convergence rate s n part due to havng good estmates of the ntal startng vector. For actual solutons of power systems load ows, ths solver would be used wthn an teratve non-lnear solver, so good estmates of startng ponts for each soluton would be readly avalable. 8 Conclusons We have developed a parallel sparse Gauss-Sedel solver wth the potental for good relatve speedup for the very sparse, rregular matrces encountered n electrcal power system applcatons. Block-dagonal-

10 bordered matrx structure oers promse for smpled mplementaton and also oers a smple decomposton of the problem nto clearly dentable subproblems. The node-tearng orderng heurstc has proven to be successful n dentfyng the herarchcal structure n the power systems matrces, and reducng the number of couplng equatons so that the graph mult-colorng algorthm can usually color the last block wth only two or three colors. All avalable parallelsm n our Gauss-Sedel algorthm s derved from wthn the actual nterconnecton relatonshps between elements n the matrx, and dented n the sparse matrx orderngs. Consequently, avalable parallelsm s not unlmted. Relatve speedup tends to ncrease ncely untl ether load-balance overhead or communcatons overhead cause speedup to level o. We have shown that, dependng on the matrx, relatve ecency declnes rapdly after 8 or 6 processors, lmtng the utlty of applyng large numbers of processors to a sngle parallel lnear solver. Nevertheless, other dmensons exst n electrcal power system applcatons that can be exploted to use large numbers of processors ecently. Whle a moderate number of processors can be ecently appled to a sngle power system smulaton, multple events can be smulated smultaneously. Acknowledgments We thank Alvn Leung, Nancy McCracken, Paul Coddngton, and Tony Skjellum for ther assstance n ths research. Ths work has been supported n part by Nagara Mohawk Power Corporaton, the New York State Scence and Technology Foundaton, the NSF under co-operatve agreement No. CCR-98, and ARPA under contract #DABT63-9-K-5. References [] D. Brelaz. New Methods to Color the Vertces of a Graph. Comm. ACM, :5, 979. [] I. S. Du, R. G. Grmes, and J. G. Lews. Users Gude for the Harwell-Boeng Sparse Matrx Collecton. Techncal report, Boeng Computer Servces, 99. [3] Electrcal Power Research Insttute, Palo Alto, Calforna. Extended Transent-Mdterm Stablty Program: Verson 3. - Volume : Programmers Manual, Part, Aprl 993. [] G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker. Solvng Problems on Concurrent Processors. Prentce Hall, 988. [5] G. Golub and J. M. Ortega. Scentc Computng wth an Introducton to Parallel Computng. Academc Press, Boston, MA., 993. [6] M. T. Heath, E. Ng, and B. W. Peyton. Parallel Algorthms for Sparse Lnear Systems. In Parallel Algorthms for Matrx Computatons, pages 83{. SIAM, Phladelpha, 99. [7] G. Huang and W. Ongsakul. Managng the Bottlenecks n Parallel Gauss-Sedel Type Algorthms for Power Flow Analyss. Proceedngs of the 8th Power Industry Computer Applcatons (PICA) Conference, pages 7{8, May 993. [8] D. P. Koester, S. Ranka, and G. C. Fox. A Parallel Gauss-Sedel Algorthm for Sparse Power Systems Matrces. Techncal Report SCCS-63, NPAC, Aprl 99. [9] D. P. Koester, S. Ranka, and G. C. Fox. Parallel Block-Dagonal-Bordered Sparse Lnear Solvers for Electrcal Power System Applcatons. In Proceedng of the Scalable Parallel Lbrares Conference. IEEE Press, 99. [] D. P. Koester, S. Ranka, and G. C. Fox. Parallel Cholesk Factorzaton of Block-Dagonal- Bordered Sparse Matrces. Techncal Report SCCS-6, NPAC, January 99. [] R. A. Saleh, K. A. Gallvan, M. Chang, I. N. Hajj, D. Smart, and T. N. Trck. Parallel Crcut Smulaton on Supercomputers. Proceedngs of the IEEE, 77():95{93, December 989. [] A. Sangovann-Vncentell, L. K. Chen, and L. O. Chua. Node-Tearng Nodal Analyss. Techncal Report ERL-M58, Electroncs Research Laboratory, College of Engneerng, Unversty of Calforna, Berkeley, October 976. [3] T. von Ecken, D. E. Culler, S. C. Goldsten, and K. E. Schauser. Actve Messages: a Mechansm for Integrated Communcaton and Computaton. In Nneteenth Internatonal Symposum on Computer Archtecture, New York, 99. ACM Press. [] Y. Wallach. Calculatons and Programs for Power System Networks. Prentce-Hall, 986.

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016) Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation Precondtonng Parallel Sparse Iteratve Solvers for Crcut Smulaton A. Basermann, U. Jaekel, and K. Hachya 1 Introducton One mportant mathematcal problem n smulaton of large electrcal crcuts s the soluton

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Chapter 1. Comparison of an O(N ) and an O(N log N ) N -body solver. Abstract

Chapter 1. Comparison of an O(N ) and an O(N log N ) N -body solver. Abstract Chapter 1 Comparson of an O(N ) and an O(N log N ) N -body solver Gavn J. Prngle Abstract In ths paper we compare the performance characterstcs of two 3-dmensonal herarchcal N-body solvers an O(N) and

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Advanced Computer Networks

Advanced Computer Networks Char of Network Archtectures and Servces Department of Informatcs Techncal Unversty of Munch Note: Durng the attendance check a stcker contanng a unque QR code wll be put on ths exam. Ths QR code contans

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Parallel Incremental Graph Partitioning Using Linear Programming

Parallel Incremental Graph Partitioning Using Linear Programming Syracuse Unversty SURFACE College of Engneerng and Computer Scence - Former Departments, Centers, Insttutes and roects College of Engneerng and Computer Scence 994 arallel Incremental Graph arttonng Usng

More information

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract.

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A Saturation Binary Neural Network for Crossbar Switching Problem

A Saturation Binary Neural Network for Crossbar Switching Problem A Saturaton Bnary Neural Network for Crossbar Swtchng Problem Cu Zhang 1, L-Qng Zhao 2, and Rong-Long Wang 2 1 Department of Autocontrol, Laonng Insttute of Scence and Technology, Benx, Chna bxlkyzhangcu@163.com

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

A SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES

A SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES A SYSOLIC APPROACH O LOOP PARIIONING AND MAPPING INO FIXED SIZE DISRIBUED MEMORY ARCHIECURES Ioanns Drosts, Nektaros Kozrs, George Papakonstantnou and Panayots sanakas Natonal echncal Unversty of Athens

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids)

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids) Structured meshes Very smple computatonal domans can be dscretzed usng boundary-ftted structured meshes (also called grds) The grd lnes of a Cartesan mesh are parallel to one another Structured meshes

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

2.1. The Program Model

2.1. The Program Model Hyperplane Parttonng : n pproach to Global ata Parttonng for strbuted Memory Machnes S. R. Prakash and Y.. Srkant epartment of S, Indan Insttute of Scence angalore, Inda, 6 bstract utomatc Global ata Parttonng

More information

The Shortest Path of Touring Lines given in the Plane

The Shortest Path of Touring Lines given in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He

More information

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem Ecent Computaton of the Most Probable Moton from Fuzzy Correspondences Moshe Ben-Ezra Shmuel Peleg Mchael Werman Insttute of Computer Scence The Hebrew Unversty of Jerusalem 91904 Jerusalem, Israel Emal:

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

A One-Sided Jacobi Algorithm for the Symmetric Eigenvalue Problem

A One-Sided Jacobi Algorithm for the Symmetric Eigenvalue Problem P-Q- A One-Sded Jacob Algorthm for the Symmetrc Egenvalue Problem B. B. Zhou, R. P. Brent E-mal: bng,rpb@cslab.anu.edu.au Computer Scences Laboratory The Australan Natonal Unversty Canberra, ACT 000, Australa

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

an assocated logc allows the proof of safety and lveness propertes. The Unty model nvolves on the one hand a programmng language and, on the other han

an assocated logc allows the proof of safety and lveness propertes. The Unty model nvolves on the one hand a programmng language and, on the other han UNITY as a Tool for Desgn and Valdaton of a Data Replcaton System Phlppe Quennec Gerard Padou CENA IRIT-ENSEEIHT y Nnth Internatonal Conference on Systems Engneerng Unversty of Nevada, Las Vegas { 14-16

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

with `ook-ahead for Broadcast WDM Networks TR May 14, 1996 Abstract

with `ook-ahead for Broadcast WDM Networks TR May 14, 1996 Abstract HPeR-`: A Hgh Performance Reservaton Protocol wth `ook-ahead for Broadcast WDM Networks Vjay Svaraman George N. Rouskas TR-96-06 May 14, 1996 Abstract We consder the problem of coordnatng access to the

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

and NSF Engineering Research Center Abstract Generalized speedup is dened as parallel speed over sequential speed. In this paper

and NSF Engineering Research Center Abstract Generalized speedup is dened as parallel speed over sequential speed. In this paper Shared Vrtual Memory and Generalzed Speedup Xan-He Sun Janpng Zhu ICASE NSF Engneerng Research Center Mal Stop 132C Dept. of Math. and Stat. NASA Langley Research Center Msssspp State Unversty Hampton,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

A Facet Generation Procedure. for solving 0/1 integer programs

A Facet Generation Procedure. for solving 0/1 integer programs A Facet Generaton Procedure for solvng 0/ nteger programs by Gyana R. Parja IBM Corporaton, Poughkeepse, NY 260 Radu Gaddov Emery Worldwde Arlnes, Vandala, Oho 45377 and Wlbert E. Wlhelm Teas A&M Unversty,

More information

Algorithmic Transformation Techniques for Efficient Exploration of Alternative Application Instances

Algorithmic Transformation Techniques for Efficient Exploration of Alternative Application Instances In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 Algorthmc Transformaton Technques for Effcent Exploraton of Alternatve Applcaton Instances

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Routing on Switch Matrix Multi-FPGA Systems

Routing on Switch Matrix Multi-FPGA Systems Routng on Swtch Matrx Mult-FPGA Systems Abdel Enou and N. Ranganathan Center for Mcroelectroncs Research Department of Computer Scence and Engneerng Unversty of South Florda Tampa, FL 33620 Abstract In

More information

F Geometric Mean Graphs

F Geometric Mean Graphs Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 2 (December 2015), pp. 937-952 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) F Geometrc Mean Graphs A.

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR Judth Aronow Rchard Jarvnen Independent Consultant Dept of Math/Stat 559 Frost Wnona State Unversty Beaumont, TX 7776 Wnona, MN 55987 aronowju@hal.lamar.edu

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Shared Virtual Memory Machines. Mississippi State, MS Abstract

Shared Virtual Memory Machines. Mississippi State, MS Abstract Performance Consderatons of Shared Vrtual Memory Machnes Xan-He Sun Janpng Zhu Department of Computer Scence NSF Engneerng Research Center Lousana State Unversty Dept. of Math. and Stat. Baton Rouge, LA

More information

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach Modelng, Manpulatng, and Vsualzng Contnuous Volumetrc Data: A Novel Splne-based Approach Jng Hua Center for Vsual Computng, Department of Computer Scence SUNY at Stony Brook Talk Outlne Introducton and

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Overvew 2 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Introducton Mult- Smulator MASIM Theoretcal Work and Smulaton Results Concluson Jay Wagenpfel, Adran Trachte Motvaton and Tasks Basc Setup

More information

Communication-Minimal Partitioning and Data Alignment for Af"ne Nested Loops

Communication-Minimal Partitioning and Data Alignment for Afne Nested Loops Communcaton-Mnmal Parttonng and Data Algnment for Af"ne Nested Loops HYUK-JAE LEE 1 AND JOSÉ A. B. FORTES 2 1 Department of Computer Scence, Lousana Tech Unversty, Ruston, LA 71272, USA 2 School of Electrcal

More information

Channel 0. Channel 1 Channel 2. Channel 3 Channel 4. Channel 5 Channel 6 Channel 7

Channel 0. Channel 1 Channel 2. Channel 3 Channel 4. Channel 5 Channel 6 Channel 7 Optmzed Regonal Cachng for On-Demand Data Delvery Derek L. Eager Mchael C. Ferrs Mary K. Vernon Unversty of Saskatchewan Unversty of Wsconsn Madson Saskatoon, SK Canada S7N 5A9 Madson, WI 5376 eager@cs.usask.ca

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

An Efficient Parallel Algorithm of Modified Jacobi Approach for Sparse Linear System

An Efficient Parallel Algorithm of Modified Jacobi Approach for Sparse Linear System 991 An Effcent Parallel Algorthm of Modfed Jacob Approach for Sparse Lnear System Bkash Kant Sarkar 1, Shb Sankat Sana 2, G. Sahoo 3 1,3 Department of Informaton Technology, BIT, Mesra, Ranch, Inda E-mal:

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

Parallel and Distributed Association Rule Mining - Dr. Giuseppe Di Fatta. San Vigilio,

Parallel and Distributed Association Rule Mining - Dr. Giuseppe Di Fatta. San Vigilio, Parallel and Dstrbuted Assocaton Rule Mnng - Dr. Guseppe D Fatta fatta@nf.un-konstanz.de San Vglo, 18-09-2004 1 Overvew Assocaton Rule Mnng (ARM) Apror algorthm Hgh Performance Parallel and Dstrbuted Computng

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT Bran J. Wolf, Joseph L. Hammond, and Harlan B. Russell Dept. of Electrcal and Computer Engneerng, Clemson Unversty,

More information

c 2009 Society for Industrial and Applied Mathematics

c 2009 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. 31, No. 3, pp. 1382 1411 c 2009 Socety for Industral and Appled Mathematcs SUPERFAST MULTIFRONTAL METHOD FOR LARGE STRUCTURED LINEAR SYSTEMS OF EQUATIONS JIANLIN XIA, SHIVKUMAR

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

Abstract Ths paper ponts out an mportant source of necency n Smola and Scholkopf's Sequental Mnmal Optmzaton (SMO) algorthm for SVM regresson that s c

Abstract Ths paper ponts out an mportant source of necency n Smola and Scholkopf's Sequental Mnmal Optmzaton (SMO) algorthm for SVM regresson that s c Improvements to SMO Algorthm for SVM Regresson 1 S.K. Shevade S.S. Keerth C. Bhattacharyya & K.R.K. Murthy shrsh@csa.sc.ernet.n mpessk@guppy.mpe.nus.edu.sg cbchru@csa.sc.ernet.n murthy@csa.sc.ernet.n 1

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning Parallel Inverse Halftonng by Look-Up Table (LUT) Parttonng Umar F. Sddq and Sadq M. Sat umar@ccse.kfupm.edu.sa, sadq@kfupm.edu.sa KFUPM Box: Department of Computer Engneerng, Kng Fahd Unversty of Petroleum

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

A fault tree analysis strategy using binary decision diagrams

A fault tree analysis strategy using binary decision diagrams Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:

More information