IMPLEMENTATION OF UNSTRUCTURED GRID GMRES+LU-SGS METHOD ON SHARED-MEMORY, CACHE-BASED PARALLEL COMPUTERS

Size: px
Start display at page:

Download "IMPLEMENTATION OF UNSTRUCTURED GRID GMRES+LU-SGS METHOD ON SHARED-MEMORY, CACHE-BASED PARALLEL COMPUTERS"

Transcription

1 AIAA-97 IMPLEMENTATION OF UNSTRUCTURED GRID GMRES+LU-SGS METHOD ON SHARED-MEMORY, CACHE-BASED PARALLEL COMPUTERS Dmtr Sharov, Hong Luo, Joseph D. Baum Scence Applcatons Internatonal Corporaton 7 Goodrdge Drve, MS -6-9 McLean, VA, USA and Ranald Löhner Insttute for Computatonal Scences and Informatcs George Mason Unversty, Farfax, VA 3, USA ABSTRACT The mplementaton of an unstructured grd matrx-free GMRES+LU-SGS scheme on shared-memory, cache-based parallel machnes s descrbed. A specal grd renumberng technque s used for the parallelzaton rather than the tradtonal method of parttonng the computatonal doman. The renumberng technque helps to avod nter-processor data dependences, cache-msses, and cache-lne overwrte whle allowng ppelnng. The resultng source code can be used wth maxmum effcency and wthout modfcatons on tradtonal (scalar) computers, vector supercomputers, and shared-memory parallel systems. Specal attenton has been pad to develop an optmally parallelzed precondtoner for the GMRES scheme.. INTRODUCTION Consderable progress has recently been made n the development of mplct schemes for unstructured grds. The mplct methods are wdely used to accelerate convergence of steady-state problems as well as to mprove the effcency of unsteady solvers by advancng the soluton wth substantally larger tme steps. The GMRES+LU-SGS mplct scheme proposed by Luo, Baum, and Löhner for steady-state solutons as well as for unsteady problems can mprove the effcency of tradtonal explct methods by one to more than two orders of magntude. The scheme uses the Lower Upper-Symmetrc Gauss-Sedel (LU-SGS) scheme as a precondtoner for the Generalzed Mnmal Resdual (GMRES) method 3. The LU-SGS scheme was orgnally proposed by Jameson and Yoon 4 on structured grds, and has been successfully generalzed and extended to unstructured meshes 5-7. Copyrght by the authors. Publshed by the Amercan Insttute of Aeronautcs and Astronautcs, Inc. wth permsson. Another way to reduce turn-around tme s to use the multple processors. Wth the advent of massvely parallel machnes,.e. machnes n excess of 5 nodes, the explotaton of parallelsm n solvers has become a maor focus of attenton. Most of the applcatons ported successfully to parallel machnes to date have followed the Sngle Program Multple Data (SPMD) paradgm. For grd-based solvers, a spatal subdoman was stored and updated n each processor. For obvous reasons, load balancng 8- has been a maor focus of actvty. Despte the strkng successes reported to date, only the smplest of all solvers: explct tmesteppng or mplct teratve schemes, perhaps wth multgrd added on, have been ported wthout maor changes and/or problems to massvely parallel computers wth dstrbuted memory. Many code optons that are essental for realstc smulatons are not easy to parallelze on ths type of machne. Among these, we menton local remeshng, repeated h-refnement such as requred for transent problems 3, contact detecton and force evaluaton 4, some precondtoners 5, applcatons where partcles, flow, and chemstry Amercan Insttute of Aeronautcs and Astronautcs

2 nteract, and applcatons wth rapdly varyng load mbalances. Even f 99% of all operatons requred by these codes can be parallelzed, the maxmum achevable gan wll be restrcted to :. If we accept as a fact that for most large-scale codes we may not be able to parallelze more than 99% of all operatons, the shared-memory paradgm, dscarded for a whle as nonscalable, make a comeback. It s far easer to parallelze some of the more complex algorthms, as well as cases wth large load mbalance, on shared-memory machne (such as the SGI Orgn ). The obectve of the present research effort s to mplement the GMRES+LU-SGS scheme on shared memory parallel computers. Here we wll use the shared memory parallelzaton technque, orgnally proposed by Löhner and mplemented for explct schemes 6. Ths method s based on extensve mesh renumberng whch provdes proper load balancng and avods cache-msses and cache-lne overwrte whle allowng ppelnng. The advantage of the method over the tradtonal approach, whch s based on doman parttonng, s ts ablty to be easly used wth repeated local mesh refnement and local or global remeshng. The matrx-free GMRES+LU-SGS mplct scheme uses the LU-SGS approxmate factorzaton as a precondtoner. The parallelzaton of the LU-SGS algorthm s not an obvous task, because of nherent data dependency. The LU-SGS algorthm can be vectorzed for vector processors by usng planes ++k=const for structured meshes, or by usng hyper plane edge reorderng for unstructured meshes 7. Unfortunately, for the ntended shared-memory parallelzaton approach, there are very severe penaltes to start a loop 6. Hence, a loop can be effcently parallelzed only f ts vector length s large enough. For an explct scheme, f scalablty to even 6 processors s to be acheved, the vector loop lengths should be at least 6x,. For typcal tetrahedral grds there are approxmately vector-length groups, ndcatng that we would need at least x6x,=35, edges to run effcently. For the LU-SGS scheme ths restrcton s much more severe. Snce we usually have several hundreds of hyper planes, tmes or even tmes more edges are requred to run the code effcently. Moreover, snce the LU-SGS scheme s used as a precondtoner for the GMRES method, even a small neffcency n parallelzaton of the LU-SGS scheme may result n severe degradaton of overall performance. Thus a comparson of dfferent types of matrx-free parallelzed precondtoners has been performed as part of the present effort.. GOVERNING EQUATIONS AND THEIR DISCRETIZATION The Euler equatons governng unsteady compressble nvscd flows can be expressed n the conservatve form as Q F + =, (.) t x where the summaton conventon has been employed. The unknown vector Q, and nvscd flux vector F are defned by ρ Q = ρu, ρe F ρu. (.) = ρuu + pδ u ( ρe + p) Here ρ, p, e denote the densty, pressure, and specfc total energy of the flud respectvely, and u s the velocty of the flow n the coordnate drecton x. Ths set of equatons s completed by the addton of an equaton of state. The governng equatons are dscretzed by the fnte volume method based on dual mesh cells assocated wth the nodes of the mesh, where the control volumes are nonoverlappng dual cells constructed by the medan planes of the tetrahedra. In the present study the numercal flux functons for nvscd fluxes at the dual mesh cell nterface are computed usng the AUSM+ (Advecton Upwnd Splttng Method) scheme 7. Lnear reconstructon of prmtve varables s used wth Van Albada lmter. Equaton (.) can be rewrtten n a sem-dscrete form as Q V = R, (.3) t where V s the volume of the dual mesh cell, and R s the rght-hand sde resdual and equals to zero for a steady-state soluton. 3. SHARED MEMORY PARALLELIZATION TECHNIQUE The parallelzaton technque for explct schemes 6 wll be generalzed for mplct computatons n ths paper. The method requres no explct doman decomposton, but s based on the combnaton of several renumberng and data regroupng technques developed to avod or consderably mnmze cachemsses, cache-lne overwrte, and memory contenton. Amercan Insttute of Aeronautcs and Astronautcs

3 Renumberng to mnmze cache-msses NEDGE a NEDGE c NEDGE b NPOIN Proc. Proc. Proc. NPOIN Proc NPOIN Proc. Proc. Fg.. Edge and node renumberng. (a) Renumberng to mnmze cache-msses. (b) Renumberng to avod memory contenton. (c) Renumberng for -processor machne. All unstructured CFD codes contan basc loops over nodes, edges, and elements. If a loop over the edges s consdered and cache-msses are a concern, then the storage locatons for the requred pont nformaton should be as close as possble n memory when requred by an edge. At the same tme, as the loop progresses through the edges, the pont nformaton should be accessed as unformly as possble. Ths may be acheved by frst renumberng the ponts usng a bandwdth-mnmzaton technque such as the reverse Cuthll McKee 8, wavefront 9, or Peano-Hlbert type space-fllng curves, and subsequently renumberng the edges accordng to the mnmum pont number on each edge 9. Fgure a shows an example of the reordered edges and ponts. The same type of renumberng s done for all enttes, whch serve as basc loops n the code (e.g. elements, boundary faces, etc.). All of these algorthms are of complexty O(N) or at most O(N log N), and well worth the effort. Data and loop rearrangements to avod memory contenton In order to acheve ppelnng or vectorzaton, memory contenton ssues must be avoded. The memory contenton can arse for nstance n a loop over the edges whle wrtng to correspondng ponts. The followng example s a typcal smplfed loop: Loop DO 6 IEDGE=,NEDGE IPOI=LNOED(,IEDGE) IPOI=LNOED(,IEDGE) REDGE=F(IPOI,IPOI) RHSPO(IPOI)=RHSPO(IPOI)+REDGE RHSPO(IPOI)=RHSPO(IPOI)-REDGE 6 CONTINUE Snce one and the same pont can be accessed more than once from dfferent edges, the nformaton n RHSPO may be corrupted n the ppelne. To make sure that no pont s accessed more than once, the loop can be splt nto several contenton-free loops over renumbered edges, see Fg. b: Loop C$DIR IVDEP DO 4 IPASS=,NPASS NEDG=EDPAS(IPASS)+ NEDG=EDPAS(IPASS+)!PIPELINING DIRECTIVE DO 6 IEDGE=NEDG,NEDG IPOI=LNOED(,IEDGE) 3 Amercan Insttute of Aeronautcs and Astronautcs

4 IPOI=LNOED(,IEDGE) REDGE=F(IPOI,IPOI) RHSPO(IPOI)=RHSPO(IPOI)+REDGE RHSPO(IPOI)=RHSPO(IPOI)-REDGE 6 CONTINUE 4 CONTINUE Data and loop rearrangements to avod cache-lne overwrte An auto-parallelzng compler can parallelze the nner loop n loop. However, as has been mentoned n Ref. 6, such parallelzaton s not effcent, because of both start-up penaltes and cache-lne overwrte. The start-up penaltes are assocated wth launchng of a parallel loop. To mnmze the penaltes, the number of passes NPASS should be as small as possble and therefore, the vector-length should be large. However, when large vector-lengths are used, the probablty that dfferent processors access the same cache-lne s ncreased. If the cache-lne overwrte takes place, all processors must update ths lne, leadng to a large ncrease of nterprocessor communcaton, severe performance degradaton, and non-scalablty. To keep vector-lengths short and enoy small start-up cost, a specal edge renumberng has been proposed n Ref. 6. Fgure c llustrates the dea of ths renumberng for the case of two processors. The actual loop may look lke: Loop 3 DO IMACG=,NPASG,NPROC IMAC=IMACG IMAC=MIN(NPASG,IMAC+NPROC-) C PARALLELIZATION DIRECTIVE C$DOACROSS LOCAL(IPASG) DO IPASG=IMAC,IMAC CALL LOOP3P(IPASG) CONTINUE CONTINUE LOOP3P becomes subroutne of the form: SUBROUTINE LOOP3P(IPASG) NPAS=EDPAG(IPASG)+ NPAS=EDPAG(IPASG+) DO 4 IPASS=NPAS,NPAS NEDG=EDPAS(IPASS)+ NEDG=EDPAS(IPASS+) C$DIR IVDEP!PIPELINING DIRECTIVE DO 6 IEDGE=NEDG,NEDG IPOI=LNOED(,IEDGE) IPOI=LNOED(,IEDGE) REDGE=F(IPOI,IPOI) RHSPO(IPOI)=RHSPO(IPOI)+REDGE RHSPO(IPOI)=RHSPO(IPOI)-REDGE 6 CONTINUE 4 CONTINUE RETURN There s no doubt that ths algorthm can be appled to any explct CFD solver. In the sequel, we consder the extenson of ths method to an mplct scheme. 4. IMPLICIT TIME INTEGRATION In order to obtan a steady-state soluton, the spatally dscretzed equatons must be ntegrated n tme. Usng Euler mplct tme-ntegraton, Eq.(.) can be wrtten n dscrete form as V t n = R n+, (4.) where t s the tme ncrement and n s the dfference of an unknown vector between tme levels n and n+;.e., n n+ n = Q Q. (4.) Equaton (4.) can be lnearzed n tme as V n n Q n R = R +, (4.3) t Q where R s the rght-hand sde resdual and equals zero for a steady-state soluton. Wrtng the equaton for all nodes leads to the delta form of the backward Euler scheme where A = R, (4.4) n V R A = I. (4.5) t Q We use a smplfed flux functon to obtan the lefthand sde Jacoban matrx, R = F( Q, n ( ) + F( Q λ ( Q where λ s the spectral radus, Q, n ) )) s (4.6) λ = v n + c, (4.7) where n s the unt vector normal to the cell nterface, v s the velocty vector, and c s the speed of sound. 4 Amercan Insttute of Aeronautcs and Astronautcs

5 Usng an edge-based data structure, the left-hand sde Jacoban matrx A s stored n lower, upper, and dagonal forms, whch can be expressed as where L V A = L + U + D, (4.8) F( Q, n ) λ s, (4.9) Q = I U F( Q, n ) = I Q λ s, (4.) F( Q, n ). (4.) D = I + s t + λ I Q Equaton (4.4) represents a system of lnear smultaneous algebrac equatons and needs to be solved at each tme step. The most wdely used methods to solve ths lnear system are teratve soluton methods and approxmate factorzaton methods. In Ref. t has been shown that the matrx-free GMRES+LU-SGS method results n very good convergence for unstructured meshes. Snce our goal s not to solve our system entrely by the LU-SGS approxmate factorzaton but rather use the GMRES wth approprate precondtoner, the precondtoner must be very fast, and at the same tme t should resemble the orgnal Jacoban matrx A as close as possble. Precondtonng wll be cost-effectve only f the addtonal computatonal work ncurred for each subteraton s compensated for by a reducton n the total number of teratons to convergence. Thus, even a moderate neffcency n parallelzaton of the precondtoner can be crtcal. Next, the followng matrx-free methods are consdered as canddates for the GMRES precondtoner:. The LU-SGS;. Data-Parallel Lower-Upper Relaxaton (DP-LUR) method, whch by ts nature s a Jacob teratve method; 3. Symmetrc Gauss-Sedel (SGS) relaxaton method. The LU-SGS approxmate factorzaton scheme s ust a subset of the SGS method and corresponds to the SGS scheme wth k=.. The LU-SGS approxmate factorzaton s descrbed as followng. ( D + L) D ( D + U ) = R + ( LD U ) (4.) Neglectng the last term on the rght-hand sde of Eq. (4.), and assumng that F F = F( Q + ) F( Q), (4.3) Q the system can be solved n the two steps. Frst, a lower (forward) sweep: * ( D + L) = R (4.4) or, n matrx-free form: * * * = D R ( F λ ) s ; (4.5) : L ( ) and second, an upper (backward) sweep: or: = * ( Q * D + U ) = D (4.6) D ( F λ ) s (4.7) : U ( ) The most remarkable feature of ths approxmaton s that there s no need to store the upper and lower matrces U and L, whch substantally reduces the memory requrements. It s found that ths approxmaton does not compromse any numercal accuracy, and the extra computatonal cost s neglgble. These sweeps can be vectorzed wth long vector lengths by usng specal orderng technque 7, but parallelzaton of the LU-SGS algorthm s not straghtforward due to nherent data dependences.. The DP-LUR method has been successfully used n Ref. as a substtute for the LU-SGS method for massvely parallel computer mplementaton. The method has no nherent data dependences, so t can be easly parallelzed n the same way as an explct scheme. The method can be descrbed n the followng way: The frst subteraton: = D R (4.8) Then the k max subteratons are made usng k + k ( R ( U + L) ) = D, (4.9) whch can be wrtten n matrx-free form as 5 Amercan Insttute of Aeronautcs and Astronautcs

6 k + k k = D R ( F λ ) s, (4.) where k s the subteraton number. Wth ths approach, the data that s requred for each subteraton has already been computed durng the prevous subteraton. Therefore, the entre subteraton may be performed smultaneously, and there are no data dependences. 3. Symmetrc Gauss-Sedel relaxaton. Frst, zero the array: =. (4.) Then the k max subteratons are made usng forward sweep: k + / k ( D + L) = R U (4.) and then a backward sweep: k ( ) + k D + U = R L +/ (4.3) whch can be wrtten n matrx-free form as forward sweep: k+ / k+ / k+ / = D R ( F λ ) s : L ( ) k k ( F ) λ Q s : U ( ) (4.4) and backward sweep: k+ k+ k+ = D R ( F λ ) s : U ( ) k+ / k+ / ( F ) λ Q s : L ( ) (4.5) For one subteraton (k max =), the SGS method s equvalent to the LU-SGS approxmate factorzaton method. loop ntaton, heavy nterprocessor communcatons and poor load balance.. Splt the computatonal doman nto several nonoverlappng regons accordng to the number of processors, and apply the SGS method nsde of each regon wth (or wthout) some specal nterprocessor boundary treatment 37. Ths approach may suffer from convergence degradaton but takes advantage of mnmal parallelzaton overhead and good load balance. Our experence wth the shared memory SGI Orgn computer has shown that the frst method doesn t provde good scalablty, so we wll consder the second approach here. For testng purposes we computed a transonc flow n a channel wth a % crcular bump on the lower wall. The length of the channel s 3, ts heght s, and ts wdth s.5. The nlet Mach number s.675. Ths s a three-dmensonal smulaton of a two-dmensonal flow. The tetrahedral mesh was automatcally generated by the advancng front technque and contans 3,56 grd ponts, 64,595 elements, and 8,756 boundary trangles. The mesh and computed pressure contours are shown n Fgs. a and b. All computatons were run wth essentally nfnte tme step (CFL= 4 ). a 5. PARALLELIZATION OF THE PRECONDITIONER Snce parallelzaton of the DP-LUR method s straghtforward, we wll dscuss only the parallelzaton of the Symmetrc Gauss-Sedel methods. There are two approaches to the soluton of the problem:. Use a specal schedulng algorthm whch enables data parallelsm by regroupng edges. Ths method has the advantage of producng exactly the same result as the sngle processor case, but t suffers from severe overhead penaltes for parallel b Fg.. Flow n channel wth crcular bump. (a) Surface mesh. (b) Computed pressure contours on the channel surface at M= Amercan Insttute of Aeronautcs and Astronautcs

7 a Fg. 3. Blocks wth the wavefront renumberng. (a) blocks. (b) 5 blocks. a b b c Fg. 4. (a) Peano-Hlbert-Morton space-fllng curve (b) blocks obtaned wth the Peano-Hlbert renumberng. (c) 5 blocks obtaned wth the Peano- Hlbert renumberng. There are several methods to obtan a good parttonng of computatonal doman nto blocks 8. In our case we use the fact that the grd nodes are already renumbered to mnmze bandwdth, so we cut the entre array of nodes nto equally szed peces correspondng to the number of processors. Ths technque s very smple and provdes perfect load balancng. Though the method doesn t provde good control over mnmzaton of nterprocessor boundary, t wll be shown that ths ssue can be addressed by usng alternatve node renumberng technques. In addton, snce the shared-memory platforms are consdered, the nterprocessor communcaton overhead s not tghtly connected to the area of the nterprocessor boundares. The parttonng nto blocks usng the wavefront 9 renumberng s shown n Fg. 3a. Fgure 3b shows smlar parttonng nto 5 blocks. The wavefront renumberng results n very narrow slces, thus a Peano-Hlbert type space-fllng curve was also consdered to renumber the ponts. An example of such curve s shown n Fg. 4a. Ths curve was obtaned usng Morton s algorthm. The blocks parttonng correspondng to the Peano-Hlbert renumberng s shown n Fg. 4b, whle 5 blocks parttonng s shown n Fg. 4c. Next, the mplementaton of the LU-SGS scheme on parallel nonoverlapped blocks s consdered. Fgure 5a shows an example of a grd pont surrounded by nodes belongng to the same block. All surrounded nodes are dvded nto two groups L and U for lower and upper matrx computatons correspondngly (see Eqs.(4.4-5)). At frst, the SGS used locally on each processor wthout any contrbuton from nterprocessor boundares. Consder pont, whch has neghbors belongng to dfferent blocks (Fg. 5b). If there s no any exchange between the blocks, the L and U sets wll look as shown n Fg. 5b, and contrbuton from the three gray-colored nodes of processor A are not computed. Ths approach has been tested usng the LU-SGS scheme (wthout the GMRES) on and 5 blocks, wth the wavefront node renumberng. The test computaton was performed on a sngle processor Pentum III PC. Fgure. 6 demonstrates that convergence severely degrades for -block case and stalls for 5-block case. The second approach for parallelzaton s the socalled hybrd LU-SGS or HLU-SGS. A smlar scheme was used n Ref. 7 for structured grds. Ths scheme uses the DP-LUR for nterprocessor edges, and regular SGS scheme for edges nternal to each block. It s easer to consder Eqs. (4.45) to understand the 7 Amercan Insttute of Aeronautcs and Astronautcs

8 method. Schematcally, the method s shown n Fgs. 5c and 5d. Fgure 5c corresponds to the forward sweep, and Fg. 5d corresponds to the backward sweep of the SGS procedure. In the SGS scheme, when the forward sweep s performed, upper matrx computaton has no data dependency. Conversely, when the backward sweep s performed, the lower matrx computaton has no data dependency. Forward sweep L() U() L() U() a Processor B c Backward sweep L() U() L() U() b Processor A d Fg. 5. Stencl for Gauss-Sedel scheme. (a) Internal pont. (b) Interface pont wthout nterprocessor communcatons. (c) Hybrd SGS forward sweep. (d) Hybrd SGS backward sweep. The results of HLU-SGS computaton (wth k=) on and 5 blocks are shown n Fg. 6. It s demonstrated that the hybrd scheme has some advantages over the LU-SGS scheme. Next, consder how the LU-SGS, HLU-SGS, and DP-LUR schemes work as a precondtoner for the GMRES method. In our computatons we used the same verson of GMRES as n Ref. wth search drectons, teratons and soluton tolerance set to.. Results of the DP-LUR scheme as precondtoner are shown n Fg. 7a. The advantages of ths method are ts easy parallelzaton and lack of dependency on the number of processors. The nfluence of the number of subteratons k max has also been represented n Fg. 7a. Note that k max = s equvalent to a dagonal precondtoner. It s neffcent to use more than one subteraton, snce ncrease of k max yelds no mprovement n the convergence rate. When the DP- LUR scheme s used not as a precondtoner, the result s reversed: ncrease n the number of subteratons mproves performance. Fgure 7b llustrates the nfluence of number of subteratons n the SGS precondtoner on convergence. These computatons were performed usng one sngle block. The test wth k=, whch s equvalent to the LU-SGS precondtoner, gves the best performance overall, n contrast wth results obtaned wth the SGS scheme alone, whch converges better 8 Amercan Insttute of Aeronautcs and Astronautcs

9 when more subteratons are used (usually up to ). Ths can be explaned by the fact that the GMRES teratve procedure s more effcent than the SGS teratons. GMRES+DPLUR(k=) (Dag.) GMRES+DPLUR(k=) GMRES+DPLUR(k=5) GMRES+DPLUR(k=) - - LUSGS ( bl.) LUSGS ( bl.) LUSGS (5 bl.) HLUSGS ( bl.) HLUSGS (5 bl.) CPU Tme (s) Fg. 6. Convergence hstory for LU-SGS and hybrd LU-SGS schemes wthout GMRES on,, and 5 blocks wth wavefront renumberng. a CPU Tme (s) - GMRES+SGS(k= LU-SGS) GMRES+SGS(k=) GMRES+SGS(k=3) GMRES+SGS(k=) Next, ncreasng the number of blocks s consdered. Prevously, some authors 7 used several precondtoner subteratons to reduce ts degradaton. Our results usng 5 blocks (Fg. 7c), show that the ncreasng the number of subteratons actually leads to slower convergence. b CPU Tme (s) 5 blocks, k= 5 blocks, k= 5 blocks, k=4 It was demonstrated that wth ncreasng of number of blocks, the LU-SGS scheme suffers from performance degradaton. Let s check how ths fact nfluences the behavor of the GMRES scheme. Fgure 8a shows convergence rates comparsons for,,, and 5 blocks usng a smple LU-SGS precondtoner wthout hybrd treatment of nterprocessor boundares and wth the wavefront node renumberng. The dagonal precondtoner result s also shown because t represents the worst scenaro of the LU-SGS scheme, when the number of blocks s equal to the number of grd ponts. Fgure 8b shows the correspondng results for the hybrd LU-SGS scheme. The hybrd scheme s a better choce for large number of blocks. The worst case for the hybrd scheme corresponds to the DP-LUR precondtoner wth k max =. c CPU Tme (s) Fg. 7. Convergence hstory. (a) GMRES+DP-LUR scheme, sngle block. (b) GMRES+SGS scheme, sngle block. (c) GMRES+Hybrd SGS scheme, 5 blocks. 9 Amercan Insttute of Aeronautcs and Astronautcs

10 GMRES+LU-SGS( block) GMRES+LU-SGS( blocks) GMRES+LU-SGS( blocks) GMRES+LU-SGS(5 blocks) GMRES+Dagonal GMRES+LU-SGS( block) GMRES+HLUSGS( blocks) GMRES+HLUSGS( blocks) GMRES+HLUSGS(5 blocks) GMRES+DPLUR(k=) - - a CPU Tme (s) b CPU Tme (s) GMRES+LU-SGS GMRES+LU-SGS( blocks) GMRES+LU-SGS(5 blocks) GMRES+LU-SGS( blocks) GMRES+LU-SGS GMRES+HLUSGS( blocks) GMRES+HLUSGS(5 blocks) GMRES+HLUSGS( blocks) - - c CPU Tme (s) CPU Tme (s) Fg. 8. Convergence hstory. (a) GMRES+LU-SGS wth wavefront renumberng. (b) GMRES+hybrd LU-SGS wth wavefront renumberng. (c) GMRES+LU-SGS wth Peano-Hlbert renumberng. (d) GMRES+hybrd LU-SGS wth Peano-Hlbert renumberng. d Fgures 8c and 8d show the results of computatons wth the Peano-Hlbert renumberng. Comparson wth the results shown n Fgs.8a and 8b shows that the gan from a good renumberng technque s much more mportant than gan from the hybrd SGS scheme. We conclude that the LU-SGS scheme beng used as a precondtoner for the GMRES s not very senstve to parttonng nto blocks. Both the LU-SGS scheme and the HLU-SGS scheme can be used as a precondtoner. When large number of processors s requred t s better to pay attenton to doman-splttng technque, n our case Peano-Hlbert reorderng gves good results. If the number of processors doesn t exceed, t s not mportant how the doman s dvded nto blocks. Amercan Insttute of Aeronautcs and Astronautcs

11 - proc. proc. 4 proc. 6 proc. 8 proc. proc. proc. 6 proc Steps Fg. 9. ONERA M6 wng. Absolute velocty contours. M=.84, angle of attack 3.6 o Fg.. Resdual convergence hstory versus tme steps for ONERA M6.5.5 Computaton Experment Computaton Experment.5.5 -Cp -Cp a X/C Fg. ONERA M6. Cp profle at % semspan (a) and 44% semspan (b). b X/C 6. APPLICATION OF THE GMRES+LU-SGS SCHEME TO LARGE-SCALE COMPUTATIONS The computatons were performed on a SGI Orgn computer wth R processors. ONERA M6 Wng Confguraton The frst applcaton s an nvscd transonc flow over a ONERA M6 wng. The M6 wng has a leadngedge sweep angle of 3 o, an aspect of 3.8, and a taper rato of.56. The arfol secton of the wng s the ONERA D arfol, whch s a % maxmum thckness-to-chord rato conventonal secton. The flow solutons are presented at a Mach number of.84 and an angle of attack of 3.6 o. The mesh used n the computaton conssts of 74,95 elements, 36,5 grd ponts, and,76 boundary ponts. The computed absolute velocty contours on the wng surface are dsplayed n Fg. 9. The upper surface contours clearly show the sharply captured Lambda-type shock structure formed by the two nboard shock waves, whch merge near the 8% semspan to form the sngle strong shock wave n the outboard regon of the wng. The computed pressure coeffcent dstrbutons are compared wth expermental data 3 n Fg.. We can observe that Amercan Insttute of Aeronautcs and Astronautcs

12 there s only one grd pont wthn the shock structure; ths demonstrates the sharp shock-capturng ablty of the AUSM+ scheme. The results obtaned compare closely wth the expermental data. The convergence hstory for,, 4, 6, 8,,, and 6 processors s shown n Fg.. No serous convergence degradaton s observed. Wng/Pylon/Fnned-Store (Egln) Confguraton Another test case was conducted for the wng/pylon/fnned-store confguraton reported n Ref. 3, whch conssts of a clpped delta wng, 45 o sweep, composed of a constant NACA 64 symmetrc arfol secton. The wng has a root chord of 6n, a semspan of 3n, and a taper rato of.34. The pylon s located at the mdspan staton. The wdth of the pylon s.94n. A constant NACA8 arfol secton wth a leadng-edge sweep of 45 o and a truncated tp defnes the four fns of the store. The mesh used n the computaton s shown n Fg.a. It contans,39,694 elements, 39,547 grd ponts, and 7,359 boundary ponts. The flow solutons are presented at a Mach number of.95 and an angle of attack of o. Fgures b and c show the pressure contours on the upper and lower wng surfaces, respectvely. The convergence hstory for the computaton wth 6 processors s shown n Fg. 3. The resultng speedups for the bump case, ONERA M6 case, and the Egln case are shown n Fg. 4. The speedup was measured by tmng CPU tme of one tme step on dfferent number of processors. The performance degrades wth the number of processors. Ths s to be expected, as the ncreasng number of passes results n hgher relatve loop costs. The resultng speedup s very smlar to one obtaned for the explct scheme, see Ref. 6. Tme Accurate Smulaton of Arcraft Canopy Traectory For the tme accurate mplct computaton we use the same method appled n Ref.. The method s based on pseudo-tmesteps. In ths method, Eq. (.3) s transformed to the followng form: n+ n+ n n V Q V Q t n+ ( α) R + αr n VQ + = τ (6.) where τ s the pseudo tme varable, n denotes the tme level. If α=, the scheme s the backward Euler method. If α=.5, the resultng scheme known as Crank- Ncholson method s second-order n tme. Ths yelds the followng system of lnear equatons n+ m V R m ( + ) I Q = α t τ Q n+ n n n m V Q V Q R + + αr ( α) t α t n (6.) It has been shown n Ref. that the fully mplct scheme s more accurate than ts lnearzed counterpart, snce t requres several subteratons to acheve convergence on each tme step. The new method was used to compute an F/A8- C/D fghter canopy eecton. Durng the ntal openng of the canopy, a number of topologcal changes occur n the geometry. The problem was computed wth Mach number.76. The computaton was started at t=6 ms, when the canopy ust started to move, see Fg. 5, and ended at t=5ms when the canopy has moved to the tal part of the plane. At ntal stage the canopy was hnged to the plane. After rotatng 45 degrees (at t=4 ms), the canopy was released and allowed to move n response to the forces exerted on t. The mesh at several tme nstances as well as velocty feld s shown n Fg. 6. The average sze of the mesh was 5, ponts and,3, elements. Mesh sze vared a lttle durng the computaton. More detaled data on ths problem can be found n Ref. 33. A tme step correspondng to the CFL number of 5 was used n the computaton. The smulaton requred approxmately 4 CPU hours on 8 processors of SGI Orgn. The plot of CPU tme vs. problem tme s shown n Fg. 7. The curve s not straght but rather has 8 steps attrbuted to global remeshng requred by the algorthm. The remeshng was done by a new parallel algorthm descrbed n Ref. 34. Ths computaton s more than tmes faster as compared to our explct computaton, see Ref. 33. Only 84 tme steps were requred (compare wth approxmately 3, tme steps wth explct scheme). Ths sgnfcant reducton n CPU requrements s attrbuted to the mplct GMRES scheme and parallel remeshng. Fgure 8 shows the speedup of the tme accurate computaton. Ths speedup was computed by measurng the CPU tme for a sngle tmestep. The tme accurate speedup result s very smlar to those obtaned for the steady-state cases. Amercan Insttute of Aeronautcs and Astronautcs

13 b a c Fg.. (a) Surface mesh used for Wng/Pylon/Fnned-Store confguraton. (b) Computed pressure contours on the upper surface. (c) Computed pressure contours on the lower surface Speedup Perfect case Bump Onera M6 Egln Tme steps Fg. 3. Resdual convergence hstory versus tme steps for Wng/Pylon/Fnned-Store confguraton on 6 processors Processors Fg. 4. Speedups n computatons of the channel wth crcular bump, ONEA M6, and Wng/Pylon/Fnned- Store (Egln) confguratons. 3 Amercan Insttute of Aeronautcs and Astronautcs

14 5 ms 9 ms 5 ms 7 ms 6 ms Fg. 5. F/A8-C/D fghter canopy eecton. CONCLUSIONS A parallelzaton technque for matrx-free GMRES+LU-SGS unstructured grd method on sharedmemory machne s proposed. The method requres no drect doman parttonng and can easly be combned wth mesh refnement and remeshng procedures. Specal attenton s gven to parallel mplementaton of GMRES precondtoner. It s shown that for moderate number of processors, the LU-SGS method wthout nterprocessor data exchange s a good choce. The hybrd LU-SGS scheme works slghtly better for hgher number of processors. The proper node renumberng s crtcal to effcency of the method. For parallelzaton of the mplct scheme the Peano-Hlbert type renumberng demonstrated the best results. Even though the method s effcency degrades wth ncreasng the number of processors, the degradaton s proven to be small and the method always mantans ts stablty snce the worst case corresponds to the GMRES scheme wth the dagonal precondtonng, whch s proven to be stable for the Euler computatons. The method has been successfully appled to several steady-state and tme-accurate 3-D smulatons. Sgnfcant savngs n CPU tme are acheved as compared to the prevous verson of the code, whch utlzed the explct Runge-Kutta tme ntegraton. 4 Amercan Insttute of Aeronautcs and Astronautcs

15 Fg. 6. F/A8-C/D fghter canopy eecton. Surface mesh and absolute velocty contours. 5 Amercan Insttute of Aeronautcs and Astronautcs

16 5 4 CPU Tme (h) 3 Speedup Problem Tme (ms) Processors Fg. 7. Canopy eecton. CPU tme on 8 processors versus problem tme Fg. 8. Speedup for the canopy eecton case. REFERENCES. Luo, H., Baum, J.D., and Löhner, R., A Fast, matrx-free Implct Method for Compressble Flows on Unstructured Grds, Journal of Computatonal Physcs, Vol. 46, pp ,998.. Luo, H., Baum, J.D., and Löhner, R., An Accurate, Fast Matrx-Free Implct Method for Computng Unsteady Flows on Unstructured Grds, AIAA , Saad, Y., and Schultz, M.H., GMRES: a Generalzed Mnmal Resdual Algorthm for Solvng Nonsymmetrc Lnear systems, SIAM J. Sc. Stat. Comp., Vol. 7, No 3 (988), pp Jameson, A., and Yoon, S., Lower-Upper Implct Schemes wth Multple Grds for the Euler equatons, AIAA J., Vol. 5, No7, pp , Soetrsno, M., Imlay, S.T., and Roberts, D.W., A Zonal Implct Procedure for Hybrd Structured- Unstructured Grds, AIAA , Men shov, I., Nakamura, Y., An Implct Advecton Upwnd Splttng Scheme for Hypersonc Ar Flows n Thermochemcal Nonequlbrum, 6 th Int. Symp. on CFD, pp.85-8, Sharov, D., Nakahash, K., Reorderng of Hybrd Unstructured Grds for Lower-Upper Symmetrc Gauss-Sedel Computatons, AIAA J., vol.36, No 3, pp.48486, Wllams, D., Performance of Dynamc Load Balancng Algorthms for Unstructured Grd Calculatons; CalTech Rep. C3P93 (99). 9. Smon, H., Parttonng of Unstructured Problems for Parallel Processng; NASA Ames Tech. Rep. RNR-9-8 (99).. Mehrota, P., Saltz, J., Vogt, R. (eds.), Unstructured Scentfc Computaton on Scalable Multprocessors; MIT Press (99).. Vdwans, A., Kallnders, Y, Venkatakrshnan, V., A Parallel Load Balancng Algorthm for 3-D Adaptve Unstructured Grds; AIAA-9333-CP (993). 6 Amercan Insttute of Aeronautcs and Astronautcs

17 . Löhner, R, Three-Dmensonal Flud-Structure Interacton Usng a Fnte Element Solver and Adaptve Remeshng; Computer Systems n Engneerng,, 577 (99). 3. Löhner, R, and Baum, J.D., Adaptve H- Refnement on 3-D Unstructured Grds for Trancent Problems; Int. J. Num. Meth. Fluds, 4, pp.479 (99). 4. Haug, E., Charler, H., et.al., Recent Trends and Developments of Crashworthness Smulaton Methodologes and ther Integraton nto the Industral Vehcle Desgn Cycle; Proc. Thrd European Cars/Trucks Smulaton Symposum (ASIMUTH), Oct.8 (99). 5. Ramamurt, R., and Löhner, R, Smulaton of Flow Past Complex Geometres Usng a Parallel Implct Incompressble Flow Solver; pp.49-5, Proc. th AIAA CFD Conf., Orlando, FL, July (993). 6. Löhner, R, Renumberng Strateges for Unstructured-Grd Solvers Operatng on Shared- Memory, Cache-Based Parallel Machnes, AIAA 9745, 997, pp Lou, M.S, Progress towards an Improved CFD Method: AUSM+, AIAA 95-7, (995). 8. Cuthll, E., and McKee, J., Reducng the Bandwdth of Sparse Symmetrca Matrces; Proc. ACM Nat. Conf., New York 969, pp.57-7, (969). 9. Löhner, R, Some Useful Renumberng Strateges for Unstructured Grds; Int. J. Num. Meth. Eng., 36, pp.3597, (993).. Sagan, H., Space-Fllng Curves, Sprnger Verlag, New York, Candler, G.V., and Wrght, M.J., Data-Parallel Lower-Upper Relaxaton Method for Reactng Flows, AIAA Journal, 3, No, pp38386, Povtsky, A., Morrs, P.J., Parallel Compact Mult- Dmansonal Numercal Algorthm wth Applcaton to Aeroacoustcs, AIAA 997, (999). 3. Jenssen, C.B., Implct Multblock Euler and Naver-Stokes Calculatons, AIAA Journal, 3, No 9, pp.88-84, Sheng, C., Hyams, D. et al., Three-Dmensonal Incompressble Naver-Stokes Flow Computatons About Complete Confguratons Usng a Multblock Unstructured Grd Approach, AIAA , (999). 5. Stoll, P., Gerlnger, P., Bruggemann, D., Doman Decomposton for an Implct LU-SGS Scheme usng Overlappng Grds, AIAA , pp , (997). 6. Wssnk, A.W., Lyrntzs, A.S., and Strawn, R.C., Parallelzaton of a Three-Dmensonal Flow Solver for Euler Rotorcraft Aerodynamcs Predctons, AIAA Journal, 34, No., pp.76-83, Wssnk, A.W., Lyrntzs, A.S., Chronopoulos, A.T., A Parallel Newton-Krylov Method for Rotorcraft Flowfeld Calculatons, AIAA-97-49, pp.6-7, Flower, J., Otto, S., Salama, M., Optmal Mappng of rregular Fnte Element Domans to Parallel Processors; pp.395 (99). 9. Venkatakrshnan, V., Smon, H.D., Barth, T.J., A MIMD Implementaton of a Parallel Euler Solver for Unstructured Grds; NASA Ames Tech. Rep. RNR-94 (99). 3. Löhner, R, Ramamurt, R., A Load Balancng Algorthm for Unstructured Grds; Comp. Flud Dyn., 5, pp (995). 3. Schmtt, V., Charpn, F., Pressure Dstrbutons on the ONERA M6 Wng at Transonc Mach Numbers, Experment Data Base for Computer Program Assessment, AGARD AR8, Ilem, E.R., CFD Wng/Pylon/Fnned Store Mutual Interference Wnd Tunnel Experment, AEDC- TSR-9-P4, Arnold Engneerng Development Center, Arnold AFB,TN, Jan., Baum, J.D., Löhner, R, Marquette, T.J., Luo, H., Numercal Smulaton of Arcraft Canopy Traectory, AIAA , (997). 34. Löhner, R., A Parallel Advancng Front Grd Generaton Scheme, AIAA-5, (). 7 Amercan Insttute of Aeronautcs and Astronautcs

RECENT research on structured mesh flow solver for aerodynamic problems shows that for practical levels of

RECENT research on structured mesh flow solver for aerodynamic problems shows that for practical levels of A Hgh-Order Accurate Unstructured GMRES Algorthm for Invscd Compressble Flows A. ejat * and C. Ollver-Gooch Department of Mechancal Engneerng, The Unversty of Brtsh Columba, 054-650 Appled Scence Lane,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation Precondtonng Parallel Sparse Iteratve Solvers for Crcut Smulaton A. Basermann, U. Jaekel, and K. Hachya 1 Introducton One mportant mathematcal problem n smulaton of large electrcal crcuts s the soluton

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

An inverse problem solution for post-processing of PIV data

An inverse problem solution for post-processing of PIV data An nverse problem soluton for post-processng of PIV data Wt Strycznewcz 1,* 1 Appled Aerodynamcs Laboratory, Insttute of Avaton, Warsaw, Poland *correspondng author: wt.strycznewcz@lot.edu.pl Abstract

More information

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids)

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids) Structured meshes Very smple computatonal domans can be dscretzed usng boundary-ftted structured meshes (also called grds) The grd lnes of a Cartesan mesh are parallel to one another Structured meshes

More information

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016) Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)

More information

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method Internatonal Journal of Computatonal and Appled Mathematcs. ISSN 89-4966 Volume, Number (07), pp. 33-4 Research Inda Publcatons http://www.rpublcaton.com An Accurate Evaluaton of Integrals n Convex and

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

a tree-dmensonal settng. In te presented approac, we construct te ne mes by renng an exstng coarse mes and updatng te nodes of te ne mes accordng to t

a tree-dmensonal settng. In te presented approac, we construct te ne mes by renng an exstng coarse mes and updatng te nodes of te ne mes accordng to t Parallel Two-Level Metods for Tree-Dmensonal Transonc Compressble Flow Smulatons on Unstructured Meses R. Atbayev a, X.-C. Ca a, and M. Parascvou b a Department of Computer Scence, Unversty of Colorado,

More information

Dynamic wetting property investigation of AFM tips in micro/nanoscale

Dynamic wetting property investigation of AFM tips in micro/nanoscale Dynamc wettng property nvestgaton of AFM tps n mcro/nanoscale The wettng propertes of AFM probe tps are of concern n AFM tp related force measurement, fabrcaton, and manpulaton technques, such as dp-pen

More information

OVERSET UNSTRUCTURED GRID METHOD FOR FLOW SIMULATION OF COMPLEX AND MULTIPLE BODY PROBLEMS

OVERSET UNSTRUCTURED GRID METHOD FOR FLOW SIMULATION OF COMPLEX AND MULTIPLE BODY PROBLEMS ICAS 2000 CONGRESS OVERSET UNSTRUCTURED GRID METHOD FOR FLOW SIMULATION OF COMPLEX AND MULTIPLE BODY PROBLEMS Kazuhro Nakahash, Fumya Togash Tohoku Unversty Keywords: CFD, Unstructured grd, overset grd

More information

Lecture #15 Lecture Notes

Lecture #15 Lecture Notes Lecture #15 Lecture Notes The ocean water column s very much a 3-D spatal entt and we need to represent that structure n an economcal way to deal wth t n calculatons. We wll dscuss one way to do so, emprcal

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Order of Accuracy Study of Unstructured Grid Finite Volume Upwind Schemes

Order of Accuracy Study of Unstructured Grid Finite Volume Upwind Schemes João Luz F. Azevedo et al. João Luz F. Azevedo joaoluz.azevedo@gmal.com Comando-Geral de Tecnologa Aeroespacal Insttuto de Aeronáutca e Espaço IAE 12228-903 São José dos Campos, SP, Brazl Luís F. Fguera

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

. Introducton. The system of unsteady compressble Naver-Stokes (N.-S.) equatons s a fundamental system n ud dynamcs. To be able to solve the system qu

. Introducton. The system of unsteady compressble Naver-Stokes (N.-S.) equatons s a fundamental system n ud dynamcs. To be able to solve the system qu VARIABLE DEGREE SCHWARZ METHODS FOR THE IMPLICIT SOLUTION OF UNSTEADY COMPRESSIBLE NAVIER-STOKES EQUATIONS ON TWO-DIMENSIONAL UNSTRUCTURED MESHES Xao-Chuan Ca y Department of Computer Scence Unversty of

More information

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices Hgh resoluton 3D Tau-p transform by matchng pursut Wepng Cao* and Warren S. Ross, Shearwater GeoServces Summary The 3D Tau-p transform s of vtal sgnfcance for processng sesmc data acqured wth modern wde

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

ONE very challenging goal in the field of Computational Aeroelasticity (CA) is to be able to study fluid-structure

ONE very challenging goal in the field of Computational Aeroelasticity (CA) is to be able to study fluid-structure 43rd AIAA Aerospace Scences Meetng and Exhbt AIAA 2005-926 10-13 January 2005, Reno, Nevada DEFORMATION OF UNSTRUCTURED VISCOUS GRIDS Dens B. Kholodar,* Scott A. Morton, and Russell M. Cummngs Department

More information

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION 24 CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION The present chapter proposes an IPSO approach for multprocessor task schedulng problem wth two classfcatons, namely, statc ndependent tasks and

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach Modelng, Manpulatng, and Vsualzng Contnuous Volumetrc Data: A Novel Splne-based Approach Jng Hua Center for Vsual Computng, Department of Computer Scence SUNY at Stony Brook Talk Outlne Introducton and

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Immersed Boundary Method for the Solution of 2D Inviscid Compressible Flow Using Finite Volume Approach on Moving Cartesian Grid

Immersed Boundary Method for the Solution of 2D Inviscid Compressible Flow Using Finite Volume Approach on Moving Cartesian Grid Journal of Appled Flud Mechancs, Vol. 4, No. 2, Specal Issue, pp. 27-36, 2011. Avalable onlne at www.jafmonlne.net, ISSN 1735-3572, EISSN 1735-3645. Immersed Boundary Method for the Soluton of 2D Invscd

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

THREE-DIMENSIONAL UNSTRUCTURED GRID GENERATION FOR FINITE-VOLUME SOLUTION OF EULER EQUATIONS

THREE-DIMENSIONAL UNSTRUCTURED GRID GENERATION FOR FINITE-VOLUME SOLUTION OF EULER EQUATIONS ICAS 2000 CONGRESS THREE-DIMENSIONAL UNSTRUCTURED GRID GENERATION FOR FINITE-VOLUME SOLUTION OF EULER EQUATIONS Karm Mazaher, Shahram Bodaghabad Sharf Unversty of Technology Keywords: Unstructured grd

More information

Topology Design using LS-TaSC Version 2 and LS-DYNA

Topology Design using LS-TaSC Version 2 and LS-DYNA Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool

More information

Solitary and Traveling Wave Solutions to a Model. of Long Range Diffusion Involving Flux with. Stability Analysis

Solitary and Traveling Wave Solutions to a Model. of Long Range Diffusion Involving Flux with. Stability Analysis Internatonal Mathematcal Forum, Vol. 6,, no. 7, 8 Soltary and Travelng Wave Solutons to a Model of Long Range ffuson Involvng Flux wth Stablty Analyss Manar A. Al-Qudah Math epartment, Rabgh Faculty of

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

Simplification of 3D Meshes

Simplification of 3D Meshes Smplfcaton of 3D Meshes Addy Ngan /4/00 Outlne Motvaton Taxonomy of smplfcaton methods Hoppe et al, Mesh optmzaton Hoppe, Progressve meshes Smplfcaton of 3D Meshes 1 Motvaton Hgh detaled meshes becomng

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Electrical analysis of light-weight, triangular weave reflector antennas

Electrical analysis of light-weight, triangular weave reflector antennas Electrcal analyss of lght-weght, trangular weave reflector antennas Knud Pontoppdan TICRA Laederstraede 34 DK-121 Copenhagen K Denmark Emal: kp@tcra.com INTRODUCTION The new lght-weght reflector antenna

More information

Monte Carlo Rendering

Monte Carlo Rendering Monte Carlo Renderng Last Tme? Modern Graphcs Hardware Cg Programmng Language Gouraud Shadng vs. Phong Normal Interpolaton Bump, Dsplacement, & Envronment Mappng Cg Examples G P R T F P D Today Does Ray

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Complex Deformable Objects in Virtual Reality

Complex Deformable Objects in Virtual Reality Complex Deformable Obects n Vrtual Realty Young-Mn Kang Department of Computer Scence Pusan Natonal Unversty ymkang@pearl.cs.pusan.ac.kr Hwan-Gue Cho Department of Computer Scence Pusan Natonal Unversty

More information

Structured Grid Generation Via Constraint on Displacement of Internal Nodes

Structured Grid Generation Via Constraint on Displacement of Internal Nodes Internatonal Journal of Basc & Appled Scences IJBAS-IJENS Vol: 11 No: 4 79 Structured Grd Generaton Va Constrant on Dsplacement of Internal Nodes Al Ashrafzadeh, Razeh Jalalabad Abstract Structured grd

More information

Multiblock method for database generation in finite element programs

Multiblock method for database generation in finite element programs Proc. of the 9th WSEAS Int. Conf. on Mathematcal Methods and Computatonal Technques n Electrcal Engneerng, Arcachon, October 13-15, 2007 53 Multblock method for database generaton n fnte element programs

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

A HIGH-ORDER SPECTRAL (FINITE) VOLUME METHOD FOR CONSERVATION LAWS ON UNSTRUCTURED GRIDS

A HIGH-ORDER SPECTRAL (FINITE) VOLUME METHOD FOR CONSERVATION LAWS ON UNSTRUCTURED GRIDS AIAA-00-058 A HIGH-ORDER SPECTRAL (FIITE) VOLUME METHOD FOR COSERVATIO LAWS O USTRUCTURED GRIDS Z.J. Wang Department of Mechancal Engneerng Mchgan State Unversty, East Lansng, MI 88 Yen Lu * MS T7B-, ASA

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Structural Optimization Using OPTIMIZER Program

Structural Optimization Using OPTIMIZER Program SprngerLnk - Book Chapter http://www.sprngerlnk.com/content/m28478j4372qh274/?prnt=true ق.ظ 1 of 2 2009/03/12 11:30 Book Chapter large verson Structural Optmzaton Usng OPTIMIZER Program Book III European

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

c 2009 Society for Industrial and Applied Mathematics

c 2009 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. 31, No. 3, pp. 1382 1411 c 2009 Socety for Industral and Appled Mathematcs SUPERFAST MULTIFRONTAL METHOD FOR LARGE STRUCTURED LINEAR SYSTEMS OF EQUATIONS JIANLIN XIA, SHIVKUMAR

More information

Simulation of a Ship with Partially Filled Tanks Rolling in Waves by Applying Moving Particle Semi-Implicit Method

Simulation of a Ship with Partially Filled Tanks Rolling in Waves by Applying Moving Particle Semi-Implicit Method Smulaton of a Shp wth Partally Flled Tanks Rollng n Waves by Applyng Movng Partcle Sem-Implct Method Jen-Shang Kouh Department of Engneerng Scence and Ocean Engneerng, Natonal Tawan Unversty, Tape, Tawan,

More information

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem Ecent Computaton of the Most Probable Moton from Fuzzy Correspondences Moshe Ben-Ezra Shmuel Peleg Mchael Werman Insttute of Computer Scence The Hebrew Unversty of Jerusalem 91904 Jerusalem, Israel Emal:

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Line Clipping by Convex and Nonconvex Polyhedra in E 3

Line Clipping by Convex and Nonconvex Polyhedra in E 3 Lne Clppng by Convex and Nonconvex Polyhedra n E 3 Václav Skala 1 Department of Informatcs and Computer Scence Unversty of West Bohema Unverztní 22, Box 314, 306 14 Plzeò Czech Republc e-mal: skala@kv.zcu.cz

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Analysis of 3D Cracks in an Arbitrary Geometry with Weld Residual Stress

Analysis of 3D Cracks in an Arbitrary Geometry with Weld Residual Stress Analyss of 3D Cracks n an Arbtrary Geometry wth Weld Resdual Stress Greg Thorwald, Ph.D. Ted L. Anderson, Ph.D. Structural Relablty Technology, Boulder, CO Abstract Materals contanng flaws lke nclusons

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College

More information

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Positive Semi-definite Programming Localization in Wireless Sensor Networks Postve Sem-defnte Programmng Localzaton n Wreless Sensor etworks Shengdong Xe 1,, Jn Wang, Aqun Hu 1, Yunl Gu, Jang Xu, 1 School of Informaton Scence and Engneerng, Southeast Unversty, 10096, anjng Computer

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Modeling of Airfoil Trailing Edge Flap with Immersed Boundary Method

Modeling of Airfoil Trailing Edge Flap with Immersed Boundary Method Downloaded from orbt.dtu.dk on: Sep 27, 2018 Modelng of Arfol Tralng Edge Flap wth Immersed Boundary Method Zhu, We Jun; Shen, Wen Zhong; Sørensen, Jens Nørkær Publshed n: ICOWEOE-2011 Publcaton date:

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information