ETNA Kent State University

Size: px

Start display at page:

Download "ETNA Kent State University"

Meryl Chambers
6 years ago
Views:

1 Electronc Transactons on Numercal Analyss Volume 22, pp 41-70, 2006 opyrght 2006, ISSN A NETWORK PROGRAMMING APPROAH IN SOLVING DARY S EQUATIONS BY MIXED FINITE-ELEMENT METHODS M ARIOLI AND G MANZINI Abstract We use the null space algorthm approach to solve the augmented systems produced by the mxed fnte-element approxmaton of Darcy s laws Takng nto account the propertes of the graph representng the trangulaton, we adapt the null space technque proposed n [5], where an teratve-drect hybrd method s descrbed In partcular, we use network programmng technques to dentfy the renumberng of the trangles and the edges, whch enables us to compute the null space wthout floatng-pont operatons Moreover, we extensvely take advantage of the graph propertes to buld effcent precondtoners for the teratve algorthm Fnally, we present the results of several numercal tests Key words augmented systems, sparse matrces, mxed fnte-element, graph theory AMS subject classfcatons 65F05, 65F10, 64F25, 65F50, 65G05 1 Introducton The approxmaton of Darcy s Laws by Mxed Fnte-Element technques produces a fnte-dmensonal verson of the contnuous problem whch s descrbed by an augmented system In ths paper, we present an analyss of a null space method whch uses a mxture of drect and teratve solvers appled to the soluton of ths specal augmented system The propertes of ths method, n the general case, have been studed n [5] where ts backward stablty s proved, when usng fnte-precson arthmetc, and where a revew of the bblography on the topc s also presented Here, we wll take advantage of network programmng technques for the desgn of a fast algorthm for the drect solver part and for the buldng of effectve precondtoners The relatonshp between the graph propertes of the mesh and the augmented system has been ponted out n [2] Several authors used smlar data structures and network technques n a rather dfferent context or for dfferent purposes In [1, 10, 28] smlar technques have been suggested n the area of computatonal electromagnetcs for gaugng vector potental formulatons In the feld of computatonal flud dynamcs, analogous methods have been appled to the fnte-dfference method for the soluton of Naver-Stokes equatons [, 24] Fnally, n [6], a smlar approach n the approxmaton of a -D Darcy s Law by Hybrd Fnte-Element technques s studed The null space algorthm s a popular approach for the soluton of augmented systems n the feld of numercal optmzaton but s not wdely used n flud dynamcs For a revew of other exstng methods for the soluton of saddle pont problems we advse to read the comprehensve survey [8] Among the possble alternatve methods, we ndcate the drect approach where a sparse decomposton of the symmetrc augmented matrx s computed usng preprocessng that wll help to mnmze the fll-n durng the factorzaton combned wth one by one and two by two numercal pvot strateges [18, 17] In our numercal experments we wll compare our approach wth one of these drect solvers We pont out that our null space method s an algebrac approach to the computaton of the fnte-element approxmaton of whch characterzes the subspace of the dvergence-free vector felds n, [2, 7, 4, 5] Nevertheless, we emphasze that Receved January 12, 2005 Accepted for publcaton September 5, 2005 Recommended by M Benz The work of the frst author was supported n part by EPSR grants GR/R46641/01 and GR/S42170 The work of second author was supported by EPSR grant GR/R46427/01 and partally by the NR Short-Term Moblty Programme, 2005 Rutherford Appleton Laboratory, hlton, Ddcot, Oxfordshre, OX11 0QX, UK (MArol@rlacuk) Isttuto d Matematca Applcata e Tecnologa Informatca NR, va Ferrata 1, Pava, Italy (marcomanzn@matcnrt) 41

2 % 42 M ARIOLI AND G MANZINI our method does not requre the explct storage of the null space and, therefore, of the related fnte-element approxmaton of Our approach s applcable when "!$# and '( fnte elements [11] are used Nevertheless, we do not consder ths a lmtaton n practcal stuatons: the "! # % and )*( fnte elements are wdely used n the D smulaton of physcal phenomena, where hgher order approxmatons have an exceedng computatonal complexty and the ndetermnaton n the evaluatons of the physcal parameters s hgh For the sake of smplcty, we descrbe our approach only for the "! # fnte elements In Secton 2, we wll brefly summarze the approxmaton process and descrbe the basc propertes of the lnear system and augmented matrx In Secton, the null space algorthm and ts algebrac propertes are presented The drect solver s based on the,+ factorzaton of the submatrx of the augmented system whch approxmates the dvergence operator We wll see n Secton 4, how the basc structures of the matrces nvolved are descrbed n terms of graph theory and how the,+ decomposton can be performed by Network Programmng classcal algorthms In partcular, we wll use the Shortest Path Tree (SPT) algorthms to acheve a relable fast decomposton Furthermore, the same graph propertes allow us to descrbe the block structure of the projected Hessan matrx on whch we wll apply the conjugate gradent algorthm Ths wll be used n Secton 5 to develop the precondtoners Fnally, n Secton 6, we show the results of the numercal tests that we conducted on selected experments, n Secton 7, we descrbe the possble extenson of our technques to the three dmensonal doman case, and we gve our conclusons n Secton 8 2 The analytcal problem and ts approxmaton 21 Darcy s Law We consder a smply connected bounded polygonal doman - n IR whch s defned by a closed one-dmensonal curve / The boundary / s the unon of the two dstnct parts /10 and /12, e /4/$056/12, where ether Drchlet- or Neumann-type boundary condtons are mposed In the doman -, we formulate the mathematcal model that relates the pressure feld 7 (the hydraulc head) and the velocty feld 8 (the vsble effect) n a fully saturated sol wth an ncompressble sol matrx Ths relatonshp s gven by the steady Darcy s model equatons; the assumpton of sol ncompressblty mples that the sol matrx characterstcs, eg densty, texture, specfc storage, etc, be ndependent of tme, space, and pressure The Darcy s model equatons read as (21) (22) :<;>=?A@?B7$ n DE89GFH n Equaton (21) relates the vector feld 8 to the scalar feld 7 throughout the permeablty tensor ;, whch accounts for the sol characterstcs Equaton (22) relates the dvergence of 8 to the rght-hand sde source-snk term F These model equatons are supplemented by the followng set of boundary condtons for 8 and 7 : (2) (24) 0 / 0 2 / 2 7JI KMLNPO on 8RQS,I KUTPO on where S s the unt vector orthogonal to the boundary / and pontng out of -, O0 and O?2 are two regular functons that take nto account Drchlet and Neumann condtons, respectvely For smplcty of exposton, OU2 s taken equal to zero 22 Mxed fnte-element method for Darcy s law In ths secton, we shortly revew some basc deas underlyng the mxed fnte-element approach that s used n ths work to approxmate the model equatons (21)-(22) The mxed weak formulaton s formally obtaned

3 Ž r r etna@mcskentedu A NETWORK PROGRAMMING APPROAH TO MFE METHODS 4 n a standard way by multplyng equaton (21) by the test functons VXW6Y [ZH\]I^\ Ẁ _ au b-<c dd<\ W ae f-<ga\hqsei KMTPkjl and equaton (22) by m W b-< and ntegratng by parts over the doman of computaton We refer to [11] for the defnton and the propertes of the functonal space Y The weak formulaton of (21)-(22) reads as fnd 8 WhY and 7 W b-< such that: oon rts rts p ;vu ( 8RQ VPwUx : 7yD VPwUx : qoo rts rts b<81{m w?x O 0 V QS wtz for every KUL VXWRY F m w?x for every m W f-<k} The dscrete counterpart of the weak formulaton that we consder n ths work can be ntroduced n the followng steps Frst, we consder a famly of conformng trangulatons coverng the computatonal doman - that are formed by the sets of dsjont trangles ~"!ƒ Every famly of trangulatons s regular n the sense of arlet, [1, page 12], e the trangles do not degenerate n the approxmaton process for tendng to zero Addtonally, we requre that no trangle n ~ can have more than one edge on the boundary / nor that a trangle can exst wth a vertex on / 0 and any other vertex on / 2 The label, whch denotes a partcular trangulaton of ths famly, s the maxmum dameter of the trangles n dam!y Then, we consder the functonal spaces ~, e ]@ ˆ $Š? kœ y V x < U-` IRU V x I x) W IR H! W ~ V QS,I K Tšœ e the space of the lowest-order Ravart-Thomas vector felds defned on - by usng ~", and m1 x E U-ž IRœm x I constm H! W ~ d e the space of the pecewse constant functons defned on - by usng ~ For any gven, the two functonal spaces Ž and are fnte-dmensonal subspaces of Y and b-<, respectvely, and are dense wthn these latter spaces for v Ÿ [11] The dscrete weak formulaton results from substtutng the velocty and pressure feld 8 and 7 by ther dscretzed counterparts 8 and 7 and reformulatng the weak problem n the spaces Ž and [11] The dmenson of the functonal space Ž s equal to the total number of nternal and Drchlet boundary edges Any element of Ž can be expressed as a lnear combnaton of the bass functons V for v U}}}v, where may be an nternal edge or a boundary edge wth a Drchlet condton The bass functon V ª, whch s assocated to the edge, s unquely defned by (25) «V EQS «wtz U for U otherwse where S s the unt vector wth drecton orthogonal to and arbtrarly chosen orentaton Lkewse, the dmenson of the functonal space s equal to the number of trangles In fact, any element of can be expressed as a lnear combnaton of the bass functons m for?}}} The bass functon m, whch s assocated to the trangle!, s such that m on! and m on -± )! The soluton felds 8 and 7 of the dscretzed weak formulaton are taken as approxmaton of the soluton felds 8 and 7 of the orgnal weak formulaton The soluton felds 8

4 ¾ r K» etna@mcskentedu 44 M ARIOLI AND G MANZINI and 7$ are expressed as lnear combnatons of the bass functons ntroduced n the prevous paragraph These expansons are formally gven by (26) 7 2J² ³ ( 7 m and 8 2Jµ ³ (œ V e} The coeffcent 7 of the bass functon m n (26) can be nterpreted as the approxmaton of the cell average of the pressure feld 7 on!k The coeffcent of the bass functon V ª n (26) can be nterpreted as the approxmaton of the flux, or the M -order momentum, of the normal component of 8 on the edge We collect these coeffcents n the two algebrac vectors and 7* 7 It turns out that these vectors are solvng the augmented system of lnear equatons (27) where the elements of the matrces º ºy» 7¼» :½ ¾ and º of the augmented matrx of the left-hand sde of (27) and the elements of the vectors ½ ¾ and of the rght-hand sde of (27) are gven by rs b¼ ^«b ^ ;vu ( (28) V «Q V { w?x r s (29) bºy «y : D V «m w?x (210) (211) ½e «: O 0 V «QS wtz r s F m wux } The augmented system (27) has a unque soluton because t holds that ÀÂÁ bº $Ã ÀÂÁ f¼d Ths property s an mmedate consequence of the nf-sup condton, whch s satsfed by the functonal operators represented by (28) and (29) on the bass functon sets V and m These functonal Ž operators are used to defne the dscrete weak formulaton n the functonal spaces and (see [11] for more detals) Fnally, we descrbe some major propertes of the matrces U} and º that are related to the trangulaton ~ Frst, we assume that Ä be an nternal edge, and denote ts two adjacent trangles by!yå «yæ - and!çå Å«ÇÆ - As the support of the bass functon V «s gven by the unon of!yå ^«and!yå ^«Å, every row «fè of the matrx has at most É nonzero entres These nonzero entres are on the columns correspondng to the nternal or Drchlet boundary edges of the trangles!çå «and!çå «Å (the Neumann boundary edges are not consdered) Smlarly, the row ºy «bè of the matrx º must have two dfferent nonzero entres on the columns correspondng to the two trangles!yå ^«and!çå «Å Then, we assume that Ä be a Drchlet boundary edge, and denote the unque tranglethat ëä belongs to, by!yå «In ths second case, the row «È has surely three nonzero entres on the columns correspondng to the three edges of! Å ^«From our ntal assumpton on the trangulatons, t follows that the trangle!âå «cannot have more than one edge on the boundary / ; furthermore, we are actually assumng that ths boundary edge s of Drchlet type Therefore, the row º ^«È has one nonzero entry only on the column correspondng to!"å «Moreover, the rectangular matrx º has maxmum rank By applyng

5 u Õ u Õ u Õ» Õ u» etna@mcskentedu A NETWORK PROGRAMMING APPROAH TO MFE METHODS 45 the Gauss theorem to the ntegral n (29) restrcted to!œ (where m > ) and takng nto consderaton (25), we deduce that the nonzero entres of the matrx º must be ether or :ƒ The sgn depends on the relatve orentaton of the trangle edges wth respect to the edge orentaton that was arbtrarly chosen n (25) Furthermore, for every nternal edge ëä shared by the trangles! «Å and! «Å Å, the sum of the two nonzero entres on the matrx row º «È s zero; formally, º «HÊ µ «º «œê µ Ê«Ë Let us consder the matrx º"Å$[Ì ºvI ÍÎ that s obtaned by augmentng º wth the column vector Í whch s bult as follows The column vector Í only has nonzero entres for the row ndces correspondng to the edges?ä Æ / 0 and these non-zero entres are such that Í «º «µ «P The matrx ºyÅ s the ncdence matrx of the edge-trangle graph underlyng the gven trangulaton, and, then [1, 12] the matrx ºyÅ s a totally unmodular matrx and has rank Therefore, the matrx º has rank because every submatrx that s obtaned by removng a column of a totally unmodular matrx has full rank [1, 12] Furthermore, the pattern of s equal to the pattern of I ºBII ºvI, because the nonzero entres of the row * «È must match the nonzero entres of the edges of the trangles!çå «and!çå «Å, and from the unmodularty propertes of º By constructon, the matrx s symmetrc and postve semdefnte Null space algorthms In ths secton, we take nto account the classcal null space algorthm for the mnmzaton of lnearly constraned quadratc forms, whch s descrbed for example n [21] In partcular, we choose the formulaton of ths algorthm that s based on the factorzaton of the matrx º In order to smplfy the notatons, we use the letter Ï nstead of the symbol to ndcate the total number of nternal edges and Drchlet boundary edges of the mesh, and Ð nstead of to ndcate the total number of trangles of the mesh It holds that Ï±ÑžÐ Fnally, we denote by Ò"( and Ò ÔÕthe ÏNÓ'Ð and ÏNÓN Ïh:NÐR matrces Ô Òy( Ð Ö» Ïh:9Ð and Ò Ð Ô Õ Ô Ïh:NÐ where, and are respectvely the dentty matrx of order Ð and the dentty matrx of order Ï':NÐ As already stated n Secton 22, º s a full rank submatrx of a totally unmodular matrx of order ÏÓ Ð, and ts entres are ether ØB or It turns out the LU factorzaton of º s obtanable wthout any floatng-pont operatons, throughout a couple of sutable permutaton matrces Ù and (see [1, 12]) In Secton 4, we wll see how t s possble to determne effcently Ù and by usng network programmng algorthms The matrces Ù and allow us to wrte the permutaton of the matrx º as Ô Ùƒº ( ( (1)» Ò ( where ( s a nonsngular lower trangular matrx of order Ð Wthout loss of generalty, we assume that all the dagonal entres n ( are equal to and the non-zero entres below the dagonal are equal to :ƒ Ths choce smply corresponds to a symmetrc dagonal scalng of the permuted augmented system By ntroducng the lower trangular matrx Ô,( (2)» and recognzng that ÒÂ( plays the role of the matrx U, the LU factorzaton of º s gven by Ùƒº š aòy( }

6 áà ( Ä áà u u Õ Õ»» ( ß ß ãâ ( ãâ» ( ß ãâ ( ß» ãâ } ãâ u Õ»» } } æ etna@mcskentedu 46 M ARIOLI AND G MANZINI In the rest of ths secton, we assume, for the sake of exposton smplctythat the matrces and º and the vectors ½ ¾ and have already been consstently permuted and omt to ndcate explctly the permutaton matrces Ù and Thus, by dentfyng the matrx º wth aò"( after the row and column permutatons, the augmented matrx n (27) can be factorzed as follows () º ºy» Ô Õ We ndcate the matrx u ( u by the symbol nverse matrx of are formally gven by (4) <u ( ( u ( Ô u ( u ( u Òy( Ò ( Ö» and <u The block-parttonng of u ( and u nduces on a The nverse matrx and the transposed Ô Õ Ô ( u : ( u the block-parttoned form Û (Ü( ( (5) ( Ü where, formally, lä šò žò, for Aª UÜÝ, and we exploted the symmetry of the matrx (and, of course of ) to set ( Þ ( Smlarly, we ntroduce the block partton ( ^ of the velocty vector and denote, for consstency of notaton, the pressure vector 7 by Thus, the algebrac vector 7k{ N4 ( kß { denotes the soluton of the (sutably permuted) lnear problem (27) dscussed n Secton kß 22 We use the decomposton () and take nto account the block-parttoned defntons (4) and (5) of the matrces u (, u and to splt the resoluton of the lnear problem (27) n the two followng steps Frst, we solve the lnear system ÔÕ âãäàá6 Ô Õ (Ü( ( ( u ( ÔÕ :½ (6) ¾» ( ( ^ by solvng ß àá Ô ( a ÔÕ for the auxlary unknown vector (7) ^ and, then, we compute the unknown vector âã àá Note that the left-hand sde matrx of (6) can be put n a block-trangular form by nterchangng the frst and the thrd block of equatons The fnal computatonal algorthm, whch s solved by the null space strategy, s formulated by ntroducng the vector æ ç: u ( ½ The vector æ s consstently parttoned n the two subvectors æè('éò ( æ, and æ êò accordng to the block-parttonng (5) NULL SPAE ALGORITHM: ^ solve the block lower trangular Ô Õ system: ( (Ü( ( Ô Õ ß âãäàá6 àáh àá¾ æ æ (

7 ( ñ ì u Õ ß } ( ñ (» u Õ etna@mcskentedu f ^ and, then, let A NETWORK PROGRAMMING APPROAH TO MFE METHODS 47» Hß Note that n step ª ^ we have <u» ( u ¾ :± n vew of the second formula n (4) The null space algorthm as formulated above requres the formal nverson of the matrx, whch s the projected Hessan matrx of the quadratc form assocated to the augmented Ü matrx of (27) In order to solve the lnear algebrac problem Ü æ :Û ( ( for, we may proceed n two dfferent ways If Ï>:`Ð s small, e the number of constrants s very close to the number of unknowns, or s a sparse matrx, we may explctly compute Ü and then solve the lnear system nvolvng ths matrx by usng a holesky factorzaton Ü Nonetheless, the calculaton of the product matrx u ( u mght be dffcult because of the hgh computatonal complexty, whch s of order ë) Ï ß, or because the fnal matrx tself would be farly dense despte the sparsty of In such cases, we prefer to solve the lnear system Ü æ : ( ( by usng the pre-condtoned conjugate gradent algorthm [2] The conjugate gradent algorthm does not requre the explct knowledge of the matrx but only the capablty of performng the product of ths matrx tmes a vector The product of the block sub-matrx Ä of tmes a sutably szed vector ì s effectvely performed by mplementng the formula (8) lä ìv Ò Ä <u ( fä <u Ò ìg for Af 4?AÝ} We pont out that the formula (8) only requres the resoluton of the two trangular systems wth matrces u ( and u, whch can be easly performed by, respectvely, forward and backward substtutons, and the product of the sparse matrx by an Ï -szed vector Furthermore, we observe that the matrx-vector product gven by (8) s backward stable [5] If we use the conjugate gradent method to solve Ü íæ :Þ ( (, t s qute natural to adopt a stoppng crteron whch takes advantage of the mnmzaton property of ths algorthm At every step the conjugate gradent method mnmzes the energy norm of the error î : ð on a Krylov space between the exact soluton vector and è the computed ð -th teratve approxmaton The energy norm of the vector ì W IR, whch s nduced by the matrx, s defned by Ü ñò óçôfô 4 ì Ü ì Ths norm nduces the dual norm ñ ì Å ñ ó)öm ò ôfô ì Êø u ( Ü ì Å for the dual space vectors ì Å W IR followng stoppng crteron (9) IF ñ æ :ù ( ( :Û Ü ({õ ð } ({õ ð We termnate the conjugate gradent teratons by the ñ ò ó)öm ôfôgúû æ :ù ( ñ ò ó)öm ôbô THEN STOP where û s an a pror defned threshold wth value less than The choce of û clearly depends on the propertes of the problem that we want to solve; however, n many practcal cases, û can be taken much larger than the machne precson of the floatng-pont operatons In our experments, we choose û followng the results n [4]

8 ñ ò ò ò ñ ñ ( ( ñ ò ñ ò ñ ñ ( ñ etna@mcskentedu 48 M ARIOLI AND G MANZINI In order to use (9), we need some tool for estmatng the value æ :ü ( ( : Ü ó öm ôfô Ths goal can be acheved by usng the Gauss quadrature rule proposed n [22] or the Hesteness and Stefel rule [25, 4, 7, 8] Ths latter produces a lower bound for the error óçôfô usng the quanttes already computed durng each step of the conjugate gradent algorthm wth a neglgble margnal cost In [7, 8], ts numercal stablty n fnte arthmetc has been proved All these lower bound estmates can be made reasonably close to the value of óyôbô at the prce of w addtonal teratons of the conjugate gradent algorthm In [4, 22], the choce w ý s ndcated as a successful compromse between the computatonal costs of the addtonal teratons and the accuracy requrements; several numercal experments support ths concluson ([4, 22, 5, 7, 8]) Fnally, followng Reference [5], we estmate by takng nto account that æ :ù ( ó ôbô öm æ :Û ( ñ ó)öm ò ôbô ñ ó ôbô } and replacng wth ts current evaluaton at the step f óyôbôâþ ò û óyôfô æ :ù ( 4 Graph and Network propertes In ths secton, we frst revew some basc defntons relatve to graph theory (more detaled nformaton can be found n [1, 9]) We also dscuss how these defntons are reled to the graph structure underlyng the trangulatons of the mxed fnte element formulaton of Secton 22 Then, we show how a strategy that reles on network programmng can be used to determne the permutaton matrces Ù and of the LU factorzaton of the matrx º that was ntroduced n the prevous sectons In partcular, we show how these two matrces can be constructed n a fast and effcent way by explotng the Shortest-Path algorthm (SPT) A graph ÿä s made up of a set of nodes and a set of Ä Ä ( ø ø arcs ( ø ø An arc s dentfed by a par of nodes Ä and ; Ä{ denotes an unordered par and the correspondng undrected arc, and by Ì Ä{ Î an ordered par and the correspondng drected arc Ether ÿ s an undrected graph, n whch case all the arcs are undrected, or ÿ s a drected graph, n whch case all the arcs are drected We can convert every drected graph ÿ to an undrected one called the undrected verson of ÿ by replacng each Ì Ä Î by Ä buld the drected verson of an undrected graph ÿ by replacng every Ä arcs Ì Ä{ Î and Ì ÄÎ In order to avod repeatng defntons, undrected arc Ä or a drected arc Ì Ä The nodes Ä and n an undrected graph ÿp are adjacent f Ä W, and we defne the adjacency of Ä by º w «Ä^ W M} Analogously, n a drected graph ÿ± º w «and removng the duplcate arcs onversely, we can by the par of Äª may denote ether an Î and the context resolves the ambguty f Ì Ä{, we defne Î W or Ì ÄÎ W M} We defne the adjacency of an arc as follows: º w «ø U Ä{ W 5 U W ]

9 Ì for an undrected graph, and º w «ø A NETWORK PROGRAMMING APPROAH TO MFE METHODS 49 Ä{ Î W ž$ì ÄÎ W 5 Ì Î W ž$ì Î W for a drected graph A path n a graph from node to node s a lst of nodes Ì ô }}} Î, such that ««s an arc n the graph ÿ for J4?}}}v:` The path does contan the nodes Ä for W ÌD?}}} MÎ and arcs Ä Ä1( for W ÌD?}}}B:` Î and does not contan any other nodes and arcs The nodes and are the ends of the path The path s smple f all of ts nodes are dstnct The path s a cycle f and and a smple cycle f all of ts nodes are dstnct A graph wthout cycles s acyclc If there s a path from node to node then s reachable from An undrected graph s connected f every node of ts undrected verson s reachable from every other node and dsconnected otherwse The maxmal connected subgraphs of ÿ are ts connected components A rooted tree s an undrected graph that s connected and acyclc wth a dstngushed node, called root A rooted tree wth nodes contans : arcs and has a unque smple path from any node to any other one When approprate we shall regard the arcs of a rooted tree as drected A spannng rooted tree "!#J n ÿ s a rooted tree whch s a subgraph ' of ÿ wth Ï%$ nodes If and are nodes n and s n the path from to wth then s an ancestor of and s a descendant of Moreover, f and are adjacent then s the parent of and s a chld of Every node, except the root, has only one parent, whch wll be denoted by parent, and, moreover, t has zero or more chldren A node wth zero chldren s a leaf We wll call the arcs n (! # n-tree and the arcs n š ) "! # out-of-tree A forest s a node-dsjont collecton of trees The depth Ä of the node Ä n a rooted tree s the (nteger) top-down dstance from the root node that s recursvely defned by depth Ä d and, smlarly, the heght leafthat s recursvely defned by heght Ä dä ]@ ˆ depth parent Äª? f Ä1 otherwse Ä s the (nteger) down-top dstance of the node Ä from the deepest heght ' s a chld of Ä{? f Ä s a leaf otherwse The subtree rooted at node Ä s the rooted tree consstng of the subgraph nduced by the descendants of Ä and havng root n Ä The nearest common ancestor of two nodes s the deepest node that s an ancestor of both A node Ä s a branchng node f the number of ts chldren s greater than or equal to two For each out-of-tree arc Ì? Î, the cycle of mnmal length or the fundamental cycle s the cycle composed by Ì? Î and the paths n from and to ther nearest common ancestor We can assocate the graph ÿë to the trangulaton ~Ç as follows Frst, we assocate a dstnct node of the graph to every trangle of the mesh, eg Ä s the (unque) node correspondng to the trangle!èä, for Â?}}} Then, the (drected or undrected) arc Ä exsts n the arc set f and only f the trangles!$ä and! share an edge Furthermore, we add the root node, whch represents the exteror world IR< Ç-, to the node set, and the arcs for every node assocated to a trangle! wth a boundary edge of Drchlet type to the arc set The ncdence matrx of the graph ÿ s the ÏÓ Ð totally unmodular matrx º"Å that has been ntroduced at the end of Secton 22; ts rank s Ð± fš

10 Õ A * Õ A > etna@mcskentedu 50 M ARIOLI AND G MANZINI If we remove the column correspondng to the root from ºÂÅ, we obtan the matrx º problem (27), whch also has rank Ð Moreover, every spannng tree of ÿ wth root n nduces a partton of the rows of º n n-tree rows and out-of-tree rows If we renumber the n-tree arcs frst and the out-of-tree arcs last, we can permute the n-tree arcs and the nodes such that the permuted matrx º has the form where ( W IR Ùƒº,(» s a lower trangular and non-sngular matrx, see Reference [1, 12] As the matrx º s obtaned by smply removng the root column from the totally unmodular matrx º Å, then, the entres of the matrx ( u ( must also be ØB or The non-zeroes of the matrx ( u ( are also equal to ØB and ts rows correspond to the out-of-tree arcs The number of nonzeros of a row of ths matrx s equal to the number of arcs n the cycle of mnmal length that the correspondng out-of-tree arc forms wth the n-tree arcs We recall (see Secton ) that wthout loss of generalty, all the dagonal entres n ( can be chosen equal to and, therefore, the entres outsde the dagonal are or :ƒ Ths sgnfes selectng the drectons of the arcs n +! # as Ì Ä parent ÄfÎf Ä Gven the out-of-tree arc Ì Î, the values of the nonzero entres n the correspondng row of ( u ( wll be f both the nodes of the n-tree arc correspondng to the nonzero entry are ancestors of, and wll be :ƒ f both the nodes of the n-tree arc correspondng to the nonzero entry are ancestors of We now gve some basc results the proof of whch s straghtforward LEMMA 41 Let be a branchng node wth chldren The descendants of can be parttoned n sets ( }}} such that ÄkÃ -,, for Each Ä{ W wth Ä W Ä and W s an out-of-tree arc If we defne the adjacency of a set of nodes /$ Æ and the adjacency of a set of arcs 10 Æ as follows: º w H 2$hd 4 º w «Nº w H 710,d 4 º w ««Š65 «ø ^ð ø ^ ð Š65 we have the followng orollary OROLLARY 41 Let be a branchng node wth 8 chldren and (e}}} such that Ä Ã 9,, for : be the parttonng of the descendants of Let W <; and Ä W =; be out-of-tree arcs, wth Ä W?> : and ô The mnmal length B ø ð Æ :! # 5 A and A D ø «ð Æ ;! #]5 Ä A are dsjont: D ø «ð Ã?A EB ø ^ ð F,ƒ crcuts A and ø «ð Ãhº w H 7A EB ø ^ ð dg,e º w H 2A D ø «ð èãha B ø ð F,ƒ} I Proof From Lemma 41 the descendants of the chldren of the branchng node form dsjont subtrees If the two crcuts A EB ø ^ ð and A ø «ð had an arc n common, > ths arc should smultaneously be an n-tree arc of the two subtrees of nodes and ô Thus, ths arc would close a a crcut on the ancestor, but ths fact s n contradcton wth the defnton of a tree (whch cannot contan closed crcuts) Smlarly, f A ø «ð and º w H 2A arc because t would le on A Lemma 41 because ths arc would connect two dsjont sets B ø {ð had a common arc, ths arc should be an n-tree ø «ð ÃJ "!# Ths fact s n contradcton wth the results of of

11 Ä Ä A NETWORK PROGRAMMING APPROAH TO MFE METHODS 51 Fnally, f the root of the spannng tree s a branchng node wth 0 chldren d d?}}} 0, the subtrees havng as roots form a forest Therefore, the matrx ( can be permuted n block dagonal form wth trangular dagonal blocks We refer to [1, 9] for surveys of dfferent algorthms for computng spannng trees An optmal choce for the rooted spannng tree s the one mnmzng the number of nonzero entres n the matrx KšP u $Ò the columns of whch span the null space of º" In [16], t s proved that the equvalent problem of fndng ML the tree for whch the sum of all the lengths of the fundamental crcuts s mnmal, s an -complete problem In [9, 14, 15, 16, 20, 27, ] several algorthms have been proposed whch select a rooted spannng tree reducng or mnmzng the number of nonzero entres n K In ths paper, we propose two dfferent approaches based on the ntroducton of a functon cost ^Q defned on each arc of the graph and descrbng some physcal propertes of the orgnal problem From orollary 41 t follows that a rooted spannng tree, havng the largest possble number of branchng nodes, normally has many dsjont crcuts The columns of K correspondng to these dsjont crcuts are structurally orthogonal, e the scalar product of each par s zero, ndependent of the values of the entres Moreover, we choose the functon N=O zp 4 IR n the followng way: (41) k W ž cost b f Qt such that 94 cost b šsrtr otherwse Usng the cost {Q functon, we can compute the spannng tree rooted n that s actually solvng the SPT problem [19] on the graph In partcular, we have chosen to mplement the heap verson of the shortest path tree algorthm (see the SHEAP algorthm that s descrbed n Reference [19]) The resultng spannng tree has the nterestng nterpretaton that the path crcumnavgates low permeablty regons n the sense specfed below In the presence of slands of very low permeablty n -, e regons wth very large values for ; u (, the paths from the root to a node correspondng to a trangle that les outsde the slands of low permeablty wll be made of nodes correspondng to trangles that are also outsde the slands In ths sense, we can state that the shorthest path tree crcumnavgates the slands Therefore, the set of these paths reasonably dentfes the lnes where the flux s expected to be greater Owng to the fact that we assume a null cost for the arcs connected to the root node, both strateges provdes a forest f the number of zero cost arcs s greater than one We observe that we do not need to buld the matrx º explctly: the tree (or forest) can be computed by usng the graph only Moreover, the soluton of the lower and upper trangular systems can be performed by takng advantage of the parent functon alone Ths strategy results n a very fast and potentally hghly parallel algorthm For a problem wth only Drchlet condtons and an sotropc mesh (e the number of trangles along each drecton s ndependent from the drecton), we may have a forest wth a number of trees proportonal to the number of trangles wth a boundary edge Ths number s ë) u (, and, therefore, the matrx <( has ë ^ u ( dagonal blocks of average sze ë ^ u ( Fnally, we pont out that most non-leaf nodes n the SPT have two chldren 5 Precondtonng and quotent tree In the presence of slands of very low permeablty n -, the values of ; can dffer by many orders of magntude n the dfferent regons of the doman In ths case, the projected Hessan matrx has a condton number whch Ü s stll very hgh It s then necessary to speed up the convergence of the conjugate gradent by use of a precondtoner Obvously, t s not usually practcal to explctly form Ä Ü to

12 ! ß ß ß : m ³ ß m ß m u ³ ³ _ etna@mcskentedu 52 M ARIOLI AND G MANZINI compute or dentfy any precondtoner In ths secton, we wll show how we can compute the block structure of usng only the tree and the graph nformaton We wll then use Ü the block structure to compute a precondtoner Denotng by the matrx ( u (, the projected Hessan matrx can be wrtten as Ü follows: (51) Ü UK VKPš Ü >(Ü(g :ž b (g 9( g} The row and column ndces of the matrx correspond to the out-of-tree arcs Moreover, we recall that the matrx has entres :ƒ?, ts row ndces correspond to the Ü outof-tree arcs, and the number of nonzeros n one of ts rows wll be the number of arcs n the cycle of mnmal length whch the correspondng out-of-tree arc forms wth the n-tree arcs LEMMA 51 Let and W be two out-of-tree arcs, and A R Æ "! #<5Â and AYX Æ "! #5ZW be ther correspondng crcuts of mnmal length, then (52) (5) R X A%RƒÃ?A X "[]\ç a W W º w R Ãhº w X G,^[]\Û A%RƒÃhº w H 2A X F, A X Ãhº w H 2A%R F, Proof Because the orentaton of the arcs n the graph s arbtrary and wll be determned by the sgn of the soluton, we wll use the undrected verson of the graph Let 9, Ẁ ` a ê ` a, where the graph nodes,, and respectvely correspond to the mesh cells! (,!,! and!/b The out-of-tree arc unquely corresponds to the common edge ë( between the trangles!1( and!, and the out-of-tree arc W unquely corresponds to the common edge b between the trangles! and! b From (28), we have that R X f and only f z 7U71 V ô Ã z 7?71 V ced G, Snce z 7?71 V ô ap!è(è5v! and z 7?7 V cfd d 5! b, then R X P f and only f the two supports share a mesh cell Assumng, wthout s the shared cell, (52) follows from the defnton of º w loss of generaltythat!1(<! In (5) the \ s trval We gve a proof ab absurdo for the [ Frst of all, we note that the cardnal number of º w I s less or equal than g W (a trangle has three edges) If we assume that A R Ã±º w H 2AYXœ h,, AYX'ÃNº w H 2A R h, and A R ÃAYX j,, there must exst a a node such that ` belongs to both the paths n the tree lnkng to and to respectvely Because each node n the path has only one parent and one chld, the º w contans the parent and the chld of the frst path and those of the second path Because we assumed that ARÃ"A X G, the cardnal number of º w would be k whch s n contradcton wth the upper bound on the cardnal number Thus, from (51) and the prevous Lemma 51, we have the followng orollary OROLLARY 51 Let and W be two out-of-tree arcs, and A R Æ "! #)5> and AYX Æ "! #]5@W be ther correspondng crcuts of mnmal length, then and (54) lr X šlr X ³ Š6no6pYqsr m R m X : Š6no6pYqsr t AvRƒÃ?A X F,'\ m R u X mu n=t ð Š6n=tpYqsr no ð m X X m Š6n=t6pYqsr o SR X

13 > A NETWORK PROGRAMMING APPROAH TO MFE METHODS 5 Proof The frst part of the orollary follows drectly from the expanson of (51) The mplcaton (54) follows from Lemma 51 takng nto account that ARBÃ º w H 2A X -wxa%rvã º w X, AYX Ã)º w H 7A R SwyAYXƒÃhº w R, and º w R Ãhº w X G,U\zA R Ã?AYX F, orollary 51 gves the complete descrpton of the pattern of n terms of the rooted Ü spannng tree However, we observe that the results (5) and (54) rely on the 2-D structure of the mesh and they cannot be generalzed to a mesh n -D In the second part of ths secton, we buld the block structure of usng the Quotent Tree concept, wthout explctly formng the matrx In the followng, we process the Ü root node separately from the other nodes The root node wll always be numbered by, and f t s a branchng node wth chldren the tree wll contan subtrees drectly lnked to the root Gven a rooted spannng tree! #, let {- be the set of the branchng nodes n, and let }~ be the set of the leaves n We defne the set ~{>5 }* ( }}} a} If, then Ä W «Ä parent s a bnary tree Otherwse, we can compute the followng paths: Ä parent parent Ä {}}} parent Ä g and «Ã Ä U} The path «connects Ä to all ts ancestors whch are not branchng nodes, and t can contan Ä alone The set [ }}} ƒ 5 s a partton of : JÃ «, Therefore, we can buld the quotent tree ] [ 4 Ä ( «} 6! M «g W! [(\ º w hi! «èã F, and the quotent graph ÿ I ä 6 1ˆ U[ «W 1ˆ?[(\ º w H «èã F, where º w hi! s the restrcton of the º w operator to the graph For the sake of clarty n the followng, we wll call the quotent tree nodes Q-nodes The root of the quotent tree s stll the node, and each subtree rooted at ts chldren has a bnary quotent tree 51 Data structures and separators In ths subsecton, we shortly revew the basc data structures that we used to mplement the null space method presented n the prevous secton We also dscuss some major detals concernng the renumberng strategy that allows us to perform a nested dssecton-lke decomposton of the matrx º Ths knd of permutaton and the resultng sub-block decomposton makes t possble to buld the block-jacob precondtoner mentoned n the next sub-secton and consdered n the numercal experments of Secton 6 More detals on the data structure and algorthm mplementaton are reported n the fnal appendx The data structure that s used for the graph ÿ s based on a double representaton of the sparse matrx º Ths double representaton mplements the collecton of compressed row and column sparse data whch s descrbed n [17] and allows us to access smultaneously

14 54 M ARIOLI AND G MANZINI to matrx rows and columns We assume that rows and columns are consstently permutated when we renumber the graph node Ths specal desgn facltates and speeds up the algorthms that are reported n the appendx Trees, whch are used to perform row/column permutatons, are represented by storng the followng data for any node: the parent node, the node chldren lst, the chan ndex, the depth In Fgure 51, we gve a smple example of a tree and, n Table 51, we lst the labels of each node It s relevant to observe that the reorderng obtaned by the depth frst search of renumbers the nodes such that the nodes formng a chan are consecutve Therefore, we can drectly access the lst usng vector arrays ROOT FIG 51 Example of a spannng tree After the dentfcaton of the chans, we buld the quotent tree, and, descendng the quotent tree ] I wth a depth frst search algorthm, we renumber the nodes and buld ts data structure In the new data structure, we assocate wth each Q-node the followng objects: Q-parent n + I, frst and last node n of the chan correspondng to the Q-node, depth n + I, Q-last, the last Q-node of the subtree rooted n the current Q-node, Q-star, the lst of the out-of-tree arcs n ÿ whch have one extreme belongng to the Q-node, Q-chldren, the lst of the Q-nodes chldren of the Q-node

15 A NETWORK PROGRAMMING APPROAH TO MFE METHODS 55 TABLE 51 Labels of nodes n the tree of Fgure 51 Node ndex Parent hldren lst han Index depth 0 0 1,21, , , , In Fgure 52, we show the quotent tree relatve to the example of Fgure 51, and n Table 52, we lst the labels of each Q-node Takng advantage of the data structures descrbed above, we can order the out-of-tree arcs n the followng way Frstly, we dentfy the Q-nodes whch are chldren of the external root (node ), and the subtrees rooted n each of these Q-nodes Then, we separate each of the subtrees from the others markng the out-of-tree edges that connect t to the others The out-of-tree arcs lyng wthn one of these subtrees cannot have fundamental cycles wth the out-of-tree arcs lyng wthn one of the others, because of orollary 41 Ths corresponds to a one-way dssecton appled to Then, wthn each of the subtrees, we seek the out-of-tree arcs that separate the two subtrees rooted n the Q-nodes chldren of the Q-node

16 ¾ 56 M ARIOLI AND G MANZINI ROOT FIG 52 Quotent tree relatve to the tree of Fgure 51 TABLE 52 Labels of the Q-nodes n Fgure 52 Q-node Q-parent Frst, last Depth Q-last Q-star Q-chldren node of chan 1 0 1, ,8 2, 2 1 5, ,2,5,6,7 1 10, ,6,8 4,5 4 14,16 4 7,9, ,20 5,9, , ,11 7, , , , ,1, , ,12 root of the subtree contanng both of them Ths s equvalent to a nested dssecton strategy appled to one of the dagonal blocks resultng from the prevous one-way dssecton phase In Fgure 5 eš, we show the result of the one-way dssecton on the matrx º when the root node has only three descendant subtrees Note that each subtree s now dsconnected from the others, s a bnary tree and t can be dentfed by the Q-node on whch t s rooted If the nested dssecton process s recursvely re-appled on these matrx sub-blocks, we obtan the matrx structure shown n Fgure 5 : N 52 Precondtoners The a pror knowledge of the structure of, allows us to decde what knd of precondtoner we can afford If we are subjected to a strong lmtaton of the sze of the memory n our computer, we can choose among several alternatve precond-

17 A NETWORK PROGRAMMING APPROAH TO MFE METHODS 57 (a) (b) (c) FIG 5 Example of a one-way dssecton (a, b), and of a nested dssecton (c) on the matrx Œ toners We have a choce rangng from the dagonal matrx obtaned by usng the dagonal part of and the block Jacob precondtoner usng the dagonal blocks correspondng to the separators The possblty of usng the smplest choce of the dagonal of Ü s sensble because the SPT algorthm places on ths dagonal the bggest entres of the dagonal of In Secton 6, we wll gve numercal evdence that ths choce s very effcent for several test problems Nevertheless, n the presence of strong dscontnutes and ansotropes n the permeablty functon ;, we are oblged to use ether the dagonal Jacob precondtoner or a block dagonal Jacob

18 c ß -( 5'- 5hß 58 M ARIOLI AND G MANZINI 6 Numercal experments 61 Test problems We generated the test problems usng four dfferent domans The frst two are square unt boxes and n the second one we have four rectangular regons where the tensor ; assumes dfferent values In Fgure 61, we plot the geometry of Doman 1 and the boundary condtons In Fgure 62, we plot the geometry of Doman 2 and the boundary condtons The values of the tensor ; are chosen as follow ;] x oooon p qoooo x±w -N } É x±w -( u b x±w - už x±w - už x±w - b } 56- b U The two remanng domans have an L-shape geometry In Fgure 6, we plot the geometry and the boundary condtons of the Doman In Fgure 64, we plot the geometry of the fourth and last Doman 4 and the relatve boundary condtons: wthn the doman, we have four rectangular regons where the tensor ; takes the same values defned for the second doman n (61) In (22), we take the rght-hand sde FJ x,ë n all our test problems For the domans one and three, the tensor ; n (21) s sotropc For a gven trangulaton, ts values are constant wthn each trangle and ths value s computed followng the law: ; «¼ u ( «4?}}}Ü where ÍÄ W Ì Î are numbers computed usng a random unform dstrbuton For each doman, we generated 4 meshes usng TRIANGLE [6] In Tables 61 and 62, we report, for our domans, the number of trangles, the number ] of edges, the numberh of vertces of each mesh, the number Ë} Ï Ï º } Ï Ï of nonzero entres n matrx º, and the correspondng value of of nonzero entres n the matrx, the number TABLE 61 Data relatve to the meshes for domans 1 and 2 Mesh 1 Mesh 2 Mesh Mesh " ] Ë} Ï Ï º } Ï Ï TABLE 62 Data relatve to the meshes for domans and 4 Mesh 1 Mesh 2 Mesh Mesh " ] Ë} Ï Ï º } Ï Ï

19 ò ñ ( ñ etna@mcskentedu A NETWORK PROGRAMMING APPROAH TO MFE METHODS 59 (0,1) g N = 0 g D = 0 (1,1) g N = 0 Ω g N = 0 (0,0) g D = 1 g = 0 N (1,0) FIG 61 Geometry of the frst doman (0,1) g = 0 N g = 0 D (1,1) Ω g N = 0 Ω Ω 4 g N = 0 Ω 1 Ω 2 (0,0) (1,0) g D = 1 g N = 0 FIG 62 Geometry of the second doman 62 Practcaltes We analysed the relablty of the stoppng crteron when we change the parameter w In Fgures 65 and 66, we dsplay the behavour of the estmates of the true relatve energy norm of the error for Mesh, Doman and Doman 4: for the other cases the behavour s smlar In all our test we choose û The results show that the choce w 4 s the best compromse between relablty and cost When convergence s slow and there are regons where the slope changes rapdly, the choce w É can be naccurate We reserve for future work the study of a self-adaptve technque whch wll change the value of w wth the slope of the convergence curve In Secton, we dscussed the opportunty of startng the estmate of the relatve error only when ó ôfô þ û æ : ( Ths makes t possble a reducton of the number of addtonal matrx-vector products Both fgures show an ntal phase where the estmates have not been computed because of the ntroducton of ths check on the absolute value of the error

20 ð ð 60 M ARIOLI AND G MANZINI (05,1) g = 0 D (1,1) g N = 0 (0,05) g = 1 D Ω g = 0 N (0,0) g N = 0 (1,0) FIG 6 Geometry of the thrd doman (05,1) g D = 0 (1,1) (0,05) g = 0 N Ω Ω 4 g N = 0 g D = 1 Ω 1 Ω 2 Ω (0,0) (1,0) g = 0 N FIG 64 Geometry of the fourth doman Moreover, to avod an excessve number of addtonal matrx-vector products n the stop- ð Ü è every 10 steps of the conjugate gradent method The energy norm of converges qute quckly to the energy norm of the soluton and ths justfes our choce In Fgures 67 and 68, we see png crteron, we choose to update the value of the denomnator that after 25% of the teratons, the rato ôfô Y < E ôfô s greater than Numercal results We generated and ran all our test problems on a SPAR processor of a SUN ENTERPRISE 4500 (4PU 400 MHertz, 2GByte RAM) In our test runs, we compare the performance of our approach wth the performance of MA47 of the HSL2000 lbrary [26] The package MA47 mplements a verson of the a), decomposton for symmetrc ndefnte matrces that takes advantage of the structure of the augmented system [18] The package s dvded nto three parts correspondng to the symbolc analyss where the reorderng of the matrx s computed, the factorzaton phase, and the fnal soluton usng the

21 A NETWORK PROGRAMMING APPROAH TO MFE METHODS δ u M / u M Estmate d = 10 Estmate d = 5 Estmate d = ERRORS ITERATIONS FIG 65 Error energy norm and ts estmates for šœ 8, šž JŸE, and šœ 8 < for Mesh and doman 10 0 δ u M / u M Estmate d = 10 Estmate d = 5 Estmate d = ERRORS ITERATIONS FIG 66 Error energy norm and ts estmates for šœ 8, šž JŸE, and šœ 8 < for Mesh and doman 4 trangular matrces Smlarly, the null space algorthm whch we mplemented, can be subdvded nto three phases: a frst symbolc phase where the shortest path tree and the quotent tree are computed, a second phase where the projected Hessan system s solved by the conjugate gradent algorthm, and a fnal thrd phase where we compute the pressure Ths enables us to compare the drect solver MA47 wth the null space approach n each sngle phase Generally, n the test runs that we wll present, we fx the parameter w n the stoppng crteron to the value of Nevertheless, we wll show the nfluence of dfferent choces on the parameter w on the stoppng crteron usng Mesh In Table 6, we gve the PU tmes (n seconds) and the storage (n MByte) requred by MA47 and the PU tmes (n seconds) of the null space algorthm where we use the dagonal of to precondton the projected Hessan matrx wthn the conjugate gradent algorthm

22 62 M ARIOLI AND G MANZINI Rato energy norm u over energy norm u (k) Doman 1 Doman Iteratons FIG 67 onvergence of ôfô ôfô for Mesh and domans 1 and Rato energy norm u over energy norm u (k) Doman 4 Doman Iteratons FIG 68 onvergence of ôfô ôfô for Mesh and domans and 4 From Table 6, we see that the null space algorthm performs better n the case of random permeablty whch can be a realstc smulaton of an underground stuaton Nevertheless, the global PU tme of the null space algorthm can be 10 tmes more than the PU tme of the drect solver We pont out that the MA47 storage requrement for the and factors grows wth the sze of the problem whereas the null space algorthm needs only the storage of the matrces and º We forecast that ths wll become even more favourable to the null space algorthm when we want to solve D smulatons: for these problems the MA47 storage could become so large that we could be oblged to use an out-of-core mplementaton In Table 64, we dsplay the behavour of three dfferent precondtoners on the conjugate gradent teraton number We fxed the mesh (Mesh ) and we use as a precondtoner one of the followng matrces: w ŠUO b Ü, w ŠUO Ü ¾< (the classcal Jacob), and ON w Š?O

23 A NETWORK PROGRAMMING APPROAH TO MFE METHODS 6 TABLE 6 MA47 vs null space algorthm: PU tmes (n seconds) and storage (n MBytes) Mesh Doman MA47 null space algorthm Symbolc Factorzaton Solver Storage Symbolc G(#Iteratons) Solve (12) (14) (1) (17) (19) (5) (20) (7) (41) (101) (44) (106) (176) (90) (272) (95) 27 (block Jacob) whch has been computed usng the quotent tree of the shortest path tree (see Secton 5) For each precondtoner and each doman, we dsplay the number of conjugate gradent teratons, the PU tme (n seconds) for the buldng of the matrx, and the PU tme (n seconds) spent by the conjugate gradent algorthm to solve the projected Hessan lnear system From the results of Table 64, we conclude that the smplest precondtoner w ŠUO b Ü s faster even f the conjugate gradent algorthm does more teratons than the conjugate gradent algorthm usng the other precondtoners The Jacob and the block Jacob precondtoner buldng cost s very hgh and overwhelms the good performance of the conjugate gradent algorthm

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more