ATsunami is a series of waves generated in a

Size: px
Start display at page:

Download "ATsunami is a series of waves generated in a"

Transcription

1 The Internatonal Journal on Advances n ICT for Emergng Regons (0) : 3-0 On the Performance of the Parallel Implementaton of the Shallow Water Model on Dstrbuted Memory Archtectures K.Ganeshamoorthy, D.N.Ranasnghe, K.P.M.K.Slva and R.Wat Abstract Ths paper presents a study of the mpact of memory archtectures, dstrbuted memory (DM) and vrtual shared memory (VSM), n the soluton of parallel numercal algorthms on a mult-processor cluster. A parallel mplementaton of the shallow water equatons to model a Tsunam s chosen as the case study. Data s parttoned nto sub-domans, namely a four by three grd scheme and an eght by s grd scheme whch are used for the parallel mplementaton of ths model. There are four versons of the parallel algorthm for each grd scheme: dstrbuted memory wthout threads, dstrbuted memory wth threads, vrtual shared memory wthout threads, and vrtual shared memory wth threads. These four parallel versons have been mplemented on a hgh performance cluster, connected to the Nordugrd. Eperments are realzed usng the Message Passng Interface (MPI) lbrary, the C/Lnda, and the Lnu pthreads. Subect to the avalablty of memory, the vrtual shared memory verson wthout threads performs best, and as the task s scaled up, the threaded verson becomes effcent n both DM and VSM mplementatons. A Tsunam s characterzed as a shallow water wave. Shallow water waves are dfferent from wnd generated waves, the waves many of us have seen at the beach. Wnd generated waves usually have a perod of fve to twenty seconds and a wavelength of about one hundred to two hundred meters. A tsunam can have a perod n the range of ten mnutes to two hours and a wavelength n ecess of 500 km. It s because of ther long wavelengths that a tsunam behaves as shallow water waves. A wave s characterzed as a shallow water wave when the rato between the water depth and ts wavelength gets very small []. Inde Terms MPI, Lnda, mult processors, Shallow water equatons, tsunam model. I. INTRODUCTION ATsunam s a seres of waves generated n a body of water by an mpulsve dsturbance that vertcally dsplaces the water column [], []. Earthquakes, landsldes, volcanc eruptons, eplosons can generate a tsunam. A tsunam can savagely attack coastlnes, causng devastatng property damage and loss of lfe. Manuscrpt receved March 6, 009. Accepted July 7th, 009. Ths research was funded by the Natonal Scence Foundaton, Sr Lanka (Grant No. RG/005/FR/07) and by SPIDER, the Swedsh programme of ICT for developng Regons. K.Ganeshamoorthy s wth the Department of Computaton & Intellgent Systems, Unversty of Colombo School of Computng, 35, Red Avenue, Colombo-7, Sr Lanka. (e-mal: ganesh@webmal.cmb.ac.lk). D.N.Ranasnghe and K.P.M.K.Slva are also wth the Department of Computaton & Intellgent Systems, Unversty of Colombo School of Computng, 35, Red Avenue, Colombo-7, Sr Lanka. (emal: dnr@ucsc.cmb.ac.lk, mks@ucsc.cmb.ac.lk) R.Wat s wth the Internatonal Scence Programme and Department of Informaton Technology, Uppsala Unversty, Uppsala, Sweden. (rchard.wat@sp.uu.se) Fg.. Typcal med mode programmng model [9]. The shallow water equatons on a rotatng sphere serve as a prmary test problem for numercal methods used n modelng global atmospherc flows [3]. They descrbe the behavour of a shallow homogeneous ncompressble and nvscd flud layer. They present the maor dffcultes found n the horzontal aspects of three-dmensonal global atmospherc modelng. Thus, they provde a frst test to weed out potentally non-compettve schemes wthout the effort of buldng a complete model. However, because they do not represent the complete atmospherc system, the shallow water equatons are only a frst test. Ultmately schemes whch look attractve based on these tests must be appled to the complete baroclnc problem. The estence of a standard test set for the shallow water equatons wll encourage the contnued eploraton

2 4 K.Ganeshamoorthy, D.N.Ranasnghe, K.P.M.K.Slva and R.Wat of alternatve numercal methods and provde the communty wth a mechansm for udgng the merts of numercal schemes and parallel computers for atmospherc flow calculatons [3]. The current state-of-the-art n tsunam modelng stll requres consderable qualty control, udgment, and teratve, eploratory computatons before a scenaro s assumed relable. Ths s why the effcent computaton of many scenaros for the creaton of a database of pre-computed scenaros that have been carefully analyzed and nterpreted by a knowledgeable and eperence tsunam modeler s an essental frst step n the development of a relable and robust tsunam forecastng and hazard assessment capablty. Usng more advanced parallel algorthms, t may become techncally feasble to eecute realtme model runs for gudance as an actual event unfolds. However, ths s not currently ustfed on scentfc grounds; an operatonal real-tme model forecastng capablty must awat mproved and more detaled characterzaton of earthquakes n real-tme, and verfcaton that the real-tme tsunam model computatons are suffcently robust to be used n an operatonal, real-tme model [4]. Shallow water equatons have been wdely used to study the Tsunam phenomenon [5] [3], as a model of the basc flud dynamcs of the ocean. Solvng partal dfferental equatons numercally for real-lfe problems are computatonally demandng, therefore, utlzng super-computers/clusters effcently s mportant n order to acheve computatonal effcency [6]. Strateges to mprove the accuracy and overall qualty of model predctons have been and contnue to be of great nterest to numercal model developers but n addton to accuracy, the utlty of a numercal model s greatly affected by the algorthm s effcency [7]. Parallel computng provdes a feasble and effcent approach to solve very large-scale predcton problems but any redstrbuton of the data s a potentally tme-consumng task for parallel archtectures [8]. Fg.. shows the memory herarchy that ests n most nodes of a modern cluster envronment [9], where many nodes are lnked together by a hgh-speed network; and nsde each node there may be two or more processors; along wth each processor, memory access s ether to a hgh speed memory unt cache or the low speed man memory. The am of ths paper s to study the mpact of memory archtectures assocated wth dstrbuted memory and vrtual shared memory n the etracton of multple levels of parallelsm, for the soluton of numercal algorthms. The obectve here s to evaluate the effects of the programmng model on the scalablty of ths shallow water model. Computng the wave propagaton n the tsunam model, where the entre ocean s the soluton doman, s challengng, both due to the huge amount of computaton needed and due to the fact that dfferent physcs apples n dfferent regons [0]. The Message Passng Interface, MPI [], and C/Lnda [] are alternatve paradgms for communcaton between global nodes n the dstrbuted memory envronment and n the vrtual shared memory envronment, respectvely. In the med mode programmng model, both MPI and C/ Lnda are used for communcaton between global nodes whle nsde each MPI process and wthn each C/Lnda process POSIX threads [3] are used n order to etract further parallelsm. Ths paper s organzed as follows: In the secton, Related Work, we present the related work for our study of research. The secton on Lnear Long Wave Theory descrbes the lnear long wave theory used to smulate the tsunam model. In Algorthm Desgn, we descrbe our desgn prncples. Implementaton s presented net. Epermental results from both memory archtectures on hgh performance cluster are presented and are dscussed n Evaluaton. Fnally, n Concluson we make the concludng remarks. A. II. Related Work Parallel smulators for Tsunam Hybrd tsunam smulators that allow dfferent sub-domans to use dfferent mathematcal models, spatal dscretzatons, local meshes, and seral codes have been proposed by Xng Ca [0]. Boussnesq water wave equatons gven below are used for ths purpose. η + η 6 3.( H + αη) φ + εh H. φ H = 0 ( ) φ α ε φ ε φ + φ. φ + η H. H + H = 0 6 where η and φ are prmary unknowns denotng, respectvely, the water surface elevaton and velocty potental. The water depth H s assumed to be a functon of the spatal coordnates and y. In equatons () and () the weak effect of dsperson and nonlnearty s controlled by the two dmensonless constants ε and α respectvely. The wdely used lnear shallow water equatons can be derved by choosng ε = α = 0. ( ) December 009 The Internatonal Journal on Advances n ICT for Emergng Regons 0

3 K.Ganeshamoorthy, D.N.Ranasnghe, K.P.M.K.Slva and R.Wat 5 η +. ( H φ) = 0 ( 3) φ + η = 0 ( 4) The equatons from () to (4) are used to model the tsunam. The equatons () and () are resolved by unstructured meshes and fnte element dscretzaton, whereas structured meshes and fnte dfferences are commonly used for equatons (3) and (4). Such a parallelzaton strategy s most easly realzed by usng sub-domans, such that the entre spatal doman Ω s decomposed nto a set of overlappng sub-domans {Ω s } p In generc s=. settng, where a partal dfferental equaton (PDE) s epressed as L Ω (u)=f Ω, the Schwarz algorthm conssts of an teratve processes generatng u 0, u, u,..., u k as a seres of appromate solutons. Durng Schwarz teraton k, each sub-doman frst ndependenly updates ts local soluton through LΩ ( U k ) = k s fω ( 5) s s k where f Ω refers to a rght-hand sde whch s s restrcted wthn Ω s and depends on the latest global appromaton u k-. Then, the new global soluton u k s composed by sewng together the sub-doman local solutons u k, uk, uk,..., 3 uk p Equaton (5) thus opens for the possblty of usng dfferent local solvers n dfferent sub-domans. Takng the dea of addtve Schwarz one step further, dfferent mathematcal models n dfferent subdomans can be appled. Therefore, dfferent seral codes may be deployed regon wse. The proposed hybrd parallel tsunam smulator s mplemented usng obect-orented technques and s based on an estng advanced C++ fnte element solver named class Boussnesq applcable for unstructured meshes, and a legacy F77 fnte dfference code applcable for unform meshes. The resultng hybrd parallel tsunam smulator thus has full fleblty and etensblty [0]. B. Parallel computaton of a hghly nonlnear Boussnesq equaton model through doman decomposton Applcatons of the Boussnesq equatons cover a broad spectrum of ocean and coastal problems of nterest, from wnd wave propagaton n ntermedate and shallow water depths to the study of tsunam wave propagaton across large ocean basns. In general, mplementatons of the Boussnesq wave model to calculate free surface wave evoluton n large basns are computatonally ntensve, requrng large amount of CPU tme and memory. To facltate such etensve computatons, a parallel Boussnesq model has been developed by the Kharl et al. [], usng the doman decomposton technque n conuncton wth MPI. The parallel Boussnesq model developed s based on ts seral counterpart. The governng equatons consst of the two-dmensonal depth-ntegrated contnuty equaton: H + 6.( Hu ) ì. H ( ç çh + h ) z S + ( ç h) z T 0 ( 6) and the horzontal momentum equaton: u + η + where S =.u, T =.(hu ) + /, h s depth, η s free surface elevaton. Equatons (6) and (7) dffer from the equatons gven by We et al. [4] n the ncluson of the tme dervatves of the depth (h, h )to account for temporal bottom profle changes that occur durng landslde/earthquake, whch s one of several possble sources of tsunam. The parallel approach has had three mportant aspects, doman decomposton, communcaton, and parallel solver of the trdagonal system of the smultaneous lnear equatons. Three dfferent doman szes have been consdered:( ), ( ), and ( ). The overall performance of the model has been very good. The effcency of the model decreases as the number of ( )and ( ) processors ncreases whch s apparent n the case of domans. The rate of the effcency decrease s faster for smaller doman. Ths s due to the rato of arthmetc operaton tme to communcaton tme decreasng faster for domans wth smaller number of nodes. The performance of the model mproves as the number of grds ncreases; a favourable feature of a parallel model whch s ntended for smulaton on ever-ncreasng doman szes. Thus, ths parallel model provdes a future opportunty for large waveresolvng smulatons n the near shore, wth global domans of many mllons of grd ponts, coverng O(00km ) and greater basns. Further, real-tme smulaton wth Boussnesq equatons becomes a possblty. C. Implct Parallel FEM Analyss of Shallow Water Equatons Jang Chunbo et al. [5] have solved the shallow water equatons (SWEs) as the governng equaton to model a rver flow. SWEs are mplemented on clustered workstatons. For the parallel computaton, the mesh s automatcally parttoned usng the = ( u. u ) + g η + z S + z T η S + ηt ( T + ηs) + ( z η)( u. ) T + ( z η )( u. ) S + ( T + ηs) = 0 ( 7) December 009 The Internatonal Journal on Advances n ICT for Emergng Regons 0

4 6 K.Ganeshamoorthy, D.N.Ranasnghe, K.P.M.K.Slva and R.Wat geometrc mesh parttonng method. The governng equatons are then dscretzed mplctly to form a large sparse lnear system, whch s solved usng a drect parallel generalze mnmum resdual algorthm (GMRES). The shallow water equatons (8) and (9) used here are obtaned by ntegratng the conservaton of mass and momentum equatons assumng a hydrostatc pressure dstrbuton n the vertcal drecton. η q + = 0 (8) q q η + q = gh + H ρ where η s the water elevaton, H s the water depth, q = uh, u s the mean horzontal velocty, g s the gravtatonal acceleraton, v s the eddy vscosty, ρ s the water densty, τ s s the surface shear stress, and τ b s the bottom shear stress. The fnte element method for trangular elements s used for the spatal dscretzaton. Three knds of fnte element meshes are used to smulate the velocty feld n the two numercal eamples, flow around a crcular cylnder and flow n the Yangtze Rver. MPI has been used for the communcaton between nodes. The computed results agree well wth the observed results. The good speedup and effcency for the parallel computaton show that the parallel computng technque s a good method to solve large-scale problems [5]. III. q s b ( ) q τ τ + v +,, =, (9) Lnear Long Wave Theory There are many dfferent numercal methods for computng shallow water equatons on a sphere. Therefore, a standard test sute of seven problems for evaluatng numercal methods for the shallow water equatons n sphercal geometry was proposed by Wllamson et al [3] and accepted by the modelng communty n order to compare newly proposed methods. The shallow water equatons are wdely used as a prototype to study phenomena lke wave-vorte nteractons that occur n more complcated models of large scale atmosphere/ ocean dynamcs [5]. Consder the sea to be a volume of ncompressble water on a rotatng sphere, wth Corols force, fu. The horzontal coordnates are and, y the vertcal coordnate z, whch s zero at the mean sea surface and postve upwards. The sea bed s located at z = - H, and the surface s located at z = h. The lnear shallow water equatons [6], [7], [8] conssts of the contnuty equaton + [ u( h + H )] + [ v( h + H )] = 0 and the conservaton of horzontal momentum u u u + u + v fv = g (0) () v v v + u + v + fu = g () u and v are the velocty components n - and y- drecton, s the surface elevaton and g s the acceleraton due to gravty. Defnng the equatons n terms of the dscharge flues U = uh, V = vh, leads to dscretzaton that always satsfy the conservaton of mass. Then the conservaton of horzontal momentum can be wrtten as U U + t h H + + v + UV fv = g h + H UV V + fu = g h H y h H ( h + H ) (3) ( h + H ) (4) The fnte dfference appromatons usng centered dfferences n space and a leap-frog tme dscretzaton, are based on a staggered grd correspondng to an Arakawa C-grd [9], [0], wth the contnuty equaton centered on the pont, y, t and the equatons of moton centered k + / on the ponts +/, y, t and, y, t k +/ k respectvely. Wrtng D h + H, and usng upwnd dfferences for the convecton terms to mantan stablty t follows that k k k 3 U, U, U U + +, λ, + λ, + λ3, (5) D k k k D 3 D D, +, +, where wth up-wndng, U 0 < 0 λ = 0, and smlarly for the other terms. The dfference equatons are defned on a rectangular grd n terms of sphercal polar coordnates. An accurate representaton of tsunam runnng up the shore mples a grd spacng of no more that 00 meters n a regon of about 4 km out from the shore. As ths fne grd s not reasonable over the whole ocean a successon of overlappng grds s necessary near the coast. A data decomposton scheme s appled =, = k,, 3,, + λ, =, λ, =, λ 3, = 0 λ λ December 009 The Internatonal Journal on Advances n ICT for Emergng Regons 0

5 K.Ganeshamoorthy, D.N.Ranasnghe, K.P.M.K.Slva and R.Wat 7 for the parallel soluton of the shallow water equatons. In data decomposton, we keep the sequental formulaton of the problem, but dstrbute the data and operatons among the processors. The scalablty of several data decomposton algorthms for fnte dfference atmospherc and ocean models have been analyzed by several authors [], []. Rectangular physcal or computatonal doman The physcal or computatonal doman used n ths paper s rectangular n shape. In the doman decomposton method, the rectangular doman s dvded nto several smaller rectangular subdomans, where the number of sub-domans s equal to the number of processors used. Wth four processors, for eample, there are three possble ways of decomposng the doman nto equal-area parts as depcted n Fg.. The best decomposton depends not only on the archtecture of the system beng used, but more mportantly on the memory lmtatons on each node, especally n commodty clusters, as such, an assgnment lke two by two s recommended Fg.. Three possble ways of decomposng a rectangular doman. The numbers n the sub-domans represent the processor number. Several strateges est wthn the data decomposton paradgm for dvdng domans nto sub-domans. In the two-dmensonal grd, the computatonal doman s decomposed both n and y coordnate drectons. In many cases, the computaton s proportonal to the volume of a sub-doman and the communcaton s proportonal to the surface area. In such cases, a logcal strategy s to partton the doman n such a way that t mnmzes the surface area of each sub-doman relatve to ts volume. Ths keeps the computaton-to-communcaton rato hgh. In ths study, two-dmensonal decomposton s chosen. Ths nvolves assgnng each sub-doman to a processor and solvng the equatons for that subdoman on the respectve processor. Wth the two dmensonal decomposton, no global nformaton s requred at any partcular grd pont and nterprocessor communcatons are requred only at the boundares of the sub-domans. The nner-border of a sub-doman requres the outer-border of the adacent sub-doman durng a tme-step because of the spatal dscretzaton [6], [7], [8]. IV. Algorthm Desgn In our work, the doman decomposton method s used to parallelze the tsunam model. In ths method, the parallel algorthm s very smlar to the seral algorthm wth some addtonal routnes added to facltate the communcaton between processors. Usng ths method, all the processors nvolved n the parallel calculatons bascally perform the same computatonal operatons. The only dfference s n the data beng processed n each processor. Fg. 3. (A) Overvew of the unstructured grd and assgnment of worker nodes to sub-domans. (B) Arrows ndcate the nnerworker communcaton. An mportant aspect n decomposng the doman s the load balancng,.e. all processors should have equal or almost equal amount of data to be processed. If the number of grd ponts s dvsble by the number of processors, the grd ponts n each processor s smply the rato of the number of grd ponts to processors. If t s not, the remander s dstrbuted across the frst m processors, where m s the remander. The load should be balanced n both and y drectons. Fg. 4. Lnear shallow water wave profles at three dfferent tmesteps (ts) calculated usng 48 processors. December 009 The Internatonal Journal on Advances n ICT for Emergng Regons 0

6 8 K.Ganeshamoorthy, D.N.Ranasnghe, K.P.M.K.Slva and R.Wat Two grd schemes, a node (4 3) grd system and a 48 node (8 6) grd system, are used to parallelze the tsunam model wth a doman sze of (900 6). Portons of the domans are assgned to each of the worker nodes, as llustrated n Fg. 3 where each sub-doman s labeled wth a processor (worker) number. Snapshots of the free surface evoluton are shown n Fg. 4. Data for each subdoman s stored wth each processor ncludng water depth, coordnates, and ntal dsturbance. Ths data s read by each of the workers n a preprocessng stage.e., pror to ntatng the tmentegraton loop. Snce the model updates the soluton eplctly usng local data, each processor works ndependently of the others but requres data from neghbourng workers to update soluton along sub-doman boundares. The echange of data between processors occurs several tmes per tme-step. There are two key features to ths echange: Frst, only data along the boundares between processors s echanged. Second, each processor s only requred to communcate wth at most four other processors. The doman decomposton s performed wth ths second feature n mnd to avod communcatons between more than four other processors. Communcaton occurs between two adacent processors durng message passng. In passng the data from one processor to another, an effcent and safe communcaton must be developed. To effcently echange the data between adacent processors, the data s frst stored n a contguous memory pror to eecutng the sendng processes. At the same tme contguous memores of the same sze as used n the sendng processes are created to receve the data from the sendng processes. At ths pont, the data s ready for sendng and recevng processes. TABLE I Run Tme In Seconds (wot wthout thread; wt wth thread) Nodes MPI C/Lnda wot wt wot wt In ths model, message passng occurs four tmes per tme-step and the MPI functon, ``MPI_ Sendrecv, and the C/Lnda operatons n and out are used to perform the message passng between global nodes. Ths corresponds to eght messages per tme- step per processor ndependent of the number of processors beng used. Many more messages are sent durng pre-processng, but these are gnored for run-tme analyss purposes snce tme ntegraton s by far the most tme consumng element of the program. The parallel algorthm s outlned below.. Decompose rectangular doman nto load balanced rectangular sub-domans where the wdth and length of a rectangular sub-doman are W/N w appromately, and L/N l respectvely, where W and L are the wdth and length of doman, and N = N w *N s the number of processors to be used n the (N w N ) grd scheme.. Specfy parallel language related parameters, such as locatons, neghbours, sub-doman szes, and fle names, of processors. Locaton of k th processor s (r k, c k ), where r k = k/gdm + and c k = k - (r k - )* gdm + wth gdm and gdm beng the dmensons of the grd. In the four by three grd scheme, gdm = 3 and gdm = 4, and n the eght by s scheme, gdm=6 and gdm=8. For k = 0,,, 3, 4, , (gdm*gdm -), defne north k, south k, east k, and west k to be the neghbours located n the sde of north, south, east and west of the k th processor, then: north k = k - gdm f r k > south k = k + gdm f r k <gdm east k = k + f c k < gdm west k = k - f c k > 3. Input data and ntal condtons: (a) (b), reads the water depth, coordnates, and ntal dsturbance from the tet fle assgned n step., echanges subdoman boundary data from east k to west k, from west k to east k, from north k to south k, and form south, to northk, n oder. 4. Set parameters and coeffcents used at the open sea boundary. 5. Repeat the followng steps for the pre-defned tme-steps: (a) (b) (c) (d), echanges subdoman boundary data from east k to west k., echanges subdoman boundary data from north k to south k Computaton of the equaton of contnuty. Settng of the open sea boundary condton. December 009 The Internatonal Journal on Advances n ICT for Emergng Regons 0

7 K.Ganeshamoorthy, D.N.Ranasnghe, K.P.M.K.Slva and R.Wat 9 (e) (f), echanges subdoman boundary data from west k to east k., echanges subdoman boundary data from south k to north k. 6. Gather computed results among all processors. 7. Compute the output on processor, where both are equal to null. V. Im p l e m e n tat o n Data s parttoned by sub-domans, where a four by three grd scheme and an eght by s grd scheme are used for the parallel mplementaton of ths model. In each of the grd schemes, there are four parallel varatons: () dstrbuted memory wthout threads; () dstrbuted memory wth threads; (3) vrtual shared memory wthout threads; (4) vrtual shared memory wth threads. Implementatons use the Message Passng Interface (MPI) lbrary [], the C/Lnda [] and the pthread lbrary [3]. The med-mode programmng model whch uses thread programmng n the shared memory layer and message passng programmng n the dstrbuted memory layer s a method commonly used to utlze the herarchy of memory resources effcently [9]. In our med-mode programmng model, MPI s used for the data communcaton between the global nodes, and wthn each MPI process, pthreads are used. In the vrtual shared memory med-mode approach, C/Lnda s used for the data communcaton between the global nodes, and wthn each C/Lnda process, pthreads are used. These four algorthms have been mplemented on the Monolth cluster of Nordugrd [3] whch conssts of a hgh speed backbone nterconnectng mult processor nodes. The Monolth cluster has 396 nodes, all of whch are 686 archtecture wth.0ghz Intel(R) XEON(TM) processors, dual processor nodes and 048MB per node man memory and 5KB cache. The operatng system s Lnu verson.4.34-cap-smp. VI. Eva l u at o n Table I shows the tmng values for the eght scenaros arsng from the four parallel varatons of the algorthm mentoned above. Consder the four by three grd scheme, wth both threadng and non-threadng. The parallel algorthms for the vrtual shared memory ehbt better performance than the algorthms for dstrbuted memory. Though the vrtual shared memory mplctly passes messages, the replcaton management subsystem has been optmzed n C/Lnda compared to MPI [0], to yeld better performance. In the four by three grd scheme, n both scenaros for dstrbuted and vrtual shared memory, nonthreadng algorthms ehbt better performance than threadng algorthms. One possble eplanaton for ths s that each node keeps nne, two dmensonal floatng pont type arrays of sze equal to the sub doman sze. Because of the memory lmtatons t s not possble to declare local two-dmensonal arrays for threads, causng all threads n a node to concurrently use global arrays for ther own computatons. Now consder the eght by s grd scheme. Here too, for both threadng and non-threadng, the parallel algorthms for the vrtual shared memory ehbt better performance than the algorthms for dstrbuted memory. In contrast to the prevous nstance, both algorthms for dstrbuted and vrtual shared memory, the threadng algorthms ehbt better performance than non-threadng algorthms. Ths s because that the sub doman sze allocated to each node s half sze of sub doman sze of the smaller grd scheme of four by three sze. The local two dmensonal floatng pont type array allocaton for threads n a node s now possble compared to the four by three grd system. Therefore, snce all threads of a node work ndependently, the threadng parallel algorthm shows better performance than non-threadng parallel algorthm. Among the parallel varatons not usng threads n both memory archtectures, the four by three grd scheme shows better performance than eght by s grd scheme. Ths s due to the rato of computaton tme to communcaton overhead decreasng faster for domans wth smaller sze. However, when the task s scaled up, say up to eght by s grd system, owng to the smaller sub-doman szes algnng wth the avalable memory n the node, the threaded versons become more effcent n both MPI and C/Lnda mplementatons. In both threadng and non-threadng envronments, the C/Lnda verson ehbts better performance than the MPI verson. VII. Co n c l u s o n Ths paper has presented eght dfferent parallel mplementatons of a tsunam model based on the shallow water equatons. Each of these mplementatons use a med-mode programmng model from thread based shared memory, to December 009 The Internatonal Journal on Advances n ICT for Emergng Regons 0

8 0 K.Ganeshamoorthy, D.N.Ranasnghe, K.P.M.K.Slva and R.Wat dstrbuted memory and fnally to vrtual shared memory. Owng to node memory lmtatons, scalablty ssues become paramount, and threadng becomes a sgnfcant bottleneck f suffcent node memory s not avalable, offsettng the mddleware advantages. Wth suffcent node memory however, C/Lnda wth threads outperforms MPI wth threads, ndcatng the effectveness of etractng parallelsm over vrtual shared memory, dstrbuted memory and shared memory multple levels for ths class of problems. Ac k n o w l e d g m e n t Ths work was conducted at the Department of Computaton and Intellgent Systems, Unversty of Colombo School of Computng. Ths research was performed usng computatonal resources at the Natonal Supercomputer Centre, Lnkopng Unversty, Sweden, and of the computatonal resources of Research Laboratory, Unversty of Colombo School of Computng. Re f e r e n c e s [] Wkpeda [] K. I. Stanggang and P. Lynett, Parallel computaton of a hghly nonlnear Boussnesq equaton model through doman decomposton, Internatonal ournal for numercal methods n fluds, ISSN 07-09, vol. 49, no., 005, pp [3] D. L. Wllamson, J. B. Drake, J. J. Hack, R. Jakob, and P. N. Swarztrauber, A standard Test Set for numercal Appromatons to the Shallow Water Equatons n Sphercal Geometry, Journal of Computatonal Physcs, 0(), pp. -4, 99. [4] T. Vasly, G. Frank, and M. Hal, Tsunam Forecastng and Hazard Assessment Capabltes, Pacfc Dsaster Center (PDC) & Mau Hgh Performance Computng Center (MHPCC) Tsunam Modelng, Natonal Oceanc and Atmospherc Admnstraton Center for Tsunam Research ( 009) [5] J. D. Paul, and S. Rck, Shallow Water Equatons wth a Complete Corols Force and Topography, Physcs of Fluds, September 005. [6] B. Ragnhld, Nested Parallelsm n OpenMP wth Applcaton to Adaptve Mesh Refnement, Unversty of Bergen, February 003. [7] F. S. Brett and C. P. John, Parallel Implementaton of an Eplct Fnte-volume Shallow-water Model, 6th ASCE Engneerng Mechancs Conference, July 6-8, 003, Unversty of Washnton, Seattle. [8] S. L. Johnsson and C. Ho, Algorthms for Matr Transposton on Boolean N-Cube Confgured Ensemble Archtectures, SIAM J. Matr Analyss and Applcaton, vol.9(3), July 988, pp [9] W. Meng-Shou, A. Srnvas, and A. K. Rcky, Med Mode Matr Multplcaton, Proceedngs of the IEEE Internatonal Conference on Cluster Computng, IEEE Computer Socety Washngton, DC, USA, 00. [0] C. Xng and P. L. Hans, Makng Hybrd Tsunam Smulators n a Parallel Software Framework, LNCS, Appled Parallel Computng. State of the Art n Scentfc Computng, vol. 4699/008, pp [] MPI Forum mcs.anl.gov/mp/, 009. [] Lnda User Gude, Scentfc Computng Assocates IC., One Century Tower, 65 Church Street, New haven, CT USA, September, 005. ( about/nde.html, 009) [3] Pthreads. threads.html, 009. [4] G. We, J. T. Krby, S. T. Grll, R. Subramanya, A fully nonlnear Boussnesq model for surface waves, Part I, Hghly nonlnear unsteady waves, Journal Flud Mechancs995; 94:7-9. [5] J. Chunbo, L. Ka, L. Nng, and Z. Qngha, Implct Parallel FEM Analyss of Shallow Water Equatons, Tsnghua Scence & Technology, vol. 0, Issue 3, June 005, pp [6] atmos_spectral_shallow/shallow.pdf, 009. [7] [8] [9] Arakawa A and Lamb V, Computatonal desgn of the basc dynamcal processes of the UCLA general crculaton model, Methods n Computatonal Physcs, vol. 7, Academc Press, 977, pp [0] C. Goto and Y. Ogawa, Dept. of Cvl Engneerng, Tohoku Unversty, Translated for the Tsunam Inumdaton Modellng Echange Proect, by N. Shuto. Numercal method of tsunam smulaton wth the leap-frog scheme, UNESCO Intergovernmental Oceangraphc Commsson, Manuals and Gudes, 35, 99. [] S. Roar, Scalablty of Parallel Grd Pont Lmted Area Atmospherc Models I & II, Manuscrpt, Department of Mathematcs Scences, Norwegan Unversty of Scence and Technology, Trodhem, Norway, 996. [] S. Thomas, J. Cote, A. Stanforth, I. Le, and R. Skaln, A Sem-Implct Sem-Lagrange Shallow-Water Model for Massvely Parallel Processors, Proceedngs of the 6th ECMWF Workshop on the Use of Parallel Processors n Meteorology, November 994, pp [3] The Grd mddleware proect n the Nordc countres. December 009 The Internatonal Journal on Advances n ICT for Emergng Regons 0

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids)

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids) Structured meshes Very smple computatonal domans can be dscretzed usng boundary-ftted structured meshes (also called grds) The grd lnes of a Cartesan mesh are parallel to one another Structured meshes

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Storm Surge and Tsunami Simulator in Oceans and Coastal Areas

Storm Surge and Tsunami Simulator in Oceans and Coastal Areas Proc. Int. Conf. on Montorng, Predcton and Mtgaton of Water-Related Dsasters, Dsaster Preventon Research Insttute, Kyoto Unv. (2005) Storm Surge and Tsunam Smulator n Oceans and Coastal Areas Taro Kaknuma

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method Internatonal Journal of Computatonal and Appled Mathematcs. ISSN 89-4966 Volume, Number (07), pp. 33-4 Research Inda Publcatons http://www.rpublcaton.com An Accurate Evaluaton of Integrals n Convex and

More information

Simulation of a Ship with Partially Filled Tanks Rolling in Waves by Applying Moving Particle Semi-Implicit Method

Simulation of a Ship with Partially Filled Tanks Rolling in Waves by Applying Moving Particle Semi-Implicit Method Smulaton of a Shp wth Partally Flled Tanks Rollng n Waves by Applyng Movng Partcle Sem-Implct Method Jen-Shang Kouh Department of Engneerng Scence and Ocean Engneerng, Natonal Tawan Unversty, Tape, Tawan,

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

The Performance of Geothermal Field Modeling in Distributed Component Environment

The Performance of Geothermal Field Modeling in Distributed Component Environment Porows A., Peta A., Kowal A., Dane T.: The Performance of Geothermal Feld Modelng n Dstrbuted Component Envronment. In: Innovatons n Computng Scences and Software Engneerng, Sprnger,, pp 79-83. 79 Ths

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

3D Virtual Eyeglass Frames Modeling from Multiple Camera Image Data Based on the GFFD Deformation Method

3D Virtual Eyeglass Frames Modeling from Multiple Camera Image Data Based on the GFFD Deformation Method NICOGRAPH Internatonal 2012, pp. 114-119 3D Vrtual Eyeglass Frames Modelng from Multple Camera Image Data Based on the GFFD Deformaton Method Norak Tamura, Somsangouane Sngthemphone and Katsuhro Ktama

More information

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA RFr"W/FZD JAN 2 4 1995 OST control # 1385 John J Q U ~ M Argonne Natonal Laboratory Argonne, L 60439 Tel: 708-252-5357, Fax: 708-252-3 611 APPLCATON OF A COMPUTATONALLY EFFCENT GEOSTATSTCAL APPROACH TO

More information

Multiblock method for database generation in finite element programs

Multiblock method for database generation in finite element programs Proc. of the 9th WSEAS Int. Conf. on Mathematcal Methods and Computatonal Technques n Electrcal Engneerng, Arcachon, October 13-15, 2007 53 Multblock method for database generaton n fnte element programs

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Electrical analysis of light-weight, triangular weave reflector antennas

Electrical analysis of light-weight, triangular weave reflector antennas Electrcal analyss of lght-weght, trangular weave reflector antennas Knud Pontoppdan TICRA Laederstraede 34 DK-121 Copenhagen K Denmark Emal: kp@tcra.com INTRODUCTION The new lght-weght reflector antenna

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Lecture #15 Lecture Notes

Lecture #15 Lecture Notes Lecture #15 Lecture Notes The ocean water column s very much a 3-D spatal entt and we need to represent that structure n an economcal way to deal wth t n calculatons. We wll dscuss one way to do so, emprcal

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

S.P.H. : A SOLUTION TO AVOID USING EROSION CRITERION?

S.P.H. : A SOLUTION TO AVOID USING EROSION CRITERION? S.P.H. : A SOLUTION TO AVOID USING EROSION CRITERION? Célne GALLET ENSICA 1 place Emle Bloun 31056 TOULOUSE CEDEX e-mal :cgallet@ensca.fr Jean Luc LACOME DYNALIS Immeuble AEROPOLE - Bat 1 5, Avenue Albert

More information

Collaborating components in mesh-based electronic packaging

Collaborating components in mesh-based electronic packaging Scentfc Programmng 12 (2004) 65 70 65 IOS Press Collaboratng components n mesh-based electronc packagng P. Chow a and C.-H. La b a Fujtsu Laboratores of Europe Lmted, Hayes Park Central, Hayes End Road,

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

UrbaWind, a Computational Fluid Dynamics tool to predict wind resource in urban area

UrbaWind, a Computational Fluid Dynamics tool to predict wind resource in urban area UrbaWnd, a Computatonal Flud Dynamcs tool to predct wnd resource n urban area Karm FAHSSIS a, Gullaume DUPONT a, Perre LEYRONNAS a a Meteodyn, Nantes, France Presentng Author: Karm.fahsss@meteodyn.com,

More information

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition Optmal Desgn of onlnear Fuzzy Model by Means of Independent Fuzzy Scatter Partton Keon-Jun Park, Hyung-Kl Kang and Yong-Kab Km *, Department of Informaton and Communcaton Engneerng, Wonkwang Unversty,

More information

Domain decomposition in shallow-water modelling for practical flow applications

Domain decomposition in shallow-water modelling for practical flow applications Doman decomposton n shallow-water modellng for practcal flow applcatons Mart Borsboom 1, Menno Genseberger 1, Bas van t Hof 2, and Edwn Spee 1 1 Introducton For the smulaton of flows n rvers, lakes, and

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

an assocated logc allows the proof of safety and lveness propertes. The Unty model nvolves on the one hand a programmng language and, on the other han

an assocated logc allows the proof of safety and lveness propertes. The Unty model nvolves on the one hand a programmng language and, on the other han UNITY as a Tool for Desgn and Valdaton of a Data Replcaton System Phlppe Quennec Gerard Padou CENA IRIT-ENSEEIHT y Nnth Internatonal Conference on Systems Engneerng Unversty of Nevada, Las Vegas { 14-16

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Overvew 2 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Introducton Mult- Smulator MASIM Theoretcal Work and Smulaton Results Concluson Jay Wagenpfel, Adran Trachte Motvaton and Tasks Basc Setup

More information

Control strategies for network efficiency and resilience with route choice

Control strategies for network efficiency and resilience with route choice Control strateges for networ effcency and reslence wth route choce Andy Chow Ru Sha Centre for Transport Studes Unversty College London, UK Centralsed strateges UK 1 Centralsed strateges Some effectve

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Modeling of the Absorption of the Electromagnetic Wave Energy in the Human Head Induced by Cell Phone

Modeling of the Absorption of the Electromagnetic Wave Energy in the Human Head Induced by Cell Phone Journal of Appled Mathematcs and Physcs, 14,, 179-184 Publshed Onlne November 14 n ScRes. http://www.scrp.org/ournal/amp http://dx.do.org/1.436/amp.14.114 Modelng of the Absorpton of the Electromagnetc

More information

LS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract

LS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract 12 th Internatonal LS-DYNA Users Conference Optmzaton(1) LS-TaSC Verson 2.1 Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2.1,

More information

Investigations of Topology and Shape of Multi-material Optimum Design of Structures

Investigations of Topology and Shape of Multi-material Optimum Design of Structures Advanced Scence and Tecnology Letters Vol.141 (GST 2016), pp.241-245 ttp://dx.do.org/10.14257/astl.2016.141.52 Investgatons of Topology and Sape of Mult-materal Optmum Desgn of Structures Quoc Hoan Doan

More information

Closed form Solution for Scheduling Arbitrarily Divisible Load Model in Data Grid Applications: Multiple Sources

Closed form Solution for Scheduling Arbitrarily Divisible Load Model in Data Grid Applications: Multiple Sources Amercan Journal of Appled Scences 6 (4): 626-630, 2009 ISSN 546-9239 2009 Scence Publcatons Closed form Soluton for Schedulng Arbtrarly Dvsble Load odel n Data Grd Applcatons: ultple Sources onr Abdullah,

More information

A parallel Poisson solver using the fast multipole method on networks of workstations

A parallel Poisson solver using the fast multipole method on networks of workstations A parallel Posson solver usng the fast multpole method on networks of workstatons June-Yub Lee (jylee@math.ewha.ac.kr, jylee@cms.nyu.edu) Dept. of Math, Ewha Womans Unversty, Seoul120-750, KOREA, Karpjoo

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

A unified library of nonlinear solution schemes

A unified library of nonlinear solution schemes A unfed lbrary of nonlnear soluton schemes Sofe E. Leon, Glauco H. Paulno, Anderson Perera, Ivan F. M. Menezes, Eduardo N. Lages 7/27/2011 Motvaton Nonlnear problems are prevalent n structural, flud, contnuum,

More information

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation Precondtonng Parallel Sparse Iteratve Solvers for Crcut Smulaton A. Basermann, U. Jaekel, and K. Hachya 1 Introducton One mportant mathematcal problem n smulaton of large electrcal crcuts s the soluton

More information

Multi-scale simulation with a hybrid Boussinesq-RANS hydrodynamic model

Multi-scale simulation with a hybrid Boussinesq-RANS hydrodynamic model INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS Int. J. Numer. Meth. Fluds (2009) Publshed onlne n Wley InterScence (www.nterscence.wley.com)..2056 Mult-scale smulaton wth a hybrd Boussnesq-RANS

More information

3D numerical simulation of tsunami runup onto a complex beach

3D numerical simulation of tsunami runup onto a complex beach Proc. Int. Symp. on Fluval and Coastal Dsasters -Copng wth Extreme Events and Regonal Dversty-, Dsaster Preventon Research Insttute, Kyoto Unv. (2005), pp. 221-228 3D numercal smulaton of tsunam runup

More information

Topology Design using LS-TaSC Version 2 and LS-DYNA

Topology Design using LS-TaSC Version 2 and LS-DYNA Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Solitary and Traveling Wave Solutions to a Model. of Long Range Diffusion Involving Flux with. Stability Analysis

Solitary and Traveling Wave Solutions to a Model. of Long Range Diffusion Involving Flux with. Stability Analysis Internatonal Mathematcal Forum, Vol. 6,, no. 7, 8 Soltary and Travelng Wave Solutons to a Model of Long Range ffuson Involvng Flux wth Stablty Analyss Manar A. Al-Qudah Math epartment, Rabgh Faculty of

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

An inverse problem solution for post-processing of PIV data

An inverse problem solution for post-processing of PIV data An nverse problem soluton for post-processng of PIV data Wt Strycznewcz 1,* 1 Appled Aerodynamcs Laboratory, Insttute of Avaton, Warsaw, Poland *correspondng author: wt.strycznewcz@lot.edu.pl Abstract

More information

Dynamic wetting property investigation of AFM tips in micro/nanoscale

Dynamic wetting property investigation of AFM tips in micro/nanoscale Dynamc wettng property nvestgaton of AFM tps n mcro/nanoscale The wettng propertes of AFM probe tps are of concern n AFM tp related force measurement, fabrcaton, and manpulaton technques, such as dp-pen

More information

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

Mallathahally, Bangalore, India 1 2

Mallathahally, Bangalore, India 1 2 7 IMPLEMENTATION OF HIGH PERFORMANCE BINARY SQUARER PRADEEP M C, RAMESH S, Department of Electroncs and Communcaton Engneerng, Dr. Ambedkar Insttute of Technology, Mallathahally, Bangalore, Inda pradeepmc@gmal.com,

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Analysis of 3D Cracks in an Arbitrary Geometry with Weld Residual Stress

Analysis of 3D Cracks in an Arbitrary Geometry with Weld Residual Stress Analyss of 3D Cracks n an Arbtrary Geometry wth Weld Resdual Stress Greg Thorwald, Ph.D. Ted L. Anderson, Ph.D. Structural Relablty Technology, Boulder, CO Abstract Materals contanng flaws lke nclusons

More information

A 2D pre-processor for finite element programs based on a knowledge-based system

A 2D pre-processor for finite element programs based on a knowledge-based system 5th WSEAS / IASME Internatonal Conference on ENGINEERING EDUCATION (EE'08), Heraklon, Greece, July 22-24, 2008 A 2D pre-processor for fnte element programs based on a knowledge-based system DANIELA CÂRSTEA

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research Schedulng Remote Access to Scentfc Instruments n Cybernfrastructure for Educaton and Research Je Yn 1, Junwe Cao 2,3,*, Yuexuan Wang 4, Lanchen Lu 1,3 and Cheng Wu 1,3 1 Natonal CIMS Engneerng and Research

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract.

More information

AADL : about scheduling analysis

AADL : about scheduling analysis AADL : about schedulng analyss Schedulng analyss, what s t? Embedded real-tme crtcal systems have temporal constrants to meet (e.g. deadlne). Many systems are bult wth operatng systems provdng multtaskng

More information

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach Modelng, Manpulatng, and Vsualzng Contnuous Volumetrc Data: A Novel Splne-based Approach Jng Hua Center for Vsual Computng, Department of Computer Scence SUNY at Stony Brook Talk Outlne Introducton and

More information

Distributed Resource Scheduling in Grid Computing Using Fuzzy Approach

Distributed Resource Scheduling in Grid Computing Using Fuzzy Approach Dstrbuted Resource Schedulng n Grd Computng Usng Fuzzy Approach Shahram Amn, Mohammad Ahmad Computer Engneerng Department Islamc Azad Unversty branch Mahallat, Iran Islamc Azad Unversty branch khomen,

More information

Parallel and Distributed Association Rule Mining - Dr. Giuseppe Di Fatta. San Vigilio,

Parallel and Distributed Association Rule Mining - Dr. Giuseppe Di Fatta. San Vigilio, Parallel and Dstrbuted Assocaton Rule Mnng - Dr. Guseppe D Fatta fatta@nf.un-konstanz.de San Vglo, 18-09-2004 1 Overvew Assocaton Rule Mnng (ARM) Apror algorthm Hgh Performance Parallel and Dstrbuted Computng

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Intra-Parametric Analysis of a Fuzzy MOLP

Intra-Parametric Analysis of a Fuzzy MOLP Intra-Parametrc Analyss of a Fuzzy MOLP a MIAO-LING WANG a Department of Industral Engneerng and Management a Mnghsn Insttute of Technology and Hsnchu Tawan, ROC b HSIAO-FAN WANG b Insttute of Industral

More information