A Multi-Core Numerical Framework for Characterizing Flow in Oil Reservoirs

Size: px
Start display at page:

Download "A Multi-Core Numerical Framework for Characterizing Flow in Oil Reservoirs"

Transcription

1 A Mult-Core Numercal Framework for Characterzng Flow n Ol Reservors The MIT Faculty has made ths artcle openly avalable. Please share how ths access benefts you. Your story matters. Ctaton As Publshed Publsher Leonard, Chrstopher R. et al. "A Mult-Core Numercal Framework for Characterzng Flow n Ol Reservors." n Papers of the 19th Hgh Performance Computng Symposum (HPC 2011) Boston, Massachusetts, USA Aprl 4 6, Socety for Modelng & Smulaton Internatonal Verson Author's fnal manuscrpt Accessed Wed Apr 24 21:51:12 EDT 2019 Ctable Lnk Terms of Use Creatve Commons Attrbuton-Noncommercal-Share Alke 3.0 Detaled Terms

2 A Mult-Core Numercal Framework for Characterzng Flow n Ol Reservors Chrstopher R. Leonard, Cvl and Envronmental Engneerng, Massachusetts Insttute of Technology, 77 Massachusetts Avenue, Cambrdge, MA chrsleo@mt.edu Davd W. Holmes, Department of Mechancal Engneerng, James Cook Unversty, Angus Smth Drve, Douglas, QLD 4811, Australa davd.holmes1@cu.edu.au John R. Wllams, Cvl and Envronmental Engneerng and Engneerng Systems, Massachusetts Insttute of Technology, 77 Massachusetts Avenue, Cambrdge, MA rw@mt.edu Peter G. Tlke, Department of Mathematcs and Modelng, Schlumberger-Doll Research Center, 1 Hampshre Street, Cambrdge, MA tlke@slb.com Keywords: Parallel computaton, mult-core, smoothed partcle hydrodynamcs, lattce Boltzmann method, enhanced ol recovery Abstract Ths paper presents a numercal framework that enables scalable, parallel executon of engneerng smulatons on mult-core, shared memory archtectures. Dstrbuton of the smulatons s done by selectve hash-tablng of the model doman whch spatally decomposes t nto a number of orthogonal computatonal tasks. These tasks, the sze of whch s crtcal to optmal cache blockng and consequently performance, are then dstrbuted for executon to multple threads usng the prevously presented task management algorthm, H-Dspatch. Two numercal methods, smoothed partcle hydrodynamcs (SPH) and the lattce Boltzmann method (LBM), are dscussed n the present work, although the framework s general enough to be used wth any explct tme ntegraton scheme. The mplementaton of both SPH and the LBM wthn the parallel framework s outlned, and the performance of each s presented n terms of speed-up and effcency. On the 24-core server used n ths research, near lnear scalablty was acheved for both numercal methods wth utlzaton effcences up to 95%. To close, the framework s employed to smulate flud flow n a porous rock specmen, whch s of broad geophyscal sgnfcance, partcularly n enhanced ol recovery. 1. INTRODUCTION The extenson of engneerng computatons from seral to parallel has had a profound effect on the scale and complexty of problems that can be modeled n contnuum and dscontnuum mechancs. Tradtonally, such parallel computng has almost exclusvely been undertaken wth dstrbuted memory parallel archtectures, such as clusters of sngle-processor machnes. A number of authors have reported on parallel partcle methods (of whch smoothed partcle hydrodynamcs, SPH, s an example) demonstratng scalablty on such archtectures, for example Walther and Sbalzarn [1] and Ferrar et al. [2]. The lattce Boltzmann method (LBM) has also been a popular canddate for dstrbuted computng whch s unsurprsng due to the naturally parallel characterstcs of ts tradtonally regular, orthogonal grd and local node operatons. For example, Vdal et al. [3] presented results ncorporatng fve bllon LBM nodes wth a speed-up effcency of 75% on 128 processors. Götz et al. [4] smulated dense partcle suspensons wth the LBM and a rgd body physcs engne on an SGI Altx system wth 8192 cores (based on dual-core processors). At 7800 processors an effcency of approxmately 60% s acheved n a smulaton featurng 15.6 bllon LBM nodes and 4.6 mllon suspended partcles. Bernasch et al. [5] utlzed a GPU mplementaton of a mult-component LBM and n 2D smulatons of 4.2 mllon nodes acheved a speed-up factor of 13 over ther benchmark CPU performance. Of partcular relevance to ths study s the work of Zeser et al. [6] n whch a parallel cache blockng strategy was used to optmally decompose space-tme of ther LBM smulatons. In addton, ths strategy was purported to be cacheoblvous so that the decomposed blocks were automatcally matched to the cache sze of the hardware used, mnmzng the latency of memory access durng the smulaton. Shared memory mult-core processors have emerged n the last fve years as a relatvely nexpensve commercaloff-the-shelf hardware opton for techncal computng. Ther development has been motvated by the current clockspeed lmtatons that are hnderng the advancement, at least n terms of pure performance, of sngle processors [7]. However, as a comparatvely young technology, there exsts lttle publshed work ([8] s one example) addressng the mplementaton of numercal codes on shared memory mult-core processors. Wth the expense and hgh demand for compute tme on cluster systems, mult-core represents an attractve and accessble HPC alternatve, but the known challenges of software development on such archtectures (.e. thread safety and memory bandwdth ssues [9, 10, 11, 12]) must be addressed. Mult-core technologes are mportantly begnnng to nfltrate all levels of computng, ncludng wthn each node of modern cross-machne

3 clusters. As such, the development of scalable programmng strateges for mult-core wll have wdespread beneft. A varety of approaches to programmng on mult-core have been proposed to date. Commonly, concurrency tools from tradtonal cluster computng lke MPI [13] and OpenMP [14] have been used to acheve fast take-up of the new technology. Unfortunately, fundamental dfferences n the cross-machne and mult-core archtectures mean that such approaches are rarely optmal for mult-core and result n poor scalablty for many applcatons. In response to ths, Chrysanthakopoulos and co-workers [15, 16], based on earler work by Stewart [17], have mplemented mult-core concurrency lbrares usng Port based abstractons. These mmc the functonalty of a message passng lbrary lke MPI, but use shared memory as the medum for data exchange, rather than exchangng seralzed packets over TCP/IP. Such an approach provdes flexblty n program structure, whle stll captalzng on the speed advantages of shared memory. Perhaps as a reflecton of the growng mportance of mult-core software development, a number of other concurrency lbrares have been developed such as Axum and Clk++. In addton, the latest.net Framework ncludes a Task Parallel Lbrary (TPL) whch provdes methods and types, wth varyng degrees of abstracton, whch can be used wth mnmal programmatc dffculty to dstrbute tasks on multple threads. In an earler paper [18] we have shown that a programmng model developed usng such port-based technques descrbed n [15, 16] provdes sgnfcant performance advantages over approaches lke MPI and OpenMP. Importantly, t was found that the H-Dspatch dstrbuton model facltated adustable cache-blockng whch allowed performance to be tuned va the computatonal task sze. In ths paper, we apply the proposed programmng model to the parallelzaton of both partcle based methods and fxed-grd numercal methods on mult-core. The unque challenges n parallel mplementaton of both methods wll be dscussed and the performance mprovements wll be presented. The layout of ths paper s as follows. In Secton 2 a bref descrpton of the mult-core dstrbuton model, H- Dspatch, s provded. Both the SPH and LBM numercal methods are outlned n Secton 3 and the relevant aspects of ther mplementaton n the mult-core framework, ncludng thread safety and cache memory effcency, are dscussed. Secton 4 presents performance test results from both the SPH and LBM smulators as run on a 24-core server and, fnally, an applcaton of the mult-core numercal framework to a porous meda flow problem relevant to enhanced ol recovery s presented n Secton MULTI-CORE DISTRIBUTION One of the hndrances to scalable, cross-machne dstrbuton of numercal methods s the communcaton of ghost regons. These regons correspond to neghborng sectons of the problem doman (resdent n memory on other cluster nodes) whch are requred on a cluster node for the processng of ts own sub-doman. In the LBM ths s typcally a 'layer' of grd ponts that encapsulates the local sub-doman, but n SPH the layer of neghborng partcles requred s equal to the radus of the compact support zone. In 3D t can be shown that, dependng on the sub-doman sze, the communcated fracton of the problem doman can easly exceed 50%. In ths stuaton Amdahl's Law [7], and the fact that tradtonal cross-machne parallelsm usng messagng packages s a seral process, dctates that ths type of dstrbuted memory approach wll scale poorly. If a problem s dvded nto spatal sub-domans for mult-core dstrbuton, ghost regons are no longer necessary because adacent data s readly avalable n shared memory. Further, the removal of relatvely slow network communcatons requred n cluster computng allows for an entrely new programmng paradgm. Subdomans can take on any smple shape or sze and threaded programmng means many small sub-domans can be processed on each core from an events queue rather than needng to approxmate a sngle large, computatonally balanced doman for each processor. Consequently, dynamc doman decomposton becomes unnecessary and a partcle's poston n a doman can be as smple as a spatal hashng, allowng advecton to proceed wth mnmal management. Such characterstcs mean that mult-core s perfectly suted to the parallel mplementaton of partcle methods, however, shared memory challenges such as thread safety and bandwdth lmtatons must be addressed. The decomposton of the spatal doman of a numercal method creates a number of computatonal tasks. Mult-core dstrbuton of these tasks requres the use of a coordnaton tool to manage them onto processng cores n a load balanced way. Whle such tasks could easly be dstrbuted usng a tradtonal approach lke scatter-gather, here the H- Dspatch programmng model of [18] has been used because of the demonstrated advantages for performance and memory effcency. A schematc llustratng the functonalty of the H- Dspatch programmng model s shown n Fgure 1. The fgure shows three endurng threads (correspondng to three processng cores) that reman actve through each tme step of the analyss. A smple problem space wth nne decomposed tasks s dstrbuted across these threads by H- Dspatch. The novel feature of H-Dspatch s the way n whch tasks are dstrbuted to threads. Rather than a scatter or push of tasks from the manager to threads, here threads request values when free. H-Dspatch manages requests and dstrbutes cells to the requestng threads accordngly. It s ths pull mechansm that enables the use of a sngle thread per core as threads only request a value when free, thus, there s never more than one task at a tme assocated wth a

4 gven endurng thread (and ts assocated local varable memory). Addtonally, when all tasks n the problem space have been dspatched and processed, H-Dspatch dentfes step completon (.e. synchronzaton) and the process can begn agan. Fgure 1. Schematc representaton of the H-Dspatch programmng model [18] used to dstrbute tasks to cores. An endurng processng thread s avalable for each core, whch s three n ths smplfed representaton, and H- Dspatch coordnates tasks to threads n a load balanced way over a number of tme steps. The key beneft of such an approach from the perspectve of memory usage s n the ablty to mantan a sngle set of local varables for each endurng thread. The numercal task assocated wth analyss on a sub-doman wll nevtably requre local calculaton varables, often wth sgnfcant memory requrements (partcularly for the case of partcle methods). Overwrtng ths memory wth each allocated cell means the number of local varable sets wll match core count, rather than total cell count. Consderng that most problems wll be run wth core counts n the 10's or 100's, but cell counts n the 1,000's or 10,000's, ths can sgnfcantly mprove the memory effcency of a code. Addtonally, n managed codes lke C#.NET and Java, because thread local varable memory remans actve throughout the analyss, t s not repeatedly reclamed by the garbage collector, a process that holds all other threads untl completon and degrades performance markedly (see [18]). 3. NUMERICAL METHODS The mult-core numercal framework featured n ths paper has been desgned n a general fashon so as to accommodate any explct numercal method, such as SPH, LBM, the dscrete element method (DEM), the fnte element method (FEM) or fnte dfference (FD) technques. It s worth notng that t could be adapted to accommodate mplct, teratve schemes wth the correct data structures for thread safety but that s not the focus of ths work. Instead, ths study wll focus on SPH and LBM, however the performance of the mult-core framework wth an FD scheme has been prevously reported [18] Smoothed Partcle Hydrodynamcs SPH s a mesh-free Lagrangan partcle method whch was frst proposed for the study of astrophyscal problems by Lucy [19] and Gngold and Monaghan [20], but s now wdely appled to flud mechancs problems [21]. A key advantage of partcle methods such as SPH (see also dsspatve partcle dynamcs (DPD) [22]) s n ther ablty to advect mass wth each partcle, thus removng the need to explctly track phase nterfaces for problems nvolvng multple flud phases or free surface flows. However, the management of free partcles brngs wth t the assocated computatonal cost of performng spatal reasonng at every tme step. Ths requres a search algorthm to determne whch partcles fall wthn the compact support (.e. nteracton) zone of a partcle and then processng each nteractng par. Nevertheless, n many crcumstances ths expense can be ustfed by the versatlty wth whch a varety of mult-physcs phenomena can be ncluded. SPH theory has been detaled wdely n the lterature wth varous formulatons havng been proposed. The methodology of authors such as Tartakovsky and Meakn [23, 24] and Hu and Adams [25] has been shown to perform well for the case of mult-phase flud flows. Ther partcle number densty varant of the conventonal SPH formulaton removes erroneous artfcal surface tenson effects between phases and allows for phases of sgnfcantly dfferng denstes. Such a method has been used for the performance testng n ths work. The dscretzed partcle number densty SPH equatons for some feld quantty, A, s gven as, A A W r r, h, (1) n along wth ts gradent, A A W r r, h, (2) n where n m Wr r, h s the partcle number densty term, whle W s the smoothng functon (typcally a Gaussan or some form of splne), h s the smoothng length and r and r are poston vectors. These expressons are appled to the Naver-Stokes conservaton equatons to determne the SPH equatons of moton. Computng densty drectly from (1) gves, m W r r, h, (3) where ths expresson conserves mass exactly, much lke the summaton densty approach of conventonal SPH. An approprate term for partcle velocty rate has been provded by Morrs et al. [26], and used by Tartakovsky and Meakn [23], where,

5 dv dt n whch vscosty, 1 m N 1 1 m P P dw F 2 2 d r N r r v v nn r r 1 P s the partcle pressure, v s the partcle velocty and 2 dw dr (4) s the dynamc F s the body th force appled on the partcle. Surface tenson s ntroduced nto the method va the supermposton of par-wse nter-partcle forces followng Tartakovsky and Meakn [24], 3 r r s cos r r, r r h F 2h r r, (5) 0, r h r wheren s s the strength of force between partcles and, whle h s the nteracton dstance of a partcle. By defnng s as beng stronger between partcles of the same phase, than between partcles of a dfferent phase, surface tenson manfests naturally as a result of force mbalances at phase nterfaces. Smlarly, s can be defned to control the wettablty propertes of a sold. Sold boundares n the smulator are defned usng rows of vrtual partcles smlar to that used by Morrs et al [26], and no-slp boundary condtons are enforced for low Reynolds number flow smulatons usng an artfcally mposed boundary velocty method developed n [27] and shown to produce hgh accuracy results Mult-Core Implementaton of SPH Because partcle methods necesstate the recalculaton of nteractng partcle pars at regular ntervals, algorthms that reduce the number of canddate nteractng partcles to check are crtcal to numercal effcency. Ths s acheved by spatal hashng, whch assgns partcles to cells or 'bns' based on ther Cartesan coordnates. Wth a cell sde length greater than or equal to the nteracton depth of a partcle, all canddates for nteracton wth some target partcle wll be contaned n the target cell, or one of the mmedately neghborng cells. The storage of partcle cells s handled usng hash table abstractons such as the Dctonary<Tkey, Tvalue> class n C#.NET [28], and parallel dstrbuton s performed by allocaton of cell keys to processors from an events queue. In cases where data s requred from partcles n adacent cells, t s addressed drectly usng the key of the relevant cell. Wth the descrbed partcle cell decomposton of the doman care must be taken to avod common shared memory problems lke race condtons and thread contenton. To crcumvent the problems assocated wth usng locks (coarse graned lockng scales poorly, whle fne graned lockng s tedous to mplement and can ntroduce deadlockng condtons [29]) the SPH data can be structured to remove the possblty of thread contenton altogether. By storng the present and prevous values of the SPH feld varables, necessary gradent terms can be calculated as functons of values n prevous memory, whle updates are wrtten to the current value memory. Ths reduces the number of synchronzatons per tme step from two (f the gradent terms are calculated before synchronzng, followed by the update of the feld varables) to one, and a rollng memory algorthm swtches the ndex of prevous and current data wth successve tme steps. An mportant advantage of the use of spatal hashng to create partcle cells s the ease wth whch the cell sze can be used to optmze cache blockng. By adustng the cell sze, the assocated computatonal task can be re-szed to ft n cache close to the processor (e.g. L1 or L2 cache levels). It can be shown that cells fttng completely n cache demonstrate a sgnfcantly better performance (15 to 30%) than those that overflow cache causng an ncrease n cache msses, because cache msses requre that data then be retreved from RAM wth a greater memory latency The Lattce Boltzmann Method The lattce Boltzmann method (LBM) (see [30] for a revew) has been establshed n the last 20 years as a powerful numercal method for the smulaton of flud flows. It has found applcaton n a vast array of problems ncludng magnetohydrodynamcs, multphase and multcomponent flows, flows n porous meda, turbulent flows and partcle suspensons. The prmary varables n the LBM are the partcle dstrbuton functons, f x,t, whch exst at each of the lattce nodes that comprse the flud doman. These functons relate the probable amount of flud partcles movng wth a dscrete speed n a dscrete drecton at each lattce node at each tme ncrement. The partcle dstrbuton functons are evolved at each tme step va the two-stage, collde-stream process as defned n the lattce-bhatnagar- Gross-Krook [31] equaton (LBGK), t eq f x ct, t t f x, t f x, t f x, t,(6) AG c n whch x defnes the node coordnates, t s the explct tme step, c x / t defnes the lattce veloctes, s the relaxaton tme, f eq x t, are the nodal equlbrum functons, G s a body force (e.g. gravty) and A s a massconservng constant. The collson process, whch s descrbed by the frst two terms n the RHS of (6), monotoncally relaxes the partcle dstrbuton functons towards ther respectve equlbra. The redstrbuted

6 functons are then adusted by the body force term, after whch the streamng process propagates them to ther nearest neghbor nodes. Spatal dscretzaton n the LBM s typcally based on a perodc array of polyhedra, but ths s not mandatory [32]. A choce of lattces s avalable n two and three dmensons wth an ncreasng number of veloctes and therefore symmetry. However, the benefts of ncreased symmetry can be offset by the assocated computatonal cost, especally n 3D. In the present work the D3Q15 lattce s employed, whose velocty vectors are ncluded n (7) c c (7) The macroscopc flud varables, densty, f, and momentum flux, f c, are calculated at each lattce node as velocty moments of the partcle dstrbuton functons. The defntons of the flud pressure and vscosty are by-products of the Chapman-Enskog expanson (see [34] for detals), whch shows how the Naver-Stokes equatons are recovered n the near-ncompressble lmt wth sotropy, Gallean nvarance and a velocty ndependent pressure. An sothermal equaton of state, p c 2 s, n whch c s c / 3 s the lattce speed of sound, s used to calculate the pressure drectly from the densty, whle the knematc vscosty, x, (8) 3 2 t s evaluated from the relaxaton and dscretzaton parameters. The requrement of postve vscosty n (8) mandates that 1 2 and to ensure near-ncompressblty of the flow the computatonal Mach number s lmted, Ma u cs 1. The most straghtforward approach to handlng wall boundary condtons s to employ the bounce-back technque. Although t has been shown to be generally frstorder n accuracy [35], as opposed to the second order accuracy of the lattce Boltzmann equaton at nternal flud nodes [30], ts operatons are local and the orentaton of the boundary wth respect to the grd s rrelevant. A number of alternatve wall boundary technques [36, 37] that offer generalzed second-order convergence are avalable n the LBM, however these are at the expense of the localty and smplcty of the bounce-back condton. In the present work, the mmersed movng boundary (IMB) method of Noble and Torczynsk [38] s employed to handle the hydrodynamc couplng of the flud and structure. In ths method the LBE s modfed to nclude an addtonal collson term whch s dependent on the proporton of the nodal cell that s covered by sold, thus mprovng the boundary representaton and smoothng the hydrodynamc forces calculated at an obstacle's boundary nodes as t moves relatve to the grd. Consequently, t overcomes the momentum dscontnuty of bounce-back and lnk-bounceback-based [39] technques and provdes adequate representaton of non-conformng boundares at lower grd resolutons. It also retans two crtcal advantages of the LBM, namely the localty of the collson operator and the smple lnear streamng operator, and thus facltate solutons nvolvng large numbers of rregular-shaped, movng boundares. Further detals of the IMB method and the couplng of the LBM to the DEM, ncludng an assessment of mxed boundary condtons n varous flow geometres, can be found n Owen et al. [40] Mult-Core Implementaton of the LBM Two characterstc aspects of the LBM often result n t beng descrbed as a naturally parallel numercal method. The frst feature s the regular, orthogonal dscretzaton of space, whch s typcal of Euleran schemes, and can smplfy doman decomposton. The second feature s the use of only local data to perform nodal operatons, whch consequently results n partcle dstrbuton functons at a node beng updated usng only the prevous values. However, t should be noted that ncluson of addtonal features such as flux boundary condtons and non- Newtonan rheology, f not mplemented carefully, can negate the localty of operatons. The obvous choce for decomposton of the LBM doman s to use cubc nodal bundles, as shown schematcally n Fgure 2. The bundles are analogous to the partcle cells that were used n SPH, and smlarly H- Dspatch s used to dstrbute bundle keys to processors. Data storage s handled usng a Dctonary of bundles, whch are n turn Dctonares of nodes. Ths technque s used, as opposed to a master Dctonary of all nodes, to overcome problems that can occur wth Collecton lmts (approxmately 90 mllon on the 64-bt server used here). By defnton, the LBM nodal bundles can be used to perform cache blockng ust as the partcle cells were n SPH. Wth the correct bundle sze, the assocated computatonal task can be stored sequentally n processor cache and the latency assocated wth RAM access can be mnmzed. Smlar technques for the LBM have been reported n [41, 42] and extended to perform decomposton of space-tme [6] (as opposed to ust space) n a way that s ndependent of cache sze (a recursve algorthm s used to determne the optmal block sze). To ensure thread safety, two copes of the LBM partcle dstrbuton functons at each node are stored. Nodal processng s undertaken usng the current values, whch are

7 then overwrtten and propagated to the future data sets of approprate neghbor nodes. Technques such as SHIFT [43] have been presented whch employ specalzed data structures that remove the need for storng two copes of the partcle dstrbuton functons, however ths s at the expense of the flexblty of the code. Note that the collde-push sequence mplemented here can easly be reordered to a pull-collde sequence, wth each havng ther own subtle convenences and performance benefts dependng on the data structure and hardware employed [44, 42]. mcrotomographc mages of ol reservor rock (see Secton 5). Approxmately 1.4 mllon partcles were used n the smulaton and the executon duraton was defned as the tme n seconds taken to complete a tme step, averaged over 100 steps. For the double-search algorthm a speed-up of approxmately 22 was acheved wth 24 cores, whch corresponds to an effcency of approxmately 92%. Ths, n conuncton wth the fact that the processor scalng response s near lnear, s an excellent result. Fgure 3. Mult-core, parallel performance of the SPH solver contrastng the counterntutve scalablty of the sngle-search and double-search algorthms. Fgure 2. Schematc representaton of the decomposton of the LBM doman nto nodal bundles. Data storage s handled usng a Dctonary of bundles, whch are n turn Dctonares of nodes. 4. PARALLEL PERFORMANCE OF METHODS The parallel performance of SPH and the LBM n the mult-core numercal framework was tested on a 24-core Dell Server PER900 wth Intel Xeon CPU, 2.40 GHz, runnng 64-bt Wndows Server Enterprse Here, two metrcs are used to defne the scalablty of the smulaton framework, namely SpeedUp t / t and 1proc nproc Effcency SpeedUp / N. Obvously, dealzed maxmum performance corresponds to a speed-up rato equal to the number of cores at whch pont effcency would be 100%. Fgure 3 graphs the ncreasng speed-up of the SPH solver wth ncreasng cores. The test problem smulated flow through a porous geometry determned from The results n Fgure 3 also provde an nterestng nsght nto the comparatve benefts of mnmzng computaton or mnmzng memory commt when usng mult-core hardware. The sngle-search result s attaned wth a verson of the solver that performs the spatal reasonng once per tme step and stores the results n memory for use twce per tme step. Conversely, the doublesearch results are acheved when the code s modfed to perform the spatal search twce per tme step, as needed. Intutvely, the sngle-search approach requres a greater memory commt but the double-search approach requres more computaton. However, t s counterntutve to see that double-search sgnfcantly outperforms sngle-search, especally as the number of processors ncreases. Ths can be attrbuted to better cache blockng of the second approach and the smaller amount of data experencng latency when loaded from RAM to cache. The fact that such performance gans only manfest when more than 10 cores are used, suggests that for less than 10 cores, RAM ppelne bandwdth s suffcent to handle a global nteracton lst. As n the SPH testng, the LBM solver was assessed n terms of speed-up and effcency. Fgure 4 graphs speed-up aganst the number of cores for a 3D duct flow problem on

8 a doman. Perodc boundares were employed on the n-flow and out-flow surfaces and the bounce-back wall boundary condton was used on the remanng surfaces. A constant body force was used to drve the flow. The mult-core framework, usng the SPH solver, was appled to numercally determne the porosty-permeablty relatonshp of a sample of Dolomte. The structural model geometry was generated from X-ray mcrotomographc mages of the sample, whch were taken from a 4.95mm dameter cylndrcal core sample, 5.43mm n length and wth an mage resoluton of 2.8m. Ths produced a voxelated mage set that s n sze. Current hardware lmtatons prevent the full sample from beng analyzed n one numercal experment, therefore sub-blocks (see the nsets n Fgure 5) of voxel dmensons were taken from the full mage set to carry out flow testng and map the porosty-permeablty relatonshp. Fgure 4. Mult-core, parallel performance of the LBM solver for varyng bundle and doman szes. The sde length, n nodes, of the bundles was vared between 10 and 50 and the dfference n performance for each sze can clearly be seen. Optmal performance s acheved at a sde length of 20, where the speed-up and effcency are approxmately 22 and 92%, respectvely, on 24 cores. Ths bundle sze represents the best cache blockng scenaro for the tested hardware. As the bundle sze s ncreased, performance degrades due to the computatonal tasks becomng too large for storage n cache whch results n slow communcaton wth RAM for data. The bundle sde length of 20 was then transferred to dentcal problems usng a and doman, and the results of these tests are also ncluded n Fgure 4. As n the smaller problem, near lnear scalablty s acheved and at 24 cores the speed ups are approxmately 23 and 22, and the effcences are approxmately 95% and 92% for and 400 3, respectvely. Ths s an mportant result, as t suggests that the optmum bundle sze for 3D LBM problems can be determned n an a pror fashon for a specfc archtecture. 5. APPLIED PERMEABILITY EXPERIMENT The permeablty of reservor rocks to sngle and multple flud phases s of mportance to many enhanced ol recovery procedures. Tradtonally, these data are determned expermentally from cored samples of rock. To be able to perform these experments numercally would present sgnfcant cost and tme savngs, and therefore t s the focus of the present study. Fgure 5. Results of the Dolomte permeablty tests undertaken usng SPH and the mult-core framework. Expermental data [45] are ncluded for comparson. The assemblage of SPH partcles was ntalzed wth a densty of approxmately one partcle per voxel. Flud partcles (.e. those located n the pore space) were assgned the propertes of water ( 0 = 10 3 kgm -3, = 10-6 m 2 s -1 ) and boundary partcles located further than 6h from a boundary surface were deleted for effcency. The four doman surfaces parallel to the drecton of flow were desgnated noflow boundares and the n-flow and out-flow surfaces were specfed as perodc. Due to the ncompatblty of the two rock surfaces at these boundares, perodcty could not be appled drectly. Instead, the expermental arrangement was replcated by addng a narrow volume of flud at the top and bottom of the doman. Fnally, all smulatons were drven from rest by a constant body force equvalent to an appled pressure dfferental across the sample. The results of the permeablty tests are graphed n Fgure 5 and expermental data [45] relevant to the gran sze of the sample are ncluded for comparson. It can be

9 seen that the numercal results le wthn the expermental band, suggestng that the presented numercal procedure s approprate. As expected, each sub-block exhbts a dfferent porosty and by analyzng a range of sub-blocks a porostypermeablty curve for the rock sample can be defned. 6. CONCLUDING REMARKS In ths paper a parallel, mult-core framework has been appled to the SPH and LBM numercal methods for the soluton of flud-structure nteracton problems n enhanced ol recovery. Important aspects of ther mplementaton, ncludng spatal decomposton, data structurng and the management of thread safety have been brefly dscussed. Near lnear speed-up over 24 cores was found n testng and peak effcences of 92% n SPH and 95% n LBM were attaned at 24 cores. The mportance of optmal cache blockng was demonstrated, n partcular n the LBM results, by varyng the dstrbuted computatonal task sze va the sze of the nodal bundles. Ths mnmzed the cache msses durng executon and the latency assocated wth accessng RAM. In addton, t was found that the optmal nodal bundle sze n the 3D LBM could be transferred to larger problem domans and acheve smlar performance, suggestng an a pror technque for determnng the best computatonal task sze for parallel dstrbuton. Fnally, the mult-core framework wth the SPH solver was appled n a numercal experment to determne the porosty-permeablty relatonshp of a sample of Dolomte (.e. a canddate reservor rock). Due to hardware lmtatons, a number of sub-blocks of the complete mcrotomographc mage of the rock sample were tested. Each sub-block was found to have a unque porosty and correspondng permeablty, and when these were supermposed on relevant expermental data the correlaton was excellent. Ths result provdes strong support for the numercal expermentaton technque presented. Future work wll extend testng of the mult-core framework to 64-core and 256-core server archtectures. However, the next maor numercal development les n the extenson of the flud capabltes n SPH and the LBM to multple flud phases. Ths wll allow the predcton of the relatve permeablty of rock samples whch s essental to dranage and mbbton processes n enhanced ol recovery. Acknowledgements The authors are grateful to the Schlumberger-Doll Research Center for ther support of ths research. References [1] J. H. Walther and I. F. Sbalzarn. Large-scale parallel dscrete element smulatons of granular flow. Engneerng Computatons, 26(6): , [2] A. Ferrar, M. Dumbser, E. F. Toro, and A. Armann. A new 3D parallel SPH scheme for free surface flows. Computers & Fluds, 38(6): , [3] D. Vdal, R. Roy, and F. Bertrand. A parallel workload balanced and memory effcent lattce-boltzmann algorthm wth sngle unt BGK relaxaton tme for lamnar Newtonan flows. Computers & Fluds, 39(8): , [4] J. Götz, K. Iglberger, C. Fechtnger, S. Donath, and U. Rüde. Couplng multbody dynamcs and computatonal flud dynamcs on 8192 processor cores. Parallel Computng, 36(2-3): , [5] M. Bernasch, L. Ross, R. Benz, M. Sbragagla, and S. Succ. Graphcs processng unt mplementaton of lattce Boltzmann models for flowng soft systems. Physcal Revew E, 80(6):066707, [6] T. Zeser, G. Wellen, A. Ntsure, K. Iglberger, U. Rude, and G. Hager. Introducng a parallel cache oblvous blockng approach for the lattce Boltzmann method. Progress n Computatonal Flud Dynamcs, 8(1-4): , [7] M. Herlhy and N. Shavt. The art of multprocessor programmng. Morgan Kaufman, [8] L. Valant. A brdgng model for mult-core computng. Lecture Notes n Computer Scence, 5193:13-28, [9] K. Poulsen. Software bug contrbuted to blackout. Securty Focus, [10] J. Dongarra, D. Gannon, G. Fox, and K. Kennedy. The mpact of multcore on computatonal scence software. CTWatch Quarterly, 3(1), [11] C. E. Leserson and I. B. Mrman. How to Survve the Multcore Software Revoluton (or at Least Survve the Hype). Clk Arts, Cambrdge, [12] N. Snger. More chp cores can mean slower supercomputng, Sanda smulaton shows. Sanda Natonal Laboratores News Release, Avalable: ore.html [13] W. Gropp, E. Lusk, and A. Skellum. Usng MPI: Portable Parallel Programmng Wth the Message-Passng Interface. MIT Press, Cambrdge, [14] M. Curts-Maury, X. Dng, C. D. Antonopoulos, and D. S. Nkolopoulos. An evaluaton of OpenMP on current and emergng multthreaded/multcore processors. In M. S. Mueller, B. M. Chapman, B. R. de Supnsk, A. D. Malony, and M. Voss, edtors, OpenMP Shared Memory Parallel Programmng, Lecture Notes n Computer Scence, 4315/2008: , Sprnger, Berln/Hedelberg, [15] G. Chrysanthakopoulos and S. Sngh. An asynchronous messagng lbrary for C#. In Proceedngs of the Workshop on Synchronzaton and Concurrency n Obect-Orented Languages, 89-97, San Dego, [16] X. Qu, G. Fox, G. Chrysanthakopoulos, and H. F. Nelsen. Hgh performance mult-paradgm messagng runtme on multcore systems. Techncal report, Indana

10 Unversty, Avalable: ptlupages/publcatons/ccraprl16open.pdf. [17] D. B. Stewart, R. A. Volpe, and P. K. Khosla. Desgn of dynamcally reconfgurable real-tme software usng portbased obects. IEEE Transactons on Software Engneerng, 23: , [18] D. W. Holmes, J. R. Wllams, and P. G. Tlke. An events based algorthm for dstrbutng concurrent tasks on mult-core archtectures. Computer Physcs Communcatons, 181(2): , [19] L. B. Lucy. A numercal approach to the testng of the fsson hypothess. Astronomcal Journal, 82: , [20] R. A. Gngold and J. J. Monaghan. Smoothed partcle hydrodynamcs: Theory and applcaton to non-sphercal stars. Monthly Notces of the Royal Astronomcal Socety, 181: , [21] G. R. Lu and M. B. Lu. Smoothed Partcle Hydrodynamcs: a meshfree partcle method. World Scentfc, Sngapore, [22] M. Lu, P. Meakn, and H. Huang. Dsspatve partcle dynamcs smulatons of multphase flud flow n mcrochannels and mcrochannel networks. Physcs of Fluds, 19:033302, [23] A. M. Tartakovsky and P. Meakn. A smoothed partcle hydrodynamcs model for mscble flow n threedmensonal fractures and the two-dmensonal Raylegh- Taylor nstablty. Journal of Computatonal Physcs, 207: , [24] A. M. Tartakovsky and P. Meakn. Pore scale modelng of mmscble and mscble flud flows usng smoothed partcle hydrodynamcs. Advances n Water Resources, 29: , [25] X. Y. Hu and N. A. Adams. A mult-phase SPH method for macroscopc and mesoscopc flows. Journal of Computatonal Physcs, 213: , [26] J. P. Morrs, P. J. Fox, and Y. Zhu. Modelng low Reynolds number ncompressble flows usng SPH. Journal of Computatonal Physcs, 136: , [27] D. W. Holmes, J. R. Wllams, and P. G. Tlke. Smooth partcle hydrodynamcs smulatons of low Reynolds number flows through porous meda. Internatonal Journal for Numercal and Analytcal Methods n Geomechancs, n/a. do: /nag.898, [28] J. Lberty and D. Xe. Programmng C# 3.0: 5 th Edton. O'Relly Meda, Sebastopol, [29] A. R. Adl-Tabataba, C. Kozyraks, and B. Saha. Unlockng concurrency. Queue, 4(10):24-33, [30] S. Chen and G. D. Doolen. Lattce Boltzmann method for flud flows. Annual Revew of Flud Mechancs, 30: , [31] H. Chen, S. Chen, and W. H. Matthaeus. Recovery of the Naver-Stokes equatons usng a lattce-gas Boltzmann method. Physcal Revew A, 45(8):R5339-R5342, [32] X. He and G. D. Doolen. Lattce Boltzmann method on a curvlnear coordnate system: Vortex sheddng behnd a crcular cylnder. Physcal Revew E, 56(1):434440, [33] U. Frsch, B. Hasslacher, and Y. Pomeau. Lattce-gas automata for the Naver-Stokes equaton. Physcal Revew Letters, 56(14): , [34] S. Hou, Q. Zou, S. Chen, G. Doolen, and A. C. Cogley. Smulaton of cavty flow by the lattce Boltzmann method. Journal of Computatonal Physcs, 118(2): , [35] R. Cornubert, D. d'humères, and D. Levermore. A Knudsen layer theory for lattce gases. Physca D, 47(1-2): , [36] T. Inamuro, M. Yoshno, and F. Ogno. A non-slp boundary condton for lattce Boltzmann smulatons. Physcs of Fluds, 7(12): , [37] D. R. Noble, S. Chen, J. G. Georgads, and R. O. Buckus. A consstent hydrodynamc boundary condton for the lattce Boltzmann method. Physcs of Fluds, 7(1): , [38] D. R. Noble and J. R. Torczynsk. A lattce-boltzmann method for partally saturated computatonal cells. Internatonal Journal of Modern Physcs C, 9(8): , [39] A. J. C. Ladd. Numercal smulatons of partculate suspensons va a dscretzed Boltzmann equaton. Part 1. Theoretcal foundaton. Journal of Flud Mechancs, 271: , [40] D. R. J. Owen, C. R. Leonard, and Y. T. Feng. An effcent framework for flud-structure nteracton usng the lattce Boltzmann method and mmersed movng boundares. Internatonal Journal for Numercal Methods n Engneerng, n/a. do: /nme.2985, [41] T. Pohl, F. Deserno, N. Thurey, U. Rude, P. Lammers, G. Wellen, and T. Zeser. Performance evaluaton of parallel large-scale Lattce Boltzmann applcatons on three supercomputng archtectures. In SC '04: Proceedngs of the 2004 ACM/IEEE conference on Supercomputng, 21-33, Washngton, [42] G. Wellen, T. Zeser, G. Hager, and S. Donath. On the sngle processor performance of smple lattce Boltzmann kernels. Computers & Fluds, 35(8-9): , [43] J. Ma, K. Wu, Z. Jang, and G. D. Couples. SHIFT: An mplementaton for lattce Boltzmann smulaton n lowporosty porous meda. Physcal Revew E, 81(5):056702, [44] T. Pohl, M. Kowarschk, J. Wlke, K. Iglberger, and U. Rüde. Optmzaton and proflng of the cache performance of parallel Lattce Boltzmann codes. Parallel Processng Letters, 13(4): , [45] R. M. Sneder and J. S. Sneder. New ol n old places. Search and Dscovery, 10007, Avalable: der/

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

S.P.H. : A SOLUTION TO AVOID USING EROSION CRITERION?

S.P.H. : A SOLUTION TO AVOID USING EROSION CRITERION? S.P.H. : A SOLUTION TO AVOID USING EROSION CRITERION? Célne GALLET ENSICA 1 place Emle Bloun 31056 TOULOUSE CEDEX e-mal :cgallet@ensca.fr Jean Luc LACOME DYNALIS Immeuble AEROPOLE - Bat 1 5, Avenue Albert

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units Invted Artcle Computer Scence & Technology March 2012 Vol.57 No.7: 707 715 do: 10.1007/s11434-011-4908-y SPECIAL TOPICS: Effcent parallel mplementaton of the lattce Boltzmann method on large clusters of

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Dynamic wetting property investigation of AFM tips in micro/nanoscale

Dynamic wetting property investigation of AFM tips in micro/nanoscale Dynamc wettng property nvestgaton of AFM tps n mcro/nanoscale The wettng propertes of AFM probe tps are of concern n AFM tp related force measurement, fabrcaton, and manpulaton technques, such as dp-pen

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Simulation of a Ship with Partially Filled Tanks Rolling in Waves by Applying Moving Particle Semi-Implicit Method

Simulation of a Ship with Partially Filled Tanks Rolling in Waves by Applying Moving Particle Semi-Implicit Method Smulaton of a Shp wth Partally Flled Tanks Rollng n Waves by Applyng Movng Partcle Sem-Implct Method Jen-Shang Kouh Department of Engneerng Scence and Ocean Engneerng, Natonal Tawan Unversty, Tape, Tawan,

More information

AADL : about scheduling analysis

AADL : about scheduling analysis AADL : about schedulng analyss Schedulng analyss, what s t? Embedded real-tme crtcal systems have temporal constrants to meet (e.g. deadlne). Many systems are bult wth operatng systems provdng multtaskng

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Interaction Methods for the SPH Parts (Multiphase Flows, Solid Bodies) in LS-DYNA

Interaction Methods for the SPH Parts (Multiphase Flows, Solid Bodies) in LS-DYNA 13 th Internatonal LS-DYNA Users Conference Sesson: Flud Structure Interacton Interacton Methods for the SPH Parts (Multphase Flows, Sold Bodes) n LS-DYNA Jngxao Xu, Jason Wang Lvermore Software Technology

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD Analyss on the Workspace of Sx-degrees-of-freedom Industral Robot Based on AutoCAD Jn-quan L 1, Ru Zhang 1,a, Fang Cu 1, Q Guan 1 and Yang Zhang 1 1 School of Automaton, Bejng Unversty of Posts and Telecommuncatons,

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

AVO Modeling of Monochromatic Spherical Waves: Comparison to Band-Limited Waves

AVO Modeling of Monochromatic Spherical Waves: Comparison to Band-Limited Waves AVO Modelng of Monochromatc Sphercal Waves: Comparson to Band-Lmted Waves Charles Ursenbach* Unversty of Calgary, Calgary, AB, Canada ursenbach@crewes.org and Arnm Haase Unversty of Calgary, Calgary, AB,

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

The Performance of Geothermal Field Modeling in Distributed Component Environment

The Performance of Geothermal Field Modeling in Distributed Component Environment Porows A., Peta A., Kowal A., Dane T.: The Performance of Geothermal Feld Modelng n Dstrbuted Component Envronment. In: Innovatons n Computng Scences and Software Engneerng, Sprnger,, pp 79-83. 79 Ths

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids)

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids) Structured meshes Very smple computatonal domans can be dscretzed usng boundary-ftted structured meshes (also called grds) The grd lnes of a Cartesan mesh are parallel to one another Structured meshes

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Finite Element Analysis of Rubber Sealing Ring Resilience Behavior Qu Jia 1,a, Chen Geng 1,b and Yang Yuwei 2,c

Finite Element Analysis of Rubber Sealing Ring Resilience Behavior Qu Jia 1,a, Chen Geng 1,b and Yang Yuwei 2,c Advanced Materals Research Onlne: 03-06-3 ISSN: 66-8985, Vol. 705, pp 40-44 do:0.408/www.scentfc.net/amr.705.40 03 Trans Tech Publcatons, Swtzerland Fnte Element Analyss of Rubber Sealng Rng Reslence Behavor

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach Modelng, Manpulatng, and Vsualzng Contnuous Volumetrc Data: A Novel Splne-based Approach Jng Hua Center for Vsual Computng, Department of Computer Scence SUNY at Stony Brook Talk Outlne Introducton and

More information

Optimizing for Speed. What is the potential gain? What can go Wrong? A Simple Example. Erik Hagersten Uppsala University, Sweden

Optimizing for Speed. What is the potential gain? What can go Wrong? A Simple Example. Erik Hagersten Uppsala University, Sweden Optmzng for Speed Er Hagersten Uppsala Unversty, Sweden eh@t.uu.se What s the potental gan? Latency dfference L$ and mem: ~5x Bandwdth dfference L$ and mem: ~x Repeated TLB msses adds a factor ~-3x Execute

More information

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Overvew 2 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Introducton Mult- Smulator MASIM Theoretcal Work and Smulaton Results Concluson Jay Wagenpfel, Adran Trachte Motvaton and Tasks Basc Setup

More information

Fitting: Deformable contours April 26 th, 2018

Fitting: Deformable contours April 26 th, 2018 4/6/08 Fttng: Deformable contours Aprl 6 th, 08 Yong Jae Lee UC Davs Recap so far: Groupng and Fttng Goal: move from array of pxel values (or flter outputs) to a collecton of regons, objects, and shapes.

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Topology Design using LS-TaSC Version 2 and LS-DYNA

Topology Design using LS-TaSC Version 2 and LS-DYNA Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Research Article Performance Optimization of 3D Lattice Boltzmann Flow Solver on a GPU

Research Article Performance Optimization of 3D Lattice Boltzmann Flow Solver on a GPU Hndaw Scentfc Programmng Volume 2017, Artcle ID 1205892, 16 pages https://do.org/10.1155/2017/1205892 Research Artcle Performance Optmzaton of 3D Lattce Boltzmann Flow Solver on a GPU Nhat-Phuong Tran,

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Automatic selection of reference velocities for recursive depth migration

Automatic selection of reference velocities for recursive depth migration Automatc selecton of mgraton veloctes Automatc selecton of reference veloctes for recursve depth mgraton Hugh D. Geger and Gary F. Margrave ABSTRACT Wave equaton depth mgraton methods such as phase-shft

More information

Chapter 1. Comparison of an O(N ) and an O(N log N ) N -body solver. Abstract

Chapter 1. Comparison of an O(N ) and an O(N log N ) N -body solver. Abstract Chapter 1 Comparson of an O(N ) and an O(N log N ) N -body solver Gavn J. Prngle Abstract In ths paper we compare the performance characterstcs of two 3-dmensonal herarchcal N-body solvers an O(N) and

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

SPH and ALE formulations for sloshing tank analysis

SPH and ALE formulations for sloshing tank analysis Int. Jnl. of Multphyscs Volume 9 Number 3 2015 209 SPH and ALE formulatons for sloshng tank analyss Jngxao Xu 1, Jason Wang 1 and Mhamed Soul*, 2 1 LSTC, Lvermore Software Technology Corp. Lvermore CA

More information

LS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract

LS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract 12 th Internatonal LS-DYNA Users Conference Optmzaton(1) LS-TaSC Verson 2.1 Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2.1,

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Kinematics of pantograph masts

Kinematics of pantograph masts Abstract Spacecraft Mechansms Group, ISRO Satellte Centre, Arport Road, Bangalore 560 07, Emal:bpn@sac.ernet.n Flght Dynamcs Dvson, ISRO Satellte Centre, Arport Road, Bangalore 560 07 Emal:pandyan@sac.ernet.n

More information

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA RFr"W/FZD JAN 2 4 1995 OST control # 1385 John J Q U ~ M Argonne Natonal Laboratory Argonne, L 60439 Tel: 708-252-5357, Fax: 708-252-3 611 APPLCATON OF A COMPUTATONALLY EFFCENT GEOSTATSTCAL APPROACH TO

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Positive Semi-definite Programming Localization in Wireless Sensor Networks Postve Sem-defnte Programmng Localzaton n Wreless Sensor etworks Shengdong Xe 1,, Jn Wang, Aqun Hu 1, Yunl Gu, Jang Xu, 1 School of Informaton Scence and Engneerng, Southeast Unversty, 10096, anjng Computer

More information

Reproducing Works of Calder

Reproducing Works of Calder Reproducng Works of Calder Dongkyoo Lee*, Hee-Jung Bae*, Chang Tae Km*, Dong-Chun Lee*, Dae-Hyun Jung*, Nam-Kyung Lee*, Kyoo-Ho Lee*, Nakhoon Baek**, J. Won Lee***, Kwan Woo Ryu* and James K. Hahn*** *

More information

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation Precondtonng Parallel Sparse Iteratve Solvers for Crcut Smulaton A. Basermann, U. Jaekel, and K. Hachya 1 Introducton One mportant mathematcal problem n smulaton of large electrcal crcuts s the soluton

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

Application of Clustering Algorithm in Big Data Sample Set Optimization

Application of Clustering Algorithm in Big Data Sample Set Optimization Applcaton of Clusterng Algorthm n Bg Data Sample Set Optmzaton Yutang Lu 1, Qn Zhang 2 1 Department of Basc Subjects, Henan Insttute of Technology, Xnxang 453002, Chna 2 School of Mathematcs and Informaton

More information

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley

More information

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment Journal of Physcs: Conference Seres PAPER OPEN ACCESS Resource and Vrtual Functon Status Montorng n Network Functon Vrtualzaton Envronment To cte ths artcle: MS Ha et al 2018 J. Phys.: Conf. Ser. 1087

More information

GPU Accelerated Blood Flow Computation using the Lattice Boltzmann Method

GPU Accelerated Blood Flow Computation using the Lattice Boltzmann Method GPU Accelerated Blood Flow Computaton usng the Lattce Boltmann Method Cosmn Nţă, Lucan Mha Itu, Constantn Sucu Department of Automaton Translvana Unversty of Braşov Braşov, Romana Constantn Sucu Corporate

More information

Lecture #15 Lecture Notes

Lecture #15 Lecture Notes Lecture #15 Lecture Notes The ocean water column s very much a 3-D spatal entt and we need to represent that structure n an economcal way to deal wth t n calculatons. We wll dscuss one way to do so, emprcal

More information

COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES. University, Beijing , China

COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES. University, Beijing , China COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES Bn Zhao 1, 2, He Xao 1, Yue Wang 1, Yuebao Wang 1 1 Department of Buldng Scence and Technology, Tsnghua Unversty, Bejng

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introducton 1.1 Parallel Processng There s a contnual demand for greater computatonal speed from a computer system than s currently possble (.e. sequental systems). Areas need great computatonal

More information

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility IJCSNS Internatonal Journal of Computer Scence and Network Securty, VOL.15 No.10, October 2015 1 Evaluaton of an Enhanced Scheme for Hgh-level Nested Network Moblty Mohammed Babker Al Mohammed, Asha Hassan.

More information

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce Performance Study of Parallel Programmng on Cloud Computng Envronments Usng MapReduce Wen-Chung Shh, Shan-Shyong Tseng Department of Informaton Scence and Applcatons Asa Unversty Tachung, 41354, Tawan

More information

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems Real-tme Fault-tolerant Schedulng Algorthm for Dstrbuted Computng Systems Yun Lng, Y Ouyang College of Computer Scence and Informaton Engneerng Zheang Gongshang Unversty Postal code: 310018 P.R.CHINA {ylng,

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation Loop Transformatons for Parallelsm & Localty Last week Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Scalar expanson for removng false dependences Loop nterchange Loop

More information