Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

Size: px
Start display at page:

Download "Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units"

Transcription

1 Invted Artcle Computer Scence & Technology March 2012 Vol.57 No.7: do: /s y SPECIAL TOPICS: Effcent parallel mplementaton of the lattce Boltzmann method on large clusters of graphc processng unts XIONG QnGang 1,2, LI Bo 1,2, XU J 1,2, FANG XaoJan 1,2, WANG XaoWe 1*, WANG LMn 1*, HE XanFeng 1 & GE We 1 1 State Key Laboratory of Multphase Complex Systems, Insttute of Process Engneerng, Chnese Academy of Scences, Bejng , Chna; 2 Graduate Unversty of Chnese Academy of Scences, Bejng , Chna Receved May 23, 2011; accepted October 19, 2011 Many-core processors, such as graphc processng unts (GPUs), are promsng platforms for ntrnsc parallel algorthms such as the lattce Boltzmann method (LBM). Although tremendous speedup has been obtaned on a sngle GPU compared wth manstream CPUs, the performance of the LBM for multple GPUs has not been studed extensvely and systematcally. In ths artcle, we carry out LBM smulaton on a GPU cluster wth many nodes, each havng multple Ferm GPUs. Asynchronous executon wth CUDA stream functons, OpenMP and non-blockng MPI communcaton are ncorporated to mprove effcency. The algorthm s tested for two-dmensonal Couette flow and the results are n good agreement wth the analytcal soluton. For both the one- and two-dmensonal decomposton of space, the algorthm performs well as most of the communcaton tme s hdden. Drect numercal smulaton of a two-dmensonal gas-sold suspenson contanng more than one mllon sold partcles and one bllon gas lattce cells demonstrates the potental of ths algorthm n large-scale engneerng applcatons. The algorthm can be drectly extended to the three-dmensonal decomposton of space and other modelng methods ncludng explct grd-based methods. asynchronous executon, compute unfed devce archtecture, graphc processng unt, lattce Boltzmann method, non-blockng message passng nterface, OpenMP Ctaton: Xong Q G, L B, Xu J, et al. Effcent parallel mplementaton of the lattce Boltzmann method on large clusters of graphc processng unts. Chn Sc Bull, 2012, 57: , do: /s y Hgh-performance computng (HPC) on general-purpose graphcal processng unts (GPGPUs) has emerged as a compettve approach to make demandng computatons such as those of computatonal flud dynamcs (CFD) [1,2] and dscrete partcle smulatons [3 5]. Ths s, on one hand, due to the computatonal capacty of graphcal processng unts (GPUs), whch s almost one order of magntude hgher than that of manstream central processng unts (CPUs) n terms of both peak performance and memory bandwdth, and on the other hand, due to the ntroducton of effectve and convenent programmng nterfaces such as Compute Unfed Devce Archtecture (CUDA). *Correspondng authors (emal: xwwang@home.pe.ac.cn; lmwang@home.pe.ac.cn) The lattce Boltzmann method (LBM) [6] s a numercal method sutable for GPGPUs owng to ts explct numercal scheme, localzed communcaton mode and nherent addtvty of ts numercal operatons. Hence, t s a powerful alternatve to CFD methods such as fnte dfference and fnte volume methods. Implementatons of LBM on a sngle GPU have been reported [7 10] wth speedup ratos rangng from tens to above 100 relatve to a sngle CPU core. In the case of mult-gpu mplementatons, L et al. [11] performed LBM smulaton of ld-drven cavty flow on an HPC system comprsng both Nvda and AMD GPUs, usng CUDA and Brook+, respectvely, and combnng va the Message Passng Interface (MPI). Myre et al. [12] mplemented sngle-phase, mult-phase and mult-component The Author(s) Ths artcle s publshed wth open access at Sprngerlnk.com csb.scchna.com

2 708 Xong Q G, et al. Chn Sc Bull March (2012) Vol.57 No.7 LBMs on GPU clusters usng Open Mult-Processng (OpenMP). In these mplementatons, data communcaton between GPUs s trval or the GPUs are nstalled at the same node, so the real performances of these mplementatons were almost unaffected by communcaton. However, ths s not typcal n engneerng practce. In fact, the data n GPUs cannot be accessed by the network drectly and has to be coped, from the GPU to CPU before sendng and from the CPU to GPU after recevng, through a PCIe bus wth bandwdth of about 10 GB/s currently (Gen 2), whch s much lower than that of the GPU global memory. Therefore, communcaton between the CPU and GPU can be a bottleneck n some applcatons. In ths artcle, we ntegrate asynchronous computng communcaton va the CUDA v3.1 framework [13], sharedmemory parallelzaton usng OpenMP and nter-node parallelzaton usng non-blockng MPI to mprove the performance of mult-gpu LBM smulatons. Performances for both one- and two-dmensonal decompostons are analyzed and t s found that our mplementaton s very effcent. The consstency of our mplementaton on HPC systems wth multple GPUs at one node s emphaszed. 1 The lattce Boltzmann method The lattce BGK model [14] s one of the most frequently used schemes for the LBM. Dependng on the dmensonalty (D) and number of dscrete lattce veloctes (Q), there are dfferent varatons, such as D2Q9, D3Q13, and D3Q19. The formulaton of the lattce BGK model s 1 eq f( x 1, t 1) f( x, t) ( f ( x, t) f( x, t)), (1) where f (x,t) s the densty functon of the th drecton at poston x and tme t. τ s the relaxaton tme related to flud molecular dynamc vscosty μ. The term f eq ( x, t) s approxmated to second order as 2 eq e u ( e u) u u f ( x, t) w( ), (2) 2 2 c c c where f, u e f. (3) The D2Q9 scheme s llustrated n Fgure 1 and further detals were gven by Qan et al. [14]. To reduce the compressng effect n the orgnal lattce BGK model, He et al. [15] proposed revsons to the DdQq schemes and named them DdQq. The evolutonal rules are the same but wth dfferent equlbrum densty propagatons: eq f ( x, t) 0 p 2 e u ( e u) u u w ( ), (4) 2 2 c c c Fgure 1 D2Q9 model wth w = 4/9 when = 0, w = 1/9 when = 1, 2, 3 and 4, and w = 1/36 when = 5, 6, 7 and 8. w0 1 w where 0 3, 3, ρ 0 s the referenced c c flud densty for the ntal state, pressure p and velocty u are expressed as 2 c uu p ( f 1.5 w0 ), u 2 e f. (5) 3(1 w0 ) 0 c 0 The DdQq schemes ntroduce no further computatonal cost, and for GPU mplementaton, the zeroth drecton can be omtted, whch makes the schemes faster than the correspondng DdQq schemes. However, for DdQq schemes, the hdng of data communcatons s more mportant snce the communcaton-to-computaton rato s hgher than DdQq for the sze of data to be transfered among GPUs s same. 2 Mult-GPU mplementaton of the D2Q9 scheme The mplementaton of the LBM for a sngle GPU has been dscussed extensvely n [7,16]. We emphass one pont here. As the GPU s sutable for data-ndependent computaton-ntensve tasks, the memory access mode s crtcal to the performance. For ths reason, the storage of LBM grd data must be algned and accessed n a coalescent manner to make full use of the memory bandwdth. As long as global memory access s optmzed, the performance of dfferent mplementatons on the same sngle GPU vares lttle. However, for mult-gpu mplementaton, GPU CPU data transfer and CPU CPU communcaton may requre a large porton of the wall tme, and they have to be optmzed also. In CUDA 3.1, the launch of a GPU kernel s asynchronous, whch means that when a kernel s launched, the system returns to ts ntal state before the kernel completes ts computng. Ths feature enables the host CPU to perform

3 Xong Q G, et al. Chn Sc Bull March (2012) Vol.57 No Fgure 2 Schematc map of the overlappng of GPU computaton and data communcatons. ndcates a boundary cell and an nner cell; and cells make up the entre grd executed n stream [1]. other jobs whle watng for the GPU kernel to fnsh; e.g., copyng data between a GPU and CPU and carryng out nter-cpu communcaton and arthmetc operatons. For LBM smulatons, ths mples that collson and propagaton of the densty functons can be run n parallel by copyng boundary grd nformaton to a CPU and then transferrng the nformaton to neghborng CPUs. As shown n Fgure 2, ths s realzed usng the stream functon and portable pnned memory n CUDA 3.1, OpenMP and non-blockng communcatons provded by MPI. The flowchart of parallel mplementaton of LBM on GPU cluster s gven n Fgure 3. At the begnnng of each teraton, the collson operaton on boundary cells s launched asynchronously by the kernel Boundary_Collson n stream[0]. In ths kernel, the boundary grds are only subject to collson and not to propagaton, and post-collson boundary nformaton s wrtten to sendng buffers n the GPU global memory. The collson and propagaton on the entre grd are launched by the kernel Collson_Propagaton n stream[1] as soon as Boundary_Collson returns. The host can return before these asynchronous kernels completon, but kernels n the same stream are carred out n seres. Therefore, we launch the copy between GPU and CPU cudamemcpyasync n stream[0] to ensure that the copy operaton starts after the completon of Boundary_Collson. Although the operatons n stream[0] are n seres, these operatons can be done whle Collson_Propagaton s n executon. To use the asynchronous cudamemcpyasync, the buffers n the host must be allocated as pnned memory. After the GPU CPU copy operaton, the communcatons between CPUs are ready to be carred out. To confrm the fnsh of GPU CPU data copy n host memory, cudastreamsynchronze (stream[0]) s performed to ensure that all boundary nformaton s coped to sendng buffers n host memory. Non-blockng MPI_Isend and MPI_Irecv are then launched f the neghborng processors do not belong to the same node. These two MPI functons are non-blocked so that other CPU operatons can proceed whle data are beng sent or receved. MPI_Wat s needed to wat untl data have been receved. If neghborng processors are located on the same node, data can be transfered wth the portable pnned memory n CUDA. Ths desgn results n the reducton of the amount of data n MPI and acheves a hgher data transfer speed. Such an dea s realzed usng OpenMP for data communcatons wthn a node [17]. OpenMP threads control GPU devces and make portable pnned memory vsble to all GPU devces at the same node. Furthermore, a new technology, GPUDrect [18] for Tesla or Ferm GPUs, s adopted to mprove communcaton performance. The mprovement s acheved by removng the step of copyng data from GPU-dedcated host memory to host memory avalable to InfnBand devces to execute the RDMA communcatons. After the data communcatons, receved data are stll coped to the GPU wth cudamemcpyasync. Fnally, the

4 710 Xong Q G, et al. Chn Sc Bull March (2012) Vol.57 No.7 Fgure 3 Flowchart of the hybrd mplementaton of the LBM on mult-gpus [20]. boundary nformaton s updated by the data from recevng buffers n GPU global memory. 3 Results and dscusson In the followng, the algorthm s valdated and ts performance tested for our GPU cluster Mole-8.5 (cf. top500.org/lst/2011/11/100), whch conssts of 362 nodes connected wth Quad Data Rate InfnBand. Most of the computng nodes are equpped wth two quad-core CPUs and sx Nvda Tesla C2050 GPUs; therefore, the whole system s confgured wth more than 2000 GPUs, resultng n peak performance of 2 petaflops n sngle precson. 3.1 Valdaton Numercal valdaton s mportant n GPU computng, although many authors [7,19] have declared that the results are nsenstve to sngle precson. We consder the analytcal soluton for the classcal case of two-dmensonal Couette flow to evaluate the accuracy of our GPU mplementaton. The doman sze s and the Reynolds number Re s 400. The smulaton s run n parallel on four GPUs. The smulaton results and the analytcal soluton are llustrated n Fgure 4. We fnd that the computatonal results of our GPU mplementaton agree very well wth the analytcal soluton wth a maxmum error of about 1.5%. 3.2 Performance Fve cases of Couette flow are smulated wth the grd szes for each GPU rangng from (A), to (B), (C), (D) and (E). The whole computaton doman s parttoned n ether one

5 Xong Q G, et al. Chn Sc Bull March (2012) Vol.57 No Fgure 4 Velocty profles at steady state for a two-dmensonal Couette flow smulaton wth grd sze (Reynolds number Re = UH/υ = 400). or two dmensons. All cases were run 10 tmes wth teraton steps for each and the wall tmes were recorded after arthmetcal averagng. In the followng, unless otherwse specfed, each node runs sx GPUs concurrently. Tme costs of GPU computaton, data transfer between the GPU and CPU and communcaton between neghborng CPUs n cases usng 12 GPUs for one- and two-dmensonal decomposton wth synchronous executon and blockng MPI are plotted n Fgures 5 and 6 respectvely. We fnd that the tme portons of GPU CPU data transfer and communcaton between CPUs ncrease wth reducton of the doman sze for each GPU. In addton, as expected, the tme percentage of GPU CPU and CPU CPU data transfer n two-dmensonal decomposton s hgher than that for one-dmensonal decomposton and sometmes the tme consumpton even exceeds the tme for GPU computng, whch means there s more room to mprove the effcency by hdng data transfer between the GPU and CPU and communcatons between CPUs. Smulatons deployng the proposed computaton communcaton overlappng algorthm n both one-and twodmensonal decomposton were carred out. The tme costs for all cases are llustrated n Fgures 7 and 8. The fgures show that most of the tme for data copy and communcaton s successfully hdden through overlappng wth GPU computaton, leadng to an obvous reducton n the total tme. In two-dmensonal decomposton, the performance mprovement s even greater than that n one- dmensonal Fgure 5 (a) Tme component of each part of the algorthm wth synchronous executon and blockng MPI but wthout OpenMP n one-dmensonal decomposton; (b) tme percentages of GPU CPU data transfer and CPU CPU communcaton. Fgure 6 (a) Tme component of each part of the algorthm wth synchronous executon and blockng MPI n two-dmensonal decomposton; (b) tme percentages of GPU CPU data transfer and CPU CPU communcaton.

6 712 Xong Q G, et al. Chn Sc Bull March (2012) Vol.57 No.7 decomposton snce more tme for data transfer between a GPU and CPU and communcaton s hdden. To descrbe the performance mprovement clearly, we take case E n one-dmensonal decomposton usng 12 GPUs as an example to compare tme components of 5 algorthms: (a) synchronous executon and blockng MPI wthout OpenMP; (b) synchronous executon and blockng MPI wth OpenMP; (c) asynchronous executon and blockng MPI wth OpenMP; (d) synchronous executon and non-blockng MPI wth OpenMP; (e) asynchronous executon and nonblockng MPI wth OpenMP. The tme results are lsted n Table 1. Because of the non-seral characterstc of asynchronous executon and non-blockng MPI, the tme requred for asynchronous GPU executon and non-blockng MPI s dffcult to separate. Therefore, the GPU computaton tme was assumed to be the same for the asynchronous cases. Table 1 shows that the tme requred for data delvery between the GPU and CPU s reduced by about 60% 70% and the tme requred for nter-cpu communcaton s reduced by 70% 80%, whch gves performance of 1192 mllon lattce updates per second for each GPU card n mult-node and multple GPU mplementaton. Table 1 Comparson of tme components for fve algorthms n case E Algorthm GPU computaton (s) GPU CPU data transfer (s) CPU CPU communcaton (s) Total (s) (a) (b) (c) (d) (e) Fgure 7 (a) Tme component for the algorthm wth asynchronous executon, OpenMP and non-blockng MPI n one-dmensonal decomposton; (b) tme percentage of GPU CPU copy and CPU CPU communcaton. Fgure 8 (a) Tme component for the algorthm wth asynchronous executon, OpenMP and non-blockng MPI n two-dmensonal decomposton; (b) tme percentage of GPU CPU copy and CPU CPU communcaton.

7 Xong Q G, et al. Chn Sc Bull March (2012) Vol.57 No To nvestgate the scalablty of the mplementaton further, we change the number of GPUs n case E, rangng from 12 to The correspondng tme costs for communcaton are shown n Fgure 9. We see that the computaton communcaton overlappng algorthm stll performs better than orgnal algorthms wth blockng MPI as the number of GPUs ncreases. Ths shows that the optmzaton can be appled to hundreds or thousands of GPUs wth good scalablty. 3.3 Performance balance for mult-gpus nodes In addton to the above performance dscussons, we also run our GPU mplementaton usng 12 GPUs for case E but wth a varyng number (one, two, three, four or sx) of GPUs at each node to test the balance of performance and economy for computng nodes ntegratng multple GPUs. As t s known that the bandwdth of the PCI-E bus s usually a bottleneck owng to data transfer between the GPU and CPU durng computaton compared wth the GPU computng, the performance deterorates when multple GPUs at one node are engaged n a parallel computaton because of the PCI-E bandwdth conflct. Owng to the use of CUDA portable pnned memory and OpenMP, the communcaton load of the processes wthn a node s theoretcally equal, rrespectve of how many GPUs are employed concurrently at a node. Therefore, we can ensure that there are neglgble dfferences n the CPU CPU communcaton tme for the fve confguraton settngs. The performance of our mplementaton s summarzed n Table 2. We fnd that although the number of GPUs used at each node ncreases from one to sx, the ncrease n the total computaton tme s almost neglgble as most of the tme for communcaton and data transfer s hdden owng to the asynchronous executon. The tme dfference s manly due to the GPU CPU data transfer as more data are transfered through the PCI-E bus n the case that more GPUs are runnng on the same Fgure 9 Comparson of communcaton tme between blockng and non-blockng MPI n large-scale LBM smulatons. node. Therefore, we beleve that nodes ntegratng more GPUs lke Mole-8.5 acheve a good balance between performance and economy for some applcatons wth an effcent algorthm consderng the hardware cost and space occupaton. 3.4 Applcaton Because of CUDA s nteroperablty wth OpenGL, we couple the effcent GPU mplementaton of the LBM wth a vsualzaton framework developed by our group [20] to realze large-scale smulatons. In ths secton, we conduct a drect numercal smulaton of gas up-flowng through suspended sold partcles under a two-dmensonal doubly perodcal boundary condton. The smulaton doman s 11.5 cm 46 cm, whch s dscretzed by about one bllon lattce cells. We smulate the gas-sold flow usng 576 GPUs at 96 nodes by two-dmensonal doman decomposton. In Fgure 10, dstnct regons of partcle aggregaton, whch are called clusters n the chemcal communty, are reproduced. Ths large-scale smulaton confrms that the effcent mult-gpu parallel LBM smulaton wth a powerful GPU cluster s a promsng tool for scentfc or ndustral modelng. 4 Conclusons and prospects A hybrd parallel GPU mplementaton for LBM smulaton was proposed. Asynchronous GPU executon technology was appled to confrm overlappng between GPU CPU data transfer and GPU computaton, ndcatng that a large porton of the tme for GPU CPU copy can be hdden. Data transfer between CPUs s realzed wth MPI. To hde ths nter-cpu communcaton cost, non-blockng MPI was used to enable concurrent executons of GPU computng and MPI sendng and recevng. A shared memory model such as OpenMP was appled to mprove the performance of nodes ntegrated wth multple GPUs. In our test cases, the tme requred for GPU CPU data transfer and nter-cpu communcaton was reduced by up to about 70% for one-dmensonal decomposton and 80% for twodmensonal decomposton. These results show that the hybrd mult-gpu LBM mplementaton s a feasble way to mprove effcency. Large-scale drect numercal smulaton of an 11.5 cm 46 cm two-dmensonal doubly perodcal gas-sold suspenson was demonstrated by couplng the mplementaton wth a vsualzaton framework. The hybrd mode was easy to mplement and can be extended to three-dmensonal decomposton. Although our mplementatons were based on the LBM, other CFD methods such as the fnte dfference and fnte volume methods can be ncorporated nto ths hybrd mode and we beleve that they wll also perform well.

8 714 Xong Q G, et al. Chn Sc Bull March (2012) Vol.57 No.7 Table 2 Tme costs for GPU CPU data transfer and CPU CPU communcaton wth a varyng number of GPUs at each node n case E Number of GPUs n a node GPU computaton (s) GPU CPU data transfer (s) CPU CPU communcaton (s) Total (s) Fgure 10 Large-scale drect numercal smulaton of a two-dmensonal gas-sold suspenson contanng more than one mllon partcles [20]. Ths work was supported by the Natonal Natural Scence Foundaton of Chna ( and ). We are grateful to Prof. Abng Yu of Unversty of New South Wales for llumnatve dscussons. Two anonymous revewers who gave valuable comments and suggestons that helped mprove the qualty of ths artcle are gratefully acknowledged. Support from Nvda through the CUDA Center of Excellence Program s also apprecated. 1 Kampols I C, Trompouks X S, Asout V G, et al. CFD-based analyss and two-level aerodynamc optmzaton on graphcs processng unts. Comput Method Appl M, 2010, 199: Wang J, Xu M, Ge W, et al. GPU accelerated drect numercal smulaton wth SIMPLE arthmetc for sngle-phase flow. Chn Sc Bull, 2010, 55: Anderson J A, Lorenz C D, Travesset A. General purpose molecular dynamcs smulatons fully mplemented on graphcs processng unt. J Comput Phys, 2008, 227: Chen F, Ge W, L J. Molecular dynamcs smulaton of complex multphase flow on a computer cluster wth GPUs. Sc Chna Ser B: Chem, 2009, 52: Xong Q, L B, Chen F, et al. Drect numercal smulaton of sub-grd structures n gas-sold flow GPU mplementaton of macro-scale pseudo-partcle modelng. Chem Eng Sc, 2010, 65: McNamara G R, Zanett G. Use of the Boltzmann equaton to smulate lattce-gas automata. Phys Rev Lett, 1988, 61: Tolke J, Krafczyk M. TeraFLOP computng on a desktop PC wth GPUs for 3D CFD. Int J Comput Flud D, 2008, 22: Ge W, Chen F, Meng F, et al. Mult-scale Dscrete Smulaton Parallel Computng Based on GPU (n Chnese). Bejng: Scence Press, Bernasch M, Fatca M, Melchonna S, et al. A flexble hghperformance lattce Boltzmann GPU code for the smulatons of flud flows n complex geometres. Concurr Comp-Pract E, 2010, 22: Kuznk F, Obrecht C, Rusaouen G, et al. LBM based flow smulaton usng GPU computng processor. Comput Math Appl, 2010, 59: L B, L X, Zhang Y, et al. Lattce Boltzmann smulaton on Nvda

9 Xong Q G, et al. Chn Sc Bull March (2012) Vol.57 No and AMD GPUs (n Chnese). Chn Sc Bull (Chn Ver), 2009, 54: Myre J, Walsh S, Llja D, et al. Performance analyss of sngle-phase, multphase, and multcomponent lattce-boltzmann flud flow smulatons on GPU clusters. Concurr Comp-Pract E, 2010, 23: NVIDIA. NVIDIA CUDA compute unfed devce archtecture Programmng Gude Verson 3.1, Qan Y, Humeres D, Lallemand P. Lattce BGK for Naver-Stokes equaton. Europhys Lett, 1992, 17: He N, Wang N, Sh B. A unfed ncompressble lattce BGK model and ts applcaton to three-dmensonal ld-drven cavty flow. Chn Phys, 2004, 13: Obrecht C, Kuznk F, Tourancheau B, et al. A new approach to the lattce Boltzmann method for graphcs processng unts. Comput Math Appl, 2011, 61: Yang C, Huang C, Ln C. Hybrd CUDA, OpenMP, and MPI parallel programmng on multcore GPU clusters. Comput Phys Commun, 2011, 182: Mellanox. NVIDIA GPUDrect Technology Acceleratng GPU-based Systems Komattsch D, Erlebacher G, Goddeke D, et al. Hgh-order fnte-element sesmc wave propagaton modelng wth MPI on a large GPU cluster. J Comput Phys, 2010, 229: Ge W, Wang W, Yang N, et al. Meso-scale orented smulaton towards vrtual process engneerng (VPE) The EMMS paradgm. Chem Eng Sc, 2011, 66: Open Access Ths artcle s dstrbuted under the terms of the Creatve Commons Attrbuton Lcense whch permts any use, dstrbuton, and reproducton n any medum, provded the orgnal author(s) and source are credted.

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

GPU Accelerated Blood Flow Computation using the Lattice Boltzmann Method

GPU Accelerated Blood Flow Computation using the Lattice Boltzmann Method GPU Accelerated Blood Flow Computaton usng the Lattce Boltmann Method Cosmn Nţă, Lucan Mha Itu, Constantn Sucu Department of Automaton Translvana Unversty of Braşov Braşov, Romana Constantn Sucu Corporate

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

The Shortest Path of Touring Lines given in the Plane

The Shortest Path of Touring Lines given in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD Analyss on the Workspace of Sx-degrees-of-freedom Industral Robot Based on AutoCAD Jn-quan L 1, Ru Zhang 1,a, Fang Cu 1, Q Guan 1 and Yang Zhang 1 1 School of Automaton, Bejng Unversty of Posts and Telecommuncatons,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Research Article Performance Optimization of 3D Lattice Boltzmann Flow Solver on a GPU

Research Article Performance Optimization of 3D Lattice Boltzmann Flow Solver on a GPU Hndaw Scentfc Programmng Volume 2017, Artcle ID 1205892, 16 pages https://do.org/10.1155/2017/1205892 Research Artcle Performance Optmzaton of 3D Lattce Boltzmann Flow Solver on a GPU Nhat-Phuong Tran,

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach Modelng, Manpulatng, and Vsualzng Contnuous Volumetrc Data: A Novel Splne-based Approach Jng Hua Center for Vsual Computng, Department of Computer Scence SUNY at Stony Brook Talk Outlne Introducton and

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research Schedulng Remote Access to Scentfc Instruments n Cybernfrastructure for Educaton and Research Je Yn 1, Junwe Cao 2,3,*, Yuexuan Wang 4, Lanchen Lu 1,3 and Cheng Wu 1,3 1 Natonal CIMS Engneerng and Research

More information

AADL : about scheduling analysis

AADL : about scheduling analysis AADL : about schedulng analyss Schedulng analyss, what s t? Embedded real-tme crtcal systems have temporal constrants to meet (e.g. deadlne). Many systems are bult wth operatng systems provdng multtaskng

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to: 4.1 4.2 Motvaton EE 457 Unt 4 Computer System Performance An ndvdual user wants to: Mnmze sngle program executon tme A datacenter owner wants to: Maxmze number of Mnmze ( ) http://e-tellgentnternetmarketng.com/webste/frustrated-computer-user-2/

More information

Speedup of Type-1 Fuzzy Logic Systems on Graphics Processing Units Using CUDA

Speedup of Type-1 Fuzzy Logic Systems on Graphics Processing Units Using CUDA Speedup of Type-1 Fuzzy Logc Systems on Graphcs Processng Unts Usng CUDA Durlabh Chauhan 1, Satvr Sngh 2, Sarabjeet Sngh 3 and Vjay Kumar Banga 4 1,2 Department of Electroncs & Communcaton Engneerng, SBS

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Design of Structure Optimization with APDL

Design of Structure Optimization with APDL Desgn of Structure Optmzaton wth APDL Yanyun School of Cvl Engneerng and Archtecture, East Chna Jaotong Unversty Nanchang 330013 Chna Abstract In ths paper, the desgn process of structure optmzaton wth

More information

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION 24 CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION The present chapter proposes an IPSO approach for multprocessor task schedulng problem wth two classfcatons, namely, statc ndependent tasks and

More information

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment Journal of Physcs: Conference Seres PAPER OPEN ACCESS Resource and Vrtual Functon Status Montorng n Network Functon Vrtualzaton Envronment To cte ths artcle: MS Ha et al 2018 J. Phys.: Conf. Ser. 1087

More information

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

Maintaining temporal validity of real-time data on non-continuously executing resources

Maintaining temporal validity of real-time data on non-continuously executing resources Mantanng temporal valdty of real-tme data on non-contnuously executng resources Tan Ba, Hong Lu and Juan Yang Hunan Insttute of Scence and Technology, College of Computer Scence, 44, Yueyang, Chna Wuhan

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

Research and Application of Fingerprint Recognition Based on MATLAB

Research and Application of Fingerprint Recognition Based on MATLAB Send Orders for Reprnts to reprnts@benthamscence.ae The Open Automaton and Control Systems Journal, 205, 7, 07-07 Open Access Research and Applcaton of Fngerprnt Recognton Based on MATLAB Nng Lu* Department

More information

COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES. University, Beijing , China

COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES. University, Beijing , China COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES Bn Zhao 1, 2, He Xao 1, Yue Wang 1, Yuebao Wang 1 1 Department of Buldng Scence and Technology, Tsnghua Unversty, Bejng

More information

An inverse problem solution for post-processing of PIV data

An inverse problem solution for post-processing of PIV data An nverse problem soluton for post-processng of PIV data Wt Strycznewcz 1,* 1 Appled Aerodynamcs Laboratory, Insttute of Avaton, Warsaw, Poland *correspondng author: wt.strycznewcz@lot.edu.pl Abstract

More information

Distributed Middlebox Placement Based on Potential Game

Distributed Middlebox Placement Based on Potential Game Int. J. Communcatons, Network and System Scences, 2017, 10, 264-273 http://www.scrp.org/ournal/cns ISSN Onlne: 1913-3723 ISSN Prnt: 1913-3715 Dstrbuted Mddlebox Placement Based on Potental Game Yongwen

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

A fast algorithm for color image segmentation

A fast algorithm for color image segmentation Unersty of Wollongong Research Onlne Faculty of Informatcs - Papers (Arche) Faculty of Engneerng and Informaton Scences 006 A fast algorthm for color mage segmentaton L. Dong Unersty of Wollongong, lju@uow.edu.au

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

Security Vulnerabilities of an Enhanced Remote User Authentication Scheme

Security Vulnerabilities of an Enhanced Remote User Authentication Scheme Contemporary Engneerng Scences, Vol. 7, 2014, no. 26, 1475-1482 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.12988/ces.2014.49186 Securty Vulnerabltes of an Enhanced Remote User Authentcaton Scheme Hae-Soon

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility IJCSNS Internatonal Journal of Computer Scence and Network Securty, VOL.15 No.10, October 2015 1 Evaluaton of an Enhanced Scheme for Hgh-level Nested Network Moblty Mohammed Babker Al Mohammed, Asha Hassan.

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Evaluation of Parallel Processing Systems through Queuing Model

Evaluation of Parallel Processing Systems through Queuing Model ISSN 2278-309 Vkas Shnde, Internatonal Journal of Advanced Volume Trends 4, n Computer No.2, March Scence - and Aprl Engneerng, 205 4(2), March - Aprl 205, 36-43 Internatonal Journal of Advanced Trends

More information

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole Appled Mathematcs, 04, 5, 37-3 Publshed Onlne May 04 n ScRes. http://www.scrp.org/journal/am http://dx.do.org/0.436/am.04.584 The Research of Ellpse Parameter Fttng Algorthm of Ultrasonc Imagng Loggng

More information

Simulation of a Ship with Partially Filled Tanks Rolling in Waves by Applying Moving Particle Semi-Implicit Method

Simulation of a Ship with Partially Filled Tanks Rolling in Waves by Applying Moving Particle Semi-Implicit Method Smulaton of a Shp wth Partally Flled Tanks Rollng n Waves by Applyng Movng Partcle Sem-Implct Method Jen-Shang Kouh Department of Engneerng Scence and Ocean Engneerng, Natonal Tawan Unversty, Tape, Tawan,

More information

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Positive Semi-definite Programming Localization in Wireless Sensor Networks Postve Sem-defnte Programmng Localzaton n Wreless Sensor etworks Shengdong Xe 1,, Jn Wang, Aqun Hu 1, Yunl Gu, Jang Xu, 1 School of Informaton Scence and Engneerng, Southeast Unversty, 10096, anjng Computer

More information

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems Real-tme Fault-tolerant Schedulng Algorthm for Dstrbuted Computng Systems Yun Lng, Y Ouyang College of Computer Scence and Informaton Engneerng Zheang Gongshang Unversty Postal code: 310018 P.R.CHINA {ylng,

More information

Video Proxy System for a Large-scale VOD System (DINA)

Video Proxy System for a Large-scale VOD System (DINA) Vdeo Proxy System for a Large-scale VOD System (DINA) KWUN-CHUNG CHAN #, KWOK-WAI CHEUNG *# #Department of Informaton Engneerng *Centre of Innovaton and Technology The Chnese Unversty of Hong Kong SHATIN,

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

A new segmentation algorithm for medical volume image based on K-means clustering

A new segmentation algorithm for medical volume image based on K-means clustering Avalable onlne www.jocpr.com Journal of Chemcal and harmaceutcal Research, 2013, 5(12):113-117 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCRC5 A new segmentaton algorthm for medcal volume mage based

More information

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA RFr"W/FZD JAN 2 4 1995 OST control # 1385 John J Q U ~ M Argonne Natonal Laboratory Argonne, L 60439 Tel: 708-252-5357, Fax: 708-252-3 611 APPLCATON OF A COMPUTATONALLY EFFCENT GEOSTATSTCAL APPROACH TO

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Performance Comparison of a QoS Aware Routing Protocol for Wireless Sensor Networks

Performance Comparison of a QoS Aware Routing Protocol for Wireless Sensor Networks Communcatons and Network, 2016, 8, 45-55 Publshed Onlne February 2016 n ScRes. http://www.scrp.org/journal/cn http://dx.do.org/10.4236/cn.2016.81006 Performance Comparson of a QoS Aware Routng Protocol

More information

Parallel Branch and Bound Algorithm - A comparison between serial, OpenMP and MPI implementations

Parallel Branch and Bound Algorithm - A comparison between serial, OpenMP and MPI implementations Journal of Physcs: Conference Seres Parallel Branch and Bound Algorthm - A comparson between seral, OpenMP and MPI mplementatons To cte ths artcle: Luco Barreto and Mchael Bauer 2010 J. Phys.: Conf. Ser.

More information

A Five-Point Subdivision Scheme with Two Parameters and a Four-Point Shape-Preserving Scheme

A Five-Point Subdivision Scheme with Two Parameters and a Four-Point Shape-Preserving Scheme Mathematcal and Computatonal Applcatons Artcle A Fve-Pont Subdvson Scheme wth Two Parameters and a Four-Pont Shape-Preservng Scheme Jeqng Tan,2, Bo Wang, * and Jun Sh School of Mathematcs, Hefe Unversty

More information

ANALYTICAL MODEL AND PERFORMANCE ANALYSIS OF A NETWORK INTERFACE CARD. Abstract

ANALYTICAL MODEL AND PERFORMANCE ANALYSIS OF A NETWORK INTERFACE CARD. Abstract ANALYTICAL MODEL AND PERFORMANCE ANALYSIS OF A NETWORK INTERFACE CARD Naveen Cherukur 1, Gokul B. Kandraju 2, Natarajan Gautam 3, and Anand Svasubramanam 4 Abstract One of the key concerns for practtoners

More information

Efficient Content Distribution in Wireless P2P Networks

Efficient Content Distribution in Wireless P2P Networks Effcent Content Dstrbuton n Wreless P2P Networs Qong Sun, Vctor O. K. L, and Ka-Cheong Leung Department of Electrcal and Electronc Engneerng The Unversty of Hong Kong Pofulam Road, Hong Kong, Chna {oansun,

More information

Two-Stage Data Distribution for Distributed Surveillance Video Processing with Hybrid Storage Architecture

Two-Stage Data Distribution for Distributed Surveillance Video Processing with Hybrid Storage Architecture Two-Stage Data Dstrbuton for Dstrbuted Survellance Vdeo Processng wth Hybrd Storage Archtecture Yangyang Gao, Hatao Zhang, Bngchang Tang, Yanpe Zhu, Huadong Ma Bejng Key Lab of Intellgent Telecomm. Software

More information

Hybrid Non-Blind Color Image Watermarking

Hybrid Non-Blind Color Image Watermarking Hybrd Non-Blnd Color Image Watermarkng Ms C.N.Sujatha 1, Dr. P. Satyanarayana 2 1 Assocate Professor, Dept. of ECE, SNIST, Yamnampet, Ghatkesar Hyderabad-501301, Telangana 2 Professor, Dept. of ECE, AITS,

More information

Investigations of Topology and Shape of Multi-material Optimum Design of Structures

Investigations of Topology and Shape of Multi-material Optimum Design of Structures Advanced Scence and Tecnology Letters Vol.141 (GST 2016), pp.241-245 ttp://dx.do.org/10.14257/astl.2016.141.52 Investgatons of Topology and Sape of Mult-materal Optmum Desgn of Structures Quoc Hoan Doan

More information

Application of Improved Fish Swarm Algorithm in Cloud Computing Resource Scheduling

Application of Improved Fish Swarm Algorithm in Cloud Computing Resource Scheduling , pp.40-45 http://dx.do.org/10.14257/astl.2017.143.08 Applcaton of Improved Fsh Swarm Algorthm n Cloud Computng Resource Schedulng Yu Lu, Fangtao Lu School of Informaton Engneerng, Chongqng Vocatonal Insttute

More information

Analysis of Collaborative Distributed Admission Control in x Networks

Analysis of Collaborative Distributed Admission Control in x Networks 1 Analyss of Collaboratve Dstrbuted Admsson Control n 82.11x Networks Thnh Nguyen, Member, IEEE, Ken Nguyen, Member, IEEE, Lnha He, Member, IEEE, Abstract Wth the recent surge of wreless home networks,

More information

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT Bran J. Wolf, Joseph L. Hammond, and Harlan B. Russell Dept. of Electrcal and Computer Engneerng, Clemson Unversty,

More information

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Reproducing Works of Calder

Reproducing Works of Calder Reproducng Works of Calder Dongkyoo Lee*, Hee-Jung Bae*, Chang Tae Km*, Dong-Chun Lee*, Dae-Hyun Jung*, Nam-Kyung Lee*, Kyoo-Ho Lee*, Nakhoon Baek**, J. Won Lee***, Kwan Woo Ryu* and James K. Hahn*** *

More information

A Parallelization Design of JavaScript Execution Engine

A Parallelization Design of JavaScript Execution Engine , pp.171-184 http://dx.do.org/10.14257/mue.2014.9.7.15 A Parallelzaton Desgn of JavaScrpt Executon Engne Duan Huca 1,2, N Hong 2, Deng Feng 2 and Hu Lnln 2 1 Natonal Network New eda Engneerng Research

More information

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations* Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*

More information

Finite Element Analysis of Rubber Sealing Ring Resilience Behavior Qu Jia 1,a, Chen Geng 1,b and Yang Yuwei 2,c

Finite Element Analysis of Rubber Sealing Ring Resilience Behavior Qu Jia 1,a, Chen Geng 1,b and Yang Yuwei 2,c Advanced Materals Research Onlne: 03-06-3 ISSN: 66-8985, Vol. 705, pp 40-44 do:0.408/www.scentfc.net/amr.705.40 03 Trans Tech Publcatons, Swtzerland Fnte Element Analyss of Rubber Sealng Rng Reslence Behavor

More information

UrbaWind, a Computational Fluid Dynamics tool to predict wind resource in urban area

UrbaWind, a Computational Fluid Dynamics tool to predict wind resource in urban area UrbaWnd, a Computatonal Flud Dynamcs tool to predct wnd resource n urban area Karm FAHSSIS a, Gullaume DUPONT a, Perre LEYRONNAS a a Meteodyn, Nantes, France Presentng Author: Karm.fahsss@meteodyn.com,

More information

Obstacle-Aware Routing Problem in. a Rectangular Mesh Network

Obstacle-Aware Routing Problem in. a Rectangular Mesh Network Appled Mathematcal Scences, Vol. 9, 015, no. 14, 653-663 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.1988/ams.015.411911 Obstacle-Aware Routng Problem n a Rectangular Mesh Network Norazah Adzhar Department

More information