Fast Multipole Method on the GPU
|
|
- Phyllis Russell
- 6 years ago
- Views:
Transcription
1 Fast Multipole Method on the GPU with application to the Adaptive Vortex Method University of Bristol, Bristol, United Kingdom. 1
2 Introduction Particle methods Highly parallel Computational intensive Numerical Challenge: N-body problem Opportunity: Clever algorithms Massively parallel architectures (GPUs) Contribution: Mesh-less method. Accelerated using clever algorithms (FMM). Implementation for GPUs. 2
3 Overview of the presentation Adaptive Vortex Method (brief introduction) Algorithmic representation The Fast Multipole Method Introduction to the algorithm GPU implementation Lessons learned Final remark 3
4 Vortex Method for fluid simulation 4
5 Vortex Method for fluid simulation Incompresible Newtonian fluid (2D case) u t + u u = p ρ + ν 2 u Navier-Stokes equation on vorticity formulation ω ω = u t + u ω = ω u + ν 2 ω 5
6 Vortex Method for fluid simulation Discretize the vorticity field into particles ω σ (x, t) = N i=1 γ i ζ σ (x x i ) Each particle carries vorticity ω ζ σ (x) = 1 2πσ 2 exp ( x 2 2σ 2 ) Particles move with the fluid u dx i dt = u(x i,t) 6
7 Vortex Method for fluid simulation The velocity can be obtained from the vorticity field: ω = 2 ψ u(x) = 1 2π (x x ) ω(x )ê z x x 2 dx where ω is given by the discretized vorticity field, which results in an N-body problem: u σ (x, t) = N i=1 γ i K σ (x x i ) K σ = 1 ( )) 2π x 2 ( x 2,x 1 ) 1 exp ( x 2 2σ 2 7
8 Vortex Method Algorithm 8
9 Vortex Method algorithm 1.Discretization 2.Velocity evaluation 3.Convection 4.Diffusion 5.Spatial adaptation Start N ω(x,t) ω σ (x,t)= i=1 Γ i (t)ζ σi (x x i (t)). End 9
10 Vortex Method algorithm 1.Discretization 2.Velocity evaluation 3.Convection 4.Diffusion 5.Spatial adaptation Start N u σ (x,t)= j=1 Γ j K σ (x x j ) End 10
11 Vortex Method algorithm 1.Discretization 2.Velocity evaluation 3.Convection 4.Diffusion 5.Spatial adaptation Start dx i dt = u(x i,t) End 11
12 Vortex Method algorithm 1.Discretization 2.Velocity evaluation 3.Convection 4.Diffusion 5.Spatial adaptation Start dω dt = ν 2 ω End 12
13 Vortex Method algorithm 1.Discretization 2.Velocity evaluation 3.Convection 4.Diffusion 5.Spatial adaptation Start N ω(x,t) ω σ (x,t)= i=1 Γ i (t)ζ σi (x x i (t)). End 13
14 VM advantages Low numerical diffusion. No mesh. It adapts to the fluid. VM challenges Efficient treatment of boundary conditions. Numerical: solution of an N-body problem. 14
15 Fast Multipole Method 15
16 Fast summation problem Accelerate the evaluation of problems of the form: f(y) = N c i K(y x i ) y [1...N] i=1 For N evaluations the total amount of work is proportional to N 2 We want to solve this kind of problems in less than O(N 2 ): We want a O(N) and highly accurate algorithm The FMM exchanges accuracy for speed and we control the accuracy. 16
17 ! " # $ % & ' ( ) * +, -. / : ; < = A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { } ~ The Fast Multipole Method The FMM is based on ME to approximate the kernel function when evaluated far away from the origin. A ME is an infinite series truncated after p terms. This is how we control the accuracy of the approximation. K( y x c )= p a m (x c )f m (y) m=0 y y a m (x c ) : coefficient terms r r x c x i x 17
18 The Fast Multipole Method The basic idea is to use this ME to approximate a cluster of particles as a single pseudo particle. The bigger the distance to the cluster, the bigger the pseudo particles can be. Direct evaluation for all particles in the near-field. pseudo-particles particles Distance Evaluation point b r Domain decomposition 18
19 The Fast Multipole Method A Local Expansion (LE) is used to approximate the influence of a group of Multipole Expansions. An LE provides a local description of the influence of a particle that is located far away. Far field evaluation using a single Local Expansion. 19
20 The Fast Multipole Method A Local Expansion (LE) is used to approximate the influence of a group of Multipole Expansions. An LE provides a local description of the influence of a particle that is located far away. Far field evaluation using a single Local Expansion. 20
21 The Fast Multipole Method The computation related to the tree-structure, in the O(N) algorithm: Upward Sweep Downward Sweep Create Multipole Expansions. Evaluate Local Expansions. P2M M2M M2L L2L L2P 21
22 Fast Multipole Method on the GPU 22
23 Exposing task level parallelism Stages: Setup Upward Sweep Downward Sweep Evaluation Directed Acyclic Graph of the FMM. Show tasks dependencies. Expose Task level parallelism. 1. Tree creation. 2. Particle clustering. 3. Listing of clusters interactions. 4. Particle to Multipole. 5. Multipole to Multipole. 6. Multipole to Local. 7. Local to Local. 8. Local to Particle. 9. Near field evaluation. 10. Adding near and far field contributions. 23
24 FMM: Computational time per stage Downward Sweep (M2L) and particle evaluation = over 99% of time ME Initialization Upward Sweep Downward Sweep Evaluation Total time Opportunities for these two stages, big gains. Particle evaluation easy to implement for the GPU. Time [sec] Focus on Multipole-to-Local operations (M2L) Number of processors Computational time Parallel FMM (PetFMM) 10 million particles FMM level 9 FMM terms 17 24
25 Accelerating the M2L M2L stage can over 99% of computation time. One LE is formed by several transformed MEs. In total, many LEs are produced but only one per cluster. (L=5 requires 27,648 M2L translations) The M2L transformations as a matrix vector operator. M2L implementation is: matrix free, and computationally intensive. ME (orange) used to produce a single LE (blue) M2L(t) ME LE M2L Transformation 25
26 Accelerating the M2L Work reorganization: From hierarchical structure to a Queue. Homogeneous units of work. Improved temporal locality. Upward Sweep Downward Sweep Create Multipole Expansions. Evaluate Local Expansions. P2M M2M M2L L2L L2P 26
27 Accelerating the M2L Work reorganization: From hierarchical structure to a Queue. Homogeneous units of work. Improved temporal locality. Upward Sweep Downward Sweep Reorganized Task Queue M2L(A, c 1 ) M2L(A, c 2 ) Reorganize computations M2L(A, c 3 ) M2L(B, c 1 ) Create Multipole Expansions. Evaluate Local Expansions. P2M M2M M2L L2L L2P M2L(B, c 2 ) M2L(B, c 3 ) 27
28 GPU kernel version 1 Each thread transforms one ME. Matrix free multiplication. Efficient matrix creation and multiplication. No thread synchronization is required. Resource intensive thread. ME LE Non-coalesced memory transactions. Single thread computation pattern Result: 20 Giga-operations. (1 C1060 card) 20x speedup. 28
29 GPU kernel version 2 Many threads transform one ME. One thread computes only one term. Less float-operation efficient. More parallelism. Coalesced memory transactions. ME LE Less resources per thread. Other memory tricks. Multiple threads computation pattern Result: 482 Giga-operations. (1 C1060 card) 100x speedup. 29
30 Lessons Learned 30
31 Paradigm shift Start by exposing parallelism: Think about homogeneous units of work. Think about thousands of parallel operations. Think about smart usage of resources. Trade operation efficiency for more parallel and resource efficient kernels. Think about heterogeneous computing. GPUs are not a silver bullet. Use CPU to reorganize work. 31
32 Conclusions Heterogeneous Computing: use all available hardware! Current FMM peak: 480 giga-ops. Methodology: Identify and expose parallelism Distribute work between CPU and GPU Use the best for each job! Current Work: Parallel FMM library (many applications) Multi-GPU implementation of the FMM. 32
33 Ongoing work Particle methods maps well to new architectures. However, particle methods has the disadvantage of not being as mature as mesh-based methods. Much more research has been done for conventional mesh methods. On going work: A compromise between method, hybrid particle-mesh methods on new architectures. 33
34 Final remark Novel Architectures Current Applications How to cross the bridge between new technologies to current applications? Re-develop algorithms can give large speedups but is far from trivial. Port algorithms can give small speedups with less effort. Cost effective solution: Research / development of heterogeneity aware libraries. 34
35 Thanks for listening 35
36 Velocity calculation: Gaussian particles N-body problem ζ σ (x) = 1 2πσ 2 exp ( x 2 2σ 2 ) vorticity ω σ (x, t) = N γ i ζ σ (x x i ) i=1 velocity u σ (x, t) = N γ i K σ (x x i ) i=1 with K σ = 1 2π x 2 ( x 2,x 1 ) ( )) 1 exp ( x 2 2σ 2 36
37 Vortex sheet Discontinuity in the velocity field. Represented by vortex elements. γ(s) 1 π [ n [log x(s) x(s ) ] ρ 1(s) L ]γ(s )ds = 2 u slip ŝ ω t ν ω =0, ω(t δt) =0, ν ω n = γ(s) δt 37
38 Vortex Method algorithm 1.Discretization 2.Velocity evaluation 3.Convection 4.Diffusion 5.Spatial adaptation Start End
39 Vortex method algorithm with panel-free boundary conditions Start 5 B 1 A 2 4 A.Vortex sheet calculation B.Vortex sheet diffusion 3 End
40 Vortex method algorithm with panel-free boundary conditions Start 5 B 1 A 2 4 A.Vortex sheet calculation B.Vortex sheet diffusion 3 End
41 Vortex method algorithm with panel-free boundary conditions Start 5 B 1 A 2 4 A.Vortex sheet calculation B.Vortex sheet diffusion 3 End
42 Panel-free method Discretize into points. Particle discretization Points are the control points. B.C. are enforced at the control points. RBF solution. 42
43 Panel-free method Discretize into points. Particle discretization Points are the control points. B.C. are enforced at the control points. RBF solution. γ(x) N φ( x c i )α i i=1
44 { Accelerating the M2L M2L: Two stage computation ME ME ME ME ME ME Stage 1: Transformation of ME. Stage 2: Reduction of LE. LE 44
45 PetFMM Parallel extensible toolkit for the FMM M2M and L2L translations M2L transformation Local domain Root tree Level k Sub-tree 1 Sub-tree 2 Sub-tree 3 Sub-tree 4 Sub-tree 5 Sub-tree 6 Sub-tree 7 Sub-tree 8 Parallelization strategy 45
46 PetFMM Parallel extensible toolkit for the FMM w i c ij w j Parallel work distribution 46
47 PetFMM Parallel extensible toolkit for the FMM Speedup uniform 4ML8R5 uniform 10ML9R5 spiral 1ML8R5 spiral w/ space-filling 1ML8R5 Perfect Speedup Number of processors Speedup of PetFMM for different test cases 47
Center for Computational Science
Center for Computational Science Toward GPU-accelerated meshfree fluids simulation using the fast multipole method Lorena A Barba Boston University Department of Mechanical Engineering with: Felipe Cruz,
More informationFast Multipole and Related Algorithms
Fast Multipole and Related Algorithms Ramani Duraiswami University of Maryland, College Park http://www.umiacs.umd.edu/~ramani Joint work with Nail A. Gumerov Efficiency by exploiting symmetry and A general
More informationExaFMM. Fast multipole method software aiming for exascale systems. User's Manual. Rio Yokota, L. A. Barba. November Revision 1
ExaFMM Fast multipole method software aiming for exascale systems User's Manual Rio Yokota, L. A. Barba November 2011 --- Revision 1 ExaFMM User's Manual i Revision History Name Date Notes Rio Yokota,
More informationA Kernel-independent Adaptive Fast Multipole Method
A Kernel-independent Adaptive Fast Multipole Method Lexing Ying Caltech Joint work with George Biros and Denis Zorin Problem Statement Given G an elliptic PDE kernel, e.g. {x i } points in {φ i } charges
More informationStokes Preconditioning on a GPU
Stokes Preconditioning on a GPU Matthew Knepley 1,2, Dave A. Yuen, and Dave A. May 1 Computation Institute University of Chicago 2 Department of Molecular Biology and Physiology Rush University Medical
More informationThe Fast Multipole Method on NVIDIA GPUs and Multicore Processors
The Fast Multipole Method on NVIDIA GPUs and Multicore Processors Toru Takahashi, a Cris Cecka, b Eric Darve c a b c Department of Mechanical Science and Engineering, Nagoya University Institute for Applied
More informationTree-based methods on GPUs
Tree-based methods on GPUs Felipe Cruz 1 and Matthew Knepley 2,3 1 Department of Mathematics University of Bristol 2 Computation Institute University of Chicago 3 Department of Molecular Biology and Physiology
More information21. Efficient and fast numerical methods to compute fluid flows in the geophysical β plane
12th International Conference on Domain Decomposition Methods Editors: Tony Chan, Takashi Kako, Hideo Kawarada, Olivier Pironneau, c 2001 DDM.org 21. Efficient and fast numerical methods to compute fluid
More informationSoftware and Performance Engineering for numerical codes on GPU clusters
Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3
More informationcuibm A GPU Accelerated Immersed Boundary Method
cuibm A GPU Accelerated Immersed Boundary Method S. K. Layton, A. Krishnan and L. A. Barba Corresponding author: labarba@bu.edu Department of Mechanical Engineering, Boston University, Boston, MA, 225,
More informationFast Multipole Methods on a Cluster of GPUs for the Meshless Simulation of Turbulence
Fast Multipole Methods on a Cluster of GPUs for the Meshless Simulation of Turbulence Rio Yokota 1, Tetsu Narumi 2, Ryuji Sakamaki 3, Shun Kameoka 3, Shinnosuke Obi 3, Kenji Yasuoka 3 1 Department of Mathematics,
More informationGPU accelerated heterogeneous computing for Particle/FMM Approaches and for Acoustic Imaging
GPU accelerated heterogeneous computing for Particle/FMM Approaches and for Acoustic Imaging Ramani Duraiswami University of Maryland, College Park http://www.umiacs.umd.edu/~ramani With Nail A. Gumerov,
More informationSlat noise prediction with Fast Multipole BEM based on anisotropic synthetic turbulence sources
DLR.de Chart 1 Slat noise prediction with Fast Multipole BEM based on anisotropic synthetic turbulence sources Nils Reiche, Markus Lummer, Roland Ewert, Jan W. Delfs Institute of Aerodynamics and Flow
More informationAccelerated flow acoustic boundary element solver and the noise generation of fish
Accelerated flow acoustic boundary element solver and the noise generation of fish JUSTIN W. JAWORSKI, NATHAN WAGENHOFFER, KEITH W. MOORED LEHIGH UNIVERSITY, BETHLEHEM, USA FLINOVIA PENN STATE 27 APRIL
More informationEfficient Tridiagonal Solvers for ADI methods and Fluid Simulation
Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation Nikolai Sakharnykh - NVIDIA San Jose Convention Center, San Jose, CA September 21, 2010 Introduction Tridiagonal solvers very popular
More informationTwo-Phase flows on massively parallel multi-gpu clusters
Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous
More informationVirtual EM Inc. Ann Arbor, Michigan, USA
Functional Description of the Architecture of a Special Purpose Processor for Orders of Magnitude Reduction in Run Time in Computational Electromagnetics Tayfun Özdemir Virtual EM Inc. Ann Arbor, Michigan,
More informationGPU-based Distributed Behavior Models with CUDA
GPU-based Distributed Behavior Models with CUDA Courtesy: YouTube, ISIS Lab, Universita degli Studi di Salerno Bradly Alicea Introduction Flocking: Reynolds boids algorithm. * models simple local behaviors
More informationPHYSICALLY BASED ANIMATION
PHYSICALLY BASED ANIMATION CS148 Introduction to Computer Graphics and Imaging David Hyde August 2 nd, 2016 WHAT IS PHYSICS? the study of everything? WHAT IS COMPUTATION? the study of everything? OUTLINE
More informationNumerical Algorithms on Multi-GPU Architectures
Numerical Algorithms on Multi-GPU Architectures Dr.-Ing. Harald Köstler 2 nd International Workshops on Advances in Computational Mechanics Yokohama, Japan 30.3.2010 2 3 Contents Motivation: Applications
More informationFast Methods with Sieve
Fast Methods with Sieve Matthew G Knepley Mathematics and Computer Science Division Argonne National Laboratory August 12, 2008 Workshop on Scientific Computing Simula Research, Oslo, Norway M. Knepley
More informationUsing GPUs to compute the multilevel summation of electrostatic forces
Using GPUs to compute the multilevel summation of electrostatic forces David J. Hardy Theoretical and Computational Biophysics Group Beckman Institute for Advanced Science and Technology University of
More informationCUDA. Fluid simulation Lattice Boltzmann Models Cellular Automata
CUDA Fluid simulation Lattice Boltzmann Models Cellular Automata Please excuse my layout of slides for the remaining part of the talk! Fluid Simulation Navier Stokes equations for incompressible fluids
More informationPanel methods are currently capable of rapidly solving the potential flow equation on rather complex
A Fast, Unstructured Panel Solver John Moore 8.337 Final Project, Fall, 202 A parallel high-order Boundary Element Method accelerated by the Fast Multipole Method is presented in this report. The case
More informationCollocation and optimization initialization
Boundary Elements and Other Mesh Reduction Methods XXXVII 55 Collocation and optimization initialization E. J. Kansa 1 & L. Ling 2 1 Convergent Solutions, USA 2 Hong Kong Baptist University, Hong Kong
More informationStream Function-Vorticity CFD Solver MAE 6263
Stream Function-Vorticity CFD Solver MAE 66 Charles O Neill April, 00 Abstract A finite difference CFD solver was developed for transient, two-dimensional Cartesian viscous flows. Flow parameters are solved
More informationParallelized Coupled Solver (PCS) Model Refinements & Extensions
Parallelized Coupled Solver (PCS) Model Refinements & Extensions Sven Schmitz GE Wind November 29 th, 2007 Greenville, SC University of California, Davis Schmitz GE Wind - PCS 1 Outline 2007 Parallelized
More informationEfficient tools for the simulation of flapping wing flows
43rd AIAA Aerospace Sciences Meeting and Exhibit 1-13 January 25, Reno, Nevada AIAA 25-85 Efficient tools for the simulation of flapping wing flows Jeff D. Eldredge Mechanical & Aerospace Engineering Department,
More informationInvestigation of cross flow over a circular cylinder at low Re using the Immersed Boundary Method (IBM)
Computational Methods and Experimental Measurements XVII 235 Investigation of cross flow over a circular cylinder at low Re using the Immersed Boundary Method (IBM) K. Rehman Department of Mechanical Engineering,
More informationFINITE POINTSET METHOD FOR 2D DAM-BREAK PROBLEM WITH GPU-ACCELERATION. M. Panchatcharam 1, S. Sundar 2
International Journal of Applied Mathematics Volume 25 No. 4 2012, 547-557 FINITE POINTSET METHOD FOR 2D DAM-BREAK PROBLEM WITH GPU-ACCELERATION M. Panchatcharam 1, S. Sundar 2 1,2 Department of Mathematics
More informationNIA CFD Futures Conference Hampton, VA; August 2012
Petascale Computing and Similarity Scaling in Turbulence P. K. Yeung Schools of AE, CSE, ME Georgia Tech pk.yeung@ae.gatech.edu NIA CFD Futures Conference Hampton, VA; August 2012 10 2 10 1 10 4 10 5 Supported
More informationA brief description of the particle finite element method (PFEM2). Extensions to free surface
A brief description of the particle finite element method (PFEM2). Extensions to free surface flows. Juan M. Gimenez, L.M. González, CIMEC Universidad Nacional del Litoral (UNL) Santa Fe, Argentina Universidad
More informationarxiv: v4 [cs.na] 20 Aug 2012
FMM-based vortex method for simulation of isotropic turbulence on GPUs, compared with a method Rio Yokota a,, L. A. Barba a a Department of Mechanical Engineering, Boston University, Boston, MA, 5, USA.
More informationParallel and Distributed Systems Lab.
Parallel and Distributed Systems Lab. Department of Computer Sciences Purdue University. Jie Chi, Ronaldo Ferreira, Ananth Grama, Tzvetan Horozov, Ioannis Ioannidis, Mehmet Koyuturk, Shan Lei, Robert Light,
More informationLecture 1.1 Introduction to Fluid Dynamics
Lecture 1.1 Introduction to Fluid Dynamics 1 Introduction A thorough study of the laws of fluid mechanics is necessary to understand the fluid motion within the turbomachinery components. In this introductory
More informationIntegral Equation Methods for Vortex Dominated Flows, a High-order Conservative Eulerian Approach
Integral Equation Methods for Vortex Dominated Flows, a High-order Conservative Eulerian Approach J. Bevan, UIUC ICERM/HKUST Fast Integral Equation Methods January 5, 2016 Vorticity and Circulation Γ =
More informationPossibility of Implicit LES for Two-Dimensional Incompressible Lid-Driven Cavity Flow Based on COMSOL Multiphysics
Possibility of Implicit LES for Two-Dimensional Incompressible Lid-Driven Cavity Flow Based on COMSOL Multiphysics Masanori Hashiguchi 1 1 Keisoku Engineering System Co., Ltd. 1-9-5 Uchikanda, Chiyoda-ku,
More informationInterdisciplinary practical course on parallel finite element method using HiFlow 3
Interdisciplinary practical course on parallel finite element method using HiFlow 3 E. Treiber, S. Gawlok, M. Hoffmann, V. Heuveline, W. Karl EuroEDUPAR, 2015/08/24 KARLSRUHE INSTITUTE OF TECHNOLOGY -
More informationTopology optimization of heat conduction problems
Topology optimization of heat conduction problems Workshop on industrial design optimization for fluid flow 23 September 2010 Misha Marie Gregersen Anton Evgrafov Mads Peter Sørensen Technical University
More informationFMM implementation on CPU and GPU. Nail A. Gumerov (Lecture for CMSC 828E)
FMM implementation on CPU and GPU Nail A. Gumerov (Lecture for CMSC 828E) Outline Two parts of the FMM Data Structure Flow Chart of the Run Algorithm FMM Cost/Optimization on CPU Programming on GPU Fast
More informationDriven Cavity Example
BMAppendixI.qxd 11/14/12 6:55 PM Page I-1 I CFD Driven Cavity Example I.1 Problem One of the classic benchmarks in CFD is the driven cavity problem. Consider steady, incompressible, viscous flow in a square
More informationA Deterministic Viscous Vortex Method for Grid-free CFD with Moving Boundary Conditions
A Deterministic Viscous Vortex Method for Grid-free CFD with Moving Boundary Conditions M.W. PITMAN, A.D. LUCEY Department of Mechanical Engineering Curtin University of Technology GPO Box U1987, Perth,
More informationComputing Nearly Singular Solutions Using Pseudo-Spectral Methods
Computing Nearly Singular Solutions Using Pseudo-Spectral Methods Thomas Y. Hou Ruo Li January 9, 2007 Abstract In this paper, we investigate the performance of pseudo-spectral methods in computing nearly
More informationCMSC 858M/AMSC 698R. Fast Multipole Methods. Nail A. Gumerov & Ramani Duraiswami. Lecture 20. Outline
CMSC 858M/AMSC 698R Fast Multipole Methods Nail A. Gumerov & Ramani Duraiswami Lecture 20 Outline Two parts of the FMM Data Structures FMM Cost/Optimization on CPU Fine Grain Parallelization for Multicore
More information(LSS Erlangen, Simon Bogner, Ulrich Rüde, Thomas Pohl, Nils Thürey in collaboration with many more
Parallel Free-Surface Extension of the Lattice-Boltzmann Method A Lattice-Boltzmann Approach for Simulation of Two-Phase Flows Stefan Donath (LSS Erlangen, stefan.donath@informatik.uni-erlangen.de) Simon
More informationThe Fast Multipole Method and the Radiosity Kernel
and the Radiosity Kernel The Fast Multipole Method Sharat Chandran http://www.cse.iitb.ac.in/ sharat January 8, 2006 Page 1 of 43 (Joint work with Alap Karapurkar and Nitin Goel) 1 Copyright c 2005 Sharat
More informationA higher-order finite volume method with collocated grid arrangement for incompressible flows
Computational Methods and Experimental Measurements XVII 109 A higher-order finite volume method with collocated grid arrangement for incompressible flows L. Ramirez 1, X. Nogueira 1, S. Khelladi 2, J.
More informationVortex Method Applications. Peter S. Bernard University of Maryland
Vortex Method Applications Peter S. Bernard University of Maryland Vortex Methods Flow field is represented using gridfree vortex elements Navier-Stokes equation governs the dynamics of the freely convecting
More informationNumerical Simulation of Coupled Fluid-Solid Systems by Fictitious Boundary and Grid Deformation Methods
Numerical Simulation of Coupled Fluid-Solid Systems by Fictitious Boundary and Grid Deformation Methods Decheng Wan 1 and Stefan Turek 2 Institute of Applied Mathematics LS III, University of Dortmund,
More informationALE Seamless Immersed Boundary Method with Overset Grid System for Multiple Moving Objects
Tenth International Conference on Computational Fluid Dynamics (ICCFD10), Barcelona,Spain, July 9-13, 2018 ICCFD10-047 ALE Seamless Immersed Boundary Method with Overset Grid System for Multiple Moving
More informationCUDA Experiences: Over-Optimization and Future HPC
CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign
More informationGeodesics in heat: A new approach to computing distance
Geodesics in heat: A new approach to computing distance based on heat flow Diana Papyan Faculty of Informatics - Technische Universität München Abstract In this report we are going to introduce new method
More informationScalable, Hybrid-Parallel Multiscale Methods using DUNE
MÜNSTER Scalable Hybrid-Parallel Multiscale Methods using DUNE R. Milk S. Kaulmann M. Ohlberger December 1st 2014 Outline MÜNSTER Scalable Hybrid-Parallel Multiscale Methods using DUNE 2 /28 Abstraction
More informationKernel Independent FMM
Kernel Independent FMM FMM Issues FMM requires analytical work to generate S expansions, R expansions, S S (M2M) translations S R (M2L) translations R R (L2L) translations Such analytical work leads to
More informationME964 High Performance Computing for Engineering Applications
ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964
More informationMass-Spring Systems. Last Time?
Mass-Spring Systems Last Time? Implicit Surfaces & Marching Cubes/Tetras Collision Detection & Conservative Bounding Regions Spatial Acceleration Data Structures Octree, k-d tree, BSF tree 1 Today Particle
More informationParticle-Based Fluid Simulation. CSE169: Computer Animation Steve Rotenberg UCSD, Spring 2016
Particle-Based Fluid Simulation CSE169: Computer Animation Steve Rotenberg UCSD, Spring 2016 Del Operations Del: = x Gradient: s = s x y s y z s z Divergence: v = v x + v y + v z x y z Curl: v = v z v
More informationTowards a Parallel, 3D Simulation of Platelet Aggregation and Blood Coagulation
Towards a Parallel, 3D Simulation of Platelet Aggregation and Blood Coagulation p. 1/22 Towards a Parallel, 3D Simulation of Platelet Aggregation and Blood Coagulation Oral Exam Elijah Newren January 7,
More informationFlux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters
Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,
More informationAsynchronous OpenCL/MPI numerical simulations of conservation laws
Asynchronous OpenCL/MPI numerical simulations of conservation laws Philippe HELLUY 1,3, Thomas STRUB 2. 1 IRMA, Université de Strasbourg, 2 AxesSim, 3 Inria Tonus, France IWOCL 2015, Stanford Conservation
More informationParallel 3D Sweep Kernel with PaRSEC
Parallel 3D Sweep Kernel with PaRSEC Salli Moustafa Mathieu Faverge Laurent Plagne Pierre Ramet 1 st International Workshop on HPC-CFD in Energy/Transport Domains August 22, 2014 Overview 1. Cartesian
More informationAn Embedded Boundary Method with Adaptive Mesh Refinements
An Embedded Boundary Method with Adaptive Mesh Refinements Marcos Vanella and Elias Balaras 8 th World Congress on Computational Mechanics, WCCM8 5 th European Congress on Computational Methods in Applied
More informationMassively Parallel Phase Field Simulations using HPC Framework walberla
Massively Parallel Phase Field Simulations using HPC Framework walberla SIAM CSE 2015, March 15 th 2015 Martin Bauer, Florian Schornbaum, Christian Godenschwager, Johannes Hötzer, Harald Köstler and Ulrich
More informationExploring the features of OpenCL 2.0
Exploring the features of OpenCL 2.0 Saoni Mukherjee, Xiang Gong, Leiming Yu, Carter McCardwell, Yash Ukidave, Tuan Dao, Fanny Paravecino, David Kaeli Northeastern University Outline Introduction and evolution
More informationSolving a Two Dimensional Unsteady-State. Flow Problem by Meshless Method
Applied Mathematical Sciences, Vol. 7, 203, no. 49, 242-2428 HIKARI Ltd, www.m-hikari.com Solving a Two Dimensional Unsteady-State Flow Problem by Meshless Method A. Koomsubsiri * and D. Sukawat Department
More informationQuasi-3D Computation of the Taylor-Green Vortex Flow
Quasi-3D Computation of the Taylor-Green Vortex Flow Tutorials November 25, 2017 Department of Aeronautics, Imperial College London, UK Scientific Computing and Imaging Institute, University of Utah, USA
More informationRealistic Animation of Fluids
Realistic Animation of Fluids p. 1/2 Realistic Animation of Fluids Nick Foster and Dimitri Metaxas Realistic Animation of Fluids p. 2/2 Overview Problem Statement Previous Work Navier-Stokes Equations
More informationIntermediate Parallel Programming & Cluster Computing
High Performance Computing Modernization Program (HPCMP) Summer 2011 Puerto Rico Workshop on Intermediate Parallel Programming & Cluster Computing in conjunction with the National Computational Science
More informationGradient Free Design of Microfluidic Structures on a GPU Cluster
Gradient Free Design of Microfluidic Structures on a GPU Cluster Austen Duffy - Florida State University SIAM Conference on Computational Science and Engineering March 2, 2011 Acknowledgements This work
More informationCOMPUTATIONAL METHODS FOR ENVIRONMENTAL FLUID MECHANICS
COMPUTATIONAL METHODS FOR ENVIRONMENTAL FLUID MECHANICS Tayfun Tezduyar tezduyar@rice.edu Team for Advanced Flow Simulation and Modeling (T*AFSM) Mechanical Engineering and Materials Science Rice University
More informationAvailable online at ScienceDirect. Parallel Computational Fluid Dynamics Conference (ParCFD2013)
Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 61 ( 2013 ) 81 86 Parallel Computational Fluid Dynamics Conference (ParCFD2013) An OpenCL-based parallel CFD code for simulations
More informationLecture 7: Introduction to HFSS-IE
Lecture 7: Introduction to HFSS-IE 2015.0 Release ANSYS HFSS for Antenna Design 1 2015 ANSYS, Inc. HFSS-IE: Integral Equation Solver Introduction HFSS-IE: Technology An Integral Equation solver technology
More informationSuperdiffusion and Lévy Flights. A Particle Transport Monte Carlo Simulation Code
Superdiffusion and Lévy Flights A Particle Transport Monte Carlo Simulation Code Eduardo J. Nunes-Pereira Centro de Física Escola de Ciências Universidade do Minho Page 1 of 49 ANOMALOUS TRANSPORT Definitions
More informationAccepted Manuscript. A resilient and efficient CFD framework: Statistical learning tools for multi-fidelity and heterogeneous information fusion
Accepted Manuscript A resilient and efficient CFD framework: Statistical learning tools for multi-fidelity and heterogeneous information fusion Seungjoon Lee, Ioannis G. Kevrekidis, George Em Karniadakis
More informationCoping with the Ice Accumulation Problems on Power Transmission Lines
Coping with the Ice Accumulation Problems on Power Transmission Lines P.N. Shivakumar 1, J.F.Peters 2, R.Thulasiram 3, and S.H.Lui 1 1 Department of Mathematics 2 Department of Electrical & Computer Engineering
More informationLattice Boltzmann with CUDA
Lattice Boltzmann with CUDA Lan Shi, Li Yi & Liyuan Zhang Hauptseminar: Multicore Architectures and Programming Page 1 Outline Overview of LBM An usage of LBM Algorithm Implementation in CUDA and Optimization
More informationComputational Fluid Dynamics using OpenCL a Practical Introduction
19th International Congress on Modelling and Simulation, Perth, Australia, 12 16 December 2011 http://mssanz.org.au/modsim2011 Computational Fluid Dynamics using OpenCL a Practical Introduction T Bednarz
More informationInviscid Flows. Introduction. T. J. Craft George Begg Building, C41. The Euler Equations. 3rd Year Fluid Mechanics
Contents: Navier-Stokes equations Inviscid flows Boundary layers Transition, Reynolds averaging Mixing-length models of turbulence Turbulent kinetic energy equation One- and Two-equation models Flow management
More informationMESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP
Vol. 12, Issue 1/2016, 63-68 DOI: 10.1515/cee-2016-0009 MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP Juraj MUŽÍK 1,* 1 Department of Geotechnics, Faculty of Civil Engineering, University
More informationNumerical Analysis of Shock Tube Problem by using TVD and ACM Schemes
Numerical Analysis of Shock Tube Problem by using TVD and Schemes Dr. Mukkarum Husain, Dr. M. Nauman Qureshi, Syed Zaid Hasany IST Karachi, Email: mrmukkarum@yahoo.com Abstract Computational Fluid Dynamics
More informationApplication of STAR-CCM+ to Helicopter Rotors in Hover
Application of STAR-CCM+ to Helicopter Rotors in Hover Lakshmi N. Sankar and Chong Zhou School of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA Ritu Marpu Eschol CD-Adapco, Inc.,
More informationAn Efficient CUDA Implementation of a Tree-Based N-Body Algorithm. Martin Burtscher Department of Computer Science Texas State University-San Marcos
An Efficient CUDA Implementation of a Tree-Based N-Body Algorithm Martin Burtscher Department of Computer Science Texas State University-San Marcos Mapping Regular Code to GPUs Regular codes Operate on
More informationReproducibility of Complex Turbulent Flow Using Commercially-Available CFD Software
Reports of Research Institute for Applied Mechanics, Kyushu University, No.150 (60-70) March 2016 Reproducibility of Complex Turbulent Flow Using Commercially-Available CFD Software Report 2: For the Case
More informationSPH: Why and what for?
SPH: Why and what for? 4 th SPHERIC training day David Le Touzé, Fluid Mechanics Laboratory, Ecole Centrale de Nantes / CNRS SPH What for and why? How it works? Why not for everything? Duality of SPH SPH
More informationTransition modeling using data driven approaches
Center for urbulence Research Proceedings of the Summer Program 2014 427 ransition modeling using data driven approaches By K. Duraisamy AND P.A. Durbin An intermittency transport-based model for bypass
More informationNetwork traffic: Scaling
Network traffic: Scaling 1 Ways of representing a time series Timeseries Timeseries: information in time domain 2 Ways of representing a time series Timeseries FFT Timeseries: information in time domain
More information1 Past Research and Achievements
Parallel Mesh Generation and Adaptation using MAdLib T. K. Sheel MEMA, Universite Catholique de Louvain Batiment Euler, Louvain-La-Neuve, BELGIUM Email: tarun.sheel@uclouvain.be 1 Past Research and Achievements
More informationShallow Water Simulations on Graphics Hardware
Shallow Water Simulations on Graphics Hardware Ph.D. Thesis Presentation 2014-06-27 Martin Lilleeng Sætra Outline Introduction Parallel Computing and the GPU Simulating Shallow Water Flow Topics of Thesis
More informationOverview of research activities Toward portability of performance
Overview of research activities Toward portability of performance Do dynamically what can t be done statically Understand evolution of architectures Enable new programming models Put intelligence into
More informationParallel FFT Program Optimizations on Heterogeneous Computers
Parallel FFT Program Optimizations on Heterogeneous Computers Shuo Chen, Xiaoming Li Department of Electrical and Computer Engineering University of Delaware, Newark, DE 19716 Outline Part I: A Hybrid
More informationIntroducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method
Introducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method G. Wellein, T. Zeiser, G. Hager HPC Services Regional Computing Center A. Nitsure, K. Iglberger, U. Rüde Chair for System
More informationThe Immersed Interface Method
The Immersed Interface Method Numerical Solutions of PDEs Involving Interfaces and Irregular Domains Zhiiin Li Kazufumi Ito North Carolina State University Raleigh, North Carolina Society for Industrial
More informationFinite Volume Discretization on Irregular Voronoi Grids
Finite Volume Discretization on Irregular Voronoi Grids C.Huettig 1, W. Moore 1 1 Hampton University / National Institute of Aerospace Folie 1 The earth and its terrestrial neighbors NASA Colin Rose, Dorling
More informationA Novel Approach to High Speed Collision
A Novel Approach to High Speed Collision Avril Slone University of Greenwich Motivation High Speed Impact Currently a very active research area. Generic projectile- target collision 11 th September 2001.
More informationsmooth coefficients H. Köstler, U. Rüde
A robust multigrid solver for the optical flow problem with non- smooth coefficients H. Köstler, U. Rüde Overview Optical Flow Problem Data term and various regularizers A Robust Multigrid Solver Galerkin
More information3D ADI Method for Fluid Simulation on Multiple GPUs. Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA
3D ADI Method for Fluid Simulation on Multiple GPUs Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA Introduction Fluid simulation using direct numerical methods Gives the most accurate result Requires
More informationTechnical Report TR
Technical Report TR-2015-09 Boundary condition enforcing methods for smoothed particle hydrodynamics Arman Pazouki 1, Baofang Song 2, Dan Negrut 1 1 University of Wisconsin-Madison, Madison, WI, 53706-1572,
More informationReproducibility of Complex Turbulent Flow Using Commercially-Available CFD Software
Reports of Research Institute for Applied Mechanics, Kyushu University No.150 (47 59) March 2016 Reproducibility of Complex Turbulent Using Commercially-Available CFD Software Report 1: For the Case of
More informationLATTICE-BOLTZMANN AND COMPUTATIONAL FLUID DYNAMICS
LATTICE-BOLTZMANN AND COMPUTATIONAL FLUID DYNAMICS NAVIER-STOKES EQUATIONS u t + u u + 1 ρ p = Ԧg + ν u u=0 WHAT IS COMPUTATIONAL FLUID DYNAMICS? Branch of Fluid Dynamics which uses computer power to approximate
More informationFOURTH ORDER COMPACT FORMULATION OF STEADY NAVIER-STOKES EQUATIONS ON NON-UNIFORM GRIDS
International Journal of Mechanical Engineering and Technology (IJMET Volume 9 Issue 10 October 2018 pp. 179 189 Article ID: IJMET_09_10_11 Available online at http://www.iaeme.com/ijmet/issues.asp?jtypeijmet&vtype9&itype10
More information