Simulieren geht über Probieren

Size: px
Start display at page:

Download "Simulieren geht über Probieren"

Transcription

1 Simulieren geht über Probieren Ulrich Rüde Lehrstuhl für Informatik 10 (Systemsimulation) Universität Erlangen-Nürnberg www10.informatik.uni-erlangen.de Ulm, 17. Mai

2 Overview Motivation Three examples Material science and process technology: Metal Foams Nano Technology Biomedical Technology: The inverse EEG problem High Performance Computing Conclusions 2

3 The Two Principles of Science Three Experiments Theory Mathematical Models, Differential Equations, Newton Observation and prototypes empirical Sciences Computational Science Simulation, Optimization (quantitative) virtual Reality 3

4 Part IIa Metal Foams In collaboration with the Institut für Werkstoffwissenschaften Lehrstuhl Werkstoffkunde und Technologie der Metalle WTM (R.F. Singer, C. Körner) 4

5 Examples of Foams Glass Ceramics Metals Polymers Structural Properties stiffness energy absorption damping Functional Properties burner, shock absorber, heat exchanger, batteries large, dynamic surface expansion 5

6 Towards Simulating Metal Foams Bubble growth, coalescence, collapse, drainage, rheology, etc. are still poorly understood Simulation as a tool to better understand, control and optimize the process 6

7 The Lattice-Boltzmann Method Real valued representation of particles Discrete velocities and positions Algorithm consists of two steps: Stream Collide 7

8 The Stream Step Move particle distribution functions along corresponding velocity vector Normalized time step, cell size and particle speed 8

9 The Collide Step Amounts for collisions of particles during movement Weigh equilibrium velocities and velocities from streaming depending on fluid viscosity 9

10 Free surfaces with LBM Metal Foams huge gas volumes Only simulate and track fluid motion Compute boundary conditions at free surface 10

11 Boundary Conditions Problem: Missing distribution functions at interface cells after streaming! Liquid Gas Reconstruction such that macroscopic boundary conditions are satisfied. Körner et al. Lattice Boltzmann Model for Free Surface Flow, to be published in Journal of Computational Physics 11

12 Curvature calculation (version I) Alternative approaches: Integrate normals over surface (weighted triangles) Level set methods (track surface as implicit function) 12

13 Free surface flow: Breaking Dam Zur Anzeige wird der QuickTime Dekompressor YUV420 codec bentigt. 13

14 Visualization Ray-tracing Refraction Reflection Caustics About 15 Min per frame = 1 day for 4 secs About same compute time as flow simulation 14

15 Rising Bubbles Zur Anzeige wird der QuickTime Dekompressor YUV420 codec bentigt. 15

16 More Rising Bubbles Zur Anzeige wird der QuickTime Dekompressor YUV420 codec bentigt. 16

17 Simulation Verification by Experiment Zur Anzeige wird der QuickTime Dekompressor YUV420 codec bentigt. Simulation and Experiment: Diplomarbeit N. Thürey 17

18 Velocity 0,005 Verification for bubble dynamics 0,004 (C. Körner) Stokes Law: Climbing rate of a bubble exposed to gravity 0,003 Ideal bubble No boundaries Equilibrium state 0,002 0,001 Zur Anzeige wird der QuickTime Dekompressor Cinepak bentigt. Climb rate 0,000 R = 8, τ = 0.74, g = 10-4, σ = 2*10-2 Example 2 2: x x cells Rel. in error: Distance l.u.2 % Error = function of the system size 18

19 True Foams with Disjoining Pressure Zur Anzeige wird der QuickTime Dekompressor Cinepak bentigt. Zur Anzeige wird der QuickTime Dekompressor bentigt. 19

20 Data Set Pulsating Blood Flow at Aneurysm CE Elite Master Thesis: Jan Götz Zur Anzeige wird der QuickTime Dekompressor YUV420 codec bentigt. 20

21 Data Set Pulsating Blood Flow at Aneurysm Master Thesis Jan Götz In Zusammenarbeit mit Zur Anzeige wird der QuickTime Dekompressor YUV420 codec bentigt. Neuroradiologie (Prof. Dörfler) Bildverarbeitung Simulation Strömungsmechanik 21

22 Part III High Performance Computing 22

23 A little quiz... 1 Kflops = 103, 1 Mflops = 106, 1 Gflops = 109, 1 Tflops = 1012, 1 Pflops = 1015 floating point operations per second What is the speed of your PC? What is the speed of the fastest computer currently available (and where is it located) What was the speed of the fastest computer in 1995? 2000? 2005? 23

24 A little quiz... 1 Kflops = 103, 1 Mflops = 106, 1 Gflops = 109, 1 Tflops = 1012, 1 Pflops = 1015 floating point operations per second What is the speed of your PC? probably between 1 and 6.5 GFlops 24

25 A little quiz... 1 Kflops = 103, 1 Mflops = 106, 1 Gflops = 109, 1 Tflops = 1012, 1 Pflops = 1015 floating point operations per second What is the speed of your PC? What is the speed of the fastest computer currently available (and where is it located) What was the speed of the fastest computer in 1995? 2000? 2005? 25

26 A little quiz... 1 Kflops = 103, 1 Mflops = 106, 1 Gflops = 109, 1 Tflops = 1012, 1 Pflops = 1015 floating point operations per second What is the speed of your PC? What is the speed of the fastest computer currently available (and where is it located) 367 Tflop, it is a Blue Gene/L in Livermore/California with > processors 26

27 A little quiz... 1 Kflops = 103, 1 Mflops = 106, 1 Gflops = 109, 1 Tflops = 1012, 1 Pflops = 1015 floating point operations per second What is the speed of your PC? What is the speed of the fastest computer currently available (and where is it located) What was the speed of the fastest computer in 1995? 2000? 2005? 27

28 A little quiz... 1 Kflops = 103, 1 Mflops = 106, 1 Gflops = 109, 1 Tflops = 1012, 1 Pflops = 1015 floating point operations per second What is the speed of your PC? What is the speed of the fastest computer currently available (and where is it located) What was the speed of the fastest computer in 1995? 2000? 2005? TFlops 12.3 TFlops 367 TFlops... and how much has the speed of cars/airplanes/... improved in the same time? additional question: When do you expect that computers exceed 1 PFlops? 28

29 Compute Nodes (8x4 CPUs) LSS-Cluster CPU: AMD Opteron GHz, max. 4.4 GFlops RAM: 16 GByte Interactive Nodes (9x2 CPUs) CPU: AMD Opteron 248 High-Speed Network InfiniBand 10 GBit/s Fujitsu-Siemens 29

30 Architecture example: Our Pet Dinosaur 8 Proc and 8 GB per node Hitachi SR 8000 at the Leibniz-Rechenzentrum der Bayerischen Akademie der Wissenschaften Performance: 1344 CPUs (168*8) 12 GFlop/node 2016 GFlop total Linpack: 1645 Gflop (82% of theoretical peak) Very sensitive to data structures To be replaced by a 6000 Proc. SGI in 1H 2006 Upgrade to >70 Tflop in 2007 (No. 5 at time of installation in 2000) 30

31 31

32 32

33 Supercomputer Performance TOP 500 Zur Anzeige wird der QuickTime Dekompressor TIFF (Unkomprimiert) bentigt. 33

34 Growth: V 52% per year Transistors/Die G 1G M Merced Pentium Pro Pentium M 1M K K Growth: 42% per year K DRAM Microprocessor (Intel) Year Moore's Law in Semiconductor Technology (F. Hossfeld)

35 Semiconductor Technology 10 9 Atoms/Bit kt Energy/logic Operation [pico-joules Year Information Density & Energy Dissipation (adapted by F. Hossfeld from C. P. Williams et al., 1998) 35

36 Parallelization of LBM Code Standard LBM-Code in C (1-D Partitioning): - excellent performance on single SR8000 node - almost linear speed-up - large partitions favorable Performance on SR8000 Ca. 30% of Peak Performance 36

37 Parallelization Standard LBM-Code: Scalability Largest Simulation: 1,08*109 cells 370 GByte memory Communication Cost because of large data volume (64 MByte) Efficiency ~ 75% Dissertation T. Pohl (2006) 37

38 Parallelization Free surface LBM-Code Standard LBM 1 sweep through grid Free surface LBM 5 sweeps through grid Cell type changes, Closed boundary for bubbles, Initialization of modified cells, Mass balance correction 38

39 Parallelization Free surface LBM-Code: Standard LBM 1 sweep through grid 1 row of ghost nodes Free surface LBM 5 sweeps through grid 4 rows of ghost nodes 39

40 Performance Standard LBM-Code Free surface LBM-Code Performance lousy on a single node! Conditionals: 2,9 SLBM 51 free surface LBM Pentium 4: almost no degradation ~ 10% SR 8000: enormous degradation (pseudo-vector, predictable jumps) 40

41 Structured vs. Unstructured Grids (on Hitachi SR 8000) JDS Stencils ,937 2,146,689 # unknowns gridlib/hhg MFlops rates for matrix-vector multiplication on one node on the Hitachi compared with highly tuned JDS results for sparse matrices (courtesy of G. Wellein, RRZE Erlangen) 41

42 Refinement example Input Grid 42

43 Refinement example Refinement Level one 43

44 Refinement example Refinement Level Two 44

45 Refinement example Structured Interior 45

46 Refinement example Structured Interior 46

47 Refinement example Edge Interior 47

48 Refinement example Edge Interior 48

49 HHG: Parallel Scalability #Procs #DOFS x 10^6 #Els x 10^6 #Input Els GFLOP/s Time [s] 64 2,144 12, / ,288 25, / ,577 51, / , , / , , ,456/ Parallel scalability of Poisson problem discretized by tetrahedral finite elements: Machine - SGI Altix (Itanium GHz) B. Bergen, F. Hülsemann, U. Ruede: Is unknowns the largest finite element system that can be solved today? in SuperComputing, Nov

50 Conclusions (1) High performance simulation still requires heroic programming but we are on the way to make supercomputers more generally usable Parallel Programming is easy, node performance is difficult (B. Gropp) Which architecture? ASCI-type: custom CPU, massively parallel cluster of SMPs nobody has been able to show that these machines scale efficiently, except on a few very special applications and using enormous human effort Earth-simulator-type: Vector CPU, as many CPUs as affordable impressive performance on vectorizable code, but need to check with more demanding data and algorithm structures Hitachi Class: modified custom CPU, cluster of SMPs excellent performance on some codes, but unexpected slowdowns on others, too exotic to have a sufficiently large software base Others: BlueGene, Cray X1, Multithreading, PIM, reconfigurable, quantum computing, 50

51 Conclusions (2) Which data structures? structured (inflexible) unstructured (slow) HHG (high development effort, even prototype 50 K lines of code) meshless (useful in niches) Where are we going? the end of Moore s law nobody builds CPUs with HPC specific requirements high on the list of priorities petaflops: 100,000 processors and we can hardly handle 1000 It s the locality - stupid! the memory wall latency bandwidth Distinguish between algorithms where control flow is data independent: latency hiding techniques (pipelining, prefetching, etc) can help data dependent 51

52 In the Future? What s beyond Moore s Law? 52

53 Part VI Outlook: Other applications 3D-Animation Computational Steering Real-Time Simulation 53

54 Near-Real-Time Free-Surface LBM (N. Thürey) Zur Anzeige wird der QuickTime Dekompressor bentigt. 54

55 Free-Surface LBM with Adaptive Refinement (N. Thürey) Zur Anzeige wird der QuickTime Dekompressor bentigt. Hochaufgelöste Animationen Adaptive Verfeinerung/ Vergröberung Visualisierung mit Raytracer Fluid-Simulation in Blender 2.4 ( ) Blender: 3DModellierungsprogramm Frei verfügbar: 55

56 Collaborators Acknowledgements In Erlangen: WTM, LSE, LSTM, LGDV, RRZE, Neurozentrum, Radiologie, etc. Especially for foams: C. Körner (WTM) International: Utah, Technion, Constanta, Ghent, Boulder,... Dissertationen Projects U. Fabricius (AMG-Verfahren and SW-Engineering for parallelization) C. Freundl (Parelle Expression Templates for PDE-solver) J. Härtlein (Expression Templates for FE-Applications) N. Thürey (LBM, free surfaces) T. Pohl (Parallel LBM)... and 6 more 19 Diplom- /Master- Thesis Studien- /Bachelor- Thesis Especially for Performance-Analysis/ Optimization for LBM J. Wilke, K. Iglberger, S. Donath... and 23 more KONWIHR, DFG, NATO, BMBF Elitenetzwerk Bayern Bavarian Graduate School in Computational Engineering (with TUM, since 2004) Special International PhD program: Identifikation, Optimierung und Steuerung für technische Anwendungen (with Bayreuth and Würzburg) to start Jan

57 Talk is Over Zur Anzeige wird der QuickTime Dekompressor bentigt. Please wake up! 57

58 The Lattice-Boltzmann Method Based on cellular automata Introduced by von Neumann around 1940 Famous: Conway s Game of Life Complex system with simple rules Regular grid Local rules specifying time evolution Intrinsically parallel for model & simulation, similar to elliptic PDE solvers 58

59 The Lattice-Boltzmann Method Weakly compressible approximation of the Navier-Stokes equations Easy implementation Applicable for small Mach numbers (< 0.1) Easy to adapt, e.g. for Complicated or time-varying geometries Free surfaces Additional physical and chemical effects 59

60 LBM Demonstration (Java applet) file:///users/ruede/doc/lehr/vorles/ws03/hppt/lbm/jlb-comp/start.html 60

61 Free surface implementation Before stream step, compute mass exchange across cell boundaries for interface cells Calculate bubble volumes and pressure Surface curvature for surface tension Change topology if interface cells become full or empty keep layer of interface cells closed 61

62 Surface Tension (Vers. 2) Marching-cube surface triangulation Compute a curvature for each triangle _ ν1 δα = Α Α δv k= 1 da 2 dv _ n3 Α _ Α n2 Associate with each LBM cell the average curvature of its triangles Complicated Beats level sets for our applications (mass conservation). 62

63 Nano Technology Curved Boundaries: Particles approximated with spheres Improve accuracy of LBM simulations by using curved boundary conditions Standard No-Slip Reflect DFs at cell boundary More accurate: Take distance to boundary surface into account, then interpolate DFs accordingly 63

64 What are hierarchical hybrid grids? Standard geometric multigrid approach: Purely unstructured input grid resolves geometry of problem domain Patch-wise regular refinement applied repeatedly to every cell of the coarse grid generates nested grid hierarchies naturally suitable for geometric multigrid algorithms New: Modify storage formats and operations on the grid to reflect the generated regular substructures 64

65 What s the new thing here? Hierarchical hybrid grids (HHG) are not yet another block structured grid HHG are more flexible (unstructured, hybrid input grids) are not yet another unstructured geometric multigrid package HHG achieve better performance -unstructured treatment of regular regions does not improve performance 65

66 Simulation is Performance hungry and Memory intensive Parallel Supercomputing required 66

67 Current Challenge: Parallelism on all levels and The Memory Wall Parallel computing is easy, good (single) processor performance is difficult (B. Gropp, Argonne) There has been no significant progress in High Performance Computing over the past 5 years (H. Simon, NERSC) Instruction level parallelism Memory bandwidth and latency are the limiting factors Cache-aware algorithms Conventional complexity measures (based on operation count) are becoming increasingly unrealistic 67

68 LSS Cluster-Computer Fujitsu-Siemens HPC Line Programming Methods Cache Optimization C++ Expression Templates (Parallel) Algorithms Cooperations in Material Sciences Engineering Mechanical Electrical Chemical Medical Technology... 68

Performance and Software-Engineering Considerations for Massively Parallel Simulations

Performance and Software-Engineering Considerations for Massively Parallel Simulations Performance and Software-Engineering Considerations for Massively Parallel Simulations Ulrich Rüde (ruede@cs.fau.de) Ben Bergen, Frank Hülsemann, Christoph Freundl Universität Erlangen-Nürnberg www10.informatik.uni-erlangen.de

More information

Towards PetaScale Computational Science

Towards PetaScale Computational Science Towards PetaScale Computational Science U. Rüde (LSS Erlangen, ruede@cs.fau.de) joint work with many Lehrstuhl für Informatik 10 (Systemsimulation) Universität Erlangen-Nürnberg www10.informatik.uni-erlangen.de

More information

Solving Finite Element Systems with 17 Billion Unknowns at Sustained Teraflops Performance

Solving Finite Element Systems with 17 Billion Unknowns at Sustained Teraflops Performance Solving Finite Element Systems with 17 Billion Unknowns at Sustained Teraflops Performance B. Bergen (LSS Erlangen/Los Alamos) T. Gradl (LSS Erlangen) F. Hülsemann (LSS Erlangen/EDF) U. Rüde (LSS Erlangen,

More information

Parallel Solution of a Finite Element Problem with 17 Billion Unknowns

Parallel Solution of a Finite Element Problem with 17 Billion Unknowns Parallel Solution of a Finite Element Problem with 17 Billion Unknowns B. Bergen (LSS Erlangen/Los Alamos) T. Gradl (LSS Erlangen) F. Hülsemann (LSS Erlangen/EDF) U. Rüde (LSS Erlangen, ruede@cs.fau.de)

More information

High End Computing for Large Scale Simulations

High End Computing for Large Scale Simulations High End Computing for Large Scale Simulations B. Bergen (LSS Erlangen/ now Los Alamos) N. Thürey (LSS Erlangen/ now ETH Zürich) U. Rüde (LSS Erlangen, ruede@cs.fau.de) joint work with many more Lehrstuhl

More information

Scalable Parallel Multigrid for Finite Element Computations

Scalable Parallel Multigrid for Finite Element Computations Scalable Parallel Multigrid for Finite Element Computations Minisymposium MS8 @ SIAM CS&E 2007 Large Scale Parallel Multigrid B. Bergen (LSS Erlangen/ now Los Alamos) T. Gradl, Chr. Freundl (LSS Erlangen)

More information

Massively Parallel Multgrid for Finite Elements

Massively Parallel Multgrid for Finite Elements Massively Parallel Multgrid for Finite Elements B. Bergen (LSS Erlangen/ now Los Alamos) T. Gradl, Chr. Freundl (LSS Erlangen) U. Rüde (LSS Erlangen, ruede@cs.fau.de) Lehrstuhl für Informatik 10 (Systemsimulation)

More information

Adaptive Hierarchical Grids with a Trillion Tetrahedra

Adaptive Hierarchical Grids with a Trillion Tetrahedra Adaptive Hierarchical Grids with a Trillion Tetrahedra Tobias Gradl, Björn Gmeiner and U. Rüde (LSS Erlangen, ruede@cs.fau.de) in collaboration with many more Lehrstuhl für Informatik 10 (Systemsimulation)

More information

(LSS Erlangen, Simon Bogner, Ulrich Rüde, Thomas Pohl, Nils Thürey in collaboration with many more

(LSS Erlangen, Simon Bogner, Ulrich Rüde, Thomas Pohl, Nils Thürey in collaboration with many more Parallel Free-Surface Extension of the Lattice-Boltzmann Method A Lattice-Boltzmann Approach for Simulation of Two-Phase Flows Stefan Donath (LSS Erlangen, stefan.donath@informatik.uni-erlangen.de) Simon

More information

Numerical Simulation in the Multi-Core Age

Numerical Simulation in the Multi-Core Age Numerical Simulation in the Multi-Core Age N. Thürey (LSS Erlangen/ now ETH Zürich), J. Götz, M. Stürmer, S. Donath, C. Feichtinger, K. Iglberger, T. Preclic, P. Neumann (LSS Erlangen) U. Rüde (LSS Erlangen,

More information

Architecture Aware Multigrid

Architecture Aware Multigrid Architecture Aware Multigrid U. Rüde (LSS Erlangen, ruede@cs.fau.de) joint work with D. Ritter, T. Gradl, M. Stürmer, H. Köstler, J. Treibig and many more students Lehrstuhl für Informatik 10 (Systemsimulation)

More information

Performance Analysis of the Lattice Boltzmann Method on x86-64 Architectures

Performance Analysis of the Lattice Boltzmann Method on x86-64 Architectures Performance Analysis of the Lattice Boltzmann Method on x86-64 Architectures Jan Treibig, Simon Hausmann, Ulrich Ruede Zusammenfassung The Lattice Boltzmann method (LBM) is a well established algorithm

More information

Towards Exa-Scale: Computing with Millions of Cores

Towards Exa-Scale: Computing with Millions of Cores Towards Exa-Scale: Computing with Millions of Cores U. Rüde (LSS Erlangen, ruede@cs.fau.de) Lehrstuhl für Informatik 10 (Systemsimulation) Excellence Cluster Engineering of Advanced Materials Universität

More information

The walberla Framework: Multi-physics Simulations on Heterogeneous Parallel Platforms

The walberla Framework: Multi-physics Simulations on Heterogeneous Parallel Platforms The walberla Framework: Multi-physics Simulations on Heterogeneous Parallel Platforms Harald Köstler, Uli Rüde (LSS Erlangen, ruede@cs.fau.de) Lehrstuhl für Simulation Universität Erlangen-Nürnberg www10.informatik.uni-erlangen.de

More information

A Contact Angle Model for the Parallel Free Surface Lattice Boltzmann Method in walberla Stefan Donath (stefan.donath@informatik.uni-erlangen.de) Computer Science 10 (System Simulation) University of Erlangen-Nuremberg

More information

Introducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method

Introducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method Introducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method G. Wellein, T. Zeiser, G. Hager HPC Services Regional Computing Center A. Nitsure, K. Iglberger, U. Rüde Chair for System

More information

τ-extrapolation on 3D semi-structured finite element meshes

τ-extrapolation on 3D semi-structured finite element meshes τ-extrapolation on 3D semi-structured finite element meshes European Multi-Grid Conference EMG 2010 Björn Gmeiner Joint work with: Tobias Gradl, Ulrich Rüde September, 2010 Contents The HHG Framework τ-extrapolation

More information

Software and Performance Engineering for numerical codes on GPU clusters

Software and Performance Engineering for numerical codes on GPU clusters Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3

More information

Reconstruction of Trees from Laser Scan Data and further Simulation Topics

Reconstruction of Trees from Laser Scan Data and further Simulation Topics Reconstruction of Trees from Laser Scan Data and further Simulation Topics Helmholtz-Research Center, Munich Daniel Ritter http://www10.informatik.uni-erlangen.de Overview 1. Introduction of the Chair

More information

Computational Fluid Dynamics with the Lattice Boltzmann Method KTH SCI, Stockholm

Computational Fluid Dynamics with the Lattice Boltzmann Method KTH SCI, Stockholm Computational Fluid Dynamics with the Lattice Boltzmann Method KTH SCI, Stockholm March 17 March 21, 2014 Florian Schornbaum, Martin Bauer, Simon Bogner Chair for System Simulation Friedrich-Alexander-Universität

More information

Simulation of Liquid-Gas-Solid Flows with the Lattice Boltzmann Method

Simulation of Liquid-Gas-Solid Flows with the Lattice Boltzmann Method Simulation of Liquid-Gas-Solid Flows with the Lattice Boltzmann Method June 21, 2011 Introduction Free Surface LBM Liquid-Gas-Solid Flows Parallel Computing Examples and More References Fig. Simulation

More information

smooth coefficients H. Köstler, U. Rüde

smooth coefficients H. Köstler, U. Rüde A robust multigrid solver for the optical flow problem with non- smooth coefficients H. Köstler, U. Rüde Overview Optical Flow Problem Data term and various regularizers A Robust Multigrid Solver Galerkin

More information

Numerical Algorithms on Multi-GPU Architectures

Numerical Algorithms on Multi-GPU Architectures Numerical Algorithms on Multi-GPU Architectures Dr.-Ing. Harald Köstler 2 nd International Workshops on Advances in Computational Mechanics Yokohama, Japan 30.3.2010 2 3 Contents Motivation: Applications

More information

Large scale Imaging on Current Many- Core Platforms

Large scale Imaging on Current Many- Core Platforms Large scale Imaging on Current Many- Core Platforms SIAM Conf. on Imaging Science 2012 May 20, 2012 Dr. Harald Köstler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen,

More information

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG. Lehrstuhl für Informatik 10 (Systemsimulation)

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG. Lehrstuhl für Informatik 10 (Systemsimulation) FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) Hierarchical hybrid grids: A framework

More information

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics H. Y. Schive ( 薛熙于 ) Graduate Institute of Physics, National Taiwan University Leung Center for Cosmology and Particle Astrophysics

More information

Massively Parallel Phase Field Simulations using HPC Framework walberla

Massively Parallel Phase Field Simulations using HPC Framework walberla Massively Parallel Phase Field Simulations using HPC Framework walberla SIAM CSE 2015, March 15 th 2015 Martin Bauer, Florian Schornbaum, Christian Godenschwager, Johannes Hötzer, Harald Köstler and Ulrich

More information

simulation framework for piecewise regular grids

simulation framework for piecewise regular grids WALBERLA, an ultra-scalable multiphysics simulation framework for piecewise regular grids ParCo 2015, Edinburgh September 3rd, 2015 Christian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler

More information

Free Surface Lattice-Boltzmann fluid simulations. with and without level sets.

Free Surface Lattice-Boltzmann fluid simulations. with and without level sets. ree Surface Lattice-Boltzmann fluid simulations with and without level sets Nils Thürey, Ulrich Rüde University of Erlangen-Nuremberg System Simulation roup Cauerstr. 6, 91054 Erlangen, ermany Email: Nils.Thuerey@cs.fau.de

More information

Geometric Multigrid on Multicore Architectures: Performance-Optimized Complex Diffusion

Geometric Multigrid on Multicore Architectures: Performance-Optimized Complex Diffusion Geometric Multigrid on Multicore Architectures: Performance-Optimized Complex Diffusion M. Stürmer, H. Köstler, and U. Rüde Lehrstuhl für Systemsimulation Friedrich-Alexander-Universität Erlangen-Nürnberg

More information

Hierarchical Hybrid Grids

Hierarchical Hybrid Grids Hierarchical Hybrid Grids IDK Summer School 2012 Björn Gmeiner, Ulrich Rüde July, 2012 Contents Mantle convection Hierarchical Hybrid Grids Smoothers Geometric approximation Performance modeling 2 Mantle

More information

Development of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics

Development of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics Development of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics I. Pantle Fachgebiet Strömungsmaschinen Karlsruher Institut für Technologie KIT Motivation

More information

Efficient Imaging Algorithms on Many-Core Platforms

Efficient Imaging Algorithms on Many-Core Platforms Efficient Imaging Algorithms on Many-Core Platforms H. Köstler Dagstuhl, 22.11.2011 Contents Imaging Applications HDR Compression performance of PDE-based models Image Denoising performance of patch-based

More information

Virtual EM Inc. Ann Arbor, Michigan, USA

Virtual EM Inc. Ann Arbor, Michigan, USA Functional Description of the Architecture of a Special Purpose Processor for Orders of Magnitude Reduction in Run Time in Computational Electromagnetics Tayfun Özdemir Virtual EM Inc. Ann Arbor, Michigan,

More information

Towards real-time prediction of Tsunami impact effects on nearshore infrastructure

Towards real-time prediction of Tsunami impact effects on nearshore infrastructure Towards real-time prediction of Tsunami impact effects on nearshore infrastructure Manfred Krafczyk & Jonas Tölke Inst. for Computational Modeling in Civil Engineering http://www.cab.bau.tu-bs.de 24.04.2007

More information

Computational Fluid Dynamics

Computational Fluid Dynamics Computational Fluid Dynamics Prof. Dr.-Ing. Siegfried Wagner Institut für Aerodynamik und Gasdynamik, Universität Stuttgart, Pfaffenwaldring 21, 70550 Stuttgart A large number of highly qualified papers

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear

More information

Computing on GPU Clusters

Computing on GPU Clusters Computing on GPU Clusters Robert Strzodka (MPII), Dominik Göddeke G (TUDo( TUDo), Dominik Behr (AMD) Conference on Parallel Processing and Applied Mathematics Wroclaw, Poland, September 13-16, 16, 2009

More information

CUDA. Fluid simulation Lattice Boltzmann Models Cellular Automata

CUDA. Fluid simulation Lattice Boltzmann Models Cellular Automata CUDA Fluid simulation Lattice Boltzmann Models Cellular Automata Please excuse my layout of slides for the remaining part of the talk! Fluid Simulation Navier Stokes equations for incompressible fluids

More information

Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation

Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation Nikolai Sakharnykh - NVIDIA San Jose Convention Center, San Jose, CA September 21, 2010 Introduction Tridiagonal solvers very popular

More information

Efficiency Aspects for Advanced Fluid Finite Element Formulations

Efficiency Aspects for Advanced Fluid Finite Element Formulations Proceedings of the 5 th International Conference on Computation of Shell and Spatial Structures June 1-4, 2005 Salzburg, Austria E. Ramm, W. A. Wall, K.-U. Bletzinger, M. Bischoff (eds.) www.iassiacm2005.de

More information

Multigrid algorithms on multi-gpu architectures

Multigrid algorithms on multi-gpu architectures Multigrid algorithms on multi-gpu architectures H. Köstler European Multi-Grid Conference EMG 2010 Isola d Ischia, Italy 20.9.2010 2 Contents Work @ LSS GPU Architectures and Programming Paradigms Applications

More information

Algorithms and Architecture. William D. Gropp Mathematics and Computer Science

Algorithms and Architecture. William D. Gropp Mathematics and Computer Science Algorithms and Architecture William D. Gropp Mathematics and Computer Science www.mcs.anl.gov/~gropp Algorithms What is an algorithm? A set of instructions to perform a task How do we evaluate an algorithm?

More information

Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing

Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing Natasha Flyer National Center for Atmospheric Research Boulder, CO Meshes vs. Mesh-free

More information

Lehrstuhl für Informatik 10 (Systemsimulation)

Lehrstuhl für Informatik 10 (Systemsimulation) FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) On the Resource Requirements of

More information

Exploring unstructured Poisson solvers for FDS

Exploring unstructured Poisson solvers for FDS Exploring unstructured Poisson solvers for FDS Dr. Susanne Kilian hhpberlin - Ingenieure für Brandschutz 10245 Berlin - Germany Agenda 1 Discretization of Poisson- Löser 2 Solvers for 3 Numerical Tests

More information

From Notebooks to Supercomputers: Tap the Full Potential of Your CUDA Resources with LibGeoDecomp

From Notebooks to Supercomputers: Tap the Full Potential of Your CUDA Resources with LibGeoDecomp From Notebooks to Supercomputers: Tap the Full Potential of Your CUDA Resources with andreas.schaefer@cs.fau.de Friedrich-Alexander-Universität Erlangen-Nürnberg GPU Technology Conference 2013, San José,

More information

Shallow Water Simulations on Graphics Hardware

Shallow Water Simulations on Graphics Hardware Shallow Water Simulations on Graphics Hardware Ph.D. Thesis Presentation 2014-06-27 Martin Lilleeng Sætra Outline Introduction Parallel Computing and the GPU Simulating Shallow Water Flow Topics of Thesis

More information

Cost-Effective Parallel Computational Electromagnetic Modeling

Cost-Effective Parallel Computational Electromagnetic Modeling Cost-Effective Parallel Computational Electromagnetic Modeling, Tom Cwik {Daniel.S.Katz, cwik}@jpl.nasa.gov Beowulf System at PL (Hyglac) l 16 Pentium Pro PCs, each with 2.5 Gbyte disk, 128 Mbyte memory,

More information

Efficient implementation of simple lattice Boltzmann kernels

Efficient implementation of simple lattice Boltzmann kernels Survey Introduction Efficient implementation of simple lattice Boltzmann kernels Memory hierarchies of modern processors Implementation of lattice Boltzmann method Optimization of data access for 3D lattice

More information

Two-Phase flows on massively parallel multi-gpu clusters

Two-Phase flows on massively parallel multi-gpu clusters Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous

More information

Is Unknowns the Largest Finite Element System that Can Be Solved Today?

Is Unknowns the Largest Finite Element System that Can Be Solved Today? Is 1.7 10 10 Unknowns the Largest Finite Element System that Can Be Solved Today? B. Bergen F. Hülsemann U. Rüde 22nd July 2005 Permission to make digital or hard copies of all or part of this work for

More information

The Immersed Interface Method

The Immersed Interface Method The Immersed Interface Method Numerical Solutions of PDEs Involving Interfaces and Irregular Domains Zhiiin Li Kazufumi Ito North Carolina State University Raleigh, North Carolina Society for Industrial

More information

Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla

Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla Performance Optimization of a Massively Parallel Phase-Field Method Using the HPC Framework walberla SIAM PP 2016, April 13 th 2016 Martin Bauer, Florian Schornbaum, Christian Godenschwager, Johannes Hötzer,

More information

Large Scale Parallel Lattice Boltzmann Model of Dendritic Growth

Large Scale Parallel Lattice Boltzmann Model of Dendritic Growth Large Scale Parallel Lattice Boltzmann Model of Dendritic Growth Bohumir Jelinek Mohsen Eshraghi Sergio Felicelli CAVS, Mississippi State University March 3-7, 2013 San Antonio, Texas US Army Corps of

More information

LATTICE-BOLTZMANN METHOD FOR THE SIMULATION OF LAMINAR MIXERS

LATTICE-BOLTZMANN METHOD FOR THE SIMULATION OF LAMINAR MIXERS 14 th European Conference on Mixing Warszawa, 10-13 September 2012 LATTICE-BOLTZMANN METHOD FOR THE SIMULATION OF LAMINAR MIXERS Felix Muggli a, Laurent Chatagny a, Jonas Lätt b a Sulzer Markets & Technology

More information

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle ICES Student Forum The University of Texas at Austin, USA November 4, 204 Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of

More information

Why Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends

Why Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends Imagine stream processor; Bill Dally, Stanford Connection Machine CM; Thinking Machines Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid Jeffrey Bolz Eitan Grinspun Caltech Ian Farmer

More information

Thread and Data parallelism in CPUs - will GPUs become obsolete?

Thread and Data parallelism in CPUs - will GPUs become obsolete? Thread and Data parallelism in CPUs - will GPUs become obsolete? USP, Sao Paulo 25/03/11 Carsten Trinitis Carsten.Trinitis@tum.de Lehrstuhl für Rechnertechnik und Rechnerorganisation (LRR) Institut für

More information

2.7 Cloth Animation. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter 2 123

2.7 Cloth Animation. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter 2 123 2.7 Cloth Animation 320491: Advanced Graphics - Chapter 2 123 Example: Cloth draping Image Michael Kass 320491: Advanced Graphics - Chapter 2 124 Cloth using mass-spring model Network of masses and springs

More information

Solving Partial Differential Equations on Overlapping Grids

Solving Partial Differential Equations on Overlapping Grids **FULL TITLE** ASP Conference Series, Vol. **VOLUME**, **YEAR OF PUBLICATION** **NAMES OF EDITORS** Solving Partial Differential Equations on Overlapping Grids William D. Henshaw Centre for Applied Scientific

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

High Performance Computing for PDE Some numerical aspects of Petascale Computing

High Performance Computing for PDE Some numerical aspects of Petascale Computing High Performance Computing for PDE Some numerical aspects of Petascale Computing S. Turek, D. Göddeke with support by: Chr. Becker, S. Buijssen, M. Grajewski, H. Wobker Institut für Angewandte Mathematik,

More information

Continuum-Microscopic Models

Continuum-Microscopic Models Scientific Computing and Numerical Analysis Seminar October 1, 2010 Outline Heterogeneous Multiscale Method Adaptive Mesh ad Algorithm Refinement Equation-Free Method Incorporates two scales (length, time

More information

TAU mesh deformation. Thomas Gerhold

TAU mesh deformation. Thomas Gerhold TAU mesh deformation Thomas Gerhold The parallel mesh deformation of the DLR TAU-Code Introduction Mesh deformation method & Parallelization Results & Applications Conclusion & Outlook Introduction CFD

More information

Finite Volume Discretization on Irregular Voronoi Grids

Finite Volume Discretization on Irregular Voronoi Grids Finite Volume Discretization on Irregular Voronoi Grids C.Huettig 1, W. Moore 1 1 Hampton University / National Institute of Aerospace Folie 1 The earth and its terrestrial neighbors NASA Colin Rose, Dorling

More information

High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters

High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters SIAM PP 2014 High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters C. Riesinger, A. Bakhtiari, M. Schreiber Technische Universität München February 20, 2014

More information

Optimization of HOM Couplers using Time Domain Schemes

Optimization of HOM Couplers using Time Domain Schemes Optimization of HOM Couplers using Time Domain Schemes Workshop on HOM Damping in Superconducting RF Cavities Carsten Potratz Universität Rostock October 11, 2010 10/11/2010 2009 UNIVERSITÄT ROSTOCK FAKULTÄT

More information

GPU Cluster Computing for FEM

GPU Cluster Computing for FEM GPU Cluster Computing for FEM Dominik Göddeke Sven H.M. Buijssen, Hilmar Wobker and Stefan Turek Angewandte Mathematik und Numerik TU Dortmund, Germany dominik.goeddeke@math.tu-dortmund.de GPU Computing

More information

Free Surface Flow Simulations

Free Surface Flow Simulations Free Surface Flow Simulations Hrvoje Jasak h.jasak@wikki.co.uk Wikki Ltd. United Kingdom 11/Jan/2005 Free Surface Flow Simulations p.1/26 Outline Objective Present two numerical modelling approaches for

More information

Shape of Things to Come: Next-Gen Physics Deep Dive

Shape of Things to Come: Next-Gen Physics Deep Dive Shape of Things to Come: Next-Gen Physics Deep Dive Jean Pierre Bordes NVIDIA Corporation Free PhysX on CUDA PhysX by NVIDIA since March 2008 PhysX on CUDA available: August 2008 GPU PhysX in Games Physical

More information

NVIDIA. Interacting with Particle Simulation in Maya using CUDA & Maximus. Wil Braithwaite NVIDIA Applied Engineering Digital Film

NVIDIA. Interacting with Particle Simulation in Maya using CUDA & Maximus. Wil Braithwaite NVIDIA Applied Engineering Digital Film NVIDIA Interacting with Particle Simulation in Maya using CUDA & Maximus Wil Braithwaite NVIDIA Applied Engineering Digital Film Some particle milestones FX Rendering Physics 1982 - First CG particle FX

More information

High Performance Computing for PDE Towards Petascale Computing

High Performance Computing for PDE Towards Petascale Computing High Performance Computing for PDE Towards Petascale Computing S. Turek, D. Göddeke with support by: Chr. Becker, S. Buijssen, M. Grajewski, H. Wobker Institut für Angewandte Mathematik, Univ. Dortmund

More information

Network Bandwidth & Minimum Efficient Problem Size

Network Bandwidth & Minimum Efficient Problem Size Network Bandwidth & Minimum Efficient Problem Size Paul R. Woodward Laboratory for Computational Science & Engineering (LCSE), University of Minnesota April 21, 2004 Build 3 virtual computers with Intel

More information

Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling

Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling Iterative Solvers Numerical Results Conclusion and outlook 1/22 Efficient multigrid solvers for strongly anisotropic PDEs in atmospheric modelling Part II: GPU Implementation and Scaling on Titan Eike

More information

Performances and Tuning for Designing a Fast Parallel Hemodynamic Simulator. Bilel Hadri

Performances and Tuning for Designing a Fast Parallel Hemodynamic Simulator. Bilel Hadri Performances and Tuning for Designing a Fast Parallel Hemodynamic Simulator Bilel Hadri University of Tennessee Innovative Computing Laboratory Collaboration: Dr Marc Garbey, University of Houston, Department

More information

High Performance Computing

High Performance Computing High Performance Computing ADVANCED SCIENTIFIC COMPUTING Dr. Ing. Morris Riedel Adjunct Associated Professor School of Engineering and Natural Sciences, University of Iceland Research Group Leader, Juelich

More information

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG. Lehrstuhl für Informatik 10 (Systemsimulation)

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG. Lehrstuhl für Informatik 10 (Systemsimulation) FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) walberla: Visualization of Fluid

More information

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29 Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions

More information

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

3D ADI Method for Fluid Simulation on Multiple GPUs. Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA

3D ADI Method for Fluid Simulation on Multiple GPUs. Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA 3D ADI Method for Fluid Simulation on Multiple GPUs Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA Introduction Fluid simulation using direct numerical methods Gives the most accurate result Requires

More information

Principles and Goals

Principles and Goals Simulations of foam rheology Simon Cox Principles and Goals Use knowledge of foam structure to predict response when experiments are not easy to isolate causes of certain effects to save time in designing

More information

Webinar #3 Lattice Boltzmann method for CompBioMed (incl. Palabos)

Webinar #3 Lattice Boltzmann method for CompBioMed (incl. Palabos) Webinar series A Centre of Excellence in Computational Biomedicine Webinar #3 Lattice Boltzmann method for CompBioMed (incl. Palabos) 19 March 2018 The webinar will start at 12pm CET / 11am GMT Dr Jonas

More information

computational Fluid Dynamics - Prof. V. Esfahanian

computational Fluid Dynamics - Prof. V. Esfahanian Three boards categories: Experimental Theoretical Computational Crucial to know all three: Each has their advantages and disadvantages. Require validation and verification. School of Mechanical Engineering

More information

Real Application Performance and Beyond

Real Application Performance and Beyond Real Application Performance and Beyond Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400 Fax: 408-970-3403 http://www.mellanox.com Scientists, engineers and analysts

More information

1.2 Numerical Solutions of Flow Problems

1.2 Numerical Solutions of Flow Problems 1.2 Numerical Solutions of Flow Problems DIFFERENTIAL EQUATIONS OF MOTION FOR A SIMPLIFIED FLOW PROBLEM Continuity equation for incompressible flow: 0 Momentum (Navier-Stokes) equations for a Newtonian

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

Adarsh Krishnamurthy (cs184-bb) Bela Stepanova (cs184-bs)

Adarsh Krishnamurthy (cs184-bb) Bela Stepanova (cs184-bs) OBJECTIVE FLUID SIMULATIONS Adarsh Krishnamurthy (cs184-bb) Bela Stepanova (cs184-bs) The basic objective of the project is the implementation of the paper Stable Fluids (Jos Stam, SIGGRAPH 99). The final

More information

Kommunikations- und Optimierungsaspekte paralleler Programmiermodelle auf hybriden HPC-Plattformen

Kommunikations- und Optimierungsaspekte paralleler Programmiermodelle auf hybriden HPC-Plattformen Kommunikations- und Optimierungsaspekte paralleler Programmiermodelle auf hybriden HPC-Plattformen Rolf Rabenseifner rabenseifner@hlrs.de Universität Stuttgart, Höchstleistungsrechenzentrum Stuttgart (HLRS)

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

Lattice Boltzmann with CUDA

Lattice Boltzmann with CUDA Lattice Boltzmann with CUDA Lan Shi, Li Yi & Liyuan Zhang Hauptseminar: Multicore Architectures and Programming Page 1 Outline Overview of LBM An usage of LBM Algorithm Implementation in CUDA and Optimization

More information

DIFFERENTIAL. Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka

DIFFERENTIAL. Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka USE OF FOR Tomáš Oberhuber, Atsushi Suzuki, Jan Vacata, Vítězslav Žabka Faculty of Nuclear Sciences and Physical Engineering Czech Technical University in Prague Mini workshop on advanced numerical methods

More information

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

High-Performance Computational Electromagnetic Modeling Using Low-Cost Parallel Computers

High-Performance Computational Electromagnetic Modeling Using Low-Cost Parallel Computers High-Performance Computational Electromagnetic Modeling Using Low-Cost Parallel Computers July 14, 1997 J Daniel S. Katz (Daniel.S.Katz@jpl.nasa.gov) Jet Propulsion Laboratory California Institute of Technology

More information

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids Patrice Castonguay and Antony Jameson Aerospace Computing Lab, Stanford University GTC Asia, Beijing, China December 15 th, 2011

More information

Animation of Fluids. Animating Fluid is Hard

Animation of Fluids. Animating Fluid is Hard Animation of Fluids Animating Fluid is Hard Too complex to animate by hand Surface is changing very quickly Lots of small details In short, a nightmare! Need automatic simulations AdHoc Methods Some simple

More information

Communication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures

Communication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures Communication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures Rolf Rabenseifner rabenseifner@hlrs.de Gerhard Wellein gerhard.wellein@rrze.uni-erlangen.de University of Stuttgart

More information

Unstructured Mesh Generation for Implicit Moving Geometries and Level Set Applications

Unstructured Mesh Generation for Implicit Moving Geometries and Level Set Applications Unstructured Mesh Generation for Implicit Moving Geometries and Level Set Applications Per-Olof Persson (persson@mit.edu) Department of Mathematics Massachusetts Institute of Technology http://www.mit.edu/

More information

Peta-Scale Simulations with the HPC Software Framework walberla:

Peta-Scale Simulations with the HPC Software Framework walberla: Peta-Scale Simulations with the HPC Software Framework walberla: Massively Parallel AMR for the Lattice Boltzmann Method SIAM PP 2016, Paris April 15, 2016 Florian Schornbaum, Christian Godenschwager,

More information

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea.

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea. Abdulrahman Manea PhD Student Hamdi Tchelepi Associate Professor, Co-Director, Center for Computational Earth and Environmental Science Energy Resources Engineering Department School of Earth Sciences

More information