Nektar++: Library design

Size: px
Start display at page:

Download "Nektar++: Library design"

Transcription

1 Nektar++: Library design Chris Cantwell, David Moxey, Mike Kirby, Spencer Sherwin Nektar++ Workshop Imperial College London 7th July 205

2 Overview Library design overview Collections Developer Practice User Guide and Developer Guide

3 Nektar++: Objectives Mathematical Construction Expose different discretisations (CG, DG) by combining and reusing low-level elemental mathematical constructs. Computational Implementation Challenge high-/low-order boundaries while maintaining efficiency. Cross-dimensional support: D, 2D and 3D Retain and exploit domain symmetries and embeddings (homogeneous, cylindrical, manifold) = = Bridge current and future hardware diversity. Achieve flexible HPC scalability and performance through hybrid parallelism.

4 Nektar++: Design APPLICATION DOMAIN Stack of libraries SolverUtils Build up the spectral/hp element method systematically SPECTRAL ELEMENT METHOD MultiRegions LocalRegions SpatialDomains StdRegions AUXILIARY LibUtilities - -

5 Nektar++: Design APPLICATION DOMAIN Stack of libraries SolverUtils Build up the spectral/hp element method systematically SPECTRAL ELEMENT METHOD MultiRegions Core mathematical abstractions at bottom LocalRegions Complete physical domain description and operators at top SpatialDomains StdRegions AUXILIARY LibUtilities - -

6 Nektar++: Design APPLICATION DOMAIN Stack of libraries SolverUtils Build up the spectral/hp element method systematically SPECTRAL ELEMENT METHOD MultiRegions Core mathematical abstractions at bottom LocalRegions Complete physical domain description and operators at top SpatialDomains StdRegions AUXILIARY LibUtilities - - Encapsulate complexities of the spectral/hp element formulation at the top levels to improve accessibility of methods Allow users to still engage at lower level depending on their requirements (e.g. for methods research)

7 Nektar++: Operator construction Nektar++ is a spectral/hp element toolkit written in C++. Helmholtz problem:

8 Nektar++: Operator construction Nektar++ is a spectral/hp element toolkit written in C++. Helmholtz problem: Weak form + IBP:

9 Nektar++: Operator construction Nektar++ is a spectral/hp element toolkit written in C++. Helmholtz problem: Weak form + IBP: MultiRegions

10 Nektar++: Operator construction Nektar++ is a spectral/hp element toolkit written in C++. Helmholtz problem: Weak form + IBP: MultiRegions LocalRegions

11 Nektar++: Operator construction Nektar++ is a spectral/hp element toolkit written in C++. Helmholtz problem: Weak form + IBP: MultiRegions LocalRegions StdRegions - -

12 Nektar++: Operator construction Nektar++ is a spectral/hp element toolkit written in C++. Helmholtz problem: Weak form + IBP: MultiRegions LocalRegions SpatialDomains StdRegions - -

13 Nektar++: Operator construction Nektar++ is a spectral/hp element toolkit written in C++. Helmholtz problem: Weak form + IBP: MultiRegions Collections LocalRegions SpatialDomains StdRegions

14 Implementation strategies Global Strategy =

15 Implementation strategies Global Strategy = Local Strategy =

16 Implementation strategies Global Strategy = Local Strategy = Sum-factorisation =

17 Collections: Design Local Matrix = StdMat. Apply Jacobian (L) = 2. L3 Multiply by ref. matrix =

18 Collections: Design Local Matrix = StdMat. Apply Jacobian (L) = 2. L3 Multiply by ref. matrix = IterPerExp. Apply Jacobian (L) 2. L2 multiply by ref. matrix for i = :N =

19 Collections: Design Local Matrix = StdMat. Apply Jacobian (L) = 2. L3 Multiply by ref. matrix = IterPerExp. Apply Jacobian (L) 2. L2 multiply by ref. matrix for i = :N = SumFac (Quad). Apply Jacobian (L) 2. L2 multiply for first dim ref. matrix = 3. L2 multiply for second dim ref. matrix for i = :N =

20 Test Case Intercostal Pair 4k tets 2k prisms

21 Test Case LocalSumFac IterPerExp StdMat SumFac Tet BwdTrans LocalSumFac IterPerExp StdMat SumFac Tet IProduct Intercostal Pair 4k tets 2k prisms T / T IterPerExp P P Tet IProductWRTDerivBase Tet PhysDeriv LocalSumFac IterPerExp StdMat SumFac LocalSumFac IterPerExp StdMat SumFac 2 2 T / T IterPerExp P P

22 Test Case LocalSumFac IterPerExp StdMat SumFac Tet BwdTrans LocalSumFac IterPerExp StdMat SumFac Tet IProduct Intercostal Pair 4k tets 2k prisms T / T IterPerExp P P Tet IProductWRTDerivBase Tet PhysDeriv LocalSumFac IterPerExp StdMat SumFac LocalSumFac IterPerExp StdMat SumFac 2 2 T / T IterPerExp P P StdMat most effective at low order Collections less effective at high order for most operators PhysDeriv benefits from SumFac even at low P Similar trends observed for Prisms

23 Collection Size Dependence Limit size of collections to 0, 00 or 000 elements Max 0 Max 00 Max 000 All (4000) Tet IterPerExp PhysDeriv T (seconds) P Negligible dependence on collection size for < 000 elements Reduced performance for large numbers of elements

24 BLAS Implementation LocalSumFac IterPerExp StdMat SumFac Tet BwdTrans LocalSumFac IterPerExp StdMat SumFac Tet BwdTrans (OpenBLAS) T / T IterPerExp P P OpenBLAS performs best for large matrix operations Significant gains for StdMat and SumFac at high P Importance of auto-tuning

25 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++?

26 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab)

27 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab)

28 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab)

29 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab)

30 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab)

31 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab) Tests & Continuous Integration (Buildbot)

32 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab) Tests & Continuous Integration (Buildbot)

33 Developer Practice Buildssupport (latest a onlarge right) What development practices multi-plaform collaborative software project such as Nektar++? Success git clone Currently building Builders (different OSs / config) Version-control (Git + GitLab) Tests & Continuous Integration (Buildbot) Failed Click for details

34 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab) Tests & Continuous Integration (Buildbot)

35 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab) Tests & Continuous Integration (Buildbot)

36 Developer Practice What development practices support a large multi-plaform collaborative software project such as Nektar++? git clone Version-control (Git + GitLab) Tests & Continuous Integration (Buildbot) Issue tracking (Trac) Documentation (PDF + doxygen)

37 Documentation All documentation versioned with code

38 Documentation All documentation versioned with code User guide

39 Documentation All documentation versioned with code User guide Installation

40 Documentation All documentation versioned with code User guide Installation Solver and Utility usage Tutorials Input file reference FAQs

41 Documentation All documentation versioned with code User guide Installation Solver and Utility usage Tutorials Input file reference FAQs Developer guide Programming concepts Library design Nektar++ structures Nektar++ algorithms Coding Standard

42 Documentation All documentation versioned with code User guide Installation Solver and Utility usage Tutorials Input file reference FAQs Developer guide Programming concepts Library design Nektar++ structures Nektar++ algorithms Coding Standard Code documentation Doxygen Detailed implementation specifics

43 Summary Design of the Nektar++ framework Stack of libraries Code design mirrors mathematical formulation Encapsulate complexities of spectral/hp element methods Improve accessibility Collections Multiple implementations for achieving performance across P Collates action of operators across multiple elements Improves efficiency on modern vector-capable CPUs Easy to use with auto-tuning Developer practice Version control (Git + Gitlab) Regression tests (also useful as examples!) Continuous integration (Buildbot) Issue tracking (Trac) Documentation (LaTeX + Doxygen) Further information Nektar++: An open-source spectral/hp element framework, C. D. Cantwell, D. Moxey, A. Comerford, et al., Computer Physics Communications, in press, 205

Optimizing the performance of the spectral/hp element method with collective linear algebra operations

Optimizing the performance of the spectral/hp element method with collective linear algebra operations Optimizing the performance of the spectral/hp element method with collective linear algebra operations D. Moxey a,,c.d.cantwell a,r.m.kirby b, S. J. Sherwin a a Department of Aeronautics, South Kensington

More information

Pre- and post-processing in Nektar++

Pre- and post-processing in Nektar++ Pre- and post-processing in Nektar++ D. Moxey, C. Cantwell, R. M. Kirby, S. Sherwin Department of Aeronautics, Imperial College London Nektar++ workshop 7 th July 2015 Outline Motivation Pre-processing

More information

High-order mesh generation for CFD solvers

High-order mesh generation for CFD solvers High-order mesh generation for CFD solvers M. Turner, D. Moxey, S. Sherwin, J. Peiró Department of Aeronautics, Imperial College London DiPaRT 2015 Annual Meeting, Bristol, UK 17 th November 2015 Overview

More information

From h to p Efficiently: Selecting the Optimal Spectral/hp Discretisation in Three Dimensions

From h to p Efficiently: Selecting the Optimal Spectral/hp Discretisation in Three Dimensions Math. Model. Nat. henom. Vol., No., 0, pp.-9 DOI: 0.0/mmnp/00 From h to p Efficiently: Selecting the Optimal Spectral/hp Discretisation in Three Dimensions C. D. Cantwell, S. J. Sherwin, R. M. Kirby and.

More information

Spectral/hp element methods: Recent developments, applications, and perspectives *

Spectral/hp element methods: Recent developments, applications, and perspectives * 1 Available online at https://link.springer.com http://www.jhydrod.com/ 2018,30(1):1-22 https://doi.org/10.1007/s42241-018-0001-1 Spectral/hp element methods: Recent developments, applications, and perspectives

More information

COFFEE: AN OPTIMIZING COMPILER FOR FINITE ELEMENT LOCAL ASSEMBLY. Fabio Luporini

COFFEE: AN OPTIMIZING COMPILER FOR FINITE ELEMENT LOCAL ASSEMBLY. Fabio Luporini COFFEE: AN OPTIMIZING COMPILER FOR FINITE ELEMENT LOCAL ASSEMBLY Fabio Luporini Florian Rathgeber, Lawrence Mitchell, Gheorghe-Teodor Bercea, Andrew T. T. McRae, J. Ramanujam, Ana Lucia Varbanescu, David

More information

Optimising Sparse Matrix Vector Multiplication for Large Scale FEM problems on FPGA

Optimising Sparse Matrix Vector Multiplication for Large Scale FEM problems on FPGA Optimising Sparse Matrix Vector Multiplication for Large Scale FEM problems on FPGA Paul Grigoraş, Pavel Burovskiy, Wayne Luk, Spencer Sherwin Department of Computing, Imperial College London Department

More information

High-order visualization with ElVis

High-order visualization with ElVis High-order visualization with ElVis J. Peiro, D. Moxey, B. Jordi, S.J. Sherwin, B.W. Nelson, R.M. Kirby and R. Haimes Abstract Accurate visualization of high-order meshes and flow fields is a fundamental

More information

Experiences from the Fortran Modernisation Workshop

Experiences from the Fortran Modernisation Workshop Experiences from the Fortran Modernisation Workshop Wadud Miah (Numerical Algorithms Group) Filippo Spiga and Kyle Fernandes (Cambridge University) Fatima Chami (Durham University) http://www.nag.co.uk/content/fortran-modernization-workshop

More information

Linear Algebra libraries in Debian. DebConf 10 New York 05/08/2010 Sylvestre

Linear Algebra libraries in Debian. DebConf 10 New York 05/08/2010 Sylvestre Linear Algebra libraries in Debian Who I am? Core developer of Scilab (daily job) Debian Developer Involved in Debian mainly in Science and Java aspects sylvestre.ledru@scilab.org / sylvestre@debian.org

More information

First tutorial session

First tutorial session First tutorial session Vincent Dumoulin January 15, 2015 Outline 1 Solution to the numpy + MNIST + MLP assignment 2 Git primer 3 Theano primer 4 Porting numpy + MNIST + MLP to theano + MNIST + MLP Solution

More information

The challenges of new, efficient computer architectures, and how they can be met with a scalable software development strategy.! Thomas C.

The challenges of new, efficient computer architectures, and how they can be met with a scalable software development strategy.! Thomas C. The challenges of new, efficient computer architectures, and how they can be met with a scalable software development strategy! Thomas C. Schulthess ENES HPC Workshop, Hamburg, March 17, 2014 T. Schulthess!1

More information

Fourier Spectral/hp Element Method: Investigation of Time-Stepping and Parallelisation Strategies

Fourier Spectral/hp Element Method: Investigation of Time-Stepping and Parallelisation Strategies Fourier Spectral/hp Element Method: Investigation of Time-Stepping and Parallelisation Strategies by Alessandro Bolis Imperial College London Department of Aeronautics This thesis is submitted for the

More information

Introduction to Parallel Programming for Multicore/Manycore Clusters Part II-3: Parallel FVM using MPI

Introduction to Parallel Programming for Multicore/Manycore Clusters Part II-3: Parallel FVM using MPI Introduction to Parallel Programming for Multi/Many Clusters Part II-3: Parallel FVM using MPI Kengo Nakajima Information Technology Center The University of Tokyo 2 Overview Introduction Local Data Structure

More information

Quasi-3D Computation of the Taylor-Green Vortex Flow

Quasi-3D Computation of the Taylor-Green Vortex Flow Quasi-3D Computation of the Taylor-Green Vortex Flow Tutorials November 25, 2017 Department of Aeronautics, Imperial College London, UK Scientific Computing and Imaging Institute, University of Utah, USA

More information

Recommender Systems New Approaches with Netflix Dataset

Recommender Systems New Approaches with Netflix Dataset Recommender Systems New Approaches with Netflix Dataset Robert Bell Yehuda Koren AT&T Labs ICDM 2007 Presented by Matt Rodriguez Outline Overview of Recommender System Approaches which are Content based

More information

IMPROVING I/O PERFORMANCE IN NEKTAR++

IMPROVING I/O PERFORMANCE IN NEKTAR++ IMPROVING I/O PERFORMANCE IN NEKTAR++ Rupert Nash (rupert.nash@ed.ac.uk) EPCC The University of Edinburgh Nektar++ Workshop, Imperial College, 7/7/15 ARCHER ecse programme Funded by EPSRC Run by EPCC as

More information

Embedding Formulations, Complexity and Representability for Unions of Convex Sets

Embedding Formulations, Complexity and Representability for Unions of Convex Sets , Complexity and Representability for Unions of Convex Sets Juan Pablo Vielma Massachusetts Institute of Technology CMO-BIRS Workshop: Modern Techniques in Discrete Optimization: Mathematics, Algorithms

More information

A Component Framework for HPC Applications

A Component Framework for HPC Applications A Component Framework for HPC Applications Nathalie Furmento, Anthony Mayer, Stephen McGough, Steven Newhouse, and John Darlington Parallel Software Group, Department of Computing, Imperial College of

More information

To CG or to HDG: A Comparative Study in 3D

To CG or to HDG: A Comparative Study in 3D DOI 10.1007/s10915-015-0076-6 To CG or to HDG: A Comparative Study in 3D Sergey Yakovlev 1 David Moxey 2 Robert M. Kirby 3 Spencer J. Sherwin 4 Received: 25 June 2014 / Revised: 8 June 2015 / Accepted:

More information

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK Multigrid Solvers in CFD David Emerson Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK david.emerson@stfc.ac.uk 1 Outline Multigrid: general comments Incompressible

More information

CUDA Development Using NVIDIA Nsight, Eclipse Edition. David Goodwin

CUDA Development Using NVIDIA Nsight, Eclipse Edition. David Goodwin CUDA Development Using NVIDIA Nsight, Eclipse Edition David Goodwin NVIDIA Nsight Eclipse Edition CUDA Integrated Development Environment Project Management Edit Build Debug Profile SC'12 2 Powered By

More information

Automatic Tuning of Sparse Matrix Kernels

Automatic Tuning of Sparse Matrix Kernels Automatic Tuning of Sparse Matrix Kernels Kathy Yelick U.C. Berkeley and Lawrence Berkeley National Laboratory Richard Vuduc, Lawrence Livermore National Laboratory James Demmel, U.C. Berkeley Berkeley

More information

Incorporation of Multicore FEM Integration Routines into Scientific Libraries

Incorporation of Multicore FEM Integration Routines into Scientific Libraries Incorporation of Multicore FEM Integration Routines into Scientific Libraries Matthew Knepley Computation Institute University of Chicago Department of Molecular Biology and Physiology Rush University

More information

Using GPUs for unstructured grid CFD

Using GPUs for unstructured grid CFD Using GPUs for unstructured grid CFD Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute Oxford e-research Centre Schlumberger Abingdon Technology Centre, February 17th, 2011

More information

Points Addressed in this Lecture. Standard form of Boolean Expressions. Lecture 4: Logic Simplication & Karnaugh Map

Points Addressed in this Lecture. Standard form of Boolean Expressions. Lecture 4: Logic Simplication & Karnaugh Map Points Addressed in this Lecture Lecture 4: Logic Simplication & Karnaugh Map Professor Peter Cheung Department of EEE, Imperial College London Standard form of Boolean Expressions Sum-of-Products (SOP),

More information

Performance Analysis of BLAS Libraries in SuperLU_DIST for SuperLU_MCDT (Multi Core Distributed) Development

Performance Analysis of BLAS Libraries in SuperLU_DIST for SuperLU_MCDT (Multi Core Distributed) Development Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Performance Analysis of BLAS Libraries in SuperLU_DIST for SuperLU_MCDT (Multi Core Distributed) Development M. Serdar Celebi

More information

Introduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones

Introduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones Introduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones What is machine learning? Data interpretation describing relationship between predictors and responses

More information

The GridPACK Toolkit for Developing Power Grid Simulations on High Performance Computing Platforms

The GridPACK Toolkit for Developing Power Grid Simulations on High Performance Computing Platforms The GridPACK Toolkit for Developing Power Grid Simulations on High Performance Computing Platforms Bruce Palmer, Bill Perkins, Kevin Glass, Yousu Chen, Shuangshuang Jin, Ruisheng Diao, Mark Rice, David

More information

Repairing a Boundary Mesh

Repairing a Boundary Mesh Tutorial 1. Repairing a Boundary Mesh Introduction TGrid offers several tools for mesh repair. While there is no right or wrong way to repair a mesh, the goal is to improve the quality of the mesh with

More information

Use of Synthetic Benchmarks for Machine- Learning-based Performance Auto-tuning

Use of Synthetic Benchmarks for Machine- Learning-based Performance Auto-tuning Use of Synthetic Benchmarks for Machine- Learning-based Performance Auto-tuning Tianyi David Han and Tarek S. Abdelrahman The Edward S. Rogers Department of Electrical and Computer Engineering University

More information

Viscous Hybrid Mesh Generation

Viscous Hybrid Mesh Generation Tutorial 4. Viscous Hybrid Mesh Generation Introduction In cases where you want to resolve the boundary layer, it is often more efficient to use prismatic cells in the boundary layer rather than tetrahedral

More information

Introduction to ANSYS FLUENT Meshing

Introduction to ANSYS FLUENT Meshing Workshop 02 Volume Fill Methods Introduction to ANSYS FLUENT Meshing 1 2011 ANSYS, Inc. December 21, 2012 I Introduction Workshop Description: Mesh files will be read into the Fluent Meshing software ready

More information

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić How to perform HPL on CPU&GPU clusters Dr.sc. Draško Tomić email: drasko.tomic@hp.com Forecasting is not so easy, HPL benchmarking could be even more difficult Agenda TOP500 GPU trends Some basics about

More information

When Modeling meets Productivity. Sven Efftinge - itemis

When Modeling meets Productivity. Sven Efftinge - itemis When Modeling meets Productivity Sven Efftinge - itemis I Eclipse JDT I GIT So what s the Problem? It s the Language not the Tooling! Level of Abstraction Reuse existing, proven technology and apply

More information

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC 2015-09-14 Algorithms, System and Data Centre Optimisation for Energy Efficient HPC Vincent Heuveline URZ Computing Centre of Heidelberg University EMCL Engineering Mathematics and Computing Lab 1 Energy

More information

Fast Multipole and Related Algorithms

Fast Multipole and Related Algorithms Fast Multipole and Related Algorithms Ramani Duraiswami University of Maryland, College Park http://www.umiacs.umd.edu/~ramani Joint work with Nail A. Gumerov Efficiency by exploiting symmetry and A general

More information

Finite element methods in scientific computing. Wolfgang Bangerth, Texas A&M University

Finite element methods in scientific computing. Wolfgang Bangerth, Texas A&M University Finite element methods in scientific computing, Texas A&M University Implementing the finite element method A brief re-hash of the FEM, using the Poisson equation: We start with the strong form: Δ u=f...and

More information

Parallel Computing 35 (2009) Contents lists available at ScienceDirect. Parallel Computing. journal homepage:

Parallel Computing 35 (2009) Contents lists available at ScienceDirect. Parallel Computing. journal homepage: Parallel Computing 5 (09) 84 04 Contents lists available at ScienceDirect Parallel Computing journal homepage: www.elsevier.com/locate/parco Parallel performance of the coarse space linear vertex solver

More information

HPC Computer Aided CINECA

HPC Computer Aided CINECA HPC Computer Aided Engineering @ CINECA Raffaele Ponzini Ph.D. CINECA SuperComputing Applications and Innovation Department SCAI 16-18 June 2014 Segrate (MI), Italy Outline Open-source CAD and Meshing

More information

Spectral/hp Methods For Elliptic Problems on Hybrid Grids

Spectral/hp Methods For Elliptic Problems on Hybrid Grids Contemporary Mathematics Volume 18, 1998 B -818-988-1-31-7 Spectral/hp Methods For Elliptic Problems on Hybrid Grids Spencer J. Sherwin, Timothy C.E. Warburton, and George Em Karniadakis 1. Introduction

More information

Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

Privacy-Preserving Distributed Linear Regression on High-Dimensional Data Privacy-Preserving Distributed Linear Regression on High-Dimensional Data Borja Balle Amazon Research Cambridge (work done at Lancaster University) Based on joint work with Adria Gascon, Phillipp Schoppmann,

More information

Learning Objectives. In this chapter you will learn about:

Learning Objectives. In this chapter you will learn about: Ref. Page Slide 1/16 Learning Objectives In this chapter you will learn about: Basic operations performed by all types of computer systems Basic organization of a computer system Input unit and its functions

More information

Forest-of-octrees AMR: algorithms and interfaces

Forest-of-octrees AMR: algorithms and interfaces Forest-of-octrees AMR: algorithms and interfaces Carsten Burstedde joint work with Omar Ghattas, Tobin Isaac, Georg Stadler, Lucas C. Wilcox Institut für Numerische Simulation (INS) Rheinische Friedrich-Wilhelms-Universität

More information

Introduction to parallel Computing

Introduction to parallel Computing Introduction to parallel Computing VI-SEEM Training Paschalis Paschalis Korosoglou Korosoglou (pkoro@.gr) (pkoro@.gr) Outline Serial vs Parallel programming Hardware trends Why HPC matters HPC Concepts

More information

Software Ecosystem for Arm-based HPC

Software Ecosystem for Arm-based HPC Software Ecosystem for Arm-based HPC CUG 2018 - Stockholm Florent.Lebeau@arm.com Ecosystem for HPC List of components needed: Linux OS availability Compilers Libraries Job schedulers Debuggers Profilers

More information

Windows. Everywhere else

Windows. Everywhere else Git version control Enable native scrolling Git is a tool to manage sourcecode Never lose your coding progress again An empty folder 1/30 Windows Go to your programs overview and start Git Bash Everywhere

More information

CoreOS and Red Hat. Reza Shafii Joe Fernandes Brandon Philips Clayton Coleman May 2018

CoreOS and Red Hat. Reza Shafii Joe Fernandes Brandon Philips Clayton Coleman May 2018 CoreOS and Red Hat Reza Shafii Joe Fernandes Brandon Philips Clayton Coleman May 2018 Combining Industry Leading Container Solutions RED HAT QUAY REGISTRY ETCD PROMETHEUS RED HAT COREOS METERING & CHARGEBACK

More information

Learning ctools and GammaLib development in an hour

Learning ctools and GammaLib development in an hour Learning ctools and GammaLib development in an hour Introduction to 6 th ctools coding sprint Jürgen Knödlseder (IRAP) What I expect you know How to write C++ and/or Python code How to use Git Our GitLab

More information

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU GPGPU opens the door for co-design HPC, moreover middleware-support embedded system designs to harness the power of GPUaccelerated

More information

Arm's role in co-design for the next generation of HPC platforms

Arm's role in co-design for the next generation of HPC platforms Arm's role in co-design for the next generation of HPC platforms Filippo Spiga Software and Large Scale Systems What it is Co-design? Abstract: Preparations for Exascale computing have led to the realization

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear

More information

SWARM OPTIMIZATION MATLAB OPERATION MANUAL E-PUB

SWARM OPTIMIZATION MATLAB OPERATION MANUAL E-PUB 23 December, 2018 SWARM OPTIMIZATION MATLAB OPERATION MANUAL E-PUB Document Filetype: PDF 299.31 KB 0 SWARM OPTIMIZATION MATLAB OPERATION MANUAL E-PUB Genetic Algorithms and Particle Swarm Optimization

More information

PrimaryTools.co.uk 2012 MATHEMATICS KEY STAGE TEST C LEVEL 6 CALCULATOR ALLOWED PAGE TOTAL MARKS First Name Last Name School Prim

PrimaryTools.co.uk 2012 MATHEMATICS KEY STAGE TEST C LEVEL 6 CALCULATOR ALLOWED PAGE TOTAL MARKS First Name Last Name School Prim 2012 MATHEMATICS KEY STAGE 2 2001 TEST C LEVEL 6 CALCULATOR ALLOWED PAGE 3 5 7 9 11 12 TOTAL MARKS First Name Last Name School 2012 Instructions You may use a calculator to answer any questions in this

More information

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2

More information

Mathematical Methods in Fluid Dynamics and Simulation of Giant Oil and Gas Reservoirs. 3-5 September 2012 Swissotel The Bosphorus, Istanbul, Turkey

Mathematical Methods in Fluid Dynamics and Simulation of Giant Oil and Gas Reservoirs. 3-5 September 2012 Swissotel The Bosphorus, Istanbul, Turkey Mathematical Methods in Fluid Dynamics and Simulation of Giant Oil and Gas Reservoirs 3-5 September 2012 Swissotel The Bosphorus, Istanbul, Turkey Fast and robust solvers for pressure systems on the GPU

More information

HPC future trends from a science perspective

HPC future trends from a science perspective HPC future trends from a science perspective Simon McIntosh-Smith University of Bristol HPC Research Group simonm@cs.bris.ac.uk 1 Business as usual? We've all got used to new machines being relatively

More information

Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation

Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation Michael Lange 1 Gerard Gorman 1 Michele Weiland 2 Lawrence Mitchell 2 Xiaohu Guo 3 James Southern 4 1 AMCG, Imperial College

More information

AMS527: Numerical Analysis II

AMS527: Numerical Analysis II AMS527: Numerical Analysis II A Brief Overview of Finite Element Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao SUNY Stony Brook AMS527: Numerical Analysis II 1 / 25 Overview Basic concepts Mathematical

More information

Securing Cloud Applications with a Distributed Web Application Firewall Riverbed Technology

Securing Cloud Applications with a Distributed Web Application Firewall Riverbed Technology Securing Cloud Applications with a Distributed Web Application Firewall www.riverbed.com 2013 Riverbed Technology Primary Target of Attack Shifting from Networks and Infrastructure to Applications NETWORKS

More information

VISUAL APPLICATION CREATION AND PUBLISHING FOR ANYONE

VISUAL APPLICATION CREATION AND PUBLISHING FOR ANYONE Oracle Autonomous Visual Builder Cloud Service provides an easy way to create and host web and mobile applications in a secure cloud environment. An intuitive visual development experience on top of a

More information

Geometry. Prof. George Wolberg Dept. of Computer Science City College of New York

Geometry. Prof. George Wolberg Dept. of Computer Science City College of New York Geometry Prof. George Wolberg Dept. of Computer Science City College of New York Objectives Introduce the elements of geometry -Scalars - Vectors - Points Develop mathematical operations among them in

More information

Accelerating the Iterative Linear Solver for Reservoir Simulation

Accelerating the Iterative Linear Solver for Reservoir Simulation Accelerating the Iterative Linear Solver for Reservoir Simulation Wei Wu 1, Xiang Li 2, Lei He 1, Dongxiao Zhang 2 1 Electrical Engineering Department, UCLA 2 Department of Energy and Resources Engineering,

More information

Preliminaries. Chapter The FEniCS Project

Preliminaries. Chapter The FEniCS Project Chapter 1 Preliminaries 1.1 The FEniCS Project The FEniCS Project is a research and software project aimed at creating mathematical methods and software for automated computational mathematical modeling.

More information

Headline in Arial Bold 30pt. Visualisation using the Grid Jeff Adie Principal Systems Engineer, SAPK July 2008

Headline in Arial Bold 30pt. Visualisation using the Grid Jeff Adie Principal Systems Engineer, SAPK July 2008 Headline in Arial Bold 30pt Visualisation using the Grid Jeff Adie Principal Systems Engineer, SAPK July 2008 Agenda Visualisation Today User Trends Technology Trends Grid Viz Nodes Software Ecosystem

More information

CURRENT RESEARCH ON EXPLORATORY LANDSCAPE ANALYSIS

CURRENT RESEARCH ON EXPLORATORY LANDSCAPE ANALYSIS CURRENT RESEARCH ON EXPLORATORY LANDSCAPE ANALYSIS HEIKE TRAUTMANN. MIKE PREUSS. 1 EXPLORATORY LANDSCAPE ANALYSIS effective and sophisticated approach to characterize properties of optimization problems

More information

HPC with Multicore and GPUs

HPC with Multicore and GPUs HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville COSC 594 Lecture Notes March 22, 2017 1/20 Outline Introduction - Hardware

More information

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation 1 Cheng-Han Du* I-Hsin Chung** Weichung Wang* * I n s t i t u t e o f A p p l i e d M

More information

The area of this square is 9 square units. This array has 2 rows and 5 columns. So, 2 5 = 10. The gas container has a capacity of 5 gallons.

The area of this square is 9 square units. This array has 2 rows and 5 columns. So, 2 5 = 10. The gas container has a capacity of 5 gallons. Third Grade 434 Area: The number of nonoverlapping units that cover a closed boundary (measured in square units). Array: An arrangement of objects in a regular pattern, usually rows and columns. Arrays

More information

Tackling Complexity in High Performance Computing Applications

Tackling Complexity in High Performance Computing Applications Int J Parallel Prog (2017) 45:402 420 DOI 10.1007/s10766-016-0422-9 Tackling Complexity in High Performance Computing Applications J. Darlington 1 A. J. Field 1 L. Hakim 1 Received: 10 September 2015 /

More information

MAGMA. LAPACK for GPUs. Stan Tomov Research Director Innovative Computing Laboratory Department of Computer Science University of Tennessee, Knoxville

MAGMA. LAPACK for GPUs. Stan Tomov Research Director Innovative Computing Laboratory Department of Computer Science University of Tennessee, Knoxville MAGMA LAPACK for GPUs Stan Tomov Research Director Innovative Computing Laboratory Department of Computer Science University of Tennessee, Knoxville Keeneland GPU Tutorial 2011, Atlanta, GA April 14-15,

More information

An Integrated Synchronization and Consistency Protocol for the Implementation of a High-Level Parallel Programming Language

An Integrated Synchronization and Consistency Protocol for the Implementation of a High-Level Parallel Programming Language An Integrated Synchronization and Consistency Protocol for the Implementation of a High-Level Parallel Programming Language Martin C. Rinard (martin@cs.ucsb.edu) Department of Computer Science University

More information

An HPC Implementation of the Finite Element Method

An HPC Implementation of the Finite Element Method An HPC Implementation of the Finite Element Method John Rugis Interdisciplinary research group: David Yule Physiology James Sneyd, Shawn Means, Di Zhu Mathematics John Rugis Computer Science Project funding:

More information

KNITRO NLP Solver. Todd Plantenga, Ziena Optimization, Inc. US-Mexico Workshop on Optimization January 2007, Huatulco, Mexico

KNITRO NLP Solver. Todd Plantenga, Ziena Optimization, Inc. US-Mexico Workshop on Optimization January 2007, Huatulco, Mexico KNITRO NLP Solver Todd Plantenga, Ziena Optimization, Inc. US-Mexico Workshop on Optimization January 2007, Huatulco, Mexico What is KNITRO? Commercial software for general NLP Fully embeddable software

More information

Why Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends

Why Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends Imagine stream processor; Bill Dally, Stanford Connection Machine CM; Thinking Machines Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid Jeffrey Bolz Eitan Grinspun Caltech Ian Farmer

More information

MAGMA. Matrix Algebra on GPU and Multicore Architectures

MAGMA. Matrix Algebra on GPU and Multicore Architectures MAGMA Matrix Algebra on GPU and Multicore Architectures Innovative Computing Laboratory Electrical Engineering and Computer Science University of Tennessee Piotr Luszczek (presenter) web.eecs.utk.edu/~luszczek/conf/

More information

MAP4C Learning Goals and Success Criteria Overall Expectation By the end of this course students will be able to:

MAP4C Learning Goals and Success Criteria Overall Expectation By the end of this course students will be able to: MAP4C Learning Goals and Success Criteria Overall Expectation By the end of this course students will be able to: Learning Goal I will be able to: A. Mathematical Models (MM) UNIT TWO MM1 Solve Exponential

More information

Sparse Direct Solvers for Extreme-Scale Computing

Sparse Direct Solvers for Extreme-Scale Computing Sparse Direct Solvers for Extreme-Scale Computing Iain Duff Joint work with Florent Lopez and Jonathan Hogg STFC Rutherford Appleton Laboratory SIAM Conference on Computational Science and Engineering

More information

TGrid 5.0 Tutorial Guide

TGrid 5.0 Tutorial Guide TGrid 5.0 Tutorial Guide April 2008 Copyright c 2008 by ANSYS, Inc. All Rights Reserved. No part of this document may be reproduced or otherwise used in any form without express written permission from

More information

GPUML: Graphical processors for speeding up kernel machines

GPUML: Graphical processors for speeding up kernel machines GPUML: Graphical processors for speeding up kernel machines http://www.umiacs.umd.edu/~balajiv/gpuml.htm Balaji Vasan Srinivasan, Qi Hu, Ramani Duraiswami Department of Computer Science, University of

More information

MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures

MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures Stan Tomov Innovative Computing Laboratory University of Tennessee, Knoxville OLCF Seminar Series, ORNL June 16, 2010

More information

Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM

Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM Alexander Monakov, amonakov@ispras.ru Institute for System Programming of Russian Academy of Sciences March 20, 2013 1 / 17 Problem Statement In OpenFOAM,

More information

WinBioinfTools: Bioinformatics Tools for Windows High Performance Computing Server Mohamed Abouelhoda Nile University

WinBioinfTools: Bioinformatics Tools for Windows High Performance Computing Server Mohamed Abouelhoda Nile University WinBioinfTools: Bioinformatics Tools for Windows High Performance Computing Server 2008 joint project between Nile University, Microsoft Egypt, and Cairo Microsoft Innovation Center Mohamed Abouelhoda

More information

Hybrid MPI + OpenMP Approach to Improve the Scalability of a Phase-Field-Crystal Code

Hybrid MPI + OpenMP Approach to Improve the Scalability of a Phase-Field-Crystal Code Hybrid MPI + OpenMP Approach to Improve the Scalability of a Phase-Field-Crystal Code Reuben D. Budiardja reubendb@utk.edu ECSS Symposium March 19 th, 2013 Project Background Project Team (University of

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Intel Parallel Amplifier

Intel Parallel Amplifier Intel Parallel Amplifier Product Brief Intel Parallel Amplifier Optimize Performance and Scalability Intel Parallel Amplifier makes it simple to quickly find multicore performance bottlenecks without needing

More information

Unit 4, Lesson 14: Fractional Lengths in Triangles and Prisms

Unit 4, Lesson 14: Fractional Lengths in Triangles and Prisms Unit 4, Lesson 14: Fractional Lengths in Triangles and Prisms Lesson Goals Use multiplication and division to solve problems involving fractional areas and lengths in triangles. cubes with fractional edge

More information

SENSEI / SENSEI-Lite / SENEI-LDC Updates

SENSEI / SENSEI-Lite / SENEI-LDC Updates SENSEI / SENSEI-Lite / SENEI-LDC Updates Chris Roy and Brent Pickering Aerospace and Ocean Engineering Dept. Virginia Tech July 23, 2014 Collaborations with Math Collaboration on the implicit SENSEI-LDC

More information

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report ESPRESO ExaScale PaRallel FETI Solver Hybrid FETI Solver Report Lubomir Riha, Tomas Brzobohaty IT4Innovations Outline HFETI theory from FETI to HFETI communication hiding and avoiding techniques our new

More information

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017 A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council Perth, July 31-Aug 01, 2017 http://levlafayette.com Necessary and Sufficient Definitions High Performance Computing: High

More information

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction FOR 493 - P3: A monolithic multigrid FEM solver for fluid structure interaction Stefan Turek 1 Jaroslav Hron 1,2 Hilmar Wobker 1 Mudassar Razzaq 1 1 Institute of Applied Mathematics, TU Dortmund, Germany

More information

Minion: Fast, Scalable Constraint Solving. Ian Gent, Chris Jefferson, Ian Miguel

Minion: Fast, Scalable Constraint Solving. Ian Gent, Chris Jefferson, Ian Miguel Minion: Fast, Scalable Constraint Solving Ian Gent, Chris Jefferson, Ian Miguel 1 60 Second Introduction to CSPs Standard Definition A CSP is a tuple V: list of variables D: a domain for each

More information

trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee

trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell Khronos OpenCL SYCL committee 05/12/2015 IWOCL 2015 SYCL Tutorial OpenCL SYCL committee work... Weekly telephone meeting Define

More information

Computer Graphics Hands-on

Computer Graphics Hands-on Computer Graphics Hands-on Two-Dimensional Transformations Objectives Visualize the fundamental 2D geometric operations translation, rotation about the origin, and scale about the origin Learn how to compose

More information

1d Determining the derivative 1 Solving 2 First alternative: Determine the points where the graphs of and meet:

1d Determining the derivative 1 Solving 2 First alternative: Determine the points where the graphs of and meet: Marking scheme for the VWO Mathematics-B Exam of 3 July 04 Question. a. Determining the coordinates of point Taking the first derivative: Calculating the slope of the tangent line: Determining a formula

More information

NEW ADVANCES IN GPU LINEAR ALGEBRA

NEW ADVANCES IN GPU LINEAR ALGEBRA GTC 2012: NEW ADVANCES IN GPU LINEAR ALGEBRA Kyle Spagnoli EM Photonics 5/16/2012 QUICK ABOUT US» HPC/GPU Consulting Firm» Specializations in:» Electromagnetics» Image Processing» Fluid Dynamics» Linear

More information

Intelligent Reduction of Tire Noise

Intelligent Reduction of Tire Noise Intelligent Reduction of Tire Noise Matthias Becker and Helena Szczerbicka University Hannover Welfengarten 3067 Hannover, Germany xmb@sim.uni-hannover.de Abstract. In this paper we report about deployment

More information

Workshop 3: Cutcell Mesh Generation. Introduction to ANSYS Fluent Meshing Release. Release ANSYS, Inc.

Workshop 3: Cutcell Mesh Generation. Introduction to ANSYS Fluent Meshing Release. Release ANSYS, Inc. Workshop 3: Cutcell Mesh Generation 14.5 Release Introduction to ANSYS Fluent Meshing 1 2011 ANSYS, Inc. December 21, 2012 I Introduction Workshop Description: CutCell meshing is a general purpose meshing

More information

HPC code modernization with Intel development tools

HPC code modernization with Intel development tools HPC code modernization with Intel development tools Bayncore, Ltd. Intel HPC Software Workshop Series 2016 HPC Code Modernization for Intel Xeon and Xeon Phi February 17 th 2016, Barcelona Microprocessor

More information

ANNUAL REPORT Visit us at project.eu Supported by. Mission

ANNUAL REPORT Visit us at   project.eu Supported by. Mission Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing

More information