Paralution & ViennaCL

Size: px
Start display at page:

Download "Paralution & ViennaCL"

Transcription

1 Paralution & ViennaCL Clemens Schiffer June 12, 2014 Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

2 Introduction Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

3 Idea of Paralution Package for iterative solvers/preconditioners Additional abstract layer between user s preferred program and varying hardware Code independent of platform and hardware backend Futureproof Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

4 Installation No root access required Install using make/cmake Library & header based I had to specify the CUDA root directory cmake -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda.. Set environmental variable export LD_LIBRARY_PATH=$LD_LIBRARY_PATH: /paralution/build/lib Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

5 Using Paralution: Basic Structure # include < paralution.hpp > using namespace paralution ; int main ( int argc, char * argv []) { init_ paralution (); info_ paralution (); // optional // your paralution code // goes here stop_ paralution (); return 0; Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

6 Compilation and Linking Use g++ -O3 -Wall -I / paralution / build / inc -c main. cpp -o main.o g ++ -o main main. o -L / paralution / build / lib / - lparalution - lopencl Or modify your Makefile: CXXFLAGS += -I / paralution / build / inc LINKFLAGS += -L / paralution / build / lib / - lparalution - lopencl Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

7 Info Paralution Output Number of CPU cores: 8 Host thread affinity policy - thread mapping on every core Number of GPU devices in the system: 1 PARALUTION ver PARALUTION platform is initialized Accelerator backend: GPU(CUDA) OpenMP threads:8 Selected GPU device: Device number: 0 Device name: GeForce GTX 680 totalglobalmem: 4095 MByte clockrate: compute capability: 3.0 ECCEnabled: Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

8 Simple example: Apply Matrix to Vector LocalVector < double > x; LocalVector < double > y; LocalMatrix < double > mat ; mat. ReadFileMTX (" my_matrix. mtx "); x. ReadFileASCII (" my_vector. dat "); y. Allocate (" rhs ", mat. get_nrow ()); mat. Apply (x, & rhs ); Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

9 Simple Example: On the Accelerator(GPU,... ) LocalVector < double > x; LocalVector < double > y; LocalMatrix < double > mat ; mat. ReadFileMTX (" my_matrix. mtx "); x. ReadFileASCII (" my_vector. dat "); y. Allocate (" rhs ", mat. get_nrow ()); mat. MoveToAccelerator (); x. MoveToAccelerator (); y. MoveToAccelerator (); mat. Apply (x, & rhs ); // perform rhs <- Ax Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

10 Info Paralution Output Calling mat.info(); will produce the output: LocalMatrix name=l100.mtx; rows=10000; cols=10000; nnz=49600; prec=64bit; asm=no; format=csr; host backend={cpu(openmp)}; accelerator backend={opencl}; current=opencl If an operation can not be performed on the accelerator efficiently: *** warning: LocalMatrix::ConvertTo() is performed on the host Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

11 Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

12 Linear Solver: CG CG < LocalMatrix < double >, LocalVector < double >, double > ls; ls. SetOperator ( mat ); ls. Build (); ls. Solve (rhs, &x); // solve Ax = rhs Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

13 Linear Solver: PCG CG < LocalMatrix < double >, LocalVector < double >, double > ls; Jacobi < LocalMatrix < double >, LocalVector < double >, double > p; ls. SetOperator ( mat ); ls. SetPreconditioner (p); ls. Build (); ls. Solve (rhs, &x); // solve Ax = rhs Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

14 Custom Iteration Control CG < LocalMatrix < double >, LocalVector < double >, double > ls; Jacobi < LocalMatrix < double >, LocalVector < double >, double > p; ls. Init (1e -10, // abs_tol 1e -8, // rel_tol 1e+8, // div_tol 10000); // max_iter ls. SetOperator ( mat ); ls. SetPreconditioner (p); ls. Build (); ls. Solve (rhs, &x); // solve Ax = rhs Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

15 Available Solvers/Preconditioners Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

16 Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

17 Switching the Backend No recompilation needed, just switch the library E.g. a second installation in /paralution_cl installed with cmake -DSUPPORT_CUDA=OFF -DSUPPORT_OCL=ON.. just changing export LD_LIBRARY_PATH=$LD_LIBRARY_PATH: /paralution_cl/build/lib will make the executable run using OpenCl. Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

18 Matlab Plug-in Consists of an example file paralution_pcg.cpp That can be modified easily Compile into a MEX-file Can then be called in MATLAB as a normal function Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

19 Matlab Plug-in: Details Required some extra attention: Finding mex: export PATH=$PATH:/usr/local/MATLAB/R2013a/bin Using an older compiler: sudo rm /usr/bin/gcc sudo rm /usr/bin/g++ sudo ln -s /usr/bin/gcc-4.4 /usr/bin/gcc sudo ln -s /usr/bin/g /usr/bin/g++ cd /usr/local/matlab/r2013a/sys/os/glnxa64 sudo unlink libstdc++.so.6 sudo ln -s /usr/lib/libstdc++.so.6 Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

20 Other Plugins FORTRAN OpenFOAM Deal.II Elmer Hermes/Agros2D Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

21 Pros Easy to use Portable Open Source Many precond/solvers Cons No MPI yet No stencils (No CUDA for CC < 2.0) In development Futureproof...? Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

22 Introduction Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

23 Idea of ViennaCL Linear algebra and iterative solvers/preconditioners Additional abstract layer between user s preferred program and varying hardware Code independent of platform and hardware backend: Header based Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

24 Details More linear algebra: Dense matrices, slicing, extraction, etc. Compatible with ublas: just changing the namespace is enough Completely header based, no installation needed CMake only required to build examples highly recommended though Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

25 Simple Example # include " viennacl / scalar. hpp " //... using namespace viennacl ; //... typedef float ScalarType ; matrix < ScalarType > vcl_a (N, M); vector < ScalarType > vcl_x (M); vector < ScalarType > vcl_rhs (N); std :: vector < ScalarType > stl_x ( M); // standard vectors ensure std :: vector < ScalarType > stl_a (N*M); // linear memory // -> fast_copy //.. fill with data fast_copy (&( stl_a [0]), &( stl_a [0]) + stl_a. size (), vcl_a ); fast_copy (&( stl_x [0]), &( stl_x [0]) + stl_x. size (), vcl_x ); vcl_rhs = linalg :: prod ( vcl_a, vcl_x ); Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

26 Direct Solvers using namespace viennacl ; matrix < ScalarType > vcl_a vector < ScalarType > vcl_rhs ; //... fill with data // conjugate gradient : linalg :: lu_factorize ( vcl_a ); linalg :: lu_substitute ( vcl_a, vcl_rhs ); Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

27 Iterative Solvers using namespace viennacl :: linalg ; //... compressed_ matrix < ScalarType > vcl_ matrix ; //... fill with data // conjugate gradient : vcl_ result = solve ( vcl_matrix, vcl_rhs, cg_tag () ); // BiCGStab : vcl_ result = solve ( vcl_matrix, vcl_rhs, bicgstab_ tag () ); // GMRES : vcl_ result = solve ( vcl_matrix, vcl_rhs, gmres_ tag () ); Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

28 Iteration Control using namespace viennacl :: linalg ; //... compressed_ matrix < ScalarType > vcl_ matrix ; //... fill with data cg_tag custom_ cg (1e -10, 100); // relative tol, max_ iter // conjugate gradient : vcl_ result = solve ( vcl_matrix, vcl_rhs, custom_ cg ); cout << "No. of iters : " << custom_cg. iters () << endl ; cout << " Est. error : " << custom_cg. error () << endl ; // BiCGStab : vcl_ result = solve ( vcl_matrix, vcl_rhs, bicgstab_ tag () ); // GMRES : vcl_ result = solve ( vcl_matrix, vcl_rhs, gmres_ tag () ); Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

29 Preconditioning using namespace viennacl :: linalg ; //... // Incomplete LU factorization with threshold ilut_ tag ilut_ config ( max_entries, // # nz row elements in L/ U drop_tol, // minimal value of L/ U true ); // level scheduling // subst paralell if possible ilut_precond < SparseMatrix > vcl_ ilut ( vcl_matrix, ilut_ config ); // PCG vcl_ result = solve ( vcl_matrix, vcl_rhs, cg_tag (), vcl_ ilut ); Other Preconditioners: ILU0, Block-ILU, Jacobi, Row Scaling; Experimental: AMG, SPAI Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

30 pyviennacl import pyviennacl as p import numpy as np from scipy import io from my_ read_ mtx import read_ mtx # from util import read_mtx, read_ vector #B = io. mmread (" L20. mtx ") # not yet supported A = read_mtx ( " L20. mtx ", dtype =np. float64 ) b = p. Vector (20*20,1.0, dtype = np. float64 ) x = p. Vector (20*20,1.0, dtype = np. float64 ) tag = p. gmres_ tag ( tolerance = 1e -5, max_ iterations = 500, krylo # tag = p. cg_tag ( tolerance = 1e -8, max_ iterations = 150) x = p. solve (A, b, tag ) # Show some info print (" Num. iterations : %s" % tag. iters ) print (" Estimated error : %s" % tag. error ) print (" True error : %s" % (A*x-b). norm (2)) Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

31 Pros Portable! Open Source More linear algebra (ublas, py) Cons No MPI In development Futureproof...? Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

32 Thank you for your attention! Questions? Clemens Schiffer (Uni Graz) Paralution & ViennaCL June 12, / 32

PARALUTION - a Library for Iterative Sparse Methods on CPU and GPU

PARALUTION - a Library for Iterative Sparse Methods on CPU and GPU - a Library for Iterative Sparse Methods on CPU and GPU Dimitar Lukarski Division of Scientific Computing Department of Information Technology Uppsala Programming for Multicore Architectures Research Center

More information

iennacl GPU-accelerated Linear Algebra at the Convenience of the C++ Boost Libraries Karl Rupp

iennacl GPU-accelerated Linear Algebra at the Convenience of the C++ Boost Libraries Karl Rupp GPU-accelerated Linear Algebra at the Convenience of the C++ Boost Libraries Karl Rupp Mathematics and Computer Science Division Argonne National Laboratory based on previous work at Technische Universität

More information

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER AVISHA DHISLE PRERIT RODNEY ADHISLE PRODNEY 15618: PARALLEL COMPUTER ARCHITECTURE PROF. BRYANT PROF. KAYVON LET S

More information

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013 GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»

More information

Structure-preserving Smoothing for Seismic Amplitude Data by Anisotropic Diffusion using GPGPU

Structure-preserving Smoothing for Seismic Amplitude Data by Anisotropic Diffusion using GPGPU GPU Technology Conference 2016 April, 4-7 San Jose, CA, USA Structure-preserving Smoothing for Seismic Amplitude Data by Anisotropic Diffusion using GPGPU Joner Duarte jduartejr@tecgraf.puc-rio.br Outline

More information

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC

More information

Multi-GPU simulations in OpenFOAM with SpeedIT technology.

Multi-GPU simulations in OpenFOAM with SpeedIT technology. Multi-GPU simulations in OpenFOAM with SpeedIT technology. Attempt I: SpeedIT GPU-based library of iterative solvers for Sparse Linear Algebra and CFD. Current version: 2.2. Version 1.0 in 2008. CMRS format

More information

Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM

Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM Efficient Multi-GPU CUDA Linear Solvers for OpenFOAM Alexander Monakov, amonakov@ispras.ru Institute for System Programming of Russian Academy of Sciences March 20, 2013 1 / 17 Problem Statement In OpenFOAM,

More information

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative

More information

ViennaCL and PETSc Tutorial

ViennaCL and PETSc Tutorial ViennaCL and PETSc Tutorial Karl Rupp rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC 2013 May 23th, 2013 Part 1 iennacl Vienna Computing Library http://viennacl.sourceforge.net/

More information

OpenMP and MPI parallelization

OpenMP and MPI parallelization OpenMP and MPI parallelization Gundolf Haase Institute for Mathematics and Scientific Computing University of Graz, Austria Chile, Jan. 2015 OpenMP for our example OpenMP generation in code Determine matrix

More information

CUDA Accelerated Compute Libraries. M. Naumov

CUDA Accelerated Compute Libraries. M. Naumov CUDA Accelerated Compute Libraries M. Naumov Outline Motivation Why should you use libraries? CUDA Toolkit Libraries Overview of performance CUDA Proprietary Libraries Address specific markets Third Party

More information

Sparse Matrices. This means that for increasing problem size the matrices become sparse and sparser. O. Rheinbach, TU Bergakademie Freiberg

Sparse Matrices. This means that for increasing problem size the matrices become sparse and sparser. O. Rheinbach, TU Bergakademie Freiberg Sparse Matrices Many matrices in computing only contain a very small percentage of nonzeros. Such matrices are called sparse ( dünn besetzt ). Often, an upper bound on the number of nonzeros in a row can

More information

Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems

Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems Dr.-Ing. Achim Basermann, Melven Zöllner** German Aerospace Center (DLR) Simulation- and Software Technology

More information

Lessons Learned in Developing the Linear Algebra Library ViennaCL

Lessons Learned in Developing the Linear Algebra Library ViennaCL Lessons Learned in Developing the Linear Algebra Library ViennaCL Florian Rudolf 1, Karl Rupp 1,2, Josef Weinbub 1 http://karlrupp.net/ 1 Institute for Microelectronics 2 Institute for Analysis and Scientific

More information

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations Hartwig Anzt 1, Marc Baboulin 2, Jack Dongarra 1, Yvan Fournier 3, Frank Hulsemann 3, Amal Khabou 2, and Yushan Wang 2 1 University

More information

Report of Linear Solver Implementation on GPU

Report of Linear Solver Implementation on GPU Report of Linear Solver Implementation on GPU XIANG LI Abstract As the development of technology and the linear equation solver is used in many aspects such as smart grid, aviation and chemical engineering,

More information

Eigen Tutorial. CS2240 Interactive Computer Graphics

Eigen Tutorial. CS2240 Interactive Computer Graphics CS2240 Interactive Computer Graphics CS2240 Interactive Computer Graphics Introduction Eigen is an open-source linear algebra library implemented in C++. It s fast and well-suited for a wide range of tasks,

More information

nag sparse nsym sol (f11dec)

nag sparse nsym sol (f11dec) f11 Sparse Linear Algebra f11dec nag sparse nsym sol (f11dec) 1. Purpose nag sparse nsym sol (f11dec) solves a real sparse nonsymmetric system of linear equations, represented in coordinate storage format,

More information

FASP User Guide. FASP Developers. Version 2.0.5

FASP User Guide. FASP Developers. Version 2.0.5 FASP User Guide FASP Developers Version 2.0.5 Contents Contents 1 1 Introduction 3 1.1 General description..................................... 3 1.2 Roadmap: from basics to complex applications.....................

More information

OpenFOAM + GPGPU. İbrahim Özküçük

OpenFOAM + GPGPU. İbrahim Özküçük OpenFOAM + GPGPU İbrahim Özküçük Outline GPGPU vs CPU GPGPU plugins for OpenFOAM Overview of Discretization CUDA for FOAM Link (cufflink) Cusp & Thrust Libraries How Cufflink Works Performance data of

More information

Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead

Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU Robert Strzodka NVAMG Project Lead A Parallel Success Story in Five Steps 2 Step 1: Understand Application ANSYS Fluent Computational Fluid Dynamics

More information

fspai-1.0 Factorized Sparse Approximate Inverse Preconditioner

fspai-1.0 Factorized Sparse Approximate Inverse Preconditioner fspai-1.0 Factorized Sparse Approximate Inverse Preconditioner Thomas Huckle Matous Sedlacek 2011 08 01 Technische Universität München Research Unit Computer Science V Scientific Computing in Computer

More information

Performance of deal.ii on a node

Performance of deal.ii on a node Performance of deal.ii on a node Bruno Turcksin Texas A&M University, Dept. of Mathematics Bruno Turcksin Deal.II on a node 1/37 Outline 1 Introduction 2 Architecture 3 Paralution 4 Other Libraries 5 Conclusions

More information

NEW ADVANCES IN GPU LINEAR ALGEBRA

NEW ADVANCES IN GPU LINEAR ALGEBRA GTC 2012: NEW ADVANCES IN GPU LINEAR ALGEBRA Kyle Spagnoli EM Photonics 5/16/2012 QUICK ABOUT US» HPC/GPU Consulting Firm» Specializations in:» Electromagnetics» Image Processing» Fluid Dynamics» Linear

More information

Don t reinvent the wheel. BLAS LAPACK Intel Math Kernel Library

Don t reinvent the wheel. BLAS LAPACK Intel Math Kernel Library Libraries Don t reinvent the wheel. Specialized math libraries are likely faster. BLAS: Basic Linear Algebra Subprograms LAPACK: Linear Algebra Package (uses BLAS) http://www.netlib.org/lapack/ to download

More information

Introduction to PETSc KSP, PC. CS595, Fall 2010

Introduction to PETSc KSP, PC. CS595, Fall 2010 Introduction to PETSc KSP, PC CS595, Fall 2010 1 Linear Solution Main Routine PETSc Solve Ax = b Linear Solvers (KSP) PC Application Initialization Evaluation of A and b Post- Processing User code PETSc

More information

Due Date: See Blackboard

Due Date: See Blackboard Source File: ~/2315/45/lab45.(C CPP cpp c++ cc cxx cp) Input: under control of main function Output: under control of main function Value: 4 Integer data is usually represented in a single word on a computer.

More information

fspai-1.1 Factorized Sparse Approximate Inverse Preconditioner

fspai-1.1 Factorized Sparse Approximate Inverse Preconditioner fspai-1.1 Factorized Sparse Approximate Inverse Preconditioner Thomas Huckle Matous Sedlacek 2011 09 10 Technische Universität München Research Unit Computer Science V Scientific Computing in Computer

More information

NAG Library Function Document nag_sparse_nsym_sol (f11dec)

NAG Library Function Document nag_sparse_nsym_sol (f11dec) f11 Large Scale Linear Systems NAG Library Function Document nag_sparse_nsym_sol () 1 Purpose nag_sparse_nsym_sol () solves a real sparse nonsymmetric system of linear equations, represented in coordinate

More information

Highly Parallel Multigrid Solvers for Multicore and Manycore Processors

Highly Parallel Multigrid Solvers for Multicore and Manycore Processors Highly Parallel Multigrid Solvers for Multicore and Manycore Processors Oleg Bessonov (B) Institute for Problems in Mechanics of the Russian Academy of Sciences, 101, Vernadsky Avenue, 119526 Moscow, Russia

More information

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

Application of GPU technology to OpenFOAM simulations

Application of GPU technology to OpenFOAM simulations Application of GPU technology to OpenFOAM simulations Jakub Poła, Andrzej Kosior, Łukasz Miroslaw jakub.pola@vratis.com, www.vratis.com Wroclaw, Poland Agenda Motivation Partial acceleration SpeedIT OpenFOAM

More information

Part VI. Scientific Computing in Python. Alfredo Parra : Scripting with Python Compact Max-PlanckMarch 6-10,

Part VI. Scientific Computing in Python. Alfredo Parra : Scripting with Python Compact Max-PlanckMarch 6-10, Part VI Scientific Computing in Python Compact Course @ Max-PlanckMarch 6-10, 2017 63 Doing maths in Python Standard sequence types (list, tuple,... ) Can be used as arrays Can contain different types

More information

NAG Fortran Library Routine Document F11DSF.1

NAG Fortran Library Routine Document F11DSF.1 NAG Fortran Library Routine Document Note: before using this routine, please read the Users Note for your implementation to check the interpretation of bold italicised terms and other implementation-dependent

More information

VIENNACL - LINEAR ALGEBRA LIBRARY FOR MULTI- AND MANY-CORE ARCHITECTURES

VIENNACL - LINEAR ALGEBRA LIBRARY FOR MULTI- AND MANY-CORE ARCHITECTURES VIENNACL - LINEAR ALGEBRA LIBRARY FOR MULTI- AND MANY-CORE ARCHITECTURES KARL RUPP, PHILIPPE TILLET, FLORIAN RUDOLF, JOSEF WEINBUB, ANDREAS MORHAMMER, TIBOR GRASSER, ANSGAR JÜNGEL, SIEGFRIED SELBERHERR

More information

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,

More information

OpenFOAM on GPUs. 3rd Northern germany OpenFoam User meeting. Institute of Scientific Computing. September 24th 2015

OpenFOAM on GPUs. 3rd Northern germany OpenFoam User meeting. Institute of Scientific Computing. September 24th 2015 OpenFOAM on GPUs 3rd Northern germany OpenFoam User meeting September 24th 2015 Haus der Wissenschaften, Braunschweig Overview HPC on GPGPUs OpenFOAM on GPUs 2013 OpenFOAM on GPUs 2015 BiCGstab/IDR(s)

More information

AMGCL Documentation. Release post189. Denis Demidov

AMGCL Documentation. Release post189. Denis Demidov AMGCL Documentation Release 1.2.0.post189 Denis Demidov Nov 13, 2018 Contents 1 Contents: 3 1.1 Getting started.............................................. 3 1.2 Components...............................................

More information

Masterpraktikum - Scientific Computing, High Performance Computing

Masterpraktikum - Scientific Computing, High Performance Computing Masterpraktikum - Scientific Computing, High Performance Computing Message Passing Interface (MPI) and CG-method Michael Bader Alexander Heinecke Technische Universität München, Germany Outline MPI Hello

More information

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Kevin Deweese 1 Erik Boman 2 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms

More information

Sparse Matrix Libraries in C++ for High Performance. Architectures. ferent sparse matrix data formats in order to best

Sparse Matrix Libraries in C++ for High Performance. Architectures. ferent sparse matrix data formats in order to best Sparse Matrix Libraries in C++ for High Performance Architectures Jack Dongarra xz, Andrew Lumsdaine, Xinhui Niu Roldan Pozo z, Karin Remington x x Oak Ridge National Laboratory z University oftennessee

More information

GPU-based Parallel Reservoir Simulators

GPU-based Parallel Reservoir Simulators GPU-based Parallel Reservoir Simulators Zhangxin Chen 1, Hui Liu 1, Song Yu 1, Ben Hsieh 1 and Lei Shao 1 Key words: GPU computing, reservoir simulation, linear solver, parallel 1 Introduction Nowadays

More information

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2

More information

Iterative Methods for Linear Systems

Iterative Methods for Linear Systems Iterative Methods for Linear Systems 1 the method of Jacobi derivation of the formulas cost and convergence of the algorithm a Julia function 2 Gauss-Seidel Relaxation an iterative method for solving linear

More information

Iterative Sparse Triangular Solves for Preconditioning

Iterative Sparse Triangular Solves for Preconditioning Euro-Par 2015, Vienna Aug 24-28, 2015 Iterative Sparse Triangular Solves for Preconditioning Hartwig Anzt, Edmond Chow and Jack Dongarra Incomplete Factorization Preconditioning Incomplete LU factorizations

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

VIENNACL LINEAR ALGEBRA LIBRARY FOR MULTI- AND MANY-CORE ARCHITECTURES

VIENNACL LINEAR ALGEBRA LIBRARY FOR MULTI- AND MANY-CORE ARCHITECTURES SIAM J. SCI. COMPUT. Vol. 38, No. 5, pp. S412 S439 c 216 Society for Industrial and Applied Mathematics VIENNACL LINEAR ALGEBRA LIBRARY FOR MULTI- AND MANY-CORE ARCHITECTURES KARL RUPP, PHILIPPE TILLET,

More information

THE DEVELOPMENT OF THE POTENTIAL AND ACADMIC PROGRAMMES OF WROCLAW UNIVERISTY OF TECH- NOLOGY ITERATIVE LINEAR SOLVERS

THE DEVELOPMENT OF THE POTENTIAL AND ACADMIC PROGRAMMES OF WROCLAW UNIVERISTY OF TECH- NOLOGY ITERATIVE LINEAR SOLVERS ITERATIVE LIEAR SOLVERS. Objectives The goals of the laboratory workshop are as follows: to learn basic properties of iterative methods for solving linear least squares problems, to study the properties

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26

More information

Introduction to Supercomputing

Introduction to Supercomputing Introduction to Supercomputing TMA4280 Introduction to development tools 0.1 Development tools During this course, only the make tool, compilers, and the GIT tool will be used for the sake of simplicity:

More information

A parallel direct/iterative solver based on a Schur complement approach

A parallel direct/iterative solver based on a Schur complement approach A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008

More information

Performance of Implicit Solver Strategies on GPUs

Performance of Implicit Solver Strategies on GPUs 9. LS-DYNA Forum, Bamberg 2010 IT / Performance Performance of Implicit Solver Strategies on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Abstract: The increasing power of GPUs can be used

More information

OP2 C++ User s Manual

OP2 C++ User s Manual OP2 C++ User s Manual Mike Giles, Gihan R. Mudalige, István Reguly December 2013 1 Contents 1 Introduction 4 2 Overview 5 3 OP2 C++ API 8 3.1 Initialisation and termination routines..........................

More information

Porting the NAS-NPB Conjugate Gradient Benchmark to CUDA. NVIDIA Corporation

Porting the NAS-NPB Conjugate Gradient Benchmark to CUDA. NVIDIA Corporation Porting the NAS-NPB Conjugate Gradient Benchmark to CUDA NVIDIA Corporation Outline! Overview of CG benchmark! Overview of CUDA Libraries! CUSPARSE! CUBLAS! Porting Sequence! Algorithm Analysis! Data/Code

More information

Lab 2: Pointers. //declare a pointer variable ptr1 pointing to x. //change the value of x to 10 through ptr1

Lab 2: Pointers. //declare a pointer variable ptr1 pointing to x. //change the value of x to 10 through ptr1 Lab 2: Pointers 1. Goals Further understanding of pointer variables Passing parameters to functions by address (pointers) and by references Creating and using dynamic arrays Combing pointers, structures

More information

BDDCML. solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) Jakub Šístek version 1.

BDDCML. solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) Jakub Šístek version 1. BDDCML solver library based on Multi-Level Balancing Domain Decomposition by Constraints copyright (C) 2010-2012 Jakub Šístek version 1.3 Jakub Šístek i Table of Contents 1 Introduction.....................................

More information

Sparse Linear Systems

Sparse Linear Systems 1 Sparse Linear Systems Rob H. Bisseling Mathematical Institute, Utrecht University Course Introduction Scientific Computing February 22, 2018 2 Outline Iterative solution methods 3 A perfect bipartite

More information

IBM Research. IBM Research Report

IBM Research. IBM Research Report RC 24398 (W0711-017) November 5, 2007 (Last update: June 28, 2018) Computer Science/Mathematics IBM Research Report WSMP: Watson Sparse Matrix Package Part III iterative solution of sparse systems Version

More information

CSE 599 I Accelerated Computing - Programming GPUS. Parallel Pattern: Sparse Matrices

CSE 599 I Accelerated Computing - Programming GPUS. Parallel Pattern: Sparse Matrices CSE 599 I Accelerated Computing - Programming GPUS Parallel Pattern: Sparse Matrices Objective Learn about various sparse matrix representations Consider how input data affects run-time performance of

More information

Little Motivation Outline Introduction OpenMP Architecture Working with OpenMP Future of OpenMP End. OpenMP. Amasis Brauch German University in Cairo

Little Motivation Outline Introduction OpenMP Architecture Working with OpenMP Future of OpenMP End. OpenMP. Amasis Brauch German University in Cairo OpenMP Amasis Brauch German University in Cairo May 4, 2010 Simple Algorithm 1 void i n c r e m e n t e r ( short a r r a y ) 2 { 3 long i ; 4 5 for ( i = 0 ; i < 1000000; i ++) 6 { 7 a r r a y [ i ]++;

More information

Profiling and Parallelizing with the OpenACC Toolkit OpenACC Course: Lecture 2 October 15, 2015

Profiling and Parallelizing with the OpenACC Toolkit OpenACC Course: Lecture 2 October 15, 2015 Profiling and Parallelizing with the OpenACC Toolkit OpenACC Course: Lecture 2 October 15, 2015 Oct 1: Introduction to OpenACC Oct 6: Office Hours Oct 15: Profiling and Parallelizing with the OpenACC Toolkit

More information

On the Parallel Solution of Sparse Triangular Linear Systems. M. Naumov* San Jose, CA May 16, 2012 *NVIDIA

On the Parallel Solution of Sparse Triangular Linear Systems. M. Naumov* San Jose, CA May 16, 2012 *NVIDIA On the Parallel Solution of Sparse Triangular Linear Systems M. Naumov* San Jose, CA May 16, 2012 *NVIDIA Why Is This Interesting? There exist different classes of parallel problems Embarrassingly parallel

More information

Computing with vectors and matrices in C++

Computing with vectors and matrices in C++ CS319: Scientific Computing (with C++) Computing with vectors and matrices in C++ Week 7: 9am and 4pm, 22 Feb 2017 1 Introduction 2 Solving linear systems 3 Jacobi s Method 4 Implementation 5 Vectors 6

More information

JCudaMP: OpenMP/Java on CUDA

JCudaMP: OpenMP/Java on CUDA JCudaMP: OpenMP/Java on CUDA Georg Dotzler, Ronald Veldema, Michael Klemm Programming Systems Group Martensstraße 3 91058 Erlangen Motivation Write once, run anywhere - Java Slogan created by Sun Microsystems

More information

Andrew V. Knyazev and Merico E. Argentati (speaker)

Andrew V. Knyazev and Merico E. Argentati (speaker) 1 Andrew V. Knyazev and Merico E. Argentati (speaker) Department of Mathematics and Center for Computational Mathematics University of Colorado at Denver 2 Acknowledgement Supported by Lawrence Livermore

More information

ANSI C. Data Analysis in Geophysics Demián D. Gómez November 2013

ANSI C. Data Analysis in Geophysics Demián D. Gómez November 2013 ANSI C Data Analysis in Geophysics Demián D. Gómez November 2013 ANSI C Standards published by the American National Standards Institute (1983-1989). Initially developed by Dennis Ritchie between 1969

More information

Separate Compilation of Multi-File Programs

Separate Compilation of Multi-File Programs 1 About Compiling What most people mean by the phrase "compiling a program" is actually two separate steps in the creation of that program. The rst step is proper compilation. Compilation is the translation

More information

OPEN MP and MPI on Kingspeak chpc cluster

OPEN MP and MPI on Kingspeak chpc cluster OPEN MP and MPI on Kingspeak chpc cluster Command to compile the code with openmp and mpi /uufs/kingspeak.peaks/sys/pkg/openmpi/std_intel/bin/mpicc -o hem hemhotlz.c -I /uufs/kingspeak.peaks/sys/pkg/openmpi/std_intel/include

More information

Tomonori Kouya Shizuoka Institute of Science and Technology Toyosawa, Fukuroi, Shizuoka Japan. October 5, 2018

Tomonori Kouya Shizuoka Institute of Science and Technology Toyosawa, Fukuroi, Shizuoka Japan. October 5, 2018 arxiv:1411.2377v1 [math.na] 10 Nov 2014 A Highly Efficient Implementation of Multiple Precision Sparse Matrix-Vector Multiplication and Its Application to Product-type Krylov Subspace Methods Tomonori

More information

This offering is not approved or endorsed by OpenCFD Limited, the producer of the OpenFOAM software and owner of the OPENFOAM and OpenCFD trade marks.

This offering is not approved or endorsed by OpenCFD Limited, the producer of the OpenFOAM software and owner of the OPENFOAM and OpenCFD trade marks. Disclaimer This offering is not approved or endorsed by OpenCFD Limited, the producer of the OpenFOAM software and owner of the OPENFOAM and OpenCFD trade marks. Introductory OpenFOAM Course From 8 th

More information

CG solver assignment

CG solver assignment CG solver assignment David Bindel Nikos Karampatziakis 3/16/2010 Contents 1 Introduction 1 2 Solver parameters 2 3 Preconditioned CG 3 4 3D Laplace operator 4 5 Preconditioners for the Laplacian 5 5.1

More information

Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity Problems

Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity Problems Mathematical Problems in Engineering Volume 2015, Article ID 147286, 7 pages http://dx.doi.org/10.1155/2015/147286 Research Article A PETSc-Based Parallel Implementation of Finite Element Method for Elasticity

More information

OPENFOAM ON GPUS USING AMGX

OPENFOAM ON GPUS USING AMGX OPENFOAM ON GPUS USING AMGX Thilina Rathnayake Sanath Jayasena Mahinsasa Narayana ABSTRACT Field Operation and Manipulation (OpenFOAM) is a free, open-source, feature-rich Computational Fluid Dynamics

More information

SkePU 2 User Guide For the preview release

SkePU 2 User Guide For the preview release SkePU 2 User Guide For the preview release August Ernstsson October 20, 2016 Contents 1 Introduction 3 2 License 3 3 Authors and Maintainers 3 3.1 Acknowledgements............................... 3 4 Dependencies

More information

CS2141 Software Development using C/C++ C++ Basics

CS2141 Software Development using C/C++ C++ Basics CS2141 Software Development using C/C++ C++ Basics Integers Basic Types Can be short, long, or just plain int C++ does not define the size of them other than short

More information

Large Displacement Optical Flow & Applications

Large Displacement Optical Flow & Applications Large Displacement Optical Flow & Applications Narayanan Sundaram, Kurt Keutzer (Parlab) In collaboration with Thomas Brox (University of Freiburg) Michael Tao (University of California Berkeley) Parlab

More information

Optimising the Mantevo benchmark suite for multi- and many-core architectures

Optimising the Mantevo benchmark suite for multi- and many-core architectures Optimising the Mantevo benchmark suite for multi- and many-core architectures Simon McIntosh-Smith Department of Computer Science University of Bristol 1 Bristol's rich heritage in HPC The University of

More information

Efficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI

Efficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI Efficient AMG on Hybrid GPU Clusters ScicomP 2012 Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann Fraunhofer SCAI Illustration: Darin McInnis Motivation Sparse iterative solvers benefit from

More information

Figure 6.1: Truss topology optimization diagram.

Figure 6.1: Truss topology optimization diagram. 6 Implementation 6.1 Outline This chapter shows the implementation details to optimize the truss, obtained in the ground structure approach, according to the formulation presented in previous chapters.

More information

An Example of Porting PETSc Applications to Heterogeneous Platforms with OpenACC

An Example of Porting PETSc Applications to Heterogeneous Platforms with OpenACC An Example of Porting PETSc Applications to Heterogeneous Platforms with OpenACC Pi-Yueh Chuang The George Washington University Fernanda S. Foertter Oak Ridge National Laboratory Goal Develop an OpenACC

More information

Performance Strategies for Parallel Mathematical Libraries Based on Historical Knowledgebase

Performance Strategies for Parallel Mathematical Libraries Based on Historical Knowledgebase Performance Strategies for Parallel Mathematical Libraries Based on Historical Knowledgebase CScADS workshop 29 Eduardo Cesar, Anna Morajko, Ihab Salawdeh Universitat Autònoma de Barcelona Objective Mathematical

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964

More information

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /

More information

i486 or Pentium Windows 3.1 PVM MasPar Thinking Machine CM-5 Intel Paragon IBM SP2 telnet/ftp or rlogin

i486 or Pentium Windows 3.1 PVM MasPar Thinking Machine CM-5 Intel Paragon IBM SP2 telnet/ftp or rlogin Hidehiko Hasegawa 1983: University of Library and Information Science, the smallest National university March 1994: Visiting Researcher at icl, University of Tennessee, Knoxville 1994-95 in Japan: a bad

More information

CPS343 Parallel and High Performance Computing Project 1 Spring 2018

CPS343 Parallel and High Performance Computing Project 1 Spring 2018 CPS343 Parallel and High Performance Computing Project 1 Spring 2018 Assignment Write a program using OpenMP to compute the estimate of the dominant eigenvalue of a matrix Due: Wednesday March 21 The program

More information

PyAMG. Algebraic Multigrid Solvers in Python Nathan Bell, Nvidia Luke Olson, University of Illinois Jacob Schroder, University of Colorado at Boulder

PyAMG. Algebraic Multigrid Solvers in Python Nathan Bell, Nvidia Luke Olson, University of Illinois Jacob Schroder, University of Colorado at Boulder PyAMG Algebraic Multigrid Solvers in Python Nathan Bell, Nvidia Luke Olson, University of Illinois Jacob Schroder, University of Colorado at Boulder Copper Mountain 2011 Boot disc For Mac Press and hold

More information

Nonsymmetric Problems. Abstract. The eect of a threshold variant TPABLO of the permutation

Nonsymmetric Problems. Abstract. The eect of a threshold variant TPABLO of the permutation Threshold Ordering for Preconditioning Nonsymmetric Problems Michele Benzi 1, Hwajeong Choi 2, Daniel B. Szyld 2? 1 CERFACS, 42 Ave. G. Coriolis, 31057 Toulouse Cedex, France (benzi@cerfacs.fr) 2 Department

More information

HPC with PGI and Scalasca

HPC with PGI and Scalasca HPC with PGI and Scalasca Stefan Rosenberger Supervisor: Univ.-Prof. Dipl.-Ing. Dr. Gundolf Haase Institut für Mathematik und wissenschaftliches Rechnen Universität Graz May 28, 2015 Stefan Rosenberger

More information

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS John R Appleyard Jeremy D Appleyard Polyhedron Software with acknowledgements to Mark A Wakefield Garf Bowen Schlumberger Outline of Talk Reservoir

More information

Topic Notes: Message Passing Interface (MPI)

Topic Notes: Message Passing Interface (MPI) Computer Science 400 Parallel Processing Siena College Fall 2008 Topic Notes: Message Passing Interface (MPI) The Message Passing Interface (MPI) was created by a standards committee in the early 1990

More information

Pragma-based GPU Programming and HMPP Workbench. Scott Grauer-Gray

Pragma-based GPU Programming and HMPP Workbench. Scott Grauer-Gray Pragma-based GPU Programming and HMPP Workbench Scott Grauer-Gray Pragma-based GPU programming Write programs for GPU processing without (directly) using CUDA/OpenCL Place pragmas to drive processing on

More information

Computational Graphics: Lecture 15 SpMSpM and SpMV, or, who cares about complexity when we have a thousand processors?

Computational Graphics: Lecture 15 SpMSpM and SpMV, or, who cares about complexity when we have a thousand processors? Computational Graphics: Lecture 15 SpMSpM and SpMV, or, who cares about complexity when we have a thousand processors? The CVDLab Team Francesco Furiani Tue, April 3, 2014 ROMA TRE UNIVERSITÀ DEGLI STUDI

More information

Accelerated Test Execution Using GPUs

Accelerated Test Execution Using GPUs Accelerated Test Execution Using GPUs Vanya Yaneva Supervisors: Ajitha Rajan, Christophe Dubach Mathworks May 27, 2016 The Problem Software testing is time consuming Functional testing The Problem Software

More information

ECE 574 Cluster Computing Lecture 10

ECE 574 Cluster Computing Lecture 10 ECE 574 Cluster Computing Lecture 10 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 1 October 2015 Announcements Homework #4 will be posted eventually 1 HW#4 Notes How granular

More information

Department of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer

Department of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer HPC-Lab Session 4: MPI, CG M. Bader, A. Breuer Meetings Date Schedule 10/13/14 Kickoff 10/20/14 Q&A 10/27/14 Presentation 1 11/03/14 H. Bast, Intel 11/10/14 Presentation 2 12/01/14 Presentation 3 12/08/14

More information

Due Date: See Blackboard

Due Date: See Blackboard Source File: ~/2315/11/lab11.(C CPP cpp c++ cc cxx cp) Input: Under control of main function Output: Under control of main function Value: 1 The purpose of this assignment is to become more familiar with

More information

The performances of R GPU implementations of the GMRES method. Bogdan Oancea University of Bucharest

The performances of R GPU implementations of the GMRES method. Bogdan Oancea University of Bucharest The performances of R GPU implementations of the GMRES method Bogdan Oancea University of Bucharest bogdan.oancea@faa.unibuc.ro Richard Pospisil Palacky University of Olomouc richard.pospisil@upol.cz Abstract

More information

A Comparison of Algebraic Multigrid Preconditioners using Graphics Processing Units and Multi-Core Central Processing Units

A Comparison of Algebraic Multigrid Preconditioners using Graphics Processing Units and Multi-Core Central Processing Units A Comparison of Algebraic Multigrid Preconditioners using Graphics Processing Units and Multi-Core Central Processing Units Markus Wagner, Karl Rupp,2, Josef Weinbub Institute for Microelectronics, TU

More information

Enhanced Oil Recovery simulation Performances on New Hybrid Architectures

Enhanced Oil Recovery simulation Performances on New Hybrid Architectures Renewable energies Eco-friendly production Innovative transport Eco-efficient processes Sustainable resources Enhanced Oil Recovery simulation Performances on New Hybrid Architectures A. Anciaux, J-M.

More information