How to get Access to Shaheen2? Bilel Hadri Computational Scientist KAUST Supercomputing Core Lab

Size: px
Start display at page:

Download "How to get Access to Shaheen2? Bilel Hadri Computational Scientist KAUST Supercomputing Core Lab"

Transcription

1 How to get Access to Shaheen2? Bilel Hadri Computational Scientist KAUST Supercomputing Core Lab

2 Live Survey Please login with your laptop/mobile h#p://'ny.cc/kslhpc And type the code VF9SKGQ6

3

4 Account Applications Ge9ng an account on Shaheen is a three- step process Your organiza>on or department must submit the Organisa>onal Access Applica>on (OAA), establishing a rela>onship between your home organiza>on and the KAUST Supercompu>ng Laboratory (KSL). You must submit an Individual Account Applica>on, supplying iden>fica>on informa>on from which we can generate login creden>als. You (or your Principal Inves>gator) must submit a Project Proposal project proposal describing the work to be done and the resources your project will require. Send all documents on via projects@hpc.kaust.edu.sa or use the Contact Us form to submit creden>als or other private informa>on. FAQs on the process and requirements for obtaining access to the Shaheen Systems are available here.

5 Computing Resource Allocation Project Type Development Project : System familiariza>on, code por>ng, assessment of performance. Up to 2M of Core hours Produc'on Project: Produc>on run aver applica>ons have been ported and op>mized Project Review Project proposals reviewed monthly by RCAC (Resource Compu>ng Alloca>on CommiXee) Three- step process Computa'onal Readiness Review: Performed by KSL. Jus>fica>on for core hour, portability, scalability, impact on execu>on Scien'fic Readiness Review: Performed by scien>fic peers in the discipline of the proposed project RCAC final review and recommenda>on

6

7

8

9

10 Timeframe Development Project : Within 5 days once all documents submixed. Produc'on Project: Within end of following month, aver RCAC mee>ng Dura'on: up to one year Extension: RCAC may grant extension of dura>on and compute core hours aver progress report( including publica>ons, success stories )

11 Questions?

12 The Programming Enviromment Bilel Hadri Computational Scientist KAUST Supercomputing Core Lab

13 Compiler Driver Wrappers All applica>ons that will run in parallel on the Cray XC should be compiled with standard language wrappers. The compiler drivers for each language are: cc wrapper around the C compiler CC wrapper around the C++ compiler Vn wrapper around the Fortran compiler These scripts will choose the required compiler version, target architecture op>ons, scien>fic libraries and their include files automa>cally from the current used module environment.

14 Compiler Driver Wrappers Use them exactly like you would use the original compiler, e.g. To compile prog.f90 run > ftn -c <any_other_flags> prog.f90 These scripts choose which compiler to use from the loaded ProgEnv module PrgEnv Description Real Compilers PrgEnv-cray Cray Compilation Environment crayftn, craycc, craycc PrgEnv-intel Intel Composer Suite ifort, icc, icpc PrgEnv-gnu GNU Compiler Collection gfortran, gcc, g++

15 Compiler Driver Wrappers Use module swap to change PrgEnv, e.g. > module swap PrgEnv-cray PrgEnv-intel PrgEnv- cray is loaded by default at login. The Cray MPI module is loaded by default (cray- mpich) Use module list to check what is currently loaded Current default modules

16 OpenMP OpenMP is supported by all of the PrgEnvs. CCE (PrgEnv- cray) recognizes and interprets OpenMP direc>ves by default. If you have OpenMP direc>ves in your applica>on but do not want to use them, disable OpenMP recogni>on with hnoopm. PrgEnv Enable OpenMP Disable OpenMP PrgEnv- cray - homp - hnoomp PrgEnv- intel - openmp PrgEnv- gnu - fopenmp

17 Compiler Man Pages For more informa>on on individual compilers PrgEnv C C++ Fortran PrgEnv- cray man craycc man craycc man crayvn PrgEnv- intel man icc man icpc man ifort PrgEnv- gnu man gcc man g++ man gfortran Wrappers man cc man CC man Vn To verify that you are using the correct version of the compiler, use: - V op>on on a cc, CC, or Vn command with Intel and Cray - - version op>on on a cc, CC, or Vn command with GNU Cray Reference Manuals: C and C++: hxp://docs.cray.com/books/s / Fortran: hxp://docs.cray.com/books/s /

18 Compiler Flags Feature Cray Intel GNU Lis>ng - hlist=a - opt- report3 - fdump- tree- all Free Format(Vn) - f free - free - ffree- form Vectoriza>on By default at O1 and above Inter- Procedural Op>miza>on Floa>ng- point Op>miza>on By default at O2 and above By default at O3 or using Vree- vectorize - hwp - ipo - flto (note: link- >me op>miza>on) - hfpn, N=0 4 - fp- model - f[no- ]fast- math or - funsafe- math- op>miza>ons Suggested Op>miza>on (default) - O2 xavx - O2 mavx Vree- vectorize - ffast- math funroll- loops Aggressive Op>miza>on - O3 hfp3 - fast - Ofast mavx - funroll- loops Variables Size (Vn) - s real64 - s integer64 - real0suze 64 - integer- size 64 - freal- 4- real- 8 - finteger- 4- interger- 8

19 Cray Scien'fic Libraries Compiler wrappers takes care of not only the compiler but also libs like BLAS, SCALAPACK, MPI,.. Cray Scien>fic Libraries package, LibSci, is a collec>on of numerical rou>nes op>mized for best performance on Cray systems. LibSci is loaded by default and this is for all programming environment No user flags or op>ons are required for compiling or linking. LibSci library collec>on contains; BLAS, BLACS, LAPACK, ScaLAPACK, IRT, CRAFFT, CASE, FFT, FFTW2, FFTW3 FFTW: Cray s main FFT library is FFTW from MIT with some addi>onal op>miza>ons for Cray hardware Cray PETSc (with CASK Cray Adap>ve Sparse Kernels) Cray Trilinos (with CASK Cray Adap>ve Sparse Kernels)

20 Cray Scien'fic Libraries Cray TPSL (Third Party Scien>fic Libraries) contains a collec>on of outside mathema>cal libraries that can be used with PETSc and Trilinos The TPSL increase the flexibility of PETSc and Trilinos by providing users with mul>ple op>ons for solving problems in dense and sparse linear algebra The cray- tpsl module is automa>cally loaded when PETSc or Trilinos is loaded. The libraries included are MUMPs, SuperLU, SuperLU_dist, ParMe>s, Hypre, Sundials, and Scotch. Intel MKL: The Intel Math Kernel libraries is an alterna>ve to LibSci Features tuned performance for Intel CPUs as well Linking is quite complicated but with Intel compilers (PrgEnv- intel) is usually straighvorward Just need to module load and compile your code

21 Modules Useful module commands: module avail, module list, module load and module swap Type man module to learn how to use them. The module avail list is too long. Useful Op>ons for filtering - U: List all modulefiles of interest to typical user - D: List only default versions of module files - P: List all PrgEnv modulefiles - L: List all library modulefiles module avail S <product> : List all <product> versions available

22 Check the flyer Available on line:

23 Summary Three compiler environments available at XC40: Cray, Intel, and GNU All of them accessed through the wrappers Wn, cc and CC just do module swap to change a compiler! There is no universally fastest compiler but performance depends on the applica>on, even input Use the module available for linking libraries. If you need a library, contact the CS team

24 Ques'ons? #HPCSAUDI17 Follow us on TwiXer: twixer.com/kaust_hpc

COMPILING FOR THE ARCHER HARDWARE. Slides contributed by Cray and EPCC

COMPILING FOR THE ARCHER HARDWARE. Slides contributed by Cray and EPCC COMPILING FOR THE ARCHER HARDWARE Slides contributed by Cray and EPCC Modules The Cray Programming Environment uses the GNU modules framework to support multiple software versions and to create integrated

More information

Programming Environment 4/11/2015

Programming Environment 4/11/2015 Programming Environment 4/11/2015 1 Vision Cray systems are designed to be High Productivity as well as High Performance Computers The Cray Programming Environment (PE) provides a simple consistent interface

More information

Compilers and Libraries

Compilers and Libraries Compilers and Libraries Ilias Katsardis ikatsardis@cray.com 1/23/17 1 Don t forget Use the ftn, cc, and CC wrappers The wrappers uses your module environment to get all libraries and include directories

More information

Never forget Always use the ftn, cc, and CC wrappers

Never forget Always use the ftn, cc, and CC wrappers Using Compilers 2 Never forget Always use the ftn, cc, and CC wrappers The wrappers uses your module environment to get all libraries and include directories for you. You don t have to know their real

More information

Compiling applications for the Cray XC

Compiling applications for the Cray XC Compiling applications for the Cray XC Compiler Driver Wrappers (1) All applications that will run in parallel on the Cray XC should be compiled with the standard language wrappers. The compiler drivers

More information

The Cray Compilation Environment (CCE) Additional Information

The Cray Compilation Environment (CCE) Additional Information The Cray Compilation Environment (CCE) Additional Information Cray Inc 2013 CCE Overview Cray technology focused on scientific applications Takes advantage of automatic vectorization Takes advantage of

More information

Overview of Compilers and Libraries on XC30

Overview of Compilers and Libraries on XC30 Overview of Compilers and Libraries on XC30 Cray Inc 2013 Cray Programming Environment Distribution Focus on Performance and Productivity Programming Languages Programming models Compilers Tools Optimized

More information

The Cray Programming Environment. An Introduction

The Cray Programming Environment. An Introduction The Cray Programming Environment An Introduction Vision Cray systems are designed to be High Productivity as well as High Performance Computers The Cray Programming Environment (PE) provides a simple consistent

More information

Cray Scientific Libraries. Overview

Cray Scientific Libraries. Overview Cray Scientific Libraries Overview What are libraries for? Building blocks for writing scientific applications Historically allowed the first forms of code re-use Later became ways of running optimized

More information

Compiler Optimizations. Aniello Esposito HPC Saudi, March 15 th 2016

Compiler Optimizations. Aniello Esposito HPC Saudi, March 15 th 2016 Compiler Optimizations Aniello Esposito HPC Saudi, March 15 th 2016 Using Compiler Feedback Compilers can generate annotated listing of your source code indicating important optimizations. Useful for targeted

More information

First steps on using an HPC service ARCHER

First steps on using an HPC service ARCHER First steps on using an HPC service ARCHER ARCHER Service Overview and Introduction ARCHER in a nutshell UK National Supercomputing Service Cray XC30 Hardware Nodes based on 2 Intel Ivy Bridge 12-core

More information

The Cray Programming Environment. An Introduction

The Cray Programming Environment. An Introduction The Cray Programming Environment An Introduction Vision Cray systems are designed to be High Productivity as well as High Performance Computers The Cray Programming Environment (PE) provides a simple consistent

More information

Introduction to SahasraT. RAVITEJA K Applications Analyst, Cray inc E Mail :

Introduction to SahasraT. RAVITEJA K Applications Analyst, Cray inc E Mail : Introduction to SahasraT RAVITEJA K Applications Analyst, Cray inc E Mail : raviteja@cray.com 1 1. Introduction to SahasraT 2. Cray Software stack 3. Compile applications on XC 4. Run applications on XC

More information

Practical: a sample code

Practical: a sample code Practical: a sample code Alistair Hart Cray Exascale Research Initiative Europe 1 Aims The aim of this practical is to examine, compile and run a simple, pre-prepared OpenACC code The aims of this are:

More information

Introduction to Numerical Libraries for HPC. Bilel Hadri. Computational Scientist KAUST Supercomputing Lab.

Introduction to Numerical Libraries for HPC. Bilel Hadri. Computational Scientist KAUST Supercomputing Lab. Introduction to Numerical Libraries for HPC Bilel Hadri bilel.hadri@kaust.edu.sa Computational Scientist KAUST Supercomputing Lab Bilel Hadri 1 Numerical Libraries Application Areas Most used libraries/software

More information

Installing the Quantum ESPRESSO distribution

Installing the Quantum ESPRESSO distribution Joint ICTP-TWAS Caribbean School on Electronic Structure Fundamentals and Methodologies, Cartagena, Colombia (2012). Installing the Quantum ESPRESSO distribution Coordinator: A. D. Hernández-Nieves Installing

More information

Introduc)on to Xeon Phi

Introduc)on to Xeon Phi Introduc)on to Xeon Phi IXPUG 14 Lars Koesterke Acknowledgements Thanks/kudos to: Sponsor: National Science Foundation NSF Grant #OCI-1134872 Stampede Award, Enabling, Enhancing, and Extending Petascale

More information

Cray Scientific Libraries: Overview and Performance. Cray XE6 Performance Workshop University of Reading Nov 2012

Cray Scientific Libraries: Overview and Performance. Cray XE6 Performance Workshop University of Reading Nov 2012 Cray Scientific Libraries: Overview and Performance Cray XE6 Performance Workshop University of Reading 20-22 Nov 2012 Contents LibSci overview and usage BFRAME / CrayBLAS LAPACK ScaLAPACK FFTW / CRAFFT

More information

Using Spack to Manage Software on Cray Supercomputers

Using Spack to Manage Software on Cray Supercomputers Using Spack to Manage Software on Cray Supercomputers May 9 th, 2017-1 - Mario Melara (NERSC)! Todd Gamblin (LLNL)! Gregory Becker (LLNL)! Robert French (ORNL)! Matt P. Belhorn (ORNL)! Kelly Thompson (LANL)!

More information

Mathematical Libraries and Application Software on JUQUEEN and JURECA

Mathematical Libraries and Application Software on JUQUEEN and JURECA Mitglied der Helmholtz-Gemeinschaft Mathematical Libraries and Application Software on JUQUEEN and JURECA JSC Training Course May 2017 I.Gutheil Outline General Informations Sequential Libraries Parallel

More information

User Orientation on Cray XC40 SERC, IISc

User Orientation on Cray XC40 SERC, IISc User Orientation on Cray XC40 SERC, IISc Sudhakar Yerneni & Patricia Balle C O M P U T E S T O R E A N A L Y Z E Copyright 2014 Cray Inc. 1 Agenda Introduction to Cray XC40 architecture. IISc's Cray system

More information

Faster Code for Free: Linear Algebra Libraries. Advanced Research Compu;ng 22 Feb 2017

Faster Code for Free: Linear Algebra Libraries. Advanced Research Compu;ng 22 Feb 2017 Faster Code for Free: Linear Algebra Libraries Advanced Research Compu;ng 22 Feb 2017 Outline Introduc;on Implementa;ons Using them Use on ARC systems Hands on session Conclusions Introduc;on 3 BLAS Level

More information

BLAS. Basic Linear Algebra Subprograms

BLAS. Basic Linear Algebra Subprograms BLAS Basic opera+ons with vectors and matrices dominates scien+fic compu+ng programs To achieve high efficiency and clean computer programs an effort has been made in the last few decades to standardize

More information

Achieve Better Performance with PEAK on XSEDE Resources

Achieve Better Performance with PEAK on XSEDE Resources Achieve Better Performance with PEAK on XSEDE Resources Haihang You, Bilel Hadri, Shirley Moore XSEDE 12 July 18 th 2012 Motivations FACTS ALTD ( Automatic Tracking Library Database ) ref Fahey, Jones,

More information

Mathematical Libraries and Application Software on JUQUEEN and JURECA

Mathematical Libraries and Application Software on JUQUEEN and JURECA Mitglied der Helmholtz-Gemeinschaft Mathematical Libraries and Application Software on JUQUEEN and JURECA JSC Training Course November 2015 I.Gutheil Outline General Informations Sequential Libraries Parallel

More information

The Cray XT Compilers

The Cray XT Compilers The Cray XT Compilers Geir Johansen, Cray Inc. ABSTRACT: The Cray XT3 and Cray XT4 supports compilers from the Portland Group, PathScale, and the GNU Compiler Collection. The goal of the paper is to provide

More information

Dynamic Selection of Auto-tuned Kernels to the Numerical Libraries in the DOE ACTS Collection

Dynamic Selection of Auto-tuned Kernels to the Numerical Libraries in the DOE ACTS Collection Numerical Libraries in the DOE ACTS Collection The DOE ACTS Collection SIAM Parallel Processing for Scientific Computing, Savannah, Georgia Feb 15, 2012 Tony Drummond Computational Research Division Lawrence

More information

How to compile Fortran program on application server

How to compile Fortran program on application server How to compile Fortran program on application server Center for Computational Materials Science, Institute for Materials Research, Tohoku University 2015.3 version 1.0 Contents 1. Compile... 1 1.1 How

More information

Mathematical Libraries and Application Software on JUROPA, JUGENE, and JUQUEEN. JSC Training Course

Mathematical Libraries and Application Software on JUROPA, JUGENE, and JUQUEEN. JSC Training Course Mitglied der Helmholtz-Gemeinschaft Mathematical Libraries and Application Software on JUROPA, JUGENE, and JUQUEEN JSC Training Course May 22, 2012 Outline General Informations Sequential Libraries Parallel

More information

Introduc)on to Xeon Phi

Introduc)on to Xeon Phi Introduc)on to Xeon Phi ACES Aus)n, TX Dec. 04 2013 Kent Milfeld, Luke Wilson, John McCalpin, Lars Koesterke TACC What is it? Co- processor PCI Express card Stripped down Linux opera)ng system Dense, simplified

More information

Institute for Materials Research, Tohoku University Large-Scale Parallel Computing Server Manual

Institute for Materials Research, Tohoku University Large-Scale Parallel Computing Server Manual Institute for Materials Research, Tohoku University Large-Scale Parallel Computing Server Manual October 22th, 2018 Center for Computational Materials Science, Institute for Materials Research, Tohoku

More information

Intel Math Kernel Library 10.3

Intel Math Kernel Library 10.3 Intel Math Kernel Library 10.3 Product Brief Intel Math Kernel Library 10.3 The Flagship High Performance Computing Math Library for Windows*, Linux*, and Mac OS* X Intel Math Kernel Library (Intel MKL)

More information

Introduc)on to Pacman

Introduc)on to Pacman Introduc)on to Pacman Don Bahls User Consultant dmbahls@alaska.edu (Significant Slide Content from Tom Logan) Overview Connec)ng to Pacman Hardware Programming Environment Compilers Queuing System Interac)ve

More information

Introduc)on to Xeon Phi

Introduc)on to Xeon Phi Introduc)on to Xeon Phi MIC Training Event at TACC Lars Koesterke Xeon Phi MIC Xeon Phi = first product of Intel s Many Integrated Core (MIC) architecture Co- processor PCI Express card Stripped down Linux

More information

User Training Cray XC40 IITM, Pune

User Training Cray XC40 IITM, Pune User Training Cray XC40 IITM, Pune Sudhakar Yerneni, Raviteja K, Nachiket Manapragada, etc. 1 Cray XC40 Architecture & Packaging 3 Cray XC Series Building Blocks XC40 System Compute Blade 4 Compute Nodes

More information

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System Ilkay Al(ntas and Daniel Crawl San Diego Supercomputer Center UC San Diego Jianwu Wang UMBC WorDS.sdsc.edu Computa3onal

More information

An Introduc+on to OpenACC Part II

An Introduc+on to OpenACC Part II An Introduc+on to OpenACC Part II Wei Feinstein HPC User Services@LSU LONI Parallel Programming Workshop 2015 Louisiana State University 4 th HPC Parallel Programming Workshop An Introduc+on to OpenACC-

More information

Intel Parallel Studio XE 2015

Intel Parallel Studio XE 2015 2015 Create faster code faster with this comprehensive parallel software development suite. Faster code: Boost applications performance that scales on today s and next-gen processors Create code faster:

More information

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah Introduction to Parallel Programming Martin Čuma Center for High Performance Computing University of Utah m.cuma@utah.edu Overview Types of parallel computers. Parallel programming options. How to write

More information

Sisu User Guide 1. Sisu User Guide. Version: First version of the Sisu phase 2 User Guide

Sisu User Guide 1. Sisu User Guide. Version: First version of the Sisu phase 2 User Guide Sisu User Guide 1 Sisu User Guide Version: 24.9.2014 First version of the Sisu phase 2 User Guide Sisu User Guide 2 Table of Contents Sisu User Guide...1 1. Introduction...4 1.1 Sisu supercomputer...4

More information

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah Introduction to Parallel Programming Martin Čuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu Overview Types of parallel computers. Parallel programming options. How to

More information

Advanced School in High Performance and GRID Computing November Mathematical Libraries. Part I

Advanced School in High Performance and GRID Computing November Mathematical Libraries. Part I 1967-10 Advanced School in High Performance and GRID Computing 3-14 November 2008 Mathematical Libraries. Part I KOHLMEYER Axel University of Pennsylvania Department of Chemistry 231 South 34th Street

More information

OpenACC2 vs.openmp4. James Lin 1,2 and Satoshi Matsuoka 2

OpenACC2 vs.openmp4. James Lin 1,2 and Satoshi Matsuoka 2 2014@San Jose Shanghai Jiao Tong University Tokyo Institute of Technology OpenACC2 vs.openmp4 he Strong, the Weak, and the Missing to Develop Performance Portable Applica>ons on GPU and Xeon Phi James

More information

Scientific Programming in C XIV. Parallel programming

Scientific Programming in C XIV. Parallel programming Scientific Programming in C XIV. Parallel programming Susi Lehtola 11 December 2012 Introduction The development of microchips will soon reach the fundamental physical limits of operation quantum coherence

More information

Cray RS Programming Environment

Cray RS Programming Environment Cray RS Programming Environment Gail Alverson Cray Inc. Cray Proprietary Red Storm Red Storm is a supercomputer system leveraging over 10,000 AMD Opteron processors connected by an innovative high speed,

More information

Stable Cray Support in EasyBuild 2.7. Petar Forai

Stable Cray Support in EasyBuild 2.7. Petar Forai Stable Cray Support in EasyBuild 2.7 Petar Forai 1 Major Areas of Interest Specific to Cray Support Version pinning in Cray toolchain to achieve reproducible builds New toolchain naming scheme that is

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

Software Usage on Cray Systems across Three Centers (NICS, ORNL and CSCS)

Software Usage on Cray Systems across Three Centers (NICS, ORNL and CSCS) Software Usage on Cray Systems across Three Centers (NICS, ORNL and CSCS) Bilel Hadri, Mark Fahey, Timothy Robinson, and William Renaud CUG 2012, May 3 rd, 2012 Contents Introduction and Motivations Overview

More information

Optimization and Scalability

Optimization and Scalability Optimization and Scalability Drew Dolgert CAC 29 May 2009 Intro to Parallel Computing 5/29/2009 www.cac.cornell.edu 1 Great Little Program What happens when I run it on the cluster? How can I make it faster?

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

ARM High Performance Computing

ARM High Performance Computing ARM High Performance Computing Eric Van Hensbergen Distinguished Engineer, Director HPC Software & Large Scale Systems Research IDC HPC Users Group Meeting Austin, TX September 8, 2016 ARM 2016 An introduction

More information

Workshop on High Performance Computing (HPC08) School of Physics, IPM February 16-21, 2008 HPC tools: an overview

Workshop on High Performance Computing (HPC08) School of Physics, IPM February 16-21, 2008 HPC tools: an overview Workshop on High Performance Computing (HPC08) School of Physics, IPM February 16-21, 2008 HPC tools: an overview Stefano Cozzini CNR/INFM Democritos and SISSA/eLab cozzini@democritos.it Agenda Tools for

More information

Dr. Ilia Bermous, the Australian Bureau of Meteorology. Acknowledgements to Dr. Martyn Corden (Intel), Dr. Zhang Zhang (Intel), Dr. Martin Dix (CSIRO)

Dr. Ilia Bermous, the Australian Bureau of Meteorology. Acknowledgements to Dr. Martyn Corden (Intel), Dr. Zhang Zhang (Intel), Dr. Martin Dix (CSIRO) Performance, accuracy and bit-reproducibility aspects in handling transcendental functions with Cray and Intel compilers for the Met Office Unified Model Dr. Ilia Bermous, the Australian Bureau of Meteorology

More information

No Time to Read This Book?

No Time to Read This Book? Chapter 1 No Time to Read This Book? We know what it feels like to be under pressure. Try out a few quick and proven optimization stunts described below. They may provide a good enough performance gain

More information

Intel C++ Compiler User's Guide With Support For The Streaming Simd Extensions 2

Intel C++ Compiler User's Guide With Support For The Streaming Simd Extensions 2 Intel C++ Compiler User's Guide With Support For The Streaming Simd Extensions 2 This release of the Intel C++ Compiler 16.0 product is a Pre-Release, and as such is 64 architecture processor supporting

More information

CSC Supercomputing Environment

CSC Supercomputing Environment CSC Supercomputing Environment Jussi Enkovaara Slides by T. Zwinger, T. Bergman, and Atte Sillanpää CSC Tieteen tietotekniikan keskus Oy CSC IT Center for Science Ltd. CSC IT Center for Science Ltd. Services:

More information

Cray XT Series System Overview S

Cray XT Series System Overview S Cray XT Series System Overview S 2423 20 2004 2007 Cray Inc. All Rights Reserved. This manual or parts thereof may not be reproduced in any form unless permitted by contract or by written permission of

More information

Compilers & Optimized Librairies

Compilers & Optimized Librairies Institut de calcul intensif et de stockage de masse Compilers & Optimized Librairies Modules Environment.bashrc env $PATH... Compilers : GNU, Intel, Portland Memory considerations : size, top, ulimit Hello

More information

Mo;va;on. Program Equivalence. Performance. Goal. More Pain, More Gain 10/27/15. Program Equivalence. (slides due to Rahul Sharma)

Mo;va;on. Program Equivalence. Performance. Goal. More Pain, More Gain 10/27/15. Program Equivalence. (slides due to Rahul Sharma) Mo;va;on Program Equivalence Verifica/on is specifica/on- limited We need specifica/ons to verifica/on And specifica/ons are hard to come by (slides due to Rahul Sharma) Much research focuses on well-

More information

Ranger Optimization Release 0.3

Ranger Optimization Release 0.3 Ranger Optimization Release 0.3 Drew Dolgert May 20, 2011 Contents 1 Introduction i 1.1 Goals, Prerequisites, Resources...................................... i 1.2 Optimization and Scalability.......................................

More information

CME 213 S PRING Eric Darve

CME 213 S PRING Eric Darve CME 213 S PRING 2017 Eric Darve PTHREADS pthread_create, pthread_exit, pthread_join Mutex: locked/unlocked; used to protect access to shared variables (read/write) Condition variables: used to allow threads

More information

Improving the Performance and Extending the Scalability in the Cluster of SMP based Petaflops Computing

Improving the Performance and Extending the Scalability in the Cluster of SMP based Petaflops Computing Improving the Performance and Extending the Scalability in the Cluster of SMP based Petaflops Computing Nagarajan Kathiresan, Ph.D., IBM India, Bangalore. k.nagarajan@in.ibm.com Agenda :- Different types

More information

HPC Numerical Libraries. Nicola Spallanzani SuperComputing Applications and Innovation Department

HPC Numerical Libraries. Nicola Spallanzani SuperComputing Applications and Innovation Department HPC Numerical Libraries Nicola Spallanzani n.spallanzani@cineca.it SuperComputing Applications and Innovation Department Algorithms and Libraries Many numerical algorithms are well known and largely available.

More information

Intel Performance Libraries

Intel Performance Libraries Intel Performance Libraries Powerful Mathematical Library Intel Math Kernel Library (Intel MKL) Energy Science & Research Engineering Design Financial Analytics Signal Processing Digital Content Creation

More information

Intel MPI Cluster Edition on Graham A First Look! Doug Roberts

Intel MPI Cluster Edition on Graham A First Look! Doug Roberts Intel MPI Cluster Edition on Graham A First Look! Doug Roberts SHARCNET / COMPUTE CANADA Intel Parallel Studio XE 2016 Update 4 Cluster Edition for Linux 1. Intel(R) MPI Library 5.1 Update 3 Cluster Ed

More information

Installation of OpenMX

Installation of OpenMX Installation of OpenMX Truong Vinh Truong Duy and Taisuke Ozaki OpenMX Group, ISSP, The University of Tokyo 2015/03/30 Download 1. Download the latest version of OpenMX % wget http://www.openmx-square.org/openmx3.7.tar.gz

More information

MPI & OpenMP Mixed Hybrid Programming

MPI & OpenMP Mixed Hybrid Programming MPI & OpenMP Mixed Hybrid Programming Berk ONAT İTÜ Bilişim Enstitüsü 22 Haziran 2012 Outline Introduc/on Share & Distributed Memory Programming MPI & OpenMP Advantages/Disadvantages MPI vs. OpenMP Why

More information

Brief notes on setting up semi-high performance computing environments. July 25, 2014

Brief notes on setting up semi-high performance computing environments. July 25, 2014 Brief notes on setting up semi-high performance computing environments July 25, 2014 1 We have two different computing environments for fitting demanding models to large space and/or time data sets. 1

More information

Intel Math Kernel Library

Intel Math Kernel Library Intel Math Kernel Library Release 7.0 March 2005 Intel MKL Purpose Performance, performance, performance! Intel s scientific and engineering floating point math library Initially only basic linear algebra

More information

GOING ARM A CODE PERSPECTIVE

GOING ARM A CODE PERSPECTIVE GOING ARM A CODE PERSPECTIVE ISC18 Guillaume Colin de Verdière JUNE 2018 GCdV PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France June 2018 A history of disruptions All dates are installation dates of the machines

More information

Performance Analysis of BLAS Libraries in SuperLU_DIST for SuperLU_MCDT (Multi Core Distributed) Development

Performance Analysis of BLAS Libraries in SuperLU_DIST for SuperLU_MCDT (Multi Core Distributed) Development Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Performance Analysis of BLAS Libraries in SuperLU_DIST for SuperLU_MCDT (Multi Core Distributed) Development M. Serdar Celebi

More information

Parallel Programming. Libraries and implementations

Parallel Programming. Libraries and implementations Parallel Programming Libraries and implementations Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

The Arm Technology Ecosystem: Current Products and Future Outlook

The Arm Technology Ecosystem: Current Products and Future Outlook The Arm Technology Ecosystem: Current Products and Future Outlook Dan Ernst, PhD Advanced Technology Cray, Inc. Why is an Ecosystem Important? An Ecosystem is a collection of common material Developed

More information

MPI Performance Analysis Trace Analyzer and Collector

MPI Performance Analysis Trace Analyzer and Collector MPI Performance Analysis Trace Analyzer and Collector Berk ONAT İTÜ Bilişim Enstitüsü 19 Haziran 2012 Outline MPI Performance Analyzing Defini6ons: Profiling Defini6ons: Tracing Intel Trace Analyzer Lab:

More information

Introduction to GALILEO

Introduction to GALILEO November 27, 2016 Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it SuperComputing Applications and Innovation Department

More information

Implementing MPI on Windows: Comparison with Common Approaches on Unix

Implementing MPI on Windows: Comparison with Common Approaches on Unix Implementing MPI on Windows: Comparison with Common Approaches on Unix Jayesh Krishna, 1 Pavan Balaji, 1 Ewing Lusk, 1 Rajeev Thakur, 1 Fabian Tillier 2 1 Argonne Na+onal Laboratory, Argonne, IL, USA 2

More information

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah Introduction to Parallel Programming Martin Čuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu Overview Types of parallel computers. Parallel programming options. How to

More information

Linear Algebra libraries in Debian. DebConf 10 New York 05/08/2010 Sylvestre

Linear Algebra libraries in Debian. DebConf 10 New York 05/08/2010 Sylvestre Linear Algebra libraries in Debian Who I am? Core developer of Scilab (daily job) Debian Developer Involved in Debian mainly in Science and Java aspects sylvestre.ledru@scilab.org / sylvestre@debian.org

More information

HPCF Cray Phase 2. User Test period. Cristian Simarro User Support. ECMWF April 18, 2016

HPCF Cray Phase 2. User Test period. Cristian Simarro User Support. ECMWF April 18, 2016 HPCF Cray Phase 2 User Test period Cristian Simarro User Support advisory@ecmwf.int ECMWF April 18, 2016 Content Introduction Upgrade timeline Changes Hardware Software Steps for the testing on CCB Possible

More information

Op#miza#on & Scalability

Op#miza#on & Scalability Op#miza#on & Scalability Carlos Rosales carlos@tacc.utexas.edu September 20 th, 2013 Parallel Compu#ng in Stampede What this talk is about Highlight main performance and scalability bo5lenecks Simple but

More information

Advanced OpenMP Vectoriza?on

Advanced OpenMP Vectoriza?on UT Aus?n Advanced OpenMP Vectoriza?on TACC TACC OpenMP Team milfeld/lars/agomez@tacc.utexas.edu These slides & Labs:?nyurl.com/tacc- openmp Learning objec?ve Vectoriza?on: what is that? Past, present and

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Alessandro Grottesi a.grottesi@cineca.it SuperComputing Applications and

More information

Lessons Learned from Selected NESAP Applications. Helen He!! NCAR Multi-core 5 Workshop! Sept 16-17, 2015

Lessons Learned from Selected NESAP Applications. Helen He!! NCAR Multi-core 5 Workshop! Sept 16-17, 2015 Lessons Learned from Selected NESAP Applications Helen He!! NCAR Multi-core 5 Workshop! Sept 16-17, 2015 The Big Picture The next large NERSC produc6on system Cori will be Intel Xeon Phi KNL (Knights Landing)

More information

AASPI Software Structure

AASPI Software Structure AASPI Software Structure Introduction The AASPI software comprises a rich collection of seismic attribute generation, data conditioning, and multiattribute machine-learning analysis tools constructed by

More information

In 1986, I had degrees in math and engineering and found I wanted to compute things. What I ve mostly found is that:

In 1986, I had degrees in math and engineering and found I wanted to compute things. What I ve mostly found is that: Parallel Computing and Data Locality Gary Howell In 1986, I had degrees in math and engineering and found I wanted to compute things. What I ve mostly found is that: Real estate and efficient computation

More information

Scientific Computing. Some slides from James Lambers, Stanford

Scientific Computing. Some slides from James Lambers, Stanford Scientific Computing Some slides from James Lambers, Stanford Dense Linear Algebra Scaling and sums Transpose Rank-one updates Rotations Matrix vector products Matrix Matrix products BLAS Designing Numerical

More information

PROGRAMMING MODEL EXAMPLES

PROGRAMMING MODEL EXAMPLES ( Cray Inc 2015) PROGRAMMING MODEL EXAMPLES DEMONSTRATION EXAMPLES OF VARIOUS PROGRAMMING MODELS OVERVIEW Building an application to use multiple processors (cores, cpus, nodes) can be done in various

More information

Cerebro Quick Start Guide

Cerebro Quick Start Guide Cerebro Quick Start Guide Overview of the system Cerebro consists of a total of 64 Ivy Bridge processors E5-4650 v2 with 10 cores each, 14 TB of memory and 24 TB of local disk. Table 1 shows the hardware

More information

Ge#ng Started with Automa3c Compiler Vectoriza3on. David Apostal UND CSci 532 Guest Lecture Sept 14, 2017

Ge#ng Started with Automa3c Compiler Vectoriza3on. David Apostal UND CSci 532 Guest Lecture Sept 14, 2017 Ge#ng Started with Automa3c Compiler Vectoriza3on David Apostal UND CSci 532 Guest Lecture Sept 14, 2017 Parallellism is Key to Performance Types of parallelism Task-based (MPI) Threads (OpenMP, pthreads)

More information

Code Optimization. Brandon Barker Computational Scientist Cornell University Center for Advanced Computing (CAC)

Code Optimization. Brandon Barker Computational Scientist Cornell University Center for Advanced Computing (CAC) Code Optimization Brandon Barker Computational Scientist Cornell University Center for Advanced Computing (CAC) brandon.barker@cornell.edu Workshop: High Performance Computing on Stampede January 15, 2015

More information

XSEDE and XSEDE Resources

XSEDE and XSEDE Resources October 22, 2013 XSEDE and XSEDE Resources Dan Stanzione Deputy Director, Texas Advanced Computing Center Co-Director, iplant Collaborative Welcome to XSEDE! XSEDE is an exciting cyberinfrastructure, providing

More information

Vienna Scientific Cluster: Problems and Solutions

Vienna Scientific Cluster: Problems and Solutions Vienna Scientific Cluster: Problems and Solutions Dieter Kvasnicka Neusiedl/See February 28 th, 2012 Part I Past VSC History Infrastructure Electric Power May 2011: 1 transformer 5kV Now: 4-5 transformer

More information

Coding Tools. (Lectures on High-performance Computing for Economists VI) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 25, 2018

Coding Tools. (Lectures on High-performance Computing for Economists VI) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 25, 2018 Coding Tools (Lectures on High-performance Computing for Economists VI) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 25, 2018 1 University of Pennsylvania 2 Boston College Compilers Compilers

More information

Op#miza#on & Scalability

Op#miza#on & Scalability Op#miza#on & Scalability Carlos Rosales carlos@tacc.utexas.edu May 5 th, 2015 Parallel Compu#ng in Stampede What this talk is about Highlight main performance and scalability bo5lenecks Simple but efficient

More information

Mixed MPI-OpenMP EUROBEN kernels

Mixed MPI-OpenMP EUROBEN kernels Mixed MPI-OpenMP EUROBEN kernels Filippo Spiga ( on behalf of CINECA ) PRACE Workshop New Languages & Future Technology Prototypes, March 1-2, LRZ, Germany Outline Short kernel description MPI and OpenMP

More information

Evaluating Shifter for HPC Applications Don Bahls Cray Inc.

Evaluating Shifter for HPC Applications Don Bahls Cray Inc. Evaluating Shifter for HPC Applications Don Bahls Cray Inc. Agenda Motivation Shifter User Defined Images (UDIs) provide a mechanism to access a wider array of software in the HPC environment without enduring

More information

EE/CSCI 451 Introduction to Parallel and Distributed Computation. Discussion #4 2/3/2017 University of Southern California

EE/CSCI 451 Introduction to Parallel and Distributed Computation. Discussion #4 2/3/2017 University of Southern California EE/CSCI 451 Introduction to Parallel and Distributed Computation Discussion #4 2/3/2017 University of Southern California 1 USC HPCC Access Compile Submit job OpenMP Today s topic What is OpenMP OpenMP

More information

Blue Waters Programming Environment

Blue Waters Programming Environment December 3, 2013 Blue Waters Programming Environment Blue Waters User Workshop December 3, 2013 Science and Engineering Applications Support Documentation on Portal 2 All of this information is Available

More information

An Introduction to the Cray X1E

An Introduction to the Cray X1E An Introduction to the Cray X1E Richard Tran Mills (with help from Mark Fahey and Trey White) Scientific Computing Group National Center for Computational Sciences Oak Ridge National Laboratory 2006 NCCS

More information

Lecture 4: Build Systems, Tar, Character Strings

Lecture 4: Build Systems, Tar, Character Strings CIS 330:! / / / / (_) / / / / _/_/ / / / / / \/ / /_/ / `/ \/ / / / _/_// / / / / /_ / /_/ / / / / /> < / /_/ / / / / /_/ / / / /_/ / / / / / \ /_/ /_/_/_/ _ \,_/_/ /_/\,_/ \ /_/ \ //_/ /_/ Lecture 4:

More information