Guillimin HPC Users Meeting March 17, 2016

Similar documents
Guillimin HPC Users Meeting March 16, 2017

Guillimin HPC Users Meeting October 20, 2016

Guillimin HPC Users Meeting February 11, McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

Guillimin HPC Users Meeting July 14, 2016

Guillimin HPC Users Meeting. Bart Oldeman

Guillimin HPC Users Meeting. Bryan Caron

Guillimin HPC Users Meeting June 16, 2016

Guillimin HPC Users Meeting November 16, 2017

Guillimin HPC Users Meeting April 13, 2017

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Guillimin HPC Users Meeting December 14, 2017

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Practical Introduction to

User Guide of High Performance Computing Cluster in School of Physics

Batch Systems. Running calculations on HPC resources

Effective Use of CCV Resources

Supercomputing environment TMA4280 Introduction to Supercomputing

Outline. March 5, 2012 CIRMMT - McGill University 2

Ambiente CINECA: moduli, job scripts, PBS. A. Grottesi (CINECA)

Practical Introduction to Message-Passing Interface (MPI)

Our new HPC-Cluster An overview

Genius Quick Start Guide

Running Jobs, Submission Scripts, Modules

Introduction to Python for Scientific Computing

Introduction to Advanced Research Computing (ARC)

Guillimin HPC Users Meeting

Introduction to CINECA Computer Environment

Guillimin HPC Users Meeting January 13, 2017

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Parameter searches and the batch system

Introduction to GALILEO

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

Introduction to Unix Environment: modules, job scripts, PBS. N. Spallanzani (CINECA)

Computing with the Moore Cluster

Batch Systems. Running your jobs on an HPC machine

Introduction to PICO Parallel & Production Enviroment

Parallel Applications on Distributed Memory Systems. Le Yan HPC User LSU

Cerebro Quick Start Guide

Practical Introduction to Message-Passing Interface (MPI)

Introduc)on to Hyades

The JANUS Computing Environment

Knights Landing production environment on MARCONI

XSEDE New User Tutorial

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende

Transitioning to Leibniz and CentOS 7

Introduction to GALILEO

Introduction to HPCC at MSU

XSEDE New User Tutorial

XSEDE New User Tutorial

DDT: A visual, parallel debugger on Ra

Working on the NewRiver Cluster

Using ITaP clusters for large scale statistical analysis with R. Doug Crabill Purdue University

Answers to Federal Reserve Questions. Training for University of Richmond

XSEDE New User Tutorial

Crash Course in High Performance Computing

HPC Resources at Lehigh. Steve Anthony March 22, 2012

UF Research Computing: Overview and Running STATA

Introduction to Discovery.

Advanced Message-Passing Interface (MPI)

Introduction to HPC Using the New Cluster at GACRC

RWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

Introduction to Discovery.

Running applications on the Cray XC30

Workshop Set up. Workshop website: Workshop project set up account at my.osc.edu PZS0724 Nq7sRoNrWnFuLtBm

Introduction to Cheyenne. 12 January, 2017 Consulting Services Group Brian Vanderwende

Cloud Computing Research Cloud: NeCTAR Commercial Cloud: Amazon AWS, Microsoft Azure, etc. Seed money for exploration of new cloud technologies

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE

PACE. Instructional Cluster Environment (ICE) Orientation. Mehmet (Memo) Belgin, PhD Research Scientist, PACE

Shell Scripting. With Applications to HPC. Edmund Sumbar Copyright 2007 University of Alberta. All rights reserved

Shifter on Blue Waters

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Introduction to Discovery.

UAntwerpen, 24 June 2016

Introduction to HPC Numerical libraries on FERMI and PLX

Simple examples how to run MPI program via PBS on Taurus HPC

UBDA Platform User Gudie. 16 July P a g e 1

The GPU-Cluster. Sandra Wienke Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

Getting started with the CEES Grid

Introduction to High Performance Computing

Batch environment PBS (Running applications on the Cray XC30) 1/18/2016

SCALABLE HYBRID PROTOTYPE

Why Combine OpenMP and MPI

Introduction to High Performance Computing (HPC) Resources at GACRC

INTRODUCTION TO THE CLUSTER

PBS Pro and Ansys Examples

New User Seminar: Part 2 (best practices)

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta

Table of Contents. Table of Contents Job Manager for remote execution of QuantumATK scripts. A single remote machine

Introduction to Molecular Dynamics on ARCHER: Instructions for running parallel jobs on ARCHER

Introduction to CINECA HPC Environment

New User Tutorial. OSU High Performance Computing Center

Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat

Queue systems. and how to use Torque/Maui. Piero Calucci. Scuola Internazionale Superiore di Studi Avanzati Trieste

Programming Techniques for Supercomputers. HPC RRZE University Erlangen-Nürnberg Sommersemester 2018

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Installing and running COMSOL 4.3a on a Linux cluster COMSOL. All rights reserved.

Introduction to the Cluster

Exercises: Abel/Colossus and SLURM

Introduction to HPC at MSU

Transcription:

Guillimin HPC Users Meeting March 17, 2016 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

Outline Compute Canada News System Status Software Updates Training News Special Topic Best Practices for Job Submission 2

Compute Canada News Compute Canada Account Renewals Mandatory annual renewal process for all Compute Canada account holders including faculty, researchers, students and Compute Canada staff Renewal process includes gathering of information about researchers and their research activities as required by our funding agencies including CFI All Compute Canada users contacted via email in early March to begin renewal process Deadline for renewals: March 31 Accounts not renewed will be de-activated in early April Questions? renewals@computecanada.ca 3

Network Instability System Status February 23 between 22:00-23:30 and February 25 late evening VLAG connection issue between two site routers, restored after one router restart stable since then but investigating further Security Updates Rolling updates to all nodes including login nodes for CVE-2015-7547 (glibc) 4

Software Update New Installations PyQt, rpy2, JupyterHub, IPython for Python 3.5.0 (via Lmod only) About the Lmod/EasyBuild based module structure: Now backwards compatible, opt-in by doing: touch ~/.lmod_legacy Default on March 22, opt-out via ~/.lmod_disabled Old modulefiles keep working, including those in $HOME/modulefiles Most new modulefiles accessed via: module load iomkl/2015b (loads GCC 4.9.3+Intel 15.0.3+OpenMPI 1.8.8+MKL); see http://www.hpc.mcgill.ca/index.php/starthere/81-doc-pages/88-guillimin-modules 5

Training News See Training and Outreach at www.hpc.mcgill.ca for our calendar of training and workshops for 2016 and for links to registration pages Upcoming events: calculquebec.eventbrite.ca March 22 - Two Python workshops based on Software and Data Carpentry material (U. Laval) March 24 - Introduction to R (U. McGill) March 31 - Profiling and Optimization Tools (U. McGill) April, May - Suggestions for training? Please let us know! June 6-10 - Calcul Québec ARC spring school All materials from previous workshops are available online: https://wiki.calculquebec.ca/w/formations/en Recently completed: February 17 - Introduction to Python (McGill U.) 6

User Feedback and Discussion Questions? Comments? We value your feedback. Contact us at: guillimin@calculquebec.ca Guillimin Operational News for Users Status Pages http://www.hpc.mcgill.ca/index.php/guillimin-status http://serveurscq.computecanada.ca (all CQ systems) Follow us on Twitter http://twitter.com/mcgillhpc 7

Best Practices for Job Submission March 17, 2016 McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

Example for serial job submission qsub script: #!/bin/bash #PBS -l nodes=1:ppn=1 #PBS -l walltime=00:10:00 #PBS -A xyz-123-aa #PBS -N JobTest #PBS -M email@example.ca #PBS -m abe cd $PBS_O_WORKDIR module load iomkl/2015b./your_app arg1 arg2 arg3... > output.txt Note: No #PBS -V, no modules loaded in.bashrc, gives self-contained, more easily reproducible submission script. 9

Example for parallel job submission qsub script #!/bin/bash #PBS -l nodes=3:ppn=12 #PBS -l pmem=2700m #PBS -l walltime=00:10:00 #PBS -A xyz-123-aa #PBS -N JobTest cd $PBS_O_WORKDIR module load iomkl/2015b mpiexec -n 36./your_app arg1 arg2 arg3... > output.txt Note: No #PBS -V, no modules loaded in.bashrc, gives self-contained, more easily reproducible submission script. 10

Submission styles Serial: default memory: pmem=2700m (2.7G per core). #PBS -l nodes=1:ppn=m, m 12 Recommended: m 6, or m=12 (full node) Serial (Sandy Bridge): #PBS -l nodes=1:ppn=m:sandybridge, m<12 or #PBS -l nodes=1:ppn=16 Recommended: m 8, or m=16 (full node) Parallel (Westmere): default pmem=1700m #PBS -l nodes=n:ppn=12, n>1 Parallel (Sandy Bridge): default pmem=3700m #PBS -l nodes=n:ppn=16, n>1 Parallel (Any): default pmem=1700m #PBS -l procs=m (m>11, multiples of 48 are best). 11

Submission styles (accelerators, debug) GPUs #PBS -l nodes=2:ppn=16:gpus=2 #PBS -l pmem=123200m Reserves two full nodes with 2 GPUs each pmem is per node for GPUs! Xeon Phi: #PBS -l nodes=1:ppn=8:mics=1,pmem=29600m Queues: Default queue: metaq, generally no need to specify queue name Exception: debug queue: #PBS -q debug, for test jobs (default walltime 30 mins, max 2 hours) 12

MPI/OpenMP hybrid jobs Challenges: Special mpiexec syntax Worry about processor affinity Example: one MPI process per node: Switch off affinity at MPI level, otherwise MPI processes often bound to one core only! MVAPICH2 (for 2 nodes) export IPATH_NO_CPUAFFINITY=1 mpiexec -n 2 -ppn 1 executable OpenMPI < 1.8 (for 2 nodes) mpiexec -n 2 -npernode 1 executable OpenMPI 1.8+ (for 2 nodes with 12 cores): slot is core on Guillimin, PE=processing element mpiexec -n 2 -map-by slot:pe=12 executable 13

MPI/OpenMP hybrid jobs (2) Example: 4 MPI processes x 3 threads per node: Best to assign specific cores to each MPI process! Example: MATLAB: it will otherwise spawn too many threads leading to high load and inefficiencies. MVAPICH2 (for 2 nodes) mpiexec -n 8 -ppn 4 -bind-to core:3 -map-by core executable OpenMPI < 1.8 (for 2 nodes) mpiexec -n 8 -npernode 4 -cpus-per-proc 3 executable OpenMPI 1.8+ (for 2 nodes): mpiexec -n 8 -map-by slot:pe=3 executable Use --report-bindings option for OpenMPI to see how MPI processes are bound to cores, or export MV2_SHOW_CPU_BINDING=1 for MVAPICH2. 14

Job Scheduling Tetris Time On Guillimin, many nodes accept either (but not both!): Parallel jobs, e.g. nodes=n:ppn=12 or 16, n > 0, or procs=p, p > 11 or Serial jobs, nodes=1:ppn=m, m < 12 Each colour = one job Some (single-node serial only!) jobs can be split on the cores axis; parallel jobs can only be split on node boundaries. Unused cores 12 Cores 24 Nodes 1 2 15

Job Scheduling Tetris Time Unused cores Cores 12 24 1 Nodes 2 16

Job Scheduling Tetris Lower priority Time Low priority High priority (reservation) Unused cores 12 Cores 24 1 Nodes 2 17

Job Scheduling Tetris Backfill (small, low priority job can run when higher priority jobs can't) Time Unused cores 12 Cores 24 1 Nodes 2 18

Backfilling tips Submit short (30 mins - 36 hours) jobs Design tasks for maximum scheduler flexibility Low memory per core (-l pmem=1700m) Pack tasks into full nodes ~ 12000 cores available for short, low-memory ppn=12 jobs, shared with up to 30 day jobs ~ 5000 cores available for short, low-memory ppn=1 jobs, but with much faster churn rate Walltime < 36 hours ppn = 12 (hbplus) ppn < 12 (serial-short) 19

Data Parallel Jobs Data parallel: Parallelize by processing each chunk of data as a separate task Strategies Job arrays Background processing GNU Parallel Note that each process will compete for resources (filesystem access, memory, CPUs, etc.) 20

Job Arrays Job arrays are useful for submitting a large number of related tasks at one time Example for qsub (Torque): #!/bin/bash #PBS -l walltime=30:00:00 #PBS -l nodes=1:ppn=12 #PBS -t 0-31 SRC=$HOME/program_dir LOWER_BOUND =$((12 * $PBS_ARRAYID )) UPPER_BOUND =$(($LOWER_BOUND + 11)) for i in $( seq $LOWER_BOUND $UPPER_BOUND ) do cd $SCRATCH/dir$i ; $SRC/prog > output & done wait 21

Background tasks The Linux operating system can run your process in the background so that your script continues without waiting for it to finish Use the ampersand symbol, & The wait command says to wait for all background processes to finish #!/bin/bash #PBS -l walltime=30:00:00 #PBS -l nodes=1:ppn=12 SRC=$HOME/program_dir cd $SCRATCH/dir1 ; $SRC/prog > output & cd $SCRATCH/dir2 ; $SRC/prog > output & cd $SCRATCH/dir3 ; $SRC/prog > output &... cd $SCRATCH/dir12 ; $SRC/prog > output& wait #!/bin/bash #PBS -l walltime=30:00:00 #PBS -l nodes=1:ppn=12 SRC=$HOME/program_dir for i in $(seq 12) do cd $SCRATCH/dir$i ; $SRC/prog > output & done wait 22

GNU-Parallel GNU-Parallel is an easy-to-use tool for launching processes in parallel Example: loop to apply a command (file) to multiple files $ find x*.gz -type f -print0 parallel -q0 file xdc.gz: gzip compressed data, was "xdc", from Unix, last modified: Wed Apr 8 16:09:51 2015, max speed xda.gz: gzip compressed data, was "xda", from Unix, last modified: Wed Apr 8 16:09:51 2015, max speed xdb.gz: gzip compressed data, was "xdb", from Unix, last modified: Wed Apr 8 16:09:50 2015, max speed... 23

GNU Parallel Run different commands in parallel $ parallel ::: hostname date echo hello world Input sources from a file $ parallel -a input-file echo Input sources from the command line $ parallel echo ::: A B C Input sources from STDIN $ cat input-file parallel echo Input from multiple sources $ parallel -a abc-file -a def-file echo $ cat abc-file parallel -a - -a def-file echo Will operate on each pair of inputs 24

Conclusion Documentation: http://www.hpc.mcgill.ca/index.php/starthere/81-docpages/322-simple-job-submission http://www.hpc.mcgill.ca/index.php/starthere/81-docpages/91-guillimin-job-submit https://wiki.calculquebec.ca/w/running_jobs#tab=tab4 For any other questions: guillimin@calculquebec.ca 25