HPC Introductory Training. on Balena by Team Bath

Size: px
Start display at page:

Download "HPC Introductory Training. on Balena by Team Bath"

Transcription

1 HPC Introductory Training on Balena by Team Bath

2 What is HPC and why is it different to using your desktop? High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. - insidehpc Aggregated computing power Very large problem sizes Multiple problems simultaneously

3 General Information Scheduler Managing workloads Visualisation Balena - HPC at Bath

4 Objectives of this training Login to Balena Understand what storage is available on Balena Finding the software you want to use Create you own jobscript and be familiar with options Submit and manage workloads on Balena Use interactive nodes for Test and Development Use visualisation tools Knowing where find more information and to ask for help

5 General Information What makes up the cluster (Hardware/Technology - servers, interconnects, storage, etc) Accessing the cluster; logging in, copying files Data storage areas; /home and /beegfs partitions, expected performance, quotas Modules environments; setting up your working environment, compilers, libraries, software/applications

6 Balena Technical Specification CPU cores Memory Storage Network Size Performance Accelerators Add. Services 3,072 Intel Ivy Bridge 2.6 GHz cores (16 per node) 18 TB of main memory (mostly 4 or 8 GB/core, but 2 nodes have 512 GB) ~50TB, NFS for home area ~200TB, BeeGFS non-archival (Parallel Filesystem) Intel TrueScale 2:1 blocking (worst case) QDR Infiniband fabric 202 nodes (including management) 57 TFlops 22 Xeon Phi (5110p), 24 GPUs (K20x), 2 S10K Visualisation and Test & Development Service levels Limits Free 6hrs per job, max 16 nodes, max cores 256 Premium 5days per job, max 32 nodes, max cores 512

7

8 Accessing Balena # Exercise 1 Note: Balena can only be accessed from within the campus network

9 # Exercise 2 Where can I keep put my files/data? # Exercise 3 /home $HOME Total capacity ~50TB ~200TB User quota 5 GB Unlimited Peak performance from login node <500MB/sec 1-3GB/sec Peak performance from a compute node <100MB/sec Data policy Backed up Non archival /beegfs/scratch $SCRATCH 1-3GB/sec (Aggregate BW for all users in excess of 10GB/dec)

10 # Exercise 4 Modules where is the software I want to run? $ module avail Applications Ansys, VASP, Matlab, Gaussian etc Compilers-langs Intel Compiler suite, GNU compilers, python etc Libraries MKL, MPI, CUDA, FFTW3 etc Tools Allinea DDT/MAP, Intel vtune, Valgrind etc

11 [balena-01 ~]$ module avail /apps/modules/balena cluster-manager dot module-info teaching use.own cluster-tools/7.0 http_proxy slurm/ untested version /apps/modules/applications ansys/v150 cp2k/ssmp/3.0 gromacs/cpu/dp/5.0.2 phylobayes/4.1b ansys/v161 cplex/ gromacs/cpu/sp/5.0.2 phylobayes/4.1c ansys/v162 crystal09/intel/1.0.1 gromacs/cuda/5.0.2 phylobayes/mpi1.6j ansys/v170 crystal14/intel/1.0.3 hwloc/1.8.1 R/3.1.2 bonnie++/ esm/2.0 idl/8.5 R/3.3.0 bowtie/1.1.2 esm/3.0 lammps/intel/dec-2013 relion/1.4 CASTEP/16.11 espresso/5.1 llvm/3.4.1 stata/14 CASTEP/8.0 espresso/5.2.0 matlab/2014b vasp/intel/5.3.5 comsol/5.1 gaussian09/a.02 matlab/2015b vmd/1.9.2 cp2k/popt/3.0(default) gaussian09/intel/d.01 nwchem/6.5 cp2k/psmp/3.0 gaussian09/intel/d.01-linda openfoam/intel/2.3.x-svn /apps/modules/compilers-langs gcc/4.8.2 intel/compiler/64/ (default) intel/compiler/mic/ python/3.4.2(default) intel/compiler/64/ intel/compiler/64/ java/jdk/1.8.0 intel/compiler/64/14.0/2013_sp intel/compiler/mic/ python/ /apps/modules/libraries acml/gcc/64/5.3.1 cuda/blas/ intel/mkl/64/11.3(default) lapack/intel/64/3.5.0 acml/gcc/fma4/5.3.1 cuda/blas/ intel/mkl/mic/11.2 mvapich2/gcc/2.1-qib acml/gcc/mp/64/5.3.1 cuda/blas/ intel/mkl/mic/11.3 mvapich2/icc/2.1-qib acml/gcc/mp/fma4/5.3.1 cuda/fft/ intel/mpi/64/ openblas/dynamic/0.2.8 acml/gcc-int64/64/5.3.1 cuda/fft/ intel/mpi/64/ (default) opencv/3.1.0 acml/gcc-int64/fma4/5.3.1 cuda/fft/ intel/mpi/mic/ openmpi/gcc/1.8.4 acml/gcc-int64/mp/64/5.3.1 cudnn/3.0 intel/mpi/mic/ openmpi/intel/1.8.4 acml/gcc-int64/mp/fma4/5.3.1 fftw3/intel/avx/3.3.4 intel-mpi/32/4.1.3/049 pcre/8.38 blas/gcc/64/1 fftw3/intel/sse/3.3.4 intel-mpi/64/4.1.3/049 wannier/1.2 blas/intel/64/1 gsl/2.1 intel-mpi/mic/4.1.3/049 xz/5.2.2 boost/gcc/ intel/mkl/64/ intel-tbb-oss/ia32/42_ oss zlib/1.2.8 boost/intel/ intel/mkl/64/11.1/2013_sp intel-tbb-oss/intel64/42_ oss bzip2/1.0.6 intel/mkl/64/11.2 lapack/gcc/64/ /apps/modules/tools allinea/ddt-map/4.2 cmake/3.5.2 git/2.5.1 intel-cluster-runtime/ia32/3.6 allinea/ddt-map/6.0.2 cuda/nsight/ hdf5/ intel-cluster-runtime/intel64/3.6 allinea/reports/5.0 cuda/nsight/ hdf5_18/ intel-cluster-runtime/mic/3.6 allinea/reports/6.0.2 cuda/nsight/ htop iozone/3_420 anaconda/2.3.0 cuda/profiler/ intel/adviser/ netcdf/gcc/64/ anaconda3/2.5.0 cuda/profiler/ intel/inspector/ netperf/2.6.0 autotools/latest cuda/profiler/ intel/itac/ paraview/4.3.1 bonnie++/ cuda/tdk/ intel/mpss/runtime/3.4.3 valgrind/ cmake/ cuda/toolkit/ intel/mpss/sdk/3.4.3 cmake/3.2.3 cuda/toolkit/ intel/vtune/ cmake/3.3.1 cuda/toolkit/ intel-cluster-checker/2.1.2

12 Scheduler Brief introduction to SLURM How to discover information on how the scheduler is configured, listing the queues/partitions, fairshare Essential scheduler commands: sinfo, squeue, sbatch etc. Understanding about the topology, effects of intra- and inter-ibswitch communication, how to assign workloads to single or multiple switches

13 Simple Linux Utility for Resource Management (SLURM) Terminology CPU For multicore machines this will be the core Task A task is synonymous to a process, usually the number of MPI processes that are required Partition (queue) Grouping of nodes based on features Account Grouping of users Job An application/program submitted to the scheduler, each job gets a unique job identifier (jobid)

14 Essential SLURM commands User commands to View information about SLURM nodes and partitions List status of jobs in the queue Jobs by user Jobs by jobid List all jobs (completed and running/pending) by the user Submit a job Cancel a job Hold a job in the queue Release a job that is held View get detailed information of a job in the queue Get detailed information of a node Get licenses available on SLURM Show fairshare information SLURM sinfo squeue squeue --user [userid] squeue --job [jobid] sacct sbatch [jobscript] scancel [jobid] scontrol hold [jobid] scontrol release [jobid] scontrol show job [jobid] scontrol show node [nodename] scontrol show license sshare

15 # Exercise 5 Essential SLURM commands: squeue [balena-01 ~]$ squeue JOBID NAME USER ACCOUNT PARTITION ST NODES CPUS MIN_MEMORY START_TIME TIME_LEFT PRIORITY NODELIST(REASON) CoO2-7Lhs hc722 free batch R K T10:21:55 4: node-sw-[ ,041,045] V2OBv6L hc722 free batch R K T10:29:02 11: node-sw-[021,048,060,068] V2OV4L hc722 free batch R K T10:42:22 25: node-sw-[ ] V4OV9B hc722 free batch R K T10:53:01 35: node-sw-[124,127, ] V4OV9C hc722 free batch R K T10:54:08 36: node-sw-[ ,072] V4OV9D hc722 free batch R K T10:55:06 37: node-sw-[ ,032,034] Li1RuO3-r3 hc722 free batch R K T10:57:09 39: node-sw-[ ,100,104] VASP jf298 free batch R T10:57:57 40: node-sw-[058,061,155,160] ompam_50_2 ide20 free batch-all R K T11:58:26 1:41: node-as-ngpu ReederMC3 cgk26 free batch-all R K T12:32:29 2:15: node-as-phi b120mc1 cgk26 free batch-all R K T12:34:32 2:17: node-sw ReederMC6 cgk26 free batch-all R K T12:35:03 2:17: node-sw-125 Useful filters --name [jobname] --partition [partition] --user [userid] --job [jobid] Filter based on job name List of jobs is a specific partition List jobs of a specific user in the queue Status of a single job in the queue

16 Essential SLURM commands: sinfo # Exercise 5 [balena-01 ~]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST batch* up infinite 160 alloc node-sw-[ ] batch-acc up infinite 21 alloc node-as-agpu-001,node-as-ngpu-[ ],node-as-phi-[ ] batch-all up infinite 181 alloc node-as-agpu-001,node-as-ngpu-[ ],node-as-phi-[ ] batch-512gb up infinite 2 idle node-sw-fat-[ ] batch-64gb up infinite 80 alloc node-sw-[ ] batch-128gb up infinite 80 alloc node-sw-[ ] batch-devel up infinite 4 idle node-sw-[ ] teaching up infinite 4 idle node-sw-[ ] batch-micnative up infinite 4 idle node-as-phi-003-mic[0-3] [balena-02 ~]$ sinfo -Nel --partition batch-acc --Format=nodelist,features,gres Mon Aug 22 17:18: NODELIST AVAIL_FEATURES GRES node-as-agpu-001 s10k (null) node-as-ngpu-001 k20x gpu:4 node-as-ngpu-002 k20x gpu:4 node-as-ngpu-003 k20x gpu:4 node-as-ngpu-004 k20x gpu:4 node-as-ngpu-005 k20x gpu:1 node-as-ngpu-006 k20x gpu:1 node-as-phi p,michost mic:4 node-as-phi p,michost mic:4 node-dw-ngpu-001 k20x gpu:1 node-dw-ngpu-002 k20x gpu:1 node-dw-ngpu-003 k20x gpu:1 node-dw-ngpu-004 k20x gpu:1 node-dw-phi p,michost mic:1 node-dw-phi p,michost mic:1 node-dw-phi p,michost mic:1 node-dw-phi p,michost mic:1

17 # Exercise 5 Essential SLURM commands: scontrol [balena-01 ~]$ scontrol show job JobId= Name=MAPbI3 UserId=jms70( ) GroupId=balena_ch(10307) Priority= Nice=0 Account=free QOS=free JobState=RUNNING Reason=None Dependency=(null) Requeue=0 Restarts=0 BatchFlag=1 ExitCode=0:0 RunTime=00:47:01 TimeLimit=06:00:00 TimeMin=N/A SubmitTime= T11:17:05 EligibleTime= T11:17:05 StartTime= T12:19:34 EndTime= T18:19:35 PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=batch AllocNode:Sid=balena-01: ReqNodeList=(null) ExcNodeList=(null) NodeList=node-sw-111 BatchHost=node-sw-111 NumNodes=1 NumCPUs=16 CPUs/Task=1 ReqB:S:C:T=0:0:*:* Socks/Node=* NtasksPerN:B:S:C=16:0:*:* CoreSpec=0 MinCPUsNode=16 MinMemoryNode=62G MinTmpDiskNode=0 Features=(null) Gres=(null) Reservation=(null) Shared=0 Contiguous=0 Licenses=(null) Network=(null) Command=/beegfs/scratch/user/e/jms70/MAPbI3/Cubic-Phono3py/Phono3py-16x16x16_ job WorkDir=/beegfs/scratch/user/e/jms70/MAPbI3/Cubic-Phono3py StdErr=/beegfs/scratch/user/e/jms70/MAPbI3/Cubic-Phono3py/StdErr.e.%j StdIn=/dev/null StdOut=/beegfs/scratch/user/e/jms70/MAPbI3/Cubic-Phono3py/StdOut.o

18 # Exercise 5 Essential SLURM commands: scontrol $ scontrol show node node-sw-100 NodeName=node-sw-100 Arch=x86_64 CoresPerSocket=8 CPUAlloc=16 CPUErr=0 CPUTot=16 CPULoad=16.03 Features=(null) Gres=(null) NodeAddr=node-sw-100 NodeHostName=node-sw-100 Version= OS=Linux RealMemory=64498 AllocMem=63488 Sockets=2 Boards=1 State=ALLOCATED ThreadsPerCore=1 TmpDisk=2015 Weight=1 BootTime= T14:52:50 SlurmdStartTime= T15:03:07 CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s $ scontrol show node node-dw-ngpu-001 NodeName=node-dw-ngpu-001 Arch=x86_64 CoresPerSocket=8 CPUAlloc=16 CPUErr=0 CPUTot=16 CPULoad=15.21 Features=k20x Gres=gpu:1 NodeAddr=node-dw-ngpu-001 NodeHostName=node-dw-ngpu-001 Version= OS=Linux RealMemory= AllocMem=60000 Sockets=2 Boards=1 State=ALLOCATED ThreadsPerCore=1 TmpDisk=2015 Weight=1 BootTime= T14:50:49 SlurmdStartTime= T14:54:05 CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

19 # Exercise 5 Essential SLURM commands: sshare List accounts to which a user has access to [balena-02 ~]$ sshare Account User RawShares NormShares RawUsage EffectvUsage FairShare free test-rtm

20 Topology 2:1 Blocking fabric 24 nodes per QDR and 12 uplinks to the core switches All nodes on switch 1 talking to all nodes on switch 2 will effectively communicate at DDR speed which is ~1.6GB/sec 24 nodes on a single switch communicate at full QDR speed SLURM is configured to understand the Infiniband topology Request your job to only start on a single switch with --switches=1

21 Managing workloads Creating job-scripts for SLURM Requesting specific features; GPU (K20x and S10k) or Xeon Phi nodes, specific nodes, partitions Managing workloads: submitting, cancelling Running interactive sessions

22 Example hybrid code in C - hello_world.c #include <stdio.h> #include "mpi.h #include "omp.h int main(int argc, char *argv[]) { int numprocs, rank, namelen; char processor_name[mpi_max_processor_name]; int iam = 0, np = 1; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Get_processor_name(processor_name, &namelen); #pragma omp parallel default(shared) private(iam, np) { np = omp_get_num_threads(); iam = omp_get_thread_num(); printf("hello from thread %d out of %d from process %d out of %d on %s\n", iam, np, rank+1, numprocs, processor_name); } MPI_Finalize(); }

23 Compiling the example code # Exercise 6 Load necessary Modules For our example lets load Intel Compiler suite module load intel/compiler Intel MPI library module load intel/mpi $ module load intel/compiler/64/ $ module load intel/mpi/64/ Compile Hybrid (MPI + openmp): mpiicc -qopenmp source.c o output $ mpiicc -qopenmp hello_world.c -o hello_world

24 Anatomy of a job-script # Exercise 6 hash-bang: tell linux which interpreter to use SLRUM directives Job environment Instructions to run your application #!/bin/bash #SBATCH --job-name=testjob #SBATCH --account=free #SBATCH --partition=batch #SBATCH --nodes=1 #SBATCH --cpus-per-task=16 #SBATCH --time=00:05:00 #SBATCH --error=hello-%j.err #SBATCH --output=hello-%j.out module purge module load slurm module load intel/compiler/64/ module load intel/mpi/64/ mpirun./hello_world

25 Job-script: Submitting the job # Exercise 6 Submit the job to the queue [balena-02 ~]$ cd $SCRATCH/training/hello_world [balena-02 ~]$ sbatch hello_world.slurm Submitted batch job View job in the queue [balena-02 ~]$ squeue --job View job from SLURM accounting log [balena-02 ~]$ sacct --job If you want to cancel a job that is pending/running [balena-02 ~]$ scancel

26 Job-script: Running the example code in OpenMP mode # Exercise 6 #!/bin/bash #SBATCH --job-name=testjob #SBATCH --account=free #SBATCH --partition=batch #SBATCH --nodes=1 #SBATCH --cpus-per-task=16 #SBATCH --time=00:05:00 #SBATCH --error=hello_1-%j.err #SBATCH --output=hello_1-%j.out module purge module load slurm module load intel/compiler/64/ module load intel/mpi/64/ mpirun./hello_world Hello from thread 12 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 13 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 0 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 14 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 6 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 9 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 2 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 10 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 1 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 4 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 15 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 8 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 5 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 11 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 3 out of 16 from process 1 out of 1 on node-sw-165 Hello from thread 7 out of 16 from process 1 out of 1 on node-sw-165 Submission with: 16 OpenMP threads per task 1 MPI task

27 Job-script: Running the example code in MPI mode # Exercise 6 #!/bin/bash #SBATCH --job-name=testjob #SBATCH --account=free #SBATCH --partition=batch #SBATCH --nodes=2 #SBATCH --ntasks-per-node=16 #SBATCH --time=00:05:00 #SBATCH --error=hello_2-%j.err #SBATCH --output=hello_2-%j.out module purge module load slurm module load intel/compiler/64/ module load intel/mpi/64/ mpirun./hello_world Submission with: 1 OpenMP thread per task (default) 32 MPI tasks Hello from thread 1 out of 1 from process 1 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 2 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 3 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 4 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 7 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 17 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 5 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 18 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 6 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 19 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 8 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 20 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 9 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 21 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 10 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 22 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 11 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 12 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 13 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 14 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 15 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 16 out of 32 on node-sw-166 Hello from thread 1 out of 1 from process 23 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 24 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 25 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 26 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 27 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 28 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 29 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 32 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 30 out of 32 on node-sw-167 Hello from thread 1 out of 1 from process 31 out of 32 on node-sw-167

28 Job-script: Running the example code in Hybrid mode # Exercise 6 #!/bin/bash #SBATCH --job-name=testjob #SBATCH --account=free #SBATCH --partition=batch-devel #SBATCH --nodes=2 #SBATCH --ntasks-per-node=2 #SBATCH --cpus-per-task=8 #SBATCH --time=00:05:00 #SBATCH --error=hello_3-%j.err #SBATCH --output=hello_3-%j.out module purge module load slurm module load intel/compiler/64/ module load intel/mpi/64/ mpirun./hello_world Submission with: 8 OpenMP threads per task 4 MPI tasks Hello from thread 5 out of 8 from process 1 out of 4 on node-sw-165 Hello from thread 2 out of 8 from process 1 out of 4 on node-sw-165 Hello from thread 4 out of 8 from process 1 out of 4 on node-sw-165 Hello from thread 1 out of 8 from process 1 out of 4 on node-sw-165 Hello from thread 6 out of 8 from process 1 out of 4 on node-sw-165 Hello from thread 3 out of 8 from process 1 out of 4 on node-sw-165 Hello from thread 1 out of 8 from process 2 out of 4 on node-sw-165 Hello from thread 5 out of 8 from process 2 out of 4 on node-sw-165 Hello from thread 8 out of 8 from process 2 out of 4 on node-sw-165 Hello from thread 7 out of 8 from process 2 out of 4 on node-sw-165 Hello from thread 8 out of 8 from process 3 out of 4 on node-sw-166 Hello from thread 4 out of 8 from process 3 out of 4 on node-sw-166 Hello from thread 1 out of 8 from process 4 out of 4 on node-sw-166 Hello from thread 5 out of 8 from process 4 out of 4 on node-sw-166 Hello from thread 2 out of 8 from process 4 out of 4 on node-sw-166 Hello from thread 2 out of 8 from process 2 out of 4 on node-sw-165 Hello from thread 3 out of 8 from process 2 out of 4 on node-sw-165 Hello from thread 7 out of 8 from process 4 out of 4 on node-sw-166 Hello from thread 8 out of 8 from process 1 out of 4 on node-sw-165 Hello from thread 4 out of 8 from process 2 out of 4 on node-sw-165 Hello from thread 7 out of 8 from process 1 out of 4 on node-sw-165 Hello from thread 6 out of 8 from process 2 out of 4 on node-sw-165 Hello from thread 3 out of 8 from process 4 out of 4 on node-sw-166 Hello from thread 8 out of 8 from process 4 out of 4 on node-sw-166 Hello from thread 4 out of 8 from process 4 out of 4 on node-sw-166 Hello from thread 6 out of 8 from process 4 out of 4 on node-sw-166 Hello from thread 5 out of 8 from process 3 out of 4 on node-sw-166 Hello from thread 1 out of 8 from process 3 out of 4 on node-sw-166 Hello from thread 6 out of 8 from process 3 out of 4 on node-sw-166 Hello from thread 7 out of 8 from process 3 out of 4 on node-sw-166 Hello from thread 2 out of 8 from process 3 out of 4 on node-sw-166 Hello from thread 3 out of 8 from process 3 out of 4 on node-sw-166

29 Job-script: SLURM directives #!/bin/bash Job specification SLURM directive prefix Parition/Queue Job name Wall clock limit Node count CPU count address Event notification SLURM #SBATCH --partition=[partition] --job-name=[name] --time=[min] or [days-hh:mm:ss] --nodes=[no of nodes] --ntasks=[count] --mail-user=[address] #SBATCH --job-name=testjob #SBATCH --account=free #SBATCH --partition=batch #SBATCH --nodes=1 #SBATCH --error=slurm-%j.err module load intel/mpi --mail-type=[events] eg. BEGIN, END, FAIL, ALL etc mpirun -np $SLULRM_NTASKS./my_mpi_app Node features Generic resources Working directory Licenses Job arrays Job restart Standard Output file Standard Error file --constraint=[feature] eg. k20x, 5110p --gres=[resource] eg. gpu:4 or mic:2 --workdir=[full_path_of_dir] --licenses=[license_name:count] --array=[array_spec] --requeue OR --no-requeue --output=[file_name] --error=[file_name] The filename pattern may contain one or more replacement symbols [ a percent sign "%" followed by a letter (e.g. %j)] %A Job array's master job allocation number. %a Job array ID (index) number. %j Job allocation number. %N Node name (name of the first node in the job) %u User name.

30 Requesting specific resources on Balena Different partitions --partition=batch --partition=batch-acc --partition=batch-64gb --partition=batch-128gb --partition=batch-512gb --partition=batch-all 64GB and 128GB nodes Accelerator nodes (GPUs and Xeon Phis) 64GB nodes 128GB nodes 512GB nodes All the nodes except 512GB nodes #!/bin/bash #SBATCH --job-name=testjob #SBATCH --account=free #SBATCH --partition=batch-acc #SBATCH --nodes=1 #SBATCH --gres=gpu:2 #SBATCH --constraint=k20x #SBATCH --error=slurm-%j.err module load intel/mpi mpirun -np $SLULRM_NTASKS./my_mpi_app Specific accelerator --gres=gpu:2 --gres=mic:1 Nodes with 2 GPUs Nodes with 1 MIC Specific features --constraint=k20x --constraint=5110p Nodes with K20 GPU Nodes with Intel Xeon Phi 5100p

31 Job Environment The SLURM controller will set the following variables in the environment of the batch script. OUTPUT environment variables $SLURM_JOB_ID $SLURM_JOB_NAME $SLURM_JOB_NODELIST $SLURM_NTASKS $SLURM_ARRAY_JOB_ID $SLURM_ARRAY_TASK_ID Description The unique jobid for a job Name of the job (--job-name) Nodes allocated to the job No of process started for a job Job array s master job ID number Job array ID (index) number

32 Interactive Testing and Development: sinteractive # Exercise 7 By default submitted to ITD partition using free account SHARED resources (CPU,MEM,GPU,MIC) among other users on the node Each user limited to one interactive job on the ITD partition For an EXCLUSIVE interactive session Requesting specific resources $ sinteractive --time=00:20:00 --gres=gpu:1 $ sinteractive --time=00:20:00 --gres=gpu:4 --partition=batch-acc [user123@balena-01 ~]$ sinfo --partition itd PARTITION AVAIL TIMELIMIT NODES STATE NODELIST itd up infinite 2 mix itd-ngpu-[01-02] itd up infinite 2 idle itd-phi-[01-02]

33 Monitoring workloads Monitoring workloads; CPU usage, memory usage Profiling workloads; Allinea `perf-report` and MAP, Intel profiles Debugging issues with submission scripts

34 Monitoring workloads: top # Exercise 8 $ top

35 Monitoring workloads: htop # Exercise 8 $ module load htop $ htop

36 Monitoring workloads: perfquery # Exercise 8 $ watch "perfquery -C qib0 -r" Every 2.0s: perfquery -C qib0 -r # Port counters: Lid 211 port 1 (CapMask: 0x200) PortSelect:...1 CounterSelect:...0x0000 SymbolErrorCounter:...0 LinkErrorRecoveryCounter:...0 LinkDownedCounter:...0 PortRcvErrors:...0 PortRcvRemotePhysicalErrors:...0 PortRcvSwitchRelayErrors:...0 PortXmitDiscards:...0 PortXmitConstraintErrors:...0 PortRcvConstraintErrors:...0 CounterSelect2:...0x00 LocalLinkIntegrityErrors:...0 ExcessiveBufferOverrunErrors:...0 VL15Dropped:...0 PortXmitData: PortRcvData: PortXmitPkts: PortRcvPkts:

37 Allinea Performance Reports example # Exercise 9 $ mpiicc mpi_pi_reduce.c -o mpi_pi #!/bin/bash -l #SBATCH --job-name=testjob #SBATCH --nodes=2 #SBATCH --ntasks-per-node=2 #SBATCH --cpus-per-task=8 #SBATCH --time=00:05:00 #SBATCH --error=slurm-%j.err #SBATCH --partition=batch module purge module load slurm module load intel/compiler/64/ module load intel/mpi/64/ CPU: module load allinea/reports/6.0.2 perf-report mpirun./mpi_pi HTML Output of performance report Command: Resources: Memory: Tasks: Machine: mpirun./mpi_pi 2 nodes (16 physical, 16 logical cores per node) 63 GB per node 4 processes node-sw-165 Started on: Fri Jul 29 11:36: Total time: Executable: Full path: Input file: Notes: 1 second (0 minutes) mpi_pi /beegfs/scratch/user/q/rtm25/training Summary: mpi_pi is Compute-bound in this configuration Compute: 92.9% ======== MPI: 7.1% I/O: 0.0% This application run was Compute-bound. A breakdown of this time and advice for investigating further is found in the CPU section below. As very little time is spent in MPI calls, this code may also benefit from running at larger scales. A breakdown of the 92.9% total compute time: Scalar numeric ops: 0.0% Vector numeric ops: 0.0% Memory accesses: 100.0% ========= The per-core performance is memory-bound. Use a profiler to identify timeconsuming loops and check their cache performance. No time is spent in vectorized instructions. Check the compiler's

38 Scaling

39 Debugging and Profiling Valgrind module load valgrind Allinea module load allinea/ddt-map module load allinea/reports Intel module load intel/itac Nvidia Profiler module load cuda/profiler GDB Available on all the nodes by default

40 Visualisation A brief on the technology behind Visualisation nodes (GPU virtualisation) Multiple use-cases on how our researchers can exploit this technology Setting the expectations/limits on the capability of our visualisation nodes Interacting with Visualisation nodes Example jobs Sharing visualisation session with another user

41 Technology overview and benefits Components VirtualGL TurboVNC Web portal Benefits Access to BeeGFS parallel file system ($SCRATCH) at full speed Usable from a low end laptop/tablet

42 Access the Visualisation service Balena portal: balena.bath.ac.uk What do you need? A low latency connection with reasonable bandwidth to the cluster login nodes 1MB/sec downstream for low-mid quality JPEG compression (perfectly usable from UK and European ADSL links and over wifi) 10MB/sec downstream (100Mbit) for a high quality stream (within the university campus with wired network connections) A web browser with either: A working java runtime A turbo vnc client

43 # Exercise 10 no shared viewonly Hold down CRTL to select multiple users --account=prj-tst123 Submit Job

44 # Exercise 10 Lets try with Local VNC Client (download client from HOME tab if you do not have a client on the system already) Connect

45 # Exercise 10

46 # Exercise 10

47 Finding Help Balena wiki HPC Support

48 Objectives of this training Login to Balena Understand what storage is available on Balena Finding the software you want to use Create you own jobscript and be familiar with options Submit and manage workloads on Balena Use interactive nodes for Test and Development Use visualisation tools Knowing where find more information and to ask for help

49 Thank you

HPC Introductory Training. on Balena by Team Bath

HPC Introductory Training. on Balena by Team Bath HPC Introductory Training on Balena by Team HPC @ Bath Housekeeping Attendance sheet Fire alarm Refreshment breaks Questions anytime lets us know if you need any assistance. Feedback at the end of the

More information

HPC Introductory Course - Exercises

HPC Introductory Course - Exercises HPC Introductory Course - Exercises The exercises in the following sections will guide you understand and become more familiar with how to use the Balena HPC service. Lines which start with $ are commands

More information

Juropa3 Experimental Partition

Juropa3 Experimental Partition Juropa3 Experimental Partition Batch System SLURM User's Manual ver 0.2 Apr 2014 @ JSC Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de Contents 1. System Information 2. Modules 3. Slurm Introduction

More information

Submitting batch jobs Slurm on ecgate

Submitting batch jobs Slurm on ecgate Submitting batch jobs Slurm on ecgate Xavi Abellan xavier.abellan@ecmwf.int User Support Section Com Intro 2015 Submitting batch jobs ECMWF 2015 Slide 1 Outline Interactive mode versus Batch mode Overview

More information

TITANI CLUSTER USER MANUAL V.1.3

TITANI CLUSTER USER MANUAL V.1.3 2016 TITANI CLUSTER USER MANUAL V.1.3 This document is intended to give some basic notes in order to work with the TITANI High Performance Green Computing Cluster of the Civil Engineering School (ETSECCPB)

More information

Introduction to GACRC Teaching Cluster

Introduction to GACRC Teaching Cluster Introduction to GACRC Teaching Cluster Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Overview Computing Resources Three Folders

More information

Introduction to GACRC Teaching Cluster

Introduction to GACRC Teaching Cluster Introduction to GACRC Teaching Cluster Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Overview Computing Resources Three Folders

More information

Introduction to GACRC Teaching Cluster PHYS8602

Introduction to GACRC Teaching Cluster PHYS8602 Introduction to GACRC Teaching Cluster PHYS8602 Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Overview Computing Resources Three

More information

Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat

Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat Summary 1. Submitting Jobs: Batch mode - Interactive mode 2. Partition 3. Jobs: Serial, Parallel 4. Using generic resources Gres : GPUs, MICs.

More information

JURECA Tuning for the platform

JURECA Tuning for the platform JURECA Tuning for the platform Usage of ParaStation MPI 2017-11-23 Outline ParaStation MPI Compiling your program Running your program Tuning parameters Resources 2 ParaStation MPI Based on MPICH (3.2)

More information

Batch Usage on JURECA Introduction to Slurm. May 2016 Chrysovalantis Paschoulas HPS JSC

Batch Usage on JURECA Introduction to Slurm. May 2016 Chrysovalantis Paschoulas HPS JSC Batch Usage on JURECA Introduction to Slurm May 2016 Chrysovalantis Paschoulas HPS group @ JSC Batch System Concepts Resource Manager is the software responsible for managing the resources of a cluster,

More information

Submitting batch jobs

Submitting batch jobs Submitting batch jobs SLURM on ECGATE Xavi Abellan Xavier.Abellan@ecmwf.int ECMWF February 20, 2017 Outline Interactive mode versus Batch mode Overview of the Slurm batch system on ecgate Batch basic concepts

More information

Sherlock for IBIIS. William Law Stanford Research Computing

Sherlock for IBIIS. William Law Stanford Research Computing Sherlock for IBIIS William Law Stanford Research Computing Overview How we can help System overview Tech specs Signing on Batch submission Software environment Interactive jobs Next steps We are here to

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

How to access Geyser and Caldera from Cheyenne. 19 December 2017 Consulting Services Group Brian Vanderwende

How to access Geyser and Caldera from Cheyenne. 19 December 2017 Consulting Services Group Brian Vanderwende How to access Geyser and Caldera from Cheyenne 19 December 2017 Consulting Services Group Brian Vanderwende Geyser nodes useful for large-scale data analysis and post-processing tasks 16 nodes with: 40

More information

Introduction to Kamiak

Introduction to Kamiak Introduction to Kamiak Training Workshop Aurora Clark, CIRC Director Peter Mills, HPC Team Lead Rohit Dhariwal, Computational Scientist hpc.wsu.edu/training/slides hpc.wsu.edu/training/follow-along hpc.wsu.edu/cheat-sheet

More information

High Performance Computing Cluster Advanced course

High Performance Computing Cluster Advanced course High Performance Computing Cluster Advanced course Jeremie Vandenplas, Gwen Dawes 9 November 2017 Outline Introduction to the Agrogenomics HPC Submitting and monitoring jobs on the HPC Parallel jobs on

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

Good to Great: Choosing NetworkComputer over Slurm

Good to Great: Choosing NetworkComputer over Slurm Good to Great: Choosing NetworkComputer over Slurm NetworkComputer White Paper 2560 Mission College Blvd., Suite 130 Santa Clara, CA 95054 (408) 492-0940 Introduction Are you considering Slurm as your

More information

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK RHRK-Seminar High Performance Computing with the Cluster Elwetritsch - II Course instructor : Dr. Josef Schüle, RHRK Overview Course I Login to cluster SSH RDP / NX Desktop Environments GNOME (default)

More information

Exercises: Abel/Colossus and SLURM

Exercises: Abel/Colossus and SLURM Exercises: Abel/Colossus and SLURM November 08, 2016 Sabry Razick The Research Computing Services Group, USIT Topics Get access Running a simple job Job script Running a simple job -- qlogin Customize

More information

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu Duke Compute Cluster Workshop 3/28/2018 Tom Milledge rc.duke.edu rescomputing@duke.edu Outline of talk Overview of Research Computing resources Duke Compute Cluster overview Running interactive and batch

More information

Slurm basics. Summer Kickstart June slide 1 of 49

Slurm basics. Summer Kickstart June slide 1 of 49 Slurm basics Summer Kickstart 2017 June 2017 slide 1 of 49 Triton layers Triton is a powerful but complex machine. You have to consider: Connecting (ssh) Data storage (filesystems and Lustre) Resource

More information

Tech Computer Center Documentation

Tech Computer Center Documentation Tech Computer Center Documentation Release 0 TCC Doc February 17, 2014 Contents 1 TCC s User Documentation 1 1.1 TCC SGI Altix ICE Cluster User s Guide................................ 1 i ii CHAPTER 1

More information

High Performance Computing Cluster Basic course

High Performance Computing Cluster Basic course High Performance Computing Cluster Basic course Jeremie Vandenplas, Gwen Dawes 30 October 2017 Outline Introduction to the Agrogenomics HPC Connecting with Secure Shell to the HPC Introduction to the Unix/Linux

More information

Using a Linux System 6

Using a Linux System 6 Canaan User Guide Connecting to the Cluster 1 SSH (Secure Shell) 1 Starting an ssh session from a Mac or Linux system 1 Starting an ssh session from a Windows PC 1 Once you're connected... 1 Ending an

More information

High Performance Computing. ICRAR/CASS Radio School Oct 2, 2018

High Performance Computing. ICRAR/CASS Radio School Oct 2, 2018 High Performance Computing ICRAR/CASS Radio School Oct 2, 2018 Overview Intro to Pawsey Supercomputing Centre Architecture of a supercomputer Basics of parallel computing Filesystems Software environment

More information

Introduction to SLURM on the High Performance Cluster at the Center for Computational Research

Introduction to SLURM on the High Performance Cluster at the Center for Computational Research Introduction to SLURM on the High Performance Cluster at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY

More information

Graham vs legacy systems

Graham vs legacy systems New User Seminar Graham vs legacy systems This webinar only covers topics pertaining to graham. For the introduction to our legacy systems (Orca etc.), please check the following recorded webinar: SHARCNet

More information

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda KFUPM HPC Workshop April 29-30 2015 Mohamed Mekias HPC Solutions Consultant Agenda 1 Agenda-Day 1 HPC Overview What is a cluster? Shared v.s. Distributed Parallel v.s. Massively Parallel Interconnects

More information

Introduction to HPC2N

Introduction to HPC2N Introduction to HPC2N Birgitte Brydsø HPC2N, Umeå University 4 May 2017 1 / 24 Overview Kebnekaise and Abisko Using our systems The File System The Module System Overview Compiler Tool Chains Examples

More information

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu Duke Compute Cluster Workshop 10/04/2018 Tom Milledge rc.duke.edu rescomputing@duke.edu Outline of talk Overview of Research Computing resources Duke Compute Cluster overview Running interactive and batch

More information

Cluster Clonetroop: HowTo 2014

Cluster Clonetroop: HowTo 2014 2014/02/25 16:53 1/13 Cluster Clonetroop: HowTo 2014 Cluster Clonetroop: HowTo 2014 This section contains information about how to access, compile and execute jobs on Clonetroop, Laboratori de Càlcul Numeric's

More information

CNAG Advanced User Training

CNAG Advanced User Training www.bsc.es CNAG Advanced User Training Aníbal Moreno, CNAG System Administrator Pablo Ródenas, BSC HPC Support Rubén Ramos Horta, CNAG HPC Support Barcelona,May the 5th Aim Understand CNAG s cluster design

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

June Workshop Series June 27th: All About SLURM University of Nebraska Lincoln Holland Computing Center. Carrie Brown, Adam Caprez

June Workshop Series June 27th: All About SLURM University of Nebraska Lincoln Holland Computing Center. Carrie Brown, Adam Caprez June Workshop Series June 27th: All About SLURM University of Nebraska Lincoln Holland Computing Center Carrie Brown, Adam Caprez Setup Instructions Please complete these steps before the lessons start

More information

Slurm Roll for Rocks Cluster. Werner Saar

Slurm Roll for Rocks Cluster. Werner Saar Slurm Roll for Rocks Cluster Werner Saar April 14, 2016 Contents 1 Introduction 2 2 Admin Guide 4 3 Users Guide 7 4 GPU Computing 9 5 Green Computing 13 1 Chapter 1 Introduction The Slurm Roll provides

More information

Training day SLURM cluster. Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP

Training day SLURM cluster. Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP Training day SLURM cluster Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP Context PRE-REQUISITE : LINUX connect to «genologin»

More information

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende Introduction to the NCAR HPC Systems 25 May 2018 Consulting Services Group Brian Vanderwende Topics to cover Overview of the NCAR cluster resources Basic tasks in the HPC environment Accessing pre-built

More information

SCALABLE HYBRID PROTOTYPE

SCALABLE HYBRID PROTOTYPE SCALABLE HYBRID PROTOTYPE Scalable Hybrid Prototype Part of the PRACE Technology Evaluation Objectives Enabling key applications on new architectures Familiarizing users and providing a research platform

More information

Working with Shell Scripting. Daniel Balagué

Working with Shell Scripting. Daniel Balagué Working with Shell Scripting Daniel Balagué Editing Text Files We offer many text editors in the HPC cluster. Command-Line Interface (CLI) editors: vi / vim nano (very intuitive and easy to use if you

More information

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta Using Compute Canada Masao Fujinaga Information Services and Technology University of Alberta Introduction to cedar batch system jobs are queued priority depends on allocation and past usage Cedar Nodes

More information

P a g e 1. HPC Example for C with OpenMPI

P a g e 1. HPC Example for C with OpenMPI P a g e 1 HPC Example for C with OpenMPI Revision History Version Date Prepared By Summary of Changes 1.0 Jul 3, 2017 Raymond Tsang Initial release 1.1 Jul 24, 2018 Ray Cheung Minor change HPC Example

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

LAB. Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers

LAB. Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers LAB Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers Dan Stanzione, Lars Koesterke, Bill Barth, Kent Milfeld dan/lars/bbarth/milfeld@tacc.utexas.edu XSEDE 12 July 16, 2012 1 Discovery

More information

Duke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/

Duke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/ Duke Compute Cluster Workshop 11/10/2016 Tom Milledge h:ps://rc.duke.edu/ rescompu>ng@duke.edu Outline of talk Overview of Research Compu>ng resources Duke Compute Cluster overview Running interac>ve and

More information

Introduction to SLURM & SLURM batch scripts

Introduction to SLURM & SLURM batch scripts Introduction to SLURM & SLURM batch scripts Anita Orendt Assistant Director Research Consulting & Faculty Engagement anita.orendt@utah.edu 6 February 2018 Overview of Talk Basic SLURM commands SLURM batch

More information

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico. OpenMP and MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico November 15, 2010 José Monteiro (DEI / IST) Parallel and Distributed Computing

More information

To connect to the cluster, simply use a SSH or SFTP client to connect to:

To connect to the cluster, simply use a SSH or SFTP client to connect to: RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or

More information

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico. OpenMP and MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico November 16, 2011 CPD (DEI / IST) Parallel and Distributed Computing 18

More information

Scientific Computing in practice

Scientific Computing in practice Scientific Computing in practice Kickstart 2015 (cont.) Ivan Degtyarenko, Janne Blomqvist, Mikko Hakala, Simo Tuomisto School of Science, Aalto University June 1, 2015 slide 1 of 62 Triton practicalities

More information

IT4Innovations national supercomputing center. Branislav Jansík

IT4Innovations national supercomputing center. Branislav Jansík IT4Innovations national supercomputing center Branislav Jansík branislav.jansik@vsb.cz Anselm Salomon Data center infrastructure Anselm and Salomon Anselm Intel Sandy Bridge E5-2665 2x8 cores 64GB RAM

More information

Introduction to SLURM & SLURM batch scripts

Introduction to SLURM & SLURM batch scripts Introduction to SLURM & SLURM batch scripts Anita Orendt Assistant Director Research Consulting & Faculty Engagement anita.orendt@utah.edu 16 Feb 2017 Overview of Talk Basic SLURM commands SLURM batch

More information

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions How to run applications on Aziz supercomputer Mohammad Rafi System Administrator Fujitsu Technology Solutions Agenda Overview Compute Nodes Storage Infrastructure Servers Cluster Stack Environment Modules

More information

Training day SLURM cluster. Context. Context renewal strategy

Training day SLURM cluster. Context. Context renewal strategy Training day cluster Context Infrastructure Environment Software usage Help section For further with Best practices Support Context PRE-REQUISITE : LINUX connect to «genologin» server Basic command line

More information

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU What is Joker? NMSU s supercomputer. 238 core computer cluster. Intel E-5 Xeon CPUs and Nvidia K-40 GPUs. InfiniBand innerconnect.

More information

Simple examples how to run MPI program via PBS on Taurus HPC

Simple examples how to run MPI program via PBS on Taurus HPC Simple examples how to run MPI program via PBS on Taurus HPC MPI setup There's a number of MPI implementations install on the cluster. You can list them all issuing the following command: module avail/load/list/unload

More information

Introduc)on to Hyades

Introduc)on to Hyades Introduc)on to Hyades Shawfeng Dong Department of Astronomy & Astrophysics, UCSSC Hyades 1 Hardware Architecture 2 Accessing Hyades 3 Compu)ng Environment 4 Compiling Codes 5 Running Jobs 6 Visualiza)on

More information

Cluster Computing in Frankfurt

Cluster Computing in Frankfurt Cluster Computing in Frankfurt Goethe University in Frankfurt/Main Center for Scientific Computing December 12, 2017 Center for Scientific Computing What can we provide you? CSC Center for Scientific Computing

More information

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology Introduction to the SHARCNET Environment 2010-May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology available hardware and software resources our web portal

More information

Introduction to UBELIX

Introduction to UBELIX Science IT Support (ScITS) Michael Rolli, Nico Färber Informatikdienste Universität Bern 06.06.2017, Introduction to UBELIX Agenda > Introduction to UBELIX (Overview only) Other topics spread in > Introducing

More information

Compiling applications for the Cray XC

Compiling applications for the Cray XC Compiling applications for the Cray XC Compiler Driver Wrappers (1) All applications that will run in parallel on the Cray XC should be compiled with the standard language wrappers. The compiler drivers

More information

ECE 574 Cluster Computing Lecture 4

ECE 574 Cluster Computing Lecture 4 ECE 574 Cluster Computing Lecture 4 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 31 January 2017 Announcements Don t forget about homework #3 I ran HPCG benchmark on Haswell-EP

More information

Introduction to RCC. September 14, 2016 Research Computing Center

Introduction to RCC. September 14, 2016 Research Computing Center Introduction to HPC @ RCC September 14, 2016 Research Computing Center What is HPC High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers

More information

Getting started with the CEES Grid

Getting started with the CEES Grid Getting started with the CEES Grid October, 2013 CEES HPC Manager: Dennis Michael, dennis@stanford.edu, 723-2014, Mitchell Building room 415. Please see our web site at http://cees.stanford.edu. Account

More information

Introduction to RCC. January 18, 2017 Research Computing Center

Introduction to RCC. January 18, 2017 Research Computing Center Introduction to HPC @ RCC January 18, 2017 Research Computing Center What is HPC High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much

More information

Student HPC Hackathon 8/2018

Student HPC Hackathon 8/2018 Student HPC Hackathon 8/2018 J. Simon, C. Plessl 22. + 23. August 2018 J. Simon - Architecture of Parallel Computer Systems SoSe 2018 < 1 > Student HPC Hackathon 8/2018 Get the most performance out of

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

Introduction to SLURM & SLURM batch scripts

Introduction to SLURM & SLURM batch scripts Introduction to SLURM & SLURM batch scripts Anita Orendt Assistant Director Research Consulting & Faculty Engagement anita.orendt@utah.edu 23 June 2016 Overview of Talk Basic SLURM commands SLURM batch

More information

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS Introduction to High Performance Computing High Performance Computing at UEA http://rscs.uea.ac.uk/hpc/

More information

MPI 1. CSCI 4850/5850 High-Performance Computing Spring 2018

MPI 1. CSCI 4850/5850 High-Performance Computing Spring 2018 MPI 1 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives

More information

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013 Slurm and Abel job scripts Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013 Abel in numbers Nodes - 600+ Cores - 10000+ (1 node->2 processors->16 cores) Total memory

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Introduction to GALILEO

Introduction to GALILEO November 27, 2016 Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it SuperComputing Applications and Innovation Department

More information

rcuda: towards energy-efficiency in GPU computing by leveraging low-power processors and InfiniBand interconnects

rcuda: towards energy-efficiency in GPU computing by leveraging low-power processors and InfiniBand interconnects rcuda: towards energy-efficiency in computing by leveraging low-power processors and InfiniBand interconnects Federico Silla Technical University of Valencia Spain Joint research effort Outline Current

More information

Introduction to CINECA HPC Environment

Introduction to CINECA HPC Environment Introduction to CINECA HPC Environment 23nd Summer School on Parallel Computing 19-30 May 2014 m.cestari@cineca.it, i.baccarelli@cineca.it Goals You will learn: The basic overview of CINECA HPC systems

More information

Compilation and Parallel Start

Compilation and Parallel Start Compiling MPI Programs Programming with MPI Compiling and running MPI programs Type to enter text Jan Thorbecke Delft University of Technology 2 Challenge the future Compiling and Starting MPI Jobs Compiling:

More information

UL HPC Monitoring in practice: why, what, how, where to look

UL HPC Monitoring in practice: why, what, how, where to look C. Parisot UL HPC Monitoring in practice: why, what, how, where to look 1 / 22 What is HPC? Best Practices Getting Fast & Efficient UL HPC Monitoring in practice: why, what, how, where to look Clément

More information

Parallel Computing. November 20, W.Homberg

Parallel Computing. November 20, W.Homberg Mitglied der Helmholtz-Gemeinschaft Parallel Computing November 20, 2017 W.Homberg Why go parallel? Problem too large for single node Job requires more memory Shorter time to solution essential Better

More information

Part One: The Files. C MPI Slurm Tutorial - Hello World. Introduction. Hello World! hello.tar. The files, summary. Output Files, summary

Part One: The Files. C MPI Slurm Tutorial - Hello World. Introduction. Hello World! hello.tar. The files, summary. Output Files, summary C MPI Slurm Tutorial - Hello World Introduction The example shown here demonstrates the use of the Slurm Scheduler for the purpose of running a C/MPI program. Knowledge of C is assumed. Having read the

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Heterogeneous Job Support

Heterogeneous Job Support Heterogeneous Job Support Tim Wickberg SchedMD SC17 Submitting Jobs Multiple independent job specifications identified in command line using : separator The job specifications are sent to slurmctld daemon

More information

Solution of Exercise Sheet 2

Solution of Exercise Sheet 2 Solution of Exercise Sheet 2 Exercise 1 (Cluster Computing) 1. Give a short definition of Cluster Computing. Clustering is parallel computing on systems with distributed memory. 2. What is a Cluster of

More information

Genius Quick Start Guide

Genius Quick Start Guide Genius Quick Start Guide Overview of the system Genius consists of a total of 116 nodes with 2 Skylake Xeon Gold 6140 processors. Each with 18 cores, at least 192GB of memory and 800 GB of local SSD disk.

More information

Introduction to HPC2N

Introduction to HPC2N Introduction to HPC2N Birgitte Brydsø, Jerry Eriksson, and Pedro Ojeda-May HPC2N, Umeå University 12 September 2017 1 / 38 Overview Kebnekaise and Abisko Using our systems The File System The Module System

More information

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 Email:plamenkrastev@fas.harvard.edu Objectives Inform you of available computational resources Help you choose appropriate computational

More information

New User Seminar: Part 2 (best practices)

New User Seminar: Part 2 (best practices) New User Seminar: Part 2 (best practices) General Interest Seminar January 2015 Hugh Merz merz@sharcnet.ca Session Outline Submitting Jobs Minimizing queue waits Investigating jobs Checkpointing Efficiency

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Alessandro Grottesi a.grottesi@cineca.it SuperComputing Applications and

More information

Submitting batch jobs Slurm on ecgate Solutions to the practicals

Submitting batch jobs Slurm on ecgate Solutions to the practicals Submitting batch jobs Slurm on ecgate Solutions to the practicals Xavi Abellan xavier.abellan@ecmwf.int User Support Section Com Intro 2015 Submitting batch jobs ECMWF 2015 Slide 1 Practical 1: Basic job

More information

Parallel Computing. Lecture 17: OpenMP Last Touch

Parallel Computing. Lecture 17: OpenMP Last Touch CSCI-UA.0480-003 Parallel Computing Lecture 17: OpenMP Last Touch Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Some slides from here are adopted from: Yun (Helen) He and Chris Ding

More information

The RWTH Compute Cluster Environment

The RWTH Compute Cluster Environment The RWTH Compute Cluster Environment Tim Cramer 29.07.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) The RWTH Compute Cluster (1/2) The Cluster provides ~300 TFlop/s No. 32 in TOP500

More information

Leibniz Supercomputer Centre. Movie on YouTube

Leibniz Supercomputer Centre. Movie on YouTube SuperMUC @ Leibniz Supercomputer Centre Movie on YouTube Peak Performance Peak performance: 3 Peta Flops 3*10 15 Flops Mega 10 6 million Giga 10 9 billion Tera 10 12 trillion Peta 10 15 quadrillion Exa

More information

For Dr Landau s PHYS8602 course

For Dr Landau s PHYS8602 course For Dr Landau s PHYS8602 course Shan-Ho Tsai (shtsai@uga.edu) Georgia Advanced Computing Resource Center - GACRC January 7, 2019 You will be given a student account on the GACRC s Teaching cluster. Your

More information

KISTI TACHYON2 SYSTEM Quick User Guide

KISTI TACHYON2 SYSTEM Quick User Guide KISTI TACHYON2 SYSTEM Quick User Guide Ver. 2.4 2017. Feb. SupercomputingCenter 1. TACHYON 2 System Overview Section Specs Model SUN Blade 6275 CPU Intel Xeon X5570 2.93GHz(Nehalem) Nodes 3,200 total Cores

More information

How to Use a Supercomputer - A Boot Camp

How to Use a Supercomputer - A Boot Camp How to Use a Supercomputer - A Boot Camp Shelley Knuth Peter Ruprecht shelley.knuth@colorado.edu peter.ruprecht@colorado.edu www.rc.colorado.edu Outline Today we will discuss: Who Research Computing is

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is HPC Concept? What is

More information

How to run a job on a Cluster?

How to run a job on a Cluster? How to run a job on a Cluster? Cluster Training Workshop Dr Samuel Kortas Computational Scientist KAUST Supercomputing Laboratory Samuel.kortas@kaust.edu.sa 17 October 2017 Outline 1. Resources available

More information

HPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing

HPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing HPC Workshop Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing NEEDED EQUIPMENT 1. Laptop with Secure Shell (ssh) for login A. Windows: download/install putty from https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html

More information

Számítogépes modellezés labor (MSc)

Számítogépes modellezés labor (MSc) Számítogépes modellezés labor (MSc) Running Simulations on Supercomputers Gábor Rácz Physics of Complex Systems Department Eötvös Loránd University, Budapest September 19, 2018, Budapest, Hungary Outline

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class PBIO/BINF8350 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What

More information