Using the MaRC2 HPC Cluster René Sitt / Manuel Haim, 09/2016
Get access rights and permissions Students / Staff account needed Ask your workgroup leader if MaRC2 is already being used he/she must accept the terms of use he/she nominates people who may tell us to grant you access Ask the nominated people at your workgroup to get access For further questions, terms of use etc., send us an email: marc@hrz.uni-marburg.de
Starting a terminal session (Linux, Mac) Start your favourite Terminal application Connect to MaRC2 via SSH (Secure Shell) by entering: ssh ssh -p223 -p223 username@marc2.hrz.uni-marburg.de username@marc2.hrz.uni-marburg.de replace username with your Students / Staff account name remember that your account name is case-sensitive! You are then prompted for your account's password: username@marc2.hrz.uni-marburg.de's username@marc2.hrz.uni-marburg.de's password: password:
Starting a terminal session (Linux) Example
Starting a terminal session (Windows) Start an SSH client, e.g. PuTTY (www.putty.org) Select SSH, then insert host name and port number Finally, click open A terminal window opens where you will be prompted for username (case-sensitive!) and password
Generating SSH keys (Linux/Mac) Needed if you want to access compute nodes directly Connect to MaRC2 and use ssh-keygen ~]$ ~]$ ssh-keygen ssh-keygen The resulting key pair with default names id_rsa and id_rsa.pub will be created in the /home/username/.ssh folder If wanted, put in a passphrase to protect the key (otherwise just leave it empty) Append your public key to the authorized_keys file ~]$ ~]$ cd cd.ssh]$.ssh]$.ssh.ssh cat cat id_rsa.pub id_rsa.pub >> >> authorized_keys authorized_keys
Generating SSH keys (Linux/Mac) You should now be able to connect to MaRC2 compute nodes: ~]$ ~]$ ssh ssh node001 node001 The The authenticity authenticity of of host host 'node001 'node001 (<ip (<ip address>)' address>)' can't can't be be established. established. RSA RSA key key fingerprint fingerprint is is <rsa <rsa fingerprint>. fingerprint>. Are Are you you sure sure you you want want to to continue continue connecting connecting (yes/no)? (yes/no)? yes yes Warning: Warning: Permanently Permanently added added 'node001,<ip 'node001,<ip address>' address>' (RSA) (RSA) to to the list of known hosts. the list of known hosts. [username@node001 [username@node001 ~]$ ~]$ Don't use this to start calculations directly on a node!! (see below how to submit jobs to the task scheduler)
Some hints To exit the terminal session, just enter exit or press Ctrl+D By default, your login shell is set to /bin/bash. You may change this setting globally (and choose another login shell like csh, ksh or tcsh) for your account here: Staff account: https://admin.staff.uni-marburg.de/aendshell.html Students account: https://admin.students.uni-marburg.de/login_shell.html
Introduction to the BASH Shell (Bourne-again shell) Right after login, the BASH Shell is started by default and shows a command prompt like this: ~]$ ~]$ Input cursor $ for user, # for superuser (root) Current directory name ( ~ for home directory, / for root directory) Server name Your username To exit, simply enter the command exit or press Ctrl+D
First steps with BASH-builtin commands pwd print current (working) directory: ~]$ ~]$ pwd pwd /home/username /home/username cd change current directory: cd.. go to parent directory cd <dirname> go to subdirectory /home /home /home/username /home/username ~]$ ~]$ cd cd home]$ home]$.... pwd pwd home]$ home]$ cd cd username username ~]$ ~]$ pwd pwd
Additional commands ls list directory contents: ~]$ ~]$ ls ls mydata mydata readme readme script.sh script.sh ls -l list contents in long format ~]$ ~]$ ls ls -l -l drwxrwxr-x drwxrwxr-x 22 username username username username 4096 4096 18. 18. Jun Jun 11:12 11:12 mydata mydata -rw-rw-r-1 username username 915 18. Jun 11:37 readme -rw-rw-r-- 1 username username 915 18. Jun 11:37 readme -rwxr-xr-x 26 -rwxr-xr-x 11 username username username username 26 18. 18. Jun Jun 11:35 11:35 script.sh script.sh permissions (see later) user file type (d=directory) group last change file size (bytes) file name
Additional commands (2) ls -a show all files (even hidden ones):...bash_history.bash_history...bash_logout...bash_logout ~]$ ~]$ ls ls -a -a.bash_profile.bash_profile mydata mydata script.sh script.sh.bashrc readme.bashrc readme (files starting with. are usually not shown)
Additional commands (3) mkdir <dirname> make subdirectory: ~]$ ~]$ mkdir mkdir temp temp ~]$ ~]$ cd cd temp temp temp]$ pwd temp]$ pwd /home/username/temp /home/username/temp rmdir <dirname> remove directory (must be empty): temp]$ temp]$ cd cd ~]$ rmdir ~]$ rmdir.... temp temp rm <filename> remove (delete) file rm -R <dirname> remove (delete) directory recursively! rm -Rf <dirname> just as above, but force, don't ask ;-)
Available directories on a typical Linux system The Unix/Linux filesystem is organized in a tree-like structure The top-most directory / is called the root directory All filesystem entries (including mounted filesystems) are branches of this root. / /boot /bin executable boot files binaries /dev devices /lib /usr... /home /etc UNIX configuration home dirs libraries system files ressources /home/username personal home directory ~
Getting help man <command> Show man(ual) page for command: ~]$ ~]$ man man ls ls Press space for next page, b for previous page Arrow keys and PgUp/PgDn will also work nowadays Press q to quit <command> --help or <command> -h Show help (not supported by all commands)
Change file owner and permissions This is what ls -l shows: -rwxr-xr-x -rwxr-xr-x 11 username username username username user other group permissions user 26 26 18. 18. Jun Jun 11:35 11:35 script.sh script.sh group r = read w = write x = execute chmod <permissions> <file> Change permissions example: chmod u=rwx,g=rx,o=rx script.sh chown <user>:<group> <file> Change owner and group
Some tips and tricks Select text to copy, then paste with middlemouse button (PuTTY: Use right mouse button) Tab auto-completes your input Shift+PgUp/PgDn scrolls the screen buffer up and down (maximum scrollback buffer size can be set in your terminal options) ArrowUp and ArrowDown allows you to navigate through your former input Ctrl-R allows you to reverse-i-search in your former input Edit your.bashrc and.bash_profile for personal settings
Using the Environment Modules package
Using the Environment Modules package why? Each piece of software typically needs a set of environment variables to be set, in order to run properly e.g. the $PATH variable which points to the bin directories On a HPC cluster like MaRC2, there may be many different toolsets and toolset versions with the same purpose and/or the same command names Conflict! With the Environment Modules package, you can dynamically modify your environment by loading so-called modulefiles Use our predefined modulefiles, no need to create your own :-)
Using the Environment Modules package how? module avail show available modules [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module avail avail ----------------------------------------------- /usr/share/modules/modulefiles /usr/share/modules/modulefiles ----------------------------------------------dot module-cvs module-info modules null use.own dot module-cvs module-info modules null use.own ----------------------------------------------------- /usr/share/moduleslocal /usr/share/moduleslocal ------------------------------------------------------acml/gfortran-5.1.0(default) openmpi/pgi/1.4.3-qlc acml/gfortran-5.1.0(default) openmpi/pgi/1.4.3-qlc acml/gfortran-5.2.0 openmpi-1.6.3/gcc/1.6.3 acml/gfortran-5.2.0 openmpi-1.6.3/gcc/1.6.3 acml/ifort-5.1.0 openmpi-1.6.3/icc/1.6.3 acml/ifort-5.1.0 (AMD Core Math Library) openmpi-1.6.3/icc/1.6.3 acml/pgi-5.1.0 openmpi-1.6.3/pgi/1.6.3 acml/pgi-5.1.0 openmpi-1.6.3/pgi/1.6.3 gcc/4.4.6 parastation/mpi2-gcc-5.0.27-1(default) gcc/4.4.6 parastation/mpi2-gcc-5.0.27-1(default) gcc/4.6.2(default) parastation/mpi2-gcc-mt-5.0.27-1 gcc/4.6.2(default) parastation/mpi2-gcc-mt-5.0.27-1 gcc/4.7.2 parastation/mpi2-intel-5.0.27-1 gcc/4.7.2 parastation/mpi2-intel-5.0.27-1 intel/intelpsxe2011sp1-32 parastation/mpi2-intel-mt-5.0.27-1 intel/intelpsxe2011sp1-32 parastation/mpi2-intel-mt-5.0.27-1 intel/intelpsxe2011sp1-64 parastation/mpi2-pgi-5.0.27-1 intel/intelpsxe2011sp1-64 parastation/mpi2-pgi-5.0.27-1 intel/intelpsxe2013sp1-64 parastation/mpi2-pgi-mt-5.0.27-1 intel/intelpsxe2013sp1-64 parastation/mpi2-pgi-mt-5.0.27-1 openmpi/gcc/1.4.3-qlc pgi/12.2(default) openmpi/gcc/1.4.3-qlc pgi/12.2(default) (Portland Group Inc. Compiler) openmpi/intel/1.4.3-qlc pgi/13.2 openmpi/intel/1.4.3-qlc pgi/13.2 module load <modulefile> load module You do not need to specify the full path in order to load the default, e.g.: [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module load load gcc gcc
Using the Environment Modules package how (2) module list show loaded modules [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module list list Currently Currently Loaded Loaded Modulefiles: Modulefiles: 1) acml/gfortran-5.1.0 1) acml/gfortran-5.1.0 2) 2) gcc/4.6.2 gcc/4.6.2 3) 3) parastation/mpi2-gcc-5.0.27-1 parastation/mpi2-gcc-5.0.27-1 module unload <modulefile> unload module Again, you do not need to specify the full path: [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module unload unload gcc gcc module purge unload all loaded modules
Using the Environment Modules package how (3) module switch <module1> <module2> unload module1 and load module2 (works for conflicting modules!) [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module switch switch gcc gcc pgi pgi More information: man module
Compiling your programs
Compiling your programs Choose a compiler Several compilers available: gcc GNU C Compiler gfortran GNU Fortran Compiler icc Intel C Compiler ifort Intel Fortran Compiler pgcc Portland Group Inc. (PGI) C Compiler pgfortran Portland Group Inc. (PGI) Fortran Compiler MPI Compilers (OpenMPI or Parastation MPI, GNU/Intel/PGI): mpicc MPI C Compiler mpicxx MPI C++ Compiler mpif77 MPI Fortran77 Compiler
A simple C program Create a file hello-world.c with the following content: #include #include #include #include <stdio.h> <stdio.h> <stdlib.h> <stdlib.h> int int main() main() {{ printf("hello printf("hello World!\n"); World!\n"); exit(0); exit(0); }} Compile using gcc: (remember to have the module loaded first!) [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ gcc gcc hello-world.c hello-world.c -o -o hello-world hello-world Then test locally: (don't do this with huge complex programs!) [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$./hello-world./hello-world Hello World! Hello World!
We could also use a Makefile Create a file Makefile with the following content: PROG=hello-world PROG=hello-world SRC=hello-world.c SRC=hello-world.c CC=gcc CC=gcc You must insert a tab here! all: all: clean clean build build clean: clean: build: build: rm rm -f -f $(PROG) $(PROG) $(CC) $(CC) $(SRC) $(SRC) -o -o $(PROG) $(PROG) run: run: all all./$(prog)./$(prog) Now just enter make or make run to compile/run your code
Submitting jobs to the SUN Grid Engine (SGE)
SUN Grid Engine??? The SUN Grid Engine (SGE) implements a queueing system Users can submit jobs to the SGE queues and define boundary conditions (like runtime, memory etc.) The SGE permanently reviews the available resources, evaluates pending jobs and executes them on the compute nodes The SGE is open source software with a free-to-use license Its successor, the Oracle Grid Engine, is only available commercially
How to submit a new job to the SGE First, write a start script for your software: (e.g. hello-world.sh) #!/bin/bash #!/bin/bash -S -S -e -e -o -o -l -l /bin/bash /bin/bash./stderr-file./stderr-file./stdout-file./stdout-file h_rt=10 h_rt=10 echo echo "hello-world "hello-world running running on on host host $(hostname)" $(hostname)"./hello-world./hello-world Then submit your start script: (-cwd: execute from current working dir) $$ qsub qsub -cwd -cwd hello-world.sh hello-world.sh Your job 144406 Your job 144406 ("hello-world.sh") ("hello-world.sh") has has been been submitted submitted
How to submit a new job to the SGE(2) Important: Requesting 'h_rt' (max. job runtime in seconds) is mandatory h_rt limits: 259200 (3d) standard queues, 864000 (10d) long queues RAM requests are counted per core, so h_vmem=4g with 4 cores reserves 16G When using a full node, h_vmem>4g is impossible 4G*64 = 256G = node maximum
How to submit a new job to the SGE(3) To use the module environment, include.. /etc/profile.d/modules.sh /etc/profile.d/modules.sh in your script before any module commands If the script file was written on Windows, use dos2linux dos2linux <yourscript> <yourscript> before submitting to enforce UNIX text formatting
How to submit a new job to the SGE(4) For parallel application, choose parallel environment and # of CPUs with -pe <parallel_environment> <# of cores> -pe -pe smp* smp* 32 32 this reserves 32 cores on one node (smp=symmetric multiprocessing) For more than 64 cores, use preferrably 'orte_sl64*' Other PEs exist mostly to achieve specific process distributions
How to submit a new job to the SGE(5)
How to submit a new job to the SGE(5) With many little calculations that can be indexed, consider using a task array -t n[-m[:s]] -t n n=# of tasks, IDs 1-n -t n-m range of task indices, IDs n-m -t n-m:s range of task indices n-m, stepsize s -t -t 2-10:2 2-10:2 spawns jobs with SGE_TASK_ID = 2, 4, 6, 8, 10 In the job script, get the index with ${SGE_TASK_ID}
How to submit a new job to the SGE(6) Recommended: Submit job as 'binary' when using task arrays Prerequisites: Make script 'executable', don't move or edit it as long as job is running, must give all parameters in commandline chmod chmod ugo+x ugo+x <jobscript> <jobscript> qsub qsub -b -b yy <parameters> <parameters> <jobscript> <jobscript> Alternatively, leave parameters in the script but use 'arsub' wrapper (script still needs to be executable!) module module load load arsub.py arsub.py -t -t tools/python-3.5 tools/python-3.5 tools/arsub tools/arsub <from-to:step> <from-to:step> <jobscript> <jobscript>
Get information on your submitted jobs qstat List all your jobs $$ qstat qstat job-id user state queue slots job-id prior prior name name user state submit/start submit/start at at queue slots ja-task-id ja-task-id ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------144407 06/19/2013 11 144407 0.00000 0.00000 start.sh start.sh haimm haimm qw qw 06/19/2013 12:43:26 12:43:26 SGE may take a few seconds q=queued, w=waiting, to give your job a non-zero priority E=Error, r=running 1 CPU slot (core) qstat -u "*" List all jobs from all users qstat -j <job-id> Get information on a single job $$ qstat qstat -j -j 144406 144406 ============================================================== ============================================================== job_number: 144406 job_number: 144406 exec_file: job_scripts/144406 exec_file: job_scripts/144406 submission_time: Wed submission_time: Wed Jun Jun 19 19 12:40:43 12:40:43 2013 2013 owner: haimm owner: haimm uid: 30376 uid: 30376......
Get information on your submitted jobs (2) qalter -w v <job-id> Get scheduler information for job (may be helpful if your job won't run) $$ qalter qalter -w -w vv 144406 144406 Job Job 144406 144406 queue queue instance instance "parallel_long@node028.marc2" "parallel_long@node028.marc2" dropped dropped because because it it is is temporarily not available temporarily not available Job Job 144406 144406 queue queue instance instance "parallel_long@node032.marc2" "parallel_long@node032.marc2" dropped dropped because because it it is is temporarily not available temporarily not available Job Job 144406 144406 queue queue instance instance "parallel_long@node043.marc2" "parallel_long@node043.marc2" dropped dropped because because it it is is temporarily not available temporarily not available......
Get information on your submitted jobs (3) qalter <options> <job-id> - Change submit options for a queued job To change memory and runtime requests: $$ qalter qalter -l -l h_rt=150000,h_vmem=2500m h_rt=150000,h_vmem=2500m 144406 144406 To change parallel environment and CPU reservation: $$ qalter qalter -pe -pe orte_sl32* orte_sl32* 128 128 Can be used if a Job won't run because of wrong submit parameters qdel <job-id> Delete (terminate) a submitted job
Get information on your submitted jobs (4) qhost -j -u <username> - see on which nodes your jobs are running (MASTER is the spawning process) [sittr@marc2-h2 [sittr@marc2-h2 mpiperftest]$ mpiperftest]$ qhost qhost -j -j -u -u sittr sittr HOSTNAME ARCH NCPU HOSTNAME ARCH NCPU LOAD LOAD MEMTOT MEMTOT MEMUSE MEMUSE SWAPTO SWAPTO SWAPUS SWAPUS ------------------------------------------------------------------------------------------------------------------------------------------------------------global global node001 node001 -lx24-amd64 lx24-amd64 node002 node002 node003 node003 lx24-amd64 lx24-amd64 lx24-amd64 lx24-amd64 <---snip---(node004-node013)---> <---snip---(node004-node013)---> node014 lx24-amd64 node014 lx24-amd64 ---64 64 0.02 0.02 252.4G 252.4G 64 0.00 252.4G 64 0.00 252.4G 64 64 0.02 0.02 252.4G 252.4G -2.1G 2.1G -8.0G 8.0G -0.0 0.0 9.0G 9.0G 4.6G 4.6G 8.0G 8.0G 772.0M 772.0M 8.0G 2.5G 8.0G 2.5G 8.0G 0.0 8.0G 0.0 job-id prior name user queue master job-id prior name user queue master ja-task-id ja-task-id ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------1985179 1985179 0.52597 0.52597 mpiperf_ mpiperf_ sittr sittr 64 2.7G 64 0.00 0.00 252.4G 252.4G 2.7G state submit/start state submit/start at at rr 10/12/2015 10/12/2015 14:33:34 14:33:34 parallel@n parallel@n SLAVE SLAVE parallel@n SLAVE parallel@n SLAVE <---snip---(62 <---snip---(62 more more SLAVE SLAVE processes)---> processes)---> node015 lx24-amd64 64 node015 lx24-amd64 64 0.00 0.00 252.4G 252.4G <---snip---(node016-node072)---> <---snip---(node016-node072)---> node073 lx24-amd64 node073 lx24-amd64 1985179 0.52597 mpiperf_ 1985179 0.52597 mpiperf_ sittr sittr 8.0G 8.0G 2.4G 2.4G 64 2.4G 8.0G 64 0.05 0.05 252.4G 252.4G 2.4G 8.0G 19.9M 19.9M rr 10/12/2015 14:33:34 parallel@n 10/12/2015 14:33:34 parallel@n MASTER MASTER parallel@n SLAVE parallel@n SLAVE <---snip---(62 <---snip---(62 more more SLAVE SLAVE processes)---> processes)---> node074 lx24-amd64 64 node074 lx24-amd64 64 0.00 0.00 252.4G 252.4G (...) (...) 2.5G 2.5G 2.4G 2.4G 8.0G 8.0G 941.7M 941.7M
Hint for Java users (memory boundaries) Java usually takes more memory than you think it takes SGE's h_vmem value must be larger than Java's -Xmx value, you may test it with a simple Hello world program like this: #!/bin/bash #!/bin/bash -S -S -e -e -o -o -l -l /bin/bash /bin/bash./stderr-file./stderr-file./stdout-file./stdout-file h_vmem=9g, h_vmem=9g, h_rt=60 h_rt=60 java java -Xms5120m -Xms5120m -Xmx5120m -Xmx5120m -cp -cp./hello.jar./hello.jar Hello Hello exit exit 00
Performance optimization Code has to be optimized to not waste energy, time, and money Performance optimization is complex and sometimes hard to get right How do we approach this?
Performance optimization (2) Parallel code analysis tools can help to identify problems We are mostly using the Scalasca suite and the Vampir trace analyzer to do this With potential performance bottlenecks identified, we can start making suggestions for code optimizations If you don't have the time or in-depth knowledge to optimize on your own, we're glad to help!