Using the MaRC2 HPC Cluster

Similar documents
Using the MaRC2 HPC Cluster

Using the computational resources at the GACRC

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

Introduction to Linux Basics

Supercomputing environment TMA4280 Introduction to Supercomputing

SGE Roll: Users Guide. Version Edition

Grid Engine Users Guide. 5.5 Edition

Mills HPC Tutorial Series. Mills HPC Basics

An Introduction to Cluster Computing Using Newton

Introduction: What is Unix?

National Biochemical Computational Research Familiarize yourself with the account policy

MVAPICH MPI and Open MPI

Using a Linux System 6

Cluster Clonetroop: HowTo 2014

Shark Cluster Overview

Tech Computer Center Documentation

Introduction to PICO Parallel & Production Enviroment

Introduction to the Marc2 Cluster

Working with Basic Linux. Daniel Balagué

Commands are in black

Week 2 Lecture 3. Unix

Our new HPC-Cluster An overview

SGE Roll: Users Guide. Version 5.3 Edition

Shark Cluster Overview

Carnegie Mellon. Linux Boot Camp. Jack, Matthew, Nishad, Stanley 6 Sep 2016

Grid Engine Users Guide. 7.0 Edition

Exercise 1: Basic Tools

Introduction to GALILEO

Center for Mathematical Modeling University of Chile HPC 101. HPC systems basics and concepts. Juan Carlos Maureira B.

Introduction to GALILEO

Introduction to HPC Using zcluster at GACRC

Gridengine. Contents. Aim. Configuration of gridengine. From reading group / nlp lunch

CISC 220 fall 2011, set 1: Linux basics

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines

Student Remote Login Procedure (see picture below): 1. Start SSH Secure Shell 2. Click the computer icon (4 th on the toolbar) 3.

CS CS Tutorial 2 2 Winter 2018

CENG 334 Computer Networks. Laboratory I Linux Tutorial

Computing with the Moore Cluster

Chapter Two. Lesson A. Objectives. Exploring the UNIX File System and File Security. Understanding Files and Directories

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009

Introduction to the Linux Command Line

Linux Operating System Environment Computadors Grau en Ciència i Enginyeria de Dades Q2

Using ISMLL Cluster. Tutorial Lec 5. Mohsan Jameel, Information Systems and Machine Learning Lab, University of Hildesheim

Intermediate Programming, Spring Misha Kazhdan

Linux Training. for New Users of Cluster. Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala

Practical: a sample code

Cluster User Training

Center for Mathematical Modeling University of Chile HPC 101. Scientific Computing on HPC systems. Juan Carlos Maureira B.

Siemens PLM Software. HEEDS MDO Setting up a Windows-to- Linux Compute Resource.

Introduction to Linux for BlueBEAR. January

Parallel Programming Pre-Assignment. Setting up the Software Environment

GNU/Linux 101. Casey McLaughlin. Research Computing Center Spring Workshop Series 2018

Mills HPC Tutorial Series. Linux Basics I

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

CSE 303 Lecture 2. Introduction to bash shell. read Linux Pocket Guide pp , 58-59, 60, 65-70, 71-72, 77-80

A Brief Introduction to The Center for Advanced Computing

Oregon State University School of Electrical Engineering and Computer Science. CS 261 Recitation 1. Spring 2011

Introduction to Linux

Introduction to Linux. Fundamentals of Computer Science

INTRODUCTION TO LINUX

CS197U: A Hands on Introduction to Unix

Session 1: Accessing MUGrid and Command Line Basics

Introduction to GALILEO

Introduction to Supercomputing

Practical Session 0 Introduction to Linux

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

High Performance Beowulf Cluster Environment User Manual

Introduction to Linux

Beginner's Guide for UK IBM systems

CS Fundamentals of Programming II Fall Very Basic UNIX

Outline. Structure of a UNIX command

Introduction to Unix: Fundamental Commands

Lezione 8. Shell command language Introduction. Sommario. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi

Introduction to UNIX I: Command Line 1 / 21

Effective Use of CCV Resources

Operating Systems. Copyleft 2005, Binnur Kurt

Operating Systems 3. Operating Systems. Content. What is an Operating System? What is an Operating System? Resource Abstraction and Sharing

Center for Mathematical Modeling University of Chile HPC 101. Scientific Computing on HPC systems. Juan Carlos Maureira B.

Using WestGrid from the desktop Oct on Access Grid

Linux Systems Administration Getting Started with Linux

AMS 200: Working on Linux/Unix Machines

Programming Environment on Ranger Cluster

CS 261 Recitation 1 Compiling C on UNIX

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!?

Introduction to Linux

Introduction to Linux Environment. Yun-Wen Chen

Sharpen Exercise: Using HPC resources and running parallel applications

Lezione 8. Shell command language Introduction. Sommario. Bioinformatica. Esercitazione Introduzione al linguaggio di shell

EECS Software Tools. Lab 2 Tutorial: Introduction to UNIX/Linux. Tilemachos Pechlivanoglou

Introduction to Linux

This lab exercise is to be submitted at the end of the lab session! passwd [That is the command to change your current password to a new one]

Week Overview. Unix file system File types and file naming Basic file system commands: pwd,cd,ls,mkdir,rmdir,mv,cp,rm man pages

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

CS 215 Fundamentals of Programming II Spring 2019 Very Basic UNIX

Unix Introduction to UNIX

Introduction to Linux Workshop 1

PROGRAMMAZIONE I A.A. 2015/2016

Perl and R Scripting for Biologists

Chapter-3. Introduction to Unix: Fundamental Commands

HPC Introductory Course - Exercises

Transcription:

Using the MaRC2 HPC Cluster René Sitt / Manuel Haim, 09/2016

Get access rights and permissions Students / Staff account needed Ask your workgroup leader if MaRC2 is already being used he/she must accept the terms of use he/she nominates people who may tell us to grant you access Ask the nominated people at your workgroup to get access For further questions, terms of use etc., send us an email: marc@hrz.uni-marburg.de

Starting a terminal session (Linux, Mac) Start your favourite Terminal application Connect to MaRC2 via SSH (Secure Shell) by entering: ssh ssh -p223 -p223 username@marc2.hrz.uni-marburg.de username@marc2.hrz.uni-marburg.de replace username with your Students / Staff account name remember that your account name is case-sensitive! You are then prompted for your account's password: username@marc2.hrz.uni-marburg.de's username@marc2.hrz.uni-marburg.de's password: password:

Starting a terminal session (Linux) Example

Starting a terminal session (Windows) Start an SSH client, e.g. PuTTY (www.putty.org) Select SSH, then insert host name and port number Finally, click open A terminal window opens where you will be prompted for username (case-sensitive!) and password

Generating SSH keys (Linux/Mac) Needed if you want to access compute nodes directly Connect to MaRC2 and use ssh-keygen ~]$ ~]$ ssh-keygen ssh-keygen The resulting key pair with default names id_rsa and id_rsa.pub will be created in the /home/username/.ssh folder If wanted, put in a passphrase to protect the key (otherwise just leave it empty) Append your public key to the authorized_keys file ~]$ ~]$ cd cd.ssh]$.ssh]$.ssh.ssh cat cat id_rsa.pub id_rsa.pub >> >> authorized_keys authorized_keys

Generating SSH keys (Linux/Mac) You should now be able to connect to MaRC2 compute nodes: ~]$ ~]$ ssh ssh node001 node001 The The authenticity authenticity of of host host 'node001 'node001 (<ip (<ip address>)' address>)' can't can't be be established. established. RSA RSA key key fingerprint fingerprint is is <rsa <rsa fingerprint>. fingerprint>. Are Are you you sure sure you you want want to to continue continue connecting connecting (yes/no)? (yes/no)? yes yes Warning: Warning: Permanently Permanently added added 'node001,<ip 'node001,<ip address>' address>' (RSA) (RSA) to to the list of known hosts. the list of known hosts. [username@node001 [username@node001 ~]$ ~]$ Don't use this to start calculations directly on a node!! (see below how to submit jobs to the task scheduler)

Some hints To exit the terminal session, just enter exit or press Ctrl+D By default, your login shell is set to /bin/bash. You may change this setting globally (and choose another login shell like csh, ksh or tcsh) for your account here: Staff account: https://admin.staff.uni-marburg.de/aendshell.html Students account: https://admin.students.uni-marburg.de/login_shell.html

Introduction to the BASH Shell (Bourne-again shell) Right after login, the BASH Shell is started by default and shows a command prompt like this: ~]$ ~]$ Input cursor $ for user, # for superuser (root) Current directory name ( ~ for home directory, / for root directory) Server name Your username To exit, simply enter the command exit or press Ctrl+D

First steps with BASH-builtin commands pwd print current (working) directory: ~]$ ~]$ pwd pwd /home/username /home/username cd change current directory: cd.. go to parent directory cd <dirname> go to subdirectory /home /home /home/username /home/username ~]$ ~]$ cd cd home]$ home]$.... pwd pwd home]$ home]$ cd cd username username ~]$ ~]$ pwd pwd

Additional commands ls list directory contents: ~]$ ~]$ ls ls mydata mydata readme readme script.sh script.sh ls -l list contents in long format ~]$ ~]$ ls ls -l -l drwxrwxr-x drwxrwxr-x 22 username username username username 4096 4096 18. 18. Jun Jun 11:12 11:12 mydata mydata -rw-rw-r-1 username username 915 18. Jun 11:37 readme -rw-rw-r-- 1 username username 915 18. Jun 11:37 readme -rwxr-xr-x 26 -rwxr-xr-x 11 username username username username 26 18. 18. Jun Jun 11:35 11:35 script.sh script.sh permissions (see later) user file type (d=directory) group last change file size (bytes) file name

Additional commands (2) ls -a show all files (even hidden ones):...bash_history.bash_history...bash_logout...bash_logout ~]$ ~]$ ls ls -a -a.bash_profile.bash_profile mydata mydata script.sh script.sh.bashrc readme.bashrc readme (files starting with. are usually not shown)

Additional commands (3) mkdir <dirname> make subdirectory: ~]$ ~]$ mkdir mkdir temp temp ~]$ ~]$ cd cd temp temp temp]$ pwd temp]$ pwd /home/username/temp /home/username/temp rmdir <dirname> remove directory (must be empty): temp]$ temp]$ cd cd ~]$ rmdir ~]$ rmdir.... temp temp rm <filename> remove (delete) file rm -R <dirname> remove (delete) directory recursively! rm -Rf <dirname> just as above, but force, don't ask ;-)

Available directories on a typical Linux system The Unix/Linux filesystem is organized in a tree-like structure The top-most directory / is called the root directory All filesystem entries (including mounted filesystems) are branches of this root. / /boot /bin executable boot files binaries /dev devices /lib /usr... /home /etc UNIX configuration home dirs libraries system files ressources /home/username personal home directory ~

Getting help man <command> Show man(ual) page for command: ~]$ ~]$ man man ls ls Press space for next page, b for previous page Arrow keys and PgUp/PgDn will also work nowadays Press q to quit <command> --help or <command> -h Show help (not supported by all commands)

Change file owner and permissions This is what ls -l shows: -rwxr-xr-x -rwxr-xr-x 11 username username username username user other group permissions user 26 26 18. 18. Jun Jun 11:35 11:35 script.sh script.sh group r = read w = write x = execute chmod <permissions> <file> Change permissions example: chmod u=rwx,g=rx,o=rx script.sh chown <user>:<group> <file> Change owner and group

Some tips and tricks Select text to copy, then paste with middlemouse button (PuTTY: Use right mouse button) Tab auto-completes your input Shift+PgUp/PgDn scrolls the screen buffer up and down (maximum scrollback buffer size can be set in your terminal options) ArrowUp and ArrowDown allows you to navigate through your former input Ctrl-R allows you to reverse-i-search in your former input Edit your.bashrc and.bash_profile for personal settings

Using the Environment Modules package

Using the Environment Modules package why? Each piece of software typically needs a set of environment variables to be set, in order to run properly e.g. the $PATH variable which points to the bin directories On a HPC cluster like MaRC2, there may be many different toolsets and toolset versions with the same purpose and/or the same command names Conflict! With the Environment Modules package, you can dynamically modify your environment by loading so-called modulefiles Use our predefined modulefiles, no need to create your own :-)

Using the Environment Modules package how? module avail show available modules [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module avail avail ----------------------------------------------- /usr/share/modules/modulefiles /usr/share/modules/modulefiles ----------------------------------------------dot module-cvs module-info modules null use.own dot module-cvs module-info modules null use.own ----------------------------------------------------- /usr/share/moduleslocal /usr/share/moduleslocal ------------------------------------------------------acml/gfortran-5.1.0(default) openmpi/pgi/1.4.3-qlc acml/gfortran-5.1.0(default) openmpi/pgi/1.4.3-qlc acml/gfortran-5.2.0 openmpi-1.6.3/gcc/1.6.3 acml/gfortran-5.2.0 openmpi-1.6.3/gcc/1.6.3 acml/ifort-5.1.0 openmpi-1.6.3/icc/1.6.3 acml/ifort-5.1.0 (AMD Core Math Library) openmpi-1.6.3/icc/1.6.3 acml/pgi-5.1.0 openmpi-1.6.3/pgi/1.6.3 acml/pgi-5.1.0 openmpi-1.6.3/pgi/1.6.3 gcc/4.4.6 parastation/mpi2-gcc-5.0.27-1(default) gcc/4.4.6 parastation/mpi2-gcc-5.0.27-1(default) gcc/4.6.2(default) parastation/mpi2-gcc-mt-5.0.27-1 gcc/4.6.2(default) parastation/mpi2-gcc-mt-5.0.27-1 gcc/4.7.2 parastation/mpi2-intel-5.0.27-1 gcc/4.7.2 parastation/mpi2-intel-5.0.27-1 intel/intelpsxe2011sp1-32 parastation/mpi2-intel-mt-5.0.27-1 intel/intelpsxe2011sp1-32 parastation/mpi2-intel-mt-5.0.27-1 intel/intelpsxe2011sp1-64 parastation/mpi2-pgi-5.0.27-1 intel/intelpsxe2011sp1-64 parastation/mpi2-pgi-5.0.27-1 intel/intelpsxe2013sp1-64 parastation/mpi2-pgi-mt-5.0.27-1 intel/intelpsxe2013sp1-64 parastation/mpi2-pgi-mt-5.0.27-1 openmpi/gcc/1.4.3-qlc pgi/12.2(default) openmpi/gcc/1.4.3-qlc pgi/12.2(default) (Portland Group Inc. Compiler) openmpi/intel/1.4.3-qlc pgi/13.2 openmpi/intel/1.4.3-qlc pgi/13.2 module load <modulefile> load module You do not need to specify the full path in order to load the default, e.g.: [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module load load gcc gcc

Using the Environment Modules package how (2) module list show loaded modules [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module list list Currently Currently Loaded Loaded Modulefiles: Modulefiles: 1) acml/gfortran-5.1.0 1) acml/gfortran-5.1.0 2) 2) gcc/4.6.2 gcc/4.6.2 3) 3) parastation/mpi2-gcc-5.0.27-1 parastation/mpi2-gcc-5.0.27-1 module unload <modulefile> unload module Again, you do not need to specify the full path: [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module unload unload gcc gcc module purge unload all loaded modules

Using the Environment Modules package how (3) module switch <module1> <module2> unload module1 and load module2 (works for conflicting modules!) [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ module module switch switch gcc gcc pgi pgi More information: man module

Compiling your programs

Compiling your programs Choose a compiler Several compilers available: gcc GNU C Compiler gfortran GNU Fortran Compiler icc Intel C Compiler ifort Intel Fortran Compiler pgcc Portland Group Inc. (PGI) C Compiler pgfortran Portland Group Inc. (PGI) Fortran Compiler MPI Compilers (OpenMPI or Parastation MPI, GNU/Intel/PGI): mpicc MPI C Compiler mpicxx MPI C++ Compiler mpif77 MPI Fortran77 Compiler

A simple C program Create a file hello-world.c with the following content: #include #include #include #include <stdio.h> <stdio.h> <stdlib.h> <stdlib.h> int int main() main() {{ printf("hello printf("hello World!\n"); World!\n"); exit(0); exit(0); }} Compile using gcc: (remember to have the module loaded first!) [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$ gcc gcc hello-world.c hello-world.c -o -o hello-world hello-world Then test locally: (don't do this with huge complex programs!) [haimm@marc2-h1 [haimm@marc2-h1 ~]$ ~]$./hello-world./hello-world Hello World! Hello World!

We could also use a Makefile Create a file Makefile with the following content: PROG=hello-world PROG=hello-world SRC=hello-world.c SRC=hello-world.c CC=gcc CC=gcc You must insert a tab here! all: all: clean clean build build clean: clean: build: build: rm rm -f -f $(PROG) $(PROG) $(CC) $(CC) $(SRC) $(SRC) -o -o $(PROG) $(PROG) run: run: all all./$(prog)./$(prog) Now just enter make or make run to compile/run your code

Submitting jobs to the SUN Grid Engine (SGE)

SUN Grid Engine??? The SUN Grid Engine (SGE) implements a queueing system Users can submit jobs to the SGE queues and define boundary conditions (like runtime, memory etc.) The SGE permanently reviews the available resources, evaluates pending jobs and executes them on the compute nodes The SGE is open source software with a free-to-use license Its successor, the Oracle Grid Engine, is only available commercially

How to submit a new job to the SGE First, write a start script for your software: (e.g. hello-world.sh) #!/bin/bash #!/bin/bash -S -S -e -e -o -o -l -l /bin/bash /bin/bash./stderr-file./stderr-file./stdout-file./stdout-file h_rt=10 h_rt=10 echo echo "hello-world "hello-world running running on on host host $(hostname)" $(hostname)"./hello-world./hello-world Then submit your start script: (-cwd: execute from current working dir) $$ qsub qsub -cwd -cwd hello-world.sh hello-world.sh Your job 144406 Your job 144406 ("hello-world.sh") ("hello-world.sh") has has been been submitted submitted

How to submit a new job to the SGE(2) Important: Requesting 'h_rt' (max. job runtime in seconds) is mandatory h_rt limits: 259200 (3d) standard queues, 864000 (10d) long queues RAM requests are counted per core, so h_vmem=4g with 4 cores reserves 16G When using a full node, h_vmem>4g is impossible 4G*64 = 256G = node maximum

How to submit a new job to the SGE(3) To use the module environment, include.. /etc/profile.d/modules.sh /etc/profile.d/modules.sh in your script before any module commands If the script file was written on Windows, use dos2linux dos2linux <yourscript> <yourscript> before submitting to enforce UNIX text formatting

How to submit a new job to the SGE(4) For parallel application, choose parallel environment and # of CPUs with -pe <parallel_environment> <# of cores> -pe -pe smp* smp* 32 32 this reserves 32 cores on one node (smp=symmetric multiprocessing) For more than 64 cores, use preferrably 'orte_sl64*' Other PEs exist mostly to achieve specific process distributions

How to submit a new job to the SGE(5)

How to submit a new job to the SGE(5) With many little calculations that can be indexed, consider using a task array -t n[-m[:s]] -t n n=# of tasks, IDs 1-n -t n-m range of task indices, IDs n-m -t n-m:s range of task indices n-m, stepsize s -t -t 2-10:2 2-10:2 spawns jobs with SGE_TASK_ID = 2, 4, 6, 8, 10 In the job script, get the index with ${SGE_TASK_ID}

How to submit a new job to the SGE(6) Recommended: Submit job as 'binary' when using task arrays Prerequisites: Make script 'executable', don't move or edit it as long as job is running, must give all parameters in commandline chmod chmod ugo+x ugo+x <jobscript> <jobscript> qsub qsub -b -b yy <parameters> <parameters> <jobscript> <jobscript> Alternatively, leave parameters in the script but use 'arsub' wrapper (script still needs to be executable!) module module load load arsub.py arsub.py -t -t tools/python-3.5 tools/python-3.5 tools/arsub tools/arsub <from-to:step> <from-to:step> <jobscript> <jobscript>

Get information on your submitted jobs qstat List all your jobs $$ qstat qstat job-id user state queue slots job-id prior prior name name user state submit/start submit/start at at queue slots ja-task-id ja-task-id ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------144407 06/19/2013 11 144407 0.00000 0.00000 start.sh start.sh haimm haimm qw qw 06/19/2013 12:43:26 12:43:26 SGE may take a few seconds q=queued, w=waiting, to give your job a non-zero priority E=Error, r=running 1 CPU slot (core) qstat -u "*" List all jobs from all users qstat -j <job-id> Get information on a single job $$ qstat qstat -j -j 144406 144406 ============================================================== ============================================================== job_number: 144406 job_number: 144406 exec_file: job_scripts/144406 exec_file: job_scripts/144406 submission_time: Wed submission_time: Wed Jun Jun 19 19 12:40:43 12:40:43 2013 2013 owner: haimm owner: haimm uid: 30376 uid: 30376......

Get information on your submitted jobs (2) qalter -w v <job-id> Get scheduler information for job (may be helpful if your job won't run) $$ qalter qalter -w -w vv 144406 144406 Job Job 144406 144406 queue queue instance instance "parallel_long@node028.marc2" "parallel_long@node028.marc2" dropped dropped because because it it is is temporarily not available temporarily not available Job Job 144406 144406 queue queue instance instance "parallel_long@node032.marc2" "parallel_long@node032.marc2" dropped dropped because because it it is is temporarily not available temporarily not available Job Job 144406 144406 queue queue instance instance "parallel_long@node043.marc2" "parallel_long@node043.marc2" dropped dropped because because it it is is temporarily not available temporarily not available......

Get information on your submitted jobs (3) qalter <options> <job-id> - Change submit options for a queued job To change memory and runtime requests: $$ qalter qalter -l -l h_rt=150000,h_vmem=2500m h_rt=150000,h_vmem=2500m 144406 144406 To change parallel environment and CPU reservation: $$ qalter qalter -pe -pe orte_sl32* orte_sl32* 128 128 Can be used if a Job won't run because of wrong submit parameters qdel <job-id> Delete (terminate) a submitted job

Get information on your submitted jobs (4) qhost -j -u <username> - see on which nodes your jobs are running (MASTER is the spawning process) [sittr@marc2-h2 [sittr@marc2-h2 mpiperftest]$ mpiperftest]$ qhost qhost -j -j -u -u sittr sittr HOSTNAME ARCH NCPU HOSTNAME ARCH NCPU LOAD LOAD MEMTOT MEMTOT MEMUSE MEMUSE SWAPTO SWAPTO SWAPUS SWAPUS ------------------------------------------------------------------------------------------------------------------------------------------------------------global global node001 node001 -lx24-amd64 lx24-amd64 node002 node002 node003 node003 lx24-amd64 lx24-amd64 lx24-amd64 lx24-amd64 <---snip---(node004-node013)---> <---snip---(node004-node013)---> node014 lx24-amd64 node014 lx24-amd64 ---64 64 0.02 0.02 252.4G 252.4G 64 0.00 252.4G 64 0.00 252.4G 64 64 0.02 0.02 252.4G 252.4G -2.1G 2.1G -8.0G 8.0G -0.0 0.0 9.0G 9.0G 4.6G 4.6G 8.0G 8.0G 772.0M 772.0M 8.0G 2.5G 8.0G 2.5G 8.0G 0.0 8.0G 0.0 job-id prior name user queue master job-id prior name user queue master ja-task-id ja-task-id ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------1985179 1985179 0.52597 0.52597 mpiperf_ mpiperf_ sittr sittr 64 2.7G 64 0.00 0.00 252.4G 252.4G 2.7G state submit/start state submit/start at at rr 10/12/2015 10/12/2015 14:33:34 14:33:34 parallel@n parallel@n SLAVE SLAVE parallel@n SLAVE parallel@n SLAVE <---snip---(62 <---snip---(62 more more SLAVE SLAVE processes)---> processes)---> node015 lx24-amd64 64 node015 lx24-amd64 64 0.00 0.00 252.4G 252.4G <---snip---(node016-node072)---> <---snip---(node016-node072)---> node073 lx24-amd64 node073 lx24-amd64 1985179 0.52597 mpiperf_ 1985179 0.52597 mpiperf_ sittr sittr 8.0G 8.0G 2.4G 2.4G 64 2.4G 8.0G 64 0.05 0.05 252.4G 252.4G 2.4G 8.0G 19.9M 19.9M rr 10/12/2015 14:33:34 parallel@n 10/12/2015 14:33:34 parallel@n MASTER MASTER parallel@n SLAVE parallel@n SLAVE <---snip---(62 <---snip---(62 more more SLAVE SLAVE processes)---> processes)---> node074 lx24-amd64 64 node074 lx24-amd64 64 0.00 0.00 252.4G 252.4G (...) (...) 2.5G 2.5G 2.4G 2.4G 8.0G 8.0G 941.7M 941.7M

Hint for Java users (memory boundaries) Java usually takes more memory than you think it takes SGE's h_vmem value must be larger than Java's -Xmx value, you may test it with a simple Hello world program like this: #!/bin/bash #!/bin/bash -S -S -e -e -o -o -l -l /bin/bash /bin/bash./stderr-file./stderr-file./stdout-file./stdout-file h_vmem=9g, h_vmem=9g, h_rt=60 h_rt=60 java java -Xms5120m -Xms5120m -Xmx5120m -Xmx5120m -cp -cp./hello.jar./hello.jar Hello Hello exit exit 00

Performance optimization Code has to be optimized to not waste energy, time, and money Performance optimization is complex and sometimes hard to get right How do we approach this?

Performance optimization (2) Parallel code analysis tools can help to identify problems We are mostly using the Scalasca suite and the Vampir trace analyzer to do this With potential performance bottlenecks identified, we can start making suggestions for code optimizations If you don't have the time or in-depth knowledge to optimize on your own, we're glad to help!