Bash for SLURM Author: Wesley Schaal Pharmaceutical Bioinformatics, Uppsala University wesley.schaal@farmbio.uu.se Lab session: Pavlin Mitev (pavlin.mitev@kemi.uu.se) it i slides at http://uppmax.uu.se/support/courses andworkshops/introductory course summer 2016/ Basic definitions Bash Bourne again shell (replacement for traditional bourne shell) A shell in a command interpreter; often serves as a scripting language SLURM Simple Linux Utility for Resource Management Can schedule and manage jobs so that many people can share a cluster 1
Useful commands sbatch submit and run a batch job script ex: sbatch my_job_script.job interactive start an interactive session ex: interactive A g2016011 salloc run a single command on the allocated cores/nodes ex: salloc A g2016011 n 1 t 15:00 qos=short scancel cancel one or more of your jobs scancel 5798001 Useful commands jobinfo ex: jobinfo u $USER squeue ex: squeue u $USER finishedjobinfo ex: finishedjobinfo j 999999 2
Ways to submit jobs through SLURM Command line: sbatch A g2016011 p core n 1 t 12:00:00 J jobname my_script_file.sh Batch file: #!/bin/bash l #SBATCH A g2016011 #SBATCH p core #SBATCH n 1 #SBATCH t 12:00:00 #SBATCH J jobname... the actual job script code... sbatch my_job.sh How to use SLURM Create job file may be a list of commands or just call another script Submit itjob run: sbatch jobfile.sh, with your file in place of "jobfile.sh" you should recieve a job number almost immediately Check on progress squeue u $USER if necessary, log onto the node (after allocation) Check job log look for "slurm 99999.out", but with the number from job submission this will hold anything that would normally be written to the terminal as well as error messages for the job 3
Reasons to use a script for SLURM Keep track of parameters time requested, environment, etc Easier to rerun jobs correct a small error, use new data Establish standard routines share methods within and between groups Can launch multiple jobs scripts can start scripts Convenience variables $SNIC_TMP Path to node local temporary disk space Using local storage can be much faster than the shared file systems. It s automatically created before the job starts and automatically deleted when the job has finished. $SLURM_JOB_ID $SNIC_TMP is equal to /scratch/$slurm_job_id. Not generally useful except to note that anything left if /scratch outside the specific folder for the running job can be deleted at any time. $CLUSTER Name of current cluster (eg, milou, tintin). Could be useful if other variables depend on the cluster but want to otherwise use same scripts. 4
Simple example #!/bin/bash -l #SBATCH -A g2016011 #SBATCH -p core #SBATCH -n 1 #SBATCH -t 10:00:00 #SBATCH -J jour2 module load bioinfo-tools samtools/0.1.19 bwa export SRCDIR=$HOME/baz/run3 cp $SRCDIR/foo.pl $SRCDIR/bar.txt t $SNIC_TMP/ cd $SNIC_TMP./foo.pl bar.txt cp *.out $SRCDIR/out2 Simple example explained #!/bin/bash l starts the bash interpreter " l" (login shell) is optional #SBATCH -A g2016011 "#" starts a comment that bash ignores "#SBATCH" is a special signal to SLURM " A" specifies which account will be "charged". #SBATCH -p core the "unit" of resources requested: core, node, etc #SBATCH -n 1 number of cores requested 5
Simple example explained #SBATCH t 10:00:00 maximum time requested in format: days hours:minutes:seconds #SBATCH -J jour2 name for this job mainly for your convenience module load bioinfo-tools samtools/0.1.19 bwa list of modules to be loaded (special note for "bioinfo tools") can specify versions or use default (risky) export SRCDIR=$HOME/baz/run3 variables can be defined Simple example explained cp $SRCDIR/foo.pl $SRCDIR/bar.txt $SNIC_TMP/ cd $SNIC_TMP working in node local storage can be much faster than home or glob./foo.pl bar.txt finally actually doing something can just call a simple command or be a long list of actions with if then, etc cp *.out $SRCDIR/out2 make certain to copy back any results you need since the temp folders can be deleted when the job ends 6
Separate processes in same batch job #!/bin/bash -l #SBATCH -A g2016011 #SBATCH p core #SBATCH n 4 #SBATCH -t 2-00:00:00 #SBATCH -J br_para_02 module load gcc export R_LIBS_USER=$HOME/lib/R/tintin cd $HOME/glob/p2013141/para./br_para_m.r std 1 &./br_para_m.r std 2 &./br_para_m.r std 3 &./br_para_m.r std 4 & wait Script that spawns batch jobs #!/bin/bash l TOOL=z_tools/3 IGEN=/sw/data/uppnex/igenomes cd $IGEN for v in [[:upper:]]*/*/* do echo $v cd $TOOL sbatch star_index.job $v cd $IGEN sleep 1 done 7
Script that spawns batch jobs test #!/bin/bash l TOOL=z_tools/3 IGEN=/sw/data/uppnex/igenomes cd $IGEN for v in [[:upper:]]*/*/* do echo $v cd $TOOL #sbatch star_index.job $v #cd $IGEN pwd sleep 1 done More general version #!/bin/bash l CMD=$1 TOOL=z_tools/3 IGEN=/sw/data/uppnex/igenomes cd $IGEN for v in [[:upper:]]*/*/* do echo $v cd $TOOL sbatch $CMD $v cd $IGEN sleep 1 done 8
Spawned batch job part 1 #!/bin/bash l #SBATCH -A staff #SBATCH -p core #SBATCH -n 8 #SBATCH -t 2:00:00 #SBATCH -J igenomes_star module load bioinfo-tools star IGEN=/sw/data/uppnex/igenomes VICT=$IGEN/$1 DEST=$VICT/Sequence/STARIndex echo $VICT mkdir $DEST exit 1 cd $DEST ln -s $VICT/Sequence/WholeGenomeFasta/genome.fa $SNIC_TMP/ ln -s../wholegenomefasta/genome.fa. Spawned batch job part 2 GTF='' if [ -e $VICT/Annotation/Genes/genes.gtf ] then ln -s $VICT/Annotation/Genes/genes.gtf $SNIC_TMP/ ln -s../../annotation/genes/genes.gtf. GTF='--sjdbGTFfile genes.gtf --sjdboverhang 100' fi cd $SNIC_TMP echo "STAR --runmode genomegenerate --runthreadn 8 --genomedir./ --genomefastafiles genome.fa $GTF" STAR --runmode genomegenerate --runthreadn 8 --genomedir./ --genomefastafiles genome.fa $GTF rm genome.fa genes.gtf cp -rp * $DEST/ 9
Try it yourself 1. Create and submit a batch file to list the contents of the current folder. How much time did you give this job? Where did the output appear? Look at "finishedjobinfo" for this job. 2. Find out how to request a node with more memory. How/where did you learn this? Notice any other options? 3. Create a batch file containing an if then construction. Submit if convenient. 10