Bash for SLURM. Author: Wesley Schaal Pharmaceutical Bioinformatics, Uppsala University

Similar documents
Slurm at UPPMAX. How to submit jobs with our queueing system. Jessica Nettelblad sysadmin at UPPMAX

Slurm at UPPMAX. How to submit jobs with our queueing system. Jessica Nettelblad sysadmin at UPPMAX

Introduction to RCC. September 14, 2016 Research Computing Center

Introduction to RCC. January 18, 2017 Research Computing Center

Submitting batch jobs Slurm on ecgate Solutions to the practicals

CRUK cluster practical sessions (SLURM) Part I processes & scripts

UPPMAX Introduction Martin Dahlö Valentin Georgiev

CSC BioWeek 2018: Using Taito cluster for high throughput data analysis

CSC BioWeek 2016: Using Taito cluster for high throughput data analysis

Introduction to High-Performance Computing (HPC)

Heterogeneous Job Support

Exercise 1. RNA-seq alignment and quantification. Part 1. Prepare the working directory. Part 2. Examine qualities of the RNA-seq data files

Linux Introduction Martin Dahlö

Introduction to SLURM & SLURM batch scripts

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU

High Performance Computing Cluster Basic course

Working with Shell Scripting. Daniel Balagué

Introduction to SLURM & SLURM batch scripts

Introduction to UBELIX

Linux Tutorial. Ken-ichi Nomura. 3 rd Magics Materials Software Workshop. Gaithersburg Marriott Washingtonian Center November 11-13, 2018

Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat

Slurm basics. Summer Kickstart June slide 1 of 49

Sherlock for IBIIS. William Law Stanford Research Computing

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Submitting batch jobs

Introduction to SLURM & SLURM batch scripts

Batch Systems. Running your jobs on an HPC machine

Preparation of alignments for variant calling with GATK: exercise instructions for BioHPC Lab computers

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013

Using a Linux System 6

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Compiling applications for the Cray XC

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012

An Introduction to Gauss. Paul D. Baines University of California, Davis November 20 th 2012

Exercises: Abel/Colossus and SLURM

Linux Operating System Environment Computadors Grau en Ciència i Enginyeria de Dades Q2

Our data for today is a small subset of Saimaa ringed seal RNA sequencing data (RNA_seq_reads.fasta). Let s first see how many reads are there:

Introduction to High-Performance Computing (HPC)

Batch Usage on JURECA Introduction to Slurm. May 2016 Chrysovalantis Paschoulas HPS JSC

Introduction to SLURM on the High Performance Cluster at the Center for Computational Research

Submitting batch jobs Slurm on ecgate

Chap2: Operating-System Structures

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

Using CSC Environment Efficiently,

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK

Spark Programming at Comet. UCSB CS240A Tao Yang

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs

How to run a job on a Cluster?

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides)

High Performance Computing (HPC) Using zcluster at GACRC

EE516: Embedded Software Project 1. Setting Up Environment for Projects

Working with Basic Linux. Daniel Balagué

CS 307: UNIX PROGRAMMING ENVIRONMENT KATAS FOR EXAM 1

Using ISMLL Cluster. Tutorial Lec 5. Mohsan Jameel, Information Systems and Machine Learning Lab, University of Hildesheim

Installing and running COMSOL 4.3a on a Linux cluster COMSOL. All rights reserved.

SCALABLE HYBRID PROTOTYPE

Introduction to HPC Using zcluster at GACRC

Kamiak Cheat Sheet. Display text file, one page at a time. Matches all files beginning with myfile See disk space on volume

High Performance Computing Cluster Advanced course

HPC Introductory Course - Exercises

P a g e 1. HPC Example for C with OpenMPI

Hitchhiker s Guide to VLSI Design with Cadence & Synopsys

Introduction to Linux Part 2b: basic scripting. Brett Milash and Wim Cardoen CHPC User Services 18 January, 2018

June Workshop Series June 27th: All About SLURM University of Nebraska Lincoln Holland Computing Center. Carrie Brown, Adam Caprez

CS370 Operating Systems

CS 470 Spring Mike Lam, Professor. Performance Analysis

These will serve as a basic guideline for read prep. This assumes you have demultiplexed Illumina data.

Topic 2: More Shell Skills

XSEDE New User Training. Ritu Arora November 14, 2014

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu

MPI 1. CSCI 4850/5850 High-Performance Computing Spring 2018

Introduction to Linux Environment. Yun-Wen Chen

15-122: Principles of Imperative Computation

Topic 2: More Shell Skills. Sub-Topic 1: Quoting. Sub-Topic 2: Shell Variables. Difference Between Single & Double Quotes

For Dr Landau s PHYS8602 course

GNU/Linux 101. Casey McLaughlin. Research Computing Center Spring Workshop Series 2018

Introduction to HPC2N

COMS 6100 Class Notes 3

Beginner's Guide for UK IBM systems

CS CS Tutorial 2 2 Winter 2018

Introduction to HPC Using zcluster at GACRC

CNAG Advanced User Training

Applications Software Example

Topic 2: More Shell Skills

HPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing

Exercise 1: Basic Tools

EECS2031 Winter Software Tools. Assignment 1 (15%): Shell Programming

Introduction to HPC Using zcluster at GACRC

Part 1: Basic Commands/U3li3es

Name Department/Research Area Have you used the Linux command line?

Lezione 8. Shell command language Introduction. Sommario. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi

Part One: The Files. C MPI Slurm Tutorial - Hello World. Introduction. Hello World! hello.tar. The files, summary. Output Files, summary

Shell Scripting. With Applications to HPC. Edmund Sumbar Copyright 2007 University of Alberta. All rights reserved

Graham vs legacy systems

Linux Systems Administration Shell Scripting Basics. Mike Jager Network Startup Resource Center

Choosing Resources Wisely. What is Research Computing?

323 lines (286 sloc) 15.7 KB. 7d7a6fb

Using CSC Environment Efficiently,

Lezione 8. Shell command language Introduction. Sommario. Bioinformatica. Esercitazione Introduzione al linguaggio di shell

Transcription:

Bash for SLURM Author: Wesley Schaal Pharmaceutical Bioinformatics, Uppsala University wesley.schaal@farmbio.uu.se Lab session: Pavlin Mitev (pavlin.mitev@kemi.uu.se) it i slides at http://uppmax.uu.se/support/courses andworkshops/introductory course summer 2016/ Basic definitions Bash Bourne again shell (replacement for traditional bourne shell) A shell in a command interpreter; often serves as a scripting language SLURM Simple Linux Utility for Resource Management Can schedule and manage jobs so that many people can share a cluster 1

Useful commands sbatch submit and run a batch job script ex: sbatch my_job_script.job interactive start an interactive session ex: interactive A g2016011 salloc run a single command on the allocated cores/nodes ex: salloc A g2016011 n 1 t 15:00 qos=short scancel cancel one or more of your jobs scancel 5798001 Useful commands jobinfo ex: jobinfo u $USER squeue ex: squeue u $USER finishedjobinfo ex: finishedjobinfo j 999999 2

Ways to submit jobs through SLURM Command line: sbatch A g2016011 p core n 1 t 12:00:00 J jobname my_script_file.sh Batch file: #!/bin/bash l #SBATCH A g2016011 #SBATCH p core #SBATCH n 1 #SBATCH t 12:00:00 #SBATCH J jobname... the actual job script code... sbatch my_job.sh How to use SLURM Create job file may be a list of commands or just call another script Submit itjob run: sbatch jobfile.sh, with your file in place of "jobfile.sh" you should recieve a job number almost immediately Check on progress squeue u $USER if necessary, log onto the node (after allocation) Check job log look for "slurm 99999.out", but with the number from job submission this will hold anything that would normally be written to the terminal as well as error messages for the job 3

Reasons to use a script for SLURM Keep track of parameters time requested, environment, etc Easier to rerun jobs correct a small error, use new data Establish standard routines share methods within and between groups Can launch multiple jobs scripts can start scripts Convenience variables $SNIC_TMP Path to node local temporary disk space Using local storage can be much faster than the shared file systems. It s automatically created before the job starts and automatically deleted when the job has finished. $SLURM_JOB_ID $SNIC_TMP is equal to /scratch/$slurm_job_id. Not generally useful except to note that anything left if /scratch outside the specific folder for the running job can be deleted at any time. $CLUSTER Name of current cluster (eg, milou, tintin). Could be useful if other variables depend on the cluster but want to otherwise use same scripts. 4

Simple example #!/bin/bash -l #SBATCH -A g2016011 #SBATCH -p core #SBATCH -n 1 #SBATCH -t 10:00:00 #SBATCH -J jour2 module load bioinfo-tools samtools/0.1.19 bwa export SRCDIR=$HOME/baz/run3 cp $SRCDIR/foo.pl $SRCDIR/bar.txt t $SNIC_TMP/ cd $SNIC_TMP./foo.pl bar.txt cp *.out $SRCDIR/out2 Simple example explained #!/bin/bash l starts the bash interpreter " l" (login shell) is optional #SBATCH -A g2016011 "#" starts a comment that bash ignores "#SBATCH" is a special signal to SLURM " A" specifies which account will be "charged". #SBATCH -p core the "unit" of resources requested: core, node, etc #SBATCH -n 1 number of cores requested 5

Simple example explained #SBATCH t 10:00:00 maximum time requested in format: days hours:minutes:seconds #SBATCH -J jour2 name for this job mainly for your convenience module load bioinfo-tools samtools/0.1.19 bwa list of modules to be loaded (special note for "bioinfo tools") can specify versions or use default (risky) export SRCDIR=$HOME/baz/run3 variables can be defined Simple example explained cp $SRCDIR/foo.pl $SRCDIR/bar.txt $SNIC_TMP/ cd $SNIC_TMP working in node local storage can be much faster than home or glob./foo.pl bar.txt finally actually doing something can just call a simple command or be a long list of actions with if then, etc cp *.out $SRCDIR/out2 make certain to copy back any results you need since the temp folders can be deleted when the job ends 6

Separate processes in same batch job #!/bin/bash -l #SBATCH -A g2016011 #SBATCH p core #SBATCH n 4 #SBATCH -t 2-00:00:00 #SBATCH -J br_para_02 module load gcc export R_LIBS_USER=$HOME/lib/R/tintin cd $HOME/glob/p2013141/para./br_para_m.r std 1 &./br_para_m.r std 2 &./br_para_m.r std 3 &./br_para_m.r std 4 & wait Script that spawns batch jobs #!/bin/bash l TOOL=z_tools/3 IGEN=/sw/data/uppnex/igenomes cd $IGEN for v in [[:upper:]]*/*/* do echo $v cd $TOOL sbatch star_index.job $v cd $IGEN sleep 1 done 7

Script that spawns batch jobs test #!/bin/bash l TOOL=z_tools/3 IGEN=/sw/data/uppnex/igenomes cd $IGEN for v in [[:upper:]]*/*/* do echo $v cd $TOOL #sbatch star_index.job $v #cd $IGEN pwd sleep 1 done More general version #!/bin/bash l CMD=$1 TOOL=z_tools/3 IGEN=/sw/data/uppnex/igenomes cd $IGEN for v in [[:upper:]]*/*/* do echo $v cd $TOOL sbatch $CMD $v cd $IGEN sleep 1 done 8

Spawned batch job part 1 #!/bin/bash l #SBATCH -A staff #SBATCH -p core #SBATCH -n 8 #SBATCH -t 2:00:00 #SBATCH -J igenomes_star module load bioinfo-tools star IGEN=/sw/data/uppnex/igenomes VICT=$IGEN/$1 DEST=$VICT/Sequence/STARIndex echo $VICT mkdir $DEST exit 1 cd $DEST ln -s $VICT/Sequence/WholeGenomeFasta/genome.fa $SNIC_TMP/ ln -s../wholegenomefasta/genome.fa. Spawned batch job part 2 GTF='' if [ -e $VICT/Annotation/Genes/genes.gtf ] then ln -s $VICT/Annotation/Genes/genes.gtf $SNIC_TMP/ ln -s../../annotation/genes/genes.gtf. GTF='--sjdbGTFfile genes.gtf --sjdboverhang 100' fi cd $SNIC_TMP echo "STAR --runmode genomegenerate --runthreadn 8 --genomedir./ --genomefastafiles genome.fa $GTF" STAR --runmode genomegenerate --runthreadn 8 --genomedir./ --genomefastafiles genome.fa $GTF rm genome.fa genes.gtf cp -rp * $DEST/ 9

Try it yourself 1. Create and submit a batch file to list the contents of the current folder. How much time did you give this job? Where did the output appear? Look at "finishedjobinfo" for this job. 2. Find out how to request a node with more memory. How/where did you learn this? Notice any other options? 3. Create a batch file containing an if then construction. Submit if convenient. 10