June Workshop Series June 27th: All About SLURM University of Nebraska Lincoln Holland Computing Center. Carrie Brown, Adam Caprez

Size: px
Start display at page:

Download "June Workshop Series June 27th: All About SLURM University of Nebraska Lincoln Holland Computing Center. Carrie Brown, Adam Caprez"

Transcription

1 June Workshop Series June 27th: All About SLURM University of Nebraska Lincoln Holland Computing Center Carrie Brown, Adam Caprez

2 Setup Instructions Please complete these steps before the lessons start at 1:00 PM: Setup instructions: If you need to use a demo account please speak with one of the helpers If you need help with the setup, please put a red sticky note at the top of your laptop. When you are done with the setup, please put a green sticky note at the top of your laptop.

3 June Workshop Series Schedule June 6th: Introductory Bash June 13th: Advanced Bash and Git June 20th: Introductory HCC June 27th: All about SLURM Learn all about the Simple Linux Utility for Resource Management (SLURM), HCC's workload manager (scheduler) and how to select the best options to streamline your jobs. Upcoming Software Carpentry Workshops UNL: HCC Kickstart Bash, Git and HCC Basics September 5 th and 6 th UNO: Software Carpentry Bash, Git and R October 16 th and 17 th

4 Logistics Name tags, sign-in sheet Sticky notes: Red = need help, Green = all good Link to Workshop Materials: Etherpad: Terminal commands are in this font Any entries surrounded by <brackets> need to be filled in with information Example: <username>@crane.unl.edu becomes demo01@crane.unl.edu if your username=demo01. Today we will be using the reservation hccjune for all jobs Make sure your submit scripts include the line: #SBATCH --reservation=hccjune

5 What is a Cluster?

6 Exercises 1. If you aren t already, connect to the Crane cluster 2. Navigate to your $WORK directory 3. If you were not here last week, or do not have the tutorial directory, clone the files to your $WORK directory with the command: git clone 4. Make a new directory inside the tutorial directory (./HCCWorkshops/) named slurm this is where we will put all of our tutorial files for today. Once you have finished, put up your green sticky note. If you have issues, put up your red sticky note and one of the helpers will be around to assist.

7 SLURM Simple Linux Utility for Resource Management Open source, scalable cluster management and job scheduling system Used on ~60% of the TOP500 supercomputers 3 key functions Allocates exclusive or non-exclusive access to resources Framework for starting, executing and monitoring work Manages a queue of pending jobs Uses a best fit algorithm to assign tasks Fair Tree Fairshare Algorithm

8 Slurm vs PBS To PBS/SGE Command Slurm Equivalent Submit a job qsub <script_file> sbatch <script_file> Cancel a job qdel <job_id> scancel <job_id> Check the status of a job qstat <job_id> squeue <job_id> Check the status of all jobs by user qstat u <user_name> squeue u <user_name> Hold a job qhold <job_id> scontrol hold <job_id> Release a job qrls <job_id> scontrol release <job_id> More commands and schedulers:

9 sinfo Shows a listing of all partitions on a cluster Use #SBATCH --partition=<partition_name> All partitions have a 7 day run-time limitation Publically available partitions: Partition Description Limitations Clusters batch Defaultpartition 2000 max CPUs per user Crane, Tusker guest Uses free time on owned or leased Intellaband (IB) or Omni-Path Architecture (OPA) nodes Pre-emptable Max 158 IB CPU s and 2000 OPA CPU s per user Crane highmem High memory nodes (512 and 1024 GB) 192 max CPUs per user Tusker gpu_k20 GPU nodes include 3x Tesla K20m per node with IB 48 max CPUs per user Crane gpu_m2070 GPU nodes include 2x Tesla M2070 per node, non-ib 48 max CPUs per user Crane gpu_p100 GPU nodes include 2x Tesla P100 per node with OPA 40 max CPUs per user Crane

10 Fair Tree Fairshare Algorithm Fair Tree prioritizes users such that if accounts A and B are siblings and A has a higher fairshare factor than B, all children of A will have higher fairshare factors than all children of B Benefits: All users in a higher priority account receive a higher fair share factor than all users from a lower priority account Users in a more active group have lower priority than users in a less active group Users are sorted and ranked to prevent precision loss Priority is calculated based on rank, not directly off of Level FS value New jobs are immediately assigned a priority User ranking is calculated at 5 minute intervals

11 Calculation of Level FS (LF) Where: LF = $ 0 LF % S = Shares Norm assigned shares normalized to the shares assigned to itself and its siblings S = $ *+,-./0 $ *+,-./01-23/245-0 S 1 U = Effective Usage usage normalized to the account s usage U = % *+,-./0 % *+,-./01-23/245-0 U 1

12 Fairshare Algorithm root Groups gprof1 gprof2 Users uprof1 ustudent3 uprof2 ucollab78 uphd17 Uses a rooted plane tree (aka rooted ordered tree) sorted by Level FS descending from left to right Tree is traversed depth-first users are assigned rank and given a fairshare factor Process: Calculate Level FS for subtree s children Sort children of the subtree Visit children in descending order and assign fairshare factor fairshare factor = rank total # of users

13 Exercises 1. You can check on the share division and usage on Holland clusters with the sshare command. The output of this command can be quite long, combine it with head or grep to see individual portions of it. Can you write a command so you only see the first 10 lines of output? Modify the previous command to use grep to find your user and group information Compare the amount of your EffectvUsage to your NormShares Have you used more than your NormShares? How about your group overall? How does the group s EffectvUsage compare to the NormShares? 2. The sshare argument -l shows extended output, including the current calculated LevelFS values. Repeat the steps in #1, but with the -l argument this time. How does your LevelFS value compare to your group s LevelFS value? Does the calculated LevelFS value correspond to the differences you observed EffectvUsage? Once you have finished, put up your green sticky note. If you have issues, put up your red sticky note and one of the helpers will be around to assist.

14 sbatch Used to asynchronously submit a batch job to execute on allocated resources. Sequence of events: 1. User submits a script via sbatch 2. When resources become available they are allocated to the job 3. The script is executed on one node (the master node) The script must launch other tasks on allocated nodes STDOUT and STDERR are captured and redirected to the output file(s) 4. When script terminates, the allocation is released Any non-zero exit will be interpreted as a failure

15 Submit Scripts Shebang The shebang tells Slurm what interpreter to use for this file. This one is for the shell (Bash) Name of the submit file This can be anything. Here we are using invert_single.slurm the.slurm makes it easy to recognize that this is a submit file. Commands Any commands after the SBATCH lines will be executed by the interpreter specified in the shebang similar to what would happen if you were to type the commands interactively SBATCH options These must be immediately after the shebang and before any commands. The only required SBATCH options are time, nodes and mem, but there are many that you can use to fully customize your allocation.

16 Submit Files Best Practices Put all module loads immediately after SBATCH lines Quickly locate what modules and versions were used. Specify versions on module loads Allows you to see what versions were used during the analysis Use a separate submit file for each analysis Instead of editing and resubmitting a submit files, copy a previous one and make changes to it Keep a running record of your analyses Redirect output and error to separate files Allows you to see quickly whether a job completes with errors or not Separate individual workflow steps into individual jobs Avoid putting too many steps into a single job

17 Shebang! - Interpreters Must be included in the first line of the submit script Must be an absolute path Specifies which program is used to execute the contents of the script The shebang in the submit file can be one of the following: #!/bin/bash The most common shell and also the default shell at HCC #!/bin/csh - symlink to tcsh #!/usr/bin/perl #!/usr/bin/python Using Perl or Python interpreters can make loading modules difficult Scripts that return anything but 0 will be interpreted as a failed job by Slurm

18 Common SBATCH Options Command --nodes Number of nodes requested What it does --time --mem --ntasks-per-node --mem-per-cpu --output --error --job-name Maximum walltime for the job in DD-HHH:MM:SS format maximum of 7 days on batch partition Real memory (RAM) required per node - can use KB, MB, and GB units default is MB Request less memory than total available on the node - The maximum available on a 512 GB RAM node is 500, for 256 GB RAM node is 250 Number of tasks per node used to request a specific number of cores Minimum of memory required per allocated CPU default is 1 GB Filename where all STDOUT will be directed default is slurm-<jobid>.out Filename where all STDERR will be directed default is slurm-<jobid>.out How the job will show up in the queue For more information: sbatch help SLURM Documentation:

19 scancel Use to cancel jobs prior to completion Usage: scancel <job_id> Use other arguments to cancel multiple jobs at once or combine both to prevent accidentally canceling the wrong job Other arguments: Argument Cancel --name=<job_name> --partition=<partition> --user=<user_name> --state=<job_state> jobs with this name jobs in this partition jobs of this user jobs in this state Valid states: PENDING, RUNNING, and SUSPENDED

20 Short qos Increases a jobs priority, allowing it to run as soon as possible Useful for testing and developmental work Limitations: 6 hour runtime 1 job of 16 CPU s or fewer Max of 2 jobs per user Max of 256 CPU s in use for all short jobs from all users To use, include this line in your submit script: #SBATCH --qos=short For more information:

21 Exercise 1. Write a submit script from scratch. (No copying previous ones!) The script should use the following parameters: Uses 1 node Uses 10 GB RAM 10 minutes Runtime Executes the command: echo I can write submit scripts! Submit your script and watch for output. If you run into errors, copy the error to Etherpad. If you were able to fix the error, add a brief note explaining how you did. Once you have finished, put up your green sticky note. If you have issues, put up your red sticky note and one of the helpers will be around to assist.

22 Exercise Solution

23 squeue Job ID The ID number assigned to your job by Slurm Name The name you gave the job as specified in the submit script Time The length of time the job has been running Nodes The number of nodes the job is running on Partition The partition the job is running on or assigned to User The user that owns the job State The current status of the job. Common states include: CD Completed CA Canceled F Failed PD Pending R Running Nodelist If the job is running: the names of the nodes the job is running on If the job is pending: the reason the job is pending For more information:

24 Common Reason Codes Job Reason Codes Dependency NodeDown PartitionDown Priority ReqNodeNotAvail Reservation Description This job is waiting for a dependent job to complete. A node required by the job is down. The partition (queue) required by this job is in a DOWN state and temporarily accepting no jobs, for instance because of maintainance. Note that this message may be displayed for a time even after the system is back up. One or more higher priority jobs exist for this partition or advanced reservation. Other jobs in the queue have higher priority than yours. No nodes can be found satisfying your limits, for instance because maintainance is scheduled and the job can not finish before it The job is waiting for its advanced reservation to become available. More information: squeue --help

25 Common squeue Options Option -u <user_name> --user=<user_name> Displays information about -j <job_list> specified job(s) * jobs owned by the specified user_name(s) * -p <part_list> jobs in a specified partition(s) * -t <state_list> jobs in the specified state(s) {PD, R, S, CG, CD, CF, CA, F, TO, PR, NF} * -i <interval> --interate= <interval> -S <sort_list> --sort=<sort_list> --start jobs repeatedly reported at intervals (in seconds) jobs sorted by specified field(s) * pending jobs and scheduled start times * Indicates arguments that can take a comma-separated list For more options:

26 Exercise 1. Use the squeue command to determine the following. Hint: Don t forget about wc l How many jobs are currently Running? How many jobs are currently Pending? The grid partition is composed of resources that are made available to the Open Science Grid. How many jobs are currently in the queue for this partition? How many jobs are currently in queue for the user root? 1. Edit the submit script you made previously. Add the following command to execute after the echo command: sleep 120 Submit the updated script file and monitor its progress with squeue. If it is pending for a while, use --start to see how much longer until it is expected to start. How accurate was the estimate? Can you guess what sleep does just by how your job changes? If not, take a look at the documentation (sleep --help).

27 Customizing squeue output Use the --Format argument (must be capitalized) Fields you want displayed are specified in a comma-separated list without spaces after the argument Fields of note: priority reason dependency eligibletime endtime state / statecompact submittime Even more customization options are available for --Format and the -- format flag check out man squeue for more information.

28 Environmental Variables and Replacement Symbols Environmental Variables Can be used in the command section of a submit file (passed to scripts or programs via arguments) Cannot be used within an #SBATCH directive Use Replacement Symbols instead Environment Variable SLURM_JOB_ID SLURM_JOB_NAME SLURM_NNODES SLURM_NODELIST SLURM_NTASKS SLURM_QUEUE SLURM_SUBMIT_DIR SLURM_TASKS_PER_NODE Description batch job id assigned by Slurm upon submission user-assigned job name number of nodes list of nodes total number of tasks queue (partition) directory of submission number of tasks per node Replacement Symbols Symbol %A Value Job array s master job allocation number %a Job array ID (index) number %j Job allocation number (job id) %N Node name will be replaced by the name of the first node in the job (the one that runs the script) %u User name %% The character % A number can be placed between % and the following character to zero-pad the result For example: job%j.out would create job out for job_id= job%9j.out would create job out for job_id=

29 Additional sbatch Options Argument --begin:<time> --deadline=<time> --hold --immediate --mail-type=<type> --mail-user=<user_ > --open-mode=<append truncate> --test-only --tmp=<mb> Details The controller will wait to allocate the job until the specified time Specific Time: HH:MM:SS Specific Date: MMDDYY or MM/DD/YY or YYY-MM-DD Specific Date and Time: YYYY-MM-DD[THH:MM:SS] Keywords can be used now, today, tomorrow Can also be relative in format now+<time> Remove the job if it cannot finish before the deadline Valid time formats: HH:MM[:SS] [AM PM] MMDD[YY] or MM/DD[/YY] or MM.DD[.YY] MM/DD[/YY]-HH:MM[:SS] YYYY-MM-DD[THH:MM[:SS]]] Will hold the job in held state until released manually using the command scontrol release <job_id> Will only release the job if the resources are immediately available Notify user by when certain event types occur. Valid type include: BEGIN, END, FAIL, ALL, TIME_LIMIT, TIME_LIMIT_X (When X% of the time is up, where X is 90, 80, or 50) Specify an to send event notifications to Specify how to open output files default is truncate Validates the script and returns a starting estimate based on the current queue and job requirements Does not submit the job Minimum amount of temporary disk space on the allocated node

30 3. \ Exercises 1. Edit the submit script you created previous to: Include at least two of the additional options we discussed. Submit the script to see how they work. Try changing some of the parameters (number of nodes, memory, or time) and use the #SBATCH --testonly argument to see how the estimated start time changes. Which parameter seems to affect it the most? 2. Using the cd command, navigate to the matlab directory inside of HCCWorkshops. Use less to view the contents of the invertrand.submit file. Can you find all of the environmental variables and replacement symbols used? What role do each of them play in this script? 4. Navigate back into the directory which contains the submit script you made today. Edit the script to include one environmental variable and one replacement symbol. Submit the script and check to see if your changes worked the way you expected. Once you have finished, put up your green sticky note. If you have issues, put up your red sticky note and one of the helpers will be around to assist.

31 Array Job Submissions Submits a specified number of identical jobs Use environmental variables and replacement symbols to separate output Usage: #SBATCH --array=<array numbers or ranges> Array list can any combination of the following: a comma separated list of values. #SBATCH --array=1,5,10 : submits 3 array jobs with array ids 1, 5, 10 a range of values with a separator. #SBATCH --array=0-5 : submits 6 array jobs with array ids 0, 1, 2, 3, 4, 5 A range of values with a : to indicate step value #SBATCH array=1-9:2 : submits 5 array jobs with array ids 1, 3, 5, 7, 9 A % can be used to specify the maximum number of simultaneous tasks (default is 1000) #SBATCH --array=1-10%4 : submits 10 array jobs with 4 simultaneous running jobs To cancel array jobs: Usage: scancel <job_id>_<array numbers> Cancel all array jobs: scancel <job_id> Cancel single array ids: scancel <job_id>_<array id>

32 Exercises 1. Specify how many jobs these commands will create. What are they re array id s? How many will run simultaneously? #SBATCH --array=5-10 #SBATCH --array=0-4,15-20 #SBATCH --array=1,3-10:2 #SBATCH --array=0-20:2%10 2. When we looked at the output of the example array job, the output is not in numeric order. Can you think of a reason why that happens? 3. Edit the example array job to do the following: Run 15 array tasks, each one with an odd array id Run 5 array tasks, each one with a unique 3 digit id Once you have finished, put up your green sticky note. If you have issues, put up your red sticky note and one of the helpers will be around to assist.

33 Job Dependencies Allows you to queue multiple jobs that depend on the completion of one or more previous jobs When submitting the job, use the -d argument followed by specification of what jobs and when to execute <when_to_execute>:<job_id> After successful completion afterok:<job_id> After non-successful completion afternotok:<job_id> Multiple job ids can be specified, separate with colons afterok:<job_id1>:<job_id2> Dependent jobs can use output and files created from previous jobs

34 Exercises 1. Copy the JobB.submit script, calling the new one JobC.submit and edit the contents accordingly (replace all instances of B with C ). Using sbatch, queue JobA. Then queue JobB and JobC, setting them both to begin after the successful completion of JobA. 2. Using the previous three submit scripts, create a new submit script which will do the following: Combine the output from both JobB and JobC into a text file called JobD.txt Add the line Sample job D output to this new text file 3. Using these four submit scripts, Run them so the jobs trigger in the order according to the diagram to the right Once you have finished, put up your green sticky note. If you have issues, put up your red sticky note and one of the helpers will be around to assist.

35 Exercise Solution

36 srun Used to synchronously submit a single command Commonly used to start interactive sessions Sequence of Events: 1. User submits a command for execution May include command line arguments will be executed exactly as specified 2. If allocation exists, the job executes immediately Otherwise, the job will block until a new allocation is established 3. n identical copies of the command are run simultaneously on allocated resources as individual tasks --pty induces pseudo-terminal mode input and output is directed to the users shell 4. Once all tasks terminate, the srun session will terminate If the allocation was created with srun, it will be released

37 Using srun to monitor batch jobs 1. Connect to the node running the job: srun -j <job_id> --pty bash {or top} srun -nodelist=<node_id> --pty bash {or top} 2. Monitor: top (if not already running) Use to monitor core use ideal for multi-core processes Press u to search for your username cat /cgroup/memory/slurm/uid_<uid>/job_<job_id>/memory.max_usage_in_bytes Use to monitor memory use To determine your uid use: id -u <user_name> Match with watch -n to specify a refresh interval - default is 2 seconds CTRL + C to exit

Introduction to SLURM & SLURM batch scripts

Introduction to SLURM & SLURM batch scripts Introduction to SLURM & SLURM batch scripts Anita Orendt Assistant Director Research Consulting & Faculty Engagement anita.orendt@utah.edu 6 February 2018 Overview of Talk Basic SLURM commands SLURM batch

More information

Introduction to SLURM on the High Performance Cluster at the Center for Computational Research

Introduction to SLURM on the High Performance Cluster at the Center for Computational Research Introduction to SLURM on the High Performance Cluster at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY

More information

Introduction to SLURM & SLURM batch scripts

Introduction to SLURM & SLURM batch scripts Introduction to SLURM & SLURM batch scripts Anita Orendt Assistant Director Research Consulting & Faculty Engagement anita.orendt@utah.edu 16 Feb 2017 Overview of Talk Basic SLURM commands SLURM batch

More information

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013 Slurm and Abel job scripts Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013 Abel in numbers Nodes - 600+ Cores - 10000+ (1 node->2 processors->16 cores) Total memory

More information

Slurm basics. Summer Kickstart June slide 1 of 49

Slurm basics. Summer Kickstart June slide 1 of 49 Slurm basics Summer Kickstart 2017 June 2017 slide 1 of 49 Triton layers Triton is a powerful but complex machine. You have to consider: Connecting (ssh) Data storage (filesystems and Lustre) Resource

More information

Submitting batch jobs

Submitting batch jobs Submitting batch jobs SLURM on ECGATE Xavi Abellan Xavier.Abellan@ecmwf.int ECMWF February 20, 2017 Outline Interactive mode versus Batch mode Overview of the Slurm batch system on ecgate Batch basic concepts

More information

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu Duke Compute Cluster Workshop 3/28/2018 Tom Milledge rc.duke.edu rescomputing@duke.edu Outline of talk Overview of Research Computing resources Duke Compute Cluster overview Running interactive and batch

More information

Duke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/

Duke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/ Duke Compute Cluster Workshop 11/10/2016 Tom Milledge h:ps://rc.duke.edu/ rescompu>ng@duke.edu Outline of talk Overview of Research Compu>ng resources Duke Compute Cluster overview Running interac>ve and

More information

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012 Slurm and Abel job scripts Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012 Abel in numbers Nodes - 600+ Cores - 10000+ (1 node->2 processors->16 cores) Total memory

More information

High Performance Computing Cluster Basic course

High Performance Computing Cluster Basic course High Performance Computing Cluster Basic course Jeremie Vandenplas, Gwen Dawes 30 October 2017 Outline Introduction to the Agrogenomics HPC Connecting with Secure Shell to the HPC Introduction to the Unix/Linux

More information

Introduction to SLURM & SLURM batch scripts

Introduction to SLURM & SLURM batch scripts Introduction to SLURM & SLURM batch scripts Anita Orendt Assistant Director Research Consulting & Faculty Engagement anita.orendt@utah.edu 23 June 2016 Overview of Talk Basic SLURM commands SLURM batch

More information

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Partners Funding Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike

More information

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu Duke Compute Cluster Workshop 10/04/2018 Tom Milledge rc.duke.edu rescomputing@duke.edu Outline of talk Overview of Research Computing resources Duke Compute Cluster overview Running interactive and batch

More information

Sherlock for IBIIS. William Law Stanford Research Computing

Sherlock for IBIIS. William Law Stanford Research Computing Sherlock for IBIIS William Law Stanford Research Computing Overview How we can help System overview Tech specs Signing on Batch submission Software environment Interactive jobs Next steps We are here to

More information

Slurm at UPPMAX. How to submit jobs with our queueing system. Jessica Nettelblad sysadmin at UPPMAX

Slurm at UPPMAX. How to submit jobs with our queueing system. Jessica Nettelblad sysadmin at UPPMAX Slurm at UPPMAX How to submit jobs with our queueing system Jessica Nettelblad sysadmin at UPPMAX Slurm at UPPMAX Intro Queueing with Slurm How to submit jobs Testing How to test your scripts before submission

More information

Batch Systems. Running your jobs on an HPC machine

Batch Systems. Running your jobs on an HPC machine Batch Systems Running your jobs on an HPC machine Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat

Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat Summary 1. Submitting Jobs: Batch mode - Interactive mode 2. Partition 3. Jobs: Serial, Parallel 4. Using generic resources Gres : GPUs, MICs.

More information

How to run a job on a Cluster?

How to run a job on a Cluster? How to run a job on a Cluster? Cluster Training Workshop Dr Samuel Kortas Computational Scientist KAUST Supercomputing Laboratory Samuel.kortas@kaust.edu.sa 17 October 2017 Outline 1. Resources available

More information

Submitting batch jobs Slurm on ecgate

Submitting batch jobs Slurm on ecgate Submitting batch jobs Slurm on ecgate Xavi Abellan xavier.abellan@ecmwf.int User Support Section Com Intro 2015 Submitting batch jobs ECMWF 2015 Slide 1 Outline Interactive mode versus Batch mode Overview

More information

Exercises: Abel/Colossus and SLURM

Exercises: Abel/Colossus and SLURM Exercises: Abel/Colossus and SLURM November 08, 2016 Sabry Razick The Research Computing Services Group, USIT Topics Get access Running a simple job Job script Running a simple job -- qlogin Customize

More information

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU What is Joker? NMSU s supercomputer. 238 core computer cluster. Intel E-5 Xeon CPUs and Nvidia K-40 GPUs. InfiniBand innerconnect.

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

HPC Introductory Course - Exercises

HPC Introductory Course - Exercises HPC Introductory Course - Exercises The exercises in the following sections will guide you understand and become more familiar with how to use the Balena HPC service. Lines which start with $ are commands

More information

High Performance Computing Cluster Advanced course

High Performance Computing Cluster Advanced course High Performance Computing Cluster Advanced course Jeremie Vandenplas, Gwen Dawes 9 November 2017 Outline Introduction to the Agrogenomics HPC Submitting and monitoring jobs on the HPC Parallel jobs on

More information

CRUK cluster practical sessions (SLURM) Part I processes & scripts

CRUK cluster practical sessions (SLURM) Part I processes & scripts CRUK cluster practical sessions (SLURM) Part I processes & scripts login Log in to the head node, clust1-headnode, using ssh and your usual user name & password. SSH Secure Shell 3.2.9 (Build 283) Copyright

More information

Name Department/Research Area Have you used the Linux command line?

Name Department/Research Area Have you used the Linux command line? Please log in with HawkID (IOWA domain) Macs are available at stations as marked To switch between the Windows and the Mac systems, press scroll lock twice 9/27/2018 1 Ben Rogers ITS-Research Services

More information

Heterogeneous Job Support

Heterogeneous Job Support Heterogeneous Job Support Tim Wickberg SchedMD SC17 Submitting Jobs Multiple independent job specifications identified in command line using : separator The job specifications are sent to slurmctld daemon

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

Batch Usage on JURECA Introduction to Slurm. May 2016 Chrysovalantis Paschoulas HPS JSC

Batch Usage on JURECA Introduction to Slurm. May 2016 Chrysovalantis Paschoulas HPS JSC Batch Usage on JURECA Introduction to Slurm May 2016 Chrysovalantis Paschoulas HPS group @ JSC Batch System Concepts Resource Manager is the software responsible for managing the resources of a cluster,

More information

Setup InstrucIons. If you need help with the setup, please put a red sicky note at the top of your laptop.

Setup InstrucIons. If you need help with the setup, please put a red sicky note at the top of your laptop. Setup InstrucIons Please complete these steps for the June 26 th workshop before the lessons start at 1:00 PM: h;p://hcc.unl.edu/june-workshop-setup#weekfour And make sure you can log into Crane. OS-specific

More information

Introduction to RCC. September 14, 2016 Research Computing Center

Introduction to RCC. September 14, 2016 Research Computing Center Introduction to HPC @ RCC September 14, 2016 Research Computing Center What is HPC High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers

More information

Introduction to High Performance Computing at Case Western Reserve University. KSL Data Center

Introduction to High Performance Computing at Case Western Reserve University. KSL Data Center Introduction to High Performance Computing at Case Western Reserve University Research Computing and CyberInfrastructure team KSL Data Center Presenters Emily Dragowsky Daniel Balagué Guardia Hadrian Djohari

More information

Working with Shell Scripting. Daniel Balagué

Working with Shell Scripting. Daniel Balagué Working with Shell Scripting Daniel Balagué Editing Text Files We offer many text editors in the HPC cluster. Command-Line Interface (CLI) editors: vi / vim nano (very intuitive and easy to use if you

More information

Introduction to GACRC Teaching Cluster

Introduction to GACRC Teaching Cluster Introduction to GACRC Teaching Cluster Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Overview Computing Resources Three Folders

More information

Using a Linux System 6

Using a Linux System 6 Canaan User Guide Connecting to the Cluster 1 SSH (Secure Shell) 1 Starting an ssh session from a Mac or Linux system 1 Starting an ssh session from a Windows PC 1 Once you're connected... 1 Ending an

More information

CNAG Advanced User Training

CNAG Advanced User Training www.bsc.es CNAG Advanced User Training Aníbal Moreno, CNAG System Administrator Pablo Ródenas, BSC HPC Support Rubén Ramos Horta, CNAG HPC Support Barcelona,May the 5th Aim Understand CNAG s cluster design

More information

Introduction to GACRC Teaching Cluster

Introduction to GACRC Teaching Cluster Introduction to GACRC Teaching Cluster Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Overview Computing Resources Three Folders

More information

How to access Geyser and Caldera from Cheyenne. 19 December 2017 Consulting Services Group Brian Vanderwende

How to access Geyser and Caldera from Cheyenne. 19 December 2017 Consulting Services Group Brian Vanderwende How to access Geyser and Caldera from Cheyenne 19 December 2017 Consulting Services Group Brian Vanderwende Geyser nodes useful for large-scale data analysis and post-processing tasks 16 nodes with: 40

More information

Submitting batch jobs Slurm on ecgate Solutions to the practicals

Submitting batch jobs Slurm on ecgate Solutions to the practicals Submitting batch jobs Slurm on ecgate Solutions to the practicals Xavi Abellan xavier.abellan@ecmwf.int User Support Section Com Intro 2015 Submitting batch jobs ECMWF 2015 Slide 1 Practical 1: Basic job

More information

Quick Guide for the Torque Cluster Manager

Quick Guide for the Torque Cluster Manager Quick Guide for the Torque Cluster Manager Introduction: One of the main purposes of the Aries Cluster is to accommodate especially long-running programs. Users who run long jobs (which take hours or days

More information

Introduction to RCC. January 18, 2017 Research Computing Center

Introduction to RCC. January 18, 2017 Research Computing Center Introduction to HPC @ RCC January 18, 2017 Research Computing Center What is HPC High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much

More information

Introduction to GACRC Teaching Cluster PHYS8602

Introduction to GACRC Teaching Cluster PHYS8602 Introduction to GACRC Teaching Cluster PHYS8602 Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Overview Computing Resources Three

More information

Choosing Resources Wisely. What is Research Computing?

Choosing Resources Wisely. What is Research Computing? Choosing Resources Wisely Scott Yockel, PhD Harvard - Research Computing What is Research Computing? Faculty of Arts and Sciences (FAS) department that handles nonenterprise IT requests from researchers.

More information

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK RHRK-Seminar High Performance Computing with the Cluster Elwetritsch - II Course instructor : Dr. Josef Schüle, RHRK Overview Course I Login to cluster SSH RDP / NX Desktop Environments GNOME (default)

More information

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende Introduction to the NCAR HPC Systems 25 May 2018 Consulting Services Group Brian Vanderwende Topics to cover Overview of the NCAR cluster resources Basic tasks in the HPC environment Accessing pre-built

More information

User Guide of High Performance Computing Cluster in School of Physics

User Guide of High Performance Computing Cluster in School of Physics User Guide of High Performance Computing Cluster in School of Physics Prepared by Sue Yang (xue.yang@sydney.edu.au) This document aims at helping users to quickly log into the cluster, set up the software

More information

NBIC TechTrack PBS Tutorial

NBIC TechTrack PBS Tutorial NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen Visit our webpage at: http://www.nbic.nl/support/brs 1 NBIC PBS Tutorial

More information

PBS Pro Documentation

PBS Pro Documentation Introduction Most jobs will require greater resources than are available on individual nodes. All jobs must be scheduled via the batch job system. The batch job system in use is PBS Pro. Jobs are submitted

More information

ECE 574 Cluster Computing Lecture 4

ECE 574 Cluster Computing Lecture 4 ECE 574 Cluster Computing Lecture 4 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 31 January 2017 Announcements Don t forget about homework #3 I ran HPCG benchmark on Haswell-EP

More information

Bash for SLURM. Author: Wesley Schaal Pharmaceutical Bioinformatics, Uppsala University

Bash for SLURM. Author: Wesley Schaal Pharmaceutical Bioinformatics, Uppsala University Bash for SLURM Author: Wesley Schaal Pharmaceutical Bioinformatics, Uppsala University wesley.schaal@farmbio.uu.se Lab session: Pavlin Mitev (pavlin.mitev@kemi.uu.se) it i slides at http://uppmax.uu.se/support/courses

More information

Slurm at UPPMAX. How to submit jobs with our queueing system. Jessica Nettelblad sysadmin at UPPMAX

Slurm at UPPMAX. How to submit jobs with our queueing system. Jessica Nettelblad sysadmin at UPPMAX Slurm at UPPMAX How to submit jobs with our queueing system Jessica Nettelblad sysadmin at UPPMAX Free! Watch! Futurama S2 Ep.4 Fry and the Slurm factory Simple Linux Utility for Resource Management Open

More information

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing A Hands-On Tutorial: RNA Sequencing Using Computing February 11th and 12th, 2016 1st session (Thursday) Preliminaries: Linux, HPC, command line interface Using HPC: modules, queuing system Presented by:

More information

Troubleshooting Jobs on Odyssey

Troubleshooting Jobs on Odyssey Troubleshooting Jobs on Odyssey Paul Edmon, PhD ITC Research CompuGng Associate Bob Freeman, PhD Research & EducaGon Facilitator XSEDE Campus Champion Goals Tackle PEND, FAIL, and slow performance issues

More information

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 Email:plamenkrastev@fas.harvard.edu Objectives Inform you of available computational resources Help you choose appropriate computational

More information

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta Using Compute Canada Masao Fujinaga Information Services and Technology University of Alberta Introduction to cedar batch system jobs are queued priority depends on allocation and past usage Cedar Nodes

More information

Introduction to the Cluster

Introduction to the Cluster Follow us on Twitter for important news and updates: @ACCREVandy Introduction to the Cluster Advanced Computing Center for Research and Education http://www.accre.vanderbilt.edu The Cluster We will be

More information

Answers to Federal Reserve Questions. Training for University of Richmond

Answers to Federal Reserve Questions. Training for University of Richmond Answers to Federal Reserve Questions Training for University of Richmond 2 Agenda Cluster Overview Software Modules PBS/Torque Ganglia ACT Utils 3 Cluster overview Systems switch ipmi switch 1x head node

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Introduction to UBELIX

Introduction to UBELIX Science IT Support (ScITS) Michael Rolli, Nico Färber Informatikdienste Universität Bern 06.06.2017, Introduction to UBELIX Agenda > Introduction to UBELIX (Overview only) Other topics spread in > Introducing

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing Zheng Meyer-Zhao - zheng.meyer-zhao@surfsara.nl Consultant Clustercomputing Outline SURFsara About us What we do Cartesius and Lisa Architectures and Specifications File systems Funding Hands-on Logging

More information

1 Bull, 2011 Bull Extreme Computing

1 Bull, 2011 Bull Extreme Computing 1 Bull, 2011 Bull Extreme Computing Table of Contents Overview. Principal concepts. Architecture. Scheduler Policies. 2 Bull, 2011 Bull Extreme Computing SLURM Overview Ares, Gerardo, HPC Team Introduction

More information

Linux Tutorial. Ken-ichi Nomura. 3 rd Magics Materials Software Workshop. Gaithersburg Marriott Washingtonian Center November 11-13, 2018

Linux Tutorial. Ken-ichi Nomura. 3 rd Magics Materials Software Workshop. Gaithersburg Marriott Washingtonian Center November 11-13, 2018 Linux Tutorial Ken-ichi Nomura 3 rd Magics Materials Software Workshop Gaithersburg Marriott Washingtonian Center November 11-13, 2018 Wireless Network Configuration Network Name: Marriott_CONFERENCE (only

More information

Grid Engine Users Guide. 5.5 Edition

Grid Engine Users Guide. 5.5 Edition Grid Engine Users Guide 5.5 Edition Grid Engine Users Guide : 5.5 Edition Published May 08 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the Rocks License

More information

An Introduction to Gauss. Paul D. Baines University of California, Davis November 20 th 2012

An Introduction to Gauss. Paul D. Baines University of California, Davis November 20 th 2012 An Introduction to Gauss Paul D. Baines University of California, Davis November 20 th 2012 What is Gauss? * http://wiki.cse.ucdavis.edu/support:systems:gauss * 12 node compute cluster (2 x 16 cores per

More information

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System Image Sharpening Practical Introduction to HPC Exercise Instructions for Cirrus Tier-2 System 2 1. Aims The aim of this exercise is to get you used to logging into an HPC resource, using the command line

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

GPU Cluster Usage Tutorial

GPU Cluster Usage Tutorial GPU Cluster Usage Tutorial How to make caffe and enjoy tensorflow on Torque 2016 11 12 Yunfeng Wang 1 PBS and Torque PBS: Portable Batch System, computer software that performs job scheduling versions

More information

Training day SLURM cluster. Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP

Training day SLURM cluster. Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP Training day SLURM cluster Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP Context PRE-REQUISITE : LINUX connect to «genologin»

More information

For Dr Landau s PHYS8602 course

For Dr Landau s PHYS8602 course For Dr Landau s PHYS8602 course Shan-Ho Tsai (shtsai@uga.edu) Georgia Advanced Computing Resource Center - GACRC January 7, 2019 You will be given a student account on the GACRC s Teaching cluster. Your

More information

Kimmo Mattila Ari-Matti Sarén. CSC Bioweek Computing intensive bioinformatics analysis on Taito

Kimmo Mattila Ari-Matti Sarén. CSC Bioweek Computing intensive bioinformatics analysis on Taito Kimmo Mattila Ari-Matti Sarén CSC Bioweek 2018 Computing intensive bioinformatics analysis on Taito 7. 2. 2018 CSC Environment Sisu Cray XC40 Massively Parallel Processor (MPP) supercomputer 3376 12-core

More information

Running Jobs on Blue Waters. Greg Bauer

Running Jobs on Blue Waters. Greg Bauer Running Jobs on Blue Waters Greg Bauer Policies and Practices Placement Checkpointing Monitoring a job Getting a nodelist Viewing the torus 2 Resource and Job Scheduling Policies Runtime limits expected

More information

Compiling applications for the Cray XC

Compiling applications for the Cray XC Compiling applications for the Cray XC Compiler Driver Wrappers (1) All applications that will run in parallel on the Cray XC should be compiled with the standard language wrappers. The compiler drivers

More information

Graham vs legacy systems

Graham vs legacy systems New User Seminar Graham vs legacy systems This webinar only covers topics pertaining to graham. For the introduction to our legacy systems (Orca etc.), please check the following recorded webinar: SHARCNet

More information

OpenPBS Users Manual

OpenPBS Users Manual How to Write a PBS Batch Script OpenPBS Users Manual PBS scripts are rather simple. An MPI example for user your-user-name: Example: MPI Code PBS -N a_name_for_my_parallel_job PBS -l nodes=7,walltime=1:00:00

More information

COSC 6374 Parallel Computation. Debugging MPI applications. Edgar Gabriel. Spring 2008

COSC 6374 Parallel Computation. Debugging MPI applications. Edgar Gabriel. Spring 2008 COSC 6374 Parallel Computation Debugging MPI applications Spring 2008 How to use a cluster A cluster usually consists of a front-end node and compute nodes Name of the front-end node: shark.cs.uh.edu You

More information

LAB. Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers

LAB. Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers LAB Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers Dan Stanzione, Lars Koesterke, Bill Barth, Kent Milfeld dan/lars/bbarth/milfeld@tacc.utexas.edu XSEDE 12 July 16, 2012 1 Discovery

More information

Using the SLURM Job Scheduler

Using the SLURM Job Scheduler Using the SLURM Job Scheduler [web] [email] portal.biohpc.swmed.edu biohpc-help@utsouthwestern.edu 1 Updated for 2015-05-13 Overview Today we re going to cover: Part I: What is SLURM? How to use a basic

More information

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version... Contents Note: pay attention to where you are........................................... 1 Note: Plaintext version................................................... 1 Hello World of the Bash shell 2 Accessing

More information

CSC209H Lecture 1. Dan Zingaro. January 7, 2015

CSC209H Lecture 1. Dan Zingaro. January 7, 2015 CSC209H Lecture 1 Dan Zingaro January 7, 2015 Welcome! Welcome to CSC209 Comments or questions during class? Let me know! Topics: shell and Unix, pipes and filters, C programming, processes, system calls,

More information

Batch Systems. Running calculations on HPC resources

Batch Systems. Running calculations on HPC resources Batch Systems Running calculations on HPC resources Outline What is a batch system? How do I interact with the batch system Job submission scripts Interactive jobs Common batch systems Converting between

More information

Martinos Center Compute Cluster

Martinos Center Compute Cluster Why-N-How: Intro to Launchpad 8 September 2016 Lee Tirrell Laboratory for Computational Neuroimaging Adapted from slides by Jon Kaiser 1. Intro 2. Using launchpad 3. Summary 4. Appendix: Miscellaneous

More information

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011) UoW HPC Quick Start Information Technology Services University of Wollongong ( Last updated on October 10, 2011) 1 Contents 1 Logging into the HPC Cluster 3 1.1 From within the UoW campus.......................

More information

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen 1 NBIC PBS Tutorial This part is an introduction to clusters and the PBS

More information

Using the Yale HPC Clusters

Using the Yale HPC Clusters Using the Yale HPC Clusters Robert Bjornson Yale Center for Research Computing Yale University Feb 2017 What is the Yale Center for Research Computing? Independent center under the Provost s office Created

More information

SGE Roll: Users Guide. Version Edition

SGE Roll: Users Guide. Version Edition SGE Roll: Users Guide Version 4.2.1 Edition SGE Roll: Users Guide : Version 4.2.1 Edition Published Sep 2006 Copyright 2006 University of California and Scalable Systems This document is subject to the

More information

Slurm Overview. Brian Christiansen, Marshall Garey, Isaac Hartung SchedMD SC17. Copyright 2017 SchedMD LLC

Slurm Overview. Brian Christiansen, Marshall Garey, Isaac Hartung SchedMD SC17. Copyright 2017 SchedMD LLC Slurm Overview Brian Christiansen, Marshall Garey, Isaac Hartung SchedMD SC17 Outline Roles of a resource manager and job scheduler Slurm description and design goals Slurm architecture and plugins Slurm

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class STAT8330 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 Outline What

More information

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing

More information

Introduction to Slurm

Introduction to Slurm Introduction to Slurm Tim Wickberg SchedMD Slurm User Group Meeting 2017 Outline Roles of resource manager and job scheduler Slurm description and design goals Slurm architecture and plugins Slurm configuration

More information

Brigham Young University

Brigham Young University Brigham Young University Fulton Supercomputing Lab Ryan Cox Slurm User Group September 16, 2015 Washington, D.C. Open Source Code I'll reference several codes we have open sourced http://github.com/byuhpc

More information

Introduction to Linux Part 2b: basic scripting. Brett Milash and Wim Cardoen CHPC User Services 18 January, 2018

Introduction to Linux Part 2b: basic scripting. Brett Milash and Wim Cardoen CHPC User Services 18 January, 2018 Introduction to Linux Part 2b: basic scripting Brett Milash and Wim Cardoen CHPC User Services 18 January, 2018 Overview Scripting in Linux What is a script? Why scripting? Scripting languages + syntax

More information

SCALABLE HYBRID PROTOTYPE

SCALABLE HYBRID PROTOTYPE SCALABLE HYBRID PROTOTYPE Scalable Hybrid Prototype Part of the PRACE Technology Evaluation Objectives Enabling key applications on new architectures Familiarizing users and providing a research platform

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

High Performance Computing (HPC) Using zcluster at GACRC

High Performance Computing (HPC) Using zcluster at GACRC High Performance Computing (HPC) Using zcluster at GACRC On-class STAT8060 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC?

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to Job Submission and Scheduling Andrew Gustafson Interacting with MSI Systems Connecting to MSI SSH is the most reliable connection method Linux and Mac

More information

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Introduction What are the intended uses of the MTL? The MTL is prioritized for supporting the Intel Academic Community for the testing, validation

More information

TITANI CLUSTER USER MANUAL V.1.3

TITANI CLUSTER USER MANUAL V.1.3 2016 TITANI CLUSTER USER MANUAL V.1.3 This document is intended to give some basic notes in order to work with the TITANI High Performance Green Computing Cluster of the Civil Engineering School (ETSECCPB)

More information

Kamiak Cheat Sheet. Display text file, one page at a time. Matches all files beginning with myfile See disk space on volume

Kamiak Cheat Sheet. Display text file, one page at a time. Matches all files beginning with myfile See disk space on volume Kamiak Cheat Sheet Logging in to Kamiak ssh your.name@kamiak.wsu.edu ssh -X your.name@kamiak.wsu.edu X11 forwarding Transferring Files to and from Kamiak scp -r myfile your.name@kamiak.wsu.edu:~ Copy to

More information