Graham vs legacy systems

Similar documents
Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU

Deep Learning on SHARCNET:

Introduction to High-Performance Computing (HPC)

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013

How to Use a Supercomputer - A Boot Camp

Choosing Resources Wisely. What is Research Computing?

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Using a Linux System 6

An Introduction to Gauss. Paul D. Baines University of California, Davis November 20 th 2012

Introduction to High-Performance Computing (HPC)

Sherlock for IBIIS. William Law Stanford Research Computing

Slurm basics. Summer Kickstart June slide 1 of 49

Duke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

High Performance Computing Cluster Advanced course

For Dr Landau s PHYS8602 course

HPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing

New User Seminar: Part 2 (best practices)

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

XSEDE New User Training. Ritu Arora November 14, 2014

How to run a job on a Cluster?

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu

HPC Introductory Course - Exercises

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Introduction to the Cluster

Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides)

Exercises: Abel/Colossus and SLURM

Batch Usage on JURECA Introduction to Slurm. May 2016 Chrysovalantis Paschoulas HPS JSC

High Performance Computing Cluster Basic course

Compiling applications for the Cray XC

High Performance Computing (HPC) Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC

SCALABLE HYBRID PROTOTYPE

Workstations & Thin Clients

Introduction to HPC Using zcluster at GACRC

Introduction to HPC2N

Introduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer. Purpose

Introduction to Abel/Colossus and the queuing system

Introduction to HPC Using zcluster at GACRC

Singularity: container formats

Working with Shell Scripting. Daniel Balagué

Introduction to the ITA computer system

CSCS Proposal writing webinar Technical review. 12th April 2015 CSCS

Introduction to BioHPC

Slurm Birds of a Feather

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220

Training day SLURM cluster. Context. Context renewal strategy

Introduction to CINECA HPC Environment

Introduction to the Cluster

CALMIP : HIGH PERFORMANCE COMPUTING

Name Department/Research Area Have you used the Linux command line?

Our Workshop Environment

ARMINIUS Brief Instructions

Introduction to GALILEO

Introduction to SLURM & SLURM batch scripts

HTC Brief Instructions

Genius Quick Start Guide

TITANI CLUSTER USER MANUAL V.1.3

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Introduction to CARC. To provide high performance computing to academic researchers.

CNAG Advanced User Training

Introduction. Kevin Miles. Paul Henderson. Rick Stillings. Essex Scales. Director of Research Support. Systems Engineer.

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:-

GPUs and Emerging Architectures

LaPalma User's Guide

Introduction to High-Performance Computing (HPC)

Using SHARCNET: An Introduction

Számítogépes modellezés labor (MSc)

Beginner's Guide for UK IBM systems

Introduction to PICO Parallel & Production Enviroment

Introduction to Discovery.

Scientific Computing in practice

Computing with the Moore Cluster

Introduction to HPC2N

Linux Tutorial. Ken-ichi Nomura. 3 rd Magics Materials Software Workshop. Gaithersburg Marriott Washingtonian Center November 11-13, 2018

Submitting batch jobs

Intel C++ Compiler User's Guide With Support For The Streaming Simd Extensions 2

Introduction to GACRC Teaching Cluster PHYS8602

Power9 CTE User s Guide

How to access Geyser and Caldera from Cheyenne. 19 December 2017 Consulting Services Group Brian Vanderwende

UL HPC School 2017[bis] PS1: Getting Started on the UL HPC platform

Introduction to UBELIX

Introduction to GACRC Teaching Cluster

XSEDE New User Tutorial

Illinois Proposal Considerations Greg Bauer

Training day SLURM cluster. Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP

Guillimin HPC Users Meeting April 13, 2017

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization

Using the Yale HPC Clusters

2017 Resource Allocations Competition Results

Debugging and profiling of MPI programs

Introduction to GALILEO

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

Transcription:

New User Seminar

Graham vs legacy systems This webinar only covers topics pertaining to graham. For the introduction to our legacy systems (Orca etc.), please check the following recorded webinar: SHARCNet New User Seminar for Legacy Systems available on our youtube channel, http://youtube.sharcnet.ca or read it online: https://www.sharcnet.ca/help/index.php/getting_started_with_sharcnet

Topics SHARCNET Where to look for information and get help Essentials What are available How to connect to graham How to transfer files How to compile programs How to submit jobs Manage files Do s and don t do s Q & A

What is SHARCNET? A consortium of 18 Ontario institutions providing advanced computing resources and support... Shared Hierarchical Academic Research Computing NETwork Member of Compute Canada and Compute Ontario > 3,000 Canadian and international users 50,000+ CPU cores 320 P100 GPUs 10 Gb/s network

Topics SHARCNET Where to look for information and get help Essentials What are available How to connect to graham How to transfer files How to compile programs How to submit jobs Manage files Do s and don t do s Q & A

Getting Help: SHARCNET web portal The SHARCNET web site (www.sharcnet.ca) provides extensive information about our systems and software. User-editable help wiki Help pages, tutorials, FAQ: Support > Wiki Software documentation: Facilities > Software System status System notices, present status: Facilities > Systems System notices are also sent via email and posted on RSS Ticketing system Online access (requires login): Support > Tickets Or send an email to help@sharcnet.ca

Getting Help: Compute Canada site Compute Canada web site (docs.computecanada.ca) contains a large collection of help pages for the national systems (Graham and Cedar). How-to guides Getting Started with the new National Systems (mini-webinar series) Detailed help pages on submitting jobs, software etc. Compute Canada s problem tracking system Email to support@computecanada.ca

Getting Help: SHARCNET or CC? Graham related issues Check both SHARCNET s and Compute Canada s help pages Submit a ticket to Compute Canada at support@computecanada.ca Help for legacy systems (orca etc) Use SHARCNET s help pages and ticketing system. Cedar and Niagara related issues Use Compute Canada s help pages and ticketing system.

Topics SHARCNET Where to look for information and get help Essentials What is available How to connect to graham How to transfer files How to compile programs How to submit jobs Manage files Do s and don t do s Q & A

Essentials: Computing Environment Systems Clusters, Cloud facilities Operating Systems Languages Linux (64-bit CentOS) C/C++, Fortran, Matlab/Octave, Python, R, Java, etc. Key Parallel Development Support MPI, pthreads, OpenMP, CUDA, OpenACC, OpenCL Software Modules select pre-built and configured software, as well as versions, with the module command Batch Scheduling SLURM scheduler

Essentials: Access to SHARCNET Connecting to clusters All systems are only accessible via secure shell (ssh): $ ssh user@graham.computecanada.ca Use your Compute Canada credentials to login to Graham. We recommend authenticating using an ssh key agent. See the SSH page in our help wiki for details Connection and file transfer programs Unix / Mac scp or sftp, rsync Windows MobaXterm Cygwin (A full Unix like suite) Any OS (from a browser) Globus

Essentials: File systems File system Quotas Backed up? Home Space /home Purged? Available by Default? 50 GB and 0.5M files per user Yes No Yes Yes Mounted on Compute Nodes? Scratch Space /scratch 20 TB and 1M files per user, can request increase to 100 TB No Yes, all files older than a certain number of days Yes Yes Project Space /project 1 TB and 5M files per group, can request increase to 10 TB Yes No Yes Yes Nearline Space 5 TB per group No No No No Run quota command on Graham/Cedar to find out if you are approaching or over the disk quota.

Essentials: Graham cluster Number of CPU cores: 33,448 Number of nodes: 1043 32 cpu cores per node Between 128 and 3072 GB of RAM per node Number of NVIDIA P100 GPUs: 320 Networking: EDR (cpu nodes) and FDR (GPU and cloud nodes) InfiniBand

Essentials: Managing jobs with SLURM All significant work shall be submitted to the system as a job, run in batch mode. Jobs are submitted using the sbatch command with a script #!/bin/bash #SBATCH --time=0-00:05 # Run time limit (DD-HH:MM) #SBATCH --account=def-bge #SBATCH --ntasks=32 # Number of MPI processes, default 1 #SBATCH --cpus-per-task=32 #SBATCH --gres=gpu:4 # Normally defined for threaded jobs # request GPU "generic resource", 4 on Cedar, 2 on Graham #SBATCH --mem=1024m #SBATCH --mem-per-cpu=1024m #SBATCH --job-name=hello #SBATCH --output=%x-%j.log./myprog # memory; default unit is megabytes # Optional, for user s reference # You give any name # Replace with mpiexec./myprog or srun./myprog for MPI jobs squeue: to list the status of submitted jobs. sacct: to show details of recent jobs. scancel: to kill jobs. Use command man scommand to see details.

Why is my job not starting? There may be multiple reasons Graham/Cedar are very busy clusters, with <20% of the cycles available to non-rac jobs. Tip: consider applying for RAC. Requesting much more resources (runtime, CPU cores, memory) than what is actually needed will result in a longer queue wait time, for no good reason. Tip: request only what the job needs, with a bit of leeway. If your job uses multiples of 32 cpu cores, sometimes the queue wait time can be much shorter if you do a by-node reservation, instead of the default by-core one. Tip: use --nodes=n and --ntasks-per-node=32 sbatch arguments to request the by-node reservation.

Topics SHARCNET Where to look for information and get help Essentials What are available How to connect to graham How to transfer files How to compile programs How to submit jobs Manage files Do s and don t do s Q & A

Common mistakes to avoid Do not run significant programs on login nodes, nor run programs directly on compute nodes. Do not specify a maximum job run time blindly (say, 7 days), or more memory than required for your program pick an appropriate value, eg. 150% of the measured/expected run time or memory per processor Do not create millions of tiny files, or large amounts (> GB) of uncompressed (eg. ASCII) output aggregate files with tar, use binary or compressed file formats

Q&A Submitting tickets to Compute Canada support@computecanada.ca for Graham/Cedar related RAC allocations Accounts And rest to SHARCNET at help@sharcnet.ca