HPC Resources at W&M...and How To Use Them

Similar documents
Introduction to W&M. Feb 12, 2019 Eric J. Walter Manager of HPC

Intro to W&M and VIMS. Feb 18, 2019 Eric J. Walter Manager of HPC

HPC at W&M / VIMS. June 17, 2015 Eric J. Walter

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

User Guide of High Performance Computing Cluster in School of Physics

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

Supercomputing environment TMA4280 Introduction to Supercomputing

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

New User Tutorial. OSU High Performance Computing Center

Our new HPC-Cluster An overview

Please include the following sentence in any works using center resources.

Migrating from Zcluster to Sapelo

Getting started with the CEES Grid

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

Name Department/Research Area Have you used the Linux command line?

XSEDE New User Tutorial

GPU Cluster Usage Tutorial

Quick Guide for the Torque Cluster Manager

Introduction to Linux for BlueBEAR. January

A Brief Introduction to The Center for Advanced Computing

Using Sapelo2 Cluster at the GACRC

GACRC User Training: Migrating from Zcluster to Sapelo

High Performance Computing (HPC) Club Training Session. Xinsheng (Shawn) Qin

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen

Introduction: What is Unix?

Using the computational resources at the GACRC

Computing with the Moore Cluster

For Dr Landau s PHYS8602 course

Introduction to HPC Using zcluster at GACRC

NBIC TechTrack PBS Tutorial

Introduction to Linux Environment. Yun-Wen Chen

Introduction to HPC at MSU

Introduction to Discovery.

A Brief Introduction to The Center for Advanced Computing

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

XSEDE New User Tutorial

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Introduction to Discovery.

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

XSEDE New User Tutorial

OBTAINING AN ACCOUNT:

Effective Use of CCV Resources

Knights Landing production environment on MARCONI

HPC Resources at Lehigh. Steve Anthony March 22, 2012

Introduction to HPC Using zcluster at GACRC

Shark Cluster Overview

Introduction to PICO Parallel & Production Enviroment

Introduction to HPC Resources and Linux

Introduction to HPCC at MSU

Using ISMLL Cluster. Tutorial Lec 5. Mohsan Jameel, Information Systems and Machine Learning Lab, University of Hildesheim

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System

Introduction to High Performance Computing (HPC) Resources at GACRC

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

A Brief Introduction to The Center for Advanced Computing

AMS 200: Working on Linux/Unix Machines

The DTU HPC system. and how to use TopOpt in PETSc on a HPC system, visualize and 3D print results.

Basic UNIX commands. HORT Lab 2 Instructor: Kranthi Varala

CSC111 Computer Science II

Introduction to the Linux Command Line

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU

XSEDE New User Tutorial

Introduction to Computing V - Linux and High-Performance Computing

Introduction to High Performance Computing (HPC) Resources at GACRC

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220

High Performance Computing (HPC) Using zcluster at GACRC

PACE. Instructional Cluster Environment (ICE) Orientation. Mehmet (Memo) Belgin, PhD Research Scientist, PACE

Logging in to the CRAY

Introduction to HPC Using zcluster at GACRC

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization

Lecture 01 - Working with Linux Servers and Git

Unix Workshop Aug 2014

Introduction to HPC Using the New Cluster at GACRC

Beginner's Guide for UK IBM systems

Introduction to Discovery.

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Answers to Federal Reserve Questions. Training for University of Richmond

Cluster Clonetroop: HowTo 2014

ICS-ACI System Basics

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE

Crash Course in High Performance Computing

Working on the NewRiver Cluster

Introduction to UNIX

Exercise 1: Basic Tools

The JANUS Computing Environment

Introduction to GALILEO

CSE Linux VM. For Microsoft Windows. Based on opensuse Leap 42.2

An Introduction to Cluster Computing Using Newton

Linux Tutorial. Ken-ichi Nomura. 3 rd Magics Materials Software Workshop. Gaithersburg Marriott Washingtonian Center November 11-13, 2018

Shell Scripting. With Applications to HPC. Edmund Sumbar Copyright 2007 University of Alberta. All rights reserved

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

KISTI TACHYON2 SYSTEM Quick User Guide

Introduction to GALILEO

Grid Engine - A Batch System for DESY. Andreas Haupt, Peter Wegner DESY Zeuthen

Sharpen Exercise: Using HPC resources and running parallel applications

Batch system usage arm euthen F azo he Z J. B T

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

CS CS Tutorial 2 2 Winter 2018

Introduction to GALILEO

CS 261 Recitation 1 Compiling C on UNIX

Transcription:

HPC Resources at W&M and How To Use Them What resources are available How to log into HPC machines Setting up your environment How to run a job on the cluster How to get more help January 24, 2017 Laura Hild System Engineer Eric J Walter Manager HPC Jay Kanukurthy Applications Analyst

High Performance Computing Refers to one or more of the following: Advanced computer architecture multi-processor / multi-core machines High-speed networks Gigabit Ethernet / Infiniband compute node processors core Fast Switch Head Node Node Node Node Node Faster Switch Parallel Algorithms Use multiple cores and/or nodes Numerical Algorithms computing efficiently computing in parallel File Storage managing disk space efficient i/o Differences between desktop/laptop vs HPC One processor vs multi-processor Run interactively vs batch jobs Single job vs multiple jobs Prepare on login/head node vs direct Point and click vs command line

Recent events: HPC @ W&M Moved to ISC-3 Q3/4 2016 Consolidated servers/nodes from Jones Hall, JLab, and (some from) physics into Hot-aisle containment APC NetShelter 24 racks, 250 kw power Total cores available for general use: 2152 Coming on-line soon (by end Q1): Bora subcluster: 30 2xIntel E5 2640v4 (40 cores,128gb / node) 1200 cores total Meltemi subcluster: 100 Intel Xeon Phi / Knights Landing (64 cores,32gb / node) 6400 cores total Shared with W&M Physics More disk space: ~200 TB parallel file system New server room ISC 1251

SciClone Front-end/login Typhoon Hurricane Vortex sub-cluster Typhoon (ty01-ty72) 288 cores 4 cores/node Opteron Santa-Rosa (26 GHz) 60: 8 GB/node 12: 24 GB/node ~2007 Hurricane (hu01-hu12) 96 cores 8 cores/node Intel X5672 (32 GHz) 48 GB/node ~2011 Whirlwind (wh01-wh52) 448 cores 8 cores/node Intel X5672 (32 GHz) 44: 64 GB/node 8: 192 GB/node ~2011 Vortex (vx01-vx36) 432 cores 12 cores/node Opteron Seoul (31 GHz) 28: 32 GB/node 8: 128 GB/node ~2015 For beginners ssh to typhoonsciclonewmedu to run a job on typhoon Storm Wind (wi01-wi28) 448 cores 12 cores/node Opteron Magny-Cours (24 GHz) 32 GB/node ~2012 ice (ice01,ice02) 48/32 cores Opteron Magny-Cours (24 GHz) hail (ha01-ha38) 304 cores 8 cores/node Opteron Shanghai (24 GHz) 28: 32 GB/node 8: 128 GB/node ~2011 96/64 GB/node ~2012

Necessary Topics for HPC How to log into HPC machines? Linux Shell / Text editors basic Linux skills What software do I want to run? (do I need to compile?) What sub-cluster will I use? What file-system should I use? Using the batch system Where to get help? HPC webpage: HPC ticket system http://wwwwmedu/offices/it/services/hpc/atwm/indexphp mail: hpc-help@wmedu

Logging into HPC machines Must use Secure Shell client (SSH) Linux / Mac built-in (terminal) Windows SSH Secure Shell Client / PuTTY [ewalter@particle ~]$ ssh hurricanesciclonewmedu Password: Last login: Tue Feb 2 13:57:59 2016 from particlehpcwmedu -------------------------------------------------------------------------------- William and Mary Information Technology / SciClone Cluster 1 [hurricane] Questions to ask yourself: Am I on or off campus? If you are off-campus log into statwmedu first using your W&M username and password Is my username the same as my current machine? If it is different use: ssh <username>@<host><domain> Do I need graphics? If yes, then log in with -X

Linux Shell Usage Main things to learn about linux/shell learn to log in and out of front-end servers ssh, X forwarding, alternate user name manipulating files/folders cd change directory cp copy file mv rename file man read manual page rm remove file mkdir make directory rmdir remove directory configuring your environment env print your environmentals variables editing your cshrc$platform file http://wwwwmedu/offices/it/services/hpc/using/shell/indexphp - page of linux tutorials

Popular text editors: emacs or vi/vim Linux Text Editors emacs: huge, bloated, not installed by default, but the champion! vi/vim: tiny, always available, some users love it (?) nano: editor with training wheels, very easy to use, not very powerful Emacs Tutorials http://wwwgnuorg/software/emacs/tour/ http://wwwjesshamrickcom/2012/09/10/absolute-beginners-guide-to-emacs/ http://www2libuchicagoedu/keith/tcl-course/emacs-tutorialhtml Vim Tutorials http://wwwvimorg/ http://vimwikiacom/wiki/tutorial http://linuxconfigorg/vim-tutorial Nano homepage http://wwwnano-editororg/

Using Software All HPC machines use Environment Modules to select software Environmental variables determines a user s environment ; what commands/applications/libraries are available $PATH determines what can be executed $LD_LIBRARY_PATH determines what libraries are available at run time Environment Modules make these easy to change on the command line Before running applications on HPC machines, modules should be selected Default software modules can selected (best) by editing appropriate cshrc$platform Modules can be changed on demand (not as good but can be necessary) Learning how to use Modules is important for using HPC

Using Software Modules 15 [hurricane] module avail ---------------------------------- /usr/local/modules/modulefiles ----------------- acml/510/gcc mpc/082(default) acml/510/pgi mpfr/242 acml-int64/510/gcc mpfr/242a(default) acml-int64/510/pgi mpi4py/131/gcc acml-mp/510/gcc mummer/323 acml-mp/510/pgi mvapich2-ib/12x1/pgi acml-mp-int64/510/gcc mvapich2-ib/19/gcc acml-mp-int64/510/pgi mvapich2-ib/19/gcc-484 admb/112b/gcc mvapich2-ib/19/pgi allpathslg/47017/gcc mvapich2-ib/19a2/gcc 16 [hurricane] module list Currently Loaded Modulefiles: 1) modules 2) maui/326p21 3) torque/237 4) isa/nehalem 5) gcc/470 6) python/275/gcc http://wwwwmedu/offices/it/services/hpc/using/modules/indexphp - online module help Can use module load and unload commands for current shell Best to use startup

Configuring your Environment $PLATFORM variable: 11 [hurricane] echo $PLATFORM rhel6-xeon This means that startup is controlled by cshrcrhel6-xeon for hurricane subcluster front-end $PLATFORM isa set? typhoon typhoon sles10-opteron amd64b --- hurricane hurricane rhel6-xeon nehalem --- whirlwind hurricane rhel6-xeon nehalem --- vortex vortex rhel6-opteron seoul --- wind storm rhel6-storm magny-cours set_wind ice storm rhel6-storm magny-cours set_ice hail storm rhel6-storm shanghai set_hail The cshrc$platform controls what modules are loaded for a batch job

HPC Filesystems / Backup 3 types of filesystems: home backed up nightly ; small used for input files and code, etc data backed up weekly ; large files ; medium term storage scratch NOT backed up ; large output files short term storage 146 [hurricane] df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/volgroup30-logvol31 917G 482G 390G 56% /sciclone/home00 tn00:/usr/local 134G 113G 15G 89% /usr/local tn00:/export 46G 15G 30G 33% /import mh00:/var/spool/mail 79G 43G 33G 57% /var/spool/mail gfs00:/sciclone/home04 591G 429G 157G 74% /sciclone/home04 ty00:/sciclone/scr02 273G 81M 273G 1% /sciclone/scr02 tn00:/sciclone/scr10 79G 23G 53G 31% /sciclone/scr10 tn00:/sciclone/scr30 17T 13T 47T 72% /sciclone/scr30 gfs00:/sciclone/data10 16T 15T 19T 89% /sciclone/data10 tw00-i8:/sciclone/data20 73T 57T 16T 79% /sciclone/data20 /dev/md1 81T 50T 28T 65% /sciclone/scr20 vx00:/sciclone/home10 27T 202M 26T 1% /sciclone/home10 vx00:/sciclone/scr00 318G 48G 297G 2% /sciclone/scr00 Use local scratch if you can! Will give the best performance

Using the Batch System HPC uses Torque (PBS) to schedule and run jobs Nodes are selected via the node type qsub submits the job to the batch system node type 27 [vortex] qsub -I -l walltime=30:00 -l nodes=1:vortex:ppn=12 qsub: waiting for job 1552781vortexsciclonewmedu to start qsub: job 1552781vortexsciclonewmedu ready 1 [vx01] python progpy Interactive job puts you on a node ready to work all job ids are the form of ###vortexsciclonewmedu even on other subclusters There are many node types The default node type is simply the sub-cluster name vortex It is also possible to select certain subsets within a cluster or a collection of sub-clusters x5672 any hurricane or whirlwind node c18b only large memory vortex nodes See online documentation or send email to hpc-help@wmedu for more information http://wwwwmedu/offices/it/services/hpc/using/jobs/indexphp

Using the Batch System II You can also submit a batch job which does not run interactively First you must write a batch script: 34 [hurricane] cat run #!/bin/tcsh #PBS -N test #PBS -l nodes=1:x5672:ppn=8 #PBS -l walltime=0:10:00 #PBS -j oe cd $PBS_O_WORKDIR python progpy >& progout #!/bin/tcsh interpret the following in tcsh syntax -N name of the job -l job specifications (walltime ; nodespec) -j combine stderr and stdout cd $PBS_O_WORKDIR /aout cd to where I submitted the job run the job 35 [hurricane] qsub run most widely used batch commands qsub submit job qdel delete job qstat list jobs qsu list my jobs 148 [vortex] more testo2785870 Warning: no access to tty (Bad file descriptor) Thus no job control in this shell tput: No value for $TERM and no T specified

107 [hurricane] more run #!/bin/tcsh #PBS -N test #PBS -l nodes=1:c9:ppn=1 #PBS -l walltime=12:00:00 #PBS -j oe #PBS -q matlab Using the Batch System III MATLAB example must add -q matlab for matlab jobs load matlab module (if needed) redirect stdout and stderr cd $PBS_O_WORKDIR module load matlab file for stdout and stderr matlab -nodisplay -r "readmatrix" >& OUT 108 [hurricane] head readmatrixm tic %parpool(8) syms a b c d; meshpoints = meshgenerator(); eigfile = fopen('eigfiletxt', 'wt'); count = 1; count2 = 1; %set(0, 'CurrentFigure', 1); %plot3(0,0,0,''); %grid on http://wwwwmedu/offices/it/services/hpc/using/tutorials/indexphp

Using the Batch System IV Gaussian example: Can get test input files (com files) and supplied answers from /usr/local/gaussian/g09/tests/com and /usr/local/gaussian/g09/tests/amd64 #!/bin/tcsh #PBS -N GaussianTest #PBS -l nodes=1:vortex:ppn=1 #PBS -l walltime=1:00:00 #PBS -j oe cd $PBS_O_WORKDIR set $GAUSS_SCRDIR to something defaults to current directory setenv GAUSS_SCRDIR /sciclone/scr10/ewalter g09 < test0000com > testout0000 g09 is the name of the program test0000com is the input file (from the com folder) testout0000 is the output file Can we do something better than a serial job?

Gaussian Link 0 commands from: https://wwwmsiumnedu/sites/default/files/introtogaussian09pdf %mem=n sets the amount of dynamic memory (n), default is 32MB Units allowed, kb, mb, gb, kw, mw, or gw %nproc=n sets the number of processors, n, to use %chk=file location and name of checkpoint file %rwf=file location and name of rwf file 17 [typhoon] head input %nproc=1 %mem=6gb %rwf=/sciclone/scr10/ewalter/testrwf %chk=/sciclone/data10/ewalter/testchk #p pbepbe/6-311g sparse test scf

Gaussian Shared Memory Parallel Gaussian09 has the ability to use multiple cores within a node (larger calculations) Example: test0445com in Gaussian directory; DFT optimization: Always worth checking that parallelism is not slowing down run!

HPC webpage: HPC ticket system Where to get help? http://wwwwmedu/offices/it/services/hpc/atwm/indexphp mail: hpc-help@wmedu Using the ticket system is useful since it is monitored by three of us WE'RE HERE TO HELP!