Introduc)on to Pacman

Similar documents
Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Introduction to PICO Parallel & Production Enviroment

Effective Use of CCV Resources

Introduction to GALILEO

Introduction to GALILEO

Introduction to CINECA HPC Environment

Introduc)on to Hyades

Introduction to GALILEO

Working on the NewRiver Cluster

Our new HPC-Cluster An overview

Introduction to Discovery.

Introduction to Discovery.

PACE. Instructional Cluster Environment (ICE) Orientation. Mehmet (Memo) Belgin, PhD Research Scientist, PACE

Introduction to HPC Using zcluster at GACRC

Duke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/

A Brief Introduction to The Center for Advanced Computing

Computing with the Moore Cluster

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE

User Guide of High Performance Computing Cluster in School of Physics

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Running Jobs, Submission Scripts, Modules

XSEDE New User Tutorial

Tech Computer Center Documentation

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

High Performance Computing (HPC) Using zcluster at GACRC

Using the computational resources at the GACRC

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220

Using Sapelo2 Cluster at the GACRC

Practical: a sample code

New User Seminar: Part 2 (best practices)

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009

A Brief Introduction to The Center for Advanced Computing

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization

Cluster Clonetroop: HowTo 2014

PACE Orientation. Research Scientist, PACE

Using the IBM Opteron 1350 at OSC. October 19-20, 2010

Crash Course in High Performance Computing

Introduction to HPCC at MSU

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu

XSEDE New User Tutorial

Our Workshop Environment

The JANUS Computing Environment

A Brief Introduction to The Center for Advanced Computing

Introduction to Discovery.

Introduction to HPC Numerical libraries on FERMI and PLX

UF Research Computing: Overview and Running STATA

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Introduction to Cheyenne. 12 January, 2017 Consulting Services Group Brian Vanderwende

XSEDE New User Tutorial

DDT: A visual, parallel debugger on Ra

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Data Management at ARSC

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System

XSEDE New User Tutorial

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Our Workshop Environment

Introduction to High Performance Computing (HPC) Resources at GACRC

Our Workshop Environment

Our Workshop Environment

Introduction to HPC Using zcluster at GACRC

HPC Resources at Lehigh. Steve Anthony March 22, 2012

Introduction to HPC Using the New Cluster at GACRC

Knights Landing production environment on MARCONI

Getting started with the CEES Grid

Our Workshop Environment

CSC Supercomputing Environment

Cerebro Quick Start Guide

KISTI TACHYON2 SYSTEM Quick User Guide

Compiling applications for the Cray XC

High Performance Compu2ng Using Sapelo Cluster

Introduction to HPC at MSU

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

High Performance Beowulf Cluster Environment User Manual

Introduction to HPC2N

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Introduction to HPC Using zcluster at GACRC

Our Workshop Environment

INTRODUCTION TO THE CLUSTER

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Resources and Linux

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

Wednesday, August 10, 11. The Texas Advanced Computing Center Michael B. Gonzales, Ph.D. Program Director, Computational Biology

Migrating from Zcluster to Sapelo

First steps on using an HPC service ARCHER

How to Use a Supercomputer - A Boot Camp

Supercomputing environment TMA4280 Introduction to Supercomputing

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS

PBS Pro Documentation

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Sharpen Exercise: Using HPC resources and running parallel applications

New User Tutorial. OSU High Performance Computing Center

Accelerator programming with OpenACC

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu

Transcription:

Introduc)on to Pacman Don Bahls User Consultant dmbahls@alaska.edu (Significant Slide Content from Tom Logan)

Overview Connec)ng to Pacman Hardware Programming Environment Compilers Queuing System Interac)ve Work SoLware Stack Image 1: Picture of Pacman

Connec)ng to Pacman ARSC supports ssh, s*p and scp clients Linux and OS come with na)ve command line versions of ssh, s*p and scp. Windows requires you download a terminal and file transfer program (and op)onally an X11 server). We have the most experience suppor)ng PuTTY, FileZilla, and Xming.

Other File Transfer Alterna)ves File Transfer Mac and Linux have GUI based file transfer programs (e.g. Fetch and Filezilla) which can be convenient. Others programs should work so long as they support either slp or scp. Plain Lp will not work for file transfer files to pacman. Rsync works too when started via ssh.

Image 2: Filezilla on a Mac

Other Windows Connec)on Op)ons If you prefer the command line environment, Cygwin offers command line ssh, slp and scp commands as well as an X11 Server. For more informa)on on Cygwin see: h]p://www.arsc.edu/arsc/support/howtos/usingcygwin/index.xml

PACMAN Penguin Compu)ng Cluster 88- Sixteen Core Compute Nodes: 2- Eight core 2.3 GHz AMD Opteron Processors 64 GB of memory per node (4 GB per core) 14- Twelve Core Compute Nodes: 2- Six core 2.2 GHz AMD Opteron Processors 32 GB of memory per node (2.6 GB per core) 3- Large Memory Nodes: 4- Eight core 2.3 GHz AMD Opteron Processors 256 GB of memory per node (8 GB per core) 2- GPU Nodes: 2- Quad core 2.4 GHz Intel Processors 64 GB of memory per node (8 GB per core) 2- M2050 nvidia Fermi GPU cards 256- Four Core Compute Nodes: (NEW) 2- Dual Core 2.6 GHz AMD Opteron Processors 16 GB of memory per node (4 GB per core) Infiniband Interconnects (Qlogic and Voltaire) 278 TB Center- wide Lustre file system OperaVng System Red Hat Enterprise Linux 5.7 12- Twelve Core Login Nodes 2- Six core 2.2 GHz AMD Opteron Processors 64 GB of memory per node (5.3 GB per core) 1- Large Memory Login Nodes 4- Eight core 2.3 GHz AMD Opteron Processors 256 GB of memory per node (8 GB per core)

Pacman Storage Pacman supports the ARSC standard storage loca)ons $HOME: small, backed up, not purged $CENTER: large, not backed up, purged, available on fish and worksta)ons $ARCHIVE: no quota, backed up, not purged. NOTE: $ARCHIVE not available from compute nodes! $ARCHIVE is only accessible to the pacman login nodes ARSC recommends that you avoid accessing $HOME in parallel jobs. Purging is not currently enabled on $CENTER but may be enabled at a future date (with ample warning)

Monitoring Storage Usage Quotas are enabled on $HOME and $CENTER. Use the show_storage command to list your current usage: pacman1 % show_storage Current Filesystem Quotas as of Thu Jan 10 17:33:06 AKST 2013: Filesystem Used_GB Soft_GB Hard_GB Files Soft Files Hard Files ========== ======== ======== ======== ========== ========== ========== $ARCHIVE 415.35 0.00 0.00 30549 0 0 $CENTER 873.51 4294.97 5368.71 315939 5000000 5200000 $HOME 3.55 4.19 16.50 51523 0 0 /projects 741.57 0.00 0.00 1097165 0 0

Parallel Programming Models Pacman Support Threads (OpenMP and pthreads) MPI- using the OpenMPI distribu)on with Infiniband Support Serial Applica)ons Autoparalleliza)on (via the PGI compiler) GPU Accelera)on (on several nodes)

Parallel Programming Models Level Model Descrip)on Node Auto Automa)c paralleliza)on of basic loops. Only available with PGI compilers (use Mconcur= op)on) Node OpenMP Explicit shared memory model using direc)ves to achieve loop level parallelism (use mp op)on with PGI compilers) System MPI Most common and portable method for scalable distributed memory parallelism Node GPU Pacman has two node with Nvidia GPU accelerators. The CUDA toolkit 4.0 is available.

Modules Pacman uses a package called modules which allows you to quickly and easily switch from one solware version to another. This makes it possible for us to provide mul)ple versions of a solware package One person can access the newest version of a package while someone else has an unchanging environment.

Modules Modules do the following: Set your $PATH, $MANPATH, $LD_LIBRARY_PATH Set environment variables needed by packages (e.g. NCARG_ROOT for NCL, etc). Keep track of what they set so the environment variables and aliases can be unset when a module is unloaded.

Modules By default all accounts have a compiler stack loaded which is set using the modules package. You can put alternate and addi)onal modules commands in shell login files. Module Name PrgEnv-pgi!! DescripVon Programming environment using the PGI compilers & OpenMPI stack (default module). PrgEnv-gnu Programming environment using GNU compilers & OpenMPI stack.

Sample Module Commands!"##$%&'' ()$#*+,'-.,'' /01*".,!"#$%&'()(*%'!"#$%&'()(*%' %*+,+'(%%'()(*%(-%&'!"#$%&+'."/',0&' +1+,&!2!"#$%&'%"(#'3!"#$'!"#$%&'%"(#'4/567)' %"(#+'('!"#$%&'.*%&'./"!',0&' &7)*/"7!&7,!"#$%&'$7%"(#'3!"#$'!"#$%&'$7%"(#'4/567)' $7%"(#+'('!"#$%&'.*%&'./"!',0&' &7)*/"7!&7,!"#$%&'%*+,'!"#$%&'%*+,' #*+8%(1+',0&'!"#$%&+'90*:0'(/&' :$//&7,%1'%"(#&#2!"#$%&'+9*,:0'"%#'7&9'!"#$%&'+9*,:0'4/567);85*'4/567);57$' /&8%(:&+',0&'!"#$%&'"%#'9*,0'!"#$%&'7&9'*7',0&'&7)*/"7!&7,!"#$%&'8$/5&'!"#$%&'8$/5&' $7%"(#'(%%'!"#$%&'+&,,*75+<' /&+,"/*75',0&'&7)*/"7!&7,',"',0&' +,(,&'-&."/&'(71'!"#$%&+'9&/&' %"(#&#2

Module Use Currently available modules include: abaqus, cmake, ferret, git, grads, idl, jdk, matlab, nco, ncl, ncview, octave, petsc, python, totalview More con)nue to be added to the system constantly. If the default version of a package is not what you want, check to see if a newer version is available. The modules command lets you see all versions of a package (e.g. module avail ncl lists all ncl modules)

Module Command Examples pacman3 % module avail PrgEnv-pgi ---------------------- /usr/local/pkg/modulefiles ----------------------------- PrgEnv-pgi/10.5 PrgEnv-pgi/12.5(default) PrgEnv-pgi/11.2 PrgEnv-pgi/9.0.4 pacman3 % module list No Modulefiles Currently Loaded. pacman3 % module load PrgEnv-pgi pacman3 % module list Currently Loaded Modulefiles: 1) pgi/12.5 2) openmpi-pgi/1.4.3 3) PrgEnv-pgi/12.5 pacman3 % module swap PrgEnv-pgi/12.5 PrgEnv-pgi/11.2 pacman3 % module list Currently Loaded Modulefiles: 1) PrgEnv-pgi/11.2 2) openmpi-pgi/1.4.3 3) pgi/11.2 pacman3 % which pgf90 /usr/local/pkg/pgi/pgi-11.2/linux86-64/11.2/bin/pgf90

Modules in jobs If you use a alternate module to compile, be sure to load that module in your job script so the correct shared libraries are available. #!/bin/bash #PBS q standard_4... source /etc/profile.d/modules.sh module purge module load PrgEnv-pgi/11.2 mpirun np 128./wrf.exe

Modules in jobs You could also use module switch or module swap : #!/bin/bash #PBS q standard_4... source /etc/profile.d/modules.sh module swap PrgEnv-pgi PrgEnv-pgi/11.2 mpirun np 128./wrf.exe

Last Words on Modules You can create your own modules if you want to maintain mul)ple versions of programs you build yourself. If you have ques)on how to do it let us know.

Compilers on Pacman Item MPI PGI GNU Fortran 77 mpif77 pgf77 g77 Fortran 90/95 mpif90 pgf90 gfortran C mpicc pgcc gcc / cc C++ mpicc pgcc g++ / c++ Serial Debugger pgdbg gdb Parallel Debugger totalview totalview totalview Performance pgprof gprof Default Module PrgEnv-pgi PrgEnv-gnu PrgEnv-pgi PrgEnv-gnu

- c - g - O3 A few PGI compiler op)ons Generate object file but don t link Add debugging informa)on Higher level of op)miza)on (default is O2) - fast Higher level of op)miza)on than O3 - Mipa - Mconcur - mp - Minfo - Mneginfo Perform interprocedural analysis Enable autoparalleliza)on Enables paralleliza)on via OpenMP direc)ves Provides addi)onal informa)on on op)miza)ons done by the compiler. This flag has numerous op)ons (e.g. Minfo=mp,par) Provides addi)onal informa)on on why op)miza)ons are not done by the compiler.

Batch System Batch queuing Allows job scheduling on shared compute nodes Requires users to specify resource requests Ensures that nodes are shared fairly Manages resources to maximize throughput Pacman uses the Torque/Moab batch queuing system, which is based on PBS, the Portable Batch System

Batch Script Contents 1. Queuing Commands Shell dialect (e.g. bash, ksh) Execu)on queue to use Job wall)me Number of nodes required 2. Commands to Execute File/Directory Manipula)ons Code to execute

Torque Script - MPI Example #!/bin/bash #PBS -q standard_16 #PBS -l walltime=8:00:00 #PBS -l nodes=4:ppn=16 #PBS -j oe cd $PBS_O_WORKDIR mpirun./myprog

Torque Script - MPI Example #!/bin/bash #PBS -q standard #PBS -l walltime=8:00:00 #PBS -l nodes=6:ppn=16 #PBS -j oe cd $PBS_O_WORKDIR # Use only 2 PEs per node mpirun -np 12 -npernode 2./myprog

Torque Script - OpenMP #!/bin/bash #PBS -q standard #PBS -l walltime=8:00:00 #PBS -l nodes=1:ppn=16 #PBS -j oe cd $PBS_O_WORKDIR export OMP_NUM_THREADS=16./myprog

Torque Script Shared Node #!/bin/bash #PBS -q shared #PBS -l walltime=8:00:00 #PBS -l nodes=1:ppn=1 #PBS -j oe cd $PBS_O_WORKDIR./myprog

Torque Script - Data Staging #!/bin/bash #PBS -q transfer #PBS -l walltime=8:00:00 #PBS -l nodes=1:ppn=1 #PBS -j oe cd $PBS_O_WORKDIR batch_stage $ARCHIVE/mydataset/* cp -r $ARCHIVE/mydataset/*. exit 1 qsub mpi_job.pbs

Torque Script MPI Example #!/bin/bash #PBS -q standard #PBS -l walltime=8:00:00 #PBS -l nodes=4:ppn=4 #PBS -j oe cd $PBS_O_WORKDIR (4 core nodes) mpirun np 16 --mca mpi_paffinity_alone 1./myprog # --mca mpi_paffinity_alone 1 enables processor affinity.

Common PBS commands qsub job.pbs- queue the script job.pbs to be run by PBS. qstat- list jobs which are queued, running or recently completed. qdel jobid- delete a job with ID=jobid from the qstat command and is also provided when qsub is run. qmap- show a graphical list of the current work on nodes

Common Queues standard_4 - regular work. Uses 4 core nodes. 240 hour max wall)me. standard (standard_16)- regular work. Uses 16 core nodes. 48 hour max wall)me. standard_12- regular work. Uses 12 core nodes. 240 hour max wall)me. debug- for quick turn around debugging work. Runs on 16 core nodes. background- lowest priority queue, but doesn t require that you have an alloca)on. shared shared node jobs, for running serial codes on shared nodes transfer- For data transfer to and from $ARCHIVE. Be sure to bring all $ARCHIVE files online using batch_stage prior to the file copy. Use news queues to find out more details on number of nodes and wall)me allowed per queue

Which Queue Should I Use? General Recommenda)on: standard_16 - for larger parallel jobs that don t need long run )mes. standard_4- for jobs that need fewer than 16 cores or larger jobs that need long run )mes. standard_12- for single node jobs need more than 4 cores and long run )mes.

Example Batch Queue Use # Submit job pacman1 % qsub run_hello.cmd 206782.scyld # Check job status pacman1 % qstat 206782.scyld Job id Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 206782.scyld run_hello.cmd bahls 0 Q debug # Check job status a bit later pacman1 % qstat 206782.scyld Job id Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 206782.scyld run_hello.cmd bahls 00:00:01 C debug # Output from job is returned to the working directory by default. pacman1 % ls run_hello.cmd.o206782 run_hello.cmd.o206782

Select Torque Environment Variables Torque sets environment variables for use within job scripts. $PBS_JOBID Job id assigned by torque. $PBS_NP Number of tasks assigned to the job. $PBS_O_WORKDIR Original working directory on job submission. $PBS_ARRAYID Assigned to the array index when t is used.

Example Torque Environment Variable Use #!/bin/bash #PBS l nodes=4:ppn=16 #PBS q standard_16 #PBS j oe #PBS l walltime=1:00:00 cd $PBS_O_WORKDIR mpirun np ${PBS_NP}./a.out > job.${pbs_jobid}.out 2>&1

More Torque Features There a number of other features that we aren t going to cover today, but may be useful: Job dependencies make one job run aler another. Job arrays Let you submit an array of jobs. Command line opvons You can override seyngs in a torque script when a job is submi]ed.

Interac)ve Work We have a number of login nodes on pacman. The login nodes pacman3 through pacman9 have higher memory and CPU limits for interac)ve work. Check who or top to see if someone else is doing something that might not work well with what you want to do.

Example Interac)ve Work # Load a Programming Environment pacman4 % module load PrgEnv-gnu # Compile the code pacman4 % make mpicc hello.c -O3 -o hello.exe # Run it interactively (don t use np greater than 12) pacman4 % mpirun -np 4./hello.exe pacman4: hello world- 0 pacman4: hello world- 3 pacman4: hello world- 1 pacman4: hello world- 2

Interac)ve Work Con)nued If your code uses significant amounts of memory or you are interested in )ming measurements, you can get dedicated interac)ve nodes. (Subject to node availability) # request a 2 node debug job. pacman4 % qsub -l nodes=2:ppn=16 -q debug -I qsub: waiting for job 206244.scyld to start qsub: job 206244.scyld ready n118 % n118 % cd $PBS_O_WORKDIR n118 % module load PrgEnv-gnu n118 % mpirun -np 32./hello.exe... n118 % exit

SoLware Stack Pacman has a solware stack based on requests from people on campus. If we have a package installed there s likely already someone on campus using that package. When possible we put example job scripts and test cases in $SAMPLES_HOME If you want a test case for a package, let us know and we will make one!

Meteorology, Earth Sciences, Oceanography Related SoLware Netcdf library (version 3, 4) pnetcdf NCO Netcdf Operators NCL NCAR Command Language HDF4 HDF5 GMT GRADS Python with add- ons Octave with Netcdf Support Matlab* IDL** PyNGL Ferret NCView

Engineering & Chemistry SoLware Engineering OpenFOAM CFD Abaqus** COMSOL** Matlab Octave Chemistry Gaussian NWChem NAMD

Visualiza)on SoLware Ferret GNUPlot Grace NCL NCView Python matplotlib Imagemagick Octave Visit

Other SoLware Python with many add- ons R CUDA

Addi)onal Informa)on Watch www.arsc.edu for upcoming training and addi)onal informa)on about ARSC More info at ARSC How to: h]p://www.arsc.edu/arsc/support/howtos/usingpacman PGI Compilers: h]p://www.pgroup.com/resources/docs.htm GNU Compilers: h]p://gcc.gnu.org/

Exercises If you would like some prac)ce on things covered today, see: $SAMPLES_HOME/training/introToPacman This covers modules, interac)ve work and batch jobs.

Ques)ons and Feedback If you have addi)onal ques)ons, feel free to contact us: ARSC Help Desk Phone: (907) 450-8602 Email: consult@arsc.edu Web: h]p://www.arsc.edu/support We welcome your feedback on this class!