The JANUS Computing Environment

Similar documents
The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

How to Use a Supercomputer - A Boot Camp

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Working on the NewRiver Cluster

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Our new HPC-Cluster An overview

Data Movement and Storage. 04/07/09 1

Computing with the Moore Cluster

Introduction to Discovery.

New User Tutorial. OSU High Performance Computing Center

Triton file systems - an introduction. slide 1 of 28

Introduction to Discovery.

Introduction to HPC Using the New Cluster at GACRC

Introduction to HPC Resources and Linux

Introduction to GALILEO

Using file systems at HC3

CENTER FOR HIGH PERFORMANCE COMPUTING. Overview of CHPC. Martin Čuma, PhD. Center for High Performance Computing

Scalable I/O. Ed Karrels,

Using Sapelo2 Cluster at the GACRC

Tech Computer Center Documentation

Using the IAC Chimera Cluster

Parallel I/O on Theta with Best Practices

UAntwerpen, 24 June 2016

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

New User Seminar: Part 2 (best practices)

File Systems for HPC Machines. Parallel I/O

A Brief Introduction to The Center for Advanced Computing

Data storage on Triton: an introduction

Shifter on Blue Waters

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Welcome! Virtual tutorial starts at 15:00 BST

Introduction to HPC Using zcluster at GACRC

Introduction to Discovery.

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

High Performance Computing (HPC) Club Training Session. Xinsheng (Shawn) Qin

Cluster Clonetroop: HowTo 2014

CNAG Advanced User Training

Introduction to Unix Environment: modules, job scripts, PBS. N. Spallanzani (CINECA)

Outline. March 5, 2012 CIRMMT - McGill University 2

GACRC User Training: Migrating from Zcluster to Sapelo

Migrating from Zcluster to Sapelo

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE

A Brief Introduction to The Center for Advanced Computing

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220

PACE. Instructional Cluster Environment (ICE) Orientation. Mehmet (Memo) Belgin, PhD Research Scientist, PACE

Introduc)on to Hyades

Cerebro Quick Start Guide

KISTI TACHYON2 SYSTEM Quick User Guide

Getting started with the CEES Grid

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs

Introduction to High Performance Computing (HPC) Resources at GACRC

Introduction to CINECA Computer Environment

Introduction to HPC Using zcluster at GACRC

Workflow Optimization for Large Scale Bioinformatics

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

A Brief Introduction to The Center for Advanced Computing

Computing on Mio Data & Useful Commands

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Effective Use of CCV Resources

User Guide of High Performance Computing Cluster in School of Physics

Guillimin HPC Users Meeting March 17, 2016

Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat

Introduction to HPC Using the New Cluster at GACRC

Guillimin HPC Users Meeting. Bart Oldeman

Introduction to Cheyenne. 12 January, 2017 Consulting Services Group Brian Vanderwende

XSEDE New User Tutorial

Introduction to HPC Using the New Cluster at GACRC

Genius Quick Start Guide

High Performance Computing (HPC) Using zcluster at GACRC

Lustre Parallel Filesystem Best Practices

High Performance Beowulf Cluster Environment User Manual

Introduction to GACRC Storage Environment. Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer

OBTAINING AN ACCOUNT:

ICS-ACI System Basics

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Introduction to PICO Parallel & Production Enviroment

Introduction to HPC Using the New Cluster at GACRC

Introduc)on to Pacman

Using SDSC Systems (part 2)

HPC system startup manual (version 1.20)

Ambiente CINECA: moduli, job scripts, PBS. A. Grottesi (CINECA)

Introduction to HPC Using zcluster at GACRC

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Introduction to HPC at MSU

Sharpen Exercise: Using HPC resources and running parallel applications

Introduction to HPC Using the New Cluster (Sapelo) at GACRC

Practical Introduction to

OpenPBS Users Manual

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

HPC Input/Output. I/O and Darshan. Cristian Simarro User Support Section

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

Parallel Applications on Distributed Memory Systems. Le Yan HPC User LSU

Parameter searches and the batch system

Introduction to High Performance Computing (HPC) Resources at GACRC

Batch Systems. Running your jobs on an HPC machine

Transcription:

Research Computing UNIVERSITY OF COLORADO The JANUS Computing Environment Monte Lunacek monte.lunacek@colorado.edu rc-help@colorado.edu

What is JANUS? November, 2011 1,368 Compute nodes 16,416 processors ~ 20 GB of available space ~ 800 TB of storage 2.8Ghz Intel Westmere TFLOPS is a rate of execution, trillions of floating point operations per second

NUMA Architecture Resource Management and queues Different architectures Parallel file systems Lots of ways to do something... Explicit environment

Online resources www.rc.colorado.edu

Overview Access Login, file system, data transfer Software Supported software, dotkits, building software Resource Management Queues, Moab, and Torque Running Jobs Single-core, load-balanced, MPI, OpenMP Questions

Access

Login Procedure ssh <username>@login.rc.colorado.edu Password: Yubikeys or Cryptocards

RC Filesystem Home directory /home/<user_name> 2 Gb, Network File System (NFS) Project space Build software here /projects/<user_name> 250 Gb, NFS Scratch space Run software here /lustre/janus_scratch/<user_name> No quota, no backup Lustre file system

Snapshot Did you accidentally remove a file or directory? $HOME/.snapshot/hourly.[0-12] $HOME/.snapshot/nightly.[0-6] $HOME/.snapshot/weekly.[0-7] Example rm $HOME/bugreport.csh cp $HOME/.snapshot/weekly.0/bugreport.csh $HOME Where? $HOME/.snapshot /projects/<user_name>/.snapshot

Lustre Scalable, POSIX-compliant parallel file system designed for large, distributed-memory systems Object Storage Targets (OST) Store user file data Object Storage Servers (OSS) Control I/O access and handling network request Metadata Target (MDT) Stores filenames, directories, permissions and file layout Metadata Server (MDS) Assigns storage locations associated with each file in order to direct file I/O requests to the correct set of OST

Metadata server (MDS) and target (MDT) MDS MDT IB OSS OST Object storage server (OSS) and target (OST)

File Access MDS MDT IB OSS OST Compute node requests storage location Compute node then interacts directly with OST

Striping File - contiguous sequence of bytes /file Key feature: Lustre file system can distribute these segments multiple OSTs using a technique called file striping. A file is said to be striped when its contiguous sequence of bytes is separated into small chunks, or stripes, so that read and write operations can access multiple OSTs concurrently. /file

File I/O Serial File-per-process /file /file1 /file2 /filen Shared file /file Collective Buffering: Not currently supported on JANUS

Single processor 800 600 write speed (Mb/s) 400 Transfer size 1 mb 32 mb 200 0 1 2 4 8 15 30 60 stripe count

File per processor 12000 10000 8000 write speed (Mb/s) 6000 4000 2000 0 1 2 4 8 16 32 64 128 256 512 1024 2048 processors (files)

Shared-file with striping 7000 6000 5000 write speed (Mb/s) 4000 3000 2000 1000 1 2 4 8 16 32 64 128 256 1024 processors (files)

Examples bash-janus> mkdir temp_dir bash-janus> lfs setstripe -c 3 temp_dir bash-janus> touch temp_dir/temp_file bash-janus> lfs getstripe temp_dir temp_dir stripe_count: 3 stripe_size: 33554432 stripe_offset: -1 temp_dir/temp_file lmm_stripe_count: 3 lmm_stripe_size: 33554432 lmm_stripe_offset: 18 obdidx objid objid group 18 12787913 0xc320c9 0 7 12863377 0xc44791 0 23 12496893 0xbeaffd 0

Data transfer https://www.rc.colorado.edu/crcdocs/file-transfer Grid FTP GridFTP is a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks Globus Online Utilities Large file transfers with drag and drop archiving to move data between its longtime archival storage and compute systems scp, sftp, rsync Good for small files

Access tips Control Sockets One-time passwords make multiple terminal sessions and file transer painful. mkdir -p ~/.ssh/sockets cat >> ~/.ssh/config << EOF Host login.rc* ControlMaster auto ControlPath ~/.ssh/sockets/%r@%h:%p EOF Mount Drive http://macfusionapp.org/ Symbolic links /project, /scratch

Software

less general Software support Supported software RC expertise select state-of-the-art software Installation, verification, and training Unsupported software Installation user expertise Consulting Advice on installing your software and any dependancies

Environment To run an executable, you need to know where it is. /opt/openmpi/1.4.4/bin/mpicxx /opt/mpitch2/1.5a2/bin/mpicxx Which one does the command which mpicxx use? PATH What about libraries? /opt/openmpi/1.4.4/lib/libmpi.so /opt/mpitch2/1.5a2/lib/libmpi.so LD_LIBRARY_PATH

Dotkit Manages your environmental variables use list packages in use use -a list hidden packages in use use <package_name> add a package to environment unuse <package_name> remove package from environment use -la list available packages use -la <term> list packages that contain <term>

Examples use NCAR-Parallel-Intel bash-janus> echo $PATH /curc/tools/free/redhat_5_x86_64/parallel-netcdf-1.2.0_openmpi-1.4.5_intel-12.1.4/bin /curc/tools/free/redhat_5_x86_64/openmpi-1.4.5_intel-12.1.4/bin /curc/tools/free/redhat_5_x86_64/torque-2.5.8/bin /curc/tools/free/redhat_5_x86_64/netcdf-4.1.3_intel-12.1.4_hdf-4.2.6_hdf5-1.8.8_openmpi-1.4.5/bin /curc/tools/free/redhat_5_x86_64/hdf5-1.8.8_openmpi-1.4.5_intel-12.1.4/bin /curc/tools/nonfree/redhat_5_x86_64/intel-12.1.4/composer_xe_2011_sp1.10.319/bin/intel64 /curc/tools/free/redhat_5_x86_64/sun_jdk-1.6.0_23-x86_64/bin /curc/tools/free/redhat_5_x86_64/hdf-4.2.6_ics-2012.0.032/bin /curc/tools/free/redhat_5_x86_64/szip-2.1/bin /curc/tools/nonfree/redhat_5_x86_64/moab-6.1.5/bin

Building Software I need the Boost C++ library for my software. Where should I build this? /home/molu8455/projects/software/boost/1.49.0 Build on a compute node (e.g. qsub -I) Ideas Consider sharing this with your group. How about your own dotkit?

Build your own dotkit cat $HOME/.kits/TeachingHPC.dk #c Teaching HPC #d This contains the libraries I use for teaching HPC: #d.openmpi-1.4.3_gcc-4.5.2_torque-2.5.8_ib #d.hdf5-1.8.6 # Dependencies dk_op -q.torque-2.5.8 dk_op -q.openmpi-1.4.3_gcc-4.5.2_torque-2.5.8_ib dk_op -q.hdf5-1.8.6 # Variables dk_alter HDF5_DIR /curc/tools/free/redhat_5_x86_64/hdf5-1.8.6 dk_alter BOOST_ROOT /home/molu8455/projects/software/boost/1.49.0 dk_alter LD_LIBRARY_PATH /home/molu8455/projects/software/boost/ 1.49.0/lib

Resource Management

Scheduling 7 6 Nodes 5 4 Time 3 2 1

Scheduling 3 5 6 Nodes 1 2 4 7 Time

Moab and Torque Moab Brains of the operation Comes up with the schedule Torque Reports information to Moab Receives direction from Moab Handles users requests Provide job query facilities

Commands showq -u <username> canceljob <job_id> or ALL checkjob <job_id> qsub showstart <job_id> showq -u <username> Show jobs in the queue Cancel your job(s) Information about your job submit jobs When will your job start? Show jobs in the queue

qsub Request a resource for your job 1) batch or 2) interactive Makes environmental variables available to your job PBS_O_* PBS_O_WORKDIR PBS_NODEFILE Options -q <queue_name> -l <resource_list> -I interactive -N <name> -e <error_path> -o <output_path> -j <join_path>

Queues Name Nodes Max Time Node Sharing janus-debug 1-480 1 hour janus-short 1-480 4 hours janus-long 1-80 7 days janus-small 1-20 1 day janus-normal 21-80 1 day janus-wide 81-480 1 day

Running Jobs

Process How many processors do I need? Approximately how long will this take? showstart 1024@30:00 showstart 16@16:00:00 Nodes 4 2 Time Which queue best fits this criteria? Name Nodes Max Time Node Sharing janus-debug 1-480 1 hour janus-short 1-480 4 hours janus-long 1-80 7 days janus-small 1-20 1 day janus-normal 21-80 1 day janus-wide 81-480 1 day

Serial Jobs #!/bin/bash #PBS -N example_1 #PBS -q janus-debug #PBS -l walltime=00:05:00 #PBS -l nodes=1:ppn=1 #PBS -e errfile #PBS -o outfile cd $PBS_O_WORKDIR # run trial 1 of the simulator./simulator 1 > sim.1

Pack the node #!/bin/bash #PBS -N example_2 #PBS -q janus-debug #PBS -l walltime=0:00:30, nodes=1:ppn=12 cd $PBS_O_WORKDIR./simulator 1 > sim.1 &./simulator 2 > sim.2 &./simulator 3 > sim.3 &./simulator 4 > sim.4 &./simulator 5 > sim.5 &./simulator 6 > sim.6 &./simulator 7 > sim.7 &./simulator 8 > sim.8 &./simulator 9 > sim.9 &./simulator 10 > sim.10 &./simulator 11 > sim.11 &./simulator 12 > sim.12 & wait

Multi-node serial jobs? Consider using our load-balancing tool. https://www.rc.colorado.edu/tutorials/loadbalance #!/bin/bash #PBS -N example_1 #PBS -q janus-debug #PBS -l walltime=00:05:00 #PBS -l nodes=2:ppn=12 cd $PBS_O_WORKDIR. /curc/tools/utils/dkinit reuse LoadBalance mpirun load_balance -f cmd_lines./simulator 1 > sim.1./simulator 2 > sim.2./simulator 3 > sim.3./simulator 4 > sim.4./simulator 5 > sim.5./simulator 6 > sim.6./simulator 7 > sim.7./simulator 8 > sim.8./simulator 9 > sim.9./simulator 10 > sim.10..../simulator 2000 > sim.2000

MPI #!/bin/bash #PBS -N example_4 #PBS -q janus-debug #PBS -l walltime=0:10:00 #PBS -l nodes=3:ppn=12 cd $PBS_O_WORKDIR resuse.openmpi-1.4.5_intel-12.1.4 # run trial 1 of the simulator mpirun -np 36./simulator mpirun./simulator

Non-Uniform Memory Access (NUMA) Each socket has a dedicated memory area for high speed access Also has an interconnect to other sockets for slower access to the other sockets' memory memory control memory memory control memory

MPI OpenMP / High Memory #!/bin/bash #PBS -N example_5 #PBS -q janus-debug #PBS -l walltime=0:10:00 #PBS -l nodes=3:ppn=12 cd $PBS_O_WORKDIR. /curc/tools/utils/dkinit resuse.openmpi-1.4.5_intel-12.1.4 export OMP_NUM_THREADS=12 mpirun --bind-to-core --bynode --npernode 1./simulator export OMP_NUM_THREADS=6 mpirun --bind-to-socket --bysocket --npersocket 1./simulator

Summary Access Use control sockets for login Filesystem Build software in /projects/<username> Run your jobs in /lustre/janus_scratch/<user_name> Recover files with.snapshot Consider striping when using shared-file access. Data Transfer Large files: Globus Online, Grid FTP Smaller files: sftp, scp

Software Build on compute node. Manage environment with your own dotkits. Resource Management Familiarize yourself with the queues When you have choices... showstart Running Jobs Request what you need and manage with LoadBalance OpenMP: be aware of NUMA Limit the number of processes per node for hybrid and high memory

Questions?

Collective buffering At large core counts, I/O performance can be hindered by: MDS contention (file-per-process) file system contention (shared-file) Use a subset of application processes to perform I/O. limits the number of files (file-per-process) limits the number of processes accessing file system resources (shared-file). Offloads work from the file system to the application A subset of processors write - reducing contention