Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing
|
|
- Lilian Small
- 5 years ago
- Views:
Transcription
1 Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117
2 Objectives Inform you of available computational resources Help you choose appropriate computational resources for your research Provide guidance for scaling up your applications and performing computations more efficiently More efficient use = more resources available to do research Enable you to Work smarter, better, faster Slide 2
3 Outline Choosing computational resources Overview of available RC resources Partition / Queue Time Number of nodes and cores Memory Storage Examples Slide 3
4 What resources do I need? Is my code serial or parallel? How many cores and/or nodes does it need? How much memory does it require? How long does my code take to run? How big is the Input / Output Data for each run? How is the input data read by the code (e.g., hardcoded, keyboard, parameter/data file(s), external database/website, etc.)? Slide 4
5 What resources do I need? How is the output data written by the code (standard output/screen, data file(s), etc.)? How many tasks/jobs/runs do I need to complete? What is my timeframe / deadline for the project (e.g., paper, conference, thesis, etc.)? What computational resources are available at Research Computing? Slide 5
6 RC resources: Odyssey Odyssey is a large scale heterogeneous HPC cluster Compute: 60,000+ compute cores (and increasing) Cores per node: 8 to 64 Memory per node: 12GB to 512GB (4GB/core) 1,000,000+ NVIDIA GPU cores Storage: Over 35PB of storage Home directories: 100GB Lab space: Initial 4TB at $0 with expansion on a TB basis available for purchase at $45/TB/year Local scratch: 270GB/node Global Scratch: High-performance shared scratch: 1 PB total, Lustre file system Slide 6
7 RC resources: Odyssey Odyssey is a large scale heterogeneous HPC cluster Software: CentOS SLURM job manager 1,000+ scientific tools and programs Interconnect: 2 underlying networks connecting 3 data centers TCP/IP network Low-latency 56 GB/s InfiniBand network: inter-node parallel computing, fast access to Lustre mounted storage Hosted Machines: 300+ virtual machines Lab instrument workstations Slide 7
8 Available Storage Home Directories Lab Storage Local Scratch Global Scratch Persistent Research Data Size Limit 100GB 4TB+ 270GB/node 1.2PB total 3PB Availability All cluster nodes + Desktop/laptop All cluster nodes + Desktop/laptop Local compute node only. All cluster nodes Only IB connected cluster nodes Backup Hourly snapshot + Daily Offsite Daily Offsite No backup No backup External Repos No backup Retention Policy Indefinite Indefinite Job duration 90 days 3-9 months Performance Moderate. Not suitable for high I/O Moderate. Not suitable for high I/O Suited for small file I/O intensive jobs Appropriate for large file I/O intensive jobs Appropriate for large I/O intensive jobs Cost Free 4TB Free + Expansion at $45/TB/yr Free Free Free Slide 8
9 Partition / Queue Time Limit general serial_requeue interact bigmem unrestricted Lab queues 7 days 7 days 3 days no limit no limit no limit # Nodes # Cores / Node Memory / Node (GB) Batch jobs: #SBATCH -p general # Partition name Interactive or test jobs: srun -p interact OTHER_OPTIONS Slide 9
10 Time How long does my code take to run? Batch jobs: #SBATCH -p serial_requeue #SBATCH -t 0-02:00 #Time in D-HH:MM Interactive or test jobs: srun -t 0-02:00 -p interact OTHER_JOB_OPTIONS Slide 10
11 Number of nodes and cores Is my code serial or parallel? Serial (single-core) jobs Batch jobs: #SBATCH -p serial_requeue #SBATCH -c 1 # Number of cores Interactive or test jobs: srun -c 1 -p interact OTHER_JOB_OPTIONS Core / Thread / Process / CPU Slide 11
12 Number of nodes and cores Parallel shared memory (single node) jobs Examples: OpenMP (Fortran, C/C++) MATLAB Parallel Computing Toolbox (PCT) Python (e.g., threading, multiprocessing) R (e.g., multicore) Batch jobs: #SBATCH -p general # Partition #SBATCH -N 1 # Number of nodes #SBATCH -c 4 # Number of cores (per task) srun -c 4 PROGRAM PROGRAM_OPTIONS Interactive or test jobs: srun -p interact -N 1 -c 4 OTHER_OPTIONS Slide 12
13 Number of nodes and cores Parallel distributed memory (multi-node) jobs Examples: MPI (openmpi, impi, mvapich) with Fortran or C/C++ code MATLAB Distributed Computing Server (DCS) Python (e.g., mpi4py) R (e.g., Rmpi, snow) Batch jobs: #SBATCH -p general # Partition #SBATCH -n 4 # Number of tasks srun -n 4 PROGRAM PROGRAM_OPTIONS Interactive or test jobs: srun -p interact -n 4 OTHER_OPTIONS Slide 13
14 Memory Serial and parallel shared memory (single node) jobs Batch jobs: #SBATCH -p serial_requeue # Partition #SBATCH --mem=4000 # Memory / node in MB Interactive or test jobs: srun --mem=4000 -p interact OTHER_OPTIONS Parallel distributed memory (multi-node) jobs Batch jobs: #SBATCH -p general #SBATCH -n 4 #SBATCH --mem-per-cpu=4000 # Partition # Number of tasks # Memory / core in MB Interactive or test jobs: srun --mem-per-cpu=4000 -n 4 -p interact OTHER_OPTIONS Slide 14
15 Memory How much memory does my code require? Understand your code and how the algorithms scale analytically Run an interactive job and monitor memory usage (with the top Unix command) Run a test batch job and check memory usage after the job has completed (with the sacct SLURM command) Slide 15
16 Memory Know your code Example: A real*8 (Fortran), or double (C/C++), matrix of dimension 100,000 X 100,000 requires ~80GB of RAM Data Type: Fortran / C Bytes integer*4 / int 4 integer*8 / long 8 real*4 / float 4 real*8 / double 8 complex*8 / float complex 8 complex*16 / double complex 16 Slide 16
17 Memory Run an interactive job and monitor memory usage (with the top Unix command) Example: Check the memory usage of a matrix diagonalization code Request an interactive bash shell session: srun -p interact -n 1 -t 0-02:00 --pty --mem=4000 bash Run the code, e.g.,./matrix_diag.x Open a new shell terminal and ssh to the compute node where the interactive job dispatched, e.g., ssh holy2a18307 In the new shell terminal run top, e.g., top -u pkrastev Slide 17
18 Memory Run 1: Matrix dimension = 3000 X 3000 (real*8) Needs 3,000 X 3000 X 8 / = ~72 MB of RAM Slide 18
19 Memory Run 2: Input size changed Double matrix dimension, Quadrupole required memory Matrix dimension = 6000 X 6000 (real*8) Needs 6,000 X 6000 X 8 / = ~288MB of RAM Slide 19
20 sacct overview sacct = SLURM accounting database every 30 sec the node collects the amount of CPU and memory usage that all of the process IDs are using for a given job. After the job ends this data is set to slurmdb. Common flags j jobid or name=jobname S YYYY-MM-DD and E YYYY-MM-DD o ouput_options JobID,JobName,NCPUS,Nnodes,Submit,Start,End,CPUTime,TotalCPU,ReqMem, MaxRSS,MaxVMSize,State,Exit,Node Slide 20
21 Memory Run a test batch job and check memory usage after the job has completed (with the sacct SLURM command) Example: [pkrastev@sa01 Resources]$ sacct -o ReqMem,MaxRSS -j ReqMem MaxRSS Mn K or MaxRSS = KB = MB ReqMem = 320MB or 10% > MaxRSS Slide 21
22 Storage Home directories, /n/home*, and Lab storage are not appropriate for I/O intensive or large number of jobs. Typical utilization would be jobscripts, and in-house analysis codes or self-installed software For jobs that create high-volume of small files (< 10 MB), use local scratch. You need to copy your input data to /scratch and move output data to a different location after the job completes For I/O intensive jobs large data files (> 100 MB) and/or large number of data files (100s of MB) use the global scratch file-system /n/regal Slide 22
23 Storage 60 Oxford St Initial Lab shares (4TB) Legacy equipment 1 Summer Street Personal home directories Purchased lab shares Older Lab owned compute nodes Holyoke, MA Global scratch high-performance filesystem Compute nodes > 2012 (33K+ cores) Topology may affect the efficiency of your work! For best performance storage needs to be closer to compute Slide 23
24 Storage Utilization Use du Unix command to check disk usage, e.g., du -h $HOME... 37G /n/home06/pkrastev Slide 24
25 Examples Serial application #!/bin/bash #SBATCH -J lapack_test #SBATCH -o lapack_test.out #SBATCH -e lapack_test.err #SBATCH -p serial_requeue #SBATCH -t 0-00:30 #SBATCH -N 1 #SBATCH -c 1 #SBATCH --mem=4000 # Load required modules source new-modules.sh # Run program./lapack_test.x Slide 25
26 Examples Parallel OpenMP (single-node) application #!/bin/bash #SBATCH -J omp_dot #SBATCH -o omp_dot.out #SBATCH -e omp_dot.err #SBATCH -p general #SBATCH -t 0-02:00 #SBATCH -N 1 #SBATCH -c 4 #SBATCH --mem=16000 # Set up environment source new-modules.sh export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK # Run program srun -c $SLURM_CPUS_PER_TASK./omp_dot.x Slide 26
27 Examples MATLAB Parallel Computing Toolbox (single-node) application #!/bin/bash #SBATCH -J parallel_monte_carlo #SBATCH -o parallel_monte_carlo.out #SBATCH -e parallel_monte_carlo.err #SBATCH -N 1 #SBATCH -c 8 #SBATCH -t 0-03:30 #SBATCH -p general #SBATCH --mem=32000 # Load required software modules source new-modules.sh module load matlab/r2016a-fasrc01 # Run program srun -n 1 -c 8 matlab-default -nosplash -nodesktop -r "parallel_monte_carlo;exit" Slide 27
28 Examples Parallel MPI (multi-node) application #!/bin/bash #SBATCH -J planczos #SBATCH -o planczos.out #SBATCH -e planczos.err #SBATCH -p general #SBATCH -t 30 #SBATCH -n 8 #SBATCH --mem-per-cpu=4000 # Load required modules source new-modules.sh module load intel/ fasrc01 module load openmpi/1.8.3-fasrc02 # Run program srun -n 8 --mpi=pmi2./planczos.x Slide 28
29 Test first Before diving right into submitting 100s or 1000s of research jobs, ALWAYS test a few first. ensure the job will finish to completion without errors ensure you understand the resources needs and how they scale with different data sizes and input options Slide 29
30 Contact Information Harvard Research Computing Website: Office Hours: Wednesdays noon 3pm 38 Oxford Street, 2 nd Floor Conference Room Slide 30
Choosing Resources Wisely. What is Research Computing?
Choosing Resources Wisely Scott Yockel, PhD Harvard - Research Computing What is Research Computing? Faculty of Arts and Sciences (FAS) department that handles nonenterprise IT requests from researchers.
More informationIntroduction to High-Performance Computing (HPC)
Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid
More informationIntroduction to High-Performance Computing (HPC)
Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid
More informationGraham vs legacy systems
New User Seminar Graham vs legacy systems This webinar only covers topics pertaining to graham. For the introduction to our legacy systems (Orca etc.), please check the following recorded webinar: SHARCNet
More informationSherlock for IBIIS. William Law Stanford Research Computing
Sherlock for IBIIS William Law Stanford Research Computing Overview How we can help System overview Tech specs Signing on Batch submission Software environment Interactive jobs Next steps We are here to
More informationDuke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu
Duke Compute Cluster Workshop 3/28/2018 Tom Milledge rc.duke.edu rescomputing@duke.edu Outline of talk Overview of Research Computing resources Duke Compute Cluster overview Running interactive and batch
More informationHigh Performance Computing Cluster Advanced course
High Performance Computing Cluster Advanced course Jeremie Vandenplas, Gwen Dawes 9 November 2017 Outline Introduction to the Agrogenomics HPC Submitting and monitoring jobs on the HPC Parallel jobs on
More informationSlurm basics. Summer Kickstart June slide 1 of 49
Slurm basics Summer Kickstart 2017 June 2017 slide 1 of 49 Triton layers Triton is a powerful but complex machine. You have to consider: Connecting (ssh) Data storage (filesystems and Lustre) Resource
More informationDuke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/
Duke Compute Cluster Workshop 11/10/2016 Tom Milledge h:ps://rc.duke.edu/ rescompu>ng@duke.edu Outline of talk Overview of Research Compu>ng resources Duke Compute Cluster overview Running interac>ve and
More informationDuke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu
Duke Compute Cluster Workshop 10/04/2018 Tom Milledge rc.duke.edu rescomputing@duke.edu Outline of talk Overview of Research Computing resources Duke Compute Cluster overview Running interactive and batch
More informationBefore We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop
Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources
More informationHigh Performance Computing Cluster Basic course
High Performance Computing Cluster Basic course Jeremie Vandenplas, Gwen Dawes 30 October 2017 Outline Introduction to the Agrogenomics HPC Connecting with Secure Shell to the HPC Introduction to the Unix/Linux
More informationHow to Use a Supercomputer - A Boot Camp
How to Use a Supercomputer - A Boot Camp Shelley Knuth Peter Ruprecht shelley.knuth@colorado.edu peter.ruprecht@colorado.edu www.rc.colorado.edu Outline Today we will discuss: Who Research Computing is
More informationOur new HPC-Cluster An overview
Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization
More informationSlurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012
Slurm and Abel job scripts Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012 Abel in numbers Nodes - 600+ Cores - 10000+ (1 node->2 processors->16 cores) Total memory
More informationSlurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013
Slurm and Abel job scripts Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013 Abel in numbers Nodes - 600+ Cores - 10000+ (1 node->2 processors->16 cores) Total memory
More informationHow to run a job on a Cluster?
How to run a job on a Cluster? Cluster Training Workshop Dr Samuel Kortas Computational Scientist KAUST Supercomputing Laboratory Samuel.kortas@kaust.edu.sa 17 October 2017 Outline 1. Resources available
More informationKnights Landing production environment on MARCONI
Knights Landing production environment on MARCONI Alessandro Marani - a.marani@cineca.it March 20th, 2017 Agenda In this presentation, we will discuss - How we interact with KNL environment on MARCONI
More informationBRC HPC Services/Savio
BRC HPC Services/Savio Krishna Muriki and Gregory Kurtzer LBNL/BRC kmuriki@berkeley.edu, gmk@lbl.gov SAVIO - The Need Has Been Stated Inception and design was based on a specific need articulated by Eliot
More informationECE 574 Cluster Computing Lecture 4
ECE 574 Cluster Computing Lecture 4 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 31 January 2017 Announcements Don t forget about homework #3 I ran HPCG benchmark on Haswell-EP
More informationCluster Clonetroop: HowTo 2014
2014/02/25 16:53 1/13 Cluster Clonetroop: HowTo 2014 Cluster Clonetroop: HowTo 2014 This section contains information about how to access, compile and execute jobs on Clonetroop, Laboratori de Càlcul Numeric's
More informationHigh Performance Computing (HPC) Using zcluster at GACRC
High Performance Computing (HPC) Using zcluster at GACRC On-class STAT8060 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC?
More informationIntroduction to HPC Using zcluster at GACRC
Introduction to HPC Using zcluster at GACRC On-class PBIO/BINF8350 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What
More informationUL HPC Monitoring in practice: why, what, how, where to look
C. Parisot UL HPC Monitoring in practice: why, what, how, where to look 1 / 22 What is HPC? Best Practices Getting Fast & Efficient UL HPC Monitoring in practice: why, what, how, where to look Clément
More informationIntroduction to PICO Parallel & Production Enviroment
Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it
More informationIntroduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende
Introduction to the NCAR HPC Systems 25 May 2018 Consulting Services Group Brian Vanderwende Topics to cover Overview of the NCAR cluster resources Basic tasks in the HPC environment Accessing pre-built
More informationIntroduction to HPC Using zcluster at GACRC
Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is HPC Concept? What is
More informationSubmitting and running jobs on PlaFRIM2 Redouane Bouchouirbat
Submitting and running jobs on PlaFRIM2 Redouane Bouchouirbat Summary 1. Submitting Jobs: Batch mode - Interactive mode 2. Partition 3. Jobs: Serial, Parallel 4. Using generic resources Gres : GPUs, MICs.
More informationSCALABLE HYBRID PROTOTYPE
SCALABLE HYBRID PROTOTYPE Scalable Hybrid Prototype Part of the PRACE Technology Evaluation Objectives Enabling key applications on new architectures Familiarizing users and providing a research platform
More informationIntroduction to High-Performance Computing (HPC)
Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit CPU cores : individual processing units within a Storage : Disk drives HDD : Hard Disk Drive SSD : Solid
More informationIntroduction to the Cluster
Follow us on Twitter for important news and updates: @ACCREVandy Introduction to the Cluster Advanced Computing Center for Research and Education http://www.accre.vanderbilt.edu The Cluster We will be
More informationIntroduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology
Introduction to the SHARCNET Environment 2010-May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology available hardware and software resources our web portal
More informationBatch Usage on JURECA Introduction to Slurm. May 2016 Chrysovalantis Paschoulas HPS JSC
Batch Usage on JURECA Introduction to Slurm May 2016 Chrysovalantis Paschoulas HPS group @ JSC Batch System Concepts Resource Manager is the software responsible for managing the resources of a cluster,
More informationHPC at UZH: status and plans
HPC at UZH: status and plans Dec. 4, 2013 This presentation s purpose Meet the sysadmin team. Update on what s coming soon in Schroedinger s HW. Review old and new usage policies. Discussion (later on).
More informationIntroduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer. Purpose
Introduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer Daniel Yorgov Department of Mathematical & Statistical Sciences, University of Colorado Denver
More informationHPCC New User Training
High Performance Computing Center HPCC New User Training Getting Started on HPCC Resources Eric Rees, Ph.D. High Performance Computing Center Fall 2018 HPCC User Training Agenda HPCC User Training Agenda
More informationHPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda
KFUPM HPC Workshop April 29-30 2015 Mohamed Mekias HPC Solutions Consultant Agenda 1 Agenda-Day 1 HPC Overview What is a cluster? Shared v.s. Distributed Parallel v.s. Massively Parallel Interconnects
More informationThe cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group
The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing
More informationIntroduction to GALILEO
Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Alessandro Grottesi a.grottesi@cineca.it SuperComputing Applications and
More informationKimmo Mattila Ari-Matti Sarén. CSC Bioweek Computing intensive bioinformatics analysis on Taito
Kimmo Mattila Ari-Matti Sarén CSC Bioweek 2018 Computing intensive bioinformatics analysis on Taito 7. 2. 2018 CSC Environment Sisu Cray XC40 Massively Parallel Processor (MPP) supercomputer 3376 12-core
More informationIntroduction to HPC Using zcluster at GACRC On-Class GENE 4220
Introduction to HPC Using zcluster at GACRC On-Class GENE 4220 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 OVERVIEW GACRC
More informationOutline. March 5, 2012 CIRMMT - McGill University 2
Outline CLUMEQ, Calcul Quebec and Compute Canada Research Support Objectives and Focal Points CLUMEQ Site at McGill ETS Key Specifications and Status CLUMEQ HPC Support Staff at McGill Getting Started
More informationIntroduction to GALILEO
November 27, 2016 Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it SuperComputing Applications and Innovation Department
More informationProf. Konstantinos Krampis Office: Rm. 467F Belfer Research Building Phone: (212) Fax: (212)
Director: Prof. Konstantinos Krampis agbiotec@gmail.com Office: Rm. 467F Belfer Research Building Phone: (212) 396-6930 Fax: (212) 650 3565 Facility Consultant:Carlos Lijeron 1/8 carlos@carotech.com Office:
More informationOur Workshop Environment
Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2018 Our Environment Today Your laptops or workstations: only used for portal access Bridges
More informationIntroduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU
Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU What is Joker? NMSU s supercomputer. 238 core computer cluster. Intel E-5 Xeon CPUs and Nvidia K-40 GPUs. InfiniBand innerconnect.
More informationOBTAINING AN ACCOUNT:
HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to
More informationIntroduction to HPC Using zcluster at GACRC
Introduction to HPC Using zcluster at GACRC On-class STAT8330 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 Outline What
More informationIntroduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende
Introduction to NCAR HPC 25 May 2017 Consulting Services Group Brian Vanderwende Topics we will cover Technical overview of our HPC systems The NCAR computing environment Accessing software on Cheyenne
More informationWorking with Shell Scripting. Daniel Balagué
Working with Shell Scripting Daniel Balagué Editing Text Files We offer many text editors in the HPC cluster. Command-Line Interface (CLI) editors: vi / vim nano (very intuitive and easy to use if you
More informationIntroduction to SLURM & SLURM batch scripts
Introduction to SLURM & SLURM batch scripts Anita Orendt Assistant Director Research Consulting & Faculty Engagement anita.orendt@utah.edu 16 Feb 2017 Overview of Talk Basic SLURM commands SLURM batch
More informationHeterogeneous Job Support
Heterogeneous Job Support Tim Wickberg SchedMD SC17 Submitting Jobs Multiple independent job specifications identified in command line using : separator The job specifications are sent to slurmctld daemon
More informationHPC Architectures. Types of resource currently in use
HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationOur Workshop Environment
Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2015 Our Environment Today Your laptops or workstations: only used for portal access Blue Waters
More informationRHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK
RHRK-Seminar High Performance Computing with the Cluster Elwetritsch - II Course instructor : Dr. Josef Schüle, RHRK Overview Course I Login to cluster SSH RDP / NX Desktop Environments GNOME (default)
More informationTroubleshooting Jobs on Odyssey
Troubleshooting Jobs on Odyssey Paul Edmon, PhD ITC Research CompuGng Associate Bob Freeman, PhD Research & EducaGon Facilitator XSEDE Campus Champion Goals Tackle PEND, FAIL, and slow performance issues
More informationApplication Performance on IME
Application Performance on IME Toine Beckers, DDN Marco Grossi, ICHEC Burst Buffer Designs Introduce fast buffer layer Layer between memory and persistent storage Pre-stage application data Buffer writes
More informationMOHA: Many-Task Computing Framework on Hadoop
Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction
More informationMonitoring and Trouble Shooting on BioHPC
Monitoring and Trouble Shooting on BioHPC [web] [email] portal.biohpc.swmed.edu biohpc-help@utsouthwestern.edu 1 Updated for 2017-03-15 Why Monitoring & Troubleshooting data code Monitoring jobs running
More informationNew User Seminar: Part 2 (best practices)
New User Seminar: Part 2 (best practices) General Interest Seminar January 2015 Hugh Merz merz@sharcnet.ca Session Outline Submitting Jobs Minimizing queue waits Investigating jobs Checkpointing Efficiency
More informationGPUs and Emerging Architectures
GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs
More informationLAB. Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers
LAB Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers Dan Stanzione, Lars Koesterke, Bill Barth, Kent Milfeld dan/lars/bbarth/milfeld@tacc.utexas.edu XSEDE 12 July 16, 2012 1 Discovery
More informationUsing a Linux System 6
Canaan User Guide Connecting to the Cluster 1 SSH (Secure Shell) 1 Starting an ssh session from a Mac or Linux system 1 Starting an ssh session from a Windows PC 1 Once you're connected... 1 Ending an
More informationGenius Quick Start Guide
Genius Quick Start Guide Overview of the system Genius consists of a total of 116 nodes with 2 Skylake Xeon Gold 6140 processors. Each with 18 cores, at least 192GB of memory and 800 GB of local SSD disk.
More informationHabanero Operating Committee. January
Habanero Operating Committee January 25 2017 Habanero Overview 1. Execute Nodes 2. Head Nodes 3. Storage 4. Network Execute Nodes Type Quantity Standard 176 High Memory 32 GPU* 14 Total 222 Execute Nodes
More informationUoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)
UoW HPC Quick Start Information Technology Services University of Wollongong ( Last updated on October 10, 2011) 1 Contents 1 Logging into the HPC Cluster 3 1.1 From within the UoW campus.......................
More informationWorkstations & Thin Clients
1 Workstations & Thin Clients Overview Why use a BioHPC computer? System Specs Network requirements OS Tour Running Code Locally Submitting Jobs to the Cluster Run Graphical Jobs on the Cluster Use Windows
More informationMIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization
MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization 2 Glenn Bresnahan Director, SCV MGHPCC Buy-in Program Kadin Tseng HPC Programmer/Consultant
More informationHPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing
HPC Workshop Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing NEEDED EQUIPMENT 1. Laptop with Secure Shell (ssh) for login A. Windows: download/install putty from https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html
More informationINTRODUCTION TO THE CLUSTER
INTRODUCTION TO THE CLUSTER WHAT IS A CLUSTER? A computer cluster consists of a group of interconnected servers (nodes) that work together to form a single logical system. COMPUTE NODES GATEWAYS SCHEDULER
More informationXSEDE New User Training. Ritu Arora November 14, 2014
XSEDE New User Training Ritu Arora Email: rauta@tacc.utexas.edu November 14, 2014 1 Objectives Provide a brief overview of XSEDE Computational, Visualization and Storage Resources Extended Collaborative
More informationIntroduction to High Performance Computing at Case Western Reserve University. KSL Data Center
Introduction to High Performance Computing at Case Western Reserve University Research Computing and CyberInfrastructure team KSL Data Center Presenters Emily Dragowsky Daniel Balagué Guardia Hadrian Djohari
More informationIntroduction to High Performance Computing (HPC) Resources at GACRC
Introduction to High Performance Computing (HPC) Resources at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu 1 Outline GACRC? High Performance
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System X idataplex CINECA, Italy The site selection
More informationIntroduction to SLURM on the High Performance Cluster at the Center for Computational Research
Introduction to SLURM on the High Performance Cluster at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY
More informationCRUK cluster practical sessions (SLURM) Part I processes & scripts
CRUK cluster practical sessions (SLURM) Part I processes & scripts login Log in to the head node, clust1-headnode, using ssh and your usual user name & password. SSH Secure Shell 3.2.9 (Build 283) Copyright
More informationIntroduction to RCC. September 14, 2016 Research Computing Center
Introduction to HPC @ RCC September 14, 2016 Research Computing Center What is HPC High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers
More informationIntroduction to GALILEO
Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it
More informationOur Workshop Environment
Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2016 Our Environment This Week Your laptops or workstations: only used for portal access Bridges
More informationIntroduction to SLURM & SLURM batch scripts
Introduction to SLURM & SLURM batch scripts Anita Orendt Assistant Director Research Consulting & Faculty Engagement anita.orendt@utah.edu 6 February 2018 Overview of Talk Basic SLURM commands SLURM batch
More informationStudent HPC Hackathon 8/2018
Student HPC Hackathon 8/2018 J. Simon, C. Plessl 22. + 23. August 2018 J. Simon - Architecture of Parallel Computer Systems SoSe 2018 < 1 > Student HPC Hackathon 8/2018 Get the most performance out of
More informationIntroduction to UBELIX
Science IT Support (ScITS) Michael Rolli, Nico Färber Informatikdienste Universität Bern 06.06.2017, Introduction to UBELIX Agenda > Introduction to UBELIX (Overview only) Other topics spread in > Introducing
More informationIntroduction to RCC. January 18, 2017 Research Computing Center
Introduction to HPC @ RCC January 18, 2017 Research Computing Center What is HPC High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much
More informationIntroduction to High Performance Computing (HPC) Resources at GACRC
Introduction to High Performance Computing (HPC) Resources at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? Concept
More informationSlurm Birds of a Feather
Slurm Birds of a Feather Tim Wickberg SchedMD SC17 Outline Welcome Roadmap Review of 17.02 release (Februrary 2017) Overview of upcoming 17.11 (November 2017) release Roadmap for 18.08 and beyond Time
More informationIntroduction to HPC2N
Introduction to HPC2N Birgitte Brydsø HPC2N, Umeå University 4 May 2017 1 / 24 Overview Kebnekaise and Abisko Using our systems The File System The Module System Overview Compiler Tool Chains Examples
More informationThe JANUS Computing Environment
Research Computing UNIVERSITY OF COLORADO The JANUS Computing Environment Monte Lunacek monte.lunacek@colorado.edu rc-help@colorado.edu What is JANUS? November, 2011 1,368 Compute nodes 16,416 processors
More informationComet Virtualization Code & Design Sprint
Comet Virtualization Code & Design Sprint SDSC September 23-24 Rick Wagner San Diego Supercomputer Center Meeting Goals Build personal connections between the IU and SDSC members of the Comet team working
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System
More informationThe Why and How of HPC-Cloud Hybrids with OpenStack
The Why and How of HPC-Cloud Hybrids with OpenStack OpenStack Australia Day Melbourne June, 2017 Lev Lafayette, HPC Support and Training Officer, University of Melbourne lev.lafayette@unimelb.edu.au 1.0
More informationIntroduction to the Cluster
Introduction to the Cluster Advanced Computing Center for Research and Education http://www.accre.vanderbilt.edu Follow us on Twitter for important news and updates: @ACCREVandy The Cluster We will be
More informationParallel Applications on Distributed Memory Systems. Le Yan HPC User LSU
Parallel Applications on Distributed Memory Systems Le Yan HPC User Services @ LSU Outline Distributed memory systems Message Passing Interface (MPI) Parallel applications 6/3/2015 LONI Parallel Programming
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
More informationTITANI CLUSTER USER MANUAL V.1.3
2016 TITANI CLUSTER USER MANUAL V.1.3 This document is intended to give some basic notes in order to work with the TITANI High Performance Green Computing Cluster of the Civil Engineering School (ETSECCPB)
More informationScientific Computing in practice
Scientific Computing in practice Kickstart 2015 (cont.) Ivan Degtyarenko, Janne Blomqvist, Mikko Hakala, Simo Tuomisto School of Science, Aalto University June 1, 2015 slide 1 of 62 Triton practicalities
More informationExercises: Abel/Colossus and SLURM
Exercises: Abel/Colossus and SLURM November 08, 2016 Sabry Razick The Research Computing Services Group, USIT Topics Get access Running a simple job Job script Running a simple job -- qlogin Customize
More informationWhat is Research Computing?
Spring 2017 3/19/17 Modules and Software Plamen Krastev, PhD Harvard - Research Computing 1 What is Research Computing? Faculty of Arts and Sciences (FAS) department that handles nonenterprise IT requests
More informationMIC Lab Parallel Computing on Stampede
MIC Lab Parallel Computing on Stampede Aaron Birkland and Steve Lantz Cornell Center for Advanced Computing June 11 & 18, 2013 1 Interactive Launching This exercise will walk through interactively launching
More informationManaging and Deploying GPU Accelerators. ADAC17 - Resource Management Stephane Thiell and Kilian Cavalotti Stanford Research Computing Center
Managing and Deploying GPU Accelerators ADAC17 - Resource Management Stephane Thiell and Kilian Cavalotti Stanford Research Computing Center OUTLINE GPU resources at the SRCC Slurm and GPUs Slurm and GPU
More informationHigh Performance Computing in C and C++
High Performance Computing in C and C++ Rita Borgo Computer Science Department, Swansea University WELCOME BACK Course Administration Contact Details Dr. Rita Borgo Home page: http://cs.swan.ac.uk/~csrb/
More information