Introduc)on to High Performance Compu)ng Advanced Research Computing

Similar documents
Introduction to ARC Systems

Advanced Research Compu2ng Informa2on Technology Virginia Tech

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Programming Models for Multi- Threading. Brian Marshall, Advanced Research Computing

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Introduction to High Performance Computing. Shaohao Chen Research Computing Services (RCS) Boston University

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:-

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

IT4Innovations national supercomputing center. Branislav Jansík

Faster Code for Free: Linear Algebra Libraries. Advanced Research Compu;ng 22 Feb 2017

UL HPC Monitoring in practice: why, what, how, where to look

Introduc)on to Hyades

SuperMike-II Launch Workshop. System Overview and Allocations

Umeå University

Umeå University

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC

High Performance Computing (HPC) Using zcluster at GACRC

Introduction to High-Performance Computing

n N c CIni.o ewsrg.au

GPUs and Emerging Architectures

Accelerating High Performance Computing.

CME 213 S PRING Eric Darve

GPU ACCELERATED COMPUTING. 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation

High Performance Computing Resources at MSU

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

Trends in HPC (hardware complexity and software challenges)

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda

Parallel Visualiza,on At TACC

Unstructured Finite Volume Code on a Cluster with Mul6ple GPUs per Node

Introduc)on to Xeon Phi

Intel Xeon Phi Coprocessors

High Performance Computing with Accelerators

GPU Computing with NVIDIA s new Kepler Architecture

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

Introduction to PICO Parallel & Production Enviroment

Introduction to High Performance Computing (HPC) Resources at GACRC

Parallel Applications on Distributed Memory Systems. Le Yan HPC User LSU

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

HPC Architectures. Types of resource currently in use

GPU Cluster Computing. Advanced Computing Center for Research and Education

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

Large Scale Remote Interactive Visualization

Timothy Lanfear, NVIDIA HPC

Our Workshop Environment

Advances of parallel computing. Kirill Bogachev May 2016

GPU Architecture. Alan Gray EPCC The University of Edinburgh

Erkenntnisse aus aktuellen Performance- Messungen mit LS-DYNA

Supercomputer and grid infrastructure! in Poland!

VSC Users Day 2018 Start to GPU Ehsan Moravveji

GPU Computing with Fornax. Dr. Christopher Harris

GPUs and the Future of Accelerated Computing Emerging Technology Conference 2014 University of Manchester

Parallel Programming on Ranger and Stampede

Introduc)on to Pacman

Cuda C Programming Guide Appendix C Table C-

Managing CAE Simulation Workloads in Cluster Environments

Introduction to High Performance Computing (HPC) Resources at GACRC

Implementing MPI on Windows: Comparison with Common Approaches on Unix

Fra superdatamaskiner til grafikkprosessorer og

rabbit.engr.oregonstate.edu What is rabbit?

Introduction to GALILEO

UAntwerpen, 24 June 2016

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220

General Plasma Physics

Introduction to HPC Using the New Cluster at GACRC

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

The Stampede is Coming: A New Petascale Resource for the Open Science Community

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Finite Element Integration and Assembly on Modern Multi and Many-core Processors

DATARMOR: Comment s'y préparer? Tina Odaka

Heterogeneous CPU+GPU Molecular Dynamics Engine in CHARMM

Vectorisation and Portable Programming using OpenCL

Introduction to GALILEO

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Advanced Research Computing. ARC3 and GPUs. Mark Dixon

Experiences with GPGPUs at HLRS

Performance comparison between a massive SMP machine and clusters

DELIVERABLE D5.5 Report on ICARUS visualization cluster installation. John BIDDISCOMBE (CSCS) Jerome SOUMAGNE (CSCS)

Tutorial. Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers

arxiv: v1 [physics.comp-ph] 4 Nov 2013

Name Department/Research Area Have you used the Linux command line?

ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

Programming Techniques for Supercomputers. HPC RRZE University Erlangen-Nürnberg Sommersemester 2018

HPC Resources at Lehigh. Steve Anthony March 22, 2012

Our Workshop Environment

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

New Storage Technologies First Impressions: SanDisk IF150 & Intel Omni-Path. Brian Marshall GPFS UG - SC16 November 13, 2016

Introduction to GALILEO

SCALABLE HYBRID PROTOTYPE

HOKUSAI System. Figure 0-1 System diagram

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

Parallel Computing. November 20, W.Homberg

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Real Parallel Computers

Transcription:

Introduc)on to High Performance Compu)ng Advanced Research Computing

Outline What cons)tutes high performance compu)ng (HPC)? When to consider HPC resources What kind of problems are typically solved? What are the components of HPC? What resources are available? Overview of HPC Resources at Virginia Tech 2

Should I Pursue HPC for my Problem? Are local resources insufficient to meet your needs? Very large jobs Very many jobs Large data Do you have na)onal collaborators? Share projects between different en))es Convenient mechanisms for data sharing 3

Who Uses HPC? Training (51) 2% Earth Sci (29) 2% ScienEfic CompuEng (60) 2% Chemistry (161) 7% Chemical, Thermal Sys (89) 8% Materials Research (131) 9% Atmospheric Sciences (72) 11% Physics (91) 19% Molecular Biosciences (271) 17% Astronomical Sciences (115) 13% >2 billion cpu- hours allocated 1400 alloca)ons 350 ins)tu)ons 32 research domains

Learning Curve Linux: Command- line interface Scheduler: Shares resources among mul)ple users Parallel Compu)ng: Need to parallelize code to take advantage of supercomputer s resources Third party programs or libraries make this easier

Popular SoYware Packages Molecular Dynamics: Gromacs, LAMMPS CFD: OpenFOAM, Ansys Finite Elements: Deal II, Abaqus Chemistry: VASP, Gaussian Climate: CESM Bioinforma)cs: Mothur, QIIME, MPIBLAST Numerical Compu)ng/Sta)s)cs: R, Matlab Visualiza)on: ParaView, Ensight

WHAT IS PARALLEL COMPUTING? 8

Parallel Compu)ng 101 Parallel compu)ng: use of mul)ple processors or computers working together on a common task. Each processor works on its sec)on of the problem Processors can exchange informa)on Grid of Problem to be solved CPU #1 works on this area of the problem CPU #2 works on this area of the problem y exchange exchange exchange CPU #3 works on this area of the problem exchange CPU #4 works on this area of the problem x 9

Why Do Parallel Compu)ng? Limits of single CPU compu)ng performance available memory I/O rates Parallel compu)ng allows one to: solve problems that don t fit on a single CPU solve problems that can t be solved in a reasonable )me We can solve larger problems faster more cases 10

A Change in Moore s Law

Parallelism is the New Moore s Law Power and energy efficiency impose a key constraint on design of micro- architectures Clock speeds have plateaued Hardware parallelism is increasing rapidly to make up the difference

WHAT DOES A MODERN SUPERCOMPUTER LOOK LIKE? 13

Essential Components of HPC Supercompu)ng resources Storage Visualiza)on Data management Network infrastructure Support 15

Blade : Rack : System 1 node : 2 x 8 cores = 16 cores 1 chassis : 10 nodes = 160 cores 1 rack (frame) : 4 chassis = 640 cores system : 10 racks = 6,400 cores x 10 x 4

Shared and distributed memory Memory M M M M M P P P P P P P P P Network P All processors have access to a pool of shared memory Access )mes vary from CPU to CPU in NUMA systems Example: SGI UV, CPUs on same node Memory is local to each processor Data exchange by message passing over a network Example: Clusters with single- socket blades 18

HPC Trends Memory Memory M P GPU Architecture Single core Mul)core GPU Cluster Code Serial OpenMP, Pthreads CUDA, OpenACC MPI

How are accelerators different? Intel Xeon E5-2670 (CPU) Intel Xeon Phi 5110P (MIC) Nvidia Tesla K20X (GPU) Cores 8 60 14 SMX Logical Cores 16 240 2,688 CUDA cores Frequency 2.60 GHz 1.05 GHz 0.74 MHz GFLOPs (double) 333 1,010 1,317 Memory 64 GB 8GB 6GB Memory B/W 51.2GB/s 320GB/s 250GB/s

Mul)- core systems Memory Memory Memory Memory Memory Network Current processors place mul)ple processor cores on a die Communica)on details are increasingly complex Cache access Main memory access Quick Path / Hyper Transport socket connec)ons Node to node connec)on via network

Accelerator- based Systems Memory Memory Memory Memory G P U G P U G P U G P U Network Calcula)ons made in both CPUs and Graphical Processing Unit No longer limited to single precision calcula)ons Load balancing cri)cal for performance Requires specific libraries and compilers (CUDA, OpenCL) Co- processor from Intel: MIC (Many Integrated Core)

Batch Submission Process Login Node Compute Nodes Internet ssh qsub job Queue Master Node C1 C2 C3 mpirun np #./a.out Queue: Job script waits for resources. Master: Compute node that executes the job script, launches all MPI processes. ibrun./a.out

ARC OVERVIEW 24

(ARC) Unit within the Office of the Vice President of Informa)on Technology Provide centralized resources for: Research compu)ng Visualiza)on Staff to assist users Website: hmp://www.arc.vt.edu/

Goals Advance the use of compu)ng and visualiza)on in VT research Centralize resource acquisi)on, maintenance, and support for research community Provide support to facilitate usage of resources and minimize barriers to entry Enable and par)cipate in research collabora)ons between departments

Personnel Associate VP for Research Compu)ng: Terry Herdman Director, HPC: Vijay Agarwala Director, Visualiza)on: Nicholas Polys Computa)onal Scien)sts Jus)n Krome)s James McClure Brian Marshall Srinivas Yarlanki Srijith Rajamohan

Personnel (Con)nued) System Administrators Tim Rhodes Chris Snapp Brandon Sawyers Vis & Virtual Reality Specialist: Wole Oyekoya Business Manager: Alana Romanella User Support GRAs: Umar Kalim and Di Zhang

Computa)onal Resources Name BlueRidge HokieSpeed HokieOne Ithaca Key Features, Uses Large- scale CPU or MIC GPU Shared Memory Beginners, MATLAB Available March 2013 Sept 2012 Apr 2012 Fall 2009 Theore)cal Peak (TFlops/s) 398.7 238.2 5.4 6.1 Nodes 408 201 N/A 79 Cores 6,528 2,412 492 632 Cores/Node 16 12 N/A* 8 Accelerators/ Coprocessors 260 Intel Xeon Phi 8 Nvidia K40 GPU 408 Nvidia Tesla GPU N/A N/A Memory Size 27.3 TB 5.0 TB 2.62 TB 2 TB Memory/Core 4 GB* 2 GB 5.3 GB 3 GB* Memory/Node 64 GB* 24 GB N/A* 24 GB*

Visualiza)on Resources VisCube: 3D immersion environment with three 10ʹ by 10ʹ walls and a floor of 1920 1920 stereo projec)on screens DeepSix: Six )led monitors with combined resolu)on of 7680 3200 ROVR Stereo Wall AISB Stereo Wall

Gewng Started on ARC Systems 1. Review ARC s system specifica)ons and choose the right system(s) for you a. Specialty soyware 2. Apply for an account online the Advanced Research Compu)ng website 3. When your account is ready, you will receive confirma)on from ARC s system administrators

Resources ARC Website: hmp://www.arc.vt.edu ARC Compute Resources & Documenta)on: hmp://www.arc.vt.edu/resources/hpc/ New Users Guide: hmp://www.arc.vt.edu/userinfo/newusers.php Frequently Asked Ques)ons: hmp://www.arc.vt.edu/userinfo/faq.php Linux Introduc)on: hmp://www.arc.vt.edu/resources/soyware/unix/

Thank you. Ques)ons?