Stan Posey, NVIDIA, Santa Clara, CA, USA

Similar documents
NVIDIA Update and Directions on GPU Acceleration for Earth System Models

Programming NVIDIA GPUs with OpenACC Directives

Peter Messmer Developer Technology Group Stan Posey HPC Industry and Applications

GPU Developments for the NEMO Model. Stan Posey, HPC Program Manager, ESM Domain, NVIDIA (HQ), Santa Clara, CA, USA

RECENT UPDATES ON ACCELERATING COMPUTING PLATFORM PRADEEP GUPTA SENIOR SOLUTIONS ARCHITECT, NVIDIA

NVIDIA HPC Directions for Earth System Models. Stan Posey, HPC Program Manager, ESM Domain, NVIDIA (HQ), Santa Clara, CA, USA

Status and Directions of NVIDIA GPUs for Earth System Modeling

GPU Developments in Atmospheric Sciences

NVIDIA GPUs in Earth System Modelling Thomas Bradley

GPUs and the Future of Accelerated Computing Emerging Technology Conference 2014 University of Manchester

Running the FIM and NIM Weather Models on GPUs

Timothy Lanfear, NVIDIA HPC

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Accelerating High Performance Computing.

GPU COMPUTING AND THE FUTURE OF HPC. Timothy Lanfear, NVIDIA

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

Steve Scott, Tesla CTO SC 11 November 15, 2011

Transition to MOM6 (An Update Since the Breckenridge Workshop) and CICE Consortium. Gokhan Danabasoglu and Marika Holland

IBM CORAL HPC System Solution

Progress on GPU Parallelization of the NIM Prototype Numerical Weather Prediction Dynamical Core

Opportunities for HPC in Industry. Stan Posey HPC Applications and Industry Development NVIDIA, Santa Clara, CA, USA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

PERFORMANCE PORTABILITY WITH OPENACC. Jeff Larkin, NVIDIA, November 2015

Parallelization of the NIM Dynamical Core for GPUs

Stan Posey, CAE Industry Development NVIDIA, Santa Clara, CA, USA

An Introduction to OpenACC

CLAW FORTRAN Compiler source-to-source translation for performance portability

Productive Performance on the Cray XK System Using OpenACC Compilers and Tools

ENDURING DIFFERENTIATION. Timothy Lanfear

ENDURING DIFFERENTIATION Timothy Lanfear

Adapting Numerical Weather Prediction codes to heterogeneous architectures: porting the COSMO model to GPUs

Future Directions for CUDA Presented by Robert Strzodka

Portable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.

GPU Computing with NVIDIA s new Kepler Architecture

Scaling in a Heterogeneous Environment with GPUs: GPU Architecture, Concepts, and Strategies

CUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design

Developments in Computing Technology: GPUs

CESM Projects Using ESMF and NUOPC Conventions

CPU GPU. Regional Models. Global Models. Bigger Systems More Expensive Facili:es Bigger Power Bills Lower System Reliability

Porting COSMO to Hybrid Architectures

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

HPC with the NVIDIA Accelerated Computing Toolkit Mark Harris, November 16, 2015

GPUS FOR NGVLA. M Clark, April 2015

ADVANCES IN EXTREME-SCALE APPLICATIONS ON GPU. Peng Wang HPC Developer Technology

John Levesque Nov 16, 2001

Directive-based Programming for Highly-scalable Nodes

The challenges of new, efficient computer architectures, and how they can be met with a scalable software development strategy.! Thomas C.

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Performance Analysis of the PSyKAl Approach for a NEMO-based Benchmark

Porting and Tuning WRF Physics Packages on Intel Xeon and Xeon Phi and NVIDIA GPU

Shifter: Fast and consistent HPC workflows using containers

ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

GPUs and Emerging Architectures

Application of KGen and KGen-kernel

HPC future trends from a science perspective

Debugging CUDA Applications with Allinea DDT. Ian Lumb Sr. Systems Engineer, Allinea Software Inc.

GPU- Aware Design, Implementation, and Evaluation of Non- blocking Collective Benchmarks

Barcelona Supercomputing Center

Quantum ESPRESSO on GPU accelerated systems

CUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker

GPU Computing Ecosystem

Optimizing Weather Model Radiative Transfer Physics for the Many Integrated Core and GPGPU Architectures

Mathematical computations with GPUs

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

HPC with Multicore and GPUs

GPU Architecture. Alan Gray EPCC The University of Edinburgh

NVIDIA GPU TECHNOLOGY UPDATE

Arm in HPC. Toshinori Kujiraoka Sales Manager, APAC HPC Tools Arm Arm Limited

High Performance Computing with Accelerators

TESLA ACCELERATED COMPUTING. Mike Wang Solutions Architect NVIDIA Australia & NZ

WHAT S NEW IN CUDA 8. Siddharth Sharma, Oct 2016

Trends in HPC (hardware complexity and software challenges)

Deutscher Wetterdienst

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

The Exascale Era Has Arrived

NOVEL GPU FEATURES: PERFORMANCE AND PRODUCTIVITY. Peter Messmer

NCAR s Data-Centric Supercomputing Environment Yellowstone. November 28, 2011 David L. Hart, CISL

Present and Future Leadership Computers at OLCF

Interactive Supercomputing for State-of-the-art Biomolecular Simulation

Experiences with CUDA & OpenACC from porting ACME to GPUs

n N c CIni.o ewsrg.au

THE LEADER IN VISUAL COMPUTING

X10 specific Optimization of CPU GPU Data transfer with Pinned Memory Management

GPU Computing. Axel Koehler Sr. Solution Architect HPC

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

Portable and Productive Performance on Hybrid Systems with libsci_acc Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.

Revolutionizing Data-Centric Transformation

TESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

The Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software

The Arm Technology Ecosystem: Current Products and Future Outlook

IBM Power Systems HPC Cluster

Mapping MPI+X Applications to Multi-GPU Architectures

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU

Accelerating Financial Applications on the GPU

INTRODUCTION TO OPENACC. Analyzing and Parallelizing with OpenACC, Feb 22, 2017

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

Transcription:

Stan Posey, sposey@nvidia.com NVIDIA, Santa Clara, CA, USA

NVIDIA Strategy for CWO Modeling (Since 2010) Initial focus: CUDA applied to climate models and NWP research Opportunities to refactor code with CUDA for performance speedups Current: Directives applied to research and operational models User community requests Fortran-only; portability; single source, etc. Direct investments in domain-specific applications engineering Current collaborations in 6 models and growing Continued development in broad base of libraries and OpenACC CUBLAS, CUSPARSE, collaborations with PGI, CAPS, Cray; PGI acquisition Collaborations with OEMs (development, benchmarks, etc.) Cray, IBM, HP, SGI, etc. 2

Important OEM Collaborations in CWO Modeling Collaboration on large deployments; OpenACC development TITAN ORNL/NOAA; Gaea NOAA/ORNL Blue Waters NCSA; Piz Daint CSCS Collaboration on strategic deployments; Member of Openpower Yellowstone (Geyser, Caldera) NCAR; Discover NASA GSFC Openpower Consortium: IBM, NVIDIA, Google, Mellanox, Tyan, others Strategic large deployments including Tsubame Tokyo Inst of Tech Strategic deployments including Pleadies NASA ARC 3

NVIDIA Application Engineering Investments Model CWO Focus GPU Approach Collaboration WRF NWP/Climate CUDA C, OpenACC NCAR, SSEC UW-M, NVIDIA COSMO NWP/Climate CUDA C, OpenACC CSCS, SCS, MeteoSwiss, NVIDIA CAM-SE Climate CUDA Ftn, OpenACC ORNL, NVIDIA NIM/FIM NWP/Climate F2C-ACC, OpenACC NOAA, OACC Vendors, NVIDIA GEOS-5 Climate CUDA Ftn, OpenACC NASA, NVIDIA NEMO Ocean GCM OpenACC NVIDIA, STFC (near future) Other Evaluations: GFS, COAMPS, MPAS-A, ROMS; ICON, UKMO GungHo; GRAPES (CN), OLAM (BR) Other Investments: Government and Research Institutes without direct NVIDIA collaboration 4

Example:NOPP/ONR-Funded Accelerated HPC http://www.onr.navy.mil/~/media/files/funding-announcements/baa/2013/13-011.ashx ONR BAA13-011: Advancing Air-Ocean-Land-Ice Global Coupled Prediction on Emerging Computational Architectures Predictive simulations on heterogeneous architectures Central Processing Unit (CPU), MIC, GPU: identification of representative code patterns that either look particularly amenable or intractable to refactoring; establishment of pathways to maintain single source code compatible with multiple platforms; and determination of mechanisms to achieve optimal performance and portability. Total of $3.75M funding distributed among 4 8 awards (closed Apr 2013) Models: GFS, HIRAM, NIM, MOM, Wavewatch3, CESM, MPAS, CICE, HYCOM, NUMA NOAA NCAR LANL NRL 5

Rapid OpenACC Growth Since 2011 Founding 26+ Application Developments HPC Industry Support Grows 2x Cloverleaf COSMO (Physics) HBM NIM S3D WRF Cloverleaf MiniMD COSMO (Physics) NICAM DNS NIM EMGS ELAN NEMO (GYRE) GAMESS PALM-GPU GENE Quantum Espresso GEOS RAMSES GTC ROMS Harmonie S3D HBM Seismic CPML ICON SPECFEM-3D LULESH UPACS MiniGhost WRF 2012 2013 2012 2013 6

DP GFLOPS per Watt NVIDIA GPU Roadmap (Details Require NDA) 32 16 8 4 Kepler Dynamic Parallelism Maxwell Unified Virtual Memory Volta Stacked DRAM 2 1 0.5 Tesla CUDA Fermi FP64 NVIDIA Roadmap Trends: * Increasing number of more flexible cores * Larger and faster memories (12 GB today) * Enhanced programming and standards * Tighter integration with OEM systems 2008 2010 2012 2014 7

NVIDIA Quadro K6000: Kepler-Based 12GB GPU Announced July 2013 Available Q3 2013 http://www.nvidia.com/object/quadro-desktop-gpus.html 8

ARM Support Now Available Since CUDA 5.5 ARM or x86 CPU Optimized for Few Serial Tasks GPU Accelerator Optimized for Many Parallel Tasks CUDA 5.5 Highlights Full Support for ARM Platforms Native compilation on ARM Optimized for MPI Faster Hyper-Q for all Linux distros MPI workload prioritization Guided Performance Analysis Step-by-step optimization Available Now http://developer.nvidia.com/cuda-toolkit 9