GPUs and the Future of Accelerated Computing Emerging Technology Conference 2014 University of Manchester

Similar documents
Accelerating High Performance Computing.

ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

Scaling in a Heterogeneous Environment with GPUs: GPU Architecture, Concepts, and Strategies

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

The State of Accelerated Applications. Michael Feldman

The Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations

GPU ACCELERATED COMPUTING. 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

Timothy Lanfear, NVIDIA HPC

GPU COMPUTING AND THE FUTURE OF HPC. Timothy Lanfear, NVIDIA

NVIDIA TECHNOLOGY Mark Ebersole

NVIDIA GPU TECHNOLOGY UPDATE

TESLA ACCELERATED COMPUTING. Mike Wang Solutions Architect NVIDIA Australia & NZ

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

TESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications

NOVEL GPU FEATURES: PERFORMANCE AND PRODUCTIVITY. Peter Messmer

GPU Computing with NVIDIA s new Kepler Architecture

ENDURING DIFFERENTIATION Timothy Lanfear

ENDURING DIFFERENTIATION. Timothy Lanfear

TESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications

Stan Posey, NVIDIA, Santa Clara, CA, USA

HPC with Multicore and GPUs

GPU Computing fuer rechenintensive Anwendungen. Axel Koehler NVIDIA

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

Technologies and application performance. Marc Mendez-Bermond HPC Solutions Expert - Dell Technologies September 2017

Pedraforca: a First ARM + GPU Cluster for HPC

TESLA V100 PERFORMANCE GUIDE May 2018

Arm in HPC. Toshinori Kujiraoka Sales Manager, APAC HPC Tools Arm Arm Limited

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

World s most advanced data center accelerator for PCIe-based servers

Introduction to High Performance Computing. Shaohao Chen Research Computing Services (RCS) Boston University

GPU Computing Ecosystem

GPUS FOR NGVLA. M Clark, April 2015

EXTENDING THE REACH OF PARALLEL COMPUTING WITH CUDA

APPROACHES. TO GPU COMPUTING Libraries, OpenACC Directives, and Languages

Trends in systems and how to get efficient performance

THE LEADER IN VISUAL COMPUTING

The Fermi GPU and HPC Application Breakthroughs

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications

The Visual Computing Company

CALMIP : HIGH PERFORMANCE COMPUTING

Fast Hardware For AI

TR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut

GPU Architecture. Alan Gray EPCC The University of Edinburgh

CUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker

April 4-7, 2016 Silicon Valley INSIDE PASCAL. Mark Harris, October 27,

CME 213 S PRING Eric Darve

NVIDIA : FLOP WARS, ÉPISODE III François Courteille Ecole Polytechnique 4-June-13

WHAT S NEW IN CUDA 8. Siddharth Sharma, Oct 2016

Supercomputer and grid infrastructure! in Poland!

S8901 Quadro for AI, VR and Simulation

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D

Exascale: challenges and opportunities in a power constrained world

VSC Users Day 2018 Start to GPU Ehsan Moravveji

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

Tesla GPU Computing A Revolution in High Performance Computing

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

Accelerating Insights In the Technical Computing Transformation

NVIDIA GPUs in Earth System Modelling Thomas Bradley

SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS

A NEW COMPUTING ERA. DAVID B. KIRK, FELLOW NVIDIA AI Conference Singapore 2017

The Mont-Blanc approach towards Exascale

HPC Solution. Technology for a New Era in Computing

Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain)

Autonomous Driving Solutions

Comparative Benchmarking of the First Generation of HPC-Optimised Arm Processors on Isambard

Finite Element Integration and Assembly on Modern Multi and Many-core Processors

Peter Messmer Developer Technology Group Stan Posey HPC Industry and Applications

CUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker

Trends in the Infrastructure of Computing

NVIDIA Tesla P100. Whitepaper. The Most Advanced Datacenter Accelerator Ever Built. Featuring Pascal GP100, the World s Fastest GPU

Accelerated Platforms: The Future of Computing. Marc Hamilton, VP Solutions Architecture & Engineering, NVIDIA Korea AI Conference 2018

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

The Visual Computing Company

Programming NVIDIA GPUs with OpenACC Directives

Clustering Optimizations How to achieve optimal performance? Pak Lui

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Deep Learning: Transforming Engineering and Science The MathWorks, Inc.

Interconnect Your Future

ADVANCES IN EXTREME-SCALE APPLICATIONS ON GPU. Peng Wang HPC Developer Technology

Fra superdatamaskiner til grafikkprosessorer og

IBM CORAL HPC System Solution

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design

Future Directions for CUDA Presented by Robert Strzodka

IBM Power User Group - Atlanta

Quantum Chemistry (QC) on GPUs. Feb. 2, 2017

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

Cyclone SGI Cloud Computing for HPC. Christian Tanasescu Vice President Software Engineering

Les dernières générations chez NVIDIA :

Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC?

A NEW COMPUTING ERA. Shanker Trivedi Senior Vice President Enterprise Business at NVIDIA

NVIDIA S VISION FOR EXASCALE. Cyril Zeller, Director, Developer Technology

unleashed the future Intel Xeon Scalable Processors for High Performance Computing Alexey Belogortsev Field Application Engineer

Deep Learning mit PowerAI - Ein Überblick

Building the Most Efficient Machine Learning System

GPU Computing and Its Applications

Mathematical computations with GPUs

Transcription:

NVIDIA GPU Computing A Revolution in High Performance Computing GPUs and the Future of Accelerated Computing Emerging Technology Conference 2014 University of Manchester John Ashley Senior Solutions Architect jashley@nvidia.com 1

Disclaimers These are my views, not NVIDIA s (although I do hope you all will agree by the time I m done!), and where I ve quoted others, any misrepresentation of their views is mine and mine alone. There are forward looking statements in here. You are warned that my crystal ball is no better than anyone else s, even if I m speaking about things like NVIDIA roadmaps, etc. It s a lawyer thing. Trademarks of other firms are their own, etc. 2

Agenda Who is NVIDIA? Why should I care about accelerated computing? What sort of difference can CUDA make? Where is NVIDIA going? 3

Who is NVIDIA? 4

NVIDIA products span the power-performance spectrum 5

GPUs Power World s 10 Greenest Supercomputers Green500 Rank MFLOPS/W Site 1 4,503.17 GSIC Center, Tokyo Tech 2 3,631.86 Cambridge University 3 3,517.84 University of Tsukuba 4 3,185.91 Swiss National Supercomputing (CSCS) 5 3,130.95 ROMEO HPC Center 6 3,068.71 GSIC Center, Tokyo Tech 7 2,702.16 University of Arizona 8 2,629.10 Max-Planck 9 2,629.10 (Financial Institution) 10 2,358.69 CSIRO 37 1959.90 Intel Endeavor (top Xeon Phi cluster) 49 1247.57 Météo France (top CPU cluster) 6

GPUs are becoming 100M CUDA Capable GPUs 150K CUDA Downloads 77 Supercomputing Teraflops 60 University Courses 4,000 Academic Papers 2008 2014 7 7

GPUs are becoming pervasive 100M CUDA Capable GPUs 150K CUDA Downloads 77 Supercomputing Teraflops 60 University Courses 4,000 Academic Papers 430M CUDA- Capable GPUs 50,000 Academic Papers 2.2M CUDA Downloads 41,700 Supercomputing Teraflops 738 University Courses 2008 2014 8 8

Why should I care about accelerated computing? 9

Moore s Law isn t what it used to be. 10

TeraFLOPS The future is here today Performance gains come from parallelism Systems are power limited (efficiency IS performance) Systems are communication limited (locality IS performance) 5 4 3 2 GPU 1 0 CPU 2003 2005 2007 2009 2011 2013 11

High Performance Computing is Going Hybrid CPU* Fast single threads (serial work) PCIe Sandy Bridge 32nm 690 pj/flop GPU Extreme power-efficiency (throughput work) Kepler 28nm 134 pj/flop Do most work by many cores optimized for extreme energy efficiency Still need a few cores optimized for fast serial work Amdahl s law continues to apply to all architectures *x86, ARM, Power 12

What sort of difference can CUDA make? 13

CUDA Parallel Computing Platform www.nvidia.com/getcuda Programming Approaches Libraries Drop-in Acceleration Directives Easily Accelerate Apps Programming Languages Maximum Flexibility Development Environment Nsight IDE Linux, Mac and Windows GPU Debugging and Profiling CUDA-GDB debugger NVIDIA Visual Profiler Open Compiler Tool Chain Enables compiling new languages to CUDA platform, and CUDA languages to other architectures Hardware Capabilities SMX Dynamic Parallelism HyperQ GPUDirect 14 1

Artificial Neural Network at a GOOGLE BRAIN Fraction of the Cost with GPUs Now You Can Build Google s $1M Artificial Brain on the Cheap -Wired 1,000 CPU Servers 2,000 CPUs 16,000 cores 600 kwatts $5,000,000 STANFORD AI LAB 3 GPU-Accelerated Servers 12 GPUs 18,432 cores 4 kwatts $33,000 15

VISIONWORKS COMPUTER VISION ON CUDA Your Code Driver Assistance Computational Photography Sample Pipelines Object Detection / Tracking Structure from Motion VisionWorks Primitives Classifier Corner Detection Augmented Reality Robotics CUDA Jetson TK1 16

Computer Vision on CUDA Feature Detection / Tracking ~30 GFLOPS @ 30 Hz 3D Scene Interpretation ~280 GFLOPS @ 30 Hz 17

Audi Self Driving Car Before & After CUDA 18

Solid Growth of GPU Accelerated Apps Top HPC Applications # of GPU-Accelerated Apps 300 250 200 150 113 182 272 Molecular Dynamics Quantum Chemistry Material Science Weather & Climate AMBER CHARMM DESMOND Abinit Gaussian CP2K QMCPACK COSMO GEOS-5 HOMME GROMACS LAMMPS NAMD GAMESS NWChem Quantum Espresso VASP CAM-SE NEMO NIM WRF Lattice QCD Chroma MILC 100 Plasma Physics GTC GTS 50 Structural Mechanics ANSYS Mechanical LS-DYNA Implicit MSC Nastran OptiStruct Abaqus/Standard 0 2011 2012 2013 Fluid Dynamics ANSYS Fluent Culises (OpenFOAM) 19 Accelerated, In Development

272 GPU-Accelerated Applications www.nvidia.com/appscatalog 20

Top Applications Now with Built-in GPU Support Digital Content Creation Molecular Dynamics Non-GPU Apps Adobe CS Other GPU Apps Autodesk 3dsMax Avid Media Composer Apple Final Cut Sony Vegas Pro Computer-Aided Engineering Application Market Share by Segment Non-GPU Apps AMBER DL_POLY NAMD LAMMPS GROMACS CHARMM Quantum Chemistry Non-GPU Apps ANSYS Altair Radioss MSC Nastran Simulia Abaqus Gaussian Non-GPU Apps GAMESS Quantum Espresso NWChem CP2K 272 GPU-Accelerated Applications www.nvidia.com/appscatalog 21

Performance on Leading Scientific Applications Structural Mechanics ANSYS ANSYS 14 SMP-V14sp-4 Physics CHROMA Chroma Molecular Dynamics AMBER AMBER - SPFP-Cellulose_production_NPT Material Science QMCPACK QMCPACK 4x4x1 Earth Science SPECFEM3D SPECFEM3D E5-2687W @ 3.10GHz Tesla K20X Tesla K40 0.0x 2.0x 4.0x 6.0x 8.0x 10.0x 12.0x 22

Where is NVIDIA going? 23

GFLOPS per Watt 32 16 Fast Paced CUDA GPU Roadmap Maxwell Pascal Volta 8 Kepler Higher Perf/Watt Unified Memory Stacked DRAM NVLINK 4 2 Fermi Dynamic Parallelism 1 0.5 Tesla CUDA FP64 2008 2010 2012 2014 2016 2018 24

Introducing NVLINK and Stacked Memory NVLINK GPU high speed interconnect 5-12x PCIe Gen 3 Bandwidth Drastically reduced energy/bit Stacked Memory 2-4x Capacity & Bandwidth 3-4x More Energy Efficient per bit Leaves more power for compute HBM HBM HBM HBM GP100 passive silicon interposer Package Substrate HBM HBM HBM HBM 25

IBM Partners with NVIDIA to Build Next- Generation Supercomputers + Tesla GPU POWER 8 CPU GPU-Accelerated POWER-Based Systems Available in 2014 26 2

Unify GPU and Tegra Architecture Tesla Fermi Kepler Tegra K1 Maxwell GPU ARCHITECTURE Tegra 3 Tegra 4 MOBILE ARCHITECTURE 27

Jetson TK1 Development Kit 32 Bit ARM+Kepler SMX SOC CUDA capable $192 from US retailers, cost and availability will vary elsewhere https://developer.nvidia.com/jetson-tk1 http://devblogs.nvidia.com/parallelforall/jetson-tk1-mobileembedded-supercomputer-cuda-everywhere/ 28 2

Summary Who is NVIDIA? A: The world s leading Visual Computing company, from consumer devices through to world class supercomputers Why should I care about accelerated computing? A: Because parallelism and heterogeneous computing is the future of big compute and big data What sort of difference can CUDA make? A: Order of magnitude improvements in performance and efficiency are possible with CUDA Where is NVIDIA going? A: Relentlessly forward to a CUDA enabled parallel & energy efficient computing future 29

Questions? 30