ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

Similar documents
GPU ACCELERATED COMPUTING. 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

TESLA PLATFORM. Jan 2018

ENDURING DIFFERENTIATION Timothy Lanfear

ENDURING DIFFERENTIATION. Timothy Lanfear

GPUs and the Future of Accelerated Computing Emerging Technology Conference 2014 University of Manchester

SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS

POWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017

Accelerating High Performance Computing.

TESLA V100 PERFORMANCE GUIDE May 2018

Accelerated Platforms: The Future of Computing. Marc Hamilton, VP Solutions Architecture & Engineering, NVIDIA Korea AI Conference 2018

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

The State of Accelerated Applications. Michael Feldman

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

A NEW COMPUTING ERA. Shanker Trivedi Senior Vice President Enterprise Business at NVIDIA

DGX UPDATE. Customer Presentation Deck May 8, 2017

TESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications

TESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

A NEW COMPUTING ERA. DAVID B. KIRK, FELLOW NVIDIA AI Conference Singapore 2017

MACHINE LEARNING WITH NVIDIA AND IBM POWER AI

NVIDIA GPU TECHNOLOGY UPDATE

GPU FOR DEEP LEARNING. 周国峰 Wuhan University 2017/10/13

The Exascale Era Has Arrived

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

NVIDIA PLATFORM FOR AI

Timothy Lanfear, NVIDIA HPC

GTC was the introduction to the future of AI, a protector, a healer, a helper, a guardian, a visionary, and just a little slice of amazing.

The Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations

INVESTOR UPDATE. September 2018

Scaling in a Heterogeneous Environment with GPUs: GPU Architecture, Concepts, and Strategies

TOWARDS ACCELERATED DEEP LEARNING IN HPC AND HYPERSCALE ARCHITECTURES Environnement logiciel pour l apprentissage profond dans un contexte HPC

Technologies and application performance. Marc Mendez-Bermond HPC Solutions Expert - Dell Technologies September 2017

S8901 Quadro for AI, VR and Simulation

SUPERCHARGE DEEP LEARNING WITH DGX-1. Markus Weber SC16 - November 2016

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

INTRODUCING THE DGX FAMILY. Marc Domenech May 8, 2017

EFFICIENT INFERENCE WITH TENSORRT. Han Vanholder

NEW NVIDIA PLATFORM FOR AI

Autonomous Driving Solutions

S THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE. Presenter: Louis Capps, Solution Architect, NVIDIA,

WHAT S NEW IN GRID 7.0. Mason Wu, GRID & ProViz Solutions Architect Nov. 2018

NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORKS

GTC Jensen Huang Founder & CEO

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

DGX SYSTEMS: DEEP LEARNING FROM DESK TO DATA CENTER. Markus Weber and Haiduong Vo

The Tesla Accelerated Computing Platform

NVIDIA DATA LOADING LIBRARY (DALI)

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME

Building the Most Efficient Machine Learning System

NVIDIA T4 FOR VIRTUALIZATION

NVDIA DGX Data Center Reference Design

Building the Most Efficient Machine Learning System

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017

TESLA ACCELERATED COMPUTING. Mike Wang Solutions Architect NVIDIA Australia & NZ

WELCOME. Shawn Simmons, Investor Relations May 10, 2017

TACKLING THE CHALLENGES OF NEXT GENERATION HEALTHCARE

Introduction to High Performance Computing. Shaohao Chen Research Computing Services (RCS) Boston University

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

VOLTA: PROGRAMMABILITY AND PERFORMANCE. Jack Choquette NVIDIA Hot Chips 2017

NVIDIA Accelerators Models HPE NVIDIA GV100 Nvlink Bridge Kit HPE NVIDIA Tesla V100 FHHL 16GB Computational Accelerator

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME

Turing Architecture and CUDA 10 New Features. Minseok Lee, Developer Technology Engineer, NVIDIA

World s most advanced data center accelerator for PCIe-based servers

April 4-7, 2016 Silicon Valley INSIDE PASCAL. Mark Harris, October 27,

Trends in systems and how to get efficient performance

IBM Deep Learning Solutions

NGC CONTAINER. DU _v02 November User Guide

GPU COMPUTING AND THE FUTURE OF HPC. Timothy Lanfear, NVIDIA

S INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS

WELCOME. Simona Jankowski, March 27, 2018

Arm in HPC. Toshinori Kujiraoka Sales Manager, APAC HPC Tools Arm Arm Limited

MICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE

Deep Learning: Transforming Engineering and Science The MathWorks, Inc.

STRATEGIES TO ACCELERATE VASP WITH GPUS USING OPENACC. Stefan Maintz, Dr. Markus Wetzstein

HPC and AI Solution Overview. Garima Kochhar HPC and AI Innovation Lab

NOVEL GPU FEATURES: PERFORMANCE AND PRODUCTIVITY. Peter Messmer

VSC Users Day 2018 Start to GPU Ehsan Moravveji

GPUS FOR NGVLA. M Clark, April 2015

NVIDIA TESLA V100 GPU ARCHITECTURE THE WORLD S MOST ADVANCED DATA CENTER GPU

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

IBM CORAL HPC System Solution

HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

Exascale: challenges and opportunities in a power constrained world

GPU Computing fuer rechenintensive Anwendungen. Axel Koehler NVIDIA

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems

CALMIP : HIGH PERFORMANCE COMPUTING

Stan Posey, NVIDIA, Santa Clara, CA, USA

Advancing State-of-the-Art of Autonomous Vehicles and Robotics Research using AWS GPU Instances

Fuelling the AI Revolution with Gaming

Cisco UCS C480 ML M5 Rack Server Performance Characterization

Delivering Real World 3D Applications with VMware Horizon, Blast Extreme and NVIDIA Grid

Fast Hardware For AI

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

ENABLING NEW SCIENCE GPU SOLUTIONS

Interconnect Your Future

Transcription:

ACCELERATED COMPUTING: THE PATH FORWARD Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

COMPUTING AFTER MOORE S LAW Tech Walker 40 Years of CPU Trend Data 10 7 GPU-Accelerated Computing 10 5 1.1X per year Domain-Specialized Accelerator 10 3 Single-threaded perf 1.5X per year 1980 1990 2000 2010 2020 Pioneered CUDA Accelerated Computing Extending Performance Post-Moore s Law 2

VOLTA TAKING OFF 21B xtors 5,120 CUDA cores 15.7 FP32 7.8 FP64 TFLOPS 125 Tensor TFLOPS Volta in Production Every Cloud Every Computer Maker NVIDIA ACCELERATED COMPUTING 3

ANNOUNCING NVIDIA TESLA V100 IN MICROSOFT AZURE NCv3 Series with NVIDIA Tesla V100 Sign up today at www.aka.ms/ncv3 4

ANNOUNCING JAPAN S AIST ADOPTS NVIDIA VOLTA FOR ABCI SUPERCOMPUTER Most Powerful AI Supercomputer in Japan 4,352 Tesla V100 GPUs 37 PetaFLOPS FP64 HPC Performance 0.55 ExaFLOPS AI Performance 5

109% Y-Y DATACENTER GROWTH FYQ3 $1,500 $1.5B $1,000 $1B $.5B $500 Full Fiscal Year 1 st Three Quarters $0 FY10 FY11 FY12 FY13 FY14 FY15 FY16 FY17 YTD 18 6

NVIDIA TESLA DATACENTER PLATFORM MacHow2 $12B Market 80% of Apps by 2020 20M Inference Servers $25B Market 600M Amazon Packages $3T IT Industry HPC CSP TRAINING CSP INFERENCE PUBLIC CLOUD INDUSTRIES ENTERPRISE CUDA ComputeWorks NVIDIA AI SDK cudnn, NCCL, TensorRT, DIGITS Every Framework NGC GRID vpc Quadro vws 7

ARCHITECTING MODERN DATACENTERS DEEP LEARNING COMES TO HPC OPTIMIZED HPC SOFTWARE 8

% WORKLOAD ARCHITECTING MODERN DATACENTERS Strong Scaling Weak Scaling Deep Learning Apps Accelerated Parallel Speed-Up % Workload % Sequential Amdahl s Law APPS 9

ARCHITECTING MODERN DATACENTERS Strong Core CPU Volta 5,120 CUDA Cores NVLink for Strong Scaling 125 TFLOPS Tensor Core 10

ns/day ARCHITECTING MODERN DATACENTERS 80 70 60 50 AMBER Simulation of CRISPR 40 30 20 10 0 0 20 40 60 80 100 # of CPUs 48 CPU Nodes Comet Supercomputer 11

ns/day ARCHITECTING MODERN DATACENTERS 80 70 60 The Power of Accelerated Computing AMBER Simulation of CRISPR 1 Node with 4x V100 GPUs 50 40 30 20 10 0 0 20 40 60 80 100 # of CPUs 48 CPU Nodes Comet Supercomputer 12

% WORKLOAD ARCHITECTING MODERN DATACENTERS Apps Accelerated APPS Parallel Speed-Up REACH % Workload % Sequential Amdahl s Law % Accelerated APPS 13

70% OF THE WORLD S SUPERCOMPUTING WORKLOAD ACCELERATED VASP AMBER NAMD GROMACS Gaussian Simulia Abaqus WRF OpenFOAM ANSYS LS-DYNA BLAST LAMMPS ANSYS Fluent Quantum Espresso GAMESS Top 15 HPC Applications 500+ Accelerated Applications Intersect360 Research, Nov 2017 HPC Application Support for GPU Computing 14

4X BETTER HPC SYSTEM TCO Mixed Workload: Materials Science (VASP) Life Sciences (AMBER) Physics (MILC) Deep Learning (ResNet-50) 160 Self-hosted Servers 96 KWatts 15

4X BETTER HPC SYSTEM TCO Mixed Workload: Materials Science (VASP) Life Sciences (AMBER) Physics (MILC) Deep Learning (ResNet-50) 12 Accelerated Servers w/4 V100 GPUs 20 KWatts 1/3 the Cost 1/4 the Space 1/5 the Power 16

NVIDIA POWERS WORLD S FASTEST DEEP LEARNING PERFORMANCE Time to Train 4.4 Hours Image of ResNet 50 network 60 Mins 48 Mins 15 Mins ( ) Preferred Networks Feb '17 Facebook June '17 IBM Aug '17 Preferred Networks Nov '17 128 TitanX (Maxwell) 256 Tesla P100 256 Tesla P100 1024 Tesla P100 ResNet-50 ResNet-50 Dataset: Imagenet Trained for 90 Epochs 17

DEEP LEARNING COMES TO HPC NEW DATA TRAINING SET REGRESSION SET NEW DATA SIMULATION (FP64/FP32) TRAINING (FP32/FP16) REGRESSION TESTING (FP16/INT8) INFERENCE (FP16/INT8) ERRORS 18

DEEP LEARNING COMES TO HPC UNSW: PHYSICS 14X Faster Bose-Einstein Condensate Creation U. FLORIDA & UNC: DRUG DISCOVERY 300,000X Molecular Energetics Prediction SLAC: ASTROPHYSICS Gravitational Lensing: From Weeks to 10ms PRINCETON & ITER: PARTICLE PHYSICS 90% Accuracy for Fusion Sustainment U.S. DoE: CLEAN ENERGY 33% More Accurate Neutrino Detection U. PITT: DRUG DISCOVERY 35% Higher Accuracy for Protein Scoring 19

NVIDIA DRIVE AI CAR PLATFORM NEW DATA TRAINING SET REGRESSION SET NEW DATA 3D-SIM TRAINING RE-SIM INFERENCE ERRORS 20

NVIDIA SATURNV WITH VOLTA 40 PetaFLOPS Peak FP64 Performance 660 PetaFLOPS DL FP16 Performance 660 NVIDIA DGX-1 Server Nodes 21

NVIDIA GPU CLOUD MacHow2 CLOUD CONTAINER REGISTRY Containerized in NVDocker Optimization Across the Full Stack Always Up-to-Date Fully Tested and Maintained by NVIDIA 22

23

ANNOUNCING NVIDIA GPU CLOUD FOR HPC CLOUD CONTAINER REGISTRY FOR ACCELERATED HPC APPS Containerized in NVDocker Optimized for GPU-accelerated Systems Up-to-Date Containers Available NOW Sign up at nvidia.com/gpu-cloud 24

HPC APPS COMING TO NGC GAMESS GROMACS CP2K PYFR BIOINFORMATICS CFD LAMMPS NAMD QUANTUM ESPRESSO QMCPACK MATERIAL SCIENCE PHYSICS RELION SPECFEM3D CARTESIAN SPECFEM3D GLOBE SEISMIC NOW NEXT 25

26

VISUALIZATION IS VITAL TO SCIENCE MacHow2 27

VISUALIZATION IS VITAL TO SCIENCE MacHow2 28

ANNOUNCING NVIDIA GPU CLOUD FOR HPC VISUALIZATION UNIFIED VISUALIZATION FOR LARGE DATA SETS Large-scale Volumetric Rendering Physically Accurate Ray Tracing Production-quality Images Seamless integration with ParaView Early Access NOW Signup now at nvidia.com/gpu-cloud ParaView with NVIDIA IndeX ParaView with NVIDIA OptiX ParaView with NVIDIA Holodeck 29

30

NVIDIA GPU CLOUD OPTIMIZED HPC SOFTWARE DEEP LEARNING HPC APPS HPC VIS 31

TO EXASCALE, AND BEYOND MacHow2 Volta in Every Cloud, Every Computer Maker Accelerated Computing s Time Has Come Deep Learning Comes to HPC 500+ Accelerated Apps NVIDIA GPU Cloud for HPC NVIDIA GPU Cloud for HPC Vis NVIDIA ACCELERATED COMPUTING 32