A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

Similar documents
A NEW COMPUTING ERA. DAVID B. KIRK, FELLOW NVIDIA AI Conference Singapore 2017

A NEW COMPUTING ERA. Shanker Trivedi Senior Vice President Enterprise Business at NVIDIA

POWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017

NVIDIA PLATFORM FOR AI

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

EFFICIENT INFERENCE WITH TENSORRT. Han Vanholder

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA

GTC Jensen Huang Founder & CEO

NEW NVIDIA PLATFORM FOR AI

ENDURING DIFFERENTIATION. Timothy Lanfear

ENDURING DIFFERENTIATION Timothy Lanfear

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

GPU FOR DEEP LEARNING. 周国峰 Wuhan University 2017/10/13

Autonomous Driving Solutions

GTC was the introduction to the future of AI, a protector, a healer, a helper, a guardian, a visionary, and just a little slice of amazing.

S8822 OPTIMIZING NMT WITH TENSORRT Micah Villmow Senior TensorRT Software Engineer

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME

Fast Hardware For AI

ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

NVIDIA DEEP LEARNING INSTITUTE

NVIDIA AI BRAIN OF SELF DRIVING AND HD MAPPING. September 13, 2016

NVIDIA DEEP LEARNING PLATFORM

Deep Learning: Transforming Engineering and Science The MathWorks, Inc.

MACHINE LEARNING WITH NVIDIA AND IBM POWER AI

MIOVISION DEEP LEARNING TRAFFIC ANALYTICS SYSTEM FOR REAL-WORLD DEPLOYMENT. Kurtis McBride CEO, Miovision

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME

Xilinx ML Suite Overview

An introduction to Machine Learning silicon

CME 213 S PRING Eric Darve

INVESTOR UPDATE. September 2018

TOWARDS ACCELERATED DEEP LEARNING IN HPC AND HYPERSCALE ARCHITECTURES Environnement logiciel pour l apprentissage profond dans un contexte HPC

Implementing Deep Learning for Video Analytics on Tegra X1.

Nvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR

Deep learning in MATLAB From Concept to CUDA Code

Speculations about Computer Architecture in Next Three Years. Jan. 20, 2018

Object recognition and computer vision using MATLAB and NVIDIA Deep Learning SDK

Small is the New Big: Data Analytics on the Edge

WELCOME. Shawn Simmons, Investor Relations May 10, 2017

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG

THE NVIDIA DEEP LEARNING ACCELERATOR

Making Sense of Artificial Intelligence: A Practical Guide

SUPERCHARGE DEEP LEARNING WITH DGX-1. Markus Weber SC16 - November 2016

S CUDA on Xavier

Building the Most Efficient Machine Learning System

POINT CLOUD DEEP LEARNING

Wu Zhiwen.

GPU-Accelerated Deep Learning

Embedded GPGPU and Deep Learning for Industrial Market

MIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius

Research Faculty Summit Systems Fueling future disruptions

In partnership with. VelocityAI REFERENCE ARCHITECTURE WHITE PAPER

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov

Realtime Object Detection and Segmentation for HD Mapping

S INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS

Introduction to Deep Learning in Signal Processing & Communications with MATLAB

Deploying Deep Learning Networks to Embedded GPUs and CPUs

MIXED PRECISION TRAINING OF NEURAL NETWORKS. Carl Case, Senior Architect, NVIDIA

GPU Coder: Automatic CUDA and TensorRT code generation from MATLAB

DGX UPDATE. Customer Presentation Deck May 8, 2017

Transforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell

DEEP LEARNING ALISON B LOWNDES. Deep Learning Solutions Architect & Community Manager EMEA

NVIDIA AI INFERENCE PLATFORM

SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS

Building the Most Efficient Machine Learning System

NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORKS

P I X E V I A : A I B A S E D, R E A L - T I M E C O M P U T E R V I S I O N S Y S T E M F O R D R O N E S

Accelerated Platforms: The Future of Computing. Marc Hamilton, VP Solutions Architecture & Engineering, NVIDIA Korea AI Conference 2018

World s most advanced data center accelerator for PCIe-based servers

Dr. Yassine Hariri CMC Microsystems

TESLA PLATFORM. Jan 2018

Comprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications

IBM Deep Learning Solutions

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

What s inside: What is deep learning Why is deep learning taking off now? Multiple applications How to implement a system.

Deep Learning Accelerators

Embarquez votre Intelligence Artificielle (IA) sur CPU, GPU et FPGA

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

Cisco UCS C480 ML M5 Rack Server Performance Characterization

TENSORRT. RN _v01 January Release Notes

Convolutional Neural Networks

S8901 Quadro for AI, VR and Simulation

High-Performance Data Loading and Augmentation for Deep Neural Network Training

Deep learning prevalence. first neuroscience department. Spiking Neuron Operant conditioning First 1 Billion transistor processor

DEEP NEURAL NETWORKS AND GPUS. Julie Bernauer

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

Unified Deep Learning with CPU, GPU, and FPGA Technologies

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks

Accelerating Implementation of Low Power Artificial Intelligence at the Edge

MoonRiver: Deep Neural Network in C++

Deep Learning on Arm Cortex-M Microcontrollers. Rod Crawford Director Software Technologies, Arm

Major Information Session ECE: Computer Engineering

IoT with 5G Technology

Deep Learning mit PowerAI - Ein Überblick

Data center: The center of possibility

Deep Learning Inferencing on IBM Cloud with NVIDIA TensorRT

Hello Edge: Keyword Spotting on Microcontrollers

Voice, Image, Video : AI in action with AWS. 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AUTONOMOUS DRONE NAVIGATION WITH DEEP LEARNING

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications

Transcription:

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 6 10 5 1.1X per year 10 4 10 3 10 2 1.5X per year Single-threaded perf 1980 1990 2000 2010 2020 40 Years of Microprocessor Trend Data Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp

TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 6 10 5 1.1X per year 10 4 10 3 10 2 1.5X per year Single-threaded perf 1980 1990 2000 2010 2020 40 Years of Microprocessor Trend Data The Big Bang of Deep Learning Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp

RISE OF NVIDIA GPU COMPUTING 10 7 10 6 10 5 GPU-Computing perf 1.5X per year 1.1X per year 1000X by 2025 10 4 10 3 10 2 1.5X per year Single-threaded perf 1980 1990 2000 2010 2020 40 Years of Microprocessor Trend Data The Big Bang of Deep Learning Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp

RISE OF NVIDIA GPU COMPUTING 22,000 615,000 1.8M 2012 2017 Global GTC Attendees 10X in 5 Years 2012 2017 2012 GPU Developers 14X in 5 Years 2017 CUDA Downloads 5X in 5 Years, +800K in 1 Year

NVIDIA HOLODECK Photorealistic Models Physically Simulated Interaction Virtual Team Collaboration GPU-accelerated AI

AI ON THE RISE 6000 2017 4500 $6.6B 3,000 3000 2016 1500 2002 2012 2017 AI Startup Funding 12X in 5 Years 2014 2017 Deep Learning Papers Published 13X in 3 Years -50-25 0 25 NIPS Registration

NVIDIA Interactive Ray Tracing AMAZING ADVANCES IN AI

AMAZING ADVANCES IN AI NVIDIA Interactive Ray Tracing NVIDIA / Remedy Audio-driven Facial Animation

AMAZING ADVANCES IN AI NVIDIA Interactive Ray Tracing NVIDIA / Remedy Audio-driven Facial Animation WRNCH Pose Estimation

AMAZING ADVANCES IN AI NVIDIA Interactive Ray Tracing NVIDIA / Remedy Audio-driven Facial Animation WRNCH Pose Estimation University of Edinburgh Character Animation

AMAZING ADVANCES IN AI NVIDIA Interactive Ray Tracing NVIDIA / Remedy Audio-driven Facial Animation WRNCH Pose Estimation University of Edinburgh Character Animation UC Berkeley / OpenAI One-shot Imitation Learning

NVIDIA IS THE WORLD S LEADING AI PLATFORM ONE ARCHITECTURE CUDA

ANNOUNCING CHINA TOP CSPs ADOPT NVIDIA AI COMPUTING 2B USERS

ANNOUNCING CHINA ENTERPRISE IT LEADERS ADOPT NVIDIA AI COMPUTING $200B CHINA IT MARKET

NVIDIA AI FOR DEVELOPERS EVERYWHERE Every Framework NVIDIA Inception: 1,900 DL Startups Every Cloud and Data Center Other IT Services Financial Services Drones Manufacturing Smart City Automotive Healthcare NVIDIA AI PLATFORM

AI INFERENCE IS THE NEXT GREAT CHALLENGE Training DNN Model Inferencing

EXPLOSION OF NETWORK DESIGN Convolution Networks Recurrent Networks Generative Adversarial Networks Reinforcement Learning PRelu ReLu BatchNorm LSTM GRU Highway 3D-GAN Rank GAN Conditional GAN DQN Dueling DQN Concat Dropout Pooling Projection Embedding BiDirectional Coupled GAN Speech Enhancement GAN Latent space GAN A3C

EXPLOSION OF NETWORK DESIGN Convolution Networks Recurrent Networks Generative Adversarial Networks Reinforcement Learning PRelu ReLu BatchNorm LSTM GRU Highway 3D-GAN Rank GAN Conditional GAN DQN Dueling DQN Concat Dropout Pooling Projection Embedding BiDirectional Coupled GAN Speech Enhancement GAN Latent space GAN A3C

EXPLOSION OF NETWORK COMPLEXITY 350X Inception-v4 30X DeepSpeech 3 10X MoE GNMT AlexNet GoogLeNet ResNet-50 Inception-v2 DeepSpeech DeepSpeech 2 OpenNMT 2011 2012 2013 2014 2015 2016 2017 2013 2014 2015 2016 2017 2018 2014 2015 2016 2017 2018 Image Network Complexity GOPS * Bandwidth Speech Network Complexity GOPS * Bandwidth Translation Network Complexity GOPS * Bandwidth

EXPLOSION OF INTELLIGENT MACHINES 20M Inference Servers 100s of Millions of Autonomous Machines Trillions of IoT Devices

ANNOUNCING TESLA P4 NVIDIA TENSORRT 3 PROGRAMMABLE INFERENCE ACCELERATOR TensorRT JETSON TX2 Compile and Optimize Neural Networks Support for Every Framework Optimize for Each Target Platform DRIVE PX 2 NVIDIA DLA TESLA V100

ANNOUNCING NVIDIA TENSORRT 3 PROGRAMMABLE INFERENCE ACCELERATOR next input concat relu relu relu relu batch nm batch nm batch nm batch nm 1x1 conv 3x3 conv 5x5 conv 1x1 conv next input copy 3x3 CR 5x5 CR 1x1 CR Weight & Activation Precision Calibration Layer & Tensor Fusion Kernel Auto-Tuning relu batch nm 1x1 conv relu batch nm 1x1 conv max pool 1x1 CR input max pool Multi-Stream Execution input

6,000 5,700 600 550 ANNOUNCING 4,500 450 NVIDIA TENSORRT 3 PROGRAMMABLE INFERENCE ACCELERATOR 3,000 300 40x Speed-up on ResNet-50 140x Speed-up on OpenNMT 1,500 0-140 CPU + TensorFlow 300 V100 + TensorFlow V100 + TensorRT 150 0 4 CPU + Torch 25 V100 + Torch V100 + TensorRT Images/Sec (ResNet-50) Sentences/Sec (OpenNMT)

6,000 600 ANNOUNCING 4,500 450 3,000 14ms 300 280ms NVIDIA TENSORRT 3 PROGRAMMABLE INFERENCE ACCELERATOR 40x Speed-up on ResNet-50 140x Speed-up on OpenNMT 1,500 7ms 7ms 150 153ms 117ms 0- CPU + TensorFlow V100 + TensorFlow V100 + TensorRT 0 CPU + Torch V100 + Torch V100 + TensorRT Images/Sec (ResNet-50) Sentences/Sec (OpenNMT)

AI INFERENCE DRIVES DATACENTER TCO 160 CPU servers 45,000 images / second 65 KWatts

AI INFERENCE DRIVES DATACENTER TCO 1 NVIDIA HGX with 8 Tesla V100 GPUs 45,000 images / second 3 KWatts 4 Racks in a Box

INFERENCE ON IMAGES

INFERENCE ON SPEECH

ANNOUNCING CHINA CSPs ADOPT NVIDIA GPU-ACCELERATED INFERENCE PLATFORM

AI CITY SMARTER, SAFER CITIES 1B cameras WW by 2020 Finding lost people Improving traffic Enhancing law enforcement

NVIDIA TensorRT ANNOUNCING Hikvision Server with NVIDIA Tesla HIKVISION ADOPTS NVIDIA FOR AI CITY Training Inferencing Hikvision NVR with NVIDIA Jetson Hikvision Camera with NVIDIA Jetson

NVIDIA AI CITY SMARTER, SAFER CITIES IN CHINA 10X Larger Dataset for Face Recognition 100s of Virtual Security Guards 15% Reduction in Peak Traffic +90% Accuracy for Congestion Alerts

ADAPTING TO NEW USE CASES 20 Camera NVIDIA DIGITS NVIDIA TensorRT Improved Indoor Camera Improved Pre-Trained Network Optimized Network NVIDIA Jetson 360 Camera Pre-Trained Network NVIDIA DGX Station Improved Optimized Network

THE AUTONOMOUS VEHICLE REVOLUTION

NVIDIA DRIVE AV COMPUTING PLATFORM Perception Localization Planning DRIVE AV Level 3 to Level 5 ASIL-D Functional Safety DRIVEWORKS SDK DRIVE OS DRIVE PX

145 AV STARTUPS ON NVIDIA DRIVE

145 AV STARTUPS ON NVIDIA DRIVE

145 AV STARTUPS ON NVIDIA DRIVE

145 AV STARTUPS ON NVIDIA DRIVE

145 AV STARTUPS ON NVIDIA DRIVE

145 AV STARTUPS ON NVIDIA DRIVE

145 AV STARTUPS ON NVIDIA DRIVE

THE ERA OF AUTONOMOUS MACHINES

XAVIER WORLD S FIRST AUTONOMOUS MACHINE PROCESSOR Supercomputing Performance Extreme Energy Efficiency Rich High-Speed Sensor IOs 30 TOPS of CV, Deep Learning, and Parallel Computing

ANNOUNCING JD X SELECTS NVIDIA FOR AUTONOMOUS MACHINES

CREATING AUTONOMOUS MACHINES Project Isaac SIMULATION TRAINING AUTONOMOUS

A NEW COMPUTING ERA NVIDIA TensorRT NVIDIA The World s AI Computing Platform NVIDIA TensorRT 1st Programmable Inference Accelerator NVIDIA TensorRT AI Cities Platform NVIDIA DRIVE Open AV Platform for the Transportation Industry Xavier 1 st Autonomous Machine Processor