NVIDIA PLATFORM FOR AI

Similar documents
A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

A NEW COMPUTING ERA. Shanker Trivedi Senior Vice President Enterprise Business at NVIDIA

POWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017

GTC was the introduction to the future of AI, a protector, a healer, a helper, a guardian, a visionary, and just a little slice of amazing.

A NEW COMPUTING ERA. DAVID B. KIRK, FELLOW NVIDIA AI Conference Singapore 2017

GTC Jensen Huang Founder & CEO

NEW NVIDIA PLATFORM FOR AI

EFFICIENT INFERENCE WITH TENSORRT. Han Vanholder

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

ENDURING DIFFERENTIATION Timothy Lanfear

ENDURING DIFFERENTIATION. Timothy Lanfear

Fast Hardware For AI

INVESTOR UPDATE. September 2018

GPU FOR DEEP LEARNING. 周国峰 Wuhan University 2017/10/13

Deep learning prevalence. first neuroscience department. Spiking Neuron Operant conditioning First 1 Billion transistor processor

Leveraging AI on the Cloud to transform your business. Florida Business Analytics Forum 2018 at University of South Florida

ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

Building the Most Efficient Machine Learning System

SUPERCHARGE DEEP LEARNING WITH DGX-1. Markus Weber SC16 - November 2016

Small is the New Big: Data Analytics on the Edge

Building the Most Efficient Machine Learning System

World s most advanced data center accelerator for PCIe-based servers

MACHINE LEARNING WITH NVIDIA AND IBM POWER AI

Deep Learning Inference on Openshift with GPUs

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

NVIDIA AI INFERENCE PLATFORM

Deep Learning Accelerators

HOW TO BUILD A MODERN AI

DEEP LEARNING AND ACCELERATED ANALYTICS: FASTER, BETTER RESULTS, UNIQUE INSIGHT

S INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

Deploying Deep Learning Networks to Embedded GPUs and CPUs

S8822 OPTIMIZING NMT WITH TENSORRT Micah Villmow Senior TensorRT Software Engineer

Autonomous Driving Solutions

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov

NVIDIA DEEP LEARNING PLATFORM

TOWARDS ACCELERATED DEEP LEARNING IN HPC AND HYPERSCALE ARCHITECTURES Environnement logiciel pour l apprentissage profond dans un contexte HPC

Xilinx ML Suite Overview

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

Deep Learning mit PowerAI - Ein Überblick

NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORKS

Nvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems

Major Information Session ECE: Computer Engineering

Object recognition and computer vision using MATLAB and NVIDIA Deep Learning SDK

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

MIOVISION DEEP LEARNING TRAFFIC ANALYTICS SYSTEM FOR REAL-WORLD DEPLOYMENT. Kurtis McBride CEO, Miovision

WELCOME. Simona Jankowski, March 27, 2018

NVIDIA DEEP LEARNING INSTITUTE

DEEP LEARNING ALISON B LOWNDES. Deep Learning Solutions Architect & Community Manager EMEA

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME

Unified Deep Learning with CPU, GPU, and FPGA Technologies

Introduction to Deep Learning in Signal Processing & Communications with MATLAB

NVIDIA TESLA V100 GPU ARCHITECTURE THE WORLD S MOST ADVANCED DATA CENTER GPU

CME 213 S PRING Eric Darve

VOLTA: PROGRAMMABILITY AND PERFORMANCE. Jack Choquette NVIDIA Hot Chips 2017

SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS

Deep learning in MATLAB From Concept to CUDA Code

DGX UPDATE. Customer Presentation Deck May 8, 2017

IBM Deep Learning Solutions

S8901 Quadro for AI, VR and Simulation

Accelerated Platforms: The Future of Computing. Marc Hamilton, VP Solutions Architecture & Engineering, NVIDIA Korea AI Conference 2018

OPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS

Cisco UCS C480 ML M5 Rack Server Performance Characterization

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks

In partnership with. VelocityAI REFERENCE ARCHITECTURE WHITE PAPER

Building NVLink for Developers

An introduction to Machine Learning silicon

When, Where & Why to Use NoSQL?

Accelerating your Embedded Vision / Machine Learning design with the revision Stack. Giles Peckham, Xilinx

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG

WELCOME. Shawn Simmons, Investor Relations May 10, 2017

IBM POWER SYSTEMS: YOUR UNFAIR ADVANTAGE

The Tesla Accelerated Computing Platform

GPU-Accelerated Deep Learning

Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations, and Hardware Implications

Characterizing and Benchmarking Deep Learning Systems on Modern Data Center Architectures

MIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius

Voice, Image, Video : AI in action with AWS. 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

TACKLING THE CHALLENGES OF NEXT GENERATION HEALTHCARE

How to Build Optimized ML Applications with Arm Software

How to Build Optimized ML Applications with Arm Software

TESLA PLATFORM. Jan 2018

Deep Learning: Transforming Engineering and Science The MathWorks, Inc.

GPU Coder: Automatic CUDA and TensorRT code generation from MATLAB

Why data science is the new frontier in software development

Scalable Distributed Training with Parameter Hub: a whirlwind tour

CafeGPI. Single-Sided Communication for Scalable Deep Learning

Polymorphic Acceleration for Compute Intensive Applications as a Service

NVIDIA T4 FOR VIRTUALIZATION

AWS & Intel: A Partnership Dedicated to fueling your Innovations. Thomas Kellerer BDM CSP, Intel Central Europe

High Performance Computing

MIXED PRECISION TRAINING OF NEURAL NETWORKS. Carl Case, Senior Architect, NVIDIA

Edge-to-Cloud Compute with MxNet

Implementing Deep Learning for Video Analytics on Tegra X1.

NVDIA DGX Data Center Reference Design

Transcription:

NVIDIA PLATFORM FOR AI João Paulo Navarro, Solutions Architect - Linkedin

i am ai HTTPS://WWW.YOUTUBE.COM/WATCH?V=GIZ7KYRWZGQ 2

NVIDIA Gaming VR AI & HPC Self-Driving Cars GPU Computing 3

GPU COMPUTING AT THE HEART OF AI 40 Years of CPU Trend Data 10 7 10 5 GPU-Computing perf 1.5X per year 1.1X per year 1000X by 2025 10 3 1.5X per year Single-threaded perf 1980 1990 2000 2010 2020 Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp Performance Beyond Moore s Law Big Bang of Modern AI 4

AlexNet

CAMBRIAN EXPLOSION Convolutional Networks Recurrent Networks Generative Adversarial Networks Reinforcement Learning

Convolutional Networks Recurrent Networks Generative Adversarial Networks Reinforcement Learning New Species There is a Cambrian explosion of neural networks. Since AlexNet, thousands of new models have emerged. With hundreds of layers and billions of parameters, their complexity has soared by 500X in just 5 years. The hyperscale datacenters that host them serve billions of people, cost billions to operate, and are among the most complex computers the world has ever made. Maintaining great quality of service while minimizing cost is incredibly difficult. Jensen helps us remember with PLASTER. PROGRAMMABILITY LATENCY ACCURACY SIZE THROUGHPUT ENERGY EFFICIENCY RATE OF LEARNING

REVOLUTIONARY AI PERFORMANCE Volta is the Most Advanced Data Center GPU Ever Built Performance up to 100 CPUs 21 billion transistors 5120 CUDA cores New Tensor Core architecture inspired by the demands of deep learning 8

MAXIMIZING PERFORMANCE ON VOLTA 12 8 Greater Than 10x Performance K80 vs. V100 4 0 K80 GPU Generational Training Scaling V100 Tensor Core ResNet-152 Training, 8x K80 (16 GPUs total) compared with 8x V100 NVLink GPUs using NVIDIA 17.10 containers 9

DEEP LEARNING 10

AI AND DEEP LEARNING 11

NVIDIA AI PLATFORM Announcing NEW 32GB 2X Announcing NEW 32GB 2X Tesla V100 DGX-1 and DGX Station Every Cloud Every Computer Maker NVIDIA GPU Cloud NVIDIA AI Inference TITAN V

DEEP LEARNING SOFTWARE developer.nvidia.com/deep-learning 13

WHAT IS THE BEST DEEP LEARNING FRAMEWORK? 14

DL FRAMEWORKS How to choose? Jeff Dean and Francois Chollet from Google have indicated relevant DL framework statistics for adoption. 15

DL FRAMEWORKS How to choose? https://developer.nvidia.com/deep-learning-frameworks 16

DL FRAMEWORKS How to choose? https://developer.nvidia.com/deep-learning-frameworks 17

INFERENCE 18

AI INFERENCING AT THE SPEED OF LIGHT HTTPS://WWW.YOUTUBE.COM/WATCH?V=-4UG6QFHPUM 19

THE BRAIN OF AI CARS NVIDIA DRIVE scalable AI platform for entire range of autonomous driving 320+ companies have adopted DRIVE, for data centers and in vehicles Includes automakers and suppliers, mapping and sensor companies, startups and research orgs 20

NVIDIA DRIVE AUTOMOTIVE PERCEPTION HTTPS://WWW.YOUTUBE.COM/WATCH?V=D1JDS-KXXJA 21

NVIDIA TENSORRT PROGRAMMABLE INFERENCE ACCELERATOR TESLA P4 TensorRT JETSON TX2 DRIVE PX 2 NVIDIA DLA TESLA V100 Frameworks Platforms 22

TENSOR RT HTTPS://WWW.YOUTUBE.COM/WATCH?V=HTWOJXC_MQI 23

NVIDIA TENSORRT 10X BETTER DATA CENTER TCO 160 CPU Servers 45,000 Images / Second 65 KWatts 24

NVIDIA TENSORRT 10X BETTER DATA CENTER TCO 1 NVIDIA HGX with 8 Tesla V100 GPUs 45,000 Images / Second 3 KWatts 1/6 the Cost 1/20 the Power 4 Racks in a Box 25

TENSORRT - NVIDIA AI INFERENCE ASR RNN++ SPEECH SYNTH DGN, S2S RECOMMENDER MLP-NCF NLP RNN IMAGE / VIDEO CNN TensorRT CNNs 30M HYPERSCALE SERVERS TensorRT 2 INT8 TensorRT 3 Tensor Core TensorRT 4 TensorFlow Integration Kaldi Optimization ONNX WinML 190X IMAGE / VIDEO ResNet-50 with TensorFlow Integration 50X NLP GNMT 45X RECOMMENDER Neural Collaborative Filtering 36X SPEECH SYNTH WaveNet 60X ASR DeepSpeech 2 DNN Sept 16 Apr 17 Sept 17 Apr 18 All speed-ups are chip-to-chip CPU to GV100.

BIG DATA & ANALYTICS 27

GIGABYTES TERABYTES EXABYTES ZETTABYTES PETABYTES DATA DELUGE TO DATA HUNGRY AI Sensors Infotainment Systems Streaming Video DIGITAL WEB BUSINESS PROCESS INCREASING User DATA VARIETY Generated Content Web Logs Offer Details Purchase Detail Social Network A/B Testing Segmentation Support Contacts User Click Stream Offer History Purchase Record Payment Record Mobile Web Dynamic Pricing Search Marketing Sentiment Behavioral Targeting Dynamic Funnels IoT Data Business Data Feeds Natural Language Processing HD Video Speech To Text Product/ Service Logs SMS/MMS Wearable Devices Cyber Security Logs Connected Vehicles Machine Data 28

WORKAROUNDS ARE NOT THE ANSWERS $ Sampling misses the whole picture EXPLORE THE OUTLIERS AND LONG-TAIL EVENTS Pre-aggregation struggles at scale RELY ON ACCURATE DATA Scale out on CPU infrastructure has tremendous hidden costs SCALE WITH A ROI 29

NVIDIA ACCELERATED ANALYTICS GPUs in the Data Center ANALYZE VISUALIZE AI-ACCELERATE 30

GPU FOR ANALYTICS SOLUTIONS + ARCHITECTURES DEEP LEARNING VISUALIZATION ACCELERATED VISUALIZATION DATABASES ACCELERATED DATABASES CORE TECHNOLOGIES CORE TECHNOLOGIES Spark Scheduler Mesos TRADITIONAL DATA CENTER GPU-ACCELERATED DATA CENTER NVIDIA Tesla GPUs NVIDIA DGX Products Cloud 31

Speed-up (higher is faster) GPU-ACCELERATION HAS NO LIMITS MapD Kinetica Leading In-Memory DB > 50x Slower NoSQL DB s > 100x Slower Aggregate of queries - Time (s) Less is better! BlazeGraph 1843 GPUs 700X-800X faster than graphs in all cases SQream 1403 700M Edges Single Node Xeon 2650 vs 2 K80 1.98B Edges 16 EC2 r3.xlarge vs 16 K40s 700 1.98B Edges 16 EC2 r3.4xlarge vs 16 K40s2 1.98B Edges Spark CPU Baseline 1 Speed-up over baseline spark CPU configuration 32

GPU-ACCELERATION HAS NO LIMITS MapD 33

MAPD: GPU Accelerated Database 34

ML ACROSS INDUSTRIES Finance Healthcare Telco 35

GPU ACCELERATED ML AND BIG DATA gpuopenanalytics.com

H2O4GPU PERFORMANCE 5x 10x 40x GLM XGBoost K-Means 37

NVIDIA VOLTA IN EVERY CLOUD, EVERY DATACENTER 38

NVIDIA GPU CLOUD Optimized Stacks for Every Cloud 20,000+ Registered Organizations 30 Containers NOW on AWS, GCP, AliCloud, Oracle Cloud, DGX

HOW TO START? Develop on GeForce, Deploy on Tesla GeForce Start development using GeForce Cloud Scale out on cloud Data Center Deploy on data center

developer.nvidia.com 41

INCEPTION PROGRAM https://www.nvidia.com/en-us/deep-learning-ai/startups/

NVIDIA PLATFORM FOR AI João Paulo Navarro, Solutions Architect - Linkedin