POWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017

Size: px
Start display at page:

Download "POWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017"

Transcription

1 POWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017

2 LIFE AFTER MOORE S LAW Years of Microprocessor Trend Data Transistors (thousands) 1.1X per year X per year 10 2 Single-threaded perf Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for by K. Rupp

3 RISE OF GPU COMPUTING APPLICATIONS ALGORITHMS SYSTEMS 10 7 GPU-Computing perf 1.5X per year X per year 1000X by 2025 CUDA X per year ARCHITECTURE Single-threaded perf Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for by K. Rupp

4 RISE OF GPU COMPUTING 7, ,000 1M GTC Attendees 3X in 5 Years GPU Developers 11X in 5 Years CUDA Downloads in 2016

5 ANNOUNCING PROJECT HOLODECK Photorealistic models Interactive physics Collaboration

6 ANNOUNCING PROJECT HOLODECK Photorealistic models Interactive physics Collaboration Early access in September

7 ERA OF MACHINE LEARNING The Master Algorithm Pedro Domingos A Quest for Intelligence Fei-Fei Li

8 BIG BANG OF MODERN AI Google Photo Arterys FDA Approved Reinforcement Learning AlphaGo Super Resolution Deep Voice Transfer Learning ImageNet Stanford & NVIDIA Large-scale DNN on GPU NVIDIA BB8 Captioning BRETT Auto Encoders Style Transfer NMT IDSIA CNN on GPU U Toronto AlexNet on GPU Baidu DuLight LSTM Superhuman ASR GAN

9 DEEP LEARNING FOR RAY TRACING

10 IRAY WITH DEEP LEARNING

11 BIG BANG OF MODERN AI 13,000 20,000 $5B NIPS, ICML, CVPR, ICLR Attendees 2X in 2 Years Udacity Students in AI Programs 100X in 2 Years AI Startups Investments 9X in 4 Years

12 POWERING THE AI REVOLUTION FRAMEWORKS GPU SYSTEMS GPU AAS HGX-1 NVAIL INTERNET SERVICES HEALTHCARE DGX-1 NVIDIA DEEP LEARNING SDK NVIDIA RESEARCH INCEPTION ENTERPRISE TESLA

13 NVIDIA INCEPTION 1,300 DEEP LEARNING STARTUPS HEALTHCARE RETAIL, ETAIL FINANCIAL SECURITY, IVA PLATFORMs & APIs DATA MANAGEMENT IOT & MANUFACTURING AUTONOMOUS MACHINES CYBER AEC DEVELOPMENT PLATFORMS BUSINESS INTELLIGENCE & VISUALIZATION

14 SAP AI FOR THE ENTERPRISE First commercial AI offerings from SAP Brand Impact, Service Ticketing, Invoice-to-Record applications Powered by NVIDIA GPUs on DGX-1 and AWS

15 MODEL COMPLEXITY IS EXPLODING 105 ExaFLOPS 8.7 Billion Parameters 7 ExaFLOPS 60 Million Parameters 20 ExaFLOPS 300 Million Parameters 2015 Microsoft ResNet 2016 Baidu Deep Speech Google NMT

16 ANNOUNCING TESLA V100 GIANT LEAP FOR AI & HPC VOLTA WITH NEW TENSOR CORE 21B xtors TSMC 12nm FFN 815mm 2 5,120 CUDA cores 7.5 FP64 TFLOPS 15 FP32 TFLOPS NEW 120 Tensor TFLOPS 20MB SM RF 16MB Cache 16GB 900 GB/s 300 GB/s NVLink

17 NEW TENSOR CORE New CUDA TensorOp instructions & data formats 4x4 matrix processing array D[FP32] = A[FP16] * B[FP16] + C[FP32] Optimized for deep learning Activation Inputs Weights Inputs Output Results

18 ANNOUNCING TESLA V100 GIANT LEAP FOR AI & HPC VOLTA WITH NEW TENSOR CORE Compared to Pascal 1.5X General-purpose FLOPS for HPC 12X Tensor FLOPS for DL training 6X Tensor FLOPS for DL inferencing

19

20 GALAXY

21 DEEP LEARNING FOR STYLE TRANSFER

22 ANNOUNCING NEW FRAMEWORK RELEASES FOR VOLTA CNN Training (ResNet-50) Multi-Node Training with NCCL 2.0 (ResNet-50) LSTM Training (Neural Machine Translation) 8x K80 8x P100 K80 8x P100 8x V100 P100 8x V100 64x V100 V Hours Hours Hours

23 MATT WOOD General Manager Deep Learning and AI, AWS

24 ANNOUNCING NVIDIA DGX-1 WITH TESLA V100 ESSENTIAL INSTRUMENT OF AI RESEARCH 960 Tensor TFLOPS 8x Tesla V100 NVLink Hybrid Cube From 8 days on TITAN X to 8 hours 400 servers in a box

25 ANNOUNCING NVIDIA DGX-1 WITH TESLA V100 ESSENTIAL INSTRUMENT OF AI RESEARCH 960 Tensor TFLOPS 8x Tesla V100 NVLink Hybrid Cube From 8 days on TITAN X to 8 hours 400 servers in a box

26 ANNOUNCING NVIDIA DGX-1 WITH TESLA V100 ESSENTIAL INSTRUMENT OF AI RESEARCH 960 Tensor TFLOPS 8x Tesla V100 NVLink Hybrid Cube From 8 days on TITAN X to 8 hours 400 servers in a box $149,000 Order today: nvidia.com/dgx-1

27 ANNOUNCING NVIDIA DGX STATION PERSONAL DGX 480 Tensor TFLOPS 4x Tesla V100 16GB NVLink Fully Connected 3x DisplayPort 1500W Water Cooled

28 ANNOUNCING NVIDIA DGX STATION PERSONAL DGX 480 Tensor TFLOPS 4x Tesla V100 16GB NVLink Fully Connected 3x DisplayPort 1500W Water Cooled $69,000 Order today: nvidia.com/dgx-station

29 ANNOUNCING HGX-1 WITH TESLA V100 VERSATILE GPU CLOUD COMPUTING 8x Tesla V100 with NVLINK Hybrid Cube 2C:8G 2C:4G 1C:2G Configurable NVIDIA Deep Learning, GRID graphics, CUDA HPC stacks

30 JASON ZANDER Corporate Vice President Microsoft Azure, Microsoft

31 ANNOUNCING TENSORRT next layer Relu Relu Relu Relu FOR TENSORFLOW COMPILER FOR DEEP LEARNING INFERENCING bias 1x1 conv bias 3x3 conv bias 5x5 conv bias 1x1 conv Graph optimizations for vertical and horizontal layer fusion GPU-specific optimizations Import models from Caffe and TensorFlow Relu bias 1x1 conv Relu bias 1x1 conv max pool previous layer

32 ANNOUNCING TENSORRT FOR TENSORFLOW COMPILER FOR DEEP LEARNING INFERENCING next layer Relu Relu Relu Relu bias bias bias bias 1x1 conv 3x3 conv 5x5 conv 1x1 conv next layer 1x1 CBR 3x3 CBR 5x5 CBR 1x1 CBR Graph optimizations for vertical and horizontal layer fusion GPU-specific optimizations Import models from Caffe and TensorFlow Relu bias 1x1 conv Relu bias 1x1 conv max pool 1x1 CBR 1x1 CBR max pool previous layer previous layer

33 ANNOUNCING TENSORRT FOR TENSORFLOW COMPILER FOR DEEP LEARNING INFERENCING concat 1x1 CBR 3x3 CBR 5x5 CBR 1x1 CBR next layer 3x3 CBR 5x5 CBR 1x1 CBR Graph optimizations for vertical and horizontal layer fusion GPU-specific optimizations 1x1 CBR 1x1 CBR max pool 1x1 CBR max pool Import models from Caffe and TensorFlow previous layer previous layer

34 images/sec ANNOUNCING TENSORRT FOR TENSORFLOW COMPILER FOR DEEP LEARNING INFERENCING next layer 3x3 CBR 5x5 CBR 1x1 CBR Inference Throughput and Latency (ResNet-50) 20ms 19ms 300 Graph optimizations for vertical and horizontal layer fusion GPU-specific optimizations 1x1 CBR max pool ms Import models from Caffe and TensorFlow previous layer 0 Broadwell K80 Skylake (projected) P100

35 images/sec ANNOUNCING TENSORRT FOR TENSORFLOW COMPILER FOR DEEP LEARNING INFERENCING Graph optimizations for vertical and horizontal layer fusion GPU-specific optimizations next layer 3x3 CBR 5x5 CBR 1x1 CBR Import models from Caffe and TensorFlow 0 1x1 CBR previous layer max pool Inference Throughput and Latency (ResNet-50) 20ms 19ms 10ms Broadwell K80 Skylake (projected) P100 7ms V100

36 ANNOUNCING TESLA V100 FOR HYPERSCALE INFERENCE 15-25X INFERENCE SPEED-UP VS SKYLAKE 150W FHHL PCIE

37 THE CASE FOR GPU ACCELERATED DATACENTERS 300K inf/s inf/s > 1K CPUs 1K CPUs > = $1.5M = 250KW Tesla V100 Reduce 15X 500 Nodes CPU Servers 33 Nodes GPU Accelerated Server

38 NVIDIA DEEP LEARNING STACK DEEP LEARNING FRAMEWORKS DEEP LEARNING LIBRARIES NVIDIA cudnn, NCCL, cublas, TensorRT CUDA DRIVER OPERATING SYSTEM GPU SYSTEM

39 ANNOUNCING NVIDIA GPU CLOUD GPU-ACCELERATED CLOUD PLATFORM OPTIMIZED FOR DEEP LEARNING Containerized in NVDocker Optimization across the full stack Always up-to-date Fully tested and maintained by NVIDIA NVIDIA GPU CLOUD Registry of Containers, Datasets, and Pre-trained models CSPs

40 ANNOUNCING NVIDIA GPU CLOUD GPU-ACCELERATED CLOUD PLATFORM OPTIMIZED FOR DEEP LEARNING Containerized in NVDocker Optimization across the full stack Always up-to-date Fully tested and maintained by NVIDIA Beta in July NVIDIA GPU CLOUD Registry of Containers, Datasets, and Pre-trained models CSPs

41 AMBER Performance (ns/day) Googlenet Performance (i/s) AMBER 16 CUDA cudnn 7 CUDA 9 NCCL 2 POWER OF GPU COMPUTING 16 AMBER 14 CUDA 5 AMBER 14 CUDA cudnn 6 CUDA 8 NCCL AMBER 12 CUDA cudnn 2 CUDA 6 cudnn 4 CUDA K20 K40 K80 P100 8x K80 8x Maxwell DGX-1 DGX-1V

42 POWERING THE AI REVOLUTION AI at the Edge NVAIL INTERNET SERVICES HEALTHCARE AUTOMOTIVE DRIVE PX NVIDIA DEEP LEARNING SDK NVIDIA RESEARCH INCEPTION ENTERPRISE PLACES ROBOTS JETSON TX

43 AI REVOLUTIONIZING TRANSPORTATION 280B Miles per Year 800M Parking Spots for 250M Cars in U.S. Domino s: 1M Pizzas Delivered per Day

44 NVIDIA DRIVE AI CAR PLATFORM 100 TOPS DRIVE PX Xavier Level 4/5 Localization Perception AI Path Planning 10 TOPS DRIVE PX 2 Parker Level 2/3 Computer Vision Libraries CUDA, cudnn, TensorRT 1 TOPS OS

45 NVIDIA DRIVE Mapping-to-Driving Co-Pilot Guardian Angel

46 ANNOUNCING TOYOTA SELECTS NVIDIA DRIVE PX FOR AUTONOMOUS VEHICLES

47 AI PROCESSOR FOR AUTONOMOUS MACHINES CPU FPGA General Purpose Architectures CUDA GPU Volta Pascal Domain Specific Accelerators DLA XAVIER 30 TOPS DL 30W Custom ARM64 CPU 512 Core Volta GPU 10 TOPS DL Accelerator Energy Efficiency

48 AI PROCESSOR FOR AUTONOMOUS MACHINES CPU General Purpose Architectures + CUDA GPU Volta Domain Specific Accelerators DLA XAVIER 30 TOPS DL 30W Custom ARM64 CPU 512 Core Volta GPU 10 TOPS DL Accelerator Energy Efficiency

49 Command Interface ANNOUNCING Tensor Execution Micro-controller XAVIER DLA NOW OPEN SOURCE Early Access July General Release September Input DMA (Activations and Weights) Unified 512KB Input Buffer Activations and Weights Sparse Weight Decompression Native Winograd Input Transform MAC Array 2048 Int8 or 1024 Int16 or 1024 FP16 Output Accumulators Output Postprocessor (Activation Function, Pooling etc.) Output DMA Memory Interface

50 ROBOTS SEE, HEAR, TOUCH LEARN & PLAN ACT

51 ROBOTS Credit: Yevgen Chebotar, Karol Hausman, Marvin Zhang, Sergey Levine

52 ANNOUNCING ISAAC ROBOT SIMULATOR Robot & Environment Definition ISAAC Robot Simulator OpenAI GYM NVIDIA GPU Computer Virtual Jetson

53 ANNOUNCING ISAAC ROBOT SIMULATOR SEE, HEAR, TOUCH Robot & Environment Definition ISAAC Robot Simulator OpenAI GYM LEARN & PLAN ACT NVIDIA GPU Computer Virtual Jetson

54 NVIDIA POWERING THE AI REVOLUTION NVIDIA GPU CLOUD CSPs NVIDIA GPU in Every Cloud NVIDIA GPU Cloud Tensor Core TensorRT Xavier DLA Open Source Tesla V100 DGX-1 and DGX Station ISAAC

55

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017 A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017 TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 6 10 5 1.1X per year 10 4 10 3 10 2 1.5X per year Single-threaded

More information

A NEW COMPUTING ERA. Shanker Trivedi Senior Vice President Enterprise Business at NVIDIA

A NEW COMPUTING ERA. Shanker Trivedi Senior Vice President Enterprise Business at NVIDIA A NEW COMPUTING ERA Shanker Trivedi Senior Vice President Enterprise Business at NVIDIA THE ERA OF AI AI CLOUD MOBILE PC 2 TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 5 1.1X

More information

A NEW COMPUTING ERA. DAVID B. KIRK, FELLOW NVIDIA AI Conference Singapore 2017

A NEW COMPUTING ERA. DAVID B. KIRK, FELLOW NVIDIA AI Conference Singapore 2017 A NEW COMPUTING ERA DAVID B. KIRK, FELLOW NVIDIA AI Conference Singapore 2017 TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 5 1.1X per year 10 3 1.5X per year Single-threaded

More information

NVIDIA PLATFORM FOR AI

NVIDIA PLATFORM FOR AI NVIDIA PLATFORM FOR AI João Paulo Navarro, Solutions Architect - Linkedin i am ai HTTPS://WWW.YOUTUBE.COM/WATCH?V=GIZ7KYRWZGQ 2 NVIDIA Gaming VR AI & HPC Self-Driving Cars GPU Computing 3 GPU COMPUTING

More information

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017 DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE Dennis Lui August 2017 THE RISE OF GPU COMPUTING APPLICATIONS 10 7 10 6 GPU-Computing perf 1.5X per year 1000X by 2025 ALGORITHMS 10 5 1.1X

More information

EFFICIENT INFERENCE WITH TENSORRT. Han Vanholder

EFFICIENT INFERENCE WITH TENSORRT. Han Vanholder EFFICIENT INFERENCE WITH TENSORRT Han Vanholder AI INFERENCING IS EXPLODING 2 Trillion Messages Per Day On LinkedIn 500M Daily active users of iflytek 140 Billion Words Per Day Translated by Google 60

More information

ENDURING DIFFERENTIATION. Timothy Lanfear

ENDURING DIFFERENTIATION. Timothy Lanfear ENDURING DIFFERENTIATION Timothy Lanfear WHERE ARE WE? 2 LIFE AFTER DENNARD SCALING 10 7 40 Years of Microprocessor Trend Data 10 6 10 5 10 4 Transistors (thousands) 1.1X per year 10 3 10 2 Single-threaded

More information

ENDURING DIFFERENTIATION Timothy Lanfear

ENDURING DIFFERENTIATION Timothy Lanfear ENDURING DIFFERENTIATION Timothy Lanfear WHERE ARE WE? 2 LIFE AFTER DENNARD SCALING GPU-ACCELERATED PERFORMANCE 10 7 40 Years of Microprocessor Trend Data 10 6 10 5 10 4 10 3 10 2 Single-threaded perf

More information

NEW NVIDIA PLATFORM FOR AI

NEW NVIDIA PLATFORM FOR AI NEW NVIDIA PLATFORM FOR AI Pedro Mario Cruz e Silva (pcruzesilva@nvidia.com) LinkedIn Solution Architect Manager Enterprise Latin America Global Oil & Gas Team "GTC 2017: 'I AM AI' OPENING IN KEYNOTE"

More information

ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017 ACCELERATED COMPUTING: THE PATH FORWARD Jensen Huang, Founder & CEO SC17 Nov. 13, 2017 COMPUTING AFTER MOORE S LAW Tech Walker 40 Years of CPU Trend Data 10 7 GPU-Accelerated Computing 10 5 1.1X per year

More information

GPU FOR DEEP LEARNING. 周国峰 Wuhan University 2017/10/13

GPU FOR DEEP LEARNING. 周国峰 Wuhan University 2017/10/13 GPU FOR DEEP LEARNING chandlerz@nvidia.com 周国峰 Wuhan University 2017/10/13 Why Deep Learning Boost Today? Nvidia SDK for Deep Learning? Agenda CUDA 8.0 cudnn TensorRT (GIE) NCCL DIGITS 2 Why Deep Learning

More information

MACHINE LEARNING WITH NVIDIA AND IBM POWER AI

MACHINE LEARNING WITH NVIDIA AND IBM POWER AI MACHINE LEARNING WITH NVIDIA AND IBM POWER AI July 2017 Joerg Krall Sr. Business Ddevelopment Manager MFG EMEA jkrall@nvidia.com A NEW ERA OF COMPUTING AI & IOT Deep Learning, GPU 100s of billions of devices

More information

SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS

SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA S Axel Koehler, Principal Solution Architect HPCN%Workshop%Goettingen,%14.%Mai%2018 NVIDIA - AI COMPUTING COMPANY Computer Graphics Computing Artificial Intelligence

More information

Autonomous Driving Solutions

Autonomous Driving Solutions Autonomous Driving Solutions Oct, 2017 DrivePX2 & DriveWorks Marcus Oh (moh@nvidia.com) Sr. Solution Architect, NVIDIA This work is licensed under a Creative Commons Attribution-Share Alike 4.0 (CC BY-SA

More information

GTC Jensen Huang Founder & CEO

GTC Jensen Huang Founder & CEO GTC 2018 Jensen Huang Founder & CEO 2 3 4 SCREEN-SPACE AMBIENT OCCLUSION BAKED LIGHTING 5 GLOBAL ILLUMINATION 6 SCREEN-SPACE REFLECTIONS ENVIRONMENT MAPS 7 RAY TRACED REFLECTIONS 8 SCREEN-SPACE REFRACTION

More information

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

NVIDIA FOR DEEP LEARNING. Bill Veenhuis NVIDIA FOR DEEP LEARNING Bill Veenhuis bveenhuis@nvidia.com Nvidia is the world s leading ai platform ONE ARCHITECTURE CUDA 2 GPU: Perfect Companion for Accelerating Apps & A.I. CPU GPU 3 Intro to AI AGENDA

More information

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA Inference Optimization Using TensorRT with Use Cases Jack Han / 한재근 Solutions Architect NVIDIA Search Image NLP Maps TensorRT 4 Adoption Use Cases Speech Video AI Inference is exploding 1 Billion Videos

More information

Building the Most Efficient Machine Learning System

Building the Most Efficient Machine Learning System Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide

More information

Building the Most Efficient Machine Learning System

Building the Most Efficient Machine Learning System Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide

More information

Fast Hardware For AI

Fast Hardware For AI Fast Hardware For AI Karl Freund karl@moorinsightsstrategy.com Sr. Analyst, AI and HPC Moor Insights & Strategy Follow my blogs covering Machine Learning Hardware on Forbes: http://www.forbes.com/sites/moorinsights

More information

INVESTOR UPDATE. September 2018

INVESTOR UPDATE. September 2018 INVESTOR UPDATE September 2018 SAFE HARBOR Forward-Looking Statements Except for the historical information contained herein, certain matters in this presentation including, but not limited to, statements

More information

Deep Learning: Transforming Engineering and Science The MathWorks, Inc.

Deep Learning: Transforming Engineering and Science The MathWorks, Inc. Deep Learning: Transforming Engineering and Science 1 2015 The MathWorks, Inc. DEEP LEARNING: TRANSFORMING ENGINEERING AND SCIENCE A THE NEW RISE ERA OF OF GPU COMPUTING 3 NVIDIA A IS NEW THE WORLD S ERA

More information

DGX UPDATE. Customer Presentation Deck May 8, 2017

DGX UPDATE. Customer Presentation Deck May 8, 2017 DGX UPDATE Customer Presentation Deck May 8, 2017 NVIDIA DGX-1: The World s Fastest AI Supercomputer FASTEST PATH TO DEEP LEARNING EFFORTLESS PRODUCTIVITY REVOLUTIONARY AI PERFORMANCE Fully-integrated

More information

TOWARDS ACCELERATED DEEP LEARNING IN HPC AND HYPERSCALE ARCHITECTURES Environnement logiciel pour l apprentissage profond dans un contexte HPC

TOWARDS ACCELERATED DEEP LEARNING IN HPC AND HYPERSCALE ARCHITECTURES Environnement logiciel pour l apprentissage profond dans un contexte HPC TOWARDS ACCELERATED DEEP LEARNING IN HPC AND HYPERSCALE ARCHITECTURES Environnement logiciel pour l apprentissage profond dans un contexte HPC TERATECH Juin 2017 Gunter Roth, François Courteille DRAMATIC

More information

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015 ACCELERATED COMPUTING: THE PATH FORWARD Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015 COMMODITY DISRUPTS CUSTOM SOURCE: Top500 ACCELERATED COMPUTING: THE PATH FORWARD It s time to start

More information

GTC was the introduction to the future of AI, a protector, a healer, a helper, a guardian, a visionary, and just a little slice of amazing.

GTC was the introduction to the future of AI, a protector, a healer, a helper, a guardian, a visionary, and just a little slice of amazing. GTC 20I8 GTC was the introduction to the future of AI, a protector, a healer, a helper, a guardian, a visionary, and just a little slice of amazing. IT Business Edge Press quote. Publication Clearly the

More information

SUPERCHARGE DEEP LEARNING WITH DGX-1. Markus Weber SC16 - November 2016

SUPERCHARGE DEEP LEARNING WITH DGX-1. Markus Weber SC16 - November 2016 SUPERCHARGE DEEP LEARNING WITH DGX-1 Markus Weber SC16 - November 2016 NVIDIA Pioneered GPU Computing Founded 1993 $7B 9,500 Employees 100M NVIDIA GeForce Gamers The world s largest gaming platform Pioneering

More information

S8822 OPTIMIZING NMT WITH TENSORRT Micah Villmow Senior TensorRT Software Engineer

S8822 OPTIMIZING NMT WITH TENSORRT Micah Villmow Senior TensorRT Software Engineer S8822 OPTIMIZING NMT WITH TENSORRT Micah Villmow Senior TensorRT Software Engineer 2 100 倍以上速く 本当に可能ですか? 2 DOUGLAS ADAMS BABEL FISH Neural Machine Translation Unit 3 4 OVER 100X FASTER, IS IT REALLY POSSIBLE?

More information

NVIDIA TESLA V100 GPU ARCHITECTURE THE WORLD S MOST ADVANCED DATA CENTER GPU

NVIDIA TESLA V100 GPU ARCHITECTURE THE WORLD S MOST ADVANCED DATA CENTER GPU NVIDIA TESLA V100 GPU ARCHITECTURE THE WORLD S MOST ADVANCED DATA CENTER GPU WP-08608-001_v1.1 August 2017 WP-08608-001_v1.1 TABLE OF CONTENTS Introduction to the NVIDIA Tesla V100 GPU Architecture...

More information

Xilinx ML Suite Overview

Xilinx ML Suite Overview Xilinx ML Suite Overview Yao Fu System Architect Data Center Acceleration Xilinx Accelerated Computing Workloads Machine Learning Inference Image classification and object detection Video Streaming Frame

More information

VOLTA: PROGRAMMABILITY AND PERFORMANCE. Jack Choquette NVIDIA Hot Chips 2017

VOLTA: PROGRAMMABILITY AND PERFORMANCE. Jack Choquette NVIDIA Hot Chips 2017 VOLTA: PROGRAMMABILITY AND PERFORMANCE Jack Choquette NVIDIA Hot Chips 2017 1 TESLA V100 21B transistors 815 mm 2 80 SM 5120 CUDA Cores 640 Tensor Cores 16 GB HBM2 900 GB/s HBM2 300 GB/s NVLink *full GV100

More information

DEEP LEARNING ALISON B LOWNDES. Deep Learning Solutions Architect & Community Manager EMEA

DEEP LEARNING ALISON B LOWNDES. Deep Learning Solutions Architect & Community Manager EMEA DEEP LEARNING ALISON B LOWNDES Deep Learning Solutions Architect & Community Manager EMEA 1 THE GPU-ACCELERATED WORLD HPC DEEP LEARNING PC VIRTUALIZATION CLOUD GAMING RENDERING 2 3 Why is Deep Learning

More information

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications TESLA V100 PERFORMANCE GUIDE Life Sciences Applications NOVEMBER 2017 TESLA V100 PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important

More information

Deep Learning Accelerators

Deep Learning Accelerators Deep Learning Accelerators Abhishek Srivastava (as29) Samarth Kulshreshtha (samarth5) University of Illinois, Urbana-Champaign Submitted as a requirement for CS 433 graduate student project Outline Introduction

More information

NVIDIA AI BRAIN OF SELF DRIVING AND HD MAPPING. September 13, 2016

NVIDIA AI BRAIN OF SELF DRIVING AND HD MAPPING. September 13, 2016 NVIDIA AI BRAIN OF SELF DRIVING AND HD MAPPING September 13, 2016 AI FOR AUTONOMOUS DRIVING MAPPING KALDI LOCALIZATION DRIVENET Training on DGX-1 NVIDIA DGX-1 NVIDIA DRIVE PX 2 Driving with DriveWorks

More information

WELCOME. Shawn Simmons, Investor Relations May 10, 2017

WELCOME. Shawn Simmons, Investor Relations May 10, 2017 WELCOME Shawn Simmons, Investor Relations May 10, 2017 Safe Harbor Forward-Looking Statements Except for the historical information contained therein, certain matters in these presentations including,

More information

Nvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018

Nvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018 Nvidia Jetson TX2 and its Software Toolset João Fernandes 2017/2018 In this presentation Nvidia Jetson TX2: Hardware Nvidia Jetson TX2: Software Machine Learning: Neural Networks Convolutional Neural Networks

More information

NVIDIA DEEP LEARNING PLATFORM

NVIDIA DEEP LEARNING PLATFORM TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance and Efficiency for AI Services, From the Data Center to the Network s Edge Introduction Artificial intelligence (AI), the dream

More information

DGX SYSTEMS: DEEP LEARNING FROM DESK TO DATA CENTER. Markus Weber and Haiduong Vo

DGX SYSTEMS: DEEP LEARNING FROM DESK TO DATA CENTER. Markus Weber and Haiduong Vo DGX SYSTEMS: DEEP LEARNING FROM DESK TO DATA CENTER Markus Weber and Haiduong Vo NVIDIA DGX SYSTEMS Agenda NVIDIA DGX-1 NVIDIA DGX STATION 2 ONE YEAR LATER NVIDIA DGX-1 Barriers Toppled, the Unsolvable

More information

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems Khoa Huynh Senior Technical Staff Member (STSM), IBM Jonathan Samn Software Engineer, IBM Evolving from compute systems to

More information

GPU-Accelerated Deep Learning

GPU-Accelerated Deep Learning GPU-Accelerated Deep Learning July 6 th, 2016. Greg Heinrich. Credits: Alison B. Lowndes, Julie Bernauer, Leo K. Tam. PRACTICAL DEEP LEARNING EXAMPLES Image Classification, Object Detection, Localization,

More information

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016 RECENT TRENDS IN GPU ARCHITECTURES Perspectives of GPU computing in Science, 26 th Sept 2016 NVIDIA THE AI COMPUTING COMPANY GPU Computing Computer Graphics Artificial Intelligence 2 NVIDIA POWERS WORLD

More information

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI Overview Unparalleled Value Product Portfolio Software Platform From Desk to Data Center to Cloud Summary AI researchers depend on computing performance to gain

More information

Small is the New Big: Data Analytics on the Edge

Small is the New Big: Data Analytics on the Edge Small is the New Big: Data Analytics on the Edge An overview of processors and algorithms for deep learning techniques on the edge Dr. Abhay Samant VP Engineering, Hiller Measurements Adjunct Faculty,

More information

Object recognition and computer vision using MATLAB and NVIDIA Deep Learning SDK

Object recognition and computer vision using MATLAB and NVIDIA Deep Learning SDK Object recognition and computer vision using MATLAB and NVIDIA Deep Learning SDK 17 May 2016, Melbourne 24 May 2016, Sydney Werner Scholz, CTO and Head of R&D, XENON Systems Mike Wang, Solutions Architect,

More information

THE NVIDIA DEEP LEARNING ACCELERATOR

THE NVIDIA DEEP LEARNING ACCELERATOR THE NVIDIA DEEP LEARNING ACCELERATOR INTRODUCTION NVDLA NVIDIA Deep Learning Accelerator Developed as part of Xavier NVIDIA s SOC for autonomous driving applications Optimized for Convolutional Neural

More information

CME 213 S PRING Eric Darve

CME 213 S PRING Eric Darve CME 213 S PRING 2017 Eric Darve Summary of previous lectures Pthreads: low-level multi-threaded programming OpenMP: simplified interface based on #pragma, adapted to scientific computing OpenMP for and

More information

Turing Architecture and CUDA 10 New Features. Minseok Lee, Developer Technology Engineer, NVIDIA

Turing Architecture and CUDA 10 New Features. Minseok Lee, Developer Technology Engineer, NVIDIA Turing Architecture and CUDA 10 New Features Minseok Lee, Developer Technology Engineer, NVIDIA Turing Architecture New SM Architecture Multi-Precision Tensor Core RT Core Turing MPS Inference Accelerated,

More information

INTRODUCING THE DGX FAMILY. Marc Domenech May 8, 2017

INTRODUCING THE DGX FAMILY. Marc Domenech May 8, 2017 INTRODUCING THE DGX FAMILY Marc Domenech May 8, 2017 NVIDIA Pioneered GPU Computing Founded 1993 $7B 9,500 Employees 100M NVIDIA GeForce Gamers The world s largest gaming platform Pioneering AI computing

More information

S8901 Quadro for AI, VR and Simulation

S8901 Quadro for AI, VR and Simulation S8901 Quadro for AI, VR and Simulation Carl Flygare, PNY Quadro Product Marketing Manager Allen Bourgoyne, NVIDIA Senior Product Marketing Manager The question of whether a computer can think is no more

More information

Research Faculty Summit Systems Fueling future disruptions

Research Faculty Summit Systems Fueling future disruptions Research Faculty Summit 2018 Systems Fueling future disruptions Wolong: A Back-end Optimizer for Deep Learning Computation Jilong Xue Researcher, Microsoft Research Asia System Challenge in Deep Learning

More information

Deep learning in MATLAB From Concept to CUDA Code

Deep learning in MATLAB From Concept to CUDA Code Deep learning in MATLAB From Concept to CUDA Code Roy Fahn Applications Engineer Systematics royf@systematics.co.il 03-7660111 Ram Kokku Principal Engineer MathWorks ram.kokku@mathworks.com 2017 The MathWorks,

More information

IBM Deep Learning Solutions

IBM Deep Learning Solutions IBM Deep Learning Solutions Reference Architecture for Deep Learning on POWER8, P100, and NVLink October, 2016 How do you teach a computer to Perceive? 2 Deep Learning: teaching Siri to recognize a bicycle

More information

World s most advanced data center accelerator for PCIe-based servers

World s most advanced data center accelerator for PCIe-based servers NVIDIA TESLA P100 GPU ACCELERATOR World s most advanced data center accelerator for PCIe-based servers HPC data centers need to support the ever-growing demands of scientists and researchers while staying

More information

NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORKS

NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORKS TECHNICAL OVERVIEW NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORKS A Guide to the Optimized Framework Containers on NVIDIA GPU Cloud Introduction Artificial intelligence is helping to solve some of the most

More information

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME Twenty-five years ago, we set out to transform computer graphics. Fueled by the massive growth of the gaming market and its insatiable

More information

The Tesla Accelerated Computing Platform

The Tesla Accelerated Computing Platform The Tesla Accelerated Computing Platform Axel Koehler, Principal Solution Architect HPC Advisory Council Meeting Lugano 22 March 2016 Introduction TESLA Platform for HPC Agenda TESLA Platform for HYPERSCALE

More information

TESLA V100 PERFORMANCE GUIDE May 2018

TESLA V100 PERFORMANCE GUIDE May 2018 TESLA V100 PERFORMANCE GUIDE May 2018 TESLA V100 The Fastest and Most Productive GPU for AI and HPC Volta Architecture Tensor Core Improved NVLink & HBM2 Volta MPS Improved SIMT Model Most Productive GPU

More information

TESLA PLATFORM. Jan 2018

TESLA PLATFORM. Jan 2018 TESLA PLATFORM Jan 2018 A NEW ERA OF COMPUTING AI & IOT Deep Learning, GPU 100s of billions of devices MOBILE-CLOUD iphone, Amazon AWS 2.5 billion mobile users PC INTERNET WinTel, Yahoo! 1 billion PC users

More information

TACKLING THE CHALLENGES OF NEXT GENERATION HEALTHCARE

TACKLING THE CHALLENGES OF NEXT GENERATION HEALTHCARE TACKLING THE CHALLENGES OF NEXT GENERATION HEALTHCARE Nicola Rieke, Senior Deep Learning Solution Architect Healthcare EMEA Fausto Milletari, Senior Deep Learning Solution Architect Healthcare NALA INTRODUCTION

More information

S INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS

S INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS S8497 - INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS Chris Lamb CUDA and NGC Engineering, NVIDIA John Barco NGC Product Management, NVIDIA NVIDIA GPU Cloud (NGC) overview AGENDA Using NGC

More information

Accelerated Platforms: The Future of Computing. Marc Hamilton, VP Solutions Architecture & Engineering, NVIDIA Korea AI Conference 2018

Accelerated Platforms: The Future of Computing. Marc Hamilton, VP Solutions Architecture & Engineering, NVIDIA Korea AI Conference 2018 Accelerated Platforms: The Future of Computing Marc Hamilton, VP Solutions Architecture & Engineering, NVIDIA Korea AI Conference 2018 Forces Shaping Computing 10 7 10 6 10 5 GPU PERFORMANCE CPU PERFORMANCE

More information

Cisco UCS C480 ML M5 Rack Server Performance Characterization

Cisco UCS C480 ML M5 Rack Server Performance Characterization White Paper Cisco UCS C480 ML M5 Rack Server Performance Characterization The Cisco UCS C480 ML M5 Rack Server platform is designed for artificial intelligence and machine-learning workloads. 2018 Cisco

More information

Bandwidth-Centric Deep Learning Processing through Software-Hardware Co-Design

Bandwidth-Centric Deep Learning Processing through Software-Hardware Co-Design Bandwidth-Centric Deep Learning Processing through Software-Hardware Co-Design Song Yao 姚颂 Founder & CEO DeePhi Tech 深鉴科技 song.yao@deephi.tech Outline - About DeePhi Tech - Background - Bandwidth Matters

More information

GPU Coder: Automatic CUDA and TensorRT code generation from MATLAB

GPU Coder: Automatic CUDA and TensorRT code generation from MATLAB GPU Coder: Automatic CUDA and TensorRT code generation from MATLAB Ram Kokku 2018 The MathWorks, Inc. 1 GPUs and CUDA programming faster Performance CUDA OpenCL C/C++ GPU Coder MATLAB Python Ease of programming

More information

NVIDIA DATA LOADING LIBRARY (DALI)

NVIDIA DATA LOADING LIBRARY (DALI) NVIDIA DATA LOADING LIBRARY (DALI) RN-09096-001 _v01 September 2018 Release Notes TABLE OF CONTENTS Chapter Chapter Chapter Chapter Chapter 1. 2. 3. 4. 5. DALI DALI DALI DALI DALI Overview...1 Release

More information

Unified Deep Learning with CPU, GPU, and FPGA Technologies

Unified Deep Learning with CPU, GPU, and FPGA Technologies Unified Deep Learning with CPU, GPU, and FPGA Technologies Allen Rush 1, Ashish Sirasao 2, Mike Ignatowski 1 1: Advanced Micro Devices, Inc., 2: Xilinx, Inc. Abstract Deep learning and complex machine

More information

Deploying Deep Learning Networks to Embedded GPUs and CPUs

Deploying Deep Learning Networks to Embedded GPUs and CPUs Deploying Deep Learning Networks to Embedded GPUs and CPUs Rishu Gupta, PhD Senior Application Engineer, Computer Vision 2015 The MathWorks, Inc. 1 MATLAB Deep Learning Framework Access Data Design + Train

More information

Advancing State-of-the-Art of Autonomous Vehicles and Robotics Research using AWS GPU Instances

Advancing State-of-the-Art of Autonomous Vehicles and Robotics Research using AWS GPU Instances Advancing State-of-the-Art of Autonomous Vehicles and Robotics Research using AWS GPU Instances Adrien Gaidon - Machine Learning Lead, Toyota Research Institute Mike Garrison - Senior Systems Engineer,

More information

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME

SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME SUPERCHARGED COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME We pioneered a supercharged form of computing loved by the most demanding computer users in the world scientists, designers, artists,

More information

TENSORRT. RN _v01 January Release Notes

TENSORRT. RN _v01 January Release Notes TENSORRT RN-08624-030_v01 January 2018 Release Notes TABLE OF CONTENTS Chapter Chapter Chapter Chapter 1. 2. 3. 4. Overview...1 Release 3.0.2... 2 Release 3.0.1... 4 Release 2.1... 10 RN-08624-030_v01

More information

Deep Learning mit PowerAI - Ein Überblick

Deep Learning mit PowerAI - Ein Überblick Stephen Lutz Deep Learning mit PowerAI - Open Group Master Certified IT Specialist Technical Sales IBM Cognitive Infrastructure IBM Germany Ein Überblick Stephen.Lutz@de.ibm.com What s that? and what s

More information

NVIDIA AI INFERENCE PLATFORM

NVIDIA AI INFERENCE PLATFORM TECHNICAL OVERVIEW NVIDIA AI INFERENCE PLATFORM Giant Leaps in Performance and Efficiency for AI Services, from the Data Center to the Network s Edge Introduction The artificial intelligence revolution

More information

CafeGPI. Single-Sided Communication for Scalable Deep Learning

CafeGPI. Single-Sided Communication for Scalable Deep Learning CafeGPI Single-Sided Communication for Scalable Deep Learning Janis Keuper itwm.fraunhofer.de/ml Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern, Germany Deep Neural Networks

More information

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads Natalia Vassilieva, Sergey Serebryakov Deep learning ecosystem today Software Hardware 2 HPE s portfolio for deep learning Government,

More information

Fuelling the AI Revolution with Gaming

Fuelling the AI Revolution with Gaming Fuelling the AI Revolution with Gaming ALISON B LOWNDES AI DevRel EMEA @alisonblowndes 11 days till Xmas! 1 The day job $ AUTOMOTIVE Auto sensors reporting location, problems COMMUNICATIONS Location-based

More information

Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al.

Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al. Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al.) Andreas Kurth 2017-12-05 1 In short: The situation Image credit:

More information

S CUDA on Xavier

S CUDA on Xavier S8868 - CUDA on Xavier Anshuman Bhat CUDA Product Manager Saikat Dasadhikari CUDA Engineering 29 th March 2018 1 CUDA ECOSYSTEM 2018 CUDA DOWNLOADS IN 2017 3,500,000 CUDA REGISTERED DEVELOPERS 800,000

More information

NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive)

NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive) NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive) NVDLA NVIDIA DEEP LEARNING ACCELERATOR IP Core for deep learning part of NVIDIA s Xavier

More information

Deep Learning and HPC. Bill Dally, Chief Scientist and SVP of Research January 17, 2017

Deep Learning and HPC. Bill Dally, Chief Scientist and SVP of Research January 17, 2017 Deep Learning and HPC Bill Dally, Chief Scientist and SVP of Research January 17, 2017 A Decade of Scientific Computing with GPUs World s First Atomic Model of HIV Capsid GPU-Trained AI Machine Beats World

More information

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / 2017. 10. 31 syoh@add.re.kr Page 1/36 Overview 1. Introduction 2. Data Generation Synthesis 3. Distributed Deep Learning 4. Conclusions

More information

April 4-7, 2016 Silicon Valley INSIDE PASCAL. Mark Harris, October 27,

April 4-7, 2016 Silicon Valley INSIDE PASCAL. Mark Harris, October 27, April 4-7, 2016 Silicon Valley INSIDE PASCAL Mark Harris, October 27, 2016 @harrism INTRODUCING TESLA P100 New GPU Architecture CPU to CPUEnable the World s Fastest Compute Node PCIe Switch PCIe Switch

More information

In partnership with. VelocityAI REFERENCE ARCHITECTURE WHITE PAPER

In partnership with. VelocityAI REFERENCE ARCHITECTURE WHITE PAPER In partnership with VelocityAI REFERENCE JULY // 2018 Contents Introduction 01 Challenges with Existing AI/ML/DL Solutions 01 Accelerate AI/ML/DL Workloads with Vexata VelocityAI 02 VelocityAI Reference

More information

DEEP NEURAL NETWORKS AND GPUS. Julie Bernauer

DEEP NEURAL NETWORKS AND GPUS. Julie Bernauer DEEP NEURAL NETWORKS AND GPUS Julie Bernauer GPU Computing GPU Computing Run Computations on GPUs x86 CUDA Framework to Program NVIDIA GPUs A simple sum of two vectors (arrays) in C void vector_add(int

More information

GPU ACCELERATED COMPUTING. 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation

GPU ACCELERATED COMPUTING. 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation GPU ACCELERATED COMPUTING 1 st AlsaCalcul GPU Challenge, 14-Jun-2016, Strasbourg Frédéric Parienté, Tesla Accelerated Computing, NVIDIA Corporation GAMING PRO ENTERPRISE VISUALIZATION DATA CENTER AUTO

More information

An introduction to Machine Learning silicon

An introduction to Machine Learning silicon An introduction to Machine Learning silicon November 28 2017 Insight for Technology Investors AI/ML terminology Artificial Intelligence Machine Learning Deep Learning Algorithms: CNNs, RNNs, etc. Additional

More information

TENSORRT. RN _v01 June Release Notes

TENSORRT. RN _v01 June Release Notes TENSORRT RN-08624-030_v01 June 2018 Release Notes TABLE OF CONTENTS Chapter Chapter Chapter Chapter Chapter Chapter 1. 2. 3. 4. 5. 6. Overview...1 Release 4.0.1... 2 Release 3.0.4... 6 Release 3.0.2...

More information

Wu Zhiwen.

Wu Zhiwen. Wu Zhiwen zhiwen.wu@intel.com Agenda Background information OpenCV DNN module OpenCL acceleration Vulkan backend Sample 2 What is OpenCV? Open Source Compute Vision (OpenCV) library 2500+ Optimized algorithms

More information

Major Information Session ECE: Computer Engineering

Major Information Session ECE: Computer Engineering Major Information Session ECE: Computer Engineering Prof. Christopher Batten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/~cbatten What is Computer Engineering?

More information

Fra superdatamaskiner til grafikkprosessorer og

Fra superdatamaskiner til grafikkprosessorer og Fra superdatamaskiner til grafikkprosessorer og Brødtekst maskinlæring Prof. Anne C. Elster IDI HPC/Lab Parallel Computing: Personal perspective 1980 s: Concurrent and Parallel Pascal 1986: Intel ipsc

More information

Introduction to Deep Learning in Signal Processing & Communications with MATLAB

Introduction to Deep Learning in Signal Processing & Communications with MATLAB Introduction to Deep Learning in Signal Processing & Communications with MATLAB Dr. Amod Anandkumar Pallavi Kar Application Engineering Group, Mathworks India 2019 The MathWorks, Inc. 1 Different Types

More information

S8688 : INSIDE DGX-2. Glenn Dearth, Vyas Venkataraman Mar 28, 2018

S8688 : INSIDE DGX-2. Glenn Dearth, Vyas Venkataraman Mar 28, 2018 S8688 : INSIDE DGX-2 Glenn Dearth, Vyas Venkataraman Mar 28, 2018 Why was DGX-2 created Agenda DGX-2 internal architecture Software programming model Simple application Results 2 DEEP LEARNING TRENDS Application

More information

NVIDIA DEEP LEARNING INSTITUTE

NVIDIA DEEP LEARNING INSTITUTE NVIDIA DEEP LEARNING INSTITUTE TRAINING CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial

More information

TESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications

TESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications TESLA P PERFORMANCE GUIDE HPC and Deep Learning Applications MAY 217 TESLA P PERFORMANCE GUIDE Modern high performance computing (HPC) data centers are key to solving some of the world s most important

More information

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager Characterization and Benchmarking of Deep Learning Natalia Vassilieva, PhD Sr. Research Manager Deep learning applications Vision Speech Text Other Search & information extraction Security/Video surveillance

More information

GPUS FOR NGVLA. M Clark, April 2015

GPUS FOR NGVLA. M Clark, April 2015 S FOR NGVLA M Clark, April 2015 GAMING DESIGN ENTERPRISE VIRTUALIZATION HPC & CLOUD SERVICE PROVIDERS AUTONOMOUS MACHINES PC DATA CENTER MOBILE The World Leader in Visual Computing 2 What is a? Tesla K40

More information

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial

More information

Data center: The center of possibility

Data center: The center of possibility Data center: The center of possibility Diane bryant Executive vice president & general manager Data center group, intel corporation Data center: The center of possibility The future is Thousands of Clouds

More information

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks Deep neural networks have enabled major advances in machine learning and AI Computer vision Language translation Speech recognition Question answering And more Problem: DNNs are challenging to serve and

More information

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs Ritchie Zhao 1, Weinan Song 2, Wentao Zhang 2, Tianwei Xing 3, Jeng-Hau Lin 4, Mani Srivastava 3, Rajesh Gupta 4, Zhiru

More information