S8901 Quadro for AI, VR and Simulation

Similar documents
HOW LEADING-EDGE COMPUTING TECHNOLOGIES ARE HELPING REIMAGINE CITIES OF THE FUTURE. Andrew Rink, AEC Industry Marketing GTC China - November 22, 2018

QUADRO ADVANCED VISUALIZATION. PREVAIL & PREVAIL ELITE SSDs PROFESSIONAL STORAGE -

World s most advanced data center accelerator for PCIe-based servers

HP WORKSTATIONS GRAPHICS CARD OPTIONS

WHAT S NEW IN GRID 7.0. Mason Wu, GRID & ProViz Solutions Architect Nov. 2018

SYNERGIE VON HPC UND DEEP LEARNING MIT NVIDIA GPUS

CUDA Conference. Walter Mundt-Blum March 6th, 2008

EFFICIENT INFERENCE WITH TENSORRT. Han Vanholder

INTRODUCING THE RADEON PRO PROFESSIONAL GRAPHICS FAMILY The Art of the Impossible

NVIDIA TESLA V100 GPU ARCHITECTURE THE WORLD S MOST ADVANCED DATA CENTER GPU

User Guide. NVIDIA Quadro FX 4700 X2 BY PNY Technologies Part No. VCQFX4700X2-PCIE-PB

NVIDIA T4 FOR VIRTUALIZATION

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

Overview. Web Copy. NVIDIA Quadro M4000 Extreme Performance in a Single-Slot Form Factor

NVIDIA Quadro M5000 Designed for Extreme Performance and Power Efficiency

NVIDIA PROFESSIONAL GRAPHICS SOLUTIONS

Overview. NVIDIA Quadro M GB Real Interactive Expression. NVIDIA Quadro M GB Part No. VCQM GB-PB.

NVIDIA PROFESSIONAL GRAPHICS SOLUTIONS

Overview. NVIDIA Quadro M6000 Real Interactive Expression. CUDA Cores Memory Bandwidth 317 GB/s. DisplayPort 1.2 WEB COPY

NVIDIA Quadro K5200 Sync PNY Part Number: VCQK5200SYNC-PB. User Guide

TESLA P100 PERFORMANCE GUIDE. Deep Learning and HPC Applications

HP Z Workstations graphics card options

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

TESLA V100 PERFORMANCE GUIDE May 2018

Smart Guidance for the Best BIM & 3D Workstations

NVIDIA PROFESSIONAL GRAPHICS SOLUTIONS

VOLTA: PROGRAMMABILITY AND PERFORMANCE. Jack Choquette NVIDIA Hot Chips 2017

NVIDIA Accelerators Models HPE NVIDIA GV100 Nvlink Bridge Kit HPE NVIDIA Tesla V100 FHHL 16GB Computational Accelerator

ACCELERATED COMPUTING: THE PATH FORWARD. Jensen Huang, Founder & CEO SC17 Nov. 13, 2017

ENDURING DIFFERENTIATION. Timothy Lanfear

ENDURING DIFFERENTIATION Timothy Lanfear

Deep learning prevalence. first neuroscience department. Spiking Neuron Operant conditioning First 1 Billion transistor processor

TESLA P100 PERFORMANCE GUIDE. HPC and Deep Learning Applications

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

CST STUDIO SUITE R Supported GPU Hardware

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

A NEW COMPUTING ERA. DAVID B. KIRK, FELLOW NVIDIA AI Conference Singapore 2017

Turing Architecture and CUDA 10 New Features. Minseok Lee, Developer Technology Engineer, NVIDIA

GPU FOR DEEP LEARNING. 周国峰 Wuhan University 2017/10/13

MICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE

NVIDIA TURING GPU ARCHITECTURE. Graphics Reinvented

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

Building the Most Efficient Machine Learning System

Innovationen & Virtual Reality mit HP Workstations

IBM Deep Learning Solutions

NVLink on NVIDIA GeForce RTX 2080 & 2080 Ti in Windows 10

Delivering Real World 3D Applications with VMware Horizon, Blast Extreme and NVIDIA Grid

POWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017

DGX SYSTEMS: DEEP LEARNING FROM DESK TO DATA CENTER. Markus Weber and Haiduong Vo

More Power More Performance More Productivity. Lenovo ThinkStation P Series and ThinkPad P Series

QuickSpecs. NVIDIA Quadro K4200 4GB Graphics INTRODUCTION. NVIDIA Quadro K4200 4GB Graphics. Technical Specifications

NVIDIA PLATFORM FOR AI

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

Building the Most Efficient Machine Learning System

ANSYS HPC. Technology Leadership. Barbara Hutchings ANSYS, Inc. September 20, 2011

MACHINE LEARNING WITH NVIDIA AND IBM POWER AI

ANSYS HPC Technology Leadership

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

NVIDIA Quadro K6000 SDI Reference Guide

More Power More Performance More Productivity. Lenovo ThinkStation P Series and ThinkPad P Series

Stan Posey, CAE Industry Development NVIDIA, Santa Clara, CA, USA

WHAT S NEW IN CUDA 8. Siddharth Sharma, Oct 2016

April 4-7, 2016 Silicon Valley INSIDE PASCAL. Mark Harris, October 27,

NVIDIA NVS 810 Product Snapshot

Msystems Ltd. ROG-STRIX-GTX1080TI-O11G-GAMING. ASUS Exclusive Innovations. MaxContact Technology and 2.5-Slot Width.

The Dell Precision T3620 tower as a Smart Client leveraging GPU hardware acceleration

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

Build cost-effective, reliable signage solutions with the 8 display output, single slot form factor NVIDIA NVS 810

ANSYS High. Computing. User Group CAE Associates

Deep Learning mit PowerAI - Ein Überblick

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

SUPERCHARGE DEEP LEARNING WITH DGX-1. Markus Weber SC16 - November 2016

Engineers can be significantly more productive when ANSYS Mechanical runs on CPUs with a high core count. Executive Summary

The Quadro K620 is an excellent card for medium sized product development activities and media creation.

Accelerating High Performance Computing.

STRIX-GTX1070-8G-GAMING

NVIDIA PROFESSIONAL SOLUTIONS SALES TOOLKIT. THEN: In 1999, GPUs were used to develop graphics for PC games.

DGX UPDATE. Customer Presentation Deck May 8, 2017

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov

Fast Hardware For AI

Lenovo United States Hardware Announcement , dated August 19, 2008

Nvidia Quadro K5200 8GB two DVI-I two DisplayPort Graphics Card by ThinkStation (4X60G69025)

Msystems Ltd. STRIX-GTX1080-A8G-GAMING. ASUS Exclusive Innovations DIRECTCU III WITH PATENTED WING-BLADE FANS AURA RGB LIGHTING

WELCOME! TODAY S WEBINAR: RECIPES FOR PRODUCT DESIGN & AEC WORKSTATION SUCCESS. Mike Leach. July 25, Senior Workstation Technologist Lenovo

Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink

NVIDIA GRID A True PC Experience for Everyone Anywhere

S INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS

QuickSpecs. AMD FirePro W5100 4GB Graphics INTRODUCTION PERFORMANCE AND FEATURES. AMD FirePro W5100 4GB Graphics. Overview

EVALUATING WINDOWS 10: LEARN WHY YOUR USERS NEED GPU ACCELERATION

S8822 OPTIMIZING NMT WITH TENSORRT Micah Villmow Senior TensorRT Software Engineer

Broadberry. Artificial Intelligence Server for Fraud. Date: Q Application: Artificial Intelligence

Faster Innovation - Accelerating SIMULIA Abaqus Simulations with NVIDIA GPUs. Baskar Rajagopalan Accelerated Computing, NVIDIA

NVIDIA PROFESSIONAL GRAPHICS SOLUTIONS

Ingram Mac Show August 2012

Cisco UCS C480 ML M5 Rack Server Performance Characterization

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

Transcription:

S8901 Quadro for AI, VR and Simulation

Carl Flygare, PNY Quadro Product Marketing Manager Allen Bourgoyne, NVIDIA Senior Product Marketing Manager

The question of whether a computer can think is no more interesting than the question of whether a submarine can swim. Edsger Dijkstra

Intelligence Abounds in Nature A very small sampling

Technological Intelligence Homo sapiens essential differentiator Thalmocortical brain network 3 million neurons, 476 million synapses Full human brain 106 billion neurons, 1,000 trillion synapses

Artificial Intelligence: Where we Stand Today Google s IQ is slightly below a six-year-old human s Google 47.28 78.42% increase since 2014 Baidu 32.92 40.08% increase since 2014 Microsoft Bing 31.98 Apple Siri 23.90 AI IQ s significantly lower than an 18-year-old s average 97 score In 2014 two of the three researchers found Google had an IQ of 26.5 compared to Baidu s 23.5 Source: http://www.zdnet.com/article/google-ai-vs-siri-vs-bing-iq-tests-show-one-is-smartest-by-a-mile/

NVIDIA Quadro Every segment benefits from AI, VR and simulation Manufacturing CAE Media and Entertainment Automotive AEC Energy (Oil and Gas) Scientific and Technical Healthcare

NVIDIA Quadro AI, VR and Simulation Open New Possibilities AI (Deep Learning) Development, Collaborative VR, CAE Simulations, Ultimate CAD Models, Photorealistic Rendering and GPGPU Compute GP100 16 GB GV100 32 GB Collaborative VR, Extremely Complex CAD Models, CAE, Photorealistic Rendering, DCC and VFX, Seismic Exploration, 3D Medical Imaging P6000 24 GB Professional VR, Very Complex CAD Models, CAE, Photorealistic Rendering, Advanced DCC and VFX, 3D Medical Imaging P5000 16 GB Professional VR, Complex CAD Models, CAE, Photorealistic Rendering, Complex DCC and VFX, Medical Imaging P4000 8 GB Medium Size and Complexity CAD Models, PLM, Basic DCC, Medical Imaging P2000 5 GB P1000 4 GB Small and Simple CAD Models, Entry PLM P620 2 GB P400 2 GB Entry Basic Mid Range Upper Range High End Ultra High End

NVIDIA Quadro AI, VR and Simulation Open New Possibilities AI (Deep Learning) Development, Collaborative VR, CAE Simulations, Ultimate CAD Models, Photorealistic Rendering and GPGPU Compute GP100 16 GB GV100 32 GB Collaborative VR, Extremely Complex CAD Models, CAE, Photorealistic Rendering, DCC and VFX, Seismic Exploration, 3D Medical Imaging P6000 24 GB Professional VR, Very Complex CAD Models, CAE, Photorealistic Rendering, Advanced DCC and VFX, 3D Medical Imaging P5000 16 GB Professional VR, Complex CAD Models, CAE, Photorealistic Rendering, Complex DCC and VFX, Medical Imaging P4000 8 GB Medium Size and Complexity CAD Models, PLM, Basic DCC, Medical Imaging P2000 5 GB P1000 4 GB Small and Simple CAD Models, Entry PLM P620 2 GB P400 2 GB Entry Basic Mid Range Upper Range High End Ultra High End

NVIDIA Quadro GP100 NVIDIA Quadro GV100 Reinventing the Workstation for AI

NVIDIA Quadro GP100 NVIDIA Quadro GV100 x 2 NVLink Scalable Workstation AI

NVIDIA Quadro GV100 and NVLink Scaling performance and memory * High speed GPU and memory connection for GV100 NVLink combines two GV100s for twice the compute power and 64 GB of memory Up to 200 GB/sec bidirectional bandwidth, 25% improvement Used in pairs, two dedicated NVLink connectors on GV100 boards Provides SLI functionality for GV100 boards * Application support for NVLink required. Maximum of two GV100 boards can be connected with NVLink.

NVIDIA Quadro GV100 Technical specifications GPU Architecture CUDA and Tensor Cores Memory Capacity Peak Memory Bandwidth FP64 (Double Precision) FP32 (Single Precision) FP16 (Half Precision) INT8 (Integer) System Interface NVLink Volta 2560 (FP64), 5120 (FP32), 640 (Tensor) 32 GB HBM2 870 GB/sec 7.4 TFLOPS 42% improvement 14.8 TFLOPS 44% improvement 118.5 TFLOPS (Matrix Multiply with FP16 or 32 Accumulate) 59.3 TOPS 26% improvement PCI Express Gen 3 x16 200 GB/sec Bidirectional 25% improvement Display Connectors 4x DisplayPort 1.4 with HDCP 2.2 4K Display Support 5K Display Support 8K Display Support VR Ready and Stereo 4x 4096 x 2160 at 120 Hz with HDR 4x 5120 x 2880 at 60 Hz with HDR 2x 7680 x 4320 at 60 Hz with HDR Yes, Stereo via 3-pin mini-din Connector Bracket

NVIDIA Quadro GV100 Unmatched compute capabilities FP64 FP32 FP16 7.4 14.8 118.5 TFLOPS TFLOPS TFLOPS INT8 59.3 TOPS

NVIDIA Quadro GV100 Features and benefits relative to GP100 GP100 GV100 Benefit GPU Architecture Pascal Volta Most powerful, efficient and AI optimized GPU CUDA Cores 3584 5120 Significantly greater compute and rendering performance FP64 Performance 5.2 TFLOPS 7.2 TFLOPS 1.4x greater FP64 compute performance Memory Size 16 GB HBM2 32 GB HBM2 2.0x memory capacity Memory Bus Width 4096-bit 4096-bit Radically advanced memory bus implementation Peak Memory Bandwidth 717 GB/sec 870 GB/sec Move data to and from GPU 1.2x faster Display Support 4x DP 1.4 + 1x DVI-D DL 4x DP 1.4 and HDCP 2.2 Supports four 4K, 5K or 8K displays, latest HDCP HDR Image Support Yes Yes More lifelike images Advanced Display Quadro Sync II Quadro Sync II Synchronize up to 8 GPUs per system VR Ready Yes Yes, GV100 implements full suite of hardware optimizations NVLink NVLink (First Generation) NVLink (Second Generation) Higher performance means lower latency Board Power 235 W 250 W Better performance per Watt Auxiliary Power Connector 8-pin PCIe 8-pin PCIe Simplified power supply connectivity Form Factor 4.4 H x 10.5 L Dual Slot 4.4 H x 10.5 L Dual Slot No significant mechanical or thermal changes

NVIDIA Quadro GV100 Redefines state of the art across essential solutions Artificial Intelligence RTX Rendering Compute Immersive Visualization (VR) Tensor processor cores Unrivaled FP32 performance Industry leading HPC capabilities Includes VR hardware optimizations NVIDIA GPU deep learning stack Largest models in GPU memory Work with largest datasets Full NVIDIA VRWORKS support ISV DL and ML framework optimization AI accelerated photorealistic rendering Integrate simulation into design process Create new AI-augmented technologies Iterate and innovate faster Neural network character animation Utilize generative design algorithms Visualize the largest datasets Reduce training time Apply AI to simultaneous video streams Fastest FEA, CFD, CEM available Collaborative VR environments (Holodeck) Connect two GV100 boards with NVLink to provide 64 GB of memory and twice the GPU processing power in standard workstation enclosures

NVIDIA Quadro GV100 RTX rendering lets you dream and create at the speed of thought Architectural Design Visualize cities or urban street scenes in every photorealistic detail Product Design Design with physically based lights and materials in realtime Media and Entertainment Perfect every shot with GPI accelerated and AI enhanced rendering Work at full fidelity, utilizing massive datasets with 2x larger memory capacity Master rendering projects interactively with AI (Deep Neural Network) technology

NVIDIA Quadro RTX supercharges rendering with AI accelerated denoising Denoising On 20 Frames Denoising Off 20 Frames Denoising Off 290 Frames High quality results with fluid visual interactivity throughout the design process

NVIDIA Quadro Companies working with NVIDIA s OptiX AI denoiser technology Image courtesy of Isotropix, rendered with Clarisse and denoised with NVIDIA OptiX.

NVIDIA Quadro CAD and CAE workflow elements Design (CAD) Pre-Processing Simulation (CAE) Post-Processing

NVIDIA Quadro GV100 Benefit from the ultimate immersive experiences RTX Rendered Graphics Interactive Physics Realtime Collaboration GPU-Accelerated AI 2x larger memory capacity lets you work with high fidelity, massive datasets (v. GP100) Benefit from unconstrained Holodeck experiences with full-featured VR performance and capabilities

NVIDIA Quadro GV100 Realize new opportunities with AI Development NGC Aggregation Inferencing At-The-Edge Retail store inferencing with Quadro by DeepBlue Technology, China 32 GB or 64 GB capacity (NVLink) trains neural networks with massive datasets Develop with NVIDIA optimized Deep Learning frameworks and deploy with NGC interoperability and scalability Accelerate AI training and inferencing on workstations with Tensor cores and NVLink

NVIDIA Quadro GV100 AI Training Performance Up to 2x improvement in Deep Learning training performance* 800 Caffe ResNet-50 Training IPS 700 Tensor Flow ResNet-50 Training IPS 700 600 600 500 500 400 400 300 300 200 200 100 100 GP100 Batch Size 128 GP100 Batch Size 256 GV100 Batch Size 512 GV100 Batch Size 128 GV 100 Batch Size 256 GV100 Batch Size 256 * Based on TensorFlow Resnet-50 Training. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.

NVIDIA Quadro GV100 Deep Learning Training Performance Over 2x improvement in Deep Learning training and inference performance* 700 Tensor FlowResNet-50 Training 700 TensorRT ResNet-50 Inference 600 600 500 500 400 400 300 300 200 200 100 100 GP100 Batch Size 256 1 2 4 8 GV100 Batch Size 256 GV 100 Batch Size 512 Batch Size * Based on TensorFlow Resnet-50 Training, TensorRT ResNet-50 Inference tests. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.

NVIDIA Quadro GV100 Scientific Compute Performance More than 2x improvement over the previous generation* 2.0 CUDA Basic Linear Algebra Solver Benchmark 700 LAAMPS Atomic Fluid Benchmark CUBLABS 2560 x 2048 x 8192 600 1.5 500 400 1.0 300 0.5 200 100 FP32 FP64 FP16 GP100 GV100 * Based LAMMPS molecular modeling benchmark. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.

NVIDIA Quadro GV100 CAE Example Significant ANSYS Mechanical 19 Acceleration* Power Supply Module (V19cg-1) 16 GPU Cores Base + 12 HPC Licenses 2.29 8 CPU Cores + GV100 Base + 5 HPC Licenses 3.90 8 CPU Cores Base + 4 HPC Licenses 1.71 3 CPU Cores + GV100 Base License 2.65 4 CPU Cores Base License 1.0 0 1 2 3 4 * Power Supply Module (V19cg-1). 2x Xeon E5-2699 v4 at 2.2 GHz, 22 cores, HT off, NVIDIA driver 390.40 TCC, 256GB DRAM, CentOS 7.2.1511 64-bit. ANSYS Mechanical 19 benchmark model. Steady state thermal analysis of a power supply module, 5.3 Mdofs, JCG, real-value symmetric.

NVIDIA Quadro GV100 CAE Example Standout ANSYS Fluent 19 Acceleration* Pipes Model 9.6 Million Cells 32 CPU Cores + 2x GV100 Base + 2 HPC Packs 5.55 32 CPU Cores Base + 2 HPC Packs 3.29 16 CPU Cores + 2x GV100 Base + 5 HPC Licenses 4.71 16 CPU Cores Base + 12 HPC Licenses 2.74 8 CPU Cores + GV100 Base + 5 HPC Licenses 2.67 8 CPU Cores Base + 4 HPC Licenses 1.78 3 CPU Cores + GV100 Base License 1.53 4 CPU Cores Base License 1.0 0 1 2 3 4 5 6 * Power Supply Module (V19cg-1). 2x Xeon E5-2699 v4 at 2.2 GHz, 22 cores, HT off, NVIDIA driver 390.40 TCC, 256GB DRAM, CentOS 7.2.1511 64-bit. ANSYS Mechanical 19 benchmark model. Steady state thermal analysis of a power supply module, 5.3 Mdofs, JCG, real-value symmetric.

NVIDIA Quadro GV100 Rendering Performance SOLIDWORKS Visualize scales to over 29x faster than CPU* 2x GV100 2x GP100 GV100 GP100 P6000 P5000 P4000 P2000 CPU 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 * Based on 2x GV100, Xeon E5-2697 v3, 14 cores at 2.6 GHz, 32 GB DRAM, Win 10 Pro 64-bit Fall Creator s Update and NVIDIA driver version 390.77. Tests run at 4K UHD (3840 x 2160) resolution.

NVIDIA Quadro GV100 Graphics Performance Up to 1.3x better than previous generation * 1.4 Quadro GP100 Quadro GV100 1.2 1.0 0.8 0.6 0.4 0.2 geomean 3dsmax catia creo energy maya medical showcase snx sw * Based on SPECviewperf 12.2.2 results.