S8497 - INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS Chris Lamb CUDA and NGC Engineering, NVIDIA John Barco NGC Product Management, NVIDIA
NVIDIA GPU Cloud (NGC) overview AGENDA Using NGC on your PC, workstation or cloud provider Understanding best practices for working with containers Learn more: NGC meetup and sessions Q&A 2
CHALLENGES WITH COMPLEX SOFTWARE Current DIY GPU-accelerated AI and HPC deployments can be complex and time consuming to build, test and maintain Development of software frameworks by the community is moving very fast Requires high level of expertise to manage driver, library, framework dependencies Open Source Frameworks NVIDIA Libraries NVIDIA Docker NVIDIA Driver NVIDIA GPU 3
WHY CONTAINERS? Benefits of Containers: Simplify deployment of GPU-accelerated software, eliminating time-consuming software integration work Isolate individual deep learning frameworks and applications Share, collaborate, and test applications across different environments 4
DEPLOY ACROSS MULTIPLE PLATFORMS NVIDIA TITAN (powered by NVIDIA Volta or NVIDIA Pascal) NVIDIA DGX-1 and DGX Station Amazon EC2 P3 instances with NVIDIA Volta 5
VIRTUAL MACHINES VS. CONTAINERS Motivation Packaging and deployment mechanism for applications Consistent and reproducible deployment Lightweight and lower overhead than VMs Logical isolation from other applications Image credits 6
EXAMPLE NGC CONTAINER WORKFLOW NVIDIA builds application image composed of layers of files NGC Image(s) tested and released to NGC repository hosted at URLs like nvcr.io/nvidia/tensorflow $ docker run nvcr.io/ User pulls image to a machine and runs it 101 010 Image cached and OS isolated set of resources allocated (container) in which to execute Data & results accessed as a filesystem volume 7
ANATOMY OF AN NGC CONTAINER IMAGE R/W Layer fb91e851e672 Examples & Scripts 0c395732af81 DL Framework & Source 145c1bf7947a NVIDIA DeepLearning SDK Image Layers (R/O) f2233041f557 NVIDIA CUDA SDK ubuntu:16.04 8
ALWAYS UP-TO-DATE Monthly Releases from NVIDIA 9
BEST NVIDIA PERFORMANCE Over 6 months, up to 1.5X improvement with mixed-precision on ResNet-50 10
TARGET SYSTEM SETUP NGC Virtual Machine Images NVIDIA Deep Learning for Volta (AWS EC2 AMI) NGC Examples and Management Scripts https://github.com/nvidia/ngc-examples Pre-installed Up-to-date Ubuntu Server OS CUDA Drivers NVIDIA Container Runtime NGC Container Ready BaseOS On all DGX Systems Self-Install Setup Guide 11
LOG INTO NGC, PULL AND RUN 1 Create Account / Log In 2 Get API Key 3 Browse For Image 4 Log in on Machine & Run $ docker login nvcr.io Username: $oauthtoken Password: ******* $ docker run -it nvcr.io/nvidia/pytorch:18.02 12
RUNNING CONTAINERS WITH DATA $ docker run -rm it nvcr.io/ -volume /mnt/ssd/large_dataset:/workspace/large_dataset nvcri.io/nvidia/tensorflow:18.02 /workspace/large_dataset 101010 /mnt/ssd/large_dataset 13
NVIDIA GPU CLOUD APPLICATIONS Access to a Comprehensive Catalog of GPU-Accelerated Software 14
NGC MEETUP SESSIONS Meetup Tues 7:30-9:30PM Room 210E Frameworks 101 - the NVIDIA special sauce Verizon Case Study The Future of Multi-GPU Training Cloud Inferencing 101 with TensorRT Kubernetes on NVIDIA GPUs Joey Conway Bryan Larish (Verizon) Mike O Connor David Goodwin Ryan Olson Sessions NGC Sessions 1. How to Use NGC Containers on AWS 2. Inside NGC Deep Learning Framework Containers 3. Connect with the NGC Deep Learning Experts 4. Predicting 4G Wireless Network Quality with Deep-Learning Algorithm 5. NVIDIA IndeX 2.0 - Advanced Large-Scale Data Visualizations 6. GE's Evolution from HPC to AI in Healthcare 7. Quick and Easy DL Workflow Proof of Concept 8. Building Smart Handheld 3D Ultrasound Imaging System with GPU and NGC 15
S8497 - INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS
GPU-ACCELERATED DEEP LEARNING SOFTWARE CONTAINERS Deep Learning Everywhere, for Everyone Innovation for Every Industry Quickly tap into the power of NVIDIA AI, from automotive, to healthcare, to fintech, and more Say Goodbye to DIY Ready-to-run deep learning software containers, tuned, tested, and certified by NVIDIA Stay Up To Date Monthly updates to deep learning containers NVIDIA GPU Cloud integrates GPU-optimized deep learning frameworks, runtimes, libraries, and OS into a ready-to-run container, available at no charge 17
GPU-OPTIMIZED DEEP LEARNING SOFTWARE Tuned, Tested, Certified, and Maintained by NVIDIA NVCaffe Caffe2 Microsoft Cognitive Toolkit (CNTK) DIGITS MXNet PyTorch TensorFlow Theano Torch CUDA (base level container for developers) NVIDIA TensorRT inference accelerator with ONNX support 18
ALWAYS UP-TO-DATE Monthly Updates from NVIDIA to Deep Learning Containers Containerized Applications Docker Engine Utility for NVIDIA GPUs Docker Engine Utility for NVIDIA GPUs Docker Engine Utility for NVIDIA GPUs Docker Engine Utility for NVIDIA GPUs Docker Engine Utility for NVIDIA GPUs... Other Frameworks and Apps TF Tuned SW CNTK Tuned SW Caffe2 Tuned SW CUDA RT CUDA RT CUDA RT PyTorch Tuned SW CUDA RT Tuned SW CUDA RT Linux Kernel + CUDA Driver 19