Nvidia GPU Support on Mesos: Bridging Mesos Containerizer and Docker Containerizer

Nvidia GPU Support on Mesos: Bridging Mesos Containerizer and Docker Containerizer MesosCon Asia - 2016 Yubo Li Research Stuff Member, IBM Research - China Email: liyubobj@cn.ibm.com 1

Yubo Li( 李玉博 ) Email: liyubobj@cn.ibm.com Slack: @liyubobj QQ: 395238640 Dr. Yubo Li is a Researcher Stuff Member at IBM Research, China. He is the architect of the GPU acceleration and deep-learning as a service (DlaaS) components of SuperVessel, an open-access cloud running OpenStack on OpenPOWER machines. He is currently working on GPU support for several cloud container technologies, including Mesos, Kubernetes, Marathon and OpenStack. 2

Why GPUs? GPUs are the tool of choice for many computation intensive applications Deep Learning Scientific Computing Genetic Analysis 3

Why GPUs? GPU can shorten a deep learning training from tens of days to several days 4

Why GPUs? Mesos users have been asking for GPU support for years First email asking for it can be found in the dev-list archives from 2011 The request rate has increased dramatically in the last 9-12 months 5

Interface Why GPUs? We have internal need to support cognitive solutions on Mesos User Web UI Monitoring Operation UI Cognitive API/UI Others Application Data pre-processing DL Training (Caffe, Theano, etc) DL Inference (Caffe, Theano, etc) Web Service Resource Management/Orchestration Mesos + Frameworks (Marathon, k8sm) Infrastructure Compute Resource (VM / Bare-metal) Container (docker, mesos container) Network Volume Hardware Resources Storage Compute Memory SSD/Flash Disk CPU GPU FPGA 6 Memory

Why GPUs? VM based GPU pass-through Exclusively occupied GPUs Container based GPU injection Flexible apply and release 7

Why GPUs? Mesos has no isolation guarantee for GPUs without native GPU support No built-in coordination to restrict access to GPUs Possible for multiple frameworks / tasks to access GPUs at the same time 8

Why GPUs? Enterprise users want to see GPU support on container cloud Deep learning / artificial intelligence need GPU as accelerator Traditional HPC users turn to micro-service arch. and container cloud 9

Why Docker? Extremely popular image format for containers Build once run everywhere Configure once run anything Source: DockerCon 2016 Keynote by Docker s CEO Ben Golub 10

Why Docker? Nvidia-docker Wrap around docker to allow GPUs to be used/isolated inside docker containers CUDA-ready docker images Exclusive CUDA toolkit Loose dependency Shared GPU/CUDA driver https://github.com/nvidia/nvidia-docker 11

Why Docker? Ready-to-use ML/DL images Get rid of tedious framework installation! 12

Why Docker? Our internal consideration We want to re-use so many existing docker images/dockerfiles Developers are familiar with docker 13

What We Want To Do? Test locally with nvidia-docker Deploy to production with Mesos 14

Talk Overview Challenges and our basic ideas GPU unified scheduling design Future works Demo: running cognitive application with Mesos/Marathon + GPU 15

Bare-metal vs. Container for GPU Bare-metal Application (Caffe/TF/ ) CUDA libraries nvidia base libraries Container1 Application (Caffe/TF/ ) CUDA libraries nvidia base libraries Container2 Application (Caffe/TF/ ) CUDA libraries nvidia base libraries nvidia-kernel-module Linux Kernel Loose couple between host and container is the most challenge! nvidia-kernel-module Linux Kernel 16

Challenges Container Application (Caffe/TF/ ) CUDA libraries nvidia base libraries (v1) Container1 Application (Caffe/TF/ ) CUDA libraries nvidia base libraries Container2 Application (Caffe/TF/ ) CUDA libraries nvidia base libraries nvidia-kernel-module (v2) Linux Kernel nvidia-kernel-module Linux Kernel Not work if nvidia libraries and kenel module versions are not match We also need GPU isolation control 17

How We Solve That? Container Application (Caffe/TF/ ) Container Application (Caffe/TF/ ) CUDA libraries nvidia base libraries (v1) CUDA libraries nvidia base libraries (v2) Volume injection nvidia-kernel-module (v2) Linux Kernel nvidia base libraries (v2) nvidia-kernel-module (v2) Linux Kernel Not work if nvidia libraries and kernel module versions are not match 18

How We Solve That? Mimic functionality of nvidia-docker-plugin Finds all standard nvidia libraries / binaries on the host and consolidates them into a single place as a docker volume (nvidia-volume) /var/lib/docker/volumes nvidia_xxx.xx (version number) bin lib lib64 Inject volume with ro to container if needed 19

How We Solve That? Determine whether nvidia-volume is needed Check docker image label: com.nvidia.volumes.needed = nvidia_driver Inject nvidia-volume to /usr/local/nvidia if the label found This label certificates following things: https://github.com/nvidia/nvidia-docker/blob/master/ubuntu-14.04/cuda/7.5/runtime/dockerfile 20

How We Solve That? GPU isolation Currently we support physical-core level isolation Example Per card Isolation? GPU sharing is not supported No process capping mechanism from nvidia GPU driver GPU sharing is suggested for MPI/OpenMP case only Yes 1 core of Tesla K80 (dual-core) Yes 512 CUDA cores of Tesla K40 No 21

How We Solve That? GPU device control: /dev nvidia0 nvidia1 nvidiactl nvidia-uvm nvidia-uvm-tools (data interface for GPU0) (data interface for GPU1) (control interface) (unified virtual memory) (UVM control) Isolation Mesos containerizer: cgroups Docker containerizer: docker run devices 22

How We Solve That? Dynamic loading of nvml library Nvidia GDK (nvml library) Same Mesos binary works on both for GPU node and non-gpu node Needs on compile Mesos binary with GPU support Yes, with GPU Run Mesos on GPU node Nvidia GPU Driver Yes, without GPU Mesos binary without GPU support Yes, without GPU Yes, without GPU Run Mesos on non-gpu node 23

Apache Mesos and GPUs Multiple containerizer support Mesos (aka unified) containerizer (fully supported) Docker containerizer (code review, partially merged) Why support both? Many people are asking for docker containerizer support to bridge the feature gap People are already familiar with existing docker tools Unified containerizer needs time to mature 24

Apache Mesos and GPUs GPU_RESOURCES framework capability Frameworks must opt-in to receive offers with GPU resources Prevents legacy frameworks from consuming non-gpu resources and starving out GPU jobs Use agent attributes to select specific type of GPU resources Agents advertise the type of GPUs they have installed via attributes Only accept an offer if the attributes match the GPU type you want 25

Usage Usage Nvidia GPU and GPU driver needed Install Nvidia GPU Deployment Toolkit (GDK) Compile Mesos with flag:../configure --with-nvml=/nvml-header-path && make j install Build GPU images following nvidia-docker do: (https://github.com/nvidia/nvidia-docker) Run a docker task with additional such resource gpus=1 Mesos Containerier: --isolation="cgroups/devices,gpu/nvidia" 26

Apache Mesos and GPUs -- Evolution Containerizer API Isolator API Mesos Agent (Unified) Mesos Containerizer Mimics functionality of nvidia-docker-plugin Nvidia GPU Allocator Nvidia Volume Manager CPU Memory GPU Linux devices cgroup Nvidia GPU Isolator 27

Apache Mesos and GPUs -- Evolution Containerizer API Isolator API Mesos Agent (Unified) Mesos Containerizer CPU Memory GPU Nvidia GPU Allocator Nvidia Volume Manager Nvidia GPU Isolator Linux devices cgroup 28

Apache Mesos and GPUs -- Evolution Containerizer API Composing Containerizer Docker Containerizer GPU (Unified) Mesos Containerizer Nvidia GPU Allocator Isolator API CPU Memory GPU Nvidia Volume Manager 29

Apache Mesos and GPUs Mesos Agent mesos-docker-executor Mesos Containerizer Nvidia GPU Isolator Nvidia GPU Allocator Docker Daemon --device Docker Containerizer --volume Nvidia Volume Manager CPU Memory GPU GPU driver volume Docker image label check: com.nvidia.volumes.needed="nvidia_driver" Native docker arguments for GPU management 30

Release and Eco-syetems Release GPU for Mesos Containerizer: fully supported after Mesos 1.0 (supports both image-less and docker-image based containers) GPU for Docker Containerizer: Expected to release on Mesos 1.1 or 1.2 Eco-systems Marathon GPU support for Mesos Containerizer after Marathon v1.3 GPU support for Docker Containerizer ready for release (wait for Mesos support) K8sm On design 31

Mesos on IBM POWER8 Apache Mesos 1.0 and GPU feature perfectly supports IBM POWER8 IBM POWER8 Delivers Superior Cloud Performance with Docker 32

IBM LC 产品为 Power Systems 增添新活力 High Performance Computing S812LC S822LC 大器有为要多大才比的上它的认知高度帮助企业加速获得洞察 S822LC for HPC 大快人心要多快才比的上它的运算速度开启新一波加速浪潮 S821LC 大智若云要多智才比的上它的云端契合度开启数据中心与云的无缝模式 S822LC for Commercial Computing Storage rich single socket system for big data applications Memory Intensive workloads 以存储为中心高数据吞吐量工作负载的理想之选采用 2 个 POWER8 插槽, 用于处理大数据工作负载通过 CAPI 和 GPU 实现大数据加速引入了 CPU-GPU NVLink, 可将进入 GPU 加速器的带宽提升 2.5 倍 POWER8 与 NVIDIA NVLink 的完美结合 Intel x86 系统内存带宽的 2 倍内存密集型工作负载 2X memory bandwidth of Intel x86 systems Memory Intensive workloads 33

Special Thanks to Collaborators Kevin Klues Rajat Phull Seetharami Seelam Guangya Liu Qian Zhang Benjamin Mahler Vikrama Ditya Yong Feng 34

Demo Build a GPU-enabled cognitive web service in a minute! 35