Can we boost more HPC performance? Integrate IBM POWER servers with GPUs to OpenStack Environment

Similar documents
Can we further boost HPC Performance? Integrate IBM Power System to OpenStack Environment (Part 1) Ankit Purohit, Research Engineer NTT

libvirt integration and testing for enterprise KVM/ARM Drew Jones, Eric Auger Linaro Connect Budapest 2017 (BUD17)

How Container Runtimes matter in Kubernetes?

The Path to GPU as a Service in Kubernetes Renaud Gaubert Lead Kubernetes Engineer

GPU on OpenStack for Science

Optimizing Efficiency of Deep Learning Workloads through GPU Virtualization

Installation and Maintenance Instructions for Intel(R) R-WPA VNF Package

OpenPOWER Performance

Full Scalable Media Cloud Solution with Kubernetes Orchestration. Zhenyu Wang, Xin(Owen)Zhang

Infoblox Kubernetes1.0.0 IPAM Plugin

IBM Power AC922 Server

IBM Deep Learning Solutions

Using DC/OS for Continuous Delivery

Deep Insights: High Availability VMs via a Simple Host-to-Guest Interface OpenStack Masakari Greg Waines (Wind River Systems)

Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink

PREPARING TO USE CONTAINERS

Onto Petaflops with Kubernetes

VIRTUAL GPU SOFTWARE. DU _v6.0 through 6.1 Revision 02 June User Guide

IBM Power Advanced Compute (AC) AC922 Server

Kata Containers The way to run virtualized containers. Sebastien Boeuf, Linux Software Engineer Intel Corporation

Storage Performance Tuning for FAST! Virtual Machines

Cisco UCS Manager VM-FEX for KVM CLI Configuration Guide, Release 3.2

INSTALLATION RUNBOOK FOR Triliodata + TrilioVault

S INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS

Deploy the ASAv Using KVM

Convergence of VM and containers orchestration using KubeVirt. Chunfu Wen

Building NVLink for Developers

INTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT

A comparison of performance between KVM and Docker instances in OpenStack

Singularity CRI User Documentation

OpenStack hypervisor, container and Baremetal servers performance comparison

MICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE

DGX-1 DOCKER USER GUIDE Josh Park Senior Solutions Architect Contents created by Jack Han Solutions Architect

Zoptymalizuj Swoje Centrum Danych z Red Hat Virtualization. Jacek Skórzyński Solution Architect/Red Hat

Configuring and Benchmarking Open vswitch, DPDK and vhost-user. Pei Zhang ( 张培 ) October 26, 2017

Effective Virtual CPU Configuration in Nova

Deep Learning mit PowerAI - Ein Überblick

Shifter: Fast and consistent HPC workflows using containers

Deploy the ExtraHop Discover Appliance on a Linux KVM

CS-580K/480K Advanced Topics in Cloud Computing. OpenStack

Nested Virtualization and Server Consolidation

VGPU ON KVM VFIO BASED MEDIATED DEVICE FRAMEWORK Neo Jia & Kirti Wankhede, 08/25/2016

Passthrough in QEMU/KVM on Linux

The configurations of each Dell R730:

Live Migration with Mdev Device

CONTAINERS AND MICROSERVICES WITH CONTRAIL

Conduire OpenStack Vers l Edge Computing Anthony Simonet Inria, École des Mines de Nantes, France

Deploy the ExtraHop Explore Appliance on a Linux KVM

Deploy the ExtraHop Explore Appliance on a Linux KVM

Welcome to Linux Foundation E-Learning Training

Nova Scheduler: Optimizing, Configuring and Deploying NFV VNF's on OpenStack

INSTALLATION RUNBOOK FOR Netronome Agilio OvS. MOS Version: 8.0 OpenStack Version:

Baremetal with Apache CloudStack

CLOUD ARCHITECTURE & PERFORMANCE WORKLOADS. Field Activities

IBM POWER SYSTEMS: YOUR UNFAIR ADVANTAGE

An Introduction to Kubernetes

BUILDING A GPU-FOCUSED CI SOLUTION

Welcome to Linux Foundation Virtual Training

CafeGPI. Single-Sided Communication for Scalable Deep Learning

Using SR-IOV on OpenStack

S8688 : INSIDE DGX-2. Glenn Dearth, Vyas Venkataraman Mar 28, 2018

Containerizing GPU Applications with Docker for Scaling to the Cloud

Xilinx Answer QDMA Linux Driver User Guide

DEEP DIVE: OPENSTACK COMPUTE

Red Hat Enterprise Linux 8.0 Beta

Virtualization. Santa Clara Valley Chapter of the IEEE Communication Society June 20, 2007 Scott Lurndal, 3Leaf Systems

International Journal of Computer & Organization Trends Volume5 Issue3 May to June 2015

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems

RHOSP 12 Director Installation on MaxCore TM Platform

OPENSTACK + KUBERNETES + HYPERCONTAINER. The Container Platform for NFV

VGA Assignment Using VFIO. Alex Williamson October 21 st, 2013

Integrated Management of OpenPOWER Converged Infrastructures. Revolutionizing the Datacenter

XEN and KVM in INFN production systems and a comparison between them. Riccardo Veraldi Andrea Chierici INFN - CNAF HEPiX Spring 2009

Using Docker in High Performance Computing in OpenPOWER Environment

IBM Leading High Performance Computing and Deep Learning Technologies

GaaS Workload Characterization under NUMA Architecture for Virtualized GPU

VMware Integrated OpenStack with Kubernetes Getting Started Guide. VMware Integrated OpenStack 4.0

USING NGC WITH GOOGLE CLOUD PLATFORM

/ Cloud Computing. Recitation 5 September 26 th, 2017

Linuxboot continuous integration

OpenStack Magnum Hands-on. By Saulius Alisauskas and Bryan Havenstein

Virtualization Device Emulator Testing Technology. Speaker: Qinghao Tang Title 360 Marvel Team Leader

Launching StarlingX. The Journey to Drive Compute to the Edge Pilot Project Supported by the OpenStack

Akraino & Starlingx: A Technical Overview

The speed of containers, the security of VMs

Quick Start Guide to Compute Canada Cloud Service

Welcome to Linux Foundation E-Learning Training

/ Cloud Computing. Recitation 5 February 14th, 2017

Democratizing Machine Learning on Kubernetes

Looking ahead with IBM i. 10+ year roadmap

Infoblox IPAM Driver for Kubernetes User's Guide

RUNNING VIRTUAL MACHINES ON KUBERNETES. Roman Mohr & Fabian Deutsch, Red Hat, KVM Forum, 2017

Container Orchestration on Amazon Web Services. Arun

Installing the Cisco IOS XRv 9000 Router in KVM Environments

IBM PowerKVM available with the Linux only scale-out servers IBM Redbooks Solution Guide

Infoblox IPAM Driver for Kubernetes. Page 1

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

Tooling Linux for the Future of Embedded Systems. Patrick Quairoli Director of Alliance and Embedded Technology SUSE /

CUDA QUICK START GUIDE. DU _v9.1 January 2018

Installation and setup guide of 1.1 demonstrator

Transcription:

Can we boost more HPC performance? Integrate IBM POWER servers with GPUs to OpenStack Environment Ankit Purohit, Takeaki Matsumoto Transform your business, transcend expectations with our technologically advanced solutions.

Self-Introduction Ankit Purohit Takeaki Matsumoto a.purohit@ntt.com takeaki.matsumoto@ntt.com NTT Communications Technology Development NTT Communications Technology Development High Performance Computing GPU R&D for OpenStack Ops for Private Cloud 1

Previous talk at OpenPOWER Summit 2018 March 19, 2018 at Las Vegas OpenPOWER Summit Website: https://openpowerfoundation.org/summit-2018-03-us/ Co-speaker : Yutaka Kawai, IBM Japan Our Talk s Video: https://www.youtube.com/watch?v=l4g6smtgcou&feature=youtu.be Topics * KVM on POWER * Many other Benchmarks 2

Agenda Background Our OpenStack GPU cloud Motivation for using POWER server Goal Can we boost more performance with POWER? Approach Unleash POWER s full performance as Baremetal server Integrate POWER server into OpenStack Cloud Conclusion Another choice: Kubernetes 3

Agenda Background Our OpenStack GPU cloud Motivation for using POWER server Goal Can we boost more performance with POWER? Approach Unleash POWER s full performance as Baremetal server Integrate POWER server into OpenStack Cloud Conclusion Another choice: Kubernetes 4

Background NTT Communications The largest Telecommunications company in Japan Subsidiaries and offices in over 110 cities worldwide Part of a Fortune Global 100 company Our team provide GPU cloud using OpenStack, for in-house users experimental usage. AI communication engine COTOHA http://www.ntt.com/en/services/application/cotoha.html Deep Learning training on customer data (time-series) etc. 5

Our OpenStack Environment Image source: https://www.openstack.org/software/ x86 servers (as compute nodes) nvidia nvidia nvidia K10 GPU M60 GPU P100 GPU 6

Motivation to try IBM POWER system Even with same GPU card... different server architecture brings us better performance? Intel based system : DGX-1 IBM POWER8 system : Minsky - CPU and GPU are connected via PCle (32 GB/s) - CPU and GPU are connected via NVLink (80 GB/s) - Bandwidth between CPU sockets is 64 GB/s - Bandwidth between CPU sockets is 76.8 GB/s - Bandwidth between CPU and memory is 76.8 GB/s - Bandwidth between CPU and memory is 115 GB/s 76.8 GB/s 64 GB/s 76.8 GB/s 32 GB/s 76.8 GB/s 7

Goal How can we boost more performance with POWER? 8

Agenda Background Our OpenStack GPU cloud Motivation for using POWER server Goal Can we boost more performance with POWER? Approach Unleash POWER s full performance as Baremetal server Integrate POWER server into OpenStack Cloud Conclusion Another choice: Kubernetes 9

Benchmark program: nbody - nbody is kind of cuda sample program. This program can calculate single precision and double precision by using GPU and the results are displayed in GFLOPS. It can be also calculated by CPU only. $./nbody -benchmark -numbodies=2048000 -numdevices=1 -benchmark : (run benchmark to measure performance) -numbodies : (number of bodies (>= 1) to run in simulation) (for GPU benchmark 2048000, for CPU benchmark 20480) -numdevice : (where i=(number of CUDA devices > 0) to use for simulation) -cpu : (run n-body simulation on the CPU)] -fp64 : (use double precision floating point values for simulation) 10

Benchmark program: nbody We use nbody to emulate memory intensive workflow In nbody, GPU directly access data from host memory (Main memory) many times NVLink(or PCle) Bottleneck? CPU GPU0 GPU1... Zero-copy Main Memory GPU Memory GPU Memory nbody data flow 11

Benchmark Result: POWER8 baremetal (1/2) With default server configuration Workload: numbodies=2048000, FP32 on Minsky w/ RHEL7.3 1GPU 2GPU 2GPU 4GPU When using 4 GPUs, there is low performance than 2 GPUs because it is not scaled When using 2 GPUs, specifying different GPUs causes different performance. Why?! T. Kamenoue, M. Mitsugi, and Y. Kawai, "The optimization of nbody simulation on Multi-GPU environment in Proc. the 80th National Convention of Information Processing Society of Japan (IPSJ), Tokyo, Japan, Mar. 2018, pp. 1-25,26. 12

A Solution : Memory Interleave What memory Interleave actually does?? - It enables equally use of memories of all the node (CPU sockets) in round robin way. - I/O access can be balanced - it works well for the case of nbody benchmark (FP32) - How to execute? numactl -interleave=all./nbody Interleave disabled(default) OR numactl -i all./nbody... Interleave enabled T. Kamenoue, M. Mitsugi, and Y. Kawai, "The optimization of nbody simulation on Multi-GPU environment in Proc. the 80th National Convention of Information Processing Society of Japan (IPSJ), Tokyo, Japan, Mar. 2018, pp. 1-25,26. 13

What happens if Interleave is disabled? System Memory workload : FP32, numbodies=2048000, 4GPU, Interleave disabled System Memory 115 GB/s 115 GB/s POWER8 GPU0 and GPU1 always reads from CLOSE Memory GPU2 and GPU3 always reads from FAR Memory - Elapsed Time Per 1 Iteration GPU 0 : 4.3-4.4 Second GPU 1 : 4.3-4.4 Second GPU 2 : 9.2-9.10 Second GPU 3 : 9.2-9.10 Second Benchmark Result : 8673 GFLOP/s POWER8 CPU1 CPU0 80 GB/s NVLink P100 GPU0 GPU Memory 80 GB/s NVLink P100 GPU1 P100 GPU2 GPU Memory GPU Memory 80 GB/s P100 GPU3 GPU Memory 1 Iteration T. Kamenoue, M. Mitsugi, and Y. Kawai, "The optimization of nbody simulation on Multi-GPU environment in Proc. the 80th National Convention of Information Processing Society of Japan (IPSJ), Tokyo, Japan, Mar. 2018, pp. 1-25,26. 14

What happens if Interleave is enabled? System Memory workload : FP32, numbodies=2048000, 4GPU, Interleave enabled System Memory 115 GB/s 115 GB/s POWER8 GPU0 and GPU1 always reads 1/2 data from CLOSE Memory 1/2 data from FAR Memory All GPUs read same as above - Elapsed Time Per 1 Iteration GPU 0 : 5.2-5.3 Second GPU 1 : 5.2-5.3 Second GPU 2 : 5.2-5.3 Second GPU 3 : 5.2-5.3 Second Benchmark Result : 15969 GFLOP/s POWER8 CPU1 CPU0 80 GB/s NVLink P100 GPU0 GPU Memory 80 GB/s NVLink P100 GPU1 P100 GPU2 GPU Memory GPU Memory 80 GB/s P100 GPU3 GPU Memory 1 Iteration T. Kamenoue, M. Mitsugi, and Y. Kawai, "The optimization of nbody simulation on Multi-GPU environment in Proc. the 80th National Convention of Information Processing Society of Japan (IPSJ), Tokyo, Japan, Mar. 2018, pp. 1-25,26. 15

Benchmark Result: POWER8 baremetal (2/2) With memory interleave enabled Workload: numbodies=2048000, FP32 on Minsky w/ RHEL7.3 1GPU 2GPU 2GPU 4GPU Now it is scaled. 4 GPU case has becomes faster than 2 GPU. T. Kamenoue, M. Mitsugi, and Y. Kawai, "The optimization of nbody simulation on Multi-GPU environment in Proc. the 80th National Convention of Information Processing Society of Japan (IPSJ), Tokyo, Japan, Mar. 2018, pp. 1-25,26. 16

Benchmark Result: POWER8 vs DGX-1 baremetal nbody result when increasing GPU number Workload: numbodies=2048000, FP32 1GPU 2GPU 4GPU GFLOP/s POWER8 DGX-1 - Current Intel Architecture machine can not take benefit from Memory Interleave because of its narrow I/O bandwidth. 17

Agenda Background Our OpenStack GPU cloud Motivation for using POWER server Goal Can we boost more performance with POWER? Approach Unleash POWER s full performance as Baremetal server Integrate POWER server into OpenStack Cloud Conclusion Another choice: Kubernetes 18

How to integrate POWER8 to OpenStack nova-api nova-scheduler nova-conductor Controller (x86) nova-compute Compute (x86) nova-compute Compute (x86) nova-compute Compute (ppc64le) 19

How to integrate POWER8 to OpenStack Linux can run on POWER8 KVM can run on POWER8 OpenStack can run on POWER8 Cloud Archive repository available Basically, same procedure can be used as x86 20

How to integrate POWER8 to OpenStack For GPU, we need KVM PCI-Passthrough KVM support qemu (1:2.6.1+dfsg-0ubuntu2) xenial; urgency=medium Enable GPU Passthru for ppc64le https://launchpad.net/bugs/1541902 IOMMU (like Intel VT-d) In POWER servers, IBM Translation Control Entry is available 21

How to integrate POWER8 to OpenStack Environment OpenPOWER IBM S822LC for HPC "Minsky" CPU: 20 cores (logical: 160 cores) MEM: 1TB GPU: NVIDIA P100 * 4 (with NVLink) OS Ubuntu 16.04.4 (kernel: 4.15.0-13-generic) Software KVM 2.11 Nova 17.0.1 (Queens) 22

How to integrate POWER8 to OpenStack Configuration Kernel parameters Disable SMT vfio-pci.disable_idle_d3=1 $ ppc64_cpu --smt=off Disable nouveau driver $ cat /etc/modprobe.d/blacklist-nouveau.conf blacklist nouveau blacklist lbm-nouveau options nouveau modeset=0 alias nouveau off $ sudo update-initramfs -u $ reboot $ lsmod grep nouveau 23

How to integrate POWER8 to OpenStack Nova Configuration Compute node Ensure PCI device id nova.conf $ lspci -nn grep -i nvidia 0002:01:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:15f9] (rev a1) [default] pci_passthrough_whitelist={"vendor_id":"10de","product_id":"15f9"} Controller node nova.conf [default] pci_alias= {"vendor_id":"10de", "product_id":"15f9", "name": "P100"} [filter_scheduler] enabled_filters =,PciPassthroughFilter 24

Our OpenStack Environment: After Integration Image source: https://www.openstack.org/software/ x86 servers POWER8 servers nvidia nvidia nvidia nvidia K10 GPU M60 GPU P100 GPU P100 GPU 25

Benchmark of OpenStack-integrated VM Instance flavor vcpu: 16 Mem: 120GB Disk: 160GB Metadata: pci_passthrough:alias=p100:4 hw:mem_page_size=16384 hw:numa_nodes=2 GPU environment NVIDIA Driver: 390.12 CUDA: 9.1 26

Benchmark of OpenStack-integrated VM nbody benchmark results $ numactl -i all./nbody -benchmark -numbodies=2048000 1GPU 2GPU 4GPU 27

Benchmark of OpenStack-integrated VM CPU-GPU Memory bandwidth benchmark results $./bandwidthtest 28

Benchmark of OpenStack-integrated VM CPU-GPU Memory bandwidth benchmark results $./bandwidthtest Why? 29

Benchmark of OpenStack-integrated VM NVLink implementation Physical Linux recognize CPU NVLink (2.5x PCIe GPU PCI CPU NVLink Device NVLink Device GPU 30

Benchmark of OpenStack-integrated VM OpenStack attached only GPU VM PCI-Passthrough NVLink Device PCIe x8 NVLink Device GPU 31

Benchmark of OpenStack-integrated VM Passthrough 3 devices solve this issue? PCI-Passthrough VM NVLink Device NVLink Device GPU 32

Benchmark of OpenStack-integrated VM GPU loc-code $ lspci -d 10de:15f9 0002:01:00.0 3D controller: NVIDIA Corporation Device 15f9 (rev a1) 0003:01:00.0 3D controller: NVIDIA Corporation Device 15f9 (rev a1) 000a:01:00.0 3D controller: NVIDIA Corporation Device 15f9 (rev a1) 000b:01:00.0 3D controller: NVIDIA Corporation Device 15f9 (rev a1) $ cat /sys/bus/pci/devices/0002\:01\:00.0/of_node/ibm\,loc-code GPU1 $ cat /sys/bus/pci/devices/0003\:01\:00.0/of_node/ibm\,loc-code GPU2 $ cat /sys/bus/pci/devices/000a\:01\:00.0/of_node/ibm\,loc-code GPU3 $ cat /sys/bus/pci/devices/000b\:01\:00.0/of_node/ibm\,loc-code GPU4 33

Benchmark of OpenStack-integrated VM NVLink devices and its connection $ lspci -d 1014:04ea 0004:00:00.0 Bridge: IBM Device 04ea 0004:00:00.1 Bridge: IBM Device 04ea 0004:00:01.0 Bridge: IBM Device 04ea 0004:00:01.1 Bridge: IBM Device 04ea 0005:00:00.0 Bridge: IBM Device 04ea 0005:00:00.1 Bridge: IBM Device 04ea 0005:00:01.0 Bridge: IBM Device 04ea 0005:00:01.1 Bridge: IBM Device 04ea $ cat /sys/bus/pci/devices/0004\:00\:00.0/of_node/ibm\,loc-code GPU2 $ cat /sys/bus/pci/devices/0004\:00\:00.1/of_node/ibm\,loc-code GPU2 $ cat /sys/bus/pci/devices/0004\:00\:01.0/of_node/ibm\,loc-code GPU1 $ cat /sys/bus/pci/devices/0004\:00\:01.1/of_node/ibm\,loc-code GPU1 $ cat /sys/bus/pci/devices/0005\:00\:00.0/of_node/ibm\,loc-code GPU4 $ cat /sys/bus/pci/devices/0005\:00\:00.1/of_node/ibm\,loc-code GPU4 $ cat /sys/bus/pci/devices/0005\:00\:01.0/of_node/ibm\,loc-code GPU3 $ cat /sys/bus/pci/devices/0005\:00\:01.1/of_node/ibm\,loc-code GPU3 34

Benchmark of OpenStack-integrated VM Add NVLink devices (by hand) ~~~ <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0002' bus='0x01' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x8' function='0x0'/> </hostdev> instance-000000xx.xml <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0004' bus='0x00' slot='0x01' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x9' function='0x0' multifunction='on'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0004' bus='0x00' slot='0x01' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x9' function='0x1'/> </hostdev> ~~~ 35

Benchmark of OpenStack-integrated VM CPU-GPU Memory bandwidth benchmark results with NVLink device added 36

Benchmark of OpenStack-integrated VM nbody benchmark results with NVLink device with NVLink device added 1GPU 2GPU 4GPU 37

How can we manage NVLink devices? OpenStack doesn't care about device connection 1014:04ea pool 10de:15f9 pool NVLink Device GPU1 NVLink Device GPU1 NVLink Device GPU2 NVLink Device GPU2 NVLink Device GPU3 NVLink Device GPU3 NVLink Device GPU4 NVLink Device GPU4 GPU1 GPU2 GPU3 GPU4 Request P100:1,NVLink:2 38

How can we manage NVLink devices? In ideal device_set_p100 pool GPU1 NVLink Device GPU1 NVLink Device GPU1 GPU3 NVLink Device GPU3 NVLink Device GPU3 GPU2 NVLink Device GPU2 NVLink Device GPU2 GPU4 NVLink Device GPU4 NVLink Device GPU4 Request device_set_p100:1 39

How can we manage NVLink devices? Our solution Add simple script between libvirt and qemu Rename qemu-system-ppc64 to qemu-system-ppc64.orig Add the script as qemu-system-ppc64 Nova libvirt script qemu Add NVLink devices parameters Launch VM with P100 and NVLink devices Request P100 40

Agenda Background Our OpenStack GPU cloud Motivation for using POWER server Goal Can we boost more performance with POWER? Approach Unleash POWER s full performance as Baremetal server Integrate POWER server into OpenStack Cloud Conclusion Another choice: Kubernetes 41

Conclusion How can we boost more performance with POWER? Memory interleave may be required to get max performance Add POWER as compute node into OpenStack Specify GPU and its NVLink devices to passthrough to VM Power8 results better performance than x86 in some cases It has powerful NVLink CPU-GPU connection With OpenStack, some limitations exists SMT is no available NVLink requires extra device allocation OpenStack doesn't support now 42

Agenda Background Our OpenStack GPU cloud Motivation for using POWER server Goal Can we boost more performance with POWER? Approach Unleash POWER s full performance as Baremetal server Integrate POWER server into OpenStack Cloud Conclusion Another choice: Kubernetes 43

Another option How is the container? 44

Another option How to manage containers and GPUs 45

Another option Kubernetes schedules containers can integrate with OpenStack supports GPU scheduler requirements NVIDIA drivers ~= 361.93 Device Plugin feature NVIDIA device plugin for Kubernetes nvidia-docker 46

Another option Device plugin feature NVIDIA device plugin for Kubernetes nvidia-docker NVIDIA Driver NVIDIA GPU 47

Another option Device Plugin feature Add kubelet exec parameter <= K8s version 1.9 "-feature-gates=deviceplugins=true" Example: deployed by kubeadm $ cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf grep KUBELET_EXTRA_ARGS= Environment="KUBELET_EXTRA_ARGS=--feature-gates=DevicePlugins=true" Device Plugins feature is Beta >= K8s version 1.10 Enabled by default Note: If you deploy k8s using kubeadm and the controller is x86, you have to do like $ docker tag gcr.io/google_containers/kube-proxy-ppc64le:v1.9.2 gcr.io/google_containers/kube-proxy:v1.9.2 48

Another option NVIDIA device plugin for Kubernetes https://github.com/nvidia/k8s-device-plugin Build image for ppc64le $ docker build. -t nvidia/k8s-device-plugin:1.9 49

Another option nvidia-docker (2.0) supports NVLink devices ppc64le packages are not available yet nvidia-docker depends on following packages libnvidia-container https://github.com/nvidia/libnvidia-container nvidia-container-runtime https://github.com/nvidia/nvidia-container-runtime can be installed using nvidia official repository now https://nvidia.github.io/nvidia-docker/ 50

Another option Change the default runtime $ cat /etc/docker/daemon.json $ sudo systemctl daemon-reload $ sudo systemctl restart kubelet Enable NVIDIA device plugin $ kubectl create -f https://raw.githubusercontent.com/nvidia/k8s-device-plugin/v1.9/nvidia-device-plugin.yml 51

Another option Ensure GPU resource is available $ kubectl describe node 52

Another option Ensure GPU resource is available $ kubectl apply -f bandwidth-test.yml $ kubectl logs bwt-pod bandwidth-test.yml 53

Another option CPU-GPU Memory bandwidth benchmark results 54

Thank you! 55

References OpenStack Docs: Attaching physical PCI devices to guests https://docs.openstack.org/nova/pike/admin/pci-passthrough.html Device Plugins - Kubernetes Feature Gates Kubernetes https://kubernetes.io/docs/reference/feature-gates/ GitHub - NVIDIA/k8s-device-plugin https://kubernetes.io/docs/concepts/cluster-administration/device-plugins/ https://github.com/nvidia/k8s-device-plugin GitHub - NVIDIA/nvidia-docker https://github.com/nvidia/nvidia-docker 56