Opportunities for container environments on Cray XC30 with GPU devices

Similar documents
Shifter: Fast and consistent HPC workflows using containers

LHConCRAY. Acceptance Tests 2017 Run4 System Report Miguel Gila, CSCS August 03, 2017

Singularity in CMS. Over a million containers served

Shifter at CSCS Docker Containers for HPC

[Docker] Containerization

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules. Singularity overview. Vanessa HAMAR

Singularity tests at CC-IN2P3 for Atlas

Compute Node Linux (CNL) The Evolution of a Compute OS

Reproducible computational pipelines with Docker and Nextflow

Global Software Distribution with CernVM-FS

Singularity: Containers for High-Performance Computing. Grigory Shamov Nov 21, 2017

Andrej Filipčič

State of Containers. Convergence of Big Data, AI and HPC

Docker A FRAMEWORK FOR DATA INTENSIVE COMPUTING

Think Small to Scale Big

Geant4 on Azure using Docker containers

Amazon EC2 Container Service: Manage Docker-Enabled Apps in EC2

Microservices. Chaos Kontrolle mit Kubernetes. Robert Kubis - Developer Advocate,

USING DOCKER FOR MXCUBE DEVELOPMENT AT MAX IV

Presented By: Gregory M. Kurtzer HPC Systems Architect Lawrence Berkeley National Laboratory CONTAINERS IN HPC WITH SINGULARITY

Bright Cluster Manager: Using the NVIDIA NGC Deep Learning Containers

Genomics on Cisco Metacloud + SwiftStack

docker & HEP: containerization of applications for development, distribution and preservation

Investigating Containers for Future Services and User Application Support

Transient Compute ARC as Cloud Front-End

High Performance Containers. Convergence of Hyperscale, Big Data and Big Compute

4 Effective Tools for Docker Monitoring. By Ranvijay Jamwal

Important DevOps Technologies (3+2+3days) for Deployment

Bright Cluster Manager

Red Hat Atomic Details Dockah, Dockah, Dockah! Containerization as a shift of paradigm for the GNU/Linux OS

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design

CONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY

Automatic Dependency Management for Scientific Applications on Clusters. Ben Tovar*, Nicholas Hazekamp, Nathaniel Kremer-Herman, Douglas Thain

Developing and Testing Java Microservices on Docker. Todd Fasullo Dir. Engineering

The Path to GPU as a Service in Kubernetes Renaud Gaubert Lead Kubernetes Engineer

VC3. Virtual Clusters for Community Computation. DOE NGNS PI Meeting September 27-28, 2017

Illinois Proposal Considerations Greg Bauer

MDHIM: A Parallel Key/Value Store Framework for HPC

Combining NVIDIA Docker and databases to enhance agile development and optimize resource allocation

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

The new Docker networking put into action to spin up a SLURM cluster

S INSIDE NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORK CONTAINERS

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

Igniting QuantLib on a Zeppelin

Glauber Costa, Lead Engineer

Containerizing GPU Applications with Docker for Scaling to the Cloud

XC Series Shifter User Guide (CLE 6.0.UP02) S-2571

CSCS Proposal writing webinar Technical review. 12th April 2015 CSCS

INTRODUCTION TO NEXTFLOW

Building a Secure Data Environment. Maureen Dougherty Center for High-Performance Computing University of Southern California

HTCondor on Titan. Wisconsin IceCube Particle Astrophysics Center. Vladimir Brik. HTCondor Week May 2018

Docker for HPC? Yes, Singularity! Josef Hrabal

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta

HPC Architectures. Types of resource currently in use

CERN: LSF and HTCondor Batch Services

DIRAC pilot framework and the DIRAC Workload Management System

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

HTCondor Week 2015: Implementing an HTCondor service at CERN

What s new in HTCondor? What s coming? HTCondor Week 2018 Madison, WI -- May 22, 2018

Python based Data Science on Cray Platforms Rob Vesse, Alex Heye, Mike Ringenburg - Cray Inc C O M P U T E S T O R E A N A L Y Z E

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Who is Docker and how he can help us? Heino Talvik

Operational Robustness of Accelerator Aware MPI

BEST PRACTICES FOR DOCKER

Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO

RADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP

Employing HPC DEEP-EST for HEP Data Analysis. Viktor Khristenko (CERN, DEEP-EST), Maria Girone (CERN)

SOFT CONTAINER TOWARDS 100% RESOURCE UTILIZATION ACCELA ZHAO, LAYNE PENG

Evolution of Cloud Computing in ATLAS

Introducing the HTCondor-CE

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

Shifter: Containers for HPC

ATLAS Tier-3 UniGe

Allowing Users to Run Services at the OLCF with Kubernetes

@joerg_schad Nightmares of a Container Orchestration System

A Case for High Performance Computing with Virtual Machines

Using Puppet to contextualize computing resources for ATLAS analysis on Google Compute Engine

Getting Started With Amazon EC2 Container Service

Containers, Serverless and Functions in a nutshell. Eugene Fedorenko

Using the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver

Go Faster: Containers, Platforms and the Path to Better Software Development (Including Live Demo)

Monitor your containers with the Elastic Stack. Monica Sarbu

Supporting GPUs in Docker Containers on Apache Mesos

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Building A Better Test Platform:

Farewell to Servers: Hardware, Software, and Network Approaches towards Datacenter Resource Disaggregation

MPI 1. CSCI 4850/5850 High-Performance Computing Spring 2018

Smarter Storage with Containerized Applications. Always Aligned with your Changing World

ElastiCluster Automated provisioning of computational clusters in the cloud

Scientific Computing on Emerging Infrastructures. using HTCondor

Security in the CernVM File System and the Frontier Distributed Database Caching System

Scheduling Computational and Storage Resources on the NRP

Containerized Cloud Scheduling Environment

Shifter and Singularity on Blue Waters

PREPARING TO USE CONTAINERS

BUILDING A GPU-FOCUSED CI SOLUTION

LSST software stack and deployment on other architectures. William O Mullane for Andy Connolly with material from Owen Boberg

Transcription:

Opportunities for container environments on Cray XC30 with GPU devices Cray User Group 2016, London Sadaf Alam, Lucas Benedicic, T. Schulthess, Miguel Gila May 12, 2016

Agenda Motivation Container technologies, Docker & Shifter Use case: High Energy Physics with containers on XC Use case: GPUs with containers on XC Conclusion Opportunities for container environments on Cray XC30 with GPU devices 2

Motivation

Why do we at CSCS want to use containers on HPC? Containerizing applications provides an easy and portable way for packaging complex application setups i.e. libraries, OS dependencies, etc. This allows us to support on our Cray systems a wider range of scientific applications and workflows Reach communities beyond the common HPC use cases This can also help consolidating our users and customers on fewer but elastic resources But, containers are not the solution for everything. Existing HPC workloads don t need to be containerized Opportunities for container environments on Cray XC30 with GPU devices 4

Container technologies: Docker

What is Docker and how does it work? Docker containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, runtime, system tools, system libraries anything you can install on a server. This guarantees that it will always run the same, regardless of the environment it is running in. [*] [*] [*] Virtual Machines Opportunities for container environments on Cray XC30 with GPU devices 6 Containers [*] https://www.docker.com/what-docker

What is Docker and how does it work? A Docker image is a file that contains a bunch of files on it. Usually libraries and application binaries Docker leverages the namespaces feature of the kernel to isolate processes On its simplest form, Docker basically 1. Pulls an image to the local system 2. Creates some sort of chroot environment with the image (=container) 3. Runs our application in the container ( isolated from the host thanks to kernel namespaces) However, it can also do other things: Isolate network by creating NAT or bridge devices Can use a nice GUI Opportunities for container environments on Cray XC30 with GPU devices 7

Docker in HPC environments Docker is a nice tool, but it s not built for HPC environments, because: Does not integrate well with workload managers Does not isolate users on shared filesystems Requires running a daemon on all nodes Not designed to run on diskless clients Network is by default NAT Building Docker is done within a Docker container. It can be done outside, but is a complex task (Go language, seriously??) But after all, a sysadmin can make anything to work on a cluster, right? We can create (and hopefully maintain) monstrous wrappers to run Docker containers Opportunities for container environments on Cray XC30 with GPU devices 8

Container technologies: Shifter

What is Shifter and how does it work? Shifter is a container-based solution thought from the ground up for HPC environments and, in particular, Cray systems Open source tool created by NERSC. Available on Github It leverages the current Docker environment by using Docker images to create containers Shifter uses loop devices and chroot mechanisms as well as namespaces to provide the container environment Opportunities for container environments on Cray XC30 with GPU devices 10

Shifter in HPC environments Shifter is a great tool built for HPC! Integrates very well with workload managers (Slurm!) All user code is run in userland J No need for local storage J Can run off any mountpoint (/dsl/opt or /apps) Network and anything under /dev is exposed to containers Building Shifter is very easy Per-node caching (sparse xfs filesystems are awesome) Opportunities for container environments on Cray XC30 with GPU devices 11

Architecture Shifter consists on two components: imagegateway runs on an external server and converts any Docker image to a Shifter image. Built in Python, requires Redis & MongoDB. udiroot/runtime runs on any compute node and chroots our application to run within the image constraints. Good old C. Docker Hub udiroot query restful API imagegateway db Or private registry scratch Compute node External node Opportunities for container environments on Cray XC30 with GPU devices 12

Using shifter 1. Creates the image 3. Runs a job that uses containers based on the image Docker image 2. Converts the image SLURM (SPANK) job container container container container Docker Hub Shifter image Opportunities for container environments on Cray XC30 with GPU devices 13

Use case: High Energy Physics with containers on XC

WLCG Swiss Tier-2 CSCS operates the cluster Phoenix on behalf of CHIPP, the Swiss Institute of Particle Physics Phoenix runs Tier-2 jobs for ATLAS, CMS and LHCb, 3 experiments of the LHC at CERN and part of WLCG (Worldwide LHC Computing Grid) WLCG jobs need and expect RHEL-compatible OS. All software is precompiled and exposed in a cvmfs [*] filesystem But Cray XC compute nodes run CLE, a modified version of SLES 11 SP3 So, how do we get these jobs to run on a Cray? Opportunities for container environments on Cray XC30 with GPU devices 15 [*] https://cernvm.cern.ch/portal/filesystem

How do we get WLCG jobs to run on a Cray? Mount cvmfs natively on the compute nodes using a preloaded cache Cvmfs process runs in the compute nodes, within dsl environment Cvmfs cache is located on scratch and contains all the cache that CERN exposes Compute nodes see /var/cvmfs/{atlas,cms,lhcb}.cern.ch and have 100% success hits Use shifter to contain the environment of a job to a RHEL-compatible image with a bunch of Grid packages (globus* and few others) installed Connect all this to WLCG with ARC Opportunities for container environments on Cray XC30 with GPU devices 16 [*] https://cernvm.cern.ch/portal/filesystem

That s it! Using Shifter, we are able to run unmodified ATLAS, CMS and LHCb production jobs on a Cray XC TDS Jobs see standard CentOS 6 containers Nodes are shared: multiple single-core and multi-core jobs, from different experiments, can run on the same compute node Job efficiency is comparable in both systems JOBID USER ACCOUNT NAME NODELIST ST REASON START_TIME END_TIME TIME_LEFT NODES CPU 82471 atlasprd atlas a53eb5f8_34f0_ nid00043 R None 15:03:33 Thu 15:03 1-23:54:18 1 8 82476 cms04 cms gridjob nid00043 R None 15:08:39 Tomorr 03:08 11:59:24 1 2 82451 lhcbplt lhcb gridjob nid00043 R None 15:00:10 Tomorr 03:00 11:50:55 1 2 82447 lhcbplt lhcb gridjob nid00043 R None 14:59:31 Tomorr 02:59 11:50:16 1 2 82448 lhcbplt lhcb gridjob nid00043 R None 14:59:31 Tomorr 02:59 11:50:16 1 2 82449 lhcbplt lhcb gridjob nid00043 R None 14:59:31 Tomorr 02:59 11:50:16 1 2 82450 lhcbplt lhcb gridjob nid00043 R None 14:59:31 Tomorr 02:59 11:50:16 1 2 82446 lhcbplt lhcb gridjob nid00043 R None 14:49:01 Tomorr 02:49 11:39:46 1 2 82444 lhcbplt lhcb gridjob nid00043 R None 14:48:01 Tomorr 02:48 11:38:46 1 2 82445 lhcbplt lhcb gridjob nid00043 R None 14:48:01 Tomorr 02:48 11:38:46 1 2 Opportunities for container environments on Cray XC30 with GPU devices 17

Use case: GPUs with containers on XC

Containerizing GPU applications Containerizing GPU applications provides and easy and portable way for packaging complex application setups Take advantage of several benefits: Facilitate collaboration Reproducible builds Isolation of individual GPU devices Running across heterogeneous CUDA toolkit environments Requires only the NVIDIA driver installed on the host Opportunities for container environments on Cray XC30 with GPU devices 19

The Docker catch Docker containers are both hardware-agnostic and platform-agnostic by design. This is not the case when using GPUs since: it is using specialized hardware (that shows on your system as special character device), and it requires the installation of the NVIDIA kernel driver Opportunities for container environments on Cray XC30 with GPU devices 20

Collaboration with NVidia Mutual engineering and testing efforts to find a solution for Docker and Shifter Early prototype solution: to fully install the kernel driver inside the container, but: the version of the host driver had to exactly match driver version installed in the container container images had to be built locally on each machine, i.e., could not be shared one needs to adhere to intellectual property regulations, i.e., not embedding proprietary code on a potentially sharable image without proper consent Opportunities for container environments on Cray XC30 with GPU devices 21

Collaboration with NVidia: solution Shifter already provides the required character devices (/dev/nvidiax) to the container The driver files are mounted when starting the container on the target machine using the pre-mount hooks made available by Shifter Then we alter the runtime library search configuration to make the container aware of the new libraries available This makes images agnostic to the NVIDIA driver and capable of running on our environment without embedding any driver on the image Opportunities for container environments on Cray XC30 with GPU devices 22

No overhead Testing done so far shows no overhead in terms of GPU performance when running within Shifter containers Stream benchmark within Shifter./nvidia-docker/samples/cuda- lucasbe@santis01 ~/shifter-gpu> sbatch stream/benchmark.sbatch Submitted batch job 496 lucasbe@santis01 /scratch/santis/lucasbe/jobs> cat shifter-gpu.out.log Launching GPU stream benchmark on nid00012... STREAM Benchmark implementation in CUDA Array size (double precision) = 1073.74 MB using 192 threads per block, 699051 blocks Function Rate (GB/s) Avg time(s) Min time(s) Max time(s) Copy: 184.3169 0.01167758 0.01165104 0.01170397 Scale: 183.1849 0.01175387 0.01172304 0.01178598 Add: 180.3075 0.01790012 0.01786518 0.01792288 Triad: 180.1056 0.01790700 0.01788521 0.01794291 Opportunities for container environments on Cray XC30 with GPU devices 23

A common approach NVIDIA DGX-1 uses the engineered solution for the management of its software stack Opportunities for container environments on Cray XC30 with GPU devices 24 https://github.com/nvidia/nvidia-docker

Conclusion

Conclusion We like Shifter! It allows us to run workloads that traditionally have been difficult to port on Cray systems It helps our users to package complex applications and be able to reproduce results over time It s easy to use J But this is not all or nothing: things that already run on our Cray systems don t need to change Opportunities for container environments on Cray XC30 with GPU devices 26

Questions?

Thank you for your attention.

Extra slides