Deploy containers on your cluster - A proof of concept

Similar documents
Introduction to Containers

Docker A FRAMEWORK FOR DATA INTENSIVE COMPUTING

An introduction to Docker

Travis Cardwell Technical Meeting

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

Singularity: Containers for High-Performance Computing. Grigory Shamov Nov 21, 2017

State of Containers. Convergence of Big Data, AI and HPC

Engineering Robust Server Software

Docker und IBM Digital Experience in Docker Container

Bioshadock. O. Sallou - IRISA Nettab 2016 CC BY-CA 3.0

Who is Docker and how he can help us? Heino Talvik

Docker & why we should use it

Basic Linux Security. Roman Bohuk University of Virginia

Presented By: Gregory M. Kurtzer HPC Systems Architect Lawrence Berkeley National Laboratory CONTAINERS IN HPC WITH SINGULARITY

Singularity in CMS. Over a million containers served

CS-580K/480K Advanced Topics in Cloud Computing. Container III

Introduction to containers

High Performance Containers. Convergence of Hyperscale, Big Data and Big Compute

Andrej Filipčič

Deployment Patterns using Docker and Chef

docker & HEP: containerization of applications for development, distribution and preservation

4 Effective Tools for Docker Monitoring. By Ranvijay Jamwal

Containers. Pablo F. Ordóñez. October 18, 2018

Introduction to Linux

OS Virtualization. Linux Containers (LXC)

Run containerized applications from pre-existing images stored in a centralized registry

Guillimin HPC Users Meeting

Think Small to Scale Big

Investigating Containers for Future Services and User Application Support

Getting Started with Hadoop

Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules. Singularity overview. Vanessa HAMAR

Getting Started With Containers

Singularity CRI User Documentation

Portable, lightweight, & interoperable Docker containers across Red Hat solutions

Midterm Presentation Schedule

Bright Cluster Manager: Using the NVIDIA NGC Deep Learning Containers

Opportunities for container environments on Cray XC30 with GPU devices

Docker Security. Mika Vatanen

Infrastructure Security 2.0

THE ROUTE TO ROOTLESS

Automatic Dependency Management for Scientific Applications on Clusters. Ben Tovar*, Nicholas Hazekamp, Nathaniel Kremer-Herman, Douglas Thain

DOCKER 101 FOR JS AFFICIONADOS. Christian Ulbrich, Zalari UG

BEST PRACTICES FOR DOCKER

Cross platform enablement for the yocto project with containers. ELC 2017 Randy Witt Intel Open Source Technology Center

The new Docker networking put into action to spin up a SLURM cluster

Linux Essentials Objectives Topics:

The State of Rootless Containers

Containerized Cloud Scheduling Environment

Singularity: container formats

Perl and R Scripting for Biologists

Infoblox Kubernetes1.0.0 IPAM Plugin

Seccomp, network and namespaces. Francesco Tornieri <francesco.tornieri AT kiratech.it>

INTRODUCTION TO LINUX

Docker task in HPC Pack

The kernel is the low-level software that manages hardware, multitasks programs, etc.

DGX-1 DOCKER USER GUIDE Josh Park Senior Solutions Architect Contents created by Jack Han Solutions Architect

Dockerized Tizen Platform

Shifter at CSCS Docker Containers for HPC

OS Security III: Sandbox and SFI

Docker for Developers

[Docker] Containerization

The failure of Operating Systems,

OS Containers. Michal Sekletár November 06, 2016

Android meets Docker. Jing Li

Container-based virtualization: Docker

CS197U: A Hands on Introduction to Unix

Network softwarization Lab session 2: OS Virtualization Networking

Introduction To Linux. Rob Thomas - ACRC

HTCondor: Virtualization (without Virtual Machines)

Securing Containers on the High Seas. Jack OWASP Belgium September 2018

Flatpak a technical walk-through. Alexander Larsson, Red Hat

BEST PRACTICES FOR DOCKER

Dockerfile Best Practices

Improving User Accounting and Isolation with Linux Kernel Features. Brian Bockelman Condor Week 2011

Automating the Build Pipeline for Docker Container

Introduction to Container Technology. Patrick Ladd Technical Account Manager April 13, 2016

Docker. Master the execution environment of your applications. Aurélien Dumez. Inria Bordeaux - Sud-Ouest. Tuesday, March 24th 2015

It s probably the most popular containerization technology on Linux these days

Software containers are likely to become a very important tool over the

Building A Better Test Platform:

Running Docker applications on Linux on the Mainframe

/ Cloud Computing. Recitation 5 September 27 th, 2016

DevOps in the Cloud A pipeline to heaven?! Robert Cowham BCS CMSG Vice Chair

AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS

PBS Pro with Docker Integration

/ Cloud Computing. Recitation 5 February 14th, 2017

FEniCS Containers Documentation

A Hands on Introduction to Docker

CONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY

Docker and Security. September 28, 2017 VASCAN Michael Irwin

bistro Documentation Release dev Philippe Veber

Asterisk & the Docker revolution Some lessons from the trenches

Introduction to the shell Part II

Containers and isolation as implemented in the Linux kernel

GitLab-CI and Docker Registry

LSST software stack and deployment on other architectures. William O Mullane for Andy Connolly with material from Owen Boberg

Section 1: Tools. Contents CS162. January 19, Make More details about Make Git Commands to know... 3

Introduction to Containers. Martin Čuma Center for High Performance Computing University of Utah

Arup Nanda VP, Data Services Priceline.com

LENS Server Maintenance Guide JZ 2017/07/28

Transcription:

Deploy containers on your cluster - A proof of concept

What is HPC cluster (in my world!) Where do I come from? Run and maintain a bioinformatics cluster at Bioinformatic Research Centre (BiRC), Aarhus University E-mail: anders.dannesboe@birc.au.dk The setup 3000+ cores 3.5PB parallel file system (henceforth known as /faststorage ) Use SLURM as our scheduler

What is HPC cluster (in my world!) A bunch of servers connected together with access to a shared file system Pipelines are spread into parallel pieces and run on multiple nodes at onces, to achieve accumulated speedup A multiuser system. Pipelines are run by unprivileged users (no root!) Everything is orchestrated by a scheduler. Takes care of resource sharing. E.g: Kill jobs that takes to long Enforces the limits of cores+memory of each job Packs multiple jobs from multiple users together on as few nodes as possible

What is HPC cluster (in my world!) What kinds of jobs do we run? Lots of data - Large input datasets, large shared reference dataset - Sensitive data Lots of different software by lots of different people - Versions keeps on changing Work-in-progress pipelines - Batches are seldom run twice. But a batch can have 50,000 of the same job-type Everything is in flux

Docker Docker: A Revolutionary Change in Cloud Computing

Docker Docker: A Revolutionary Change in Cloud Computing

Docker Dockers focus: Make software run the same anywhere Use containers to make software OS independent Take over networking, to make containers datacenter environment independent no static/fixed ip s One storage model, to make it backing independent image/container content is just fills in your filesystem Docker takes care of many of the nitty gritty details and lets you focus on package your software ones and for all

What are linux containers? Chroot on steroids Each container comes with its own OS Spawning a container runs a new init. Every running container on the host is a independent OS running on the system Uses features i Linux-kernel to achieve process isolation Cgroups for resource management Linux namespaces for process isolation Leverage OverlayFS in data/deployment model Spawn multiple containers from the same template without copying a thing

What are linux containers? Linux Namespaces PID namespace Network namespace UTS namespace(hostname) User namespace(uid/gid) Mount namespace Has been long underway. Full support under anything by Ubuntu/Debian can be tricky

What are linux containers? Why is this powerful? Container will work the same anywhere Each container is isolated Allow unprivileged users to run anything. Let them become root Utilize OverlayFS Spawn a new full OS in under a second Spawning multiple containers from the same template takes up no extra space No hypervisor, just native performance No need for syscall translation => No overhead Run 100+ containers on one host Back to Docker =>

Docker Docker is by far the most popular container implementation The design philosophy of docker has been adopted wholesale Creating docker images through recipes (Dockerfile) Running containers are ephemeral Make docker images reusable by others Images are be easy to publish and to download and use Split your software stack into smaller units by containerizing one service at a time

Docker Has gain serious traction amongst companies/developers working in the cloud. Here Docker and its philosophy helps: Plan, structure, develop and deploy the software stack Lots of effort has been but into containerizing existing software stack (also in academia) Restructure code under a better more scalable model Cloud ready Get in while the buzz it hot

Docker Some of the heavy hitters From academia Björn Grüning (bgruening) from University of Freiburg

Meanwhile in HPC...

Can we get Docker into our HPC clusters? How can we capitalize? A lot of software has already been dockerized. Projects like: https://github.com/biodocker/containers Or easy to get into containerize: https://github.com/mulled/mulled And the list of container resources gross every day How can we deploy all these containers with ready to use software inside our HPC cluster?

Merging containers into cluster computing Let's look at the pipeline Individual pieces of software strung together in a chain* Each link in the chain takes output from the previous link and uses it as input. Instead of the actual software being the link, how about using containers? To rephrase: Split your pipeline into smaller units by containerizing one link at a time Makes your pipelines cluster independent** Much of the development can be done off-cluster, on your own system Write your awesome software once, and everybody can use it. #citations Reuse others (a little bit less awesome) software in your pipeline *A lattice I guess, or else we wouldn't be doing stuff in parallel **well no. But a step in the right direction

Use case - The cluster user Missing a piece of software? Search the web for existing images: https://hub.docker.com https://github.com/biodocker/containers https://github.com/mulled/mulled https://docker-ui.genouest.org/app/#/containers Or query from the cmd: $:> docker search bowtie2 * Find a link in a research paper *This does require mulled, biodocker etc. to be setup as repos

Use case - The cluster user No luck? Build your own container. $:> mkdir bowtie2 && cd bowtie2 $:> vim Dockerfile 1 FROM ubuntu 2 3 RUN apt-get update -qq --fix-missing 4 RUN apt-get install -qq -y wget unzip 5 RUN wget -q -O bowtie2.zip https://sourceforge.net/.../bowtie2-2.2.9-linux-x86_64.zip/download 6 RUN unzip bowtie2.zip -d /opt/ 7 RUN ln -s /opt/bowtie2-2.2.9 /opt/bowtie2 8 RUN rm bowtie2.zip 9 10 ENV PATH $PATH:/opt/bowtie2 $:> docker build -t bowtie2-2.2.9. $:> docker images REPOSITORY TAG IMAGE ID CREATED SIZE bowtie2-2.2.9 latest 49c23f71b287 9 seconds ago 289 MB ubuntu latest c73a085dc378 5 days ago 127 MB $:> docker run --rm -it bowtie2-2.2.9 bowtie2 -h Bowtie 2 version 2.2.9 by Ben Langmead (langmea@cs.jhu.edu, www.cs.jhu.edu/~langmea) Usage: bowtie2 [options]* -x <bt2-idx> {-1 <m1> -2 <m2> -U <r>} [-S <sam>]...

Use case - The cluster user Push our own work to dockerhub for others to re-use: $:> docker push bowtie2-2.2.9 Docker images can be pushed to repositories (dockerhub being one), and automatically pulled in if needed. Dockerhub can monitor git repositories and rebuild a new docker image on commits. Setup a (private) docker repository on your local network that pulls content from the most relevant global repos. Each docker daemon can stream in >1GB docker images within seconds.

What would we like to achieve? Make your lives as user easier by reusing existing and working docker images from papers, colleage, previous projects Make your lives as an administrator easier by not maintaining a plethora of software compiled to custom specifications from source Make our pipelines easier to rerun on a different cluster, by packaging the software into docker images that can run everywhere

What do we need? 1. Mapping of data Enable containers to work on the data (massive in size) on the HPC filesystem like any piece of software (within reason ;)) 2. Resource limiting A way for the docker daemon to run under the resource management of SLURM, so that the scheduler can do resource sharing. 3. Maintain security A cluster user should never be able to achieve priviledge escalation (of any sort) Alice should only be able to run as alice No one but Alice should be able to run as alice

Mapping of data Map data from host to container via mount-bind docker run -v /storage:/storage debian /bin/bash Idear: Make a 1-1 map of the shared storage into the container. File paths are the same outside and inside a container. Easy to work with. Example: #sbatch tool_a /storage/input -o /storage/output.a tool_b /storage/output.a -o /storage output.b cat /storage/output.b #sbatch docker run -v /storage:/storage tool_a /storage/input -o /storage/output.a docker run -v /storage:/storage tool_b /storage/output.a -o /storage/output.b docker run -v /storage:/storage cat /storage/output.b

Mapping of data Problem solved. Let crack on

Mapping of data Problem solved. Let crack on Major break of nr. 3: Maintain security Docker defaults Containers run as root Anyone in the docker group can spawn containers All are equal in eyes of the daemon Alice get to spawn just as much as root does

Mapping of data Evil Alice Mapping part of the host OS into a container, Alice can act like root in the mother OS. What about: docker run -v /storage/sensitive_data:/unsensitive_data debian /bin/bash And even worse: docker run -v /etc/shadow:/root/shadow debian /bin/bash Read-write access to our password file!

Mapping of data Unprivileged containers Any storage that is mapped inside a container retain the restrictions of the user spawning Filesystems doesn t have multiple and separate UID/GID ranges Utilize the size of this UID/GID space, and shift containers into unused UID/GID s to isolate them. UID/GID gets translated back and forth when Unprivileged containers has existed and been used in LXC for a while. Fairly new (and unknown) option Docker

Mapping of data Who does it work? Assign a isolated UID-space and GID-space to a user 2 new files /etc/subuid and /etc/subgid Use these UID/GID s inside the container $:> usermod --add-subuids 100000-165536 alice $:> usermod --add-subgids 100000-165536 alice $:> docker daemon --userns-remap alice:alice & $:> docker run --rm -it -v /etc/shadow:/root/shadow debian /bin/bash #:> touch /etc/shadow #:> touch /root/shadow touch: cannot touch '/root/shadow': Permission denied *Available in Ubuntu since 14.04. But not in CentOS 7 yet.

Mapping of data That was a step too far! What about reference data, input data and output data? Soulution: Shift UID s and GID s into boring isolation but keep the UID of the user and GID on the project. cat /etc/subuid alice:100000:1000 alice:1000:1 alice:101001:64535 cat /etc/subgid plants:100000:10000 plants:10000:1 plants:110000:64535

Mapping of data Succes! $:> docker daemon --userns-remap alice:plant & $:> docker run --rm -it \ -v /etc/shadow:/root/shadow \ -v /storage:/storage debian /bin/bash #:> touch /root/shadow touch: cannot touch '/root/shadow': Permission denied #:> cd /storage #:> ls humans lost+found plants #:> ls humans/ ls: cannot open directory humans/: Permission denied #:> ls plants/ some_plant.gene

Mapping of data What did we need? Edit /etc/subuid and /etc/subgid to shift anything but the user uid and project gid into a isolated uid/gid range Multiple running docker daemons. One pr. <user>:<group> mapping Add --userns-remap to restrict container file access Add --group to restrict access to the docker daemon docker daemon \ --graph=/mnt/scratch/$user.$project/docker \ --pidfile=/mnt/scratch/$user.$project/docker.pid \ -H unix:///mnt/scratch/$user.$project/docker.sock \ --group=$user_id \ --userns-remap=$user_id:$group_id Your users are now able to run containers on your filesystem!

Resource limiting In any HPC cluster the scheduler must have total resource control. Jobs are run with the privileges of the use Processes are subprocesses of slurmd Docker daemon must be spawned by root Containers run as subprocesses of the docker daemon 1. Unprivileged user must be able to start the docker daemon 2. The scheduler must be able to monitor/control the resources of docker 3. When a job is killed, all containers spawned by that job must die

Resource limiting SLURM already uses cgroups. And that is all we need Write a setuid script start_docker that assert permissions and forks out a docker daemon locked to the <user>:<project> Run start_docker inside a job to use containers The cgroup stay with the daemon. Monitoring/limiting its resources Use SLURMs epilog-hook to cleanup afterwards Kills docker daemon and containers if still running Delete any container leftovers

Resource limiting Check the process tree slurmstepd bash pstree 20238 -a sudo docker_daemon plants docker_daemon /usr/local/bin/docker_daemo... dockerd --graph=/mnt/scratch/alice.pl... docker-containe -l unix:///var/ru... 7*[{docker-containe}] 14*[{dockerd}] 5*[{slurmstepd}] And the cgroup alice@vm47:~$ cat /proc/self/cgroup 11:name=systemd:/user/0.user/6.session 10:hugetlb:/user/0.user/6.session... alice@vm47:~$ cat /proc/`pidof dockerd`/cgroup 11:name=systemd:/user/0.user/6.session 10:hugetlb:/user/0.user/6.session...

Limitations This is a proof of concept Docker locks /etc/passwd and /etc/group No way to inject user/project names. Only UID and GID available Dockers --userns-remap limits user to one project at a time Limitations in the kernel make this unlikely to change Limitations in the kernel allow no more than 5 lines in subgid(!?) * There is an (arbitrary) limit on the number of lines in the file. As at Linux 3.18, the limit is five lines. - user_namespaces manpage

Limitations How about network? How to communicate with containers on different nodes? How about RDMA? Docker is still in very active development Docker 1.8 - August 12, 2015 Docker 1.9 - November 3, 2015 Docker 1.10 - February 4, 2016 Docker 1.11 - April 13, 2016 Docker 1.12 - June 20, 2016 All saw major changes and introduction of concepts and features. Not all features are support in the major distribution Ubuntu/debian Archlinux CentOS