Singularity: Containers for High-Performance Computing. Grigory Shamov Nov 21, 2017

Similar documents
Docker for HPC? Yes, Singularity! Josef Hrabal

Presented By: Gregory M. Kurtzer HPC Systems Architect Lawrence Berkeley National Laboratory CONTAINERS IN HPC WITH SINGULARITY

Singularity: container formats

CONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY

Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules. Singularity overview. Vanessa HAMAR

Andrej Filipčič

Singularity - Containers for Scientific Computing ZID Workshop. Innsbruck Nov 2018

Introduction to Containers. Martin Čuma Center for High Performance Computing University of Utah

Introduction to Containers. Martin Čuma Center for High Performance Computing University of Utah

Singularity. Containers for Science

Singularity tests at CC-IN2P3 for Atlas

Travis Cardwell Technical Meeting

docker & HEP: containerization of applications for development, distribution and preservation

Singularity in CMS. Over a million containers served

Introduction to Containers

LSST software stack and deployment on other architectures. William O Mullane for Andy Connolly with material from Owen Boberg

Opportunities for container environments on Cray XC30 with GPU devices

Singularity Container Documentation. Release Admin Docs

Docker A FRAMEWORK FOR DATA INTENSIVE COMPUTING

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

Automatic Dependency Management for Scientific Applications on Clusters. Ben Tovar*, Nicholas Hazekamp, Nathaniel Kremer-Herman, Douglas Thain

SINGULARITY P1. Containers for Science, Reproducibility and Mobility

Shifter and Singularity on Blue Waters

Midterm Presentation Schedule

Shifter at CSCS Docker Containers for HPC

Engineering Robust Server Software

OS Virtualization. Linux Containers (LXC)

Simple custom Linux distributions with LinuxKit. Justin Cormack

Remote Workflow Enactment using Docker and the Generic Execution Framework in EUDAT

Allowing Users to Run Services at the OLCF with Kubernetes

swiftenv Documentation

High Performance Containers. Convergence of Hyperscale, Big Data and Big Compute

Docker task in HPC Pack

Portable, lightweight, & interoperable Docker containers across Red Hat solutions

OS Security III: Sandbox and SFI

ViryaOS RFC: Secure Containers for Embedded and IoT. A proposal for a new Xen Project sub-project

Shifter: Fast and consistent HPC workflows using containers

Who is Docker and how he can help us? Heino Talvik

CS-580K/480K Advanced Topics in Cloud Computing. Container III

Containers. Pablo F. Ordóñez. October 18, 2018

containerization: more than the new virtualization

Scientific Filesystem. Vanessa Sochat

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017

Infoblox Kubernetes1.0.0 IPAM Plugin

CS197U: A Hands on Introduction to Unix

Docker und IBM Digital Experience in Docker Container

Getting Started With Containers

CONTAINER AND MICROSERVICE SECURITY ADRIAN MOUAT

Cross platform enablement for the yocto project with containers. ELC 2017 Randy Witt Intel Open Source Technology Center

Quelling the Clamor for Containers. Vanessa Borcherding Director, Scientific Computing Unit Weill Cornell Medicine

ISLET: Jon Schipp, AIDE jonschipp.com. An Attempt to Improve Linux-based Software Training

USING DOCKER FOR MXCUBE DEVELOPMENT AT MAX IV

Deploy containers on your cluster - A proof of concept

Azure Sphere: Fitting Linux Security in 4 MiB of RAM. Ryan Fairfax Principal Software Engineering Lead Microsoft

[Docker] Containerization

It s probably the most popular containerization technology on Linux these days

Singularity CRI User Documentation

This article outlines options for user-defined software stacks from

Making Immutable Infrastructure simpler with LinuxKit. Justin Cormack

Install your scientific software stack easily with Spack

My operating system is old but I don't care : I'm using NIX! B.Bzeznik BUX meeting, Vilnius 22/03/2016

Introduction to containers

Zadara Enterprise Storage in

Arup Nanda VP, Data Services Priceline.com

Combining CVMFS, Nix, Lmod, and EasyBuild at Compute Canada. Bart Oldeman, McGill HPC, Calcul Québec, Compute Canada

Index. Bessel function, 51 Big data, 1. Cloud-based version-control system, 226 Containerization, 30 application, 32 virtualize processes, 30 31

Shifter on Blue Waters

SUG Breakout Session: OSC OnDemand App Development

Getting Started with Hadoop

Rootless Containers with runc. Aleksa Sarai Software Engineer

Platform Migrator Technical Report TR

A Greybeard's Worst Nightmare

How to Restrict a Login Shell Using Linux Namespaces

If you had a freshly generated image from an LCI instructor, make sure to set the hostnames again:

Installing and Using Docker Toolbox for Mac OSX and Windows

Linux System Management with Puppet, Gitlab, and R10k. Scott Nolin, SSEC Technical Computing 22 June 2017

Basic Linux Security. Roman Bohuk University of Virginia

An introduction to Docker

A PRACTICAL INTRODUCTION TO CONTAINER SECURITY

Flatpak a technical walk-through. Alexander Larsson, Red Hat

BEST PRACTICES FOR DOCKER

diskimage-builder: Building Linux Images for Cloud / Virtualization / Container

Dockerized Tizen Platform

LiveCD Customization. Creating your own Linux distribution

The TinyHPC Cluster. Mukarram Ahmad. Abstract

Think Small to Scale Big

iems Interactive Experiment Management System Final Report

DOCKER 101 FOR JS AFFICIONADOS. Christian Ulbrich, Zalari UG

Getting Started with Stratus. Evan Bollig 05/01/2017

Bioshadock. O. Sallou - IRISA Nettab 2016 CC BY-CA 3.0

Global Software Distribution with CernVM-FS

Investigating Containers for Future Services and User Application Support

SDK. About the Cisco SDK. Installing the SDK. Procedure. This chapter contains the following sections:

Introduction To Linux. Rob Thomas - ACRC

Docker on VDS. Aurelijus Banelis

HTCondor: Virtualization (without Virtual Machines)

Perl and R Scripting for Biologists

Relax-and-Recover (ReaR) Automated Testing

Shifter Configuration Guide 1.0

Linux OS Fundamentals for the SQL Admin. Anthony E. Nocentino

Transcription:

Singularity: Containers for High-Performance Computing Grigory Shamov Nov 21, 2017

Outline Software and High Performance Computing: Installation/Maintenance of the HPC Software stack Why containers and what are they What is Singularity and how it can be useful Version history Workflows and use cases Using Singularity On Selected use cases and examples. New features of Singularity 2.4

Resources Everything is on a Hub: Github, DockerHub, SingularityHub. Singularity examples: https://github.com/singularityware/singularity/tree/master/examples Its documentation: http://singularity.lbl.gov/user-guide DockerHub: https://hub.docker.com/explore/ SingularityHub: https://www.singularity-hub.org/ Access to Singularity: Westgrid users: use Grex ssh your_wg_user@grex.westgrid.ca ; module load singularity/2.3.2 or module load singularity/2.4.0 ComputeCanada new systems: use Cedar or Graham ssh your_cc_user@cedar.computecanada.ca or ssh your_cc_user@graham.computecanada.ca Singularity will be updated soon; module load singularity gives 2.3.1 Some things need S. on a local machine or VM: install it from GitHub.

High Performance Computing HPC usage model / architecture Bare-metal performance, fast interconnect Clusters, under control of Resource Manager/ Scheduler Scalable, Parallel POSIX FileSystems (Lustre, GPFS, etc.) Centralized HPC software stack Untrusted users directly access the system; flat UID space. Some of the issues with HPC: Stable (that is, old, and server-centered) OS Linux distribution Centralized HPC Software stack is great, but less than flexible; Little provision for multi-tenancy (FileSystems, user access) 4

HPC Software Stack Operating system kernel needs root access glibc OS packages (Linux distro-specific: CentOS RPMs on CC machines; APT/DEBs on Ubuntu, Pacman on Arch etc.) Sysadmin-installed HPC Applications MPI, BLAS/LAPACK, HDF5/NetCDF : many dependencies there! Trying to suport software for each and every scientific area and community Many versions, Modules/LMOD to manage it ComputeCanada has a distributed one (CVMFS) Installed (mainly) by HPC Analysts (On CC machines, Nix can provide OS-level packages) HPC applications have dynamic dependencies on the other stack components ldd On most Linux distros, at least glibc dependency is present. Dynamic languages (Python, Perl, R) have their own package management systems 5

Dependencies on particular Linux distribution Developers like recent desktop Linux (Ubuntu) HEPhysics likes recent CentOS / ScientificLinux Release early, release often Scientific Software, problems Quick access to new software Raw software depending on errors in particular dev. env. Dynamic languages have complex dependencies Python, R, Julia: rpython, rpy2, rjava Anaconda/Miniconda will bring its own everything (MPICH) Installation and maintenance of large HPC software stack is a lot of work! Modules work, but to a point Virtualenv/Wheels works, to a point Software pipelines / workflows is a nightmare Reproducibility: is a concern Mobility: Would be nice to move software easier between the HPC systems Empower users, less wait for HPC analysts doing things. 6

VMs or containers? We d want to be able to pack the software environment, and freeze it, and move it, and perhaps to automate/define/document via scripts. At the same time, staying in traditional HPC context. Virtual Machines: Overhead; harder to use in HPC context Root access to global/parallel filesystems Better isolation/security; but less control for HPC Containers: Chroot on steroids OS-level isolation; uses the Hosts kernel Namespaces (PID, Devices, Network, etc.) Cgroups for resource control More of security risk, root access escalation 7

Docker Docker is the most popular container product Good for (micro)services; Enterprise/Developer workflow Efficient layering model, minimal images. Network virtualization Uses cgroups for resource management Popular with developers, including scientific ones Very many images present on DockerHUB. Docker-based HPC projects, some of Shifter (docker frontend, HPC backend) Adaprive Docker support in Torque 6 / Moab 9 Limiting user privileges: Udocker (Guillimin/CQ, but not only!), Socker (UOslo). Somehow, Docker is not yet an easy fit for HPC! 8

Again, HPC conditions HPC usage model / architecture Bare-metal performance, fast interconnect Clusters, under control of Resource Manager/ Scheduler ß (Docker s cgroups) Scalable, Parallel POSIX FileSystems (Lustre, GPFS, etc.) Centralized HPC software stack Untrusted users directly access the system; flat UID space. ß (Docker s root escalation) Some of the issues with HPC: Stable (that is, old, and server s) OS Linux distribution ß (Docker s need for of recent kernel) Centralized HPC Software stack is great, but less than flexible; Little provision for multi-tenancy (FileSystems, user access) ß (Docker s root escalation) While root escalation for Docker is just a potential possibility, it still needs recent kernel/namespace and proper configuration to avoid in, which limits the Mobility. 9

Singularity Singularity! Developed from scratch for HPC by Greg Kurtzer, LBNL. Container is ran as user, can mount cluster s local and parallel FS Does not require special plumbing, can run under any Scheduler/RM. http://singularity.lbl.gov Can import Docker containers into own format(s). Git: https://github.com/singularityware/singularity History: exactly two years now! Shub: https://www.singularity-hub.org/ Version 1 of Nov 2015: detect and pack application dependencies. A single file / chroot; small setuid routine to run it as the user. (runc, Flatpak, Snappy : not so HPC!) Version 2 to 2.3: packing Linux OS images Supports bind mounts to cluster Requires root access to build the images; but not to run them. Converts Docker containers to Singularity ones Version 2.4 (most recent, Oct. 2017 ): Lots of new features! (Getting more like Docker). We will be using 2.3, and then covering 2.4 new features. 10

Singularity workflow (2.3-ish) Demo: run containers; demo: Tensorflow. 11

Singularity 2.x definition/recipe Examples at: Git! https://github.com/singularityware/singularity/tree/master/examples From legacy/2.2/centos.def: ========= BootStrap: yum ß The header; specifies OS bootstrap source and method OSVersion: 7 MirrorURL: http://mirror.centos.org/centos-%{osversion}/%{osversion}/os/$basearch/ Include: yum %runscript ß The %runscript determines what happens on singularity run echo "This is what happens when you run the container... %post ß The %post is just BASH to execute after bootstrap within echo "Hello from inside the container" the new container yum -y install vim-minimal mkdir -p /global/scratch =========== $ singularity create size 1024 centos-7.img ; sudo singularity bootstrap cenots-7.img centos.def $ singularity run centos-7.img ; singularity shell centos-7.img ; singularty exec centos-7.img bash 12

Singularity 2.x definition/recipe Examples at: Git! https://github.com/singularityware/singularity/tree/master/examples From legacy/2.2/ubuntu.def: ========= BootStrap: debootstrap ß The header; specifies OS bootstrap source and method OSVersion: trusty MirrorURL: http://us.archive.ubuntu.com/ubuntu/ %runscript ß The %runscript determines what happens on singularity run echo "This is what happens when you run the container... %post ß The %post is just BASH to execute after bootstrap within echo "Hello from inside the container" the new container sed -i 's/$/ universe/' /etc/apt/sources.list apt-get y update; apt-get y force-yes install vim =========== $ singularity create size 1024 ubuntu.img ; sudo singularity bootstrap ubuntu.img ubuntu.def $ singularity run ubuntu.img ; singularity shell ubuntu.img ; singularty exec ubuntu.img bash 13

Singularity workflow / Hubs An example of Tensorflow: pulling containers from DockerHUB. Uses a cache directory. Do not try in on $HOME! : Grex $HOME quota is small, so images can easily exhaust it. export SINGULARITY_CACHEDIR=/global/scratch/$USER/.cache/ or export SINGULARITY_CACHEDIR=/dev/shm/$USER Pulling Images from DockerHub do not require root, nor they need to be create d! Bootstrap the image with a DockerHub URL with $ singularity pull docker://tensorflow/tensorflow:latest Use cases: Python/Tensorflow: To Exec or Run the container? Do we need to pull or can Exec/Run right away? Containers from SingularityHUB can be accessed same way: shub:// 14

Singularity workflow : MPI? Single-node, serial or SMP/threaded jobs are straightforward. Can we do MPI jobs within one node? Can we go across-the-nodes parallel? Choices for where MPI is: Within the container; needs a way to kickstart Outside, MPI running containerized processes The outside way expected to work in OpenMPI 2.1.x; also MPICH based ones (IntelMPI, MVAPICH2). $ mpirun -np 4 singularity exec [various options]... container.img./a.out Containers might need access to interconnect which limits mobility and is difficult to average users. https://github.com/singularityware/singularity/issues/876 15

Singularity 2.4, new features SquashFS immutable images and/or sandboxes instead of ext3 chroot Layered image format, docker/style is possible Bootstrap is depreciated (but still there) in favor of the new build command Breaks image format compatibility with 2.3. New container standard proposed : %app definitions, per-app dependencies Easier to run services within the container, network virtualization %startscript singularity instance.start singularity instance.list Automated build service on SingularityHub 2.0, integrated with Git: https://www.singularity-hub.org/collections/280 https://github.com/gshamov/shub-test/blob/master/singularity 16

Singularity 2.4, the new build singularity build [flags] target source Can bootsrap ext3 images as before with writable Can pull dockerhub and singularity hub as before. Also build can be used to convert between different formats of local containers as well (squashfs to/from ext3, etc.) 17

Singularity workflow, 2.4 Source: Singularity 2.4 documentation 18

Thank you for your attention! Questions? 19