Shifter Configuration Guide 1.0

Similar documents
XC Series Shifter User Guide (CLE 6.0.UP02) S-2571

Performance Measurement and Analysis Tools Installation Guide S

Installation, Configuration and Performance Tuning of Shifter V16 on Blue Waters

More Raspian. An editor Configuration files Shell scripts Shell variables System admin

Singularity: container formats

Introduction to remote command line Linux. Research Computing Team University of Birmingham

Test Lab Introduction to the Test Lab Linux Cluster Environment

Changing user login password on templates

Reducing Cluster Compatibility Mode (CCM) Complexity

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

List of Linux Commands in an IPm

TORQUE Resource Manager Quick Start Guide Version

Introduction p. 1 Who Should Read This Book? p. 1 What You Need to Know Before Reading This Book p. 2 How This Book Is Organized p.

Original Script. Display commands to manually creating an account. #!/bin/bash

Docker task in HPC Pack

Introduction to Cray Data Virtualization Service S

ELE409 SPRING2018 LAB0

PowerVM Lx86 for x86 Linux Applications Administration Guide

Welcome to getting started with Ubuntu Server. This System Administrator Manual. guide to be simple to follow, with step by step instructions

UNIX System Programming Lecture 3: BASH Programming

Network softwarization Lab session 2: OS Virtualization Networking

Shifter on Blue Waters

Shifter and Singularity on Blue Waters

GNU/Linux 101. Casey McLaughlin. Research Computing Center Spring Workshop Series 2018

This guide assumes that you are setting up a masternode for the first time. You will need:

IBM AIX Basic Operations V5.

DCCN Docker Swarm Cluster Documentation

Installation Manual InfraManage.NET Installation Instructions for Ubuntu

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs

LOG ON TO LINUX AND LOG OFF

Useful Unix Commands Cheat Sheet

INd_rasN SOME SHELL SCRIPTING PROGRAMS. 1. Write a shell script to check whether the name passed as first argument is the name of a file or directory.

OPERATING SYSTEMS LINUX

CENG 334 Computer Networks. Laboratory I Linux Tutorial

Basic Linux Command Line Interface Guide

CMU : Cluster Management Utility. CMU diskless user s guide Version 4.0, January 2009

The Linux IPL Procedure

Linux Essentials Objectives Topics:

Installing HMC in VirtualBox or VMWare - AIXWiki

Scratchbox Remote Shell

CS Fundamentals of Programming II Fall Very Basic UNIX

CTEC1863/2018F Bonus Lab Page 1 of 5

Downloading and installing Db2 Developer Community Edition on Ubuntu Linux Roger E. Sanders Yujing Ke Published on October 24, 2018

Certification. System Initialization and Services

Install Server Build Guide I/O device driver (SPARC Enterprise) C120-E443-07ENZ2(A) February SPARC Enterprise

The TinyHPC Cluster. Mukarram Ahmad. Abstract

How many of you have never built a NetBSD kernel?

Part 1: Basic Commands/U3li3es

UNIX System Administration

Shell Scripting. With Applications to HPC. Edmund Sumbar Copyright 2007 University of Alberta. All rights reserved

Introduction to the shell Part II

Sharpen Exercise: Using HPC resources and running parallel applications

OpenAFS Quick Start Guide for UNIX

QPKG Debian6 V (Beta)

Singularity: Containers for High-Performance Computing. Grigory Shamov Nov 21, 2017

Linux crash lecture by Andrey Lukyanenko

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs

docker & HEP: containerization of applications for development, distribution and preservation

CSNB113: System Administration - 8 th Topic: Shell Scripts Programming I

Insight Control Server Provisioning Capturing and Installing SUSE Enterprise Linux 12 System Images

How to monitor RedHat Enterprise Linux 5 or 6 using Microsoft System Center Operations Manager (SCOM) 2012 SP1 - Part 1

Linux Command Line Primer. By: Scott Marshall

Essential Unix (and Linux) for the Oracle DBA. Revision no.: PPT/2K403/02

Basic Linux Command Line Interface Guide

EX200.redhat

Zenoss Resource Manager Upgrade Guide

USING NGC WITH GOOGLE CLOUD PLATFORM

idirect Technical Note 1. INTRODUCTION 2. DIFFERENCES BETWEEN INFINITI AND NETMODEM II+ SERIES

Saving Your Bacon Recovering From Common Linux Startup Failures

Upgrade Cisco Interface Module for LoRaWAN IXM using the Console

Opportunities for container environments on Cray XC30 with GPU devices

Installation of the OS

Managing Xen With Xen-Tools, Xen-Shell, And Argo

Section 1. A zseries Linux file system test script

iscsi storage is used as shared storage in Redhat cluster, VMware vsphere, Redhat Enterprise Virtualization Manager, Ovirt, etc.

Advanced Unix Programming Module 03 Raju Alluri spurthi.com

How to Deploy Axon on VMware vcenter

Std: XI CHAPTER-3 LINUX

Embedded Linux Systems. Bin Li Assistant Professor Dept. of Electrical, Computer and Biomedical Engineering University of Rhode Island

EX200.exam.35q. Number: EX200 Passing Score: 800 Time Limit: 120 min. EX200. Red Hat Certified System Administrator RHCSA

Linux Systems Administration Shell Scripting Basics. Mike Jager Network Startup Resource Center

Exam Linux-Praxis - 1 ( From )

Outline. Cgroup hierarchies

More on file systems, Booting Todd Kelley CST8177 Todd Kelley 1

Running Jobs on Blue Waters. Greg Bauer

TABLE OF CONTENTS: NEED TO KNOW

NETW 110 Lab 5 Creating and Assigning Users and Groups Page 1

Setting up a Chaincoin Masternode

Errata and Commentary Updated, submitted to curriculum

Linux Operating System Environment Computadors Grau en Ciència i Enginyeria de Dades Q2

Secure Browser Installation Manual For Technology Coordinators

CS370 Operating Systems

Downloading and installing Db2 Developer Community Edition on Red Hat Enterprise Linux Roger E. Sanders Yujing Ke Published on October 24, 2018

These instructions describe the system requirements and process for installing and initial configuration of jbase on Linux operating systems.

Booting a Galaxy Instance

DataWarp Administration Guide S b

Assume that username is cse. The user s home directory will be /home/cse. You may remember what the relative pathname for users home directory is: ~

How to Create PAR or PCA Files on the Command Line

Essentials for Scientific Computing: Bash Shell Scripting Day 3

Exercise 1: Basic Tools

Transcription:

Shifter Configuration Guide 1.0

Contents Contents About Shifter Configuration Guide...3 Configure the Docker Daemon...4 Configure Shifter...7 Rebuild the Compute Node initramfs and cpio Files...10 Configure the TORQUE Prologue and Epilogue Files...11 2

About Shifter Configuration Guide About Shifter Configuration Guide This publication includes software configuration procedures for Shifter and Docker. Shifter Configuration Guide This is the initial release of this publication, February 18, 2016. This version includes procedures to support Cray software release CLE 5.2UP04 for XC and XE systems. Typographic Conventions Monospace Monospaced Bold Oblique or Italics Indicates program code, reserved words, library functions, command-line prompts, screen output, file/path names, key strokes (e.g., Enter and Alt-Ctrl-F), and other software constructs. Indicates commands that must be entered on a command line or in response to an interactive prompt. Indicates user-supplied values in commands or syntax definitions. Proportional Bold Indicates a graphical user interface window or element. \ (backslash) At the end of a command line, indicates the Linux shell line continuation character (lines joined by a backslash are parsed as a single line). Do not type anything after the backslash or the continuation feature will not work correctly. Scope and Audience This publication is written for Cray internal employees and Cray customers. Feedback Visit the Cray Publications Portal at http://pubs.cray.com and make comments online using the Contact Us button in the upper-right corner or Email pubs@cray.com. Your comments are important to us and we will respond within 24 hours. 3

Configure the Docker Daemon Configure the Docker Daemon The Shifter system allows users to run commands in a user defined image (UDI). A UDI has many of the features of a Linux container. Some features (e.g., separate pid namespace) are missing because the UDI is intended to be used on a compute node where only one user's applications run at a time. A UDI is created from a Docker image. A user may download an image that has been exported by another site from the Docker Hub. This image can then be unpacked and then packaged into a UDI for the local user to use. Both the Docker daemon and Shifter require file system space for the local image cache and the packaged UDI, respectively. For the Docker daemon, this space can be either a file or a disk device. For Shifter, this space is a directory on a large file system. The Docker daemon uses a thin-pool (sparse) device for the image cache and the containers. This device/file only needs to be accessible to the daemon; it is not shared with any other processes. If the site does not have any disk available for this purpose, a regular file on a Lustre file system that is accessible on the compute node will work. It will attach via the loopback device. One node, with an Internet connection, should be picked to run the Docker daemon. In the following configuration instructions, node 29 (nid00029), a login node, is used. Run the following commands: nid00029# mkdir /lus/scratch/udi nid00029# cd /lus/scratch/udi nid00029# dd if=/dev/zero of=thin-pool bs=1024k count=22532 This creates a 22 GiB + 4 MiB file. The recommended starting size is 22 GiB. If using the loopback device: nid00029# DEV=$( losetup --find --show thin-pool ) If not using a loopback device, set the environment variable DEV to the device being used. nid00029# DEV=/dev/sda3 Whether using a loopback device or not, use the following commands. The volume group and logical volume names must be used as given: nid00029# pvcreate $DEV nid00029# vgcreate vg_docker $DEV nid00029# lvcreate --thinpool vg_docker/docker_pool --size 20G nid00029# lvchange --activate=n vg_docker/docker_pool If a loopback device was used, it can be detached using the following command. nid00029# losetup --detach $DEV Pre and Post Scripts The Docker daemon start script /etc/init.d/cray-docker will run a script, if it exists, before the daemon starts and another script, if it exists, after the daemon is stopped. 4

Configure the Docker Daemon There are templates for these files in /opt/cray/docker/default/templates. They can be copied from there for convenience and to avoid copy-and-paste errors. The pre-start script, /etc/opt/cray/docker/docker-start, allows actions to be done before the daemon starts, such as set up the thin-pool file (as set up previously). Create this script in the shared root. smw:~ # ssh root@boot boot:~ # xtopview default/:/ # vi /etc/opt/cray/docker/docker-start #!/bin/bash set -e POOL=/lus/scratch/udi/thin-pool while true; do [[ -e $POOL ]] && break sleep 30 done FILE=$( readlink -e $POOL) LOOP_DEV=$( losetup --find --show $FILE ) [[ -z $LOOP_DEV ]] && exit 1 pvscan $LOOP_DEV vgscan lvscan lvchange --activate=y vg_docker/docker_pool true The POOL variable should be set to the site's thin-pool file location. The loop in the script is to allow the thin-pool file to reside on a file system that is not mounted when the script is run. The script will wait until the file is accessible. Even if a loopback file is not used, a script can be used to make sure that the physical device being used is accessible and that the logical volume is active. Similarly, the post-stop script, /etc/opt/cray/docker/docker-stop, allows actions to be done after the daemon stops, such as detach the thin-pool file from the loopback device. Create the file: default/:/ # vi /etc/opt/cray/docker/docker-stop #!/bin/bash set +e POOL=/lus/scratch/udi/thin-pool LINE=$( lvscan grep vg_docker/docker_pool grep ACTIVE ) [[ -n $LINE ]] && lvchange --activate=n vg_docker/docker_pool FILE=$( readlink -e $POOL ) LINE=$( losetup --all grep $FILE ) [[ -z $LINE ]] && LOOP_DEV=$( echo "$LINE" sed -e 's/:.*$//' ) [[ -n $LOOP_DEV ]] && losetup --detach $LOOP_DEV 5

Configure the Docker Daemon The POOL variable should be changed to the site's thin-pool file. This script deactivates the logical volume and then detaches the loopback device. These scripts must be executable to run. default/:/ # chmod 750 /etc/opt/cray/docker/docker-* Please note the following administrative items: While the Docker daemon is running, the device/file system that holds the thin-pool file will be "busy." If the thin-pool file is a regular file in a file system, it will not be possible to unmount that file system until the Docker daemon is stopped and the thin-pool file is no longer in use. If the thin-pool file is a regular file, the lsof command will not show it as being the one keeping the file system busy if an unmount is attempted. The losetup --all command will show it. Add System Group and Enable Services A docker group needs to be created for the Docker daemon. The following commands will add the group docker as a system group in the shared root. default/:/ # groupadd --system docker default/:/ # exit The Docker daemon requires several services be enabled and a symlink created. This should be done in the node specialized view for the Docker daemon node. boot:~ # xtopview -n 29 node/29:/ # insserv boot.cgroup node/29:/ # insserv xinetd node/29:/ # insserv cray-docker node/29:/ # ln -s /var/lib/docker/etc /etc/docker node/29:/ # exit On the Docker daemon node itself, the following needs to be run: nid00029# mkdir -p /var/lib/docker/etc 6

Configure Shifter Configure Shifter There are several files that need to be configured for Shifter to work properly. This section will enumerate those files and indicate what their contents should be. Each of these files should either be created or edited in the shared root using the default view of xtopview. boot:~ # xtopview default/:/ # Since Shifter can use the alternate root feature of ALPS to have it chroot to the UDI root, the configuration file /etc/opt/cray/cnrte/roots.conf needs to have a line added to it: default/:/ # echo "UDI=/var/udiMount" >> /etc/opt/cray/cnrte/roots.conf There are templates for the files mentioned in this section in /opt/cray/shifter/default/templates. They can be copied from there for convenience and to avoid copy-and-paste errors. The Shifter config file (/etc/opt/cray/shifter/shifter.conf) needs to be created and populated. It should have the following contents: default/:/ # vi /etc/opt/cray/shifter/shifter.conf imagepath=/lus/scratch/udi udirootpath=/opt/cray/shifter/default etcpath=/var/opt/cray/shifter/etc sitemounthook=/etc/opt/cray/shifter/mount.sh imagegateway=nid00029:7777 The imagepath should be replaced to reflect the site's choice for the location of the packaged UDIs (e.g., scratch). This location needs to be accessible from the node running the Docker daemon as well as the compute nodes. A directory on a global file system with free space (e.g., Lustre) is a good location. The imagegateway should be replaced with a canonical name (e.g., nid00029) for the node running the Docker daemon. It needs to be a name that is present in the /etc/hosts file on the compute nodes. The other lines should remain untouched. A /etc/opt/cray/shifter/image-path file needs to be created. This file should use the same value for imagepath as used in the shifter.conf file. This file is read when the Docker gateway is run and the contents are used as the location of the packaged UDIs. default/:/ # vi /etc/opt/cray/shifter/image-path /lus/scratch/udi The /etc/opt/cray/shifter/mount.sh file needs to be created. It should have in it any commands that the site wishes to have run when the UDI environment is set up. The mount script should include any file systems that should be available to all applications using Shifter. The three instances of lus/scratch in the example below need to be changed to the site's actual file system name. default/:/ # vi /etc/opt/cray/shifter/mount.sh #!/bin/sh 7

Configure Shifter # These are the arguments passed to this script: UDI_ROOT=$1 USERNAME=$2 set -e # These lines generate the passwd and group files for the given user. # This is necessary for ssh to function. mkdir -p /var/opt/cray/shifter/etc cp -rl /etc/opt/cray/shifter/etc_files/* /var/opt/cray/shifter/etc /opt/cray/shifter/default/sbin/gen-auth-files $USERNAME /var/opt/cray/shifter/etc mkdir -p lus/scratch mkdir -p ufs mkdir -p var/opt/cray/alps mount --bind /lus/scratch lus/scratch mount --bind /ufs ufs mount --bind /var/opt/cray/alps var/opt/cray/alps ln -s ufs/home home This example does the following: Generates the passwd and group files with information for the given user. If the site wants the user to be able to ssh into any compute node(s) that are part of the user s job, these lines are necessary. Creates 3 directories in the UDI environment that will be used as mount points. Does bind mounts from 3 file systems outside of the UDI environment to inside the environment. The /var/opt/cray/alps mount is necessary for ALPS to work properly. Shared root configuration is now complete. default/:/ # chmod 750 /etc/opt/cray/shifter/mount.sh default/:/ # exit Compute Node Configuration The mount.sh and shifter.conf files need to be copied to their similar locations in the compute node image. smw # mkdir -p /opt/xt-images/templates/default/etc/opt/cray/shifter/etc_files smw # cd /opt/xt-images/templates/default/etc/opt/cray/shifter smw # scp -p root@boot:/rr/current/.shared/base/default/etc/opt/cray\ /shifter/mount.sh. smw # scp -p root@boot:/rr/current/.shared/base/default/etc/opt/cray\ /shifter/shifter.conf. The group, passwd, and nsswitch.conf files can be customized for the site. They reside in /etc/opt/cray/shifter/etc_files. These files need to be present in both the compute node initramfs and the shared-root. smw # scp -p root@boot:/rr/current/.shared/base/default/etc/opt/cray\ /shifter/etc_files/group./etc_files/ smw # scp -p root@boot:/rr/current/.shared/base/default/etc/opt/cray\ /shifter/etc_files/passwd./etc_files/ 8

Configure Shifter smw # scp -p root@boot:/rr/current/.shared/base/default/etc/opt/cray\ /shifter/etc_files/nsswitch.conf./etc_files/ The files in the /etc/opt/cray/shifter/etc_files directory are copied to /etc in the UDI. If there are files other than the 3 mentioned above, they will be copied as well. This allows other config information, e.g., LDAP, to be present and get copied. mount.sh has already been copied into the compute image with the scp commands. Modify the following lines by adding /sbin/chroot /dsl: smw:~ # vi mount.sh #!/bin/sh # These are the arguments passed to this script: UDI_ROOT=$1 USERNAME=$2 set -e # These lines generate the passwd and group files for the given user. # This is necessary for ssh to function. mkdir -p /var/opt/cray/shifter/etc /sbin/chroot /dsl cp -rl /etc/opt/cray/shifter/etc_files/* \ /var/opt/cray/shifter/etc /sbin/chroot /dsl /opt/cray/shifter/default/sbin/gen-auth-files $USERNAME \ /var/opt/cray/shifter/etc mkdir -p lus/scratch mkdir -p ufs mkdir -p var/opt/cray/alps mount --bind /lus/scratch lus/scratch mount --bind /ufs ufs mount --bind /var/opt/cray/alps var/opt/cray/alps ln -s ufs/home home Except for this change, if these files do not match each other, Shifter may not work properly. 9

Rebuild the Compute Node initramfs and cpio Files Rebuild the Compute Node initramfs and cpio Files The initramfs needs to be repackaged and converted into a boot image cpio. This is done using the following commands on the SMW. Rebuild the boot image using the usual methods at site, for example: smw # /var/opt/cray/install/shell_bootimage_label.sh -c -d -b /bootimagedir/ \ bootimage.cpio Once this last step has been completed, the compute nodes can be rebooted with the new image. The Docker daemon service node needs to be rebooted as well. These reboots will need to be done before Shifter is usable. 10

Configure the TORQUE Prologue and Epilogue Files Configure the TORQUE Prologue and Epilogue Files On the TORQUE mom node(s), in /var/spool/torque/mom_priv, the prologue and epilogue files should be modified to either match the following or have the shifter part added. There are templates for the files mentioned in this section in /opt/cray/shifter/default/templates/wlm/torque. They can be copied from there for convenience and to avoid copy-and-paste errors. torque-mom# vi /var/spool/torque/mom_priv/prologue #!/bin/bash shifter_prologue=/opt/cray/shifter/default/libexec/cray-shifter-prologue if [[ -x $shifter_prologue ]]; then $shifter_prologue $* [[ $? -ne 0 ]] && exit 1 fi torque-mom# vi /var/spool/torque/mom_priv/epilogue #!/bin/bash shifter_epilogue=/opt/cray/shifter/default/libexec/cray-shifter-epilogue if [[ -x $shifter_epilogue ]]; then $shifter_epilogue $* fi torque-mom# chmod 700 /var/spool/torque/mom_priv/epilogue torque-mom# chmod 700 /var/spool/torque/mom_priv/prologue For DataWarp and Moab TORQUE If using DataWarp and Moab TORQUE, the following lines must be included in the prologue and epilogue scripts: /usr/local/bin/ac_dw_prologue $@ /usr/local/bin/ac_dw_epilogue $@ torque-mom# vi /var/spool/torque/mom_priv/prologue #!/bin/bash /usr/local/bin/ac_dw_prologue $@ shifter_prologue=/opt/cray/shifter/default/libexec/cray-shifter-prologue if [[ -x $shifter_prologue ]]; then $shifter_prologue $* [[ $? -ne 0 ]] && exit 1 fi torque-mom# vi /var/spool/torque/mom_priv/epilogue #!/bin/bash shifter_epilogue=/opt/cray/shifter/default/libexec/cray-shifter-epilogue if [[ -x $shifter_epilogue ]]; then $shifter_epilogue $* fi 11

Configure the TORQUE Prologue and Epilogue Files /usr/local/bin/ac_dw_epilogue $@ torque-mom# chmod 700 /var/spool/torque/mom_priv/epilogue torque-mom# chmod 700 /var/spool/torque/mom_priv/prologue Configure Manager Roles for Mom Nodes The root user must be given TORQUE "managers" access on each mom node. This applies whether the system has DataWarp configured or not. #connect to the TORQUE server for the host (sdb in this case) % ssh root@sdb sdb% module load torque sdb% qmgr -c set server managers += root@nid00029' Repeat this step for each nid that hosts a TORQUE mom. 12