Using Docker in High Performance Computing in OpenPOWER Environment

Similar documents
Best practices. Using Affinity Scheduling in IBM Platform LSF. IBM Platform LSF

IBM Platform LSF 9.1.3

Improved Infrastructure Accessibility and Control with LSF for LS-DYNA

Deployment Patterns using Docker and Chef

IBM Bluemix compute capabilities IBM Corporation

High-availability services in enterprise environment with SAS Grid Manager

LSF at SLAC. Using the SIMES Batch Cluster. Neal Adams. Stanford Linear Accelerator Center

A Hands on Introduction to Docker

OS Virtualization. Linux Containers (LXC)

Arup Nanda VP, Data Services Priceline.com

IBM Spectrum LSF Version 10 Release 1. Release Notes IBM

Who is Docker and how he can help us? Heino Talvik

Fixed Bugs for IBM Spectrum LSF Version 10.1 Fix Pack 1

Docker task in HPC Pack

Native SLURM on Cray XC30. SLURM Birds of a Feather SC 13

Using Platform LSF Advanced Edition

The following bugs have been fixed in LSF Version Service Pack 2 between 30 th May 2014 and 31 st January 2015:

Docker und IBM Digital Experience in Docker Container

IBM Power Systems HPC Cluster

Container-based virtualization: Docker

Fixed Bugs for IBM Platform LSF Version Fix Pack 3

Integrated Management of OpenPOWER Converged Infrastructures. Revolutionizing the Datacenter

Convergence of VM and containers orchestration using KubeVirt. Chunfu Wen

Improving User Accounting and Isolation with Linux Kernel Features. Brian Bockelman Condor Week 2011

Shifter at CSCS Docker Containers for HPC

HPC learning using Cloud infrastructure

CS-580K/480K Advanced Topics in Cloud Computing. Container III

Introduction to High-Performance Computing (HPC)

Bright Cluster Manager: Using the NVIDIA NGC Deep Learning Containers

Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules. Singularity overview. Vanessa HAMAR

Docker A FRAMEWORK FOR DATA INTENSIVE COMPUTING

Cluster Computing. Resource and Job Management for HPC 16/08/2010 SC-CAMP. ( SC-CAMP) Cluster Computing 16/08/ / 50

Fixed Bugs for IBM Platform LSF Version

Infrastructure at your Service. Oracle over Docker. Oracle over Docker

Using Platform LSF HPC

Let s manage agents. Tom Sightler, Principal Solutions Architect Dmitry Popov, Product Management

Technical guide. Windows HPC server 2016 for LS-DYNA How to setup. Reference system setup - v1.0

LSF Parallel User s Guide

Best Practices for Developing & Deploying Java Applications with Docker

Workload management at KEK/CRC -- status and plan

DUCC Installation and Verification Excerpt From Complete DUCC Documentation

UP! TO DOCKER PAAS. Ming

Bright Cluster Manager

Using DC/OS for Continuous Delivery

LSF HPC :: getting most out of your NUMA machine

Jenkins and Load Sharing Facility (LSF) Enables Rapid Delivery of Device Driver Software. Brian Vandegriend. Jenkins World.

Introduction to High-Performance Computing (HPC)

Vendor: IBM. Exam Code: C Exam Name: Power Systems with POWER8 Scale-out Technical Sales Skills V1. Version: Demo

UMass High Performance Computing Center

DGX-1 DOCKER USER GUIDE Josh Park Senior Solutions Architect Contents created by Jack Han Solutions Architect

Using Platform LSF HPC Features

Midterm Presentation Schedule

Amazon EC2 Container Service: Manage Docker-Enabled Apps in EC2

Travis Cardwell Technical Meeting

IBM p5 and pseries Enterprise Technical Support AIX 5L V5.3. Download Full Version :

Running Jobs with Platform LSF. Version 6.0 November 2003 Comments to:

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

Fixed Bugs for IBM Platform LSF Version 9.1.3

Management of batch at CERN

Amir Zipory Senior Solutions Architect, Redhat Israel, Greece & Cyprus

64-bit ARM Unikernels on ukvm

Release Notes for Platform LSF Version 7 Update 2

Running Docker applications on Linux on the Mainframe

Investigating Containers for Future Services and User Application Support

An introduction to Docker

Batch Systems. Running calculations on HPC resources

Can we boost more HPC performance? Integrate IBM POWER servers with GPUs to OpenStack Environment

Resource Management at LLNL SLURM Version 1.2

EMC Smarts SAM, IP, ESM, MPLS, NPM, OTM, and VoIP Managers 9.5 Support Matrix

Cloud I - Introduction

Reproducible computational pipelines with Docker and Nextflow

ovirt and Docker Integration

Exercises: Abel/Colossus and SLURM

Migrate Platform LSF to Version 7 on Windows. Platform LSF Version 7.0 Update 6 Release date: August 2009 Last modified: August 17, 2009

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015

Containerizing GPU Applications with Docker for Scaling to the Cloud

Welcome to Docker Birthday # Docker Birthday events (list available at Docker.Party) RSVPs 600 mentors Big thanks to our global partners:

Platform LSF Version 9 Release 1.1. Foundations SC

Introduction to GALILEO

Infoblox Kubernetes1.0.0 IPAM Plugin

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

Introduction to Containers

Laohu cluster user manual. Li Changhua National Astronomical Observatory, Chinese Academy of Sciences 2011/12/26

SQL Server inside a docker container. Christophe LAPORTE SQL Server MVP/MCM SQL Saturday 735 Helsinki 2018

ISLET: Jon Schipp, AIDE jonschipp.com. An Attempt to Improve Linux-based Software Training

DH2i s DxEnterprise for Linux Server A SQL Server Use Case

Container in Production : Openshift 구축사례로 이해하는 PaaS. Jongjin Lim Specialist Solution Architect, AppDev

Quelling the Clamor for Containers. Vanessa Borcherding Director, Scientific Computing Unit Weill Cornell Medicine

Introduction to containers

ViryaOS RFC: Secure Containers for Embedded and IoT. A proposal for a new Xen Project sub-project

Introduction to Docker. Antonis Kalipetis Docker Athens Meetup

Platform LSF Version 9 Release 1.3. Foundations SC

TEN LAYERS OF CONTAINER SECURITY

Fast Setup and Integration of Abaqus on HPC Linux Cluster and the Study of Its Scalability

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

The vsphere 6.0 Advantages Over Hyper- V

Dockerized Tizen Platform

Building A Better Test Platform:

IBM Platform LSF. Best Practices. IBM Platform LSF and IBM GPFS in Large Clusters. Jin Ma Platform LSF Developer IBM Canada

Best practices. Deploying IBM Platform LSF on a Linux HPC Cluster. IBM Platform LSF

Transcription:

Using Docker in High Performance Computing in OpenPOWER Environment Zhaohui Ding, Senior Product Architect Sam Sanjabi, Advisory Software Engineer IBM Platform Computing #OpenPOWERSummit Join the conversation at #OpenPOWERSummit 1

Agenda What is LSF? What is Docker? LSF & Docker Next step Q & A Join the conversation at #OpenPOWERSummit 2

What is LSF? USER 1 Project A USER 2 Project B I need access to 32 RHEL 6.1 host computers connected via infiniband, licensed to run LS-DYNA with Platform MPI and having at least 4 GB of RAM per node for four hours.. USER 3 Project C Join the conversation at #OpenPOWERSummit 3

What is LSF? user2% bsub n 32 P ProjectB \ -R rhel61 && mem>4000 && ibswitch && lsdyna \ -app lsdyna k chkpointdir method=lsdyna \ -R rusage[mem=3500:lsdyna=32] \ dyna_simulation.sh USER 3 Project C Join the conversation at #OpenPOWERSummit 4

IBM PLATFORM LSF SCHEDULER Submitted Workloads Pre-exec jobstarter Post-exec esub Host Workload Manager Cluster Workload Manager LSF Architecture Local Clusters Preemption Fairshare Allocation Units Resource Reservation Advanced Reservation Advance Parallel User - Defined LSF supports multiple scheduling policies that can be combined in flexible ways. Load Information Manager (lim) Multicluster PLUG-IN SCHEDULERS Remote Clusters Join the conversation at #OpenPOWERSummit 5

What is? A lightweight container technology built on top of linux container, cgroup and AUFS Hot topic in cloud & big data to develop, ship and run applications anyw and solve dependency hell challenges Join the conversation at #OpenPOWERSummit 6

Docker Architecture The Docker Daemon The Docker Client Inside Docker Docker Images Docker Registries Docker Containers Docker Hub Repositories Image ubuntu Image Image Image http://docs.docker.com/article-img/architecture.svg Join the conversation at #OpenPOWERSummit 7

OpenPower Join the conversation at #OpenPOWERSummit 8

Containers in HPC Not a new idea Workload resource isolation Process tracking Job controlling Checkpoint/restart, live migration Examples: IBM AIX WLM Linux control group Virtual machine Join the conversation at #OpenPOWERSummit 9

Docker in HPC Join the conversation at #OpenPOWERSummit 10

Potential Benefites Resource guarantee and performance isolation Docker leverages Linux control groups Application encapsulation and cloud mobility Docker makes it easy Application lifecycle management Different versions of applications coexisting in the same environment Consistency, repeatability and compliance Docker provides a way to run application in a known predefined environment. Lightweight, fast and transparent Identical application performance on bare metal systems and fast management operations Join the conversation at #OpenPOWERSummit 11

How to use it? [lsfadmin@ppcrhel7e bin]# lshosts HOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCES ppcrhel7e LINUXPP POWER8 250.0 19 128G 3.9G Yes (mg docker) Docker Env deployed [lsfadmin@ppcrhel7e ~]$ bsub -Is -a "docker(apollos/ubuntu-ppcle:trusty) /bin/sh Job <105> is submitted to default queue <interactive>. <<Waiting for dispatch...>> <<Starting on ppcrhel7e>> $ echo $LSB_JOBID 105 $ hostname lsf_docker-105 $ cat /etc/*release* DISTRIB_ID=Ubuntu DISTRIB_RELEASE=14.04 DISTRIB_CODENAME=trusty DISTRIB_DESCRIPTION="Ubuntu 14.04.1 LTS Docker image name Cluster name and LSF JobID [root@ppcrhel7e bin]# lsrun -m ppcrhel7e docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f006a5a8dd7c apollos/ubuntu-ppcle:trusty "/bin/bash" 2 minutes ago Up 2 minutes lsf_docker-105 Join the conversation at #OpenPOWERSummit 12

How it works? Leverage existing IBM Platform LSF plugins mechanism (esub, job starter, job control etc) Join the conversation at #OpenPOWERSummit 13

Resource Requirement & Enforce CPU Shares Translate IBM Platform LSF total number of slots to relative cpu share on docker. bsub -n <num> -> docker run c <num> Memory Fencing Enable IBM Platform LSF cgroup feature and translate M/-v to docker memory option bsub -M/-v -> docker run m Affinity & NUMA aware scheduling IBM Platform LSF selects which hosts and cores/sockets and pass information to docker to enforce it. bsub -n 4 -R span[hosts=1] affinity[core] -> docker run -cpuset Join the conversation at #OpenPOWERSummit 14

Next Steps Tighter integration with IBM Platform LSF job resource collection framework Leverage built-in runtime IBM Platform LSF resource collection and reporting mechanism instead of using bpost Docker image aware scheduling Docker image can be big and it takes time to download and install it. Prefer to go to hosts with image pre-installed Local image management Image should be cleaned up periodically when not required to save disk space Join the conversation at #OpenPOWERSummit 15

Please Download and Try It Out http://www.ibm.com/developerworks/servicemanagement/tc/plsf/downloads.html White paper: https://ibm.biz/bdfzpy Join the conversation at #OpenPOWERSummit 16

Q & A Thank You Join the conversation at #OpenPOWERSummit 17