Site presentation: CSCS

Similar documents
Using EasyBuild and Continuous Integration for Deploying Scientific Applications on Large Scale Production Systems

Stable Cray Support in EasyBuild 2.7. Petar Forai

Deploying (community) codes. Martin Čuma Center for High Performance Computing University of Utah

Software stack deployment for Earth System Modelling using Spack

Python ecosystem for scientific computing with ABINIT: challenges and opportunities. M. Giantomassi and the AbiPy group

Handbook for System Administrators (and Users)

ReFrame: A Regression Testing Framework Enabling Continuous Integration of Large HPC Systems

EasyBuild on Cray Linux Environment (WIP) Petar Forai

Install your scientific software stack easily with Spack

Building and Installing Software

The Arm Technology Ecosystem: Current Products and Future Outlook

CSCS Proposal writing webinar Technical review. 12th April 2015 CSCS

Controlling Software Environments with GNU Guix

EasyBuild: Building Software With Ease

Using Maali to Efficiently Recompile Software Post-CLE Updates on a CRAY XC System

Writing Easyconfig Files: The Basics

My operating system is old but I don't care : I'm using NIX! B.Bzeznik BUX meeting, Vilnius 22/03/2016

Status of the COSMO GPU version

Intel Distribution for Python* 2018 Beta

Overview. What are community packages? Who installs what? How to compile and install? Setup at FSU RCC. Using RPMs vs regular install

Maintaining Large Software Stacks in a Cray Ecosystem with Gentoo Portage. Colin MacLean

The Cray Programming Environment. An Introduction

The Why and How of HPC-Cloud Hybrids with OpenStack

Guillimin HPC Users Meeting December 14, 2017

Installing the new LOFAR Software on a fresh Ubuntu 14.04

Python based Data Science on Cray Platforms Rob Vesse, Alex Heye, Mike Ringenburg - Cray Inc C O M P U T E S T O R E A N A L Y Z E

Shifter: Fast and consistent HPC workflows using containers

HP Storage and UMCG

Intel Distribution for Python* и Intel Performance Libraries


VIP Documentation. Release Carlos Alberto Gomez Gonzalez, Olivier Wertz & VORTEX team

CLAW FORTRAN Compiler source-to-source translation for performance portability

Programming Environment 4/11/2015

HPCF Cray Phase 2. User Test period. Cristian Simarro User Support. ECMWF April 18, 2016

Introduction to HPC Programming 4. C and FORTRAN compilers; make, configure, cmake. Valentin Pavlov

The Cray Programming Environment. An Introduction

Simplifying the contribution process for both contributors & maintainers

Using Spack to Manage Software on Cray Supercomputers

Intel Distribution for Python* 2019

CONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY

Combining CVMFS, Nix, Lmod, and EasyBuild at Compute Canada. Bart Oldeman, McGill HPC, Calcul Québec, Compute Canada

Singularity: container formats

SALOME Maintenance release announcement

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design

Intel Distribution for Python* 2018 Update 1

Genius Quick Start Guide

KISTI TACHYON2 SYSTEM Quick User Guide

Anaconda Python Distribution

Containers. Pablo F. Ordóñez. October 18, 2018

Intel Distribution for Python* 2019 Update 1

Introduction to High Performance Computing Using Sapelo2 at GACRC

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

Installation of Apache OpenMeetings on Arch Linux. This tutorial is based on a fresh installations of. arch-anywhere x86_64.

Compiling applications for the Cray XC

ALIBUILD / PANDADIST. The New Build System for Panda

Chris Calloway for Triangle Python Users Group at Caktus Group December 14, 2017

Shifter and Singularity on Blue Waters

Vaango Installation Guide

Yocto Project components

Open SpeedShop Build and Installation Guide Version November 14, 2016

Installation Guide for Python

Using jupyter notebooks on Blue Waters. Roland Haas (NCSA / University of Illinois)

Mingw-w64 and Win-builds.org - Building for Windows

8/19/13. Blue Waters User Monthly Teleconference

Installation of Apache OpenMeetings on Mageia 6. Mageia-6-x86_64-DVD.iso

Full Stack on Wine. Create a Win-Win between Wine and thousands of Win32 open source projects. Qian Hong

User Training Cray XC40 IITM, Pune

Modern Scientific Software Management using EasyBuild & co

Intel Distribution for Python* 2018

Continuous integration & continuous delivery. COSC345 Software Engineering

OpenHPC: Project Overview and Updates

ARM BOF. Jay Kruemcke Sr. Product Manager, HPC, ARM,

Scientific computing platforms at PGI / JCNS

DePloying an embedded ERLANG System

A regression framework for checking the health of large HPC systems

Evaluating Shifter for HPC Applications Don Bahls Cray Inc.

django-jenkins Documentation

Analytics Platform for ATLAS Computing Services

FREE and Open Source Software Tools for WATer Resource Management

NEMO-RELOC Documentation

ruby-on-rails-4 #ruby-onrails-4

EDNA Tutorial. Nov 2010 Jérôme Kieffer Data analysis Unit 1

Python on GACRC Computing Resources

Sunday, February 19, 12

5.3 Install grib_api for OpenIFS

Installation of Apache OpenMeetings on PCLinuxOS pclinuxos64-mate iso

This tutorial provides a basic understanding of the infrastructure and fundamental concepts of managing an infrastructure using Chef.

Continuous Integration and Release Management with Nix

Intermediate/Advanced Python. Michael Weinstein (Day 2)

Connecting ArcGIS with R and Conda. Shaun Walbridge

Intel Distribution for Python* 2018

Installation of Apache OpenMeetings on Slackware slackware install-dvd.iso

Department of Computing Science Illinois Institute of Technology. HORNET: A Highly Configurable, Cycle-Level Multicore Simulator

Supercomputing environment TMA4280 Introduction to Supercomputing

End-to-end optimization potentials in HPC applications for NWP and Climate Research

CONCOCT Documentation. Release 1.0.0

Installation of Apache OpenMeetings on PCLinuxOS pclinuxos64-mate iso

AMath 483/583 Lecture 2. Notes: Notes: Homework #1. Class Virtual Machine. Notes: Outline:

AMath 483/583 Lecture 2

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit

Transcription:

Site presentation: EasyBuild @ CSCS 1 st EasyBuild User Meeting Ghent, Belgium Guilherme Peretti-Pezzi Head of Scientific Computing Support (CSCS) January 29 th, 2016

Outline Overview of systems @ CSCS Motivation to adopt EB EasyBuild setup / workflow @ CSCS Use cases Python Cray CS-Storm (MeteoSwiss) Uniform software stack at our systems Using Jenkins to deploy and validate builds Final thoughts EasyBuild @ CSCS: Current status and roadmap 2

Piz Daint (XC30) & Piz Dora (XC40) #7 & #92 on Top500 EasyBuild @ CSCS: Current status and roadmap 3

Piz Escha & Piz Kesch (CS-Storm) #353 on Top500 2 cabinets 12 nodes 8 x K80 EasyBuild @ CSCS: Current status and roadmap 4

Overview of systems (with EB) at CSCS System Type # of nodes # gpus/node Cray? Piz Daint Production 5272 1 Y Piz Dora Production 1256 0 Y Castor Production 32 1 N Pilatus Production 38 0 N Monch Production 376 0 N Escha Production 12 16 Y (no PrgEnv) Kesch Production 12 16 Y (no PrgEnv) Leone Production 11 1 / 0 N Santis TDS 16 1 Y Brisi TDS 16 0 Y Greina R&D --- --- N EasyBuild @ CSCS: Current status and roadmap 5

Main groups doing HPC support @ CSCS Scientific Computing Support (7 ppl) User Engagement & Support Scientific Community Engagement Responsible for application support Now mostly using EB HPC Operations Responsible for building Programming environment Gcc, intel, PGI, MPI Manual builds EasyBuild @ CSCS: Current status and roadmap 6

Motivation to adopt EasyBuild @ CSCS Lack of standard way to describe build recipes Shell scripts, readme files, web/wiki pages, invisible docs Software available is very heterogeneous across systems Moving users to a different machine requires a lot of work Systems upgrades are a huge overhead Lots of manual work to re-deploy existing software Little collaboration with other sites doing the very same thing EasyBuild @ CSCS: Current status and roadmap 7

EasyBuild setup @ CSCS Shared EasyBuild Installation /apps/common/easybuild/ Shared EasyConfig Repository easybuild/cscs_easyconfigs Individual software Installation $APPS/easybuild/software Shared EasyBlocks Repository easybuild/easyblocks Individual module files $APPS/easybuild/modules/all Shared source file Repository easybuild/sources Successful production builds go to GitHub EasyBuild @ CSCS: Current status and roadmap 8

EasyBuild setup @ CSCS Currently available toolchains CrayGNU CrayIntel CrayCCE foss gmvolf Intel Daint Pilatus Escha Greina Dora Castor Kesch Santis Monch Leone Brisi Greina EasyBuild @ CSCS: Current status and roadmap 9

Python use case Suported modules for Python 2 and 3 Setuptools 17.1.1, Pip 7.0.3, Nose 1.3.7, Numpy 1.9.2, Scipy 0.15.1, mpi4py 1.3.1, Cython 0.22, Six 1.9.0, Virtualenv 13.0.3, pandas 0.16.2, h5py 2.5.0 (serial/parallel), Matplotlib 1.4.3, pycuda 2015.1, netcdf4 1.1.8 Example Easyconfig files (for Python 2.7.10 on Cray) Python-2.7.10-CrayGNU-5.2.40.eb matplotlib-1.4.3-craygnu-5.2.40-python-2.7.10.eb netcdf4-python-1.1.8-craygnu-5.2.40-python-2.7.10.eb h5py-2.5.0-craygnu-5.2.40-python-2.7.10-parallel.eb h5py-2.5.0-craygnu-5.2.40-python-2.7.10-serial.eb pycuda-2015.1-craygnu-5.2.40-python-2.7.10.eb Easyblocks h5py.py, netcdf_python.py, pycuda.py Now available on: Daint, Dora, Santis, Brisi (CrayGNU) Pilatus, Castor, Leone (foss) Escha, Kesch, Mönch (gmvolf) EasyBuild @ CSCS: Current status and roadmap 10

MCH CS-Storm use case (gmvolf/2015a) Autoconf/2.69 gmvapich2/2015a ncview/2.1.5 Automake/1.15 gmvolf/2015a netcdf/4.3.3.1 Autotools/20150215 GSL/1.16 netcdf-fortran/4.4.2 binutils/2.25 HDF/4.2.8 netcdf-python/1.1.8 Bison/3.0.3 HDF5/1.8.15 OPARI2/1.1.4 Boost/1.49.0 JasPer/1.900.1 OpenBLAS/0.2.13 bzip2/1.0.6 Java/1.7.0_80 OTF2/1.5.1 CDO/1.6.9 libffi/3.0.13 Python/2.7.10 CMake/3.2.2 libjpeg-turbo/1.4.0 R/3.1.3 Cube/4.3.2 libpng/1.6.16 Ruby/2.2.2 curl/7.40.0 libreadline/6.3 ScaLAPACK/2.0.2 ddt/5.0(default) libtool/2.4.6 Scalasca/2.2.2 Doxygen/1.8.9.1 FFTW/3.3.4 flex/2.5.39 freetype/2.5.5 libxml2/2.9.1 M4/1.4.17 matplotlib/1.4.3 MVAPICH2/2.0.1_gnu48 Score-P/1.4.2 SQLite/3.8.8.1 Szip/2.1 Tcl/8.6.3 Red By OPS/Cray GCC/4.8.2 NASM/2.11.06 UDUNITS/2.1.24 gettext/0.18.2 NCO/4.5.1 zlib/1.2.8 GLib/2.34.3 ncurses/5.9 EasyBuild @ CSCS: Current status and roadmap 11

MCH CS-Storm use case - fixing Cray s unsupported PrgEnv: gcc/4.8.2 lacks Haswell support (-march=native) Proposed temporary workaround: use assembler from cce! export PATH=/opt/cray/cce/8.3.10/cray-binutils/x86_64-unknown-linux-gnu/bin:$PATH (before module load gcc') EasyBuild @ CSCS: Current status and roadmap 12

Jenkins Jenkins is a tool designed for continuous integration/validation But it is much more powerful than that Thousands of plugins are available Can be easily configured to run tasks by ssh anywhere You get logs for all of your executions for free Info about running / past jobs and logs are always accessible through the web interface Some usage examples: Development/Integration: Checkout svn/git repositories to automatically build on different platforms Validation Periodically run unit tests Monitoring Periodically run sanity and performance tests (*regression*) Run your favorite script or app Use your creativity (example at CSCS: driving the acceptance of MCH machine) EasyBuild @ CSCS: Current status and roadmap 13

Jenkins example: Monitoring scratch performance for apps (netcdf5) By lucamar

Jenkins example: Rebuilding all software stack for Escha/Kesch EasyBuild @ CSCS: Current status and roadmap 15

Jenkins + EB integration: workflow example for testing.eb files Testing new easyconfig files on all machines where the toolchain is available Workflow setup 1. Create a folder accessible by jenscscs to store the.eb files /path/to/eb-files/ 2. Create a jenkins project adding the target test systems CrayGNU = daint, dora, santis, brisi foss/2015b = castor, pilatus 3. Add the following commands to the Execute shell Usage source /apps/common/easybuild/setup.sh find /path/to/eb-files/ -name '*CrayGNU-5.2.40*.eb' -exec eb {} "-r -f" \; (foss/2015a: replace *CrayGNU-5.2.40* by *foss-2015a* ) 1. Copy.eb files to /path/to/eb-files/ 2. Go to Jenkins and click on Build now EasyBuild @ CSCS: Current status and roadmap 16

Jenkins: Example for testing.eb files /apps/common/tools/easybuild/jenkins/ CrayGNU/5.2.40 CDO-1.6.9-CrayGNU-5.2.40.eb NFFT-3.3.0-CrayGNU-5.2.40.eb foss/2015a Ghostscript-9.10-foss-2015a.eb HDF5-1.8.15-foss-2015a.eb EasyBuild @ CSCS: Current status and roadmap 17

Final thoughts Current EB installation is ready for application level Validation with Python use case: Daint, Dora, Santis, Brisi, Pilatus, Castor and Escha/Kesch Escha/Kesch: complete software stack built with gmvolf toolchain Continuous validation techniques can be easily applied Testing builds across all systems with Jenkins weekly builds for every machine Changes/errors on the PrgEnv can be detected early In order to get the most out of EasyBuild We need to have consistent PrgEnv across OK on Cray systems Not currently true on non-cray Achievable with EasyBuild EasyBuild @ CSCS: Current status and roadmap 18

Wish list Stable Cray support Almost there: https://github.com/hpcugent/easybuild-framework/issues/1390 +1 for Rpath support Backup of custom easyblocks for reproducibility Compatible build description with similar projects (Spack) Add more flexibility to toolchain definition Possibility of creating custom (or modifying existing) without touching the framework Possibility of (easily) using existing compilers Lower the bar for new users To understand one build, user needs to look into easyconfig + easyblock + framework Extended-dry-run is currently the best approach EasyBuild @ CSCS: Current status and roadmap 19

Links CSCS Internal doc on EB https://github.com/eth-cscs/tools/wiki/easybuild-at-cscs Additional easyconfig files repositories Successful production builds at CSCS https://github.com/eth-cscs/tools/tree/master/easybuild/ebfiles_repo Shared custom easyconfig files https://github.com/eth-cscs/tools/tree/master/easybuild/easyconfigs Shared custom easyblocks https://github.com/eth-cscs/tools/tree/master/easybuild/easyblocks EasyBuild @ CSCS: Current status and roadmap 20

Thank you for your attention.