Transient Compute ARC as Cloud Front-End

Size: px

Start display at page:

Download "Transient Compute ARC as Cloud Front-End"

Julian Richardson
5 years ago
Views:

1 Digital Infrastructures for Research , 11:30, Cracow 30 min slot AEC ALBERT EINSTEIN CENTER FOR FUNDAMENTAL PHYSICS Transient Compute ARC as Cloud Front-End Sigve Haug, AEC-LHEP University of Bern 1

2 University of Bern Cloud Cluster Commodity Clusters CSCS HPC Cluster 2

3 Big Data Science Challenge - For example ATLAS Experiment at CERN: CPU needs grow drastically - But budget is close to flat - get is What to do? (Illustration only) 3 0

Consolidate where ever possible - New type of service provider for HPC may be part of the answer NREN and commercial cloud providers - NREN as a non-profit business dedicated to science is

4 Consolidate where ever possible - New type of service provider for HPC may be part of the answer NREN and commercial cloud providers - NREN as a non-profit business dedicated to science is particularly interesting. Is it an alternative to traditional HPC providers and private clusters, i.e. more cost effective? - The Swiss example: Swiss NREN offers IaaS on OpenStack 4

5 New player - usable for large science? Clouds Commodity Clusters CSCS Super HPC Clusters 5

6 Important incentives for this exercise - Federal infrastructure funding for free trials on the NREN IaaS (SWITCHengines) ok let put big science onto it - Some nice tools make it easy : ARC and ElastiCluster - May be an alternative to buying own hardware, fight for HPC allocations or deal with rigid central batch clusters and policies - Is it cheaper for science? 6

The test case - ATLAS experiment Particle physics experiment at the Large Hadron Collider (LHC) at CERN in Geneva Investigates smallest particles in universe, dark matter, big bang 4

7 The test case - ATLAS experiment Particle physics experiment at the Large Hadron Collider (LHC) at CERN in Geneva Investigates smallest particles in universe, dark matter, big bang 4 decades life time The underground LHC tunnel at CERN in Geneva x PB per year, hundreds of thousands CPU cores running permanently, using 100s of distributed sites on several continents. 7

8 Compute workflow Input Flow Processing Step MB/event t / s*event Theory Generation Detector Simulation Digitization 3.6 Data /300 Detector saves about 1000 events per second Reconstruct ion 3.6/ / Analysis 0.4 So far running only Generation and Detector Simulation step on SWITCHengines - moderate I/O 8

- Made an instance for ATLAS with CentOS, mounted /cvmfs, installed some stuff to make ATLAS run.

9 - SWITCHengines How? - Got an account on IaaS SWITCHengines (an ) with some quota - Made an instance for elasticluster (ubuntu) - Made an instance for ATLAS with CentOS, mounted /cvmfs, installed some stuff to make ATLAS run. Made a snapshot (image). - Fired up a SLURM cluster with 304 cores with that image (30 min) UZH S3IT: Service and Support for ScienceIT - Riccardo Murri - Sergio Maffioletti 9

10 How : Elasticluster in action - Get a cluster in 30 minutes (elasticluster)ubuntu@elasticluster-nei:~$ tail.elasticluster/config security_group=mpi_test image_id=92cf2dc2-547c-4ab6-8d4f-9a383a4cf6e6 flavor=nei-ch-8cpu-16gb_ram frontend_nodes=1 compute_nodes=38 image_userdata= ssh_to=frontend network_ids=c9e33fb0-5adf-4c81-97a6-a6eba639d0b1 (elasticluster)ubuntu@elasticluster-nei:~$ elasticluster start slurm -n ATLAS (elasticluster)ubuntu@elasticluster-nei:~$ elasticluster list The following clusters have been started. Please note that there's no guarantee that they are fully configured: ATLAS name: ATLAS slurm template: - frontend nodes: 1 - compute nodes: 38 (elasticluster)ubuntu@elasticluster-nei:~$ (elasticluster)ubuntu@elasticluster-nei:~$ elasticluster resize -t slurm -a 5:compute ATLAS (elasticluster)ubuntu@elasticluster-nei:~$ elasticluster stop ATLAS 10

mounted /home/atlas from SWITCHengines and activated our ssh back-end (small wrapper

11 HOW cont. - ARC - Cloned our ARC HPC VM front-end for Cray (at home inst, not in IaaS) - ssh mounted /home/atlas from SWITCHengines and activated our ssh back-end (small wrapper around standard ARC slurm back-end ) - Registered front-end in ATLAS production system - Started running OpenStack Volunteer Computing Test Cray 11

12 Sketch of the solution OpenStack - The compute cluster has become transient (30 min thing), ARC frontend is persistent 12

13 Performance The CPU return has become good (around 90%), compared to a fix quota model, full elastic pay per usage seems interesting 13

14 20 % Discount for heavy usage like in this case Swiss NREN Prices - Yearly 1000 core cluster with 2 GB RAM per core yearly cost about 70 kchf per year - Similar to dedicated cluster operation by the CSCS (Swiss Supercomputing Center) - Not competitive with subsidised (power, manpowe) in-house solutions 14

15 Wrap up - connecting cloud to grid - Fire up application dedicated O(1000) core clusters with elasticluster on an OpenStack IaaS within an hour is possible - Hook this cluster to a remote ARC front-end works well for tested LHC tasks - Performance is sufficient, very stable - The cluster back-end becomes transient, i.e. can be reinstalled on the time-scale of changing a disk drive - In the Swiss example, pricing has become competitive to other outsourcing alternatives. - compute becomes transient 15

16 University of Bern Cloud Cluster Commodity Clusters CSCS HPC Cluster 16

17 Additional Material 17

18 ARC Bern ssh back-end ~]# ll /opt/sshslurm/ total 8 drwxr-xr-x. 2 root root 4096 Dec 18 17:50 config lrwxrwxrwx. 1 root root 8 Apr sacct -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sacctmgr -> sshslurm lrwxrwxrwx. 1 root root 8 Apr salloc -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sattach -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sbatch -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sbcast -> sshslurm lrwxrwxrwx. 1 root root 8 Apr scancel -> sshslurm lrwxrwxrwx. 1 root root 8 Apr scontrol -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sdiag -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sinfo -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sprio -> sshslurm lrwxrwxrwx. 1 root root 8 Apr squeue -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sreport -> sshslurm lrwxrwxrwx. 1 root root 8 Apr srun -> sshslurm lrwxrwxrwx. 1 root root 8 Apr sshare -> sshslurm -rwxr-xr-x. 1 root root 604 Nov sshslurm lrwxrwxrwx. 1 root root 8 Apr sstat -> sshslurm lrwxrwxrwx. 1 root root 8 Apr strigger -> sshslurm [root@ce04 ~]# [root@ce04 ~]# cat /opt/sshslurm/config/sshslurm-config SSHSLURM_HOST="atlas@ " SSH_CMDLINE="/opt/openssh-6.6/bin/ssh -o "ControlPath=~/.ssh/controlmaster-%r@%h:%p" -o "ControlMaster=auto" -o "ControlPersist=2h" -o "ServerAliveInterval=120" -i /opt/sshslurm/ config/id_rsa.$(whoami)" SCP_CMDLINE="/opt/openssh-6.6/bin/scp -o "ControlPath=~/.ssh/controlmaster-%r@%h:%p" -o "ControlMaster=auto" -o "ControlPersist=2h" -o "ServerAliveInterval=120" -i /opt/sshslurm/ config/id_rsa.$(whoami)" REMOTE_SLURM_PATH="/usr/bin" REMOTE_TEMP_PATH="/tmp" [root@ce04 ~]# 18

19 ARC Bern ssh back-end ~]# cat /opt/sshslurm/sshslurm #!/bin/bash # config source /opt/sshslurm/config/sshslurm-config SBINARY=$(basename "$0") SARGS="" for token in "$@"; do SARGS="$SARGS '$token'" # echo $SARGS done echo $(date) - $SBINARY "$SARGS" >> /tmp/sshslurm.log if [[ "$SBINARY" == "sbatch" && "$1"!= "" ]]; then SARGS=$REMOTE_TEMP_PATH/$(basename "$1") $SCP_CMDLINE -q "$1" "$SSHSLURM_HOST:$SARGS" $SSH_CMDLINE $SSHSLURM_HOST -- [ -d "$PWD" ] \&\& cd "$PWD"\; $REMOTE_SLURM_PATH/ $SBINARY "$SARGS" \&\& rm -f "$SARGS" exit $? fi $SSH_CMDLINE $SSHSLURM_HOST -- [ -d "$PWD" ] \&\& cd "$PWD"\; $REMOTE_SLURM_PATH/$SBINARY "$SARGS" exit $? [root@ce04 ~]# sshfs atlas@ :/home/atlas/ /home/atlas/ -o reconnect -o allow_other -o workaround=rename -o idmap=file -o uidfile=/opt/sshslurm/config/sshfs-cloud.uidmap -o gidfile=/opt/sshslurm/config/sshfs-cloud.gidmap -o nomap=ignore -o ServerAliveInterval=30 -o ServerAliveCountMax=2 -o IdentityFile=/opt/sshslurm/config/id_rsa.root -s -o nonempty 19

20 20

21 Running in true pilot mode APF : ATLAS Pilot Fabric act : ATLAS Control Tower Panda: Workload Manager 21 No I/O restrictions on SWITCH So it makes sense to let WN do I/O

ElastiCluster Automated provisioning of computational clusters in the cloud

ElastiCluster Automated provisioning of computational clusters in the cloud Riccardo Murri (with contributions from Antonio Messina, Nicolas Bär, Sergio Maffioletti, and Sigve