Global Software Distribution with CernVM-FS

Similar documents
RADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP

Recent Developments in the CernVM-FS Server Backend

Singularity tests at CC-IN2P3 for Atlas

CernVM-FS beyond LHC computing

Recent Developments in the CernVM-File System Server Backend

Singularity in CMS. Over a million containers served

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

File System Internals. Jo, Heeseung

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

Using CernVM-FS to deploy Euclid processing S/W on Science Data Centres

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules. Singularity overview. Vanessa HAMAR

CernVM-FS Documentation

CernVM-FS Documentation

Security in the CernVM File System and the Frontier Distributed Database Caching System

Opportunities for container environments on Cray XC30 with GPU devices

CernVM a virtual software appliance for LHC applications

Computer Systems Laboratory Sungkyunkwan University

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Introduction To Gluster. Thomas Cameron RHCA, RHCSS, RHCDS, RHCVA, RHCX Chief Architect, Central US Red

File Systems. What do we need to know?

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

OPERATING SYSTEM. Chapter 12: File System Implementation

Chapter 12: File System Implementation

Chapter 11: Implementing File Systems

Chapter 11: Implementing File

CS370 Operating Systems

What is a file system

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Toward An Integrated Cluster File System

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

Slacker: Fast Distribution with Lazy Docker Containers. Tyler Harter, Brandon Salmon, Rose Liu, Andrea C. Arpaci-Dusseau, Remzi H.

GFS: The Google File System

-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.

Scale-out Storage Solution and Challenges Mahadev Gaonkar igate

StratusLab Cloud Distribution Installation. Charles Loomis (CNRS/LAL) 3 July 2014

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Dynamic Metadata Management for Petabyte-scale File Systems

Exam : Implementing Microsoft Azure Infrastructure Solutions

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Kubernetes Integration with Virtuozzo Storage

Outline. ASP 2012 Grid School

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

CS 550 Operating Systems Spring File System

Chapter 12: File System Implementation

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008

System Requirements ENTERPRISE

Week 12: File System Implementation

Lightweight scheduling of elastic analysis containers in a competitive cloud environment: a Docked Analysis Facility for ALICE

LHCb experience running jobs in virtual machines

Travis Cardwell Technical Meeting

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System

CernVM-FS Documentation

File Systems Management and Examples

Think Small to Scale Big

Operating Systems. Week 13 Recitation: Exam 3 Preview Review of Exam 3, Spring Paul Krzyzanowski. Rutgers University.

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

CS 416: Operating Systems Design April 22, 2015

BeeGFS Solid, fast and made in Europe

GlusterFS Architecture & Roadmap

Docker 101 Workshop. Eric Smalling - Solution Architect, Docker

THE AFS NAMESPACE AND CONTAINERS

Grid Computing Activities at KIT

OS Virtualization. Linux Containers (LXC)

CS370 Operating Systems

Operating Systems Design Exam 2 Review: Spring 2011

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Chapter 10: File System Implementation

CS 416: Opera-ng Systems Design March 23, 2012

CS3600 SYSTEMS AND NETWORKS

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD.

Distributed File Systems

ovirt and Docker Integration

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition

Replication, History, and Grafting in the Ori File System Ali José Mashtizadeh, Andrea Bittau, Yifeng Frank Huang, David Mazières Stanford University

CS 4284 Systems Capstone

Chapter 12: File System Implementation

Evaluation of the Huawei UDS cloud storage system for CERN specific data

Geant4 on Azure using Docker containers

Faculté Polytechnique

Distributed File Systems II

File System Implementation

Continuous Integration and Deployment (CI/CD)

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Segmentation with Paging. Review. Segmentation with Page (MULTICS) Segmentation with Page (MULTICS) Segmentation with Page (MULTICS)

CSE380 - Operating Systems

NPTEL Course Jan K. Gopinath Indian Institute of Science

VC3. Virtual Clusters for Community Computation. DOE NGNS PI Meeting September 27-28, 2017

Using Puppet to contextualize computing resources for ATLAS analysis on Google Compute Engine

CouchDB-based system for data management in a Grid environment Implementation and Experience

Kubernetes The Path to Cloud Native

Data Centers and Cloud Computing. Data Centers

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Fast and Easy Persistent Storage for Docker* Containers with Storidge and Intel

Distributed Systems 16. Distributed File Systems II

MSG: An Overview of a Messaging System for the Grid

Transcription:

Global Software Distribution with CernVM-FS Jakob Blomer CERN 2016 CCL Workshop on Scalable Computing October 19th, 2016 jblomer@cern.ch CernVM-FS 1 / 15

The Anatomy of a Scientific Software Stack (In High Energy Physics) jblomer@cern.ch CernVM-FS 2 / 15

The Anatomy of a Scientific Software Stack (In High Energy Physics) My Analysis Code < 10 Python Classes CMS Software Framework O(1000) C++ Classes Simulation and I/O Libraries ROOT, Geant4, MC-XYZ CentOS 6 and Utilities O(10) Libraries jblomer@cern.ch CernVM-FS 2 / 15

The Anatomy of a Scientific Software Stack (In High Energy Physics) My Analysis Code < 10 Python Classes CMS Software Framework O(1000) C++ Classes Simulation and I/O Libraries ROOT, Geant4, MC-XYZ CentOS 6 and Utilities O(10) Libraries How to install on... my laptop: compile into /opt 1 week my local cluster: ask sys-admin to install in /nfs/software > 1 week someone else s cluster:? jblomer@cern.ch CernVM-FS 2 / 15

The Anatomy of a Scientific Software Stack (In High Energy Physics) My Analysis Code < 10 Python Classes CMS Software Framework O(1000) C++ Classes Simulation and I/O Libraries ROOT, Geant4, MC-XYZ CentOS 6 and Utilities O(10) Libraries stable changing How to install (again) on... my laptop: compile into /opt 1 week my local cluster: ask sys-admin to install in /nfs/software > 1 week someone else s cluster:? jblomer@cern.ch CernVM-FS 2 / 15

WLCG sites World Wide LHC Computing Grid 200 sites: from 100 to 100 000 cores Different countries, institutions, batch schedulers, OSs,... Augmented by clouds, supercomputers, LHC@Home Beyond the Local Cluster jblomer@cern.ch CernVM-FS 3 / 15

What about Docker? Example: in Docker $ docker pull r-base 1 GB image $ docker run -it r-base $... (fitting tutorial) only 30 MB used Linux Libs... Container ( App ) It s hard to scale Docker: iphone App Docker Image 20 MB 1 GB changes every month changes twice a week phones update staggered servers update synchronized Your preferred cluster or supercomputer might not run Docker jblomer@cern.ch CernVM-FS 4 / 15

A File System for Software Distribution raa Basic System Utilities Software FS Global HTTP Cache Hierarchy OS Kernel Worker Node s Memory Buffer Megabytes Worker Node s Disk Cache Gigabytes Central Web Server Entire Software Stack Terabytes Pioneered by CCL s GROW-FS for CDF at Tevatron Refined in CernVM-FS, in production for CERN s LHC and other experiments 1 Single point of publishing 2 HTTP transport, access and caching on demand 3 Important for scaling: bulk meta-data download (not shown) jblomer@cern.ch CernVM-FS 5 / 15

One More Ingredient: Content-Addressable Storage HTTP Transport Caching & Replication Transformation Read-Only File System Content-Addressed Objects Read/Write File System Worker Nodes Software Publisher / Master Source Two independent issues 1 How to mount a file system (on someone else s computer)? 2 How to distribute immutable, independent objects? jblomer@cern.ch CernVM-FS 6 / 15

Content-Addressable Storage: Data Structures /cvmfs/icecube.opensciencegrid.org amd64-gcc6.0 4.2.0 ChangeLog. Compression, SHA-1 806fbb67373e9... Repository Object Store File catalogs Object Store Compressed files and chunks De-duplicated File Catalog Directory structure, symlinks Content hashes of regular files Digitally signed integrity, authenticity Time to live Partitioned / Merkle hashes (possibility of sub catalogs) Immutable files, trivial to check for corruption, versioning jblomer@cern.ch CernVM-FS 7 / 15

Transactional Publish Interface Read/Write Scratch Area Read/Write Interface CernVM-FS Read-Only File System, S3 Union File System AUFS or OverlayFS Publishing New Content [ ~ ]# cvmfs_server transaction icecube.opensciencegrid.org [ ~ ]# make DESTDIR=/cvmfs/opensciencgrid.org/amd64-gcc6.0/4.2.0 install [ ~ ]# cvmfs_server publish icecube.opensciencegrid.org Uses cvmfs-server tools and an Apache web server jblomer@cern.ch CernVM-FS 8 / 15

Transactional Publish Interface Read/Write Scratch Area CernVM-FS Read-Only Read/Write Interface File System, S3 Union File System AUFS or OverlayFS Reproducible: as in git, you can always come back to this state Publishing New Content [ ~ ]# cvmfs_server transaction icecube.opensciencegrid.org [ ~ ]# make DESTDIR=/cvmfs/opensciencgrid.org/amd64-gcc6.0/4.2.0 install [ ~ ]# cvmfs_server publish icecube.opensciencegrid.org Uses cvmfs-server tools and an Apache web server jblomer@cern.ch CernVM-FS 8 / 15

Content Distribution over the Web Server side: stateless services Data Center Caching Proxy O(100) nodes / server Web Servery O(10) DCs / server Worker Nodes HTTP HTTP jblomer@cern.ch CernVM-FS 9 / 15

Content Distribution over the Web Server side: stateless services Data Center HTTP Load Balancing O(100) nodes / server Web Servery O(10) DCs / server Worker Nodes HTTP HTTP HTTP jblomer@cern.ch CernVM-FS 9 / 15

Content Distribution over the Web Server side: stateless services Data Center Caching Proxies O(100) nodes / server Web Servery O(10) DCs / server Failover Worker Nodes HTTP HTTP jblomer@cern.ch CernVM-FS 9 / 15

Content Distribution over the Web Server side: stateless services Data Center Caching Proxies O(100) nodes / server Mirror Serversy O(10) DCs / server Failover Worker Nodes HTTP Geo-IP jblomer@cern.ch CernVM-FS 9 / 15

Content Distribution over the Web Server side: stateless services Data Center Caching Proxies O(100) nodes / server Mirror Serversy O(10) DCs / server Failover Worker Nodes HTTP jblomer@cern.ch CernVM-FS 9 / 15

Content Distribution over the Web Server side: stateless services Data Center Caching Proxies O(100) nodes / server Mirror Serversy O(10) DCs / server Worker Nodes Prefetched Cache jblomer@cern.ch CernVM-FS 9 / 15

Mounting the File System Client: Fuse Available for RHEL, Ubuntu, OS X; Intel, ARM, Power Works on most grids and virtual machines (cloud) inflate+verify open(/changelog) fd CernVM-FS file descr. HTTP GET glibc libfuse SHA1 syscall /dev/fuse user space kernel space Fuse VFS inode cache dentry cache. NFS ext3 jblomer@cern.ch CernVM-FS 10 / 15

Mounting the File System Client: Parrot Available for Linux / Intel Works on supercomputers, opportunistic clusters, in containers inflate+verify Parrot Sandbox open(/changelog) fd libcvmfs file descr. HTTP GET glibc libparrot SHA1 syscall / Parrot user space kernel space Fuse VFS inode cache dentry cache. NFS ext3 jblomer@cern.ch CernVM-FS 11 / 15

Scale of Deployment > 350 million files under management > 50 repositories Installation service by OSG and EGI jblomer@cern.ch CernVM-FS 12 / 15

Docker Integration Under Construction! Docker Daemon Improved Docker Daemon Funded Project pull & push containers file-based transfer Docker Registry CernVM File System jblomer@cern.ch CernVM-FS 13 / 15

Client Cache Manager Plugins Under Construction! cvmfs/fuse libcvmfs/parrot Transport Channel (TCP, Socket,... ) C library Cache Manager (Key-Value Store) 3rd party plugins Memory, Ceph, RAMCloud,... Draft C Interface cvmfs_add_refcount ( s t r u c t hash o b j e c t _ i d, i n t change_by ) ; cvmfs_pread ( s t r u c t hash o b j e c t _ i d, i n t o f f s e t, i n t s i z e, v o id * b u f f e r ) ; // T r a n s a c t i o n a l w r i t i n g i n f i x e d s i z e d chunks cvmfs_start_txn ( s t r u c t hash o b j e c t _ i d, i n t txn_id, s t r u c t i n f o o b j e c t _ i n f o ) ; cvmfs_write_txn ( i n t txn_id, v o id * b u f f e r, i n t s i z e ) ; cvmfs_abort_txn ( i n t txn_id ) ; cvmfs_commit_txn ( i n t txn_id ) ; jblomer@cern.ch CernVM-FS 14 / 15

Summary CernVM-FS Global, HTTP-based file system for software distribution Works great with Parrot Optimized for small files, heavy meta-data workload Open source (BSD), used beyond high-energy physics Use Cases Scientific software Distribution of static data e. g. conditions, calibration VM / container distribution cf. CernVM Building block for long-term data preservation Source code: https://github.com/cvmfs/cvmfs Downloads: https://cernvm.cern.ch/portal/filesystem/downloads Documentation: https://cvmfs.readthedocs.org Mailing list: cvmfs-talk@cern.ch jblomer@cern.ch CernVM-FS 15 / 15

Backup Slides jblomer@cern.ch CernVM-FS 16 / 15

CernVM-FS Client Tools Fuse Module Normal namespace: /cvmfs/<repository> e. g. /cvmfs/atlas.cern.ch Private mount as a user possible One process per fuse module + watchdog process Cache on local disk Cache LRU managed NFS Export Mode Hotpach functionality cvmfs_config reload Parrot Built in by default Mount helpers Setup environment (number of file descriptors, access rights,... ) Used by autofs on /cvmfs Used by /etc/fstab or mount as root mount -t cvmfs atlas.cern.ch /cvmfs/atlas.cern.ch Diagnostics Nagios check available cvmfs_config probe cvmfs_config chksetup cvmfs_fsck cvmfs_talk, connect to running instance jblomer@cern.ch CernVM-FS 17 / 15

Experiment Software from a File System Viewpoint File System Entries [ 10 6 ] 15 10 5 1 Files Statistics over 2 Years Software Directory Tree atlas.cern.ch repo software x86_64-gcc43 17.1.0 17.2.0. jblomer@cern.ch CernVM-FS 18 / 15

Experiment Software from a File System Viewpoint File System Entries [ 10 6 ] 15 10 5 1 Duplicates File Kernel Statistics over 2 Years Software Directory Tree atlas.cern.ch repo software x86_64-gcc43 17.1.0 17.2.0. jblomer@cern.ch CernVM-FS 18 / 15

Experiment Software from a File System Viewpoint File System Entries [ 10 6 ] 15 10 5 1 Duplicates File Kernel Statistics over 2 Years Software Directory Tree atlas.cern.ch repo software x86_64-gcc43 17.1.0 17.2.0. Between consecutive software versions: only 15 % new files jblomer@cern.ch CernVM-FS 18 / 15

Experiment Software from a File System Viewpoint File System Entries [ 10 6 ] 15 10 5 1 Directories Symlinks Duplicates File Kernel Statistics over 2 Years Software Directory Tree atlas.cern.ch repo software x86_64-gcc43 17.1.0 17.2.0. Fine-grained software structure (Conway s law) Between consecutive software versions: only 15 % new files jblomer@cern.ch CernVM-FS 18 / 15

Directory Organization 50 Athena 17.0.1 CMSSW 4.2.4 LCG Externals R60 Fraction of Files [%] 40 30 20 10 0 0 5 10 15 20 Directory Depth Typical (non-lhc) software: majority of files in directory level 5 jblomer@cern.ch CernVM-FS 19 / 15

Cumulative File Size Distribution ATLAS LHCb ALICE CMS UNIX Web Server Requested 2 18 2 16 2 14 Dateigröße [B] 2 12 2 10 2 8 2 6 2 4 0 10 20 30 40 50 60 70 80 90 100 Perzentil cf. Tanenbaum et al. 2006 for Unix and Webserver Good compression rates (factor 2 3) jblomer@cern.ch CernVM-FS 20 / 15

Runtime Behavior Working Set 10 % of all available files are requested at runtime Median of file sizes: < 4 kb Flash Crowd Effect Up to 500 khz meta data request rate Up to 1 khz file open request rate /share /share ddos /share Shared Software Area Software jblomer@cern.ch CernVM-FS 21 / 15

Software vs. Data Software Based on ATLAS Figures 2012 Data POSIX Interface put, get, seek, streaming File dependencies Independent files 10 7 objects 10 8 objects 10 12 B volume 10 16 B volume Whole files File chunks Absolute paths Any mountpoint Open source Confidential WORM ( write-once-read-many ) Versioned jblomer@cern.ch CernVM-FS 22 / 15