Using CernVM-FS to deploy Euclid processing S/W on Science Data Centres M. Poncet (CNES) Q. Le Boulc h (IN2P3) M. Holliman (ROE) On behalf of Euclid EC SGS System Team ADASS 2016 1
Outline Euclid Project & SGS From Euclid pipeline to SGS Architecture From source code to processing nodes CernVM-FS principles Using CernVM-FS on SDCs 2
Euclid Project & SGS From Euclid pipeline to SGS Architecture From source code to processing nodes CernVM-FS principles Using CernVM-FS on SDCs 3
Euclid project M2 mission in the framework of the ESA Cosmic Vision Programme Euclid mission objective is to map the geometry and understand the nature of the dark Universe (dark energy and dark matter) Actors in the mission: ESA and the Euclid Consortium (institutes from 15 European countries and USA, funded by their own national Space Agencies) Euclid Consortium: 15 countries 100+ labs 1200+ members One of the biggest collaboration! For more information see : O1.2, O8.5, P3.5, P8.13, P1.16, P2.13, P8.22, P2.21, P2.31, P4.18 http://sci.esa.int/science-e/www/area/index.cfm?fareaid=102 http://www.euclid-ec.org 4
Euclid Science Ground Segment (SGS) 5
Euclid Project & SGS From Euclid pipeline to SGS Architecture From source code to processing nodes CernVM-FS principles Using CernVM-FS on SDCs 6
Euclid pipeline 7
From dedicated SDCs to Federated SDCs Dedicated SDCs Whole pipeline in any SDC on quantum of data soc SIM VIS SIR NIR MER EXT SPE PHZ SHE LE3 EAS Unbalanced load Pipeline in a Box 100+ PB in grand total More Data transfer than processing? Move the code Not the data paradigm 8
Move the code not the Data SDCs are both storage and processing nodes Data storage and processing allocated by sky area 9
Euclid Project & SGS From Euclid pipeline to SGS Architecture From source code to processing nodes CernVM-FS principles Using CernVM-FS on SDCs 10
From source code to production 11
From source code to production EDEN referential 12
in collaboration with an international team How to federate? 13
in collaboration with an international team EDEN referential 14
in collaboration with an international team How to put code together? 15
in collaboration with an international team Configuration management Euclid SVN / Git 16
in collaboration with an international team How to build? Euclid SVN / Git 17
continuously integrating its code Continuous integration Euclid SVN / Git Nightly Builds Feedback on code EDEN 18
continuously integrating its code Euclid SVN / Git EDEN 19
continuously integrating its code Build Euclid SVN / Git EDEN Elements 20
continuously integrating its code Build Run Tests (Unit, smoke, pre-int. tests) Euclid SVN / Git EDEN Elements 21
continuously integrating its code Build Run Tests (Unit, smoke, pre-int. tests) Generate Documentation (Doxygen) Euclid SVN / Git EDEN Elements 22
continuously integrating its code Build Run Tests (Unit, smoke, pre-int. tests) Generate Documentation (Doxygen) Quality check (Custom) Euclid SVN / Git EDEN Elements 23
continuously integrating its code Build Run Tests (Unit, smoke, pre-int. tests) Generate Documentation (Doxygen) Quality check (Custom) Euclid SVN / Git Dashboards (SonarQube, ) Feedback on code EDEN Elements 24
continuously integrating its code Build Run Tests (Unit, smoke, pre-int. tests) Generate Documentation (Doxygen) Quality check (Custom) Packaging (Custom) Euclid SVN / Git Dashboards (SonarQube, ) Feedback on code EDEN Elements 25
continuously integrating its code Build Run Tests (Unit, smoke, pre-int. tests) Generate Documentation (Doxygen) Quality check (Custom) Packaging (Custom) Storage in repository Euclid SVN / Git Dashboards (SonarQube, ) Local deployment (yum install) YUM Repository (Nexus) Feedback on code EDEN Elements 26
able to quickly deploy it on production Euclid SVN / Git Feedback on code Nightly Builds Feedback on code Yum install FR PRODUCTION ENVIRONMENT UK CH ES DE NL FI IT US Science Data Centers SCIENTIFIC PRODUCTS EDEN 27
able to quickly deploy it on production Euclid SVN / Git Feedback on code Nightly Builds Feedback on code Continuous Deployment FR PRODUCTION ENVIRONMENT UK CH ES DE NL FI IT US Science Data Centers SCIENTIFIC PRODUCTS EDEN 28
«Dream» Solution not (too) intrusive Automatic deployment No (less) admin intervention required Security friendly No exotic protocol (easy to filter) Outgoing connexion prefered (no incoming) Mutiple versions in // EDEN versions Pipeline versions Efficient 29
Euclid Project & SGS From Euclid pipeline to SGS Architecture From source code to processing nodes CernVM-FS principles Using CernVM-FS on SDCs 30
CernVM File System (CernVM-FS) Developed by CERN (European Organization for Nuclear Research) For High Energy Physics (HEP) collaborations To deploy software on the worldwide-distributed computing infrastructure: HTTP based Pull mode: get locally only on access Cache hierarchy User space FUSE local read-only mounting point /cvmfs 31
Cvmfs principles 32
Cvmfs distribution network Central repository Ring of mirrors: scalability, reliability 33
Cvmfs distribution network Local Squid proxy: cache, security Linked to stratum 1: failover, load balancing Proxy cache (Horizontal scaling) 34
Cvmfs distribution network Cvmfs client on workers Linked to Squid proxies HTTP Local file system cache CernVM-FS client Operating System & Applications Read-only access Fuse OS Kernel Proxy cache (Horizontal scaling) 35
Cvmfs distribution network Compatible with Physical nodes (µ)vms Docker container Read-only access HTTP Local file system cache CernVM-FS client Fuse Operating System & Applications OS Kernel Proxy cache Physical Node VM (Horizontal scaling) Container 36
Cvmfs distribution network Multi-sites Local file system cache Read-only access HTTP CernVM-FS client Fuse Operating System & Applications OS Kernel Proxy cache Physical Node VM (Horizontal scaling) Container 37
Euclid Project & SGS From Euclid pipeline to SGS Architecture From source code to processing nodes CernVM-FS principles Using CernVM-FS on SDCs 38
Euclid cvmfs at a glance 39
Euclid cvmfs repository structure Multiple versions in // Any dependency along (yum install installroot) Switch through Env. Variables 40
Euclid cvmfs & continuous deployment Automated scripts GIT CODEEN Nexus CernVM-FS SDC SDC SDC X DEVELOP CODEEN chain (Build, Tests, Doc., Quality) TEST repo TEST Branch Read TEST Environment MASTER CODEEN chain (Build, Tests, Doc., Quality) Packaging (RPM) PROD repo EDEN x.y Deployment PROD Branch EDEN x.y Read PROD Environment Both Test and Production branches 15~mn latency between Installation on stratum 0 Availability on any SDC 41
Conclusions Tested through a technical SGS challenge Adopted on the whole Euclid SGS Used for ongoing Euclid Scientific SGS challenge #3 Automation scripts almost ready 42
Thank you for your attention Maurice.Poncet@cnes.fr Acknowledgments: authors are indebted to all the indididuals participating in the Euclid SGS developement inside ESA and EC, too numerous to be listed here 43
Questions 44
BACKUP SLIDES 45
Euclid at a glance 46
Euclid SGS Components 47
Euclid tools & software stack 48
Euclid Software stack 49