LHC COMPUTING GRID INSTALLING THE RELEASE. Document identifier: Date: April 6, Document status:

Similar documents
MyProxy Server Installation

LCG Installation LCFGng

EUROPEAN MIDDLEWARE INITIATIVE

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

ISTITUTO NAZIONALE DI FISICA NUCLEARE

glite Grid Services Overview

CrossGrid testbed status

Integrating the IEKP Linux cluster as a Tier-2/3 prototype centre into the LHC Computing Grid

ARCHER Data Services Service Layer

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

EUSurvey OSS Installation Guide

Troubleshooting Grid authentication from the client side

TORQUE Resource Manager Quick Start Guide Version

Certificate Authorities: Information and Usage

Globus Toolkit Manoj Soni SENG, CDAC. 20 th & 21 th Nov 2008 GGOA Workshop 08 Bangalore

Server Installation Guide

The EU DataGrid Testbed

S/MIME on Good for Enterprise MS Online Certificate Status Protocol. Installation and Configuration Notes. Updated: November 10, 2011

The EU DataGrid Fabric Management

Edinburgh (ECDF) Update

LECTURE 7. Readings: - SSH: The Definitive Guide; D.J. Barret et al.; O Reilly Lecture outline: - SSH. Marco Spaziani Brunella, Manuel Campo

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

Entrust Connector (econnector) Venafi Trust Protection Platform

Architecture Proposal

VMware AirWatch Content Gateway for Linux. VMware Workspace ONE UEM 1811 Unified Access Gateway

DINOMI Call Center Module. Quick start guide v 1.1

& Telecommunications. Cluster Installation Guide in CrossGrid. LCFGng for Dummies

Introduction to Programming and Computing for Scientists

& Telecommunications. Cluster Installation Guide in CrossGRID. LCFG for Dummies

Setup Desktop Grids and Bridges. Tutorial. Robert Lovas, MTA SZTAKI

How to Configure SSL Interception in the Firewall

LCG User Registration & VO management

EU DataGRID testbed management and support at CERN

Argus Vulnerability Assessment *1

ENTRUST CONNECTOR Installation and Configuration Guide Version April 21, 2017

VMware AirWatch Content Gateway Guide For Linux

Managing Software Images Using Software Management

Overview of HEP software & LCG from the openlab perspective

Ubiquity Server Manual

Change and Configuration Management Administration

& Telecommunications. Cluster Installation Guide in CrossGrid. LCFGng for Dummies

Configuring SSL CHAPTER

& Telecommunications. Cluster Installation Guide in CrossGrid. LCFGng for Dummies

CVS for Moodle Developers

Bitnami JRuby for Huawei Enterprise Cloud

UGP and the UC Grid Portals

VMware AirWatch Content Gateway Guide for Linux For Linux

SA1 CILogon pilot - motivation and setup

Application Guide. Connection Broker. Advanced Connection and Capacity Management For Hybrid Clouds

Bitnami Pimcore for Huawei Enterprise Cloud

ALHAD G. APTE, BARC 2nd GARUDA PARTNERS MEET ON 15th & 16th SEPT. 2006

Configuring SSL. SSL Overview CHAPTER

DESY. Andreas Gellrich DESY DESY,

Siemens PLM Software. HEEDS MDO Setting up a Windows-to- Linux Compute Resource.

SPGrid Efforts in Italy

Installation Guide for 3.1.x

MSE System and Appliance Hardening Guidelines

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing

ARCHER Collaborative Workspace

Introduction to Grid Computing!

& Telecommunications. Cluster Installation Guide in CrossGRID. LCFG for Dummies

SAML-Based SSO Configuration

Grid Roll: Users Guide. Version 4.3 Edition

2017/11/10 10:40 1/2 Setup. Be aware of the updating procedure of VISUAL PLANNING, see UPDATE VISUAL PLANNING

PHPBasket 4 Administrator Documentation

Service Availability Monitor tests for ATLAS

Troubleshooting Grid authentication from the client side

Installing and Configuring VMware Identity Manager Connector (Windows) OCT 2018 VMware Identity Manager VMware Identity Manager 3.

Monitoring the ALICE Grid with MonALISA

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

Teradici PCoIP Connection Manager 1.8 and Security Gateway 1.14

Grid Computing. Olivier Dadoun LAL, Orsay Introduction & Parachute method. APC-Grid February 2007

Upgrade Instructions. NetBrain Integrated Edition 7.1. Two-Server Deployment

Xton Access Manager GETTING STARTED GUIDE

Configuring SSL. SSL Overview CHAPTER

Installing AX Server with PostgreSQL (multi-server)

Bitnami Tiny Tiny RSS for Huawei Enterprise Cloud

CaliberRDM. Installation Guide

Gridbus Portlets -- USER GUIDE -- GRIDBUS PORTLETS 1 1. GETTING STARTED 2 2. AUTHENTICATION 3 3. WORKING WITH PROJECTS 4

ACE Live on RSP: Installation Instructions

New open source CA development as Grid research platform.

Installation Guide. CompanyCRYPT v1.4.5

CYAN SECURE WEB Installing on Windows

Cluster Computing Spring 2004 Paul A. Farrell 4/25/2006. Dept of Computer Science Kent State University 1. Configuring & Tuning Cluster Networks

Using the MyProxy Online Credential Repository

PASSPORTAL PLUGIN DOCUMENTATION

Grid Architectural Models

VMware AirWatch Content Gateway for Windows. VMware Workspace ONE UEM 1811 Unified Access Gateway

Grid Computing. Olivier Dadoun LAL, Orsay. Introduction & Parachute method. Socle 2006 Clermont-Ferrand Orsay)

Upgrade Guide. This document details the upgrade process for customers moving from the full version of OnApp Cloud v2.3.1 to v2.3.2.

CE+WN+siteBDII Installation and configuration

EUSurvey Installation Guide

Summer Student Work: Accounting on Grid-Computing

SUSE Cloud Admin Appliance Walk Through. You may download the SUSE Cloud Admin Appliance the following ways.

Enabling Grids for E-sciencE. A centralized administration of the Grid infrastructure using Cfengine. Tomáš Kouba Varna, NEC2009.

Hardware Tokens in META Centre

ForeScout Extended Module for IBM BigFix

3. Optionally, if you want to use the new Web SSO feature, complete the steps in Adding Web Single Sign-On Functionality.

Automic Agent Deployment and Upgrade Toolkit. How-To Documentation

Edge Device Manager Quick Start Guide. Version R15

Transcription:

LHC COMPUTING GRID INSTALLING THE RELEASE Document identifier: EDMS id: Version: n/a v2.4.0 Date: April 6, 2005 Section: Document status: gis final Author(s): GRID Deployment Group (<support-lcg-deployment@cern.ch>) File: LCG2-LCFG-Install Abstract: These notes will assist you in installing the new release using LCFGng

CONTENTS 1. INSTALLING A NEW RELEASE USING LCFGNG............................. 3 1.1. DOWNLOADING THE LCFGNG CONFIGURATION FILES........................... 3 1.2. DOWNLOAD THE RPMS....................................................... 3 1.3. UPDATING THE LCFGNG SERVER.............................................. 4 1.4. SETTING THE ROOT PASSWORD................................................ 4 1.5. SITE CONFIGURATION......................................................... 4 1.6. NODE PROFILE CREATION..................................................... 5 2. MANUAL STEPS............................................................ 6 2.1. COMMON STEPS.............................................................. 6 2.2. USERINTERFACE............................................................. 6 2.3. RESOURCEBROKER........................................................... 6 2.4. COMPUTINGELEMENT........................................................ 7 2.5. MONNODE................................................................... 8 2.6. WORKERNODE............................................................... 9 2.7. STORAGEELEMENT........................................................... 10 2.8. PLAINGRIS.................................................................. 10 2.9. BDII NODE.................................................................. 10 2.10. MYPROXY NODE............................................................. 10 3. APPENDIX A.............................................................. 11 3.1. HOW TO CONFIGURE THE PBS SERVER ON A COMPUTINGELEMENT.............. 11 Installing the Release Page 2

1. INSTALLING A NEW RELEASE USING LCFGNG If you do not already have an LCFGng server installed, please use the LCFGng Server Installation Guide to install an LCFGng server. The files used by the LCFGng server can be located in different places. There are three important locations. The rpmlist directory contains the files related to rpm lists. The source directory contains files related to configuration. The profile directory contains the site-cfg.h file and the node profiles. If you do not know where these directories are or how they are used, you should read the LCFGng Server Installation Guide. The example tag LCG-x y x will be used through out the guide, this should be replaced with the current tag. 1.1. DOWNLOADING THE LCFGNG CONFIGURATION FILES. All the LCFGng configuration files needed for the release can be found in the LCG CVS repository. These files should checked out from the CVS repository onto your LCFGng Server. To do this, first set the CVS environment variables. > export CVS_RSH=ssh > export CVSROOT=:pserver:anonymous@lcgdeploy.cvs.cern.ch:/cvs/lcgdeploy Checkout the tag from CVS. > cvs checkout -r LCG-x_y_z -d LCG-x_y_z lcg2 This will create a directory LCG-x y z that contains the new configuration files. The files in the LCG-x y z/rpmlist and LCG-x y z/source directories should be copied to the locations that you use for rpmlist and source. 1.2. DOWNLOAD THE RPMS The rpms needed for the release need be copied on to the LCFGng server. To do this copy the file LCG-x y z/tools/updaterep.conf to /etc/updaterep.conf and run the updaterep to download the rpms. > cp LCG-x_y_z/tools/updaterep.conf /etc/updaterep.conf > updaterep By default all rpms will be copied to the /opt/local/linux/7.3/rpms area, which is visible from the client nodes as /export/local/linux/7.3/rpms. You can change this location by editing /etc/updaterep.conf and modifying the REPOSITORY BASE variable. IMPORTANT NOTICE: As the list and structure of Certification Authorities (CA) accepted by the LCG project can change independently from the middle-ware releases, the rpm list related to the CAs certificates and URLs has been decoupled from the standard LCG release procedure. This means that the version of the security-rpm.h file contained in the rpmlist directory associated to the current tag might be incomplete or obsolete. Please go to the URL http://markusw.home.cern.ch/markusw/ lcg2calist.html and follow the instructions there to update the CAs. Changes and updates of the CAs will be announced on the LCG-Rollout mailing list. Installing the Release Page 3

1.3. UPDATING THE LCFGNG SERVER To ensure that all the LCFGng server-side object rpms are installed on your LCFGng server, run the command, > LCG-x_y_z/tools/lcfgng_server_update.pl LCG-x_y_z/rpmlist/lcfgng-server-rpm.h This script will report which rpms are missing or have the wrong version and will create a script /tmp/lcfgng server update script.sh. The script will update all the rpms needed on the LCFGng server. Please check and verify that all the commands look reasonable before running it. 1.4. SETTING THE ROOT PASSWORD You must replace the default root password in the file, private-cfg.h with the one you want to use for your site: +auth.rootpwd <CRYPTED_PWD> To obtain the encrypted MD5 encryption password (stronger than the standard crypt method) you can use the following command: > openssl passwd -1 This command will prompt you to insert the clear text version of the password and then print the encrypted version. 1.5. SITE CONFIGURATION The directory LCG-x y z/examples contains an example of the site-cfg.h file. This file contains all the information needed about your site. You should should copy this file to the profiles directory and edit it. The description of the values are well documented within the file. Additional VOs on the LCFGng Server In the directory LCG-x y z/examples, there is an example VO configuration file new-vo.conf. For every additional VO that you want to support, copy this file to the sources directory. Edit this file so that it contains all the required information about the VO that you want to support. In the directory LCGx y z/sources, there is a script called addvo.py. Copy this script the the sources directory. Create the VO configuration file by running the command. >./addvo.py -i new-vo.conf > vo-newco.cfg.h To include this file you will need to add the following line to the vos-cfg.h file. #include "new-vo-cfg.h" Installing the Release Page 4

1.6. NODE PROFILE CREATION The directory LCG-x y z/examples contains example profiles that can be used to create the initial node profiles needed for your site. In the LCG-x y z/tools there is a script do mkxprof.sh. A detailed description of how this script works is contained in the script itself. To create the LCFG configuration for one or more nodes you can do > do_mkxprof.sh node1 [node2 node3,...] If you get an error status for one or more of the configurations, you can get a detailed report on the nature of the error by looking into URL > http://<lcfgng_server>/status Once all node configurations are correctly published, you can proceed and install your nodes following any one of the installation procedures described in the LCFGng Server Installation Guide. Installing the Release Page 5

2. MANUAL STEPS Each node type requires a few manual steps to be completely configured. 2.1. COMMON STEPS On the ResourceBroker, MyProxy, StorageElement, and ComputingElement nodes you must install the host certificate/key files in /etc/grid-security with names hostcert.pem and hostkey.pem. Also make sure that hostkey.pem is only readable by root with > chmod 400 /etc/grid-security/hostkey.pem All Globus services grant access to LCG users according to the certificates listed in the /etc/gridsecurity/grid-mapfile file. The list of VOs included in grid-mapfile is defined in /opt/edg/etc/edgmkgridmap.conf. This file is now handled automatically by the mkgridmap LCFG object. This object takes care of enabling only the VOs accepted at each site according to the SE VO <VO> definitions in site-cfg.h. If you need to modify the default configuration for your site, e.g. by adding users to grid-mapfile-local, you can do this from your local-cfg.h file by following the examples in <TAG DIRECTORY>/source/mkgridmap-cfg.h. After installing a ResourceBroker, StorageElement, or ComputingElement node you should force a first creation of the grid-mapfile by running > /opt/edg/sbin/edg-mkgridmap --output=/etc/grid-security/grid-mapfile --safe! Every 6 hours a cron job will repeat this procedure and update grid-mapfile. 2.2. USERINTERFACE No additional configuration steps are currently needed on a UserInterface node. 2.3. RESOURCEBROKER Log as root on your RB node, represented by <rb node> in the example, and make sure that the mysql server is up and running: > /etc/rc.d/init.d/mysql start If it was already running you will just get notified of the fact. Now you can choose a DB management <password> you like (write it down somewhere!) and then configure the server with the following commands: Installing the Release Page 6

> mysqladmin password <password> > mysql --password=<password> \ --exec "set password for root@<rb_node>=password( <password> )" mysql > mysqladmin --password=<password> create lbserver20 > mysql --password=<password> lbserver20 < /opt/edg/etc/server.sql > mysql --password=<password> \ --exec "grant all on lbserver20.* to lbserver@localhost" lbserver20 Note that the database name lbserver20 is hardwired in the LB server code and cannot be changed so use it exactly as shown in the commands. Make sure that /var/lib/mysql has the right permissions set (755). Reboot the node 2.4. COMPUTINGELEMENT Don t forget after upgrading the CE to make sure that the experiment specific runtime environment tags can still be published. For this move the /opt/edg/var/info/ VO-NAME / VO-NAME.ldif files to VO- NAME.list. Configure the PBS server. See detailed recipe in Appendix B at the end of this document. Create the first version of the /etc/ssh/ssh known hosts file by running > /opt/edg/sbin/edg-pbs-knownhosts A cron job will update this file every 6 hours. If your CE is NOT sharing the /home directory with your WNs (this is the LCG-2 default configuration: if you have modified site-cfg.h to run PBS in traditional mode as described in a previous chapter, just ignore the following instructions) then you have to configure sshd to allow WNs to copy job output back to the CE using scp. This requires the following two steps: 1. modify the sshd configuration. Edit the /etc/ssh/sshd config file and add these lines at the end: HostbasedAuthentication yes IgnoreUserKnownHosts yes IgnoreRhosts yes and then restart the server with > /etc/rc.d/init.d/sshd restart 2. configure the script enabling WNs to copy output back to the CE. Installing the Release Page 7

in /opt/edg/etc, copy edg-pbs-shostsequiv.conf.template to edg-pbs-shostsequiv.conf then edit this file and change parameters to your needs. Most sites will only have to set NODES to an empty string. create the first version of the /etc/ssh/shosts.equiv file by running > /opt/edg/sbin/edg-pbs-shostsequiv A cron job will update this file every 6 hours. Note: every time you will add or remove WNs, do not forget to run > /opt/edg/sbin/edg-pbs-shostsequiv < only if you do not share /home > /opt/edg/sbin/edg-pbs-knownhosts on the CE or the new WNs will not work correctly till the next time cron runs them for you. The CE is supposed to export information about the hardware configuration (i.e. CPU power, memory, disk space) of the WNs. The procedure to collect these informations and publish them is described in Appendix D of this document. Reboot the node If your CE exports the /home area to all WNs, then after rebooting it make sure that all WNs can still see this area. If this is not the case, execute this command on all WNs: > /etc/obj/nfsmount restart 2.5. MONNODE The MySQL database used by the servlets needs to be configured. > mysql -u root < /opt/edg/var/edg-rgma/rgma-db-setup.sql Restart the R-GMA servlets. /etc/rc.d/init.d/edg-tomcat4 restart For the accounting package, apel, you will need to setup the database. > mysql -u root --exec "create database accounting" > mysql -u root accounting < /opt/edg/var/edg-rgma/apel-schema.sql > mysql -u root --exec "grant all on accounting.* to accounting@$mon_host identified by APELDB_ > mysql -u root --exec "grant all on accounting.* to accounting@localhost identified by APELDB_ > mysql -u root --exec "grant all on accounting.* to accounting@$ce_host identified by APELDB_P Installing the Release Page 8

2.6. WORKERNODE The default allowed maximum number of open file on a RedHat node is only 26213. This number might be too small if users submit file-hungry jobs (we already had one case) so you may want to increase it on your WNs. At CERN we currently use 256000. To set this parameter you can use this command: > echo 256000 > /proc/sys/fs/file-max You can make this setting reboot-proof by adding the following code at the end of your /etc/rc.d/rc.local file: # Increase max number of open files if [ -f /proc/sys/fs/file-max ]; then echo 256000 > /proc/sys/fs/file-max fi Every 6 hours each WN needs to connect to the web sites of all known CAs to check if a new CRL (Certificate Revocation List) is available. As the script which handles this functionality uses wget to retrieve the new CRL, you can direct your WNs to use a web proxy. This is mandatory if your WNs sit on a hidden network with no direct external connectivity. To redirect your WNs to use a web proxy you should edit the /etc/wgetrc file and add a line like: http proxy = http://web proxy.cern.ch:8080/ name and the port to match those of your web proxy. where you should replace the node Note: I could not test this recipe directly as I am not aware of a web proxy at CERN. If you try it and find problems, please post a message on the lcg-rollout list. If your WNs are NOT sharing the /home directory with your CE (this is the default configuration) then you have to configure ssh to enable them to copy job output back to the CE using scp. To this end you have to modify the ssh client configuration file /etc/ssh/ssh config adding these lines at the end: Host * HostbasedAuthentication yes Note: the Host * line might already exist. In this case, just add the second line after it. Create the first version of the /etc/ssh/ssh known hosts file by running > /opt/edg/sbin/edg-pbs-knownhosts A cron job will update this file every 6 hours. Installing the Release Page 9

2.7. STORAGEELEMENT Make sure that the storage area defined with CE CLOSE SE MOUNTPOINT exists and contains the VO specific sub-directories with the correct access privileges (group=<vo> and r/w access for the group). Reboot the node. 2.8. PLAINGRIS No additional configuration steps are currently needed on a PlainGRIS node. 2.9. BDII NODE This is the current version of the BDII service which does not rely on Regional MDSes. If you want to install the new service then you should use the LCG-BDII node example file from the examples directory. After installation the new LCG-BDII service does not need any further configuration: the list of available sites will be automatically downloaded from the default web location defined by SITE BDII URL in site-cfg.h and the initial population of the database will be started. Expect a delay of a couple of minutes from when the machine is up and when the database is fully populated. If for some reason you want to use a static list of sites, then you should copy the static configuration file to /opt/lcg/var/bdii/lcg-bdii-update.conf and add this line at the end of your LCG-BDII node configuration file: +lcgbdii.auto no If you need a group of BDIIs being centrally managed and see a different set of sites than those defined by URL above you can setup a web-server and publish the web page containing the sites. The URL for this file has to be used to configure the SITE BDII URL in the site-cfg.h. Leave the lcgbdii.auto to yes. This file has the following structure: http://grid-deployment.web.cern.ch/grid-deployment/ gis/lcg2-bdii/dteam/lcg2-all-sites.conf 2.10. MYPROXY NODE Reboot the node after installing the host certificates (see Common Steps above). Make sure that in the site-cfg.h file you have included all Resource Brokers that your users want to use. This is done in the following line: #define GRID_TRUSTED_BROKERS "/C=CH/O=CERN/OU=GRID/CN=host/BROKER1.Domain.ch \ /C=CH/O=CERN/OU=GRID/CN=host/Broker2.Domain.ch" Installing the Release Page 10

3. APPENDIX A 3.1. HOW TO CONFIGURE THE PBS SERVER ON A COMPUTINGELEMENT Note that queues short, long, and infinite are those defined in the site-cfg.h file and the time limits are those in use at CERN. Feel free to add/remove/modify them to your liking but do not forget to modify site-cfg.h accordingly. The values given in this example are only reference values. Make sure that the requirements of the experiment as stated here: http://ibird.home.cern.ch/ibird/lcgminresources.doc are satisfied by your configuration. 1. load the server configuration with this command (replace <CEhostname> with the hostname of the CE you are installing): @--------------------------------------------------------------------- /usr/bin/qmgr <<EOF set server scheduling = True set server acl_host_enable = False set server managers = root@<cehostname> set server operators = root@<cehostname> set server default_queue = short set server log_events = 511 set server mail_from = adm set server query_other_jobs = True set server scheduler_iteration = 600 set server default_node = lcgpro set server node_pack = False create queue short set queue short queue_type = Execution set queue short resources_max.cput = 00:15:00 set queue short resources_max.walltime = 02:00:00 set queue short enabled = True set queue short started = True create queue long set queue long queue_type = Execution set queue long resources_max.cput = 12:00:00 set queue long resources_max.walltime = 24:00:00 set queue long enabled = True set queue long started = True create queue infinite Installing the Release Page 11

set queue infinite queue_type = Execution set queue infinite resources_max.cput = 80:00:00 set queue infinite resources_max.walltime = 100:00:00 set queue infinite enabled = True set queue infinite started = True EOF @--------------------------------------------------------------------- 2. edit file /var/spool/pbs/server priv/nodes to add the list of WorkerNodes you plan to use. An example setup for CERN could be: @--------------------------------------------------------------------- lxshare0223.cern.ch np=2 lcgpro lxshare0224.cern.ch np=2 lcgpro lxshare0225.cern.ch np=2 lcgpro lxshare0226.cern.ch np=2 lcgpro @--------------------------------------------------------------------- where np=2 gives the number of job slots (usually equal to #CPUs) available on the node, and lcgpro is the group name as defined in the default node parameter in the server configuration. 3. Restart the PBS server > /etc/rc.d/init.d/pbs_server restart Installing the Release Page 12