CERN LCG. LCG Short Demo. Markus Schulz. FZK 30 September 2003

Similar documents
Advanced Job Submission on the Grid

Architecture of the WMS

DataGrid. Document identifier: Date: 16/06/2003. Work package: Partner: Document status. Deliverable identifier:

glite Middleware Usage

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Problemi di schedulazione distribuita su Grid

International Collaboration to Extend and Advance Grid Education. glite WMS Workload Management System

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

DataGrid. Document identifier: Date: 28/10/2003. Work package: Partner: Document status. Deliverable identifier:

Architecture Proposal

The glite middleware. Ariel Garcia KIT

Gergely Sipos MTA SZTAKI

Grid Computing. Olivier Dadoun LAL, Orsay. Introduction & Parachute method. Socle 2006 Clermont-Ferrand Orsay)

Usage of Glue Schema v1.3 for WLCG Installed Capacity information

DataGrid. Document identifier: Date: 24/11/2003. Work package: Partner: Document status. Deliverable identifier:

DataGrid EDG-BROKERINFO USER GUIDE. Document identifier: Date: 06/08/2003. Work package: Document status: Deliverable identifier:

Overview of HEP software & LCG from the openlab perspective

Grid Computing. Olivier Dadoun LAL, Orsay Introduction & Parachute method. APC-Grid February 2007

glite Grid Services Overview

Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008

WMS overview and Proposal for Job Status

LCG-2 and glite Architecture and components

J O B D E S C R I P T I O N L A N G U A G E A T T R I B U T E S S P E C I F I C A T I O N

EUROPEAN MIDDLEWARE INITIATIVE

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova

Bookkeeping and submission tools prototype. L. Tomassetti on behalf of distributed computing group

On the employment of LCG GRID middleware

DataGrid D EFINITION OF ARCHITECTURE, TECHNICAL PLAN AND EVALUATION CRITERIA FOR SCHEDULING, RESOURCE MANAGEMENT, SECURITY AND JOB DESCRIPTION

Information and monitoring

MONITORING OF GRID RESOURCES

LHC COMPUTING GRID INSTALLING THE RELEASE. Document identifier: Date: April 6, Document status:

GRID COMPANION GUIDE

Grid Infrastructure For Collaborative High Performance Scientific Computing

Troubleshooting Grid authentication from the client side

MyProxy Server Installation

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager

AGATA Analysis on the GRID

The DESY Grid Testbed

Grid Architectural Models

Status of KISTI Tier2 Center for ALICE

OSGMM and ReSS Matchmaking on OSG

Troubleshooting Grid authentication from the client side

glite/egee in Practice

EGEE. Grid Middleware. Date: June 20, 2006

Monitoring tools in EGEE

Layered Architecture

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

EGEE and Interoperation

FREE SCIENTIFIC COMPUTING

DataGRID EDG TUTORIAL. Document identifier: EDMS id: Date: April 4, Work package: Partner(s): Lead Partner: Document status: Version 2.6.

EU DataGRID testbed management and support at CERN

ALHAD G. APTE, BARC 2nd GARUDA PARTNERS MEET ON 15th & 16th SEPT. 2006

OSG Lessons Learned and Best Practices. Steven Timm, Fermilab OSG Consortium August 21, 2006 Site and Fabric Parallel Session

SPGrid Efforts in Italy

Edinburgh (ECDF) Update

The EU DataGrid Testbed

The Grid: Processing the Data from the World s Largest Scientific Machine

Introduction to Programming and Computing for Scientists

Grid Interoperation and Regional Collaboration

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

DIRAC Documentation. Release integration. DIRAC Project. 09:29 20/05/2016 UTC

The Grid. Processing the Data from the World s Largest Scientific Machine II Brazilian LHC Computing Workshop

Introduction to SRM. Riccardo Zappi 1

Dr. Giuliano Taffoni INAF - OATS

CineGrid GRID & Networking

dcache Introduction Course

Setup Desktop Grids and Bridges. Tutorial. Robert Lovas, MTA SZTAKI

GROWL Scripts and Web Services

A Login Shell interface for INFN-GRID

Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY

Agent Teamwork Research Assistant. Progress Report. Prepared by Solomon Lane

Tutorial for CMS Users: Data Analysis on the Grid with CRAB

Grid Documentation Documentation

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

Outline. ASP 2012 Grid School

Resource Allocation in computational Grids

The glite middleware. Presented by John White EGEE-II JRA1 Dep. Manager On behalf of JRA1 Enabling Grids for E-sciencE

Cloud Computing. Up until now

ARC integration for CMS

Implementing GRID interoperability

Using the MyProxy Online Credential Repository

DESY. Andreas Gellrich DESY DESY,

Day 1 : August (Thursday) An overview of Globus Toolkit 2.4

Monitoring the Usage of the ZEUS Analysis Grid

Programming the Grid with glite

GRID COMPUTING APPLIED TO OFF-LINE AGATA DATA PROCESSING. 2nd EGAN School, December 2012, GSI Darmstadt, Germany

Deliverable D8.9 - First release of DM services

The GridWay. approach for job Submission and Management on Grids. Outline. Motivation. The GridWay Framework. Resource Selection

An Example Grid Middleware - The Globus Toolkit. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Understanding StoRM: from introduction to internals

Data Grid Infrastructure for YBJ-ARGO Cosmic-Ray Project

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing

Grid Scheduling Architectures with Globus

R-GMA (Relational Grid Monitoring Architecture) for monitoring applications

MPI SUPPORT ON THE GRID. Kiril Dichev, Sven Stork, Rainer Keller. Enol Fernández

Computing in HEP. Andreas Gellrich. DESY IT Group - Physics Computing. DESY Summer Student Program 2005 Lectures in HEP,

WP3 Final Activity Report

How to use computing resources at Grid

NorduGrid Tutorial. Client Installation and Job Examples

Michigan Grid Research and Infrastructure Development (MGRID)

Transcription:

LCG Short Demo Markus Schulz LCG FZK 30 September 2003

LCG-1 Demo Outline Monitoring tools and where to get documentation Getting started Running simple jobs Using the information system More on JDL Data management Markus.Schulz@cern.ch 2

LCG-1 Deployment Status Up to date status can be seen here: http://www.grid-support.ac.uk/goc/monitoring/dashboard/dashboard.html Has links to maps with sites that are in operation Links to GridICE based monitoring tool (history of VO s jobs, etc) Using information provided by the information system Tables with deployment status Sites that are currently in LCG-1 (here) expect 18-20 by end of 2003 PIC-Barcelona (RB) Budapest (RB) (RB) Sites to enter soon CNAF (RB) FermiLab. (FNAL) BNL, Prague,(Lyon) FZK Several tier2 centres Krakow in Italy and Spain Moscow (RB) RAL (RB) Sites preparing to join Taipei (RB) Tokyo Pakistan, Sofia, Switzerland Total number of CPUs ~120 WNs # of sites matters more Markus.Schulz@cern.ch 3

Markus.Schulz@cern.ch 4

The Basics Get the LCG-1 Users Guide http://grid-deployment.web.cern.ch/grid-deployment/cgi-bin/index.cgi?var=eis/homepage Get a certificate Go to the CA that is responsible for you and request a user certificate List of CAs can be found here http://lcg-registrar.cern.ch/pki_certificates.html Follow instructions on how to load the certificate into an web-browser Do this. Register with LCG and a VO of your choice: http://lcg-registrar.cern.ch/ In case your cert is not in PEM format change it to it by using openssl Ask your CA how to do this Find a user interface machine http://grid-deployment.web.cern.ch/grid-deployment/cgibin/index.cgi?var=lcg1status We use adc0014 at Markus.Schulz@cern.ch 5

Get ready Check your certificate in ~/.globus $ grid-cert-info Cert valid? Should return with a O.K. $ openssl verify -CApath /etc/grid-security/certificates ~/.globus/usercert.pem Generate a proxy (valid for 12h) $ grid-proxy-init (will ask for your pass phrase) $ grid-proxy-info (to see details, like how many hours until t.o.d.) $ grid-proxy-destroy For long jobs register longterm credential with proxy server $ myproxy-init -s adc0024 -d -n Creates proxy with one week duration $ myproxy-info -s adc0024 -d $ myproxy-destroy -s adc0024 -d Markus.Schulz@cern.ch 6

Job Submission Basic command: edg-job-submit test.jdl Many, many options, see WLMS manual for details Try -help option (very useful -o to get job id in a file) Tiny JDL file executable = "testjob.sh"; StdOutput = "testjob.out"; StdError = "testjob.err"; InputSandbox = {"./testjob.sh"}; OutputSandbox = {"testjob.out","testjob.err"}; Connecting to host lxshare0380.cern.ch, port 7772 Logging to host lxshare0380.cern.ch, port 9002 ================================ edg-job-submit Success ===================================== The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier (edg_jobid) is: http://www.infn.it/workload-grid Docs for WLMS - https://lxshare0380.cern.ch:9000/1gmdxnfzed1o0b9bjfc3lw The edg_jobid has been saved in the following file: /afs/cern.ch/user/m/markusw/test/demo/out ============================================================================================= Markus.Schulz@cern.ch 7

Work Load Management System Input Sandbox is what you take with you to the node Output Sandbox is what you get back Job Status submitted Arrived on RB UI sandbox Network Server RB node Match- Maker/ Broker Replica Catalog waiting ready Matching Job Adapter Workload Manager RB storage Job Adapter Inform. Service scheduled running On CE Processed Logging & Bookkeeping Log Monitor Job Contr. - CondorG CE characts & status SE characts & status done cleared Output back User done Failed jobs are resubmitted Markus.Schulz@cern.ch 8

Work Load Management System The services that bring the resources and the jobs together Live most of the times on a node called RB (Resource Broker) Keep track of the status of jobs (LBS Logging and Bookkeeping Service) Talks to the globus gate keepers and resource managers on the remote sites (LRMS) (CE) Matches jobs with sites where data and resources are available Re-submission if jobs fail Uses almost all services: IS, RLS, GSI,.. Walking trough a job might be instructive see next slide The user describes the job and its requirements using JDL (Job Description Lang.) [ JobType= Normal ; Executable = gridtest ; StdError = stderr.log ; StdOutput = stdout.log ; InputSandbox = { home/joda/test/gridtest }; OutputSandbox = { stderr.log, stdout.log }; InputData = { lfn:green, guid:red }; DataAccessProtocol = gridftp ; Requirements = other.gluehostoperatingsystemnameopsys == LINUX && other.gluecestatefreecpus>=4; Rank = other.gluecepolicymaxcputime; ] RB http://www.infn.it/workload-grid Docs for WLMS Markus.Schulz@cern.ch 9

Where to Run? Before submitting a job you might want to see where you can run edg-job-list-match <jdl> Switching RBs Use the --config-vo < vo conf file> and --config <conf file> (see User Guide) Find out which RBs you could use Connecting to host lxshare0380.cern.ch, port 7772 *************************************************************************** COMPUTING ELEMENT IDs LIST The following CE(s) matching your job requirements have been found: *CEId* adc0015.cern.ch:2119/jobmanager-lcgpbs-infinite adc0015.cern.ch:2119/jobmanager-lcgpbs-long adc0015.cern.ch:2119/jobmanager-lcgpbs-short adc0018.cern.ch:2119/jobmanager-pbs-infinite adc0018.cern.ch:2119/jobmanager-pbs-long adc0018.cern.ch:2119/jobmanager-pbs-short dgce0.icepp.s.u-tokyo.ac.jp:2119/jobmanager-lcgpbs-infinite dgce0.icepp.s.u-tokyo.ac.jp:2119/jobmanager-lcgpbs-long dgce0.icepp.s.u-tokyo.ac.jp:2119/jobmanager-lcgpbs-short grid-w1.ifae.es:2119/jobmanager-lcgpbs-infinite grid-w1.ifae.es:2119/jobmanager-lcgpbs-long grid-w1.ifae.es:2119/jobmanager-lcgpbs-short hik-lcg-ce.fzk.de:2119/jobmanager-lcgpbs-infinite hik-lcg-ce.fzk.de:2119/jobmanager-lcgpbs-long hik-lcg-ce.fzk.de:2119/jobmanager-lcgpbs-short hotdog46.fnal.gov:2119/jobmanager-pbs-infinite hotdog46.fnal.gov:2119/jobmanager-pbs-long hotdog46.fnal.gov:2119/jobmanager-pbs-short lcg00105.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-infinite lcg00105.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-long lcg00105.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-short lcgce01.gridpp.rl.ac.uk:2119/jobmanager-lcgpbs-infinite lcgce01.gridpp.rl.ac.uk:2119/jobmanager-lcgpbs-long lcgce01.gridpp.rl.ac.uk:2119/jobmanager-lcgpbs-short lhc01.sinp.msu.ru:2119/jobmanager-lcgpbs-infinite lhc01.sinp.msu.ru:2119/jobmanager-lcgpbs-long lhc01.sinp.msu.ru:2119/jobmanager-lcgpbs-short wn-02-29-a.cr.cnaf.infn.it:2119/jobmanager-lcgpbs-infinite wn-02-29-a.cr.cnaf.infn.it:2119/jobmanager-lcgpbs-long wn-02-29-a.cr.cnaf.infn.it:2119/jobmanager-lcgpbs-short zeus02.cyf-kr.edu.pl:2119/jobmanager-lcgpbs-infinite zeus02.cyf-kr.edu.pl:2119/jobmanager-lcgpbs-long zeus02.cyf-kr.edu.pl:2119/jobmanager-lcgpbs-short *************************************************************************** Markus.Schulz@cern.ch 10

And then? Check the status: edg-job-status -v <0 1 2> -o <file with id> Many options, play with it, do a -help --noint for working with scripts In case of problems: edg-job-get-logging-info (shows a lot of information) controlled by -v option Get output sandbox: edg-job-get-output, options do work on collections of jobs Output in /tmp/joboutput/1gmdxnfzed1o0b9bjfc3lw Remove the job edg-job-cancel Getting the output cancels the job, canceling a canceled job is an error Markus.Schulz@cern.ch 11

Information System Have a look at the status page to find BDII Query the BDII (use an ldap browser, or ldapsearch command) Sample: BDII at lxshare0222.cern.ch Have a look at the man pages and explore the BDII, RGIIS, CE and SE BDII ldapsearch -LLL -x -H ldap://lxshare0222.cern.ch:2170 -b "mds-vo-name=local,o=grid" "(objectclass=gluece)" dn Regional GIIS ldapsearch -LLL -x -H ldap://adc0026.cern.ch:2135 -b "mds-vo-name=lcgeast,o=grid" "(objectclass=gluece)" dn CE ldapsearch -LLL -x -H ldap://adc0018.cern.ch:2135 -b "mds-vo-name=local,o=grid" SE ldapsearch -LLL -x -H ldap://adc0021.cern.ch:2135 -b "mds-vo-name=local,o=grid" Markus.Schulz@cern.ch 12

GLUE SCHEMA Appendix B in the LCG-1 User Guide Many Categories, some attributes that are defined might be still not filled Describing CE, cluster, hosts, SE, Batchsystem etc Too many for this presentation->user Guide Markus.Schulz@cern.ch 13

GLUE SCHEMA Attributes for the Computing Element CE (objectclass GlueCE) GlueCEUniqueID: unique identifier for the CE GlueCEName: human-readable name of the service Info (objectclass GlueCEInfo) GlueCEInfoLRMSType: name of the local batch system GlueCEInfoLRMSVersion: version of the local batch system GlueCEInfoGRAMVersion: version of GRAM GlueCEInfoHostName: fully qualified name of the host where the gatekeeper runs GlueCEInfoGateKeeperPort: port number for the gatekeeperglueceinfototalcpus: number of CPUs in the cluster associated to the CE Policy (objectclass GlueCEPolicy) GlueCEPolicyMaxWallClockTime: maximum wall clock time available to jobs submitted to the CE GlueCEPolicyMaxCPUTime: maximum CPU time available to jobs submitted to the CE GlueCEPolicyMaxTotalJobs: maximum allowed total number of jobs in the queue GlueCEPolicyMaxRunningJobs: maximum allowed number of running jobs in the queue GlueCEPolicyPriority: information about the service priority State (objectclass GlueCEState) GlueCEStateRunningJobs: number of running jobs GlueCEStateWaitingJobs: number of jobs not running GlueCEStateTotalJobs: total number of jobs (running + waiting) GlueCEStateStatus: queue status: queueing (jobs are accepted but not run), production (jobs are accepted and run), closed (jobs are neither accepted nor run), draining (jobs are not accepted but those in the queue are run) GlueCEStateWorstResponseTime: worst possible time between the submission of a job and the start of its execution GlueCEStateEstimatedResponseTime: estimated time between the submission of a job and the start of its execution GlueCEStateFreeCPUs: number of CPUs available to the scheduler Job (currently not filled, the Logging and Bookkeeping service can provide this information) (objectclass GlueCEJob) GlueCEJobLocalOwner: local user name of the jobbcejobglobalowner: GSI subject of the real jobbowner GlueCEJobLocalID: local job identifier GlueCEJobGlobalId: global job identifier GlueCEJobGlueCEJobStatus: job status: SUBMITTED, WAITING, READY, SCHEDULED, RUNNING, ABORTED, DONE, CLEARED, CHECKPOINTED GlueCEJobSchedulerSpecific: any scheduler specific informationaccess control (objectclass GlueCEAccessControlBase) GlueCEAccessControlBaseRule: a rule defining any access restrictions to the CE. Current semantic: VO = a VO name, DENY = a\ n X.509 user subject Cluster (objectclass GlueCluster) GlueClusterUniqueID: unique identifier for the cluster GlueClusterName: human-readable name of the cluster Subcluster (objectclass GlueSubCluster) GlueSubClusterUniqueID: unique identifier for the subcluster Markus.Schulz@cern.ch 14 GlueSubClusterName: human-readable name of the subcluster

GLUE SCHEMA Host (objectclass GlueHost) GlueHostUniqueId: unique identifier for the host GlueHostName: human-readable name of the host Architecture (objectclass GlueHostArchitecture)GlueHostArchitecturePlatformType: platform description GlueHostArchitectureSMPSize: number of CPUs Operating system (objectclass GlueHostOperatingSystem) GlueHostOperatingSystemOSName: OS name GlueHostOperatingSystemOSRelease: OS release GlueHostOperatingSystemOSVersion: OS or kernel version Benchmark (objectclass GlueHostBenchmark) GlueHostBenchmarkSI00: SpecInt2000 benchmark Application software (objectclass GlueHostApplicationSoftware) GlueHostApplicationSoftwareRunTimeEnvironment: list of software installed on this host Processor (objectclass GlueHostProcessor) GlueHostProcessorVendor: name of the CPU vendor GlueHostProcessorModel: name of the CPU model GlueHostProcessorVersion: version of the CPU GlueHostProcessorOtherProcessorDescription: other description for the CPU GlueHostProcessorClockSpeed: clock speed of the CPU GlueHostProcessorInstructionSet: name of the instruction set architecture of the CPU GlueHostProcessorGlueHostProcessorFeatures: list of optional features of the CPU GlueHostProcessorCacheL1: size of the unified L1 cache GlueHostProcessorCacheL1I: size of the instruction L1 cache GlueHostProcessorCacheL1D: size of the data L1 cache GlueHostProcessorCacheL2: size of the unified L2 cache Main memory (objectclass GlueHostMainMemory) GlueHostMainMemoryRAMSize: physical RAM GlueHostMainMemoryRAMAvailable: unallocated RAM GlueHostMainMemoryVirtualSize: size of the configured virtual memory GlueHostMainMemoryVirtualAvailable: available virtual memory Network adapter (objectclass GlueHostNetworkAdapter) GlueHostNetworkAdapterName: name of the network card GlueHostNetworkAdapterIPAddress: IP address of the network card GlueHostNetworkAdapterMTU: the MTU size for the LAN to which the network card is attached GlueHostNetworkAdapterOutboundIP: permission for outbound connectivity GlueHostNetworkAdapterInboundIP: permission for inbound connectivity Processor load (objectclass GlueHostProcessorLoad) GlueHostProcessorLoadLast1Min: one-minute average processor availability for a single node GlueHostProcessorLoadLast5Min: 5-minute average processor availability for a single node GlueHostProcessorLoadLast15Min: 15-minute average processor availability for a single node Markus.Schulz@cern.ch 15

GLUE SCHEMA SMP load (objectclass GlueHostSMPLoad) GlueHostSMPLoadLast1Min: one-minute average processor availability for a single node GlueHostSMPLoadLast5Min: 5-minute average processor availability for a single node GlueHostSMPLoadLast15Min: 15-minute average processor availability for a single node Storage device (objectclass GlueHostStorageDevice) GlueHostStorageDeviceName: name of the storage device GlueHostStorageDeviceType: storage device type GlueHostStorageDeviceTransferRate: maximum transfer rate for the device GlueHostStorageDeviceSize: Size of the device GlueHostStorageDeviceAvailableSpace: amount of free space Local file system (objectclass GlueHostLocalFileSystem) GlueHostLocalFileSystemRoot: path name or other information defining the root of the file system GlueHostLocalFileSystemSize: size of the file system in bytes GlueHostLocalFileSystemAvailableSpace: amount of free space in bytes GlueHostLocalFileSystemReadOnly: true if the file system is read-only GlueHostLocalFileSystemType: file system type GlueHostLocalFileSystemName: the name for the file system GlueHostLocalFileSystemClient: host unique id of clients allowed to remotely access this file system Remote file system (objectclass GlueHostRemoteFileSystem) GlueHostLRemoteFileSystemRoot: path name or other information defining the root of the file system GlueHostRemoteFileSystemSize: size of the file system in bytes GlueHostRemoteFileSystemAvailableSpace: amount of free space in bytes GlueHostRemoteFileSystemReadOnly: true if the file system is read-only GlueHostRemoteFileSystemType: file system type GlueHostRemoteFileSystemName: the name for the file system GlueHostRemoteFileSystemServer: host unique id of the server which provides access to the file system File (objectclass GlueHostFile) GlueHostFileName: name for the file GlueHostFileSize: file size in bytes GlueHostFileCreationDate: file creation date and time GlueHostFileLastModified: date and time of the last modification of the file GlueHostFileLastAccessed: date and time of the last access to the file GlueHostFileLatency: time taken to access the file in seconds GlueHostFileLifeTime: time for which the file will stay on the storage device GlueHostFileOwner: name of the owner of the file Markus.Schulz@cern.ch 16

GLUE SCHEMA Attributes for the Storage Element Storage Service (objectclass GlueSE) GlueSEUniqueId: unique identifier of the storage service (URI) GlueSEName: human-readable name for the service GlueSEPort: port number that the service listens GlueSEHostingSL: unique identifier of the storage library hosting the service Storage Service State (objectclass GlueSEState) GlueSEStateCurrentIOLoad: system load (for example, number of files in the queue) Storage Service Access Protocol (objectclass GlueSEAccessProtocol) GlueSEAccessProtocolType: protocol type to access or transfer files GlueSEAccessProtocolPort: port number for the protocol GlueSEAccessProtocolVersion: protocol version GlueSEAccessProtocolAccessTime: time to access a file using this protocol GlueSEAccessProtocolSupportedSecurity: security features supported by the protocol Storage Library (objectclass GlueSL) GlueSLName: human-readable name of the storage library GlueSLUniqueId: unique identifier of the machine providing the storage service GlueSLService: unique identifier for the provided storage service Local File system (objectclass GlueSLLocalFileSystem) GlueSLLocalFileSystemRoot: path name (or other information) defining the root of the file system GlueSLLocalFileSystemName: name of the file system GlueSLLocalFileSystemType: file system type (e.g. NFS, AFS, etc.) GlueSLLocalFileSystemReadOnly: true is the file system is read-only GlueSLLocalFileSystemSize: total space assigned to this file system GlueSLLocalFileSystemAvailableSpace: total free space in this file system GlueSLLocalFileSystemClient: unique identifiers of clients allowed to access the file system remotely GlueSLLocalFileSystemServer: unique identifier of the server exporting this file system (only for remote file systems) Remote File system (objectclass GlueSLRemoteFileSystem) GlueSLRemoteFileSystemRoot: path name (or other information) defining the root of the file system GlueSLRemoteFileSystemName: name of the file system GlueSLRemoteFileSystemType: file system type (e.g. NFS, AFS, etc.) GlueSLRemoteFileSystemReadOnly: true is the file system is read-only GlueSLRemoteFileSystemSize: total space assigned to this file system GlueSLRemoteFileSystemAvailableSpace: total free space in this file system GlueSLRemoteFileSystemServer: unique identifier of the server exporting this file system Markus.Schulz@cern.ch 17

GLUE SCHEMA File Information (objectclass GlueSLFile) GlueSLFileName: file name GlueSLFileSize: file size GlueSLFileCreationDate: file creation date and time GlueSLFileLastModified: date and time of the last modification of the file GlueSLFileLastAccessed: date and time of the last access to the file GlueSLFileLatency: time needed to access the file GlueSLFileLifeTime: file lifetime GlueSLFilePath: file path Directory Information (objectclass GlueSLDirectory) GlueSLDirectoryName: directory name GlueSLDirectorySize: directory size GlueSLDirectoryCreationDate: directory creation date and time GlueSLDirectoryLastModified: date and time of the last modification of the directory GlueSLDirectoryLastAccessed: date and time of the last access to the directory GlueSLDirectoryLatency: time needed to access the directory GlueSLDirectoryLifeTime: directory lifetime GlueSLDirectoryPath: directory path Architecture (objectclass GlueSLDirectory) GlueSLDirectoryType: type of storage hardware (i.e. disk, RAID array, tape library, etc.) Performance (objectclass GlueSLPerformance) GlueSLPerformanceMaxIOCapacity: maximum bandwidth between the service and the network Storage Space (objectclass GlueSA) GlueSARoot: pathname of the directory containing the files of the storage space Policy (objectclass GlueSAPolicy) GlueSAPolicyMaxFileSize: maximum file size GlueSAPolicyMinFileSize: minimum file size GlueSAPolicyMaxData: maximum allowed amount of data that a single job can store GlueSAPolicyMaxNumFiles: maximum allowed number of files that a single job can store GlueSAPolicyMaxPinDuration: maximum allowed lifetime for non-permanent files GlueSAPolicyQuota: total available space GlueSAPolicyFileLifeTime: lifetime policy for the contained files Access Control Base (objectclass GlueSAAccessControlBase) GlueSAAccessControlBase Rule: list of the access control rules State (objectclass GlueSAState) GlueSAStateAvailableSpace: total space available in the storage space GlueSAStateUsedSpace: used space in the storage spacemarkus.schulz@cern.ch 18

More on JDL Based on Condor ClassAds syntax (parser very sensitive) Simple statements: attribute = value; Arguments = 1 2 3 -wall ; (passes arguments to the executable) Input sandbox can handle wildcards like *? Environment = { DTEAM_PATH=$HOME/dteam, TEAM=dteam }; OutputSE= adc0021.cern.ch ; (selects job to run close to this SE) [ InputSandbox = { home/joda/test/gridtest, /tmp/test/* }; OutputSandbox = { stderr.log, stdout.log }; InputData = { lfn:green, guid:red }; DataAccessProtocol ={ file, gridftp }; Requirements = other.gluehostoperatingsystemnameopsys == LINUX && other.gluecestatefreecpus>=4 && Member( alice3-4,other.gluehostapplicationsoftwareruntimeenvironment); Rank = other.gluecestatefreecpus; MyProxyServer = wn-02-36-a.cr.cnaf.infn.it ; RetryCount = 7 ; ] Markus.Schulz@cern.ch 19

More on JDL OutputData specifies where files should go If no LFN specified WP2 selects one If no SE is specified, the close SE is chosen At the end of the job the files are moved from WN and registered File with result of this operation is created and added to the snadbox DSUpload_<unique jobstring>.out OutputData = { [ OutputFile = toto.out ; StorageElement = adc0021.cern.ch ; LogicalFileName = thebesttotoever ; ], [ ] }; OutputFile = toto2.out ; StorageElement = adc0021.cern.ch ; LogicalFileName = thebesttotoever2 ; Markus.Schulz@cern.ch 20

Data Users should use high level tools (references and details -> User Guide for LCG and WP2 ) Avoid globus-url-copy, and the edg-gridftp-x tools, Except maybe the following X = exists, ls, mkdir The edg-replica-manager tools allow to: edg-rm move files around UI->SE WN->SE, Register files in the RLS Replicate them between SEs Many options -help + documentation Move a file from UI to SE Where? edg-rm --vo=dteam printinfo edg-rm --vo=dteam copyandregisterfile file:`pwd`/load -d srm://adc0021.cern.ch/flatfiles/se00/dteam/markus/t1 -l lfn:markus1 guid:dc9760d7-f36a-11d7-864b-925f9e8966fe is returned Hostname is sufficient for -d (without the RM decides where to go) Markus.Schulz@cern.ch 21

Data Replicate file to other SE (guid needed) edg-rm --vo=dteam replicatefile guid:dc9760d7-f36a-11d7-864b-925f9e8966fe -d wn-02-30- a.cr.cnaf.infn.it To list replicas edg-rm --vo=dteam listreplicas guid:dc9760d7-f36a-11d7-864b-925f9e8966fe To delete replicas use deletefile guid:xxx -s se.cern.ch To find all aliases of a file: First: edg-rm -i --vo=dteam listguid lfn:mm2 -> guid Then: edg-rmc -I aliasesforguid -h rlsdteam.cern.ch -p 7777 --vo=dteam guid Listing an SE dir: edg-rm -i --vo=dteam list srm://adc0021.cern.ch/flatfiles/se00/dteam/markus/ (broken) use instead edg-gridftp-ls --verbose gsiftp://adc0021.cern.ch/flatfiles/se00/dteam/markus Markus.Schulz@cern.ch 22

File access from a job The WLMS (RB) creates the brokerinfo file and moves it to the WN This is used to answer questions about the site you are on Get 1st input file name: (use.brokerinfo) infile=`edg-brokerinfo getinputdata cut -d -f 1` Get first close SE closese=`edg-brokerinfo getclosese cut -d -f 1 ` Get TURL TURL=`edg-rm --vo=dteam gbf $infile -d $closese -t file ` Get file name Localfile=`echo $TURL cut -d : -f 2 ` Markus.Schulz@cern.ch 23