E UFORIA G RID I NFRASTRUCTURE S TATUS R EPORT

E UFORIA G RID I NFRASTRUCTURE S TATUS R EPORT DSA1.1 Document Filename: Activity: Partner(s): Lead Partner: Document classification: EUFORIA-DSA1.1-v1.0-CSIC SA1 CSIC, FZK, PSNC, CHALMERS CSIC PUBLIC Abstract: The project EUFORIA is deploying and operating a production gridempowered infrastructure providing significant computing resources to the fusion researchers in Europe. The infrastructure added value consists in its capabilities focused at supporting fusion applications. This document describes the infrastructure status at month 6 and the plans for further improvements. EUFORIA-DSA1.1-v1.0-CSIC PUBLIC Page 1 of 22

EUFORIA GRID INFRASTRUCTURE STATUS REPORT DSA1.1 EUFORIA Contract- 211804 Released for moderation to Approved for delivery by Document review and moderation Name Partner Date Signature Document Log Version Date Summary of changes Author 0.1 15/06/2008 Initial draft version Isabel Campos 0.2 18/06/2008 Savannah and Autobuild included Marcus Hardt 1.0 18/06/08 Executive summary added Isabel Campos EUFORIA-DSA1.1-v1.0-CSIC PUBLIC Page 2 of 22

Contents 1. EXECUTIVE SUMMARY... 4 2. INFRASTRUCTURE CONTEXT AND OBJECTIVES... 5 2.1. CORE SERVICES ARCHITECTURE... 5 2.2. SITE DESCRIPTION AND ARCHITECTURE... 7 2.3. AUTHENTICATION, AUTHORIZATION AND VIRTUAL ORGANIZATION... 7 2.4. MONITORING... 8 2.5. COMPATIBILITY AND INTEROPERABILITY... 9 3. DEVELOPMENT AND SUPPORT TOOLS... 11 3.1. WIKI PAGES... 11 3.2. SAVANNAH SERVER... 11 4. INFRASTRUCTURE CAPABILITIES... 13 4.1. SUBMISSION OF PARALLEL JOBS... 13 4.2. SUBMISSION OF INTERACTIVE JOBS... 15 5. APPENDIX: SUBMISSION OF JOBS USING MIGRATING DESKTOP... 17 5.1. STARTING MIGRATING DESKTOP... 18 5.2. JOB SUBMISSION... 19 5.3. EXAMPLE: SUBMISSION OF AN OPENMPI JOB... 19

1. EXECUTIVE SUMMARY The project Euforia is one of the FP7 projects which has been funded with the general objective of facilitating the integration and induction of new user communities to newly developed e- Infrastructures in the course of FP5 and FP6. This is certainly the case of computational facilities based on distributed computing on grids. In Euforia the work package SA1 is responsible for this integration process at the level of the grid infrastructure. The first step towards the integration is to provide to researchers an infrastructure and a set of associated services that fits the requirements of that particular user community, in our case the fusion modelling community, especially in what concerns the modelling activities towards ITER. The first release of the Euforia grid infrastructure is based on a model of interoperation between grids. For this purpose we have created a Virtual Organization (VO) dedicated to the project which has been deployed on the sites that contribute to the grid infrastructure. So to say, the Euforia VO is acting as a link between the sites supporting the project activities in a VO oriented model. At the level of resources we are currently using resources from the project Interactive European Grid. The set of services that we have installed are those contained in glite (which are basically oriented towards running serial batch jobs and data management) plus the middleware developments of Interactive European Grid which allow users to run MPI parallel jobs and start interactive sessions on their local desktop on remote grid-enabled clusters. We have also deployed several Roaming Access Servers dedicated to the users of Migrating Desktop, which will be used in Euforia to make the connexion between the workflow manager Kepler and the grid infrastructure. In order to gather resources for basic batch serial jobs we have also installed the support for the Fusion VO of EGEE on our user interfaces. Our services are a superset of glite which we have enhanced by including MPI parallel support and interactivity both at the request from the code developers point of view. So far the grid resources that are supporting the Euforia VO are order of 1000 CPUs with more than 2TB of online storage. We expect the size of the resources to increase during the next year with the incorporation of resources coming from Euforia partners and related projects. On another level we have also started to deploy web tools to enhance communication channels and work towards community building activities. This includes the installation of a wiki service and a software repository to host the euforia codes based on savannah. Further details about infrastructure and middleware that are referred to in the text can be found in these links: Project Interactive European Grid (i2g) http://www.interactive-grid.eu o i2g middleware https://wiki.fzk.de/i2g/index.php/repositories Project Enabling Grids for E-sciencE http://www.eu-egee.org o glite Middleware https://www.cern.ch/glite Migrating Desktop http://desktop.poznan.pl Savannah server in Euforia: https://wiki.fzk.de/euforia/index.php/savannah_howto

2. INFRASTRUCTURE CONTEXT AND OBJECTIVES The objective of this activity is to provide an advanced Grid-empowered infrastructure for scientific computing targeted to support the fusion modelling activities of the Euforia project. Currently at M6 the Grid Infrastructure Operations activity has deployed the initial testbed and utilities dedicated to the project. The initial testbed is made from resources from CSIC and FZK, which also have deployed the central services necessary to ensure the integration of the computing resources into a production Grid infrastructure. The resources are mainly coming from the project Interactive European Grid, as Euforia is providing an ideal exploitation area in what concerns outreach to new user communities. To achieve these goals the EGEE/gLite middleware was adopted to provide the basic grid services required. On top of a regular glite deployment we have installed the middleware developed by the project Interactive European Grid (i2g). The developments of i2g make possible to support MPI parallel and interactive applications in a grid environment, support which is currently lacking in infrastructures based on plain glite like EGEE. Our infrastructure middleware is thus a superset of the EGEE/gLite middleware, i.e., complemented with the i2g middleware and services, meaning that the euforia resources and infrastructure are compatible with EGEE like infrastructures. 2.1. CORE SERVICES ARCHITECTURE Two sites provide the required core services, CSIC and FZK, ensuring increased reliability through redundancy. These services include resource brokers, file catalogues, virtual organizations management services, certificate repositories, top information index and roaming access server support. The current set of core services is: Resource Broker (RB) Top Berkeley Database Information Index (BDII) Logical File Catalogue (LFC) MyProxy server (MyProxy) Virtual Organizations Management System (VOMS) Roaming Access Server (RAS) The resource broker is responsible for receiving job requests described in the JDL language optionally accompanied by sets of small input files that are transferred with the job using a method known as the input sandbox. The RB performs the matchmaking between the job requirements described in the JDL specification and the available computing elements (CE) taking into account their capabilities and availability. If the matchmaking is successful, the RB is then responsible for submitting the job to the selected CE. Once the job is submitted, the RB will follow the job status until the job finishes or fails. If needed, the RB can also resubmit the job upon a CE job submission failure. The RB is complemented by the logging and bookkeeping service (LB) that gathers information about the status of the job across the several steps of its lifecycle. Although the LBs can be decoupled from the RBs they are installed in the same physical systems hosting the RBs. Jobs are submitted to a RB through a simple client known as the user interface (UI). The UI code can be installed in a wide range of platforms. The top BDII is an LDAP server with a Berkeley database backend that is populated with information collected from the sites information system. The top BDII caches the whole infrastructure information system and provides fast reliable access to infrastructure information that is used to discover sites, services and their status. This information is then used by services such as the RB for matchmaking and other purposes. Two top BDII systems have been established for fault tolerance purposes.

The LFC is a file catalogue developed for the CERN LHC project that stores the locations of files and their replicas across multiple distributed storage elements (SE) located at remote sites. We have installed at each grid site one storage element providing local storage capacity and supporting the needs of the applications running at the local cluster. MyProxy is a proxy certificate repository that enables users to safely store long-lived proxy certificates in a server from which shorter lifetime proxy certificates can then be obtained for use in the grid. The service is used to enable the RBs to renew the user authentication credentials on his behalf when the initial proxy submitted with the job expires. VOMS is an authentication service that enables the concept of virtual organizations through the management of VO memberships, groups and roles. In this sense a VO is composed of a set of users and services that accept to be accessed by the members of the set. The VOMS system implements this concept by providing a mechanism that allows the encoding of the VO name inside the proxy certificate that is then used by the VO member to access the grid resources. The VOMS server contains also a membership management service that enables users to apply for VO membership. A special user called VO manager can then list the membership requests and approve or deny them. Once a user is registered as a VO member he or she can use a command to contact the VOMS server and obtain a certificate proxy containing the VOMS extensions that assert his or her membership and rights. This proxy can then be used to access the VO grid resources. Most grid services in glite are VOMS enabled meaning that they can use the VOMS information inside the proxies to enforce authorization and resource access policies. Site name Core service type Node Status and comments LIP TOP Information Index i2g-ii01.lip.pt Operational CSIC CrossBroker i2grb01.ifca.es Operational CSIC RAS server i2gras01.ifca.es Operational LIP VOMS server i2g-voms.lip.pt Operational FZK RAS server iwrras2.fzk.de Operational FZK Crossbroker Iwrrb.fzk.de Operational CSIC Information Index i2gii01.ifca.es Operational CSIC Myproxy server i2gpx01.ifca.es Operational CSIC VOMS server i2gvoms01.ifca.es Operational CSIC LFC file catalogue i2glfc01.ifca.es Operational CESGA R-GMA server rgma-server.i2g.cesga.es Operational, Monitoring Table I. Core services of the Euforia project. Notice that some are maintained as a part of the exploitation of the Interactive European Grid project The RAS server is a development of the group of PSNC aimed at providing a web services based layer between the LCG grid services and grid access tools such as web based portals or graphical user interfaces. Its main purpose is to support the Migrating Desktop (MD), a Java based environment that provides a user-friendly access to grid resources through a graphical interface. The RAS services provide the interface between the MD and the grid infrastructure enabling users to access the grid infrastructure from desktop computers or laptops running diverse operating systems (MS Windows, Linux, MacOS, Solaris, ). The RAS also provides key services to support visualization and interactivity. The RAS architecture includes components responsible for job submission, job monitoring, user profile management, data management, authorisation, and application information

management. The maintenance of a few RAS servers in the project is crucial in order to make the interface with the Kepler workflow manager. 2.2. SITE DESCRIPTION AND ARCHITECTURE Currently there are two sites in production (CSIC and FZK) and one in installation phase (Chalmers University). Moreover we have received expressions of interest from ENEA to install the support to the Euforia VO on the Linux cluster dedicated to the ITM group, the Gateway cluster. The architecture of the sites is the one of any glite based infrastructure. Each site provides a computing element (CE) that interfaces with a local cluster composed of systems dedicated to provide computing capacity designated as worker nodes (WN), the interface with the local storage systems is ensured by at least one storage element (SE). Each site needs also to provide a system to collect and publish monitoring and accounting information designated as mon-box. The typical site architecture can be seen in the figure 1. Only the CE, SE and mon-box run services that need inbound connectivity from the Internet. The worker nodes have invalid IP addresses and are usually sitting behind NAT firewalls however these services require outbound connectivity. The interface between the computing element and the worker nodes is performed through a LRMS (local resource management system) such as PBS (Portable Batch System) or SGE (SUN Grid Engine. The CE publishes the hardware architecture of the WNs behind it, since each CE only publishes a single architecture this, has an important impact on sites serving machines with multiple architectures since a separate CE will have to be established. Among other information the CE publishes the installed software run-time environment, network connectivity and the list of supported VOs. It is very important that this static information is maintained updated by the site manager. In particular if provides the information regarding the intra-network available on every cluster. This is particularly important when it comes to run MPI applications requiring low latency networks. Internet Local network WN 1 CE WN 2 SE WN 3 Mon-box Figure 1 Site architecture Site name Cores Architecture Online Storage Intranet CSIC 400 Xeon 1 TB Infiniband / GbitEth. FZK 768 Opteron / Xeon Bandwidth to NREN 1Gbit/s Status Operational 320 GB GigabitEth. 1Gbit/s Operational CHALMERS 34 Xeon 200 GB GigabitEth. 1Gbit/s Expected M9 Table II. EUFORIA Grid Infrastructure Description at M6. 2.3. AUTHENTICATION, AUTHORIZATION AND VIRTUAL ORGANIZATION

The infrastructure authorization system is based on the VOMS (Virtual Organizations Management System) initially developed by DataGrid and incorporated in glite and LCG. This is a powerful mechanism that enables the central management of the virtual organizations membership effectively controlling the access to the virtual organizations resources. This system is the key component of the infrastructure in terms of security and also provides a mechanism to enable the management and deployment of VO specific software. We are using the central VOMS server of the project Interactive European Grid which has the advantage of being replicated at two different sites, improving in this way the reliability of the whole infrastructure. We recall the VOMS server is a single point of failure in a glite-based infrastructure. In this server we have created the EUFORIA VO. Figure 2. Users currently enrolled in the Euforia VO come from FZK, IPP, CSIC and CIEMAT. 2.4. MONITORING The monitoring of the resources is currently being done using the GridICE services of the i2g infrastructure located at CESGA. GridICE uses the Nagios monitoring platform therefore a nagios daemon has to be present on the GridICE server. The user nagios is the one that executes the perl script that actually checks every site, so that it is possible to update the values trough the entire database.

Figure 3. Image of the GridICE server for Euforia VO We also check that the sites are well configured and any operation executed at them will work fine. The Service Availability Monitoring is the way currently used in the i2g infrastructure to check the different services at each site. The site administrators can be sure, by checking the web, that their sites are suitable to run grid applications, to replicate data, etc. See https://sam.cyfronet.pl/i2g-sam. This page shows the different parameters for querying the services. It is possible to query a service in different regions for a selected VO, in particular for Euforia, and to sort the result by Region, Site or Node name. For each service there are different tests and every site service will show different states (NA-the service for a selected site and VO is not being tested-, MAIN -the site is in Scheduled Downtime- and then from OK -everything works fine- to CRIT -the site has a critical error in one of the tests-). 2.5. COMPATIBILITY AND INTEROPERABILITY The EUFORIA infrastructure aims to be interoperable with other European grid infrastructures. This is a fundamental objective to ensure compatibility with the widest possible range of resource providers in order to improve usability. The compatibility requirements are deeply related with the interoperability with glite and other middleware and software components as well as with the most commonly used hardware platforms. Interoperability with the EGEE infrastructure must be ensured and therefore the developed software must be compatible with the glite middleware framework. However one has to keep in mind that the architectural developments in the project must be done in way that is not affected by the future path and development venues of glite. The key point is therefore to keep the work inside standards.

Middleware releases must be available for the hardware and operating system combinations necessary to support the initial set of sites. Additionally the middleware must be portable to the widest range of platforms. Because of our interoperability developments in the context of i2g we can configure our sites to support Virtual Organizations belonging to different infrastructures, in particular EGEE (the details are explained in https://wiki.fzk.de/i2g/index.php/inter-operability_betweem_i2g_and_egee). The user interfaces of the project Euforia are configured to support users belonging to the Fusion VO of EGEE ( http://grid.bifi.unizar.es/egee/fusion-vo ): users of the Fusion VO can run in the Euforia clusters. The other way around, if a user wants to submit batch jobs to those EGEE clusters supporting the Fusion VO of EGEE, the only thing he needs to do is starting a proxy for the Fusion VO, instead of for the Euforia VO. We find this approach to be the most reasonable for optimal resource sharing. The use of the Euforia VO is mandatory in two situations: Running MPI or interactive jobs Usage of middleware developments of the Euforia project While the submission of bulk batch jobs can be handled either by the Euforia VO or by the Fusion VO

3. DEVELOPMENT AND SUPPORT TOOLS One of the main tasks in the first six months of the project has been to provide support for the EUFORIA Virtual Organization and a support channel for end-users to achieve their goals and contribute to a successful usage of the infrastructure. In this section we will describe the main steps taken in this direction. 3.1. WIKI PAGES The SA1 activity has deployed a wiki server containing information for newcomers. This includes basics conceptual explanations, and step by step descriptions on how to become a user on the Euforia infrastructure by joining the Euforia VO. The entry point to the wiki is https://wiki.fzk.de/euforia There is a user induction page https://wiki.fzk.de/euforia/index.php/user_induction where new users are tutored along the process from getting a digital certificate, to submit jobs to the infrastructure. Also a user guide is available online with many practical examples and a tutorial for beginners: https://wiki.fzk.de/i2g/index.php/i2g_user_guide The wiki pages are open to everybody, the subset protected pages can be read by everybody, but only subscribed users have writing permission. The smallest subset hidden pages can only be seen and changed by subscribed users. All members of the projects are entitled to have an account on the wiki and use this tool to exchange experiences and information with the other users. The first example has been the development and porting of the code Eirene, which is fully documented here: https://wiki.fzk.de/euforia/index.php/eirene 3.2. SAVANNAH SERVER The SA1 activity has provided a savannah server for applications development located at the address https://savannah.fzk.de/projects/euforia. This type of service has over the past five years grown to be a well known tool within open source projects. The service runs on a standard LAMP (Linux, Apache, MySQL, PHP, Perl) system. The software package we use is savane version 2.0. It is licensed under the GPL and has the same roots as the commercial sourceforge variant. The aspect of using third party open source software is important when regarding the aspect of sustainability. Below we detail the more important features that are intended to be used during the project. All the codes to be implemented on the Grid in the Euforia project are handled using the software repository tools of savannah. In this way codes have a dedicated project space on our savannah server. The list of codes that is being maintained is this (follow the symlink to reach the project page) BIT1 CENTORI EIRENE elmfire ERO Esel Euforia General page and Euforia Visualization Integration (JRA4) GEM GENE ISDEP SOLPS Tyr

The Software Repository is the central point of code exchange from the developer point of view. The basic function of a Software Repository is to keep track of changes to the source code made by developers. One key feature is the integrated conflict resolution. This enables several developers to work on the same files at the same time. SVN is the state of the art method on which the project agreed to use. To improve its usability a web frontend has been installed. It provides an easy access to the sources and easy accessible information about who made which changes to it. For everyday work a wealth of tools are available for most operating systems, including even windows. The user management at savannah.fzk.de holds the username, the user password and his public ssh-keys. This is used to grant access to savanah.fzk.de, wiki.fzk.de and login to the grid nodes at FZK. Discussion fora are available. They are a hybrid between mailinglists and online fora. The idea is to provide easy to use mailing lists with the additional feature to use them in a forum-like way. This combines the advantage of a web-gui (like easy subscription of lists, seeing who else is online) with the ubiquitous mailing list tool. Spam filter and virus scanner are included as well as authentication against registered email addresses and against the savannah membership. The Bug trackers are essential for distributed software development. Typically the developers of the software themselves are not located in the same place and cannot meet themselves often. Furthermore the people responsible for software installation and the users of the software are again located somewhere across Europe. The Bugtrackers provide an online ticketing system that tracks problems in the software. This is typically the best possibility for users to report problems. This supports to remind developers about the problem until it is closed. Autobuild: Currently only for Eirene, we have set up an autobuild service at https://savannah.fzk.de/autobuild/euforia. The aim is to provide a standardised build environment as a reference system, so that future builds can be reproduced easily. Furthermore, interoperation with the package repository from which all connected grid hosts retrieve their i2g related middleware packages and updates is granted.

4. INFRASTRUCTURE CAPABILITIES Euforia applications are typically already running in clusters and supercomputers, and the migration to the grid is motivated by the necessity of gathering more resources, this is the case for example of the code Eirene. Here the support is related with porting code that is already running in another platform, and needs to translate some cluster-like expectations to what the user will encounter in the Grid. This is particularly so when it comes to deal with input/output handling. The infrastructure provides support for the submission of batch jobs in the same way as EGEE does using plain glite middleware in other areas of science, in particular Fusion, in the context of the Fusion VO. Beyond plain glite is the support to parallelism and interactivity. MPI parallel applications in Grid infrastructures are especially interesting. Here the user encounters a number of extra issues due to the lack of homogeneity of the clusters that make the distributed infrastructure. We have installed the MPI support offered by the i2g infrastructure middleware based on OpenMPI, which has the advantage the each individual cluster belonging to the grid infrastructure has the capability of executing parallel jobs in a way completely equivalent to what the user expects in a classical cluster environment. Interactivity is a crucial point since the user cannot wait for hours in front of a workstation, but minutes, or even seconds, depending on the problem size. We find interactivity to be a very useful tool for researcher s everyday activities, for example at code development stage. Evidently the possibility of immediate allocation of a large amount of resources to a particular job is certainly an advantage that can be exploited on grid infrastructures since it can be offered as solution beyond the classical batch job processing in clusters and supercomputers. 4.1. SUBMISSION OF PARALLEL JOBS The first step to be taken by the user is the compilation of the source code. Here there are two options, compiling in a User Interface belonging to the int.eu.grid infrastructure, or compiling on your local machine. Compiling on the grid infrastructure. In principle it would be possible to submit a grid job that compiles the application. Nevertheless, one has to be aware that most worker nodes do not have a compiler and/or all the necessary development files of the different libraries installed. Therefore this approach should be used with care. Compiling on the grid User Interface. As part of our effort to support new users, we provide accounts on User Interface machines which have the required compilers and libraries installed in the version that is used on the worker nodes. Since currently these are being updated from SL3 to SL4, some sites offer both versions of User Interface. The autobuild service essentially runs on a User Interface. Compiling on your local machine. You can compile the application on your local machine as long as you pay attention to possible library version conflicts. To avoid such problems in advance one possibility is to install the same Scientific Linux distribution as used in the grid infrastructure (e.g. on a virtual machine) and install the MPI RPMs of int.eu.grid. In any case, the deployment of MPI RPMs has not been observed to be a source of problems at this level. The available compilers for OpenMPI are described in the following table:

Language Wrapper Compiler C Mpicc C++ mpic++ or mpicxx F77 mpif77 F90 Intel Fortran Compiler when available. Table III. Compilers available for OpenMPI We decided to go beyond the Fortran compiler support available in glite (F77) due to the large amount of Fortran 90 codes existing in the project applications. After compilation and before the application can be successfully executed it is sometimes necessary to perform some additional actions like downloading input and configuration files. The user is able to provide a shell script function that is executed before the MPI job is started. To use your own user specified pre-run hook the user must perform the following actions: 1. Create a shell script that contains at least a shell script function called "pre_run_hook". 2. Add the shell script with the callback function to the InputSandbox of the JDL file. 3. Tell the job script to use your callback functions by setting the "I2G_MPI_PRE_RUN_HOOK" variable in the Environment of the JDL file. More information and practical examples can be found on our user wiki pages at https://wiki.fzk.de/i2g/index.php/hook_cookbook After the compilation, the application can be run using the same approach as a sequential application with the following exceptions: The JobType must be set to "Parallel". The JobSubtype must be set to openmpi. The NodeNumber tag must be set to the amount of processes you want (the right name for this tag would be ProcessNumber, but for historical reasons it is NodeNumber). Here is an example file for an MPI parallel application. The application request 24 processes to submit an executable which has been previously compiled according to the previous instructions: JobType = "Parallel"; JobSubType = "openmpi"; Executable = "IMB-MPI1"; Arguments = "pingpong"; NodeNumber = 24; StdOutput = "std.out"; StdError = "std.err"; OutputSandbox = {"std.out","std.err"}; InputSandbox = {"IMB-MPI1"}; The job is then submitted and handled with the suite of commands i2g-job-*, an enhanced version of the edg-job-* commands from glite. As in glite, where the size permits, the job input/output data will be transmitted by the Resource Brokers in Sandboxes. In the more general case however, input/output will be handled using the pre- and post- run hooks mechanism.

4.2. SUBMISSION OF INTERACTIVE JOBS In the batch approach the output analysis takes place after job completion, which in many occasions leads to a waste of computing time because the researcher cannot modify the job parameters as the simulation goes on without restarting the job. In order to add interactivity support to the Euforia testbed we have installed the i2glogin middleware. Let us explain briefly how to work in an interactive environment in Euforia. In an interactive application there is a flow of data between the user and the executing program. This allows the user to control and steer the program by sending new data using the mouse and keyboard as input devices. The way interactivity is achieved is described in the following example. In order to start an interactive session the user should start an instance of i2glogin locally. This will allocate a listening TCP port which will be used as communication line. [user@i2gui01 InteractiveJobs]$ i2glogin -p 21015:193.136.90.35 The output will be used as the argument of the InteractiveAgentArguments field inside the JDL describing the job. This is an example JDL descriptor: JobType = "normal"; Interactive = True; Executable = "/bin/sh"; Arguments = ""; InteractiveAgent = "i2glogin"; InteractiveAgentArguments = "-r -p 21015:193.136.90.35 -t -c"; The submission of this JDL produces a shell on a worker node with a private IP address by routing the traffic over a node with public IP. The traffic routing is transparent to the user and to the application. The forwarding of X11 traffic is also supported, as it is important for interactive Grid applications using graphical interface. By forwarding the X11 traffic of the application to the user interface, the user can steer interactively without having to change the application itself. The case for interactive MPI is similar, but the interactive connection is then established with the master process of the MPI job. The functionality of i2glogin can also be used with Migrating Desktop. In particular the server part of i2glogin is running on the RAS server.

Figure 4.Interactive channel connection using Migrating Desktop

5. APPENDIX: SUBMISSION OF JOBS USING MIGRATING DESKTOP Migrating Desktop (MD) will be used in the context of Euforia to make the connection between the workflow manager Kepler and the grid infrastructure. Therefore we are maintaining as a part of the SA1 activity two RAS servers, plus the one for development which is installed at the developpers site at PSNC. We describe in this section the main steps to be taken by a user to start the MD plug-in and submit jobs to the Euforia grid infrastructure. The software requirements to run Migrating Desktop are very minor, since any computing system ready to launch Java Runtime Environment can start MD. Hardware-wise the minimal optimal requirements are a minimum memory of 256MB RAM and any Processor equivalent to an Intel P4 2GHz in performance. The Network analysis shows that one can use MD with a bandwidth of ~2 Mb/s. The use of Migrating Desktop requires access to some ports in order to perform certain actions: Functionality Protocol Ports/Ports range Access to RAS server SSL HTTP 8443 8080 File transfer to/from SE N/A 13000-17000 2811 Interactivity N/A 5800-5802, 5901 Interactive stream connection N/A 21000-25000 Table A.I. Ports and Protocols used by Migrating Desktop In order to setup Migrating Desktop on your local workstation the latest version is always located at the project main web page http://desktop.psnc.pl/. As we can see in the main web page currently, there are two ways of launching MD: Java-enabled web browser (Netscape Navigator, Internet Explorer, Firefox, and Opera etc. with the Java plug-in version 1.4.2 or newer). Java Web Start (JWS 1.4.2 or newer). Using JWS has the advantage of caching the application on your local workstation, therefore it runs sometimes much faster than calling it from an Internet Browser. Migrating Desktop is the user side of a more involved mechanism to provide user friendly access to the Grid. The commands that the user issues on Migrating Desktop are translated into Grid specific commands by a web service called Roaming Access Server (RAS) which is the service that actually hides the complexity of the Grid to the user. Currently there are a number of RAS available in the inteugrid infrastructure RAS hostname Service Port Run by i2gras01.ifca.es 8443 CSIC iwrras2.fzk.de 8443 FZK Table A.II. List of available RAS resources When starting MD the user has to pick one of them, it is typically the closest one geographically.

5.1. STARTING MIGRATING DESKTOP The first step to be taken by the user is to choose the closest Roaming Access Server. Afterwards MD can be started either as an Applet or as a Java Web Start application: 1. To start Migrating Desktop as an Applet use following default address: http://<ras-host>/md 2. To start Migrating Desktop as a Java Web Start application, use following default address: http://<ras-host>/md/jws/migratingdesktop.jnlp After a few seconds a User Login Dialog is loaded (see Figure 2). This dialog collects the data required for proper work: location of user s certificate, user VO and proxy details. In the Certificate list box one must give the path to the Digital Certificate and private key. The create credential option will automatically generate a proxy for the user with the specified options. Figure A.1 User Login Dialog authentication method If the user has been working with the Migrating Desktop already, his personal settings will be restored. In the other case a default user profile will be created by the application and the main empty Desktop will appear.

5.2. JOB SUBMISSION The job submission types available from Migrating Desktop (see Figure 3) correspond to types that we have already described in the section referring to working from a User Interface, however the functionality is rather different in what respects input/output file handling and interactivity features. First, File Handling is made in a more simplified way using the graphical tools of Migrating Desktop. The user submitting MPI jobs does not have to deal with writing the hooks scripts because all the input/output is handled via the RAS server. The inclusion of i2glogin in the architecture of Migrating Desktop/RAS allows specifying plug-ins to visualize or steer an application in runtime using i2glogin. Figure A.2. Job submission type in the selection dialog. 5.3. EXAMPLE: SUBMISSION OF AN OPENMPI JOB As an example we will follow the steps to submit an OpenMPI job to the inteugrid infrastructure. Open Migrating Desktop and invoke a Job Wizard with command line application. Choose, or invoke from the main menu: Tools->Job Wizard and click on the Script dialog. A window pop-ups where the name of the executable script has to be specified (see Figure A.3 for a visualization of this step).

Figure A.3. Job wizard dialog for the Script job Next you should specify the input and output files of your job execution. For that go to the Files tab label. In order to specify the input files, open the Grid Explorer dialog in the main menu, there you can select the input files by mouse-clicking (Figure A.4). Make sure to use in the Name field the name that script is using for them at execution time: Figure A.4. Job submission wizard - setting up input files As for the output in this example file output.tar.gz was used in the script to tar and gzip the output of the simulation. The standard output and standard error will be also included automatically.

Figure A.5. Defining the output files Next one has to go to the Resources tab and define the job properties. In our example (Figure A.7) we are requesting 8 processes of an OpenMPI job at the IISAS site. Notice that the decision about the particular site where the job ends up could be left up to the Resource Broker. However it is also nice to have the option of being more specific in cases like parallel applications with need for small latency intra-networks, like the case of the special infiniband queue. Figure A.6. Defining Job type and specifying resources needed Once you have chosen all the parameters you can click on the submit button. In order to monitor the job execution, choose Tools->Job Monitor, in the main menu. Pressing Refresh all you can observe the status of your job. Furthermore you can click on this particular job ID (if you have more than one job) and see the detailed status of the job (Figure A.7).

Figure A.7. Job monitor window and detailed Job status The job is finished when you see the status Done in the monitor dialog. At this point you can retrieve the output. Go to Files tab, and choose StdOutput item, then press Visualize button. The proper visualisation tool will be launched automatically. If your output is a file or a set of files which ended up stored in a Storage Element you can open the Grid Commander to bring the output to your local machine using ordinary operations. In the example below we show how to use the Copy[F5] to copy a file from a Grid Storage Element to the local directory of the user Figure A.8. Copying files to the local desktop from a SE using Grid Commander For a full description of all the features of Migrating Desktop and all details about functionality we refer to the central web page located at http://desktop.poznan.pl