January 10, 2012 Getting Started with XSEDE Andrew Grimshaw and Karolina Sarnowska- Upton
Audience End users and developers who want to Access and use NSF funded XSEDE compute, data, and storage resources Create a secure shared resource environment with collaborators around the world Access and use compute, data, and storage resources available via XSEDE located at other institutions Advanced user support personnel who work with end users and developers
Goals At the end of this tutorial you will Understand underlying system and resource model of XSEDE Be able to install and configure client-side tools Understand grid command shell and basics of GUI Be able to define and run jobs on XSEDE Be able to use Global Federated File System Be able to share data resources into XSEDE 3
Agenda XSEDE Architecture Overview and Context XSEDE Genesis II Client Installation Using the client interfaces Running a job with XSEDE Adding data resources 4
January 10, 2012 XSEDE Architecture Overview & Context Andrew Grimshaw and Karolina Sarnowska-Upton
Initial XSEDE architecture: High-order bits Don t disrupt the user community! Maintain existing TeraGrid services Focus on user-facing access layer For power users, first, do no harm For other users, expand use via interfacees, new hosted XSEDE User Access Services (XUAS) and Global Federated File System (GFFS) Promote standards and best practices to enhance interoperability, portability, and implementation choice 6
XSEDE provides capabilities Access and share data between campuses and centers Access data on center resources from the campus, campus resources from a center, or campus A resources from campus B Access and share compute resources from home, campus, or center to run a job directly on a particular resource submit to one or more global queues to execute a workflow 7
XSEDE Architecture Access Layer Applications, GUIs, Portals and Gateways, XUAS Transparent access via the file system APIs and CLIs Services & Web Services Infrastructure XSEDE Enterprise Services JSDL/BES HPC-BP GridFTP WSI-BSP RNS/ByteIO Community Provided GRAM5 Services REST/RMI Amazon EC2 Application Deployment Resources Core Enterprise Resources, e.g., RP resources Other Resources, e.g. Campus centers, Amazon, Research group data 8
Implementations and Architecture The architecture defines the interfaces, communication, and interactions between software components The architecture defines how quality attributes are realized Security, reliability, availability, performance,.. Architecture components (that implement interfaces) may have more than one implementation Thus, we distinguish between the architecture and the implementation 9
Implementation Choices We have made initial choices of implementations we will use Process to evolve architecture & implementations Three major configuration items (software systems) providing implementations. They are (in alphabetical order) Genesis II :CLIs, APIs, GUI, GFFS, XES services Globus: XAUS (XD-Data), gridftp UNICORE 6: GUI, XES (BES at the SPs) XES services run on Grid Interface Units 10
XSEDE is a System of Systems XSEDE is a system of systems: Different organizations may be running different standards-compliant software stacks. 11
A Typical Service Provider Setup Supercomputer and local storage Login nodes Local scheduler e.g., PBS Grid Interface Unit(s) Site backbone Connection to internet Supercomputer and local storage Site wide file system and archival storage Data
A Typical Campus Setup Campus cluster Researcher data set Grid Interface Unit(s) Connection to Campus backbone internet Researcher cluster Department file system
Simple Grid Interface Unit Grid Interface Unit Local queuing systems Web Service Container Local distributed file systems Local disk
January 10, 2012 XSEDE Genesis II Client Installation Andrew Grimshaw and Karolina Sarnowska-Upton
Agenda Install Genesis II grid client 16
Acquire Installer Installers are delivered with Increment 1 TRR materials. Select the installer for the appropriate operating system platform. Run the installer. 17
The Installation Process Questions OK to install? License follows Apache license agreement 18
The Installation Process Questions Installation directory path? where code and configuration files will be placed **Directory to store container state will be created at ~/.genesisii- 2.0 19
Grid Choice Question Shows supported grids; pick XSEDE for Increment 1 Deliverable. 20
Installer Progress...
22 Voila Client Installation Complete
January 10, 2012 XSEDE Genesis II Client Usage Andrew Grimshaw and Karolina Sarnowska-Upton
Agenda Prerequisites Client installed Access grid via: Cmd-line grid shell GUI client FUSE file system mount Learn access control basics 24
Using the Grid Client Multiple access methods Cmd-line grid shell GUI client-ui FUSE file system mount You will learn to: Login Navigate namespace Use GUI Manage access control Setup FUSE mount 25
Login via the CLI Note: All of the things we will talk about can also be done from the grid shell without using the GUI, it is just not as convenient Login using your grid credentials login Check grid credentials whoami 26
Fire up the GUI Type grid At the command line type client-ui You should see something like this Let s look around /queus /users /home /groups 27
/users versus /home /users is a directory of end user identities Used to log in and to add people to access control lists, e.g., chmod myfile +r /users/karolina /home shows home directories in GFFS of users you can put files and directories there E.g., /home/grimshaw/data.txt 28
GUI Grid Client: Start-Up Basics Browse to /home Click on your directory icon Open GUI sub shell Select Tools, then grid shell Shell as tab completion, history, help, etc. 29
GUI Grid Client: Tearing off a Browser Create additional GUI browser of grid global namespace by: Clicking Tear icon and draging to tear off browser 30
GUI Grid Client: View Access Control To view access control information: Browse to and highlight resource, then select Security tab Exercise: Give read access to your neighbor 31
GUI Grid Client: Edit Access Control Select credential to be added Add specific user by browsing to user identity under /users Add everyone by selecting Everyone icon Add specific username/password token by filling in dialog box and selecting icon Drag and drop credential to add desired rwx permission 32
That s it for the GUI for now Let s look at mapping the directory structure into the local file system using FUSE 33
FUSE Mounting the Grid: Overview Filesystem in Userspace (FUSE) is a loadable kernel module for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code We use FUSE to provide accesses to gird resources directly from your Linux file system via a directory mount point 34
FUSE Mounting the Grid: Setup Basics Ensure you are logged into the grid GenesisII/grid whoami Create empty Unix directory to use as mount point mkdir XSEDE Mount grid at mount point nohup GenesisII/grid fuse --mount local:xsede & Now you can access XSEDE via your file system Can add command to your Unix login dotfile to setup FUSE mount automatically on Unix login 35
Result XSEDE resources regardless of location can be accessed via the file system Files and directories can be accessed by programs and shell scripts as if they were local files Jobs can be started by copying job descriptions into directories One can see the jobs running or queued by doing an ls. One can cd into a running job and access the working directory where the job is running directly More on this later 36
GUI Grid Client: Editing Files Edit files in default editor (from client-ui sub-shell or grid shell) edit <filename> In Linux, EDITOR environment variable needs to be set before running grid client; e.g.: export EDITOR=/usr/bin/vim 37
GUI Grid Client: Configuring Preferences Select Preferences under File menu to configure: Credential verbosity Shell fonts Default job history level XML display mode 38
January 10, 2012 Running a Job with XSEDE Andrew Grimshaw and Karolina Sarnowska-Upton
Audience & Goals Audience End users and developers who want to Access and use NSF funded XSEDE compute resources Create secure shared compute environment with collaborators around the world Access and use compute resources available via XSEDE located at other institutions Advanced user support personnel who work with end users and developers Goals: at the end of this tutorial you will Be able to define and run jobs on XSEDE
Prerequisites Installed Genesis II client software Grid account with permission to run jobs Basic grid shell and client GUI understanding 41
XSEDE Activities (a.k.a. jobs) What are jobs in XSEDE? How are jobs executed? How are jobs specified? How to interact with jobs while they are running? Compute Grid Use module JSDL tool Grid queue Interacting with jobs Job state change notification
What are Jobs in XSEDE? A job is a unit of work that executes a program Really pretty generic: much like PBS or LSF job Program may be sequential, threaded, hybrid GPGPU program, or traditional parallel using MPI or OpenMP Programs can be command line programs or shell scripts that take zero or more parameters Jobs MAY specify files to be staged in before execution and out after execution This MAY include executables and libraries Jobs MAY specify file systems to mount, e.g., SCRATCH or GFFS (Global Federated File System) Jobs MAY specify resource requirements such as operating system, amount of memory, number of CPU s, or other matching criteria Jobs MAY be parameter sweep jobs with arbitrary number of dimensions 43
How are Jobs Executed? Job are executed by grid resources that implement the OGSA Basic Execution Services (BES) interface These are referred to as BESes Users submit jobs directly to BES or to a grid queue 44
BESes: Basic Execution Services BESes run jobs on particular compute resources Manage data staging for jobs Monitor job progress/completion Maintains job state Compute resources may be workstations, clusters, or supercomputers Each BES has a set of resource properties such as operating system, memory, number of cores, etc. that can be used to match jobs to BESes for execution XCG Tutorial
Grid Queues Work much like any other queuing system Grid users submit jobs to grid queue Maintain: List of (BES) compute resources available for scheduling Description of capabilities of each compute resource List of jobs and statuses Match jobs to available compute resources Ask matching resources to run jobs Monitor job progress/completion Cmd-line and GUI tools to manage jobs in queue qsub, qstat, qkill, qcomplete, queue manager XCG Tutorial
Grid Queues Cmd-line View Check queue/job status with: qstat <queue-path> XCG Tutorial
Grid Queues GUI Queue Manager Queue Manager presents information about jobs and resources currently managed by queue Click in the Max Slots column in the row for the desired resource, type in a number, and save. 48
Grid Queues Job Execution Grid-Queue job1 job2 job3 job4 BES1 BES executes job BES2 BES3 XCG
Job Execution The Working Directory User submits job/queue schedules on BES BES1 BES creates unique working dir for each job activities my_job_data job1 job2 job3 working-dir runa runb BES stages data to/from job working dir as specified in JSDL XCG
How are Jobs Specified? Jobs are specified using the Open Grid Forum standard Job Specification Description Language 1.0 (JSDL) XML-based language Widely adopted Not intended for human consumption Job information that is specified Identity Application description Resource requirements Data staging 51
JSDL Fragment Gdfg Job Name Resource Requirement Application Description XCG Tutorial
Creating JSDL Files using the Grid Job Tool Manual Creation: Use editor to create XML file Difficult and error-prone due to XML s eccentricities Easiest method: start with existing JSDL and modify (carefully) Using Grid Job Tool: GUI builder for JSDL files User describes job in GUI Description can be saved as GridJobTool project file edit/re-use project to create new JSDL files Automatically generates XML from user provided description Started with grid command job-tool XCG Tutorial
How to Launch Job Tool from GUI Browser Select directory where you want JSDL project file located OR Select execution container (BES or queue) where you want to execute job 54
How are Jobs Submitted for Execution? Recall: Jobs submitted to BES or grid-queue Jobs can be submitted via Grid shell run (to BES) or qsub (to queue) commands JSDL tool menu option from GUI grid shell Copying JDSL file to BES s submission-point pseudo-directory 55
Using run to Execute Jobs Check command syntax help run EXAMPLE run command for gnomad run --jsdl=<jsdf-file> <path-to-bes> 56
Using a Grid Queue to Execute Jobs General purpose XSEDE grid queue location /queues/grid-queue Submission syntax qsub <queue-path> <JSDL-file> OR cp <queue-path>/submission-point <JSDL-file> Example submission qsub /queues/grid-queue local:gnomad.xml 57
Job Submission Exercises GOAL: Run some simple jobs Create and execute hostname.jsdl Single job and parameter sweep Example files located at /examples 58
Interact with Jobs via Queue Manager You can stop, check status, examine job history, or reschedule a job You can interact with a job s working directory if job is in a running state on a (Genesis II) BES 59
View Job Information in Queue Manager Status QUEUED: job waiting to be scheduled on BES resources REQUEUED: job failed execution at least once and has been automatically re-queued ERROR: job failed the maximum allowable execution attempts and will not be re-queued On <BES name>: job passed to <BES name> for execution Note: Does not connote status within BES (job may be running, queued, staging data, etc.) FINISHED: job executed successfully Attempts Number of times queue has tried schedule job for execution Some failures do not increment attempts grid software failures job preempted due to local BES policies Ticket Unique ID assigned by grid queue to job on submission Queue keeps status of active and completed jobs Jobs in final status (ERROR and FINISHED) need to be cleaned up by user qcomplete <queue name> { --all <job ticket>+ } XCG Tutorial
Examine Job History in Queue Manager Right-clicking on job provides information about job s history in different levels of detail 61
Scratch file system Persists on BES between runs Good for caching large or frequently used files 62
Interact with Job Working Directory When using Genesis II BES resources, job working directory is accessible via GFFS Working-directory is located in queue where job was submitted at <queue-path>/jobs/mine/running For each running job, there is a directory with job ticket number with two entries: status file containing state of job (e.g. queued, running) working-dir session execution directory of running job read/write/create/delete files here to interact with running job If job was submitted directly to BES, job directory is located at <bes-path>/activities 63
JSDL File Contents Explained Identifier Info Descriptive information about job, e.g. job name Resource Requirements Describe resources job requires Memory OS Architecture Number of processors Run time <JobIdentification> <JobName>Adder</JobName> </JobIdentification> <Resources> <OperatingSystem> <OperatingSystemType> <OperatingSystemName>LINUX </OperatingSystemName> </OperatingSystemType> </OperatingSystem> </Resources> XCG Tutorial
JSDL File Contents Explained Application Description Describe execution Executable name Arguments Routing for stdout and stderr <Application> <POSIXApplication> <Executable>adder.sh</Executable> <Output>stdout</Output> <Argument>seven.dat</Argument> <Argument>fourty-two.dat</Argument> <Argument>sum.dat</Argument> <Argument>10</Argument> </POSIXApplication> </Application> XCG Tutorial
JSDL File Contents Explained DataStaging Describe data to copy in/out Several transport options: http scp (secure copy) RNS (grid directory structure) Email (out only) Copy in (data staging source): Source is URL of remote file to be copied in FileName is name within job working directory where file will be copied to Copy out (data staging target): Target is URL of remote file to be copied to FileName is name within job working directory of file to be copied out Other file handling info <DataStaging> <FileName>adder.sh</FileName> <CreationFlag>overwrite</CreationFlag> <DeleteOnTermination>true</DeleteOnTermination> <Source> <URI>http://www.cs.virginia.edu/adder.sh</URI> </Source> </DataStaging> <DataStaging> <FileName>sum.dat</FileName> <CreationFlag>overwrite</CreationFlag> <DeleteOnTermination>true</DeleteOnTermination> <Target> <URI>rns:sum.dat</URI> </Target> </DataStaging> XCG Tutorial
Example JSDL Gdfg Job Name Resource Requirement Application Description XCG Tutorial
Gdfg Data Staging Requests XCG
January 10, 2012 Adding DATA Resources into the Grid Andrew Grimshaw
Ways to Add Data into the Grid Create files and directories Export file system directory 70
Creating Files in the Grid Creating a file (or directory) places its state on same grid container as its containing directory For example, all these following commands place files and directories in container where /home/bob resides echo hello > /home/bob/newfile mkdir /home/bob/testdir cp local:testfile grid:/home/bob/testfile 71
Creating Directories on Specific Containers Files can be created on other containers by specifying creating a containing directory on target container Directory placement location can be changed by explicitly specifying grid container to be used Path to service on target container is given to directory creation command (service is EnhancedRNSServicePortType) mkdir --rns-service=<rns-service-path> <new-dir-path> 72
Exports: Mapping Data into the Grid Basic idea: create grid resource that securely proxies access to local files and directories via RNS and ByteIO web services We use an export service to proxy a local file system directory tree into grid To create export, create instance of LightWeightExportPortType Via the command line Via the GUI (for local hosts) 73
Exporting: Mapping a local directory structure into the global namespace User runs export command Export Service myfiles Export service mounts local directory into global namespace user /home Export services redirects calls from grid export to local file system myexport XCG Tutorial
Export Creation Example: Cmdline Creating an export maps specified directory on container host into specified GFFS path To run export command, you need to know Location of files you want to export On which container you will create export resource (service is LightWeightExportPortType) Location in global namespace where you want to mount export export --create <path-to-service> <local-path-to-files> <GFFS-path-for-export> Quitting export turns off export service (underlying files in local file system are left intact) export --quit <GFFS-path-for-export> 75
Export Creation Example: GUI Provide: Location of files you want to export Location in global namespace where you want to mount files 76
Export Security Settings Recommendation Give users extended access control to enable export creation Allow only admin users to create exports 77