Log into Jasper.westgrid.ca: ssh -X use pucy if you are working in windows Log into the workshop cluster from jasper:

Size: px

Start display at page:

Download "Log into Jasper.westgrid.ca: ssh -X use pucy if you are working in windows Log into the workshop cluster from jasper:"

Kory Chambers
5 years ago
Views:

1 Using a cluster effec/vely Scheduling and Job Management Log into Jasper.westgrid.ca: ssh -X yourusername@jasper.westgrid.ca use pucy if you are working in windows Log into the workshop cluster from jasper: ssh X cl2n230 Copy the working directory to your own and go into it. cp -r /global/solware/workshop/scheduling-wg cd scheduling-wg-2015 You can find a copy of the slides and materials for this workshop in the following link hcps://goo.gl/m1qmcq

2 Scheduling and Job Management 2 Using a cluster effec/vely

3 Presenta/on contents Job submission part 2 Understanding Jobs

4 PBS Jobs and memory It is very important to specify memory correctly If you don t ask for enough and your job uses more,your job will be killed. If you ask for too much, it will take a much longer /me to schedule a job, and you will be was/ng resources. If you ask for more memory than is available on the cluster your job will never run. The scheduling system will not stop you from submiyng such a job or even warn you. If you don t know how much memory your jobs will need ask for a large amount in your first job and run checkjob v v <jobid> or qstat f <jobid>. Along other informa/on, you should see how much memory your job used. If you don t specify any memory then your job will get a very small default maximum memory (256MB on Jasper).

5 PBS Jobs and memory Always ask for slightly less than total memory on node as some memory is used for OS, and your job will not start un/l enough memory is available. You may specify the maximum memory available to your job in one of 2 ways. Ask for a total memory used by your jobs #PBS l mem=24000mb Ask for memory used per process/core in your job #PBS l pmem=2000mb

6 Features and Par//ons Some/mes nodes have certain proper/es: fast processor, bigger disk, SSD, Fast connec/on or they belong to certain research group. Such nodes are given a feature name by the sysadmin so you can ask for the nodes by feature name in your pbs job script. If you would like to specify that your job will run on nodes with feature ssd: #PBS l feature=ssd Par//ons are set of nodes that you can specify and your job must run with all of its components in a par//on to run in the newer node par//on on orcinus: #PBS -l par//on=qdr

7 PBS jobs and GPUS To request GPU use the nodes nota/on and add :gpu=x for #PBS l nodes=2:gpus=3:ppn=4 Modern torque scheduling programs recognize GPUs as well as the state of the GPU.

8 SoLware licenses and generic resources Some/mes not only cluster hardware is required to be scheduled for a job but other resources as well, such as solware licenses, telescope or other instrument /me. To request generic resources or licenses: #PBS -W x=gres:matlab=2 #PBS -l other=matlab=2 You can see the list of solware licenses and generic resources available on the cluster with the jobinfo n command.

9 PBS script commands PBS script command #PBS -l mem=4gb #PBS l pmem=4gb #PBS l feature=ssd #PBS -l par//on=qdr #PBS l nodes=2:blue:ppn=2 #PBS l nodes=2:gpus=3:ppn=4 #PBS l nodes=cl2n002+cl2n003 #PBS l host=cl2n002 #PBS -I Descrip2on Requests 4 GB of memory in total Requests 4GB of memory per process Requests 1 procesor on node with a feature ssd Requests to run in the QDR par//on Request 2 cores on each of 2 nodes with blue feature. Request 4 cores and 2 gpus on each of 2 nodes Requests 2 nodes cl2n002 and cl2n003 Requests host or node cl2n002 Request an interac/ve Job

10 Memory, Features, SoLware licenses BREAK FOR PRACTICE

11 Job Submission Requiring Full nodes Some/mes there is a need for exclusive access to guarantee that no other job will be running on the same nodes as your job To guarantee that the job will only run on nodes with other jobs you own use: #PBS -l naccesspolicy=singleuser To guarantee that the job will only run on nodes with no other Job use: #PBS -n #PBS -l naccesspolicy=singlejob To guarantee that the each part of the job will only run on a separate node without anything else running on that node use: #PBS -l naccesspolicy=singletask Your group may get charged for using the whole node and not just the resources requested, and it may take a long /me to gather resources needed for these special jobs.

12 Job submission mul/ple projects If you are part of two different WestGrid projects and are running jobs for both, you need to specify the accoun/ng group for each project so that the correct priority of the job can be determined and so that the usage is charged to the correct group. In order to specify an accoun/ng group for a Job use: #PBS A <accoun/ng group> You can find more informa/on about your accoun/ng groups (RAPI) on the WestGrid s accounts portal: hcps://portal.westgrid.ca/user/my_account.php You can see your accoun/ng group informa/on with the jobinfo a command.

13 Job dependencies If you want one job to start one aler another finishes use the qsub W depend=alerok:<jobid1> job2.pbs If one can break apart a long job into several shorter jobs then the shorter jobs will olen be able to be ran faster. This is also the technique to use if the required job run/me is longer than the maximum wall/me allowed on the cluster. jobn1= $( qsub job1.pbs ) qsub -W depend=alerok:$jobn1 job2.pbs

14 Prologue, Epilogue and Data staging Prologue script runs before your job starts for a maximum of 5 minutes. #PBS -l prologue=/home/fujinaga/prologue.script Epilogue script runs aler your job is finished for a maximum of 5 minutes. #PBS -l epilogue=/home/fujinaga/epilogue.script These scripts are nice if you need to document some more informa/on about the state of your job in the scheduling system. Jobs can resubmit themselves with an appropriate script in the epilogue on some systems.

15 sample epilogue.script #!/bin/sh export MYOUTPUT="$HOME/$1-epilogue.out echo "Epilogue Args:" echo "Job ID: $1" echo "User ID: $2" echo "Group ID: $3" echo "Job Name: $4" echo "Session ID: $5" echo "Resource List: $6" echo "Resources Used: $7" echo "Queue Name: $8" echo "Account String: $9" echo exit 0

16 Temporary available local storage Some solware like Gaussian needs to write and read many small files to disk. The cluster (lustre) file system cannot do this well and this becomes a performance problem for the job and the cluster its running on. Each node has local disk, that is shared by all jobs running on the node. One specifies the requests the local storage via #PBS l file=1000mb. There is a directory created for each job when it is run. When the job finished this directory is automa/cally erased. The directory name is $TMPDIR. A example of using the temporary local storage: #PBS l file=1000mb cd $TMPDIR <run my job > mkdir $HOME/$PBS_JOBID/ cp <file I wish to save> $HOME/$PBS_JOBID/

17 PBS script commands PBS script command #PBS -l naccesspolicy=singleuser #PBS -l naccesspolicy=singlejob #PBS -n #PBS -l naccesspolicy=singletask #PBS A <accoun/ng group> #PBS -W x=gres:matlab=2 #PBS -l other=matlab=2 qsub W depend=alerok:<job1id> j2.pbs #PBS -l epilogue=/home/fujinaga/ epilogue.script #PBS l prologue=/home/ fujinaga/prologue.script #PBS l nodes=5:ppn=12+nodes=1:ppn=1 Descrip2on Requests to only run on nodes with other jobs of same user Requests to only run on nodes with no other jobs Requests that the each part of the job will only run on a separate node without anything else running on that node. Requests that a specific accoun/ng group be used for this job Requests 2 units of a generic resource or solware license MATLAB Job 2 that depends on job1 and will not start un/l job1 completes successfully. Runs epilogue script for maximum of 5 minutes aler job is complete. Runs prologue script for maximum of 5 minutes before job is complete. Requests 5 nodes with 12 processors each and a single node with 1 core.

18 PBS Environment Variables Environment Variable PBS_JOBNAME PBS_ARRAYID PBS_GPUFILE PBS_O_WORKDIR PBS_TASKNUM PBS_O_HOME PBS_JOBID PBS_NUM_NODES PBS_NUM_PPN PBS_O_HOST PBS_QUEUE PBS_NODEFILE PBS_O_PATH Descrip2on User specified job name Job array index for this job list of GPUs allocated to the job located 1 per line: <host>-gpu<number> Job's submission directory Number of tasks requested Home directory of submiyng user Unique pbs job id Number of nodes allocated to the job Number of procs per node allocated to the job Host on which job script is currently running Job queue File containing line delimited list on nodes allocated to the job Path variable used to locate executables within job script

19 Job submission prac/ce BREAK FOR PRACTICE

20 GeYng informa/on on your Job Command jobinfo -j qstat t u $USER qstat a qstat r showq showq i showq b qstat f <Jobid> checkjob <Jobid> checkjob -v -v <Jobid> What its used for List all your jobs and their state List all your array jobs and the subcomponents and their state. List all jobs on the system and their state. List all running jobs on the system. List all jobs on the system and their state. List all jobs being considered for scheduling and their priority Lists all blocked (unable to be run) jobs List detailed informa/on on Job List detailed informa/on on Job List detailed informa/on on Job, including history and why it is not running now on each node.

21 jobinfo -j ~]$ jobinfo -j JobID State Proc WCLimit User Opsys Class Features Running 1 3:00:00:00 kamil - batch Idle 12 8:00:00 kamil - batch Idle 12 8:00:00 kamil - batch Idle 12 8:00:00 kamil - batch Idle 12 8:00:00 kamil - batch Idle 12 8:00:00 kamil - batch Idle 12 8:00:00 kamil - batch Idle 12 8:00:00 kamil - batch Idle 12 8:00:00 kamil - batch -

22 qstat -t u $USER [kamil@jasper ~]$ qstat -t -u kamil jasper-usradm.westgrid.ca: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time [349].jasper-us kamil batch "sumrate" :59:00 Q [350].jasper-us kamil batch "sumrate" :59:00 Q [351].jasper-us kamil batch "sumrate" :59:00 Q [352].jasper-us kamil batch "sumrate" :59:00 R 09:40: [353].jasper-us kamil batch "sumrate" :59:00 R 09:40: [354].jasper-us kamil batch "sumrate" :59:00 R 09:40: [355].jasper-us kamil batch "sumrate" :59:00 R 09:40: [356].jasper-us kamil batch "sumrate" :59:00 R 09:40:12

23 qstat -a hungabee:~ # qstat -a hungabee: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time hungabee fujinaga hall Alliaria.RunAllP :00 R 32: hungabee fujinaga hall Lythrum.RunAllPa :00 Q hungabee tmah hall Nektar_job_3D :00 R 31: hungabee tmah hall Nektar_job_3D :00 Q hungabee tmah hiru cakile.abyssalt :00 R 17: hungabee jyang hiru runscript.hungab :00 R 14: hungabee jyang hiru runscript.hungab :00 R 11: hungabee kamil hiru f_rpx10_c64_f :00 R 06: hungabee kamil hiru f_rpx10_c128_f :00 R 06: hungabee kamil iru f_rpx10_c256_f :00 Q hungabee tmcguire hiru E1e4eta70N :00 R 01:04

24 qstat -r hungabee:~ # qstat -r hungabee: Req'd Req'd Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time hungabee fujinaga hall Alliaria.RunAllP :00 R 32: hungabee tmah hall Nektar_job_3D :00 R 31: hungabee tmah hiru cakile.abyssalt :00 R 17: hungabee jyang hiru runscript.hungab :00 R 14: hungabee jyang hiru runscript.hungab :00 R 11: hungabee kamil hiru f_rpx10_c64_f :00 R 06: hungabee kamil hiru f_rpx10_c128_f :00 R 06: hungabee tmcguire hiru E1e4eta70N :00 R 01:04 Elap

25 showq hungabee:~ # showq active jobs JOBID USERNAME STATE PROCS REMAINING STARTTIME fujinaga Running 64 5:11:32 Thu Apr 10 03:51: kamil Running 64 6:29:27 Thu Apr 10 09:09: tmcguire Running 512 1:15:03:42 Wed Apr 9 01:43:26 4 active jobs 640 of 2048 processors in use by local jobs (31.25%) 80 of 256 nodes active (31.25%) eligible jobs JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME fujinaga Idle :00:00 Thu Apr 10 03:51:27 1 eligible jobs blocked jobs JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME jyang Deferred 1 3:00:00:00 Thu Apr 10 10:35:37 1 blocked jobs Total jobs: 5

26 showq -b etc]# showq b blocked jobs JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME fujinaga BatchHold 1 1:12:00:00 Fri Apr 4 07:15: fujinaga BatchHold 1 1:12:00:00 Fri Apr 4 07:15: fujinaga BatchHold 1 1:12:00:00 Fri Apr 4 07:15: fujinaga BatchHold 1 1:12:00:00 Fri Apr 4 07:15: fujinaga BatchHold 1 1:12:00:00 Fri Apr 4 07:15: fujinaga BatchHold 1 1:12:00:00 Fri Apr 4 07:15: [74] tmcguire Idle 5 3:00:00:00 Sat Apr 5 12:27: jyang Deferred 12 00:01:00 Mon Apr 7 11:52: jyang Deferred 12 00:01:00 Mon Apr 7 11:58: tmah Deferred 4 3:00:00:00 Mon Apr 7 15:07: blocked jobs Total jobs: 3426

27 jobinfo -i or showq -i etc]# showq -i eligible jobs JOBID PRIORITY XFACTOR Q USERNAME GROUP PROCS WCLIMIT CLASS SYSTEMQUEUETIME * fujinaga fujinaga 16 10:30:00 batch Thu Apr 10 10:11: * fujinaga fujinaga 16 10:30:00 batch Thu Apr 10 10:11: [482]* kamil kamil 1 1:00:59:00 batch Thu Apr 10 00:25: [404]* kamil kamil 1 1:00:59:00 batch Thu Apr 10 00:25: [405]* kamil kamil 1 1:00:59:00 batch Thu Apr 10 00:25: * jyang jyang 12 3:00:00:00 batch Wed Apr 9 15:31: * tmcguire tmcguire 8 2:00:00 batch Thu Apr 10 10:27: * jyang jyang 12 3:00:00:00 batch Wed Apr 9 15:31: [539]* tmah tmah 5 3:00:00:00 batch Wed Apr 9 15:36:01 9 eligible jobs Total jobs: 9

28 qstat f [kamil@cl2n234 testwrapper]$ qstat -f 508.cl2n234 Job Id: 508.cl2n234 Job_Name = partest-lq.pbs Job_Owner = kamil@cl2n234 job_state = Q queue = parallel server = cl2n234 Checkpoint = u ctime = Thu Apr 10 13:15: Error_Path = cl2n234:/lustre/home/kamil/test/pbs/jasper/testwrapper/partes t-lq.pbs.e508 Hold_Types = n Join_Path = n Keep_Files = n Mail_Points = abe Mail_Users = kamil@ualberta.ca mtime = Thu Apr 10 13:15: Output_Path = cl2n234:/lustre/home/kamil/test/pbs/jasper/testwrapper/parte st-lq.pbs.o508 Priority = 0 qtime = Thu Apr 10 13:15:

29 qstat f (con/nued) Rerunable = True Resource_List.nodect = 1 Resource_List.nodes = 1:ppn=12 Resource_List.pmem = 256mb Resource_List.walltime = 03:00:00 Shell_Path_List = /bin/sh Variable_List = PBS_O_QUEUE=parallel,PBS_O_HOME=/home/kamil, PBS_O_LOGNAME=kamil, PBS_O_PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/lustre/jas per/software/jobinfo/jobinfo//bin:/opt/sgi/sgimc/bin:/opt/moab/moab-ve rsion/bin:/opt/moab/moab-version/sbin:/var/spool/torque/torque-version /bin,pbs_o_mail=/var/spool/mail/kamil,pbs_o_shell=/bin/bash, PBS_O_LANG=en_US.UTF-8, PBS_O_WORKDIR=/lustre/home/kamil/test/pbs/jasper/testwrapper, PBS_O_HOST=cl2n234,PBS_O_SERVER=cl2n234 etime = Thu Apr 10 13:15: submit_args = partest-lq.pbs fault_tolerant = False job_radix = 0 submit_host = cl2n234

30 checkjob <jobid> torque-setup]# checkjob 508 job 508 AName: partest-lq.pbs State: Idle Creds: user:kamil group:kamil account:ndz-983-aa class:parallel WallTime: 00:00:00 of 3:00:00 BecameEligible: Thu Apr 10 13:18:14 SubmitTime: Thu Apr 10 13:15:43 (Time Queued Total: 00:20:16 Eligible: 00:20:08) TemplateSets: DEFAULT NodeMatchPolicy: EXACTNODE Total Requested Tasks: 12

31 checkjob <jobid> (con/nued) Req[0] TaskCount: 12 Partition: ALL Memory >= 256M Disk >= 0 Swap >= 0 Dedicated Resources Per Task: PROCS: 1 MEM: 256M SystemID: Moab SystemJID: 508 Notification Events: JobStart,JobEnd,JobFail Notification Address: kamil@ualberta.ca Flags: RESTARTABLE Attr: checkpoint StartPriority: cl2n236 available: 12 tasks supported cl2n235 available: 12 tasks supported NOTE: job can run in partition torque (24 procs available 12 procs required)

32 Checkjob -v -v <jobid> ~]$ checkjob -v -v job (RM job ' jasper-usradm.westgrid.ca ) AName: jrc.egsinp_w193 State: Idle Creds: user:kamil group:kamil account:cas-124-aa class:batch WallTime: 00:00:00 of 1:06:00:00 SubmitTime: Thu Apr 10 13:31:02 (Time Queued Total: 00:08:04 Eligible: 00:00:10) TemplateSets: DEFAULT NodeMatchPolicy: EXACTNODE Total Requested Tasks: 1 Total Requested Nodes: 1 Req[0] TaskCount: 1 Partition: ALL Memory >= 2048M Disk >= 0 Swap >= 0 Available Memory >= 0 Available Swap >= 0 Dedicated Resources Per Task: PROCS: 1 MEM: 2048M SWAP: 2048M NodeSet=ONEOF:FEATURE:X5675-QDR:X5675-DDR:L5420-DDR NodeCount: 1 SystemID: Moab SystemJID: Notification Events: JobFail

33 Checkjob -v -v <jobid> (con/nued) UMask: 0000 OutputFile: jasper.westgrid.ca:/home/kamil/egsnrc/codes/dosxyznrc_nob/ ProfilePhantom02IC10_10x10_Emean5_9MeVMonoAltPrimColModFFMT50Ang08KMNRC.egsinp_w193.eo ErrorFile: jasper.westgrid.ca:/home/kamil/egsnrc/codes/dosxyznrc_nob/ ProfilePhantom02IC10_10x10_Emean5_9MeVMonoAltPrimColModFFMT50Ang08KMNRC.egsinp_w193.eo EnvVariables: PBS_O_QUEUE=batch,PBS_O_HOME=/home/ kamil,pbs_o_logname=kamil,pbs_o_path=/home/kamil/egsnrc/codes/bin/x86_64-unknown-linuxgnu-f95:/home/kamil/egsnrc/bin/x86_64-unknown-linux-gnu-f95:/usr/kerberos/bin:/usr/ local/bin:/bin:/usr/bin:/usr/bin:/lustre/jasper/software/jobinfo/jobinfo//bin:/opt/sgi/ sgimc/bin:/opt/moab/moab-version/bin:/opt/moab/moab-version/sbin:/var/spool/torque/ torque-version/bin,pbs_o_mail=/var/spool/mail/kamil,pbs_o_shell=/bin/ bash,pbs_o_lang=en_us.utf-8,pbs_o_workdir=/lustre/home/kamil/egsnrc/codes/ dosxyznrc_nob,pbs_o_host=jasper.westgrid.ca,pbs_o_server=jasper-usradm.westgrid.ca Partition List: [ALL] SrcRM: jasper-usradm DstRM: jasper-usradm DstRMJID: jasperusradm.westgrid.ca Submit Args: -j eo -l pmem=2gb,vmem=2gb -e ProfilePhantom02IC10_10x10_Emean5_9MeVMonoAltPrimColModFFMT50Ang08KMNRC.egsinp_w193.eo - N jrc.egsinp_w193 -l walltime=30:00:00 Flags: RESTARTABLE Attr: checkpoint StartPriority: Priority Analysis: Job PRIORITY* Cred(Class) FS(Accnt) Serv(QTime) Weights ( 1) 1000( 1) 1( 1) ( 0.0) 100.0(-11.1) 0.0( 0.0) PE: 1.00 Node Availability for Partition jasper-usradm

34 Checkjob -v -v <jobid> (con/nued) Node Availability for Partition jasper-usradm cl1n001 cl1n002 cl2n002 cl2n003 cl2n028 cl2n029 cl2n030 cl2n031 rejected: Reserved (wlcg_ops ) allocationpriority=0.00 rejected: Reserved (wlcg_ops ) allocationpriority=0.00 rejected: Memory allocationpriority=0.00 rejected: Memory allocationpriority=0.00 rejected: State (Busy) allocationpriority=0.00 rejected: State (Busy) allocationpriority=0.00 rejected: Memory allocationpriority=0.00 rejected: Memory allocationpriority=0.00 NOTE: job req cannot run in partition jasper-usradm (available procs do not meet requirements : 0 of 1 procs found) idle procs: 354 feasible procs: 0 Node Rejection Summary: [Memory: 128][State: 284][Reserved: 4] BLOCK MSG: job violates idle HARD MAXIJOB limit of 5 for user kamil partition ALL (Req: 1 InUse: 5) (recorded at last scheduling iteration)

35 Demonstra/on on cluster SSH cluster and show all the following commands and how to interpret them jobinfo -j qstat -t -u $USER qstat -a qstat -r showq showq -i showq -b qstat -f <jobid> Checkjob <jobid> Checkjob -v -v <jobid>

36 Job informa/on prac/ce BREAK FOR PRACTICE

37 QUESTIONS?

38 The End

SGI Altix Running Batch Jobs With PBSPro Reiner Vogelsang SGI GmbH

SGI Altix Running Batch Jobs With PBSPro Reiner Vogelsang SGI GmbH reiner@sgi.com Module Objectives After completion of this module you should be able to Submit batch jobs Create job chains Monitor your