Requesting Resources on an HPC Facility

Size: px
Start display at page:

Download "Requesting Resources on an HPC Facility"

Transcription

1 Requesting Resources on an HPC Facility (Using the Sun Grid Engine Job Scheduler) Deniz Savas dsavas.staff.shef.ac.uk/teaching June 2017

2 Outline 1. Using the Job Scheduler 2. Interactive Jobs 3. Batch Jobs 4. Task arrays 5. Running Parallel Jobs 6. GPUs and remote Visualisation 7. Beyond Iceberg Accessing the N8 tier 2 facility

3 Running Jobs A note on interactive jobs Software that requires intensive computing should be run on the worker nodes and not the head node. You should run compute intensive interactive jobs on the worker nodes by using the qsh or qrsh command. Maximum ( and also default) time limit for interactive jobs is 8 hours.

4 Sun Grid Engine Two iceberg or sharc headnodes are gateways to the cluster of worker nodes. Headnodes main purpose is to allow access to the worker nodes for the logged in users. All CPU intensive computations must be performed on the worker nodes. This is achieved by using one of the following two commands on the headnode. qsh or qrsh : To start an interactive session on a worker node. qsub : To submit a batch job to the cluster Once you log into iceberg, you are recommended to start working interactively on a worker-node by simply typing qsh and working in the new shell window that is opened. The next set of slides assume that you are already working on one of the worker nodes (qsh session).

5 Practice Session 1 Running Applications on Iceberg (Problem 1) Case Studies Analysis of Patient Inflammation Data Running an R application how to submit jobs and run R interactively List available and loaded modules load the module for the R package Start the R Application and plot the inflammation data

6 Other Methods of submitting Batch Jobs on the Sheffield HPC clusters Iceberg has a number of home grown commands for submitting jobs for some of the most popular applications and packages to the batch system. These commands create suitable scripts to submit the users job to the cluster automatically These are; runfluent, runansys, runmatlab, runabaqus To get information on how to use these command, simply issue the command name on a worker node without any parameters.

7 Exercise 1: Submit a job via qsub Create a script file (named example.sh) by using a text editor such as gedit,vi or emacs and inputing the following lines: #!/bin/sh # echo This code is running on /bin/hostname /bin/date Now Submit this script to SGE using the qsub command: qsub example.sh

8 Tutorials On iceberg copy the contents of the tutorial directory to your user area into a directory named sge: cp r /usr/local/courses/sge sge cd sge In this directory the file readme.txt contains all the instructions necessary to perform the exercises.

9 Managing Your Jobs Sun Grid Engine Overview SGE is the resource management system, job scheduling and batch control system. (Others available such as PBS, Torque/Maui, Platform LSF ) Starts up interactive jobs on available workers Schedules all batch orientated i.e. non-interactive jobs Attempts to create a fair-share environment Optimizes resource utilization

10 SGE worker node SGE worker node SGE worker node SGE worker node SGE worker node C Slot 1 B Slot 1 A Slot 1 C Slot 2 C Slot 1 B Slot 1 B Slot 3 B Slot 2 B Slot 1 C Slot 3 C Slot 2 C Slot 1 B Slot 1 A Slot 2 A Slot 1 Queue-A Queue-B Queue-C SGE MASTER node Queues Policies Priorities JOB X JOB U JOB Y JOB N JOB Z JOB O Share/Tickets Resources Users/Projects Scheduling qsub batch jobs on the cluster

11 Working with SGE as a user Although the SGE system contains many commands and utilities most of them are for the administration of the scheduling system only. The following list of SGE commands will be sufficient for most users. qsub : Submits a batch job qsh or qrsh : Starts an interactive session qstat : Queries the progress of the jobs qdel : Removes unwanted jobs.

12 Running interactive jobs on the cluster 1. User asks to run an interactive job (qsh, qrsh ) 2. SGE checks to see if there are resources available to start the job immediately (i.e a free worker ) If so, the interactive session is started under the control/monitoring of SGE on that worker. If resources are not available the request is simply rejected and the user notified. This is because by its very nature users can not wait for an interactive session to start. 3. User terminates the job by typing exit or logout or the job is terminated when the queue limits are reached (i.e. currently after 8 hours of wall-clock time usage).

13 Demonstration 1 Running Jobs batch job example Using the R package to analyse patient data qsub example: qsub l h_rt=10:00:00 o myoutputfile j y myjob OR alternatively the first few lines of the submit script myjob contains - $!/bin/bash $# -l h_rt=10:00:00 $# -o myoutputfile $# -j y and you simply type; qsub myjob

14 Summary table of useful SGE commands Command(s) Description User/System qsub, qresub,qmon Submit batch jobs USER qsh,qrsh qstat, qhost, qdel, qmon qacct, qmon, qalter, qdel, qmod Submit Interactive Jobs Status of queues and jobs in queues, list of execute nodes, remove jobs from queues Monitor/manage accounts, queues, jobs etc USER USER SYSTEM ADMIN

15 Using the qsub command to submit batch jobs In its simplest form any script file can be submitted to the SGE batch queue by simply typing qsub scriptfile. In this way the scriptfile is queued to be executed by the SGE under default conditions and using default amount of resources. Such use is not always desirable as the default conditions provided may not be appropriate for that job. Also, providing a good estimate of the amount of resources needed helps SGE to schedule the tasks more efficiently. There are two alternative mechanisms for specifying the environment & resources; 1) Via parameters to the qsub command 2) Via special SGE comments (#$ ) in the script file that is submitted. The meaning of the parameters are the same for both methods and they control such things as; - cpu time required - number of processors needed ( for multi-processor jobs), - output file names, - notification of job activity.

16 Method 1 Using qsub command-line parameters Format: qsub [qsub_params] script_file [-- script_arguments] Examples: qsub myjob qsub cwd $HOME/myfolder1 qsub l h_rt=00:05:00 myjob -- test1 -large Note that the last example passes parameters to the script file following the -- token.

17 Method 2 special comments in script files A script file is a file containing a set of Unix commands written in a scripting language usually Bourne/Bash or C-Shell. When the job runs these script files are executed as if their contents were typed at the keyboard. In a script file any line beginning with # will normally be treated as a comment line and ignored. However the SGE treats the comment lines in the submitted script, which start with the special sequence #$,in a special way. SGE expects to find declarations of the qsub options in these comment lines. At the time of job submission SGE determines the job resources from these comment lines. If there are any conflicts between the actual qsub command-line parameters and the special comment (#$) sge options the command line parameters always override the #$ sge options specified in the script.

18 An example script containing SGE options #!/bin/sh #A simple job script for sun grid engine. # #$ -l h_rt=01:00:00 #$ -m be #$ -M username@shef.ac.uk benchtest < inputfile > myresults

19 More examples of #$ options in a scriptfile #!/bin/csh # Force the shell to be the C-shell # On iceberg the default shell is the bash-shell #$ -S /bin/csh # Request 8 GBytes of virtual memory #$ -l mem=8g # Specify myresults as the output file #$ -o myresults # Compile the program pgf90 test.for o mytestprog # Run the program and read the data that program # would have read from the keyboard from file mydata mytestprog < mydata

20 Running Jobs qsub and qsh options -l h_rt=hh:mm:ss The wall clock time. This parameter must be specified, failure to include this parameter will result in the error message: Error: no suitable queues. Current default is 8 hours. -l arch=intel* -l arch=amd* Force SGE to select either Intel or AMD architecture nodes. No need to use this parameter unless the code has processor dependency. -l mem=memory sets the virtual-memory limit e.g. l mem=10g (for parallel jobs this is per processor and not total). Current default if not specified is 6 GB. -l rmem=memory Sets the limit of real-memory required Current default is 2 GB. Note: rmem parameter must always be less than mem. -help Prints a list of options -pe ompigige np -pe openmpi-ib np -pe openmp np Specifies the parallel environment to be used. np is the number of processors required for the parallel job.

21 Running Jobs qsub and qsh options ( continued) -N jobname By default a job s name is constructed from the job-script-filename and the job-id that is allocated to the job by SGE. This options defines the jobname. Make sure it is unique because the job output files are constructed from the jobname. -o output_file Output is directed to a named file. Make sure not to overwrite important files by accident. -j y Join the standard output and standard error output streams recommended -m [bea] -M -address Sends s about the progress of the job to the specified address. If used, both m and M must be specified. Select any or all of the b,e and a to imply ing when the job begins, ends or aborts. -P project_name Runs a job using the specified projects allocation of resources. -S shell Use the specified shell to interpret the script rather than the default bash shell. Use with care. A better option is to specify the shell in the first line of the job script. E.g. #!/bin/bash -V Export all environment variables currently in effect to the job.

22 Qsub options ( notifications and testing related ) -M _address address for job notifications. Example: -M Joe.Bloggs@gmail.com -m b e a s Send (s) when the job begins, ends, aborted or suspended Example: m be -now Start running the job now or if can t be run exit with an error code. -verify do not submit the job but check and report on submission.

23 Qsub options (output files and job names related ) When a job is submitted it is given a unique job_number. Also by default the name of the script file is used as the jobname. When the job starts running the standard output and error outputs are sent to files named jobname.ojob_number and jobname.ejob_number respectively. For example: myscript.o45612 myscript.e45612 The o, -e, -j y and N name parameters modify this behaviour.

24 Example: relating to job output files Passed to qsub as arguments during submission: qsub N walldesign j y walljob OR insert in the submit script file walljob: #$!/bin/bash #$ -N walldesign #$ -j y /home/data/my_app and submit the job qsub walljob Using either of these methods, when the job runs the both normal and error output will be contained in a file named walldesign.onnnnn where nnnnn is the unique job number SGE designated to your job at the time of submission.

25 More on starting interactive iobs qsh and qrsh commands qsh : starts an Xterm session on a worker node. Use this command if you have XWindows capabilities. qrsh : starts a remote command shell and optionally executes a shellscripts on a worker node. If you do not have Xwindows capability, i.e. you are not using Exceed, Xming Cygwin so on, this is the way for starting interactive jobs on iceberg. It is suitable when you log in using putty or ssh in line-mode. In Sheffield all interactive jobs are put into the short queues that limit the clock time to 8 hours of wall clock time. BEWARE: As soon as the time limit is exceeded the job will terminate without any warning.

26 More on qrsh command qrsh [parameters] If no parameters are given it behaves exactly like qsh. This is the normal method of using qrsh. If there are parameters a remote shell is started up on one of the workers and the parameters are passed to shell for execution. For example, if a script file name is presented as a parameter, commands in the script file are executed and the job terminates when the end of the script file is reached. Example : qrsh myscript

27 More on qsh command qsh display display_specifier qsh starts up an X-terminal within which the interactive job is started. It is possible to pass any Xterm parameters via the -- construct. Example : qsh -- title myjob1 Type man xterm for a list of parameters. Qsh : this is a home produced variation of the qsh command. It passes suitable display parameters to qsh to produce a more pleasant looking command window.

28 Monitoring your jobs A submitted job will either be; 1. still waiting in the queue, 2. be executing, 3. finished execution and left the SGE scheduling system. In order to monitor the progress of your job while in states (1) and (2) use the qstat or Qstat commands that will inform you if the job is still waiting or started executing. The command qstat gives info about all the jobs but Qstat gives info about your jobs alone. While executing (state 2) ; use qstat j job_number to monitor the jobs status including time and memory consumptions. Contains too much information! Better still use qstat j job_number grep mem that will give time and memory consumed information. Also use tail f job_output_filename to see the latest output from the job Finished executing ( state 3) ; qacct is the only command that may be able to tell you about the past jobs by referring to a data-base of past usage. Output file names will contain the job number so; qacct -j job_number should give some information.

29 Monitoring your job If you are interested in only your job use Qstat job-id prior name user state submit/start at queue slots ja-task-id INTERACTIV cs1ds r 10/21/ :03:05 interactive.q@node94.iceberg.s INTERACTIV cs1ds r 10/21/ :37:37 interactive.q@node94.iceberg.s 1 If you want to see all the jobs use qstat job-id prior name user state submit/start at queue slots ja-task-id grd80x120x cpp06kw r 10/15/ :35:43 long.q@node81.iceberg.shef.ac M11_NiCl.3 chp09bwh r 10/15/ :44:44 parallel.q@node107.iceberg.she pythonjob co1afh r 10/16/ :34:33 long.q@node59.iceberg.shef.ac parallelne elp05ws r 10/16/ :34:33 long.q@node86.iceberg.shef.ac parallelne elp05ws r 10/16/ :31:51 long.q@node23.iceberg.shef.ac coaxialjet zzp09hw r 10/17/ :03:53 parallel.q@node105.iceberg.she periodichi mep09ww r 10/18/ :39:06 parallel.q@node96.iceberg.shef gav1 me1ccl r 10/18/ :00:16 long.q@node87.iceberg.shef.ac ga2 me1ccl r 10/18/ :00:26 long.q@node87.iceberg.shef.ac DDNaca0 mep09ww r 10/18/ :41:56 parallel.q@node112.iceberg.she ga3 me1ccl r 10/18/ :00:46 long.q@node33.iceberg.shef.ac ga4 me1ccl r 10/18/ :00:46 long.q@node33.iceberg.shef.ac job elp07dc r 10/19/ :25:24 eenano.q@node124.iceberg.shef fluent12jo fc1jz qw 10/19/ :32: Oct_ATLAS php09ajf qw 10/19/ :41:01

30 qstat command qstat command will list all the jobs in the system that are either waiting to be run or running. This can be a very long list! qstat f full listing ( even longer) qstat u username or Qstat ( recommend this! ) qstat f u username ( detailed information ) Status of the job is indicated by letters in qstat listings as: qw waiting t transfering r running s,s suspended R restarted T treshold

31 Deleting Jobs with the qdel command qdel command will remove from the queue the specified jobs that are waiting to be run or abort jobs that are already running. Individual Job qdel List of Jobs qdel All Jobs running or queueing under a given username qdel u <username>

32 Reasons for Job Failures SGE cannot find the binary file specified in the job script One of the Linux environment resource limits is exceeded (see command ulimit a ) Required input files are missing from the startup directory You have exceeded your quota and job fails when trying to write to a file ( use quota command to check usage ) Environment variable is not set (LM_LICENSE_FILE etc) Hardware failure

33 Job Arrays By using a single qsub command, it is possible to submit a series of jobs that use the same job template. These jobs are described as array jobs. For example: qsub myanalysis t 1-10 will submit the script named myanalysis as 10 separate jobs. Of course it is pointless to run the same job 10 times. The only justification for doing so is that all these jobs are doing different tasks. This is where a special environment variable named SGE_TASK_ID becomes essential. In the above example in each job the variable SGE_TASK_ID will contain a unique number between 1 and 10 to differentiate these jobs from each other. Thus we can use this variable s value to control each job in a different manner. Please note that there is no guarantee about the order of execution of these tasks. I.e. there is no guarantee that task number m will start before task number n, where m<n

34 Example Array jobs and the $SGE_TASK_ID variable #$ -S /bin/tcsh #$ -l h_cpu=01:00:00 #$ -t 2-16:2 #$ -cwd myprog > results.$sge_task_id < mydata.$sge_task_id This will run 8 jobs. The jobs are considered to be independent of each other and hence may run in parallel depending on the availability of resources. Note that tasks will be numbered 2, 4,6,8 (steps of 2) For example job 8 will read its data from file mydata.8 and write its output to file results.8 It is possible to make these jobs dependent on each other so as to impose an order of execution by means of the hold_jid parameter.

35 Practice Session: Submitting A Task Array To Iceberg (Problem 4) Case Study Fish population simulation Submitting jobs to Sun Grid Engine Instructions are in the readme file in the sge folder of the course examples From an interactive session Run the SGE task array example Run test4, test5

36 An example OpenMP job script OpenMP programming takes advantage of the multiple CPU s that reside in a single computer to distribute work amongst CPUs that share the same memory. Currently we have maximum of 8 CPU s per computer and therefore only upto 8 processors can be requested for an iceberg openmp job. After the next upgrade this figure will increase to minimum 24. #$ -pe openmp 4 #$ -l h_rt=01:30:00 OMP_NUM_THREADS=4./myprog

37 An example MPI job script MPI programs are harder to code but can take advantage of interconnected multiple computers by passing messages between them ( MPI= Message Passing Interface ) 23 workers on the iceberg pool are connected together with fast Infiniband communications cabling to provide upto 10 Gbits/sec data transfer rate between them. The rest of the workers can communicate with each other via the normal 1 Gbits/sec ethernet cables. #$ -pe mvapich2-ib 4 # limit run to 1 hours actual clock time #$ -l h_rt=1:00:00 mpirun_rsh -rsh -np $NSLOTS -hostfile $TMPDIR/machines./executable

38 Managing Jobs : Running cpu-parallel jobs More many processor tasks Sharing memory Distributed Memory Parallel environment needed for a job can be specified by the: -pe <env> nn parameter of qsub command, where <env> is.. openmp : These are shared memory OpenMP jobs and therefore must run on a single node using its multiple processors. openmpi-ib : OpenMPI library-infiniband. These are MPI jobs running on multiple hosts using the Infiniband Connection ( 32GBits/sec ) mvapich2-ib : Mvapich-library-Infiniband. As above but using the MVAPICH MPI library. Compilers that support MPI. PGI, Intel, GNU

39 Running GPU parallel jobs GPU parallel processing is supported on 8 Nvidia Tesla Fermi M2070s GPU units attached to iceberg. In order to use the GPU hardware you will need to join the GPU project by ing research-it@sheffield.ac.uk You can then submit jobs that use the GPU facilities by using the following three parameters to the qsub command; -P gpu -l arch=intel* -l gpu=nn where 1<= nn <= 8 is the number of gpu-modules to be used by the job. P stands for project that you belong to. See next slide.

40 Demonstration 3 Running a parallel job Test 6 provides an opportunity to practice submitting parallel jobs to the scheduler. To run testmpi6, compile the mpi example Load the openmpi compiler module module load mpi/intel/openmpi/1.8.3 compile the diffuse program mpicc diffuse.c -o diffuse qsub testmpi6 Use qstat to monitor the job examine the output

41 The progress of your batch job The user submits a batch job as described above eg. qsub myscript_file The job is placed in the queue and given a unique job number <nnnn> The user is informed immediately of the job number <nnnn> The user can check the progress of the job by using the qstat command. Status of the job is shown as qw (waiting), t (transfering) or r (running) User can abort a job by using the qdel command at this stage. When the job runs the standard output and error messages are placed in files named <my_scriptfile>.o<nnnn> and <my_scriptfile>.e<nnnn> respectively

42 Hints Once you prepared your job script you can test it by simply running it on its own for a very small or trivial problem. For example- if your script is called analyse.sh you simply type./analyse.sh This will immediately expose any errors in your script. Note: that the qsub parameters which are defined using the #$ sequence will be treated as comments during this run. Q: Should I define the qsub parameters in the script file or as parameters at the time of issuing qsub? A: The choice is yours, I prefer to define any parameter, which is not likely to alter between runs, within the script file to save myself having to remember it at each submission.

43 SGE related Environment Variables Apart from the specific environment variables passed via the v or V options, during the execution of a batch job the following environment variables are also available to help build unique or customized filenames messages etc. $HOME : Your own login directory $USER : your iceberg username $JOB_NAME : Name of the job $HOSTNAME : Name of the cluster node that is being used $SGE_TASK_ID : Task number (important for task arrays) $NSLOTS : Number of processors used ( important for parallel openmp or mpi jobs )

44 Submitting Batch Jobs via the qmon command If you are using an X terminal ( such as provided by Exceed ) then a GUI interface named qmon can also be used to make job submission easier. This command also allows an easier way of setting the job parameters.

45 Job submission panel of QMON Click on Job Submission Icon Click to browse for the job script test2

46 Job queues Unlike the traditional batch queue systems, users do not need to select the queue they are submitting to. Instead SGE uses the resource needs as specified by the user to determine the best queue for the job. In Sheffield and Leeds the underlying queues are setup according to memory size and cpu time requirements and also numbers of multiple cpu s needed (for mpi & openmp jobs ) qstat F displays full queue information, Also qmon (Task- Queue_Control) will allow information to be distilled about the queue limits.

47 Job queue configuration Normally you will not need to know the details of each queue, as the Grid Engine will make the decisions for you in selecting a suitable queue for your job. If you feel the need to find out how the job queues are configured, perhaps to aid you in specifying the appropriate resources, you may do so by using the qconf system administrator command. qconf sql will give a list of all the queues qconf sq queue_name will list details of a specific queue s configuration

48 Monitoring the progress of your jobs The commands qstat and the XWindows based qmon can be used to check on the progress of your jobs through the system. We recommend that you use the qmon command if your terminal has X capability as this makes it easier to view your jobs progress and also cancel or abort it, if it becomes necessary to do so.

49 Checking the progress of jobs with QMON Click on Job Control Icon Click on Running Jobs tab

50 Managing Jobs monitoring and controlling your jobs There are a number of commands for querying and modifying the status of a job running or waiting to run. These are; qstat or Qstat (query job status) qstat u username qdel (delete a job) qdel jobid qmon ( a GUI interface for SGE )

51 Practice Session: Submitting Jobs To Iceberg (Problem 2 & 3) Patient Inflammation Study run the R example as a batch job Case Study Fish population simulation Submitting jobs to Sun Grid Engine Instructions are in the readme file in the sge folder of the course examples From an interactive session Load the compiler module Compile the fish program Run test1, test2 and test3

52 Managing Jobs: Reasons for job failures SGE cannot find the binary file specified in the job script You ran out of file storage. It is possible to exceed your filestore allocation limits during a job that is producing large output files. Use the quota command to check this. Required input files are missing from the startup directory Environment variable is not set correctly (LM_LICENSE_FILE etc) Hardware failure (eg communication equipment failure for mpi jobs)

53 Finding out the memory requirements of a job Virtual Memory Limits: Default virtual memory limits for each job is 6 GBytes Jobs will be killed if virtual memory used by the job exceeds the amount requested via the l mem= parameter. Real Memory Limits: Default real memory allocation is 2 GBytes Real memory resource can be requested by using l rmem= Jobs exceeding the real memory allocation will not be deleted but will run with reduced efficiency and the user will be ed about the memory deficiency. When you get warnings of that kind, increase the real memory allocation for your job by using the l rmem= parameter. rmem must always be less than mem Determining the virtual memory requirements for a job; qstat f j jobid grep mem The reported figures will indicate - the currently used memory ( vmem ) - Maximum memory needed since startup ( maxvmem) - cumulative memory_usage*seconds ( mem ) When you run the job next you need to use the reported value of vmem to specify the memory requirement

54 Remote Visualisation See Specialist High Speed Visualization Access to iceberg Undertake visualisation using thin clients accessing remote high quality visualisation hardware Remote visualisation removes the need to transfer data and allows researchers to visualise data sets on remote visualisation servers attached to the high performance computer and its storage facility

55 VirtualGL VirtualGL is an open source package which gives any UNIX or Linux remote display software the ability to run 3D applications with full hardware accelerations. VirtualGL can also be used in conjunction with remote display software such as VNC to provide 3D hardware accelerated rendering for OpenGL applications. VirtualGL is very useful in providing remote display to thin clients which lack the 3D hardware acceleration.

56 Client Access to Visualisation Cluster Iceberg Campus Compute Cloud VirtualGL Server (NVIDIA GPU) VirtualGL Client

57 Remote Visualisation Using SGD Star a browser and goto login to Sun Global Desktop Under Iceberg Applications start the Remote visualisation session This opens a shell with instructions to either Open a browser and enter the address Start Tiger VNCViewer on your desktop Use the address iceberg.shef.ac.uk:xxxx XXXX is a port address provided on the iceberg terminal When requested use your usual iceberg user credentials

58

59 Remote Desktop Through VNC

60 Remote Visualisation Using Tiger VNC and the Putty SHH Client Login in to iceberg using putty At the prompt type qsh-vis This opens a shell with instructions to either Open a browser and enter the address Start Tiger VNCViewer on your desktop Use the address iceberg.shef.ac.uk:xxxx XXXX is a port address provided on the iceberg terminal When requested use your usual iceberg user credentials

61 Beyond Iceberg Iceberg OK for many compute problems Purchasing dedicated resource N8 tier 2 facility for more demanding compute problems Hector/Archer Larger facility for grand challenge problems (pier review process to access)

62 High Performance Computing Tiers Tier 1 computing Hector, Archer Tier 2 Computing Polaris Tier 3 Computing Iceberg

63 Purchasing Resource Buying nodes using framework Research Groups purchase HPC equipment against their research grant this hardware is integrated with Iceberg cluster Buying slice of time Research groups can purchase servers for a length of time specified by the research group (cost is 1.7p/core per hour) Servers are reserved for dedicated usage by the research group using a provided project name When reserved nodes are idle they become available to the general short queues. They are quickly released for use by the research group when required. For information research-it@sheffield.ac.uk

64 The N8 Tier 2 Facility: Polaris Note N8 is for users whose research problems require greater resource than that available through Iceberg Registration is through Projects Authourisation by a supervisor or project leader to register project with the N8 Users obtain a project code from supervisor or project leader Complete online form provide an outline of work explaining why N8 resources are required

65 Polaris: Specifications 5312 Intel Sandy Bridge cores Co-located with 4500-core Leeds HPC Purchased through Esteem framework agreement: SGI hardware #291 in June 2012 Top500

66 National HPC Services Archer UK National Supercomputing Service Hardware CRAY XC Standard nodes Each node contains two Intel E v2 12-core processors Therefore 2632*2* cores. 64 GB of memory per node 376 high memory nodes with128gb memory Nodes connected to each other via ARIES low latency interconnect Research Data File System 7.8PB disk EPCC HPCC Facilities Training and expertise in parallel computing

67 Sheffield University Web References Interactive Jobs Batch Jobs

68 Links for Software Downloads Putty WinSCP TigerVNC Main_Page

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class STAT8330 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 Outline What

More information

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!?

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!? X Grid Engine Where X stands for Oracle Univa Open Son of more to come...?!? Carsten Preuss on behalf of Scientific Computing High Performance Computing Scheduler candidates LSF too expensive PBS / Torque

More information

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing A Hands-On Tutorial: RNA Sequencing Using Computing February 11th and 12th, 2016 1st session (Thursday) Preliminaries: Linux, HPC, command line interface Using HPC: modules, queuing system Presented by:

More information

Shark Cluster Overview

Shark Cluster Overview Shark Cluster Overview 51 Execution Nodes 1 Head Node (shark) 2 Graphical login nodes 800 Cores = slots 714 TB Storage RAW Slide 1/17 Introduction What is a High Performance Compute (HPC) cluster? A HPC

More information

National Biochemical Computational Research https://nbcr.net/accounts/apply.php. Familiarize yourself with the account policy

National Biochemical Computational Research  https://nbcr.net/accounts/apply.php. Familiarize yourself with the account policy Track 3: Molecular Visualization and Virtual Screening NBCR Summer Institute Session: NBCR clusters introduction August 11, 2006 Nadya Williams nadya@sdsc.edu Where to start National Biochemical Computational

More information

SGE Roll: Users Guide. Version Edition

SGE Roll: Users Guide. Version Edition SGE Roll: Users Guide Version 4.2.1 Edition SGE Roll: Users Guide : Version 4.2.1 Edition Published Sep 2006 Copyright 2006 University of California and Scalable Systems This document is subject to the

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Grid Engine - A Batch System for DESY. Andreas Haupt, Peter Wegner DESY Zeuthen

Grid Engine - A Batch System for DESY. Andreas Haupt, Peter Wegner DESY Zeuthen Grid Engine - A Batch System for DESY Andreas Haupt, Peter Wegner 15.6.2005 DESY Zeuthen Introduction Motivations for using a batch system more effective usage of available computers (e.g. reduce idle

More information

Running Applications on The Sheffield University HPC Clusters

Running Applications on The Sheffield University HPC Clusters Running Applications on The Sheffield University HPC Clusters Deniz Savas dsavas.staff.sheffield.ac.uk June 2017 Topics 1. Software on an HPC system 2. Available Applications 3. Available Development Tools

More information

Batch Systems. Running calculations on HPC resources

Batch Systems. Running calculations on HPC resources Batch Systems Running calculations on HPC resources Outline What is a batch system? How do I interact with the batch system Job submission scripts Interactive jobs Common batch systems Converting between

More information

Grid Engine Users Guide. 5.5 Edition

Grid Engine Users Guide. 5.5 Edition Grid Engine Users Guide 5.5 Edition Grid Engine Users Guide : 5.5 Edition Published May 08 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the Rocks License

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

Using the computational resources at the GACRC

Using the computational resources at the GACRC An introduction to zcluster Georgia Advanced Computing Resource Center (GACRC) University of Georgia Dr. Landau s PHYS4601/6601 course - Spring 2017 What is GACRC? Georgia Advanced Computing Resource Center

More information

High Performance Computing (HPC) Using zcluster at GACRC

High Performance Computing (HPC) Using zcluster at GACRC High Performance Computing (HPC) Using zcluster at GACRC On-class STAT8060 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC?

More information

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011) UoW HPC Quick Start Information Technology Services University of Wollongong ( Last updated on October 10, 2011) 1 Contents 1 Logging into the HPC Cluster 3 1.1 From within the UoW campus.......................

More information

Why You Should Consider Grid Computing

Why You Should Consider Grid Computing Why You Should Consider Grid Computing Kenny Daily BIT Presentation 8 January 2007 Outline Motivational Story Electric Fish Grid Computing Overview N1 Sun Grid Engine Software Use of UCI's cluster My Research

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is HPC Concept? What is

More information

Programming Environment on Ranger Cluster

Programming Environment on Ranger Cluster Programming Environment on Ranger Cluster Cornell Center for Advanced Computing December 8, 2010 12/8/2010 www.cac.cornell.edu 1 User Guides TACC Ranger (http://services.tacc.utexas.edu/index.php/ranger-user-guide)

More information

Introduction to GALILEO

Introduction to GALILEO November 27, 2016 Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it SuperComputing Applications and Innovation Department

More information

User Guide of High Performance Computing Cluster in School of Physics

User Guide of High Performance Computing Cluster in School of Physics User Guide of High Performance Computing Cluster in School of Physics Prepared by Sue Yang (xue.yang@sydney.edu.au) This document aims at helping users to quickly log into the cluster, set up the software

More information

SGE Roll: Users Guide. Version 5.3 Edition

SGE Roll: Users Guide. Version 5.3 Edition SGE Roll: Users Guide Version 5.3 Edition SGE Roll: Users Guide : Version 5.3 Edition Published Dec 2009 Copyright 2009 University of California and Scalable Systems This document is subject to the Rocks

More information

Sharpen Exercise: Using HPC resources and running parallel applications

Sharpen Exercise: Using HPC resources and running parallel applications Sharpen Exercise: Using HPC resources and running parallel applications Contents 1 Aims 2 2 Introduction 2 3 Instructions 3 3.1 Log into ARCHER frontend nodes and run commands.... 3 3.2 Download and extract

More information

Cluster User Training

Cluster User Training Cluster User Training From Bash to parallel jobs under SGE in one terrifying hour Christopher Dwan, Bioteam First delivered at IICB, Kolkata, India December 14, 2009 UNIX ESSENTIALS Unix command line essentials

More information

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System Image Sharpening Practical Introduction to HPC Exercise Instructions for Cirrus Tier-2 System 2 1. Aims The aim of this exercise is to get you used to logging into an HPC resource, using the command line

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class PBIO/BINF8350 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What

More information

Shark Cluster Overview

Shark Cluster Overview Shark Cluster Overview 51 Execution Nodes 1 Head Node (shark) 1 Graphical login node (rivershark) 800 Cores = slots 714 TB Storage RAW Slide 1/14 Introduction What is a cluster? A cluster is a group of

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu 1 Outline What is GACRC? What is HPC Concept? What

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to Job Submission and Scheduling Andrew Gustafson Interacting with MSI Systems Connecting to MSI SSH is the most reliable connection method Linux and Mac

More information

NBIC TechTrack PBS Tutorial

NBIC TechTrack PBS Tutorial NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen Visit our webpage at: http://www.nbic.nl/support/brs 1 NBIC PBS Tutorial

More information

Effective Use of CCV Resources

Effective Use of CCV Resources Effective Use of CCV Resources Mark Howison User Services & Support This talk... Assumes you have some familiarity with a Unix shell Provides examples and best practices for typical usage of CCV systems

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

Sharpen Exercise: Using HPC resources and running parallel applications

Sharpen Exercise: Using HPC resources and running parallel applications Sharpen Exercise: Using HPC resources and running parallel applications Andrew Turner, Dominic Sloan-Murphy, David Henty, Adrian Jackson Contents 1 Aims 2 2 Introduction 2 3 Instructions 3 3.1 Log into

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

Grid Engine Users Guide. 7.0 Edition

Grid Engine Users Guide. 7.0 Edition Grid Engine Users Guide 7.0 Edition Grid Engine Users Guide : 7.0 Edition Published Dec 01 2017 Copyright 2017 University of California and Scalable Systems This document is subject to the Rocks License

More information

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced Sarvani Chadalapaka HPC Administrator University of California

More information

Batch system usage arm euthen F azo he Z J. B T

Batch system usage arm euthen F azo he Z J. B T Batch system usage 10.11.2010 General stuff Computing wikipage: http://dvinfo.ifh.de Central email address for questions & requests: uco-zn@desy.de Data storage: AFS ( /afs/ifh.de/group/amanda/scratch/

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Alessandro Grottesi a.grottesi@cineca.it SuperComputing Applications and

More information

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Introduction What are the intended uses of the MTL? The MTL is prioritized for supporting the Intel Academic Community for the testing, validation

More information

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012 Slurm and Abel job scripts Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012 Abel in numbers Nodes - 600+ Cores - 10000+ (1 node->2 processors->16 cores) Total memory

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to MSI Systems Andrew Gustafson The Machines at MSI Machine Type: Cluster Source: http://en.wikipedia.org/wiki/cluster_%28computing%29 Machine Type: Cluster

More information

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen 1 NBIC PBS Tutorial This part is an introduction to clusters and the PBS

More information

Installing and running COMSOL 4.3a on a Linux cluster COMSOL. All rights reserved.

Installing and running COMSOL 4.3a on a Linux cluster COMSOL. All rights reserved. Installing and running COMSOL 4.3a on a Linux cluster 2012 COMSOL. All rights reserved. Introduction This quick guide explains how to install and operate COMSOL Multiphysics 4.3a on a Linux cluster. It

More information

New User Tutorial. OSU High Performance Computing Center

New User Tutorial. OSU High Performance Computing Center New User Tutorial OSU High Performance Computing Center TABLE OF CONTENTS Logging In... 3-5 Windows... 3-4 Linux... 4 Mac... 4-5 Changing Password... 5 Using Linux Commands... 6 File Systems... 7 File

More information

HPCC New User Training

HPCC New User Training High Performance Computing Center HPCC New User Training Getting Started on HPCC Resources Eric Rees, Ph.D. High Performance Computing Center Fall 2018 HPCC User Training Agenda HPCC User Training Agenda

More information

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220 Introduction to HPC Using zcluster at GACRC On-Class GENE 4220 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 OVERVIEW GACRC

More information

XSEDE New User Tutorial

XSEDE New User Tutorial April 2, 2014 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Make sure you sign the sign in sheet! At the end of the module, I will ask you to

More information

To connect to the cluster, simply use a SSH or SFTP client to connect to:

To connect to the cluster, simply use a SSH or SFTP client to connect to: RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, phoenix.ce.rit.edu, serves as the master controller or head

More information

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Partners Funding Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike

More information

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009 ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009 What is ACEnet? Shared resource......for research computing... physics, chemistry, oceanography, biology, math, engineering,

More information

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende Introduction to the NCAR HPC Systems 25 May 2018 Consulting Services Group Brian Vanderwende Topics to cover Overview of the NCAR cluster resources Basic tasks in the HPC environment Accessing pre-built

More information

PBS Pro Documentation

PBS Pro Documentation Introduction Most jobs will require greater resources than are available on individual nodes. All jobs must be scheduled via the batch job system. The batch job system in use is PBS Pro. Jobs are submitted

More information

Using ISMLL Cluster. Tutorial Lec 5. Mohsan Jameel, Information Systems and Machine Learning Lab, University of Hildesheim

Using ISMLL Cluster. Tutorial Lec 5. Mohsan Jameel, Information Systems and Machine Learning Lab, University of Hildesheim Using ISMLL Cluster Tutorial Lec 5 1 Agenda Hardware Useful command Submitting job 2 Computing Cluster http://www.admin-magazine.com/hpc/articles/building-an-hpc-cluster Any problem or query regarding

More information

Working on the NewRiver Cluster

Working on the NewRiver Cluster Working on the NewRiver Cluster CMDA3634: Computer Science Foundations for Computational Modeling and Data Analytics 22 February 2018 NewRiver is a computing cluster provided by Virginia Tech s Advanced

More information

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:-

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:- HPC DOCUMENTATION 1. Hardware Resource :- Our HPC consists of Blade chassis with 5 blade servers and one GPU rack server. a.total available cores for computing: - 96 cores. b.cores reserved and dedicated

More information

Batch Systems. Running your jobs on an HPC machine

Batch Systems. Running your jobs on an HPC machine Batch Systems Running your jobs on an HPC machine Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Name Department/Research Area Have you used the Linux command line?

Name Department/Research Area Have you used the Linux command line? Please log in with HawkID (IOWA domain) Macs are available at stations as marked To switch between the Windows and the Mac systems, press scroll lock twice 9/27/2018 1 Ben Rogers ITS-Research Services

More information

June 26, Explanatory meeting for users of supercomputer system -- Overview of UGE --

June 26, Explanatory meeting for users of supercomputer system -- Overview of UGE -- June 26, 2012 Explanatory meeting for users of supercomputer system -- Overview of UGE -- What is Univa Grid Engine (UGE)? It is software that is used to construct a grid computing system. It functions

More information

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS Introduction to High Performance Computing High Performance Computing at UEA http://rscs.uea.ac.uk/hpc/

More information

Sun Grid Engine - A Batch System for DESY

Sun Grid Engine - A Batch System for DESY Sun Grid Engine - A Batch System for DESY Wolfgang Friebel, Peter Wegner 28.8.2001 DESY Zeuthen Introduction Motivations for using a batch system more effective usage of available computers (e.g. more

More information

XSEDE New User Tutorial

XSEDE New User Tutorial June 12, 2015 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Please remember to sign in for today s event: http://bit.ly/1fashvo Also, please

More information

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions How to run applications on Aziz supercomputer Mohammad Rafi System Administrator Fujitsu Technology Solutions Agenda Overview Compute Nodes Storage Infrastructure Servers Cluster Stack Environment Modules

More information

Center for Mathematical Modeling University of Chile HPC 101. HPC systems basics and concepts. Juan Carlos Maureira B.

Center for Mathematical Modeling University of Chile HPC 101. HPC systems basics and concepts. Juan Carlos Maureira B. Center for Mathematical Modeling University of Chile HPC 101 HPC systems basics and concepts By Juan Carlos Maureira B. BioMedicina II Calculo Masivo en Biomedicina CMM - FCFM - University

More information

An Introduction to Cluster Computing Using Newton

An Introduction to Cluster Computing Using Newton An Introduction to Cluster Computing Using Newton Jason Harris and Dylan Storey March 25th, 2014 Jason Harris and Dylan Storey Introduction to Cluster Computing March 25th, 2014 1 / 26 Workshop design.

More information

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing Quick Start Guide by Burak Himmetoglu Supercomputing Consultant Enterprise Technology Services & Center for Scientific Computing E-mail: bhimmetoglu@ucsb.edu Contents User access, logging in Linux/Unix

More information

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization 2 Glenn Bresnahan Director, SCV MGHPCC Buy-in Program Kadin Tseng HPC Programmer/Consultant

More information

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013 Slurm and Abel job scripts Katerina Michalickova The Research Computing Services Group SUF/USIT November 13, 2013 Abel in numbers Nodes - 600+ Cores - 10000+ (1 node->2 processors->16 cores) Total memory

More information

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE PACE Instructional Cluster Environment (ICE) Orientation Mehmet (Memo) Belgin, PhD Research Scientist, PACE www.pace.gatech.edu What is PACE A Partnership for an Advanced Computing Environment Provides

More information

Kohinoor queuing document

Kohinoor queuing document List of SGE Commands: qsub : Submit a job to SGE Kohinoor queuing document qstat : Determine the status of a job qdel : Delete a job qhost : Display Node information Some useful commands $qstat f -- Specifies

More information

SINGAPORE-MIT ALLIANCE GETTING STARTED ON PARALLEL PROGRAMMING USING MPI AND ESTIMATING PARALLEL PERFORMANCE METRICS

SINGAPORE-MIT ALLIANCE GETTING STARTED ON PARALLEL PROGRAMMING USING MPI AND ESTIMATING PARALLEL PERFORMANCE METRICS SINGAPORE-MIT ALLIANCE Computational Engineering CME5232: Cluster and Grid Computing Technologies for Science and Computing COMPUTATIONAL LAB NO.2 10 th July 2009 GETTING STARTED ON PARALLEL PROGRAMMING

More information

Introduction to HPC Using the New Cluster at GACRC

Introduction to HPC Using the New Cluster at GACRC Introduction to HPC Using the New Cluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is the new cluster

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu The Discovery Cluster 2 Agenda What is a cluster and why use it Overview of computer hardware in cluster Help Available to Discovery Users Logging

More information

Introduction to CINECA Computer Environment

Introduction to CINECA Computer Environment Introduction to CINECA Computer Environment Today you will learn... Basic commands for UNIX environment @ CINECA How to submitt your job to the PBS queueing system on Eurora Tutorial #1: Example: launch

More information

Cluster Clonetroop: HowTo 2014

Cluster Clonetroop: HowTo 2014 2014/02/25 16:53 1/13 Cluster Clonetroop: HowTo 2014 Cluster Clonetroop: HowTo 2014 This section contains information about how to access, compile and execute jobs on Clonetroop, Laboratori de Càlcul Numeric's

More information

UF Research Computing: Overview and Running STATA

UF Research Computing: Overview and Running STATA UF : Overview and Running STATA www.rc.ufl.edu Mission Improve opportunities for research and scholarship Improve competitiveness in securing external funding Matt Gitzendanner magitz@ufl.edu Provide high-performance

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou OVERVIEW GACRC High Performance

More information

Our Workshop Environment

Our Workshop Environment Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2017 Our Environment This Week Your laptops or workstations: only used for portal access Bridges

More information

XSEDE New User Tutorial

XSEDE New User Tutorial May 13, 2016 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Please complete a short on-line survey about this module at http://bit.ly/hamptonxsede.

More information

Migrating from Zcluster to Sapelo

Migrating from Zcluster to Sapelo GACRC User Quick Guide: Migrating from Zcluster to Sapelo The GACRC Staff Version 1.0 8/4/17 1 Discussion Points I. Request Sapelo User Account II. III. IV. Systems Transfer Files Configure Software Environment

More information

Introduction to Sheffield University High Performance Computing Facilities

Introduction to Sheffield University High Performance Computing Facilities Introduction to Sheffield University High Performance Computing Facilities Deniz Savas dsavas.staff.shef.ac.uk/teaching June 2017 Objectives 1. Understand what High Performance Computing is 2. Be able

More information

Submit a Job. Want to run a batch script: #!/bin/sh echo Starting job date /usr/bin/time./hello date echo Ending job. qsub A HPC job.

Submit a Job. Want to run a batch script: #!/bin/sh echo Starting job date /usr/bin/time./hello date echo Ending job. qsub A HPC job. Submit a Job Want to run a batch script: #!/bin/sh echo Starting job date /usr/bin/time./hello date echo Ending job Have to ask scheduler to do it. qsub A 20090528HPC job.sge #!/bin/sh #$ -N ht3d-hyb #$

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu The Discovery Cluster 2 Agenda What is a cluster and why use it Overview of computer hardware in cluster Help Available to Discovery Users Logging

More information

Computing with the Moore Cluster

Computing with the Moore Cluster Computing with the Moore Cluster Edward Walter An overview of data management and job processing in the Moore compute cluster. Overview Getting access to the cluster Data management Submitting jobs (MPI

More information

Using the MaRC2 HPC Cluster

Using the MaRC2 HPC Cluster Using the MaRC2 HPC Cluster Manuel Haim, 06/2013 Using MaRC2??? 2 Using MaRC2 Overview Get access rights and permissions Starting a terminal session (Linux, Windows, Mac) Intro to the BASH Shell (and available

More information

HPC Course Session 3 Running Applications

HPC Course Session 3 Running Applications HPC Course Session 3 Running Applications Checkpointing long jobs on Iceberg 1.1 Checkpointing long jobs to safeguard intermediate results For long running jobs we recommend using checkpointing this allows

More information

A Brief Introduction to The Center for Advanced Computing

A Brief Introduction to The Center for Advanced Computing A Brief Introduction to The Center for Advanced Computing May 1, 2006 Hardware 324 Opteron nodes, over 700 cores 105 Athlon nodes, 210 cores 64 Apple nodes, 128 cores Gigabit networking, Myrinet networking,

More information

Introduction to HPC Resources and Linux

Introduction to HPC Resources and Linux Introduction to HPC Resources and Linux Burak Himmetoglu Enterprise Technology Services & Center for Scientific Computing e-mail: bhimmetoglu@ucsb.edu Paul Weakliem California Nanosystems Institute & Center

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu March 2014 The Discovery Cluster 2 Agenda Resource overview Logging on to the cluster with ssh Transferring files to and from the cluster The Environment

More information

Logging in to the CRAY

Logging in to the CRAY Logging in to the CRAY 1. Open Terminal Cray Hostname: cray2.colostate.edu Cray IP address: 129.82.103.183 On a Mac 2. type ssh username@cray2.colostate.edu where username is your account name 3. enter

More information

Answers to Federal Reserve Questions. Training for University of Richmond

Answers to Federal Reserve Questions. Training for University of Richmond Answers to Federal Reserve Questions Training for University of Richmond 2 Agenda Cluster Overview Software Modules PBS/Torque Ganglia ACT Utils 3 Cluster overview Systems switch ipmi switch 1x head node

More information

PACE. Instructional Cluster Environment (ICE) Orientation. Mehmet (Memo) Belgin, PhD Research Scientist, PACE

PACE. Instructional Cluster Environment (ICE) Orientation. Mehmet (Memo) Belgin, PhD  Research Scientist, PACE PACE Instructional Cluster Environment (ICE) Orientation Mehmet (Memo) Belgin, PhD www.pace.gatech.edu Research Scientist, PACE What is PACE A Partnership for an Advanced Computing Environment Provides

More information

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende Introduction to NCAR HPC 25 May 2017 Consulting Services Group Brian Vanderwende Topics we will cover Technical overview of our HPC systems The NCAR computing environment Accessing software on Cheyenne

More information

For Dr Landau s PHYS8602 course

For Dr Landau s PHYS8602 course For Dr Landau s PHYS8602 course Shan-Ho Tsai (shtsai@uga.edu) Georgia Advanced Computing Resource Center - GACRC January 7, 2019 You will be given a student account on the GACRC s Teaching cluster. Your

More information

Our Workshop Environment

Our Workshop Environment Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2017 Our Environment This Week Your laptops or workstations: only used for portal access Bridges

More information

Working with Shell Scripting. Daniel Balagué

Working with Shell Scripting. Daniel Balagué Working with Shell Scripting Daniel Balagué Editing Text Files We offer many text editors in the HPC cluster. Command-Line Interface (CLI) editors: vi / vim nano (very intuitive and easy to use if you

More information

OpenPBS Users Manual

OpenPBS Users Manual How to Write a PBS Batch Script OpenPBS Users Manual PBS scripts are rather simple. An MPI example for user your-user-name: Example: MPI Code PBS -N a_name_for_my_parallel_job PBS -l nodes=7,walltime=1:00:00

More information

Introduction to running C based MPI jobs on COGNAC. Paul Bourke November 2006

Introduction to running C based MPI jobs on COGNAC. Paul Bourke November 2006 Introduction to running C based MPI jobs on COGNAC. Paul Bourke November 2006 The following is a practical introduction to running parallel MPI jobs on COGNAC, the SGI Altix machine (160 Itanium2 cpus)

More information

Using a Linux System 6

Using a Linux System 6 Canaan User Guide Connecting to the Cluster 1 SSH (Secure Shell) 1 Starting an ssh session from a Mac or Linux system 1 Starting an ssh session from a Windows PC 1 Once you're connected... 1 Ending an

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

XSEDE New User Tutorial

XSEDE New User Tutorial October 20, 2017 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Please complete a short on line survey about this module at http://bit.ly/xsedesurvey.

More information

Introduction to Cuda Visualization. Graphical Application Tunnelling on Palmetto

Introduction to Cuda Visualization. Graphical Application Tunnelling on Palmetto Introduction to Cuda Visualization The CUDA programming paradigm is NVidia's development tool which is used to enable advanced computer processing on their GPGPU (General Purpose graphics Processing Units)

More information