Task farming on Blue Gene

Size: px
Start display at page:

Download "Task farming on Blue Gene"

Transcription

1 Task farming on Blue Gene Fiona J. L. Reid July 3, 2006 Abstract In this paper we investigate how to implement a trivial task farm on the EPCC eserver Blue Gene/L system, BlueSky. This is achieved by adding a small number of MPI calls to an existing serial code. We illustrate the method using example codes and demonstrate it to be successful by application to a real user code.

2 Contents 1 Introduction 1 2 IBM eserver Blue Gene 2 3 Implementing a trivial task farm on Blue Gene Encapsulate the serial code with MPI calls Test cases - ClockModel code 5 5 Conclusions 6 6 Appendix Fortran 90 version of the serial test code Fortran 90 version of the serial test code with MPI calls added C version of the serial test code C version of the serial test code with MPI calls added ii

3 1 Introduction Many serial codes are limited by the total CPU time that they require to run. Often the individual tasks are actually independent of one another and therefore can be potentially be run simultaneously (in parallel) on different processors. This approach can greatly reduce the actual time required to obtain a scientific result. For example, consider a code which takes 1 hour to execute and requires 1000 runs to obtain a reliable solution. On a single processor this would require 1000 hours (42 days) of continuous runs. The same result could be obtained in just 1 hour if all the runs can be performed simultaneously using 1000 processors. Distributing the separate runs across many processors in such a way is known as task farming. Trivial task farming (or job farming) is one of the most common forms of parallelism available. It relies on being able to decompose your problem into a number of identical but independent serial tasks. Essentially, each processor (or node) runs its own copy of the serial code with its own input file(s) and output file(s). There is no communication required between the processes. The trivial task farming method is particularly suited to examining large independent parameter spaces or large independent datasets. Providing all tasks complete at the same time the there will be no load imbalance and linear scaling will be obtained. Trivial task farming can be very efficient and on many systems is relatively easy to implement. For example a Montecarlo simulation would be a good candidate for the trivial task farm approach. In a Montecarlo simulation the same model is typically run many times (with slightly different start points). This allows statistically significant summaries of the overall model behaviour to be built up. As each model takes approximately the same length of time to run, linear scaling will be attainable. The main advantages and disadvantages of the trivial task farm approach are given below: Advantages Generally easy to implement (on some systems it can be carried out via the batch system directly, e.g. lomond or via a taskfarm harness, e.g. HPCx) Can be very efficient - providing tasks take same length of time Linear scaling can be achieved Existing serial code can be used with minimal modification - in fact for some situations no modifications to the serial code are required whatsoever No communication overheads User may not require detailed knowledge of MPI techniques Disadvantages If tasks take different amounts of time then execution time will be governed by the slowest process Data/parameter space must be truly independent Not ideal for problems requiring communication between processes May restrict future code development - e.g. problem size will be limited to that which can fit on a single processor 1

4 2 IBM eserver Blue Gene BlueSky is an IBM eserver Blue Gene/L system consisting of a single cabinet containing 1024 compute chips (nodes). Each compute node consists of a dual-core 700MHz PowerPC 440 processor, 512 MB of RAM. A compute node can operate in two modes; Coprocessor (CO) mode or Virtual-Node (VN) mode. In Coprocessor mode one core handles communication whilst the other handles computation with 512 MB main memory available to the compute core. The idea behind this is that its possible for the programmer to overlap communications and computations and thus obtain optimal performance. In Virtual-Node mode both cores are used simultaneously for computation with 256MB main memory available to each core. In addition to the compute nodes there are also dedicated I/O nodes. The BlueSky service is a relatively I/O rich system and is configured with one I/O node for every eight compute nodes. The computes nodes run a lightweight Linux derived compute node kernel (CNK). The kernel offers only very limited functionality. The I/O nodes run a full Linux kernel. The rationale is to keep the compute nodes as uninterrupted by the operating system as possible by outsourcing the usual operating system tasks to dedicated additional hardware. For example on BlueSky the compute nodes (in CO mode) can access 508 MB of the total 512 MB main memory i.e. the CNK requires 4 MB. By comparison, a single 16 processor node of the HPCx [1, 2] system has 32 GB main memory, however, only 26.9 GB can be accessed by user code with the rest being required by the operating system. Finally the are four front-end nodes which provide the user interface to BlueSky. The front-end nodes consist of an IBM eserver BladeCenter JS20 with 4 blades. The frontend nodes run SUSE Linux and can be used for editing, compilation and job submission. Further details of the BlueSky system can be found at [3]. For the purposes of performing a trivial task farm users can think of the system as either up to 1024 processors each with 512 MB main memory (CO mode) or up to 2048 processors each with 256 MB main memory (VN mode). 3 Implementing a trivial task farm on Blue Gene Ideally we would like to run multiple copies (one copy per processor) of a serial code simultaneously, with each copy capable of accessing its own input/output file(s). On many high performance computing (HPC) systems (e.g. lomond [4, 5], various Linux clusters) the batch system can be used to execute multiple serial executables simultaneously with each running on a different processor. Unfortunately, this is not possible on either the HPCx or Blue Gene systems. Both HPCx and Blue Gene use the IBM scheduling software, Loadleveler, which does not allow more than one executable to be run simultaneously [6]. On the HPCx system this problem was overcome by using a task farm harness code which allows users to run multiple copies of a serial code with different input/output files without any modification to the serial code. Further details regarding the task farm harness code are available at: Essentially, the task farm harness code consists of an MPI wrapper code which invokes the serial executable by using the system() function/subroutine. e.g. 2

5 In Fortran: call system("./serialexename") or in C: int retcode; retcode = system("./serialexename"); would run the serial executable serialexename. Due to the reduced operating system installed on the compute nodes, the Blue Gene system does not allow calls to system on the compute nodes (backend). This means that the task farm harness code cannot be used and therefore another method of invoking the serial code must be found. As a result, all of methods considered in this paper will require some modifications to the serial code. In testing the different methods of implementing a trivial task farm on Blue Gene we make the following assumptions:- 1. The user has an existing serial code which runs on a single Blue Gene node 2. The memory requirements of the serial code do not exceed 512 MB (CO mode) 256 MB (VN mode) 3. This serial code can have both input and output file(s) or parameter sets 4. The file unit numbers (Fortran) are declared as variables within the serial code. If the file unit numbers are hard-wired then the serial code should be amended and tested prior to the addition of any MPI calls. To simulate such problem a simple test code has been written. The test code performs some simple statistical computations on an input dataset. The input data set consists of a vector of data of length nmax. The output file contains the statistics (mean and standard deviation) as computed from the input data. The full source for the test code is given in the Appendix. Several different approaches of implementing a task farm on Blue Gene are investigated: 1. Encapsulate the serial code with MPI calls 2. Place serial code inside a function and call this function from an MPI template code Both these approaches require very careful consideration of how file IO is handled. 3.1 Encapsulate the serial code with MPI calls For this approach an existing serial code is encapsulated with MPI calls. The encapsulated code will then be able to run on any number of processors. Input/output files require careful consideration. Essentially the procedure for a Fortran code is as follows: Add include "mpif.h" directly after the implicit none statement 3

6 Add the following block of code directly after all type declarations! MPI related declarations integer :: errcode, rank! New declarations to handle input/output files character (len=4) :: dir_id! Initialise MPI call MPI_INIT(errcode) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, errcode) Add call MPI FINALIZE(errcode) directly before the end program statement These changes will enable a copy of the serial code to run on each processor simultaneously. However, the input and output file names still require further consideration. Without further modification each processor will attempt to open the file taskfarm data.dat and will attempt to write output to the file taskfarm results.output. Clearly, this would result in the same input file being read in by all processors when in fact the user may wish a different file to be read in on each processor. Additionally, as all processors will attempt to write to the same output file this could also result in output files generated on one processor potentially being over-written by another processor. Therefore, some method of distinguishing which files are read from/written to by each processor is required. Probably the simplest way to achieve this is to place the input/output files in directories which are labelled in accordance with their MPI rank (or label the files in accordance with their MPI rank). The procedure for doing this for a Fortran code is as follows: We use the MPI rank to define a character variable, dir id, which will be used to determine the directory which contains the input/output files for a particular process, e.g. write(dir_id, (i4.4) )rank This statement should be executed after the call to MPI INIT and before any file open statements. As each MPI process has it s own copy of the input/output file unit numbers, iounit in and iounit out we do not need to change the file unit numbers (or file pointers for C/C++ codes), we only need to change the file names. We then modify all references to the input/output file names within the code so that they are preceded by the following "dir"//dir id// and add an additional / directly before the original file name e.g. open(unit=iounit_in, file = "dir"//dir_id//"/taskfarm_data.dat") Finally, before running the code you will need to ensure that the correct number of dir???? directories have been created and that the relevant input files are placed inside these directories. For example, if 4 processors are used, then the following directories need to be created prior to running the code: dir0000, dir0001, dir0002 and dir0003. The relevant input file(s) also need to be copied/moved into the relevant directory. A 4

7 Unix shell script could be used to achieve this. It may also be possible to achieve this via the Loadleveler batch script. A full version of the modified code is contained in the Appendix. It should be noted that the main body of the serial code remains completely unchanged. With the exception of the file name specific modifications all modifications occur at the beginning and end of the code. If the user doesn t want to place the input/output files into separate directories then the input and output filenames can simply be appended with the rank as follows: open(unit=iounit_in, file = "taskfarm_data"//dir_id//".dat") C or C++ codes can be also treated in a similar manner. A simple C example (testcode serial.c) and corresponding modified code containing the required MPI calls (testcode serial MPIwrapper.c) which allow the code to be run on several processors are also included in the Appendix for reference. 4 Test cases - ClockModel code To investigate the ease of applying this approach we have tested the method described in this paper on a real user supplied code. The code is a serial C code which models the biological clock of plants and was supplied by Professor Andrew Millar, University of Edinburgh. The serial code exists as a number of source files (*.c) and a single header file (*.h) which all the source files refer to. The serial code reads in a number of input files (up to 4) and writes to 3 output files. One of these output files is used for both input and output as the code executes (e.g. a solution is written out and subsequently re-read). The input and output files are opened from a number of different source files and therefore careful consideration of variable scope is required. The procedure used to implement a trivial task farm on the ClockModel code was very similar to that described in Section 3.1. The only additional complication arose from the fact that the input/output files are opened from both the main program and other functions out-with the scope of the main program. This means that the character variable (char dir id[4] in the sample code) used to control the output directory needs to be in global scope. This can be achieved by specifying this variable as an external variable within the header file (e.g. extern char dir id[]) and then declaring the variable prior to the (main) statement e.g. #include headerfile.h char dir_id[4]; int main(int argc, char* argv[])... We have successfully tested the trivial taskfarm on BlueSky by verifying that the same (or similar) results are obtained on all processors when using identical input files on each processor. Due to the nature of the code identical results cannot be obtained as a random number generator is used. 5

8 5 Conclusions Trivial task farming of a serial code can be performed relatively easily on the BlueSky machine allowing users to utilise a large number of processors simultaneously. Minimal modification to an existing serial code is required and no detailed MPI knowledge is needed. The method has been tested successfully on a real user application. Acknowledgements We would like to acknowledge the following for their support and assistance: Mark Bull and Joachim Hein. References [1] User s Guide to the HPCx Service (Version 2.02) [2] HPCx web page [3] User Guide to EPCC s BlueGene/L Service (Version 1.0), bgapps/userguide/bguser/bguser.html [4] Introduction to the University of Edinburgh HPC Service (Version 3.00) [5] Lomond web page [6] S. Kannan, P. Mayes, M. Roberts, D. Brelsford, and J. F. Skovira (2001). Workload Management with LoadLeveler, IBM Corp, SG

9 6 Appendix 6.1 Fortran 90 version of the serial test code! Serial code which is used to test various ways of performing a trivial! task farm on Blue Gene. The code reads in a vector of data from the input! file with unit iounit_in and writes out the mean and standard deviation to! the output file with iounit_out.! This test code attempts to simulate a typical user code that may be! appropriate for trivial task farming. program testcode_serial implicit none integer, parameter :: nmax = 10 real, dimension(nmax) :: adata real :: stddev = 0.0, mean = 0.0 integer :: i integer :: iounit_in, iounit_out iounit_in = 10 iounit_out = 11! Input data! Output data! Open input and output data files open(unit=iounit_in, file = "taskfarm_data.dat") open(unit=iounit_out, file = "taskfarm_results.output")! Read in input data from file with unit number iounit_in do i = 1, nmax read(iounit_in,*,err=100)adata(i) end do 100 continue write(*,*)"total number of points read from file = ",i-1! Close input file close(iounit_in)! Compute mean and standard deviation mean = sum(adata)/nmax do i = 1, nmax stddev = stddev + (adata(i) - mean)**2.0d0 end do stddev = sqrt(stddev/nmax)! Write results to output file with unit number iounit_out write(iounit_out,101)"standard deviation = ",stddev, " mean = ",mean 101 format(a21,x,f10.3,a8,x,f10.3)! Close output file 7

10 close(iounit_out) end program testcode_serial 6.2 Fortran 90 version of the serial test code with MPI calls added! Serial code with MPI calls inserted which will be used to perform a trivial! task farm on Blue Gene. The code reads in a vector of data from the input! file with iounit_in and writes out the mean and standard deviation to the! output file with iounit_out.! This test code attempts to simulate a typical user code that may be! appropriate for trivial task farming and includes the necessary MPI calls! in onder to run the code on multiple processors. program testcode_serial_mpiwrapper implicit none include "mpif.h" integer, parameter :: nmax = 10 real, dimension(nmax) :: adata real :: stddev = 0.0, mean = 0.0 integer :: i integer :: iounit_in, iounit_out! MPI related declarations integer :: errcode, rank! New declarations to handle input/output files character (len=4) :: dir_id! Initialise MPI call MPI_INIT(errcode) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, errcode)! Use the rank to define the directory name write(dir_id, (i4.4) )rank! Open input and output data files in directory dir**. The value! of "**" is determined by the rank of the process open(unit=iounit_in, file = "dir"//dir_id//"/taskfarm_data.dat") open(unit=iounit_out, file = "dir"//dir_id//"/taskfarm_results.output")! Read in input data from file with unit number iounit_in do i = 1, nmax read(iounit_in,*,err=100)adata(i) end do 100 continue 8

11 write(*,*)"total number of points read from file = ",i-1! Close input file close(iounit_in)! Compute mean and standard deviation mean = sum(adata)/nmax do i = 1, nmax stddev = stddev + (adata(i) - mean)**2.0d0 end do stddev = sqrt(stddev/nmax)! Write results to output file with unit number iounit_out write(iounit_out,101)"standard deviation = ",stddev, " mean = ",mean 101 format(a21,x,f10.3,a8,x,f10.3)! Close output file close(iounit_out)! Finalise MPI call MPI_FINALIZE (errcode) end program testcode_serial_mpiwrapper 6.3 C version of the serial test code #include <stdlib.h> #include <stdio.h> #include <math.h> #define nmax 10 int main() float adata[nmax]; float stddev = 0.0, mean = 0.0; int i; int count = 0; char fnamein[100], fnameout[100]; FILE *fpin, *fpout; /* Open input and output files */ sprintf(fnamein,"taskfarm_data.dat"); if (NULL == (fpin = fopen(fnamein,"r"))) fprintf(stderr, "Cannot open <%s>\n",fnamein); exit(-1); 9

12 sprintf(fnameout,"taskfarm_results.output"); if (NULL == (fpout = fopen(fnameout,"w"))) fprintf(stderr, "Cannot open <%s>\n",fnameout); exit(-1); /* Read in input data from file with file pointer fpin */ for (i = 0; i < nmax; i++) fscanf(fpin,"%f",&adata[i]); count = count + 1; printf("total number of points read from file = %d \n",count); /* Close the input file */ fclose(fpin); /* Compute mean and standard deviation */ for (i = 0; i < nmax; i++) mean = mean + adata[i]; mean = mean/nmax; for (i = 0; i < nmax; i++) stddev = stddev + (adata[i] - mean)*(adata[i] - mean); stddev = sqrt(stddev/nmax); /* Write results to output file */ fprintf(fpout,"standard deviation =%8.3f, mean = %8.3f \n",stddev,mean); /* Close output file */ fclose(fpout); 6.4 C version of the serial test code with MPI calls added #include <stdlib.h> #include <stdio.h> #include <math.h> #include <mpi.h> 10

13 #define nmax 10 int main(argc, argv) int argc; char *argv[]; float adata[nmax]; float stddev = 0.0, mean = 0.0; int i; int count = 0; char fnamein[100], fnameout[100]; FILE *fpin, *fpout; /* MPI related declarations */ int rank; /* New declarations to handle input/output files */ char dir_id[4]; /* Initialise MPI */ MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); /* Create character variable dir_id to control input/output directory */ sprintf(dir_id,"%i",rank); /* Open input and output files */ sprintf(fnamein,"dir%s/taskfarm_data.dat",dir_id); if (NULL == (fpin = fopen(fnamein,"r"))) fprintf(stderr, "Cannot open <%s>\n",fnamein); exit(-1); sprintf(fnameout,"dir%s/taskfarm_results.output",dir_id); if (NULL == (fpout = fopen(fnameout,"w"))) fprintf(stderr, "Cannot open <%s>\n",fnameout); exit(-1); /* Read in input data from file with file pointer fpin */ for (i = 0; i < nmax; i++) fscanf(fpin,"%f",&adata[i]); count = count + 1; printf("total number of points read from file = %d \n",count); 11

14 /* Close the input file */ fclose(fpin); /* Compute mean and standard deviation */ for (i = 0; i < nmax; i++) mean = mean + adata[i]; mean = mean/nmax; for (i = 0; i < nmax; i++) stddev = stddev + (adata[i] - mean)*(adata[i] - mean); stddev = sqrt(stddev/nmax); /* Write results to output file */ fprintf(fpout,"standard deviation =%8.3f, mean = %8.3f \n",stddev,mean); /* Close output file */ fclose(fpout); /* Finalise MPI */ MPI_Finalize (); 12

Message Passing Programming. Introduction to MPI

Message Passing Programming. Introduction to MPI Message Passing Programming Introduction to MPI Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy Why serial is not enough Computing architectures Parallel paradigms Message Passing Interface How

More information

MPI Program Structure

MPI Program Structure MPI Program Structure Handles MPI communicator MPI_COMM_WORLD Header files MPI function format Initializing MPI Communicator size Process rank Exiting MPI 1 Handles MPI controls its own internal data structures

More information

MPI introduction - exercises -

MPI introduction - exercises - MPI introduction - exercises - Paolo Ramieri, Maurizio Cremonesi May 2016 Startup notes Access the server and go on scratch partition: ssh a08tra49@login.galileo.cineca.it cd $CINECA_SCRATCH Create a job

More information

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology Introduction to the SHARCNET Environment 2010-May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology available hardware and software resources our web portal

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

Executing Message-Passing Programs. Mitesh Meswani

Executing Message-Passing Programs. Mitesh Meswani Executing Message-assing rograms Mitesh Meswani resentation Outline Introduction to Top Gun (eserver pseries 690) MI on Top Gun (AIX/Linux) Itanium2 (Linux) Cluster Sun (Solaris) Workstation Cluster Environment

More information

Supercomputing in Plain English Exercise #6: MPI Point to Point

Supercomputing in Plain English Exercise #6: MPI Point to Point Supercomputing in Plain English Exercise #6: MPI Point to Point In this exercise, we ll use the same conventions and commands as in Exercises #1, #2, #3, #4 and #5. You should refer back to the Exercise

More information

Parallel Programming with MPI: Day 1

Parallel Programming with MPI: Day 1 Parallel Programming with MPI: Day 1 Science & Technology Support High Performance Computing Ohio Supercomputer Center 1224 Kinnear Road Columbus, OH 43212-1163 1 Table of Contents Brief History of MPI

More information

Part One: The Files. C MPI Slurm Tutorial - Hello World. Introduction. Hello World! hello.tar. The files, summary. Output Files, summary

Part One: The Files. C MPI Slurm Tutorial - Hello World. Introduction. Hello World! hello.tar. The files, summary. Output Files, summary C MPI Slurm Tutorial - Hello World Introduction The example shown here demonstrates the use of the Slurm Scheduler for the purpose of running a C/MPI program. Knowledge of C is assumed. Having read the

More information

Introduction to the Message Passing Interface (MPI)

Introduction to the Message Passing Interface (MPI) Introduction to the Message Passing Interface (MPI) CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction to the Message Passing Interface (MPI) Spring 2018

More information

ITCS 4145/5145 Assignment 2

ITCS 4145/5145 Assignment 2 ITCS 4145/5145 Assignment 2 Compiling and running MPI programs Author: B. Wilkinson and Clayton S. Ferner. Modification date: September 10, 2012 In this assignment, the workpool computations done in Assignment

More information

Introduction to MPI. Ekpe Okorafor. School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014

Introduction to MPI. Ekpe Okorafor. School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014 Introduction to MPI Ekpe Okorafor School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014 Topics Introduction MPI Model and Basic Calls MPI Communication Summary 2 Topics Introduction

More information

MPI-Hello World. Timothy H. Kaiser, PH.D.

MPI-Hello World. Timothy H. Kaiser, PH.D. MPI-Hello World Timothy H. Kaiser, PH.D. tkaiser@mines.edu 1 Calls we will Use MPI_INIT( ierr ) MPI_Get_Processor_name(myname,resultlen,ierr) MPI_FINALIZE(ierr) MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr

More information

518 Lecture Notes Week 3

518 Lecture Notes Week 3 518 Lecture Notes Week 3 (Sept. 15, 2014) 1/8 518 Lecture Notes Week 3 1 Topics Process management Process creation with fork() Overlaying an existing process with exec Notes on Lab 3 2 Process management

More information

Computer Science 322 Operating Systems Mount Holyoke College Spring Topic Notes: C and Unix Overview

Computer Science 322 Operating Systems Mount Holyoke College Spring Topic Notes: C and Unix Overview Computer Science 322 Operating Systems Mount Holyoke College Spring 2010 Topic Notes: C and Unix Overview This course is about operating systems, but since most of our upcoming programming is in C on a

More information

Application Performance on an e-server Blue Gene: Early Experiences. Lorna Smith EPCC, The University of Edinburgh

Application Performance on an e-server Blue Gene: Early Experiences. Lorna Smith EPCC, The University of Edinburgh Application Performance on an e-server Blue Gene: Early Experiences Lorna Smith EPCC, The University of Edinburgh l.smith@epcc.ed.ac.uk 3-Jun-05 ScicomP 11 - Edinburgh 2 Overview Acknowledgements / contributors

More information

MPI-Hello World. Timothy H. Kaiser, PH.D.

MPI-Hello World. Timothy H. Kaiser, PH.D. MPI-Hello World Timothy H. Kaiser, PH.D. tkaiser@mines.edu 1 Calls we will Use MPI_INIT( ierr ) MPI_Get_Processor_name(myname,resultlen,ierr) MPI_FINALIZE(ierr) MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr

More information

Debugging on Blue Waters

Debugging on Blue Waters Debugging on Blue Waters Debugging tools and techniques for Blue Waters are described here with example sessions, output, and pointers to small test codes. For tutorial purposes, this material will work

More information

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.

More information

MPI: The Message-Passing Interface. Most of this discussion is from [1] and [2].

MPI: The Message-Passing Interface. Most of this discussion is from [1] and [2]. MPI: The Message-Passing Interface Most of this discussion is from [1] and [2]. What Is MPI? The Message-Passing Interface (MPI) is a standard for expressing distributed parallelism via message passing.

More information

Message Passing Interface

Message Passing Interface MPSoC Architectures MPI Alberto Bosio, Associate Professor UM Microelectronic Departement bosio@lirmm.fr Message Passing Interface API for distributed-memory programming parallel code that runs across

More information

Working with Shell Scripting. Daniel Balagué

Working with Shell Scripting. Daniel Balagué Working with Shell Scripting Daniel Balagué Editing Text Files We offer many text editors in the HPC cluster. Command-Line Interface (CLI) editors: vi / vim nano (very intuitive and easy to use if you

More information

Parallel hardware. Distributed Memory. Parallel software. COMP528 MPI Programming, I. Flynn s taxonomy:

Parallel hardware. Distributed Memory. Parallel software. COMP528 MPI Programming, I. Flynn s taxonomy: COMP528 MPI Programming, I www.csc.liv.ac.uk/~alexei/comp528 Alexei Lisitsa Dept of computer science University of Liverpool a.lisitsa@.liverpool.ac.uk Flynn s taxonomy: Parallel hardware SISD (Single

More information

Introduction to Computing V - Linux and High-Performance Computing

Introduction to Computing V - Linux and High-Performance Computing Introduction to Computing V - Linux and High-Performance Computing Jonathan Mascie-Taylor (Slides originally by Quentin CAUDRON) Centre for Complexity Science, University of Warwick Outline 1 Program Arguments

More information

Parallel Programming. Libraries and implementations

Parallel Programming. Libraries and implementations Parallel Programming Libraries and implementations Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Tool for Analysing and Checking MPI Applications

Tool for Analysing and Checking MPI Applications Tool for Analysing and Checking MPI Applications April 30, 2010 1 CONTENTS CONTENTS Contents 1 Introduction 3 1.1 What is Marmot?........................... 3 1.2 Design of Marmot..........................

More information

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared

More information

Tech Computer Center Documentation

Tech Computer Center Documentation Tech Computer Center Documentation Release 0 TCC Doc February 17, 2014 Contents 1 TCC s User Documentation 1 1.1 TCC SGI Altix ICE Cluster User s Guide................................ 1 i ii CHAPTER 1

More information

Message-Passing Programming with MPI

Message-Passing Programming with MPI Message-Passing Programming with MPI Message-Passing Concepts David Henty d.henty@epcc.ed.ac.uk EPCC, University of Edinburgh Overview This lecture will cover message passing model SPMD communication modes

More information

Solution of Exercise Sheet 2

Solution of Exercise Sheet 2 Solution of Exercise Sheet 2 Exercise 1 (Cluster Computing) 1. Give a short definition of Cluster Computing. Clustering is parallel computing on systems with distributed memory. 2. What is a Cluster of

More information

COMP/CS 605: Introduction to Parallel Computing Topic : Distributed Memory Programming: Message Passing Interface

COMP/CS 605: Introduction to Parallel Computing Topic : Distributed Memory Programming: Message Passing Interface COMP/CS 605: Introduction to Parallel Computing Topic : Distributed Memory Programming: Message Passing Interface Mary Thomas Department of Computer Science Computational Science Research Center (CSRC)

More information

Hybrid MPI and OpenMP Parallel Programming

Hybrid MPI and OpenMP Parallel Programming Hybrid MPI and OpenMP Parallel Programming Jemmy Hu SHARCNET HPTC Consultant July 8, 2015 Objectives difference between message passing and shared memory models (MPI, OpenMP) why or why not hybrid? a common

More information

IBM Engineering and Scientific Subroutine Library V4.4 adds new LAPACK and Fourier Transform subroutines

IBM Engineering and Scientific Subroutine Library V4.4 adds new LAPACK and Fourier Transform subroutines Announcement ZP08-0527, dated November 11, 2008 IBM Engineering and Scientific Subroutine Library V4.4 adds new LAPACK and Fourier Transform subroutines Table of contents 1 At a glance 3 Offering Information

More information

Simple examples how to run MPI program via PBS on Taurus HPC

Simple examples how to run MPI program via PBS on Taurus HPC Simple examples how to run MPI program via PBS on Taurus HPC MPI setup There's a number of MPI implementations install on the cluster. You can list them all issuing the following command: module avail/load/list/unload

More information

ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 2016 Solutions Name:...

ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 2016 Solutions Name:... ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 016 Solutions Name:... Answer questions in space provided below questions. Use additional paper if necessary but make sure

More information

Introduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign

Introduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Introduction to MPI May 20, 2013 Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Top500.org PERFORMANCE DEVELOPMENT 1 Eflop/s 162 Pflop/s PROJECTED 100 Pflop/s

More information

Parallel Programming Using MPI

Parallel Programming Using MPI Parallel Programming Using MPI Prof. Hank Dietz KAOS Seminar, February 8, 2012 University of Kentucky Electrical & Computer Engineering Parallel Processing Process N pieces simultaneously, get up to a

More information

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues Top500 Supercomputer list represent parallel computers, so distributed systems such as SETI@Home are not considered Does not consider storage or I/O issues Both custom designed machines and commodity machines

More information

Computer Science 2500 Computer Organization Rensselaer Polytechnic Institute Spring Topic Notes: C and Unix Overview

Computer Science 2500 Computer Organization Rensselaer Polytechnic Institute Spring Topic Notes: C and Unix Overview Computer Science 2500 Computer Organization Rensselaer Polytechnic Institute Spring 2009 Topic Notes: C and Unix Overview This course is about computer organization, but since most of our programming is

More information

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico. OpenMP and MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico November 15, 2010 José Monteiro (DEI / IST) Parallel and Distributed Computing

More information

Message-Passing Programming with MPI. Message-Passing Concepts

Message-Passing Programming with MPI. Message-Passing Concepts Message-Passing Programming with MPI Message-Passing Concepts Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Creating a Shell or Command Interperter Program CSCI411 Lab

Creating a Shell or Command Interperter Program CSCI411 Lab Creating a Shell or Command Interperter Program CSCI411 Lab Adapted from Linux Kernel Projects by Gary Nutt and Operating Systems by Tannenbaum Exercise Goal: You will learn how to write a LINUX shell

More information

MPI: Parallel Programming for Extreme Machines. Si Hammond, High Performance Systems Group

MPI: Parallel Programming for Extreme Machines. Si Hammond, High Performance Systems Group MPI: Parallel Programming for Extreme Machines Si Hammond, High Performance Systems Group Quick Introduction Si Hammond, (sdh@dcs.warwick.ac.uk) WPRF/PhD Research student, High Performance Systems Group,

More information

UNIVERSITY OF NEBRASKA AT OMAHA Computer Science 4500/8506 Operating Systems Fall Programming Assignment 1 (updated 9/16/2017)

UNIVERSITY OF NEBRASKA AT OMAHA Computer Science 4500/8506 Operating Systems Fall Programming Assignment 1 (updated 9/16/2017) UNIVERSITY OF NEBRASKA AT OMAHA Computer Science 4500/8506 Operating Systems Fall 2017 Programming Assignment 1 (updated 9/16/2017) Introduction The purpose of this programming assignment is to give you

More information

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico. OpenMP and MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico November 16, 2011 CPD (DEI / IST) Parallel and Distributed Computing 18

More information

MPI and comparison of models Lecture 23, cs262a. Ion Stoica & Ali Ghodsi UC Berkeley April 16, 2018

MPI and comparison of models Lecture 23, cs262a. Ion Stoica & Ali Ghodsi UC Berkeley April 16, 2018 MPI and comparison of models Lecture 23, cs262a Ion Stoica & Ali Ghodsi UC Berkeley April 16, 2018 MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers,

More information

Programming with MPI. Pedro Velho

Programming with MPI. Pedro Velho Programming with MPI Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage - Who might be interested in those applications?

More information

CSE 374 Programming Concepts & Tools

CSE 374 Programming Concepts & Tools CSE 374 Programming Concepts & Tools Hal Perkins Fall 2017 Lecture 8 C: Miscellanea Control, Declarations, Preprocessor, printf/scanf 1 The story so far The low-level execution model of a process (one

More information

THE UNIVERSITY OF WESTERN ONTARIO. COMPUTER SCIENCE 211a FINAL EXAMINATION 17 DECEMBER HOURS

THE UNIVERSITY OF WESTERN ONTARIO. COMPUTER SCIENCE 211a FINAL EXAMINATION 17 DECEMBER HOURS Computer Science 211a Final Examination 17 December 2002 Page 1 of 17 THE UNIVERSITY OF WESTERN ONTARIO LONDON CANADA COMPUTER SCIENCE 211a FINAL EXAMINATION 17 DECEMBER 2002 3 HOURS NAME: STUDENT NUMBER:

More information

Holland Computing Center Kickstart MPI Intro

Holland Computing Center Kickstart MPI Intro Holland Computing Center Kickstart 2016 MPI Intro Message Passing Interface (MPI) MPI is a specification for message passing library that is standardized by MPI Forum Multiple vendor-specific implementations:

More information

MPI 1. CSCI 4850/5850 High-Performance Computing Spring 2018

MPI 1. CSCI 4850/5850 High-Performance Computing Spring 2018 MPI 1 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives

More information

Parallel Computing: Overview

Parallel Computing: Overview Parallel Computing: Overview Jemmy Hu SHARCNET University of Waterloo March 1, 2007 Contents What is Parallel Computing? Why use Parallel Computing? Flynn's Classical Taxonomy Parallel Computer Memory

More information

Early experience with Blue Gene/P. Jonathan Follows IBM United Kingdom Limited HPCx Annual Seminar 26th. November 2007

Early experience with Blue Gene/P. Jonathan Follows IBM United Kingdom Limited HPCx Annual Seminar 26th. November 2007 Early experience with Blue Gene/P Jonathan Follows IBM United Kingdom Limited HPCx Annual Seminar 26th. November 2007 Agenda System components The Daresbury BG/P and BG/L racks How to use the system Some

More information

Parallel Programming Overview

Parallel Programming Overview Parallel Programming Overview Introduction to High Performance Computing 2019 Dr Christian Terboven 1 Agenda n Our Support Offerings n Programming concepts and models for Cluster Node Core Accelerator

More information

SHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008

SHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008 SHARCNET Workshop on Parallel Computing Hugh Merz Laurentian University May 2008 What is Parallel Computing? A computational method that utilizes multiple processing elements to solve a problem in tandem

More information

Programming with MPI on GridRS. Dr. Márcio Castro e Dr. Pedro Velho

Programming with MPI on GridRS. Dr. Márcio Castro e Dr. Pedro Velho Programming with MPI on GridRS Dr. Márcio Castro e Dr. Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage -

More information

Lesson 1. MPI runs on distributed memory systems, shared memory systems, or hybrid systems.

Lesson 1. MPI runs on distributed memory systems, shared memory systems, or hybrid systems. The goals of this lesson are: understanding the MPI programming model managing the MPI environment handling errors point-to-point communication 1. The MPI Environment Lesson 1 MPI (Message Passing Interface)

More information

PGAS: Partitioned Global Address Space

PGAS: Partitioned Global Address Space .... PGAS: Partitioned Global Address Space presenter: Qingpeng Niu January 26, 2012 presenter: Qingpeng Niu : PGAS: Partitioned Global Address Space 1 Outline presenter: Qingpeng Niu : PGAS: Partitioned

More information

Parallel Performance of the XL Fortran random_number Intrinsic Function on Seaborg

Parallel Performance of the XL Fortran random_number Intrinsic Function on Seaborg LBNL-XXXXX Parallel Performance of the XL Fortran random_number Intrinsic Function on Seaborg Richard A. Gerber User Services Group, NERSC Division July 2003 This work was supported by the Director, Office

More information

BIL 104E Introduction to Scientific and Engineering Computing. Lecture 14

BIL 104E Introduction to Scientific and Engineering Computing. Lecture 14 BIL 104E Introduction to Scientific and Engineering Computing Lecture 14 Because each C program starts at its main() function, information is usually passed to the main() function via command-line arguments.

More information

Faculty of Electrical and Computer Engineering Department of Electrical and Computer Engineering Program: Computer Engineering

Faculty of Electrical and Computer Engineering Department of Electrical and Computer Engineering Program: Computer Engineering Faculty of Electrical and Computer Engineering Department of Electrical and Computer Engineering Program: Computer Engineering Course Number EE 8218 011 Section Number 01 Course Title Parallel Computing

More information

Introduction to MPI. SHARCNET MPI Lecture Series: Part I of II. Paul Preney, OCT, M.Sc., B.Ed., B.Sc.

Introduction to MPI. SHARCNET MPI Lecture Series: Part I of II. Paul Preney, OCT, M.Sc., B.Ed., B.Sc. Introduction to MPI SHARCNET MPI Lecture Series: Part I of II Paul Preney, OCT, M.Sc., B.Ed., B.Sc. preney@sharcnet.ca School of Computer Science University of Windsor Windsor, Ontario, Canada Copyright

More information

Building Library Components That Can Use Any MPI Implementation

Building Library Components That Can Use Any MPI Implementation Building Library Components That Can Use Any MPI Implementation William Gropp Mathematics and Computer Science Division Argonne National Laboratory Argonne, IL gropp@mcs.anl.gov http://www.mcs.anl.gov/~gropp

More information

FINAL TERM EXAMINATION SPRING 2010 CS304- OBJECT ORIENTED PROGRAMMING

FINAL TERM EXAMINATION SPRING 2010 CS304- OBJECT ORIENTED PROGRAMMING FINAL TERM EXAMINATION SPRING 2010 CS304- OBJECT ORIENTED PROGRAMMING Question No: 1 ( Marks: 1 ) - Please choose one Classes like TwoDimensionalShape and ThreeDimensionalShape would normally be concrete,

More information

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18 PROCESS VIRTUAL MEMORY CS124 Operating Systems Winter 2015-2016, Lecture 18 2 Programs and Memory Programs perform many interactions with memory Accessing variables stored at specific memory locations

More information

Parallel Programming. Libraries and Implementations

Parallel Programming. Libraries and Implementations Parallel Programming Libraries and Implementations Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

File IO and command line input CSE 2451

File IO and command line input CSE 2451 File IO and command line input CSE 2451 File functions Open/Close files fopen() open a stream for a file fclose() closes a stream One character at a time: fgetc() similar to getchar() fputc() similar to

More information

Overview Interactive Data Language Design of parallel IDL on a grid Design of IDL clients for Web/Grid Service Status Conclusions

Overview Interactive Data Language Design of parallel IDL on a grid Design of IDL clients for Web/Grid Service Status Conclusions GRIDL: High-Performance and Distributed Interactive Data Language Svetlana Shasharina, Ovsei Volberg, Peter Stoltz and Seth Veitzer Tech-X Corporation HPDC 2005, July 25, 2005 Poster Overview Interactive

More information

Programming Techniques for Supercomputers. HPC RRZE University Erlangen-Nürnberg Sommersemester 2018

Programming Techniques for Supercomputers. HPC RRZE University Erlangen-Nürnberg Sommersemester 2018 Programming Techniques for Supercomputers HPC Services @ RRZE University Erlangen-Nürnberg Sommersemester 2018 Outline Login to RRZE s Emmy cluster Basic environment Some guidelines First Assignment 2

More information

IBM PSSC Montpellier Customer Center. Content

IBM PSSC Montpellier Customer Center. Content Content IBM PSSC Montpellier Customer Center Standard Tools Compiler Options GDB IBM System Blue Gene/P Specifics Core Files + addr2line Coreprocessor Supported Commercial Software TotalView Debugger Allinea

More information

CS Operating Systems Lab 3: UNIX Processes

CS Operating Systems Lab 3: UNIX Processes CS 346 - Operating Systems Lab 3: UNIX Processes Due: February 15 Purpose: In this lab you will become familiar with UNIX processes. In particular you will examine processes with the ps command and terminate

More information

NIOS CPU Based Embedded Computer System on Programmable Chip

NIOS CPU Based Embedded Computer System on Programmable Chip 1 Objectives NIOS CPU Based Embedded Computer System on Programmable Chip EE8205: Embedded Computer Systems This lab has been constructed to introduce the development of dedicated embedded system based

More information

Contents. Chapter 1 Overview of the JavaScript C Engine...1. Chapter 2 JavaScript API Reference...23

Contents. Chapter 1 Overview of the JavaScript C Engine...1. Chapter 2 JavaScript API Reference...23 Contents Chapter 1 Overview of the JavaScript C Engine...1 Supported Versions of JavaScript...1 How Do You Use the Engine?...2 How Does the Engine Relate to Applications?...2 Building the Engine...6 What

More information

Practical Introduction to Message-Passing Interface (MPI)

Practical Introduction to Message-Passing Interface (MPI) 1 Outline of the workshop 2 Practical Introduction to Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Theoretical / practical introduction Parallelizing your

More information

Assignment 3 MPI Tutorial Compiling and Executing MPI programs

Assignment 3 MPI Tutorial Compiling and Executing MPI programs Assignment 3 MPI Tutorial Compiling and Executing MPI programs B. Wilkinson: Modification date: February 11, 2016. This assignment is a tutorial to learn how to execute MPI programs and explore their characteristics.

More information

Compute Cluster Server Lab 2: Carrying out Jobs under Microsoft Compute Cluster Server 2003

Compute Cluster Server Lab 2: Carrying out Jobs under Microsoft Compute Cluster Server 2003 Compute Cluster Server Lab 2: Carrying out Jobs under Microsoft Compute Cluster Server 2003 Compute Cluster Server Lab 2: Carrying out Jobs under Microsoft Compute Cluster Server 20031 Lab Objective...1

More information

Hello, World! in C. Johann Myrkraverk Oskarsson October 23, The Quintessential Example Program 1. I Printing Text 2. II The Main Function 3

Hello, World! in C. Johann Myrkraverk Oskarsson October 23, The Quintessential Example Program 1. I Printing Text 2. II The Main Function 3 Hello, World! in C Johann Myrkraverk Oskarsson October 23, 2018 Contents 1 The Quintessential Example Program 1 I Printing Text 2 II The Main Function 3 III The Header Files 4 IV Compiling and Running

More information

Supercomputing in Plain English

Supercomputing in Plain English Supercomputing in Plain English An Introduction to High Performance Computing Part VI: Distributed Multiprocessing Henry Neeman, Director The Desert Islands Analogy Distributed Parallelism MPI Outline

More information

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009 ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009 What is ACEnet? Shared resource......for research computing... physics, chemistry, oceanography, biology, math, engineering,

More information

Fractals exercise. Investigating task farms and load imbalance

Fractals exercise. Investigating task farms and load imbalance Fractals exercise Investigating task farms and load imbalance Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Name :. Roll No. :... Invigilator s Signature : INTRODUCTION TO PROGRAMMING. Time Allotted : 3 Hours Full Marks : 70

Name :. Roll No. :... Invigilator s Signature : INTRODUCTION TO PROGRAMMING. Time Allotted : 3 Hours Full Marks : 70 Name :. Roll No. :..... Invigilator s Signature :.. 2011 INTRODUCTION TO PROGRAMMING Time Allotted : 3 Hours Full Marks : 70 The figures in the margin indicate full marks. Candidates are required to give

More information

MPI Mechanic. December Provided by ClusterWorld for Jeff Squyres cw.squyres.com.

MPI Mechanic. December Provided by ClusterWorld for Jeff Squyres cw.squyres.com. December 2003 Provided by ClusterWorld for Jeff Squyres cw.squyres.com www.clusterworld.com Copyright 2004 ClusterWorld, All Rights Reserved For individual private use only. Not to be reproduced or distributed

More information

PCAP Assignment I. 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail.

PCAP Assignment I. 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail. PCAP Assignment I 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail. The multicore CPUs are designed to maximize the execution speed

More information

Sharpen Exercise: Using HPC resources and running parallel applications

Sharpen Exercise: Using HPC resources and running parallel applications Sharpen Exercise: Using HPC resources and running parallel applications Contents 1 Aims 2 2 Introduction 2 3 Instructions 3 3.1 Log into ARCHER frontend nodes and run commands.... 3 3.2 Download and extract

More information

C Compilation Model. Comp-206 : Introduction to Software Systems Lecture 9. Alexandre Denault Computer Science McGill University Fall 2006

C Compilation Model. Comp-206 : Introduction to Software Systems Lecture 9. Alexandre Denault Computer Science McGill University Fall 2006 C Compilation Model Comp-206 : Introduction to Software Systems Lecture 9 Alexandre Denault Computer Science McGill University Fall 2006 Midterm Date: Thursday, October 19th, 2006 Time: from 16h00 to 17h30

More information

CSE 303 Midterm Exam

CSE 303 Midterm Exam CSE 303 Midterm Exam October 29, 2008 Name Sample Solution The exam is closed book, except that you may have a single page of hand written notes for reference. If you don t remember the details of how

More information

PROGRAMMAZIONE I A.A. 2017/2018

PROGRAMMAZIONE I A.A. 2017/2018 PROGRAMMAZIONE I A.A. 2017/2018 FUNCTIONS INTRODUCTION AND MAIN All the instructions of a C program are contained in functions. üc is a procedural language üeach function performs a certain task A special

More information

Preview from Notesale.co.uk Page 6 of 52

Preview from Notesale.co.uk Page 6 of 52 Binary System: The information, which it is stored or manipulated by the computer memory it will be done in binary mode. RAM: This is also called as real memory, physical memory or simply memory. In order

More information

Killing Zombies, Working, Sleeping, and Spawning Children

Killing Zombies, Working, Sleeping, and Spawning Children Killing Zombies, Working, Sleeping, and Spawning Children CS 333 Prof. Karavanic (c) 2015 Karen L. Karavanic 1 The Process Model The OS loads program code and starts each job. Then it cleans up afterwards,

More information

CpSc 1010, Fall 2014 Lab 10: Command-Line Parameters (Week of 10/27/2014)

CpSc 1010, Fall 2014 Lab 10: Command-Line Parameters (Week of 10/27/2014) CpSc 1010, Fall 2014 Lab 10: Command-Line Parameters (Week of 10/27/2014) Goals Demonstrate proficiency in the use of the switch construct and in processing parameter data passed to a program via the command

More information

Message Passing Interface (MPI)

Message Passing Interface (MPI) CS 220: Introduction to Parallel Computing Message Passing Interface (MPI) Lecture 13 Today s Schedule Parallel Computing Background Diving in: MPI The Jetson cluster 3/7/18 CS 220: Parallel Computing

More information

Docker task in HPC Pack

Docker task in HPC Pack Docker task in HPC Pack We introduced docker task in HPC Pack 2016 Update1. To use this feature, set the environment variable CCP_DOCKER_IMAGE of a task so that it could be run in a docker container on

More information

Introduction to parallel computing concepts and technics

Introduction to parallel computing concepts and technics Introduction to parallel computing concepts and technics Paschalis Korosoglou (support@grid.auth.gr) User and Application Support Unit Scientific Computing Center @ AUTH Overview of Parallel computing

More information

CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading)

CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) Limits to ILP Conflicting studies of amount of ILP Benchmarks» vectorized Fortran FP vs. integer

More information

Introduction to Parallel Programming Message Passing Interface Practical Session Part I

Introduction to Parallel Programming Message Passing Interface Practical Session Part I Introduction to Parallel Programming Message Passing Interface Practical Session Part I T. Streit, H.-J. Pflug streit@rz.rwth-aachen.de October 28, 2008 1 1. Examples We provide codes of the theoretical

More information

Lecture 7: Distributed memory

Lecture 7: Distributed memory Lecture 7: Distributed memory David Bindel 15 Feb 2010 Logistics HW 1 due Wednesday: See wiki for notes on: Bottom-up strategy and debugging Matrix allocation issues Using SSE and alignment comments Timing

More information

CS 326 Operating Systems C Programming. Greg Benson Department of Computer Science University of San Francisco

CS 326 Operating Systems C Programming. Greg Benson Department of Computer Science University of San Francisco CS 326 Operating Systems C Programming Greg Benson Department of Computer Science University of San Francisco Why C? Fast (good optimizing compilers) Not too high-level (Java, Python, Lisp) Not too low-level

More information

CS342 - Spring 2019 Project #3 Synchronization and Deadlocks

CS342 - Spring 2019 Project #3 Synchronization and Deadlocks CS342 - Spring 2019 Project #3 Synchronization and Deadlocks Assigned: April 2, 2019. Due date: April 21, 2019, 23:55. Objectives Practice multi-threaded programming. Practice synchronization: mutex and

More information

NAG Library Function Document nag_dtr_load (f16qgc)

NAG Library Function Document nag_dtr_load (f16qgc) 1 Purpose NAG Library Function Document nag_dtr_load () nag_dtr_load () initializes a real triangular matrix. 2 Specification #include #include void nag_dtr_load (Nag_OrderType order,

More information