COSC 6374 Parallel Computation. Edgar Gabriel Fall Each student should deliver Source code (.c file) Documentation (.pdf,.doc,.tex or.

Size: px

Start display at page:

Download "COSC 6374 Parallel Computation. Edgar Gabriel Fall Each student should deliver Source code (.c file) Documentation (.pdf,.doc,.tex or."

Christina Bishop
5 years ago
Views:

1 COSC 6374 Parallel Computation 1 st homework assignment Edgar Gabriel Fall st Homework Rules Each student should deliver Source code (.c file) Documentation (.pdf,.doc,.tex or.txt file) explanations to the code answers to questions Deliver electronically to gabriel@cs.uh.edu Expected by Monday, October 5, 11.59pm In case of questions: ask, ask, ask! 1

2 Given the sequential code for an image processing algorithm, which performs a k-means clustering operation on pixels of an image. 1. Parallelize the algorithm using MPI using a 1-D block-row wise data distribution a. Develop the necessary code assuming each process holds the same number of rows of the image b. Measure the execution time for the k-means clustering for 1, 2, 4 and 8 processors for the 1024x1024 and 2048x2048 pixel image provided (1024_featurevec.out and 2048_featurevec.out) c. Determine the parallel speedup and the parallel efficiency of the code sequence on the whale cluster for the 2, 4, and 8 processor cases 2. 1-D block column wise data distribution. a. Sketch in your document how the code would have to look for this data distribution. b. Compare the communication requirements (number of messages, data transferred, number of collective operations) between the 1-D block column and block row wise data decomposition. 2

3 Code consists of three parts: reading input image (suggestion for parallelization provided later) perform k-means clustering ( your work) write output image (suggestion for parallelization provided later) All input files (512_featurevec.out, 1024_featurevec.out and 2048_featurevec.out) are available in your home directory on the whale cluster. K-means clustering: sequential version Input : I = {i 1,,i k } Instances to be clustered n : Number of clusters Output C = {c 1,,c n } :cluster centroids m: I C : cluster membership Algorithm Set C to initial value while m has changed c temp k = 0, k = 1,, n count k= 0, k = 1,, n for each i ϵ I m(i j ) = min distance (i j, c k ), k = 1,, n c temp k += i count k ++ end recompute C based on c temp and count 3

4 how to compile the code for measurements: gcc o clustering clustering.c O3 how to compile the code for debugging: gcc o clustering clustering.c g O0 how to run the sequential code:./clustering 1024_featurevec.out outlabel_1024.out Multispectral images Image data contains multiple spectral channels Number of spectral channels in the file: 24 Storage order: Pixel(i,j) has first the data for channel1, then channel 2, channel 3, etc. Each data item is a double precision floating point number 4

5 how to compile an MPI code for measurements: mpicc o myapp myapp.c O3 how to compile an MPI code for debugging: mpicc o myapp myapp.c g O0 how to run the code: mpirun np 4./smoothing Matrix-1024.mat 1024 How to read a portion of a matrix Determine the number of rows a process has to read Allocate buffers accordingly #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> int fh=open(infile1, O_RDONLY); if ( fh == -1) { printf( Could not open file\n ); MPI_Abort ( MPI_COMM_WORLD, 1); } lseek(fh,localheight*width*sizeof(int)*rank,seek_set); read (fh,&(labels[0][0]),sizeof(int)*localheight*width); close(fh); 5

6 Documentation The Documentation should contain (Brief) Problem description Solution strategy Results section Description of resources used Description of measurements performed Results (graphs + findings) The document should not contain Replication of the entire source code that s why you have to deliver the sources Screen shots of every single measurement you made Actually, no screen shots at all. The slurm output files 6

7 How to use a cluster A cluster usually consists of a front-end node and compute nodes You can login to the front end node using ssh (from windows or linux machines) using the login name and the password assigned to you. The front end node is supposed to be there for editing, and compiling - not for running jobs! To allocate a node for interactive development: staccxy@whale:~>salloc n 4 p cosc6374 staccxy@whale:~>squeue JOBID PARTITION USER ST TIME NODES NODELIST(REASON) 48 cosc6374 staccxy R 0:02 1 whale-008 staccxy@whale:~> mpirun np 4./mytest How to use a cluster (II) Note: the mpirun command will know where to execute the job only if you type the command from the very same terminal/window where you typed the salloc command Maximum time an allocation will be available to you: 45 minutes after that time, your job gets automatically killed by the system fairness rule Note, that when you request e.g. 8 processors, you might receive different one with every allocation 7

8 Notes If you need hints on how to use a UNIX/Linux machine through ssh: How to use a cluster such as shark and whale 8

COSC 6385 Computer Architecture. - Homework

COSC 6385 Computer Architecture - Homework Fall 2008 1 st Assignment Rules Each team should deliver Source code (.c,.h and Makefiles files) Please: no.o files and no executables! Documentation (.pdf,.doc,.tex