ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 2016 Solutions Name:...

Similar documents
Assignment 3 MPI Tutorial Compiling and Executing MPI programs

Assignment 1 OpenMP Tutorial Assignment

ECE 563 Midterm 1 Spring 2015

Basic Structure and Low Level Routines

Lecture 4: OpenMP Open Multi-Processing

OpenMPand the PGAS Model. CMSC714 Sept 15, 2015 Guest Lecturer: Ray Chen

Chip Multiprocessors COMP Lecture 9 - OpenMP & MPI

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

Computer Architecture

ITCS 4145/5145 Assignment 2

Hybrid Programming. John Urbanic Parallel Computing Specialist Pittsburgh Supercomputing Center. Copyright 2017

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system

Parallel Computing. Lecture 17: OpenMP Last Touch

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016

Parallel Programming Assignment 3 Compiling and running MPI programs

Introduction to OpenMP

Hybrid MPI and OpenMP Parallel Programming

Mango DSP Top manufacturer of multiprocessing video & imaging solutions.

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP

Lecture 14: Mixed MPI-OpenMP programming. Lecture 14: Mixed MPI-OpenMP programming p. 1

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

Raspberry Pi Basics. CSInParallel Project

ECE 574 Cluster Computing Lecture 10

Message Passing Interface

Concurrent Programming with OpenMP

OpenMP, Part 2. EAS 520 High Performance Scientific Computing. University of Massachusetts Dartmouth. Spring 2015

HPC Workshop University of Kentucky May 9, 2007 May 10, 2007

/Users/engelen/Sites/HPC folder/hpc/openmpexamples.c

Introduction to Parallel Programming Message Passing Interface Practical Session Part I

Introduction to parallel computing concepts and technics

Assignment 5 Suzaku Programming Assignment

Holland Computing Center Kickstart MPI Intro

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer

JANUARY 2004 LINUX MAGAZINE Linux in Europe User Mode Linux PHP 5 Reflection Volume 6 / Issue 1 OPEN SOURCE. OPEN STANDARDS.

Parallel Computing: Overview

CS 470 Spring Mike Lam, Professor. OpenMP

Parallel Programming. Libraries and Implementations

Parallel Programming in C with MPI and OpenMP

CS 470 Spring Mike Lam, Professor. OpenMP

Shared Memory Programming Model

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013

High Performance Computing: Tools and Applications

PARALLEL AND DISTRIBUTED COMPUTING

Parallel Programming Using MPI

Topics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP

PCAP Assignment I. 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail.

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

Parallel Processing Top manufacturer of multiprocessing video & imaging solutions.

6.189 IAP Lecture 5. Parallel Programming Concepts. Dr. Rodric Rabbah, IBM IAP 2007 MIT

OpenMP 4. CSCI 4850/5850 High-Performance Computing Spring 2018

DPHPC: Introduction to OpenMP Recitation session

Introduction to OpenMP

ECE 563 Spring 2016, Second Exam

Paraguin Compiler Examples

Parallel Programming

Parallel Programming in C with MPI and OpenMP

OpenMP Tutorial. Seung-Jai Min. School of Electrical and Computer Engineering Purdue University, West Lafayette, IN

OpenMP - exercises -

by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by using function call

Parallel Programming with OpenMP. CS240A, T. Yang

Parallel Computing and the MPI environment

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 18. Combining MPI and OpenMP

Distributed Memory Programming with Message-Passing

Parallel Programming Libraries and implementations

CS 426. Building and Running a Parallel Application

Parallel Computing. Lecture 13: OpenMP - I

OpenMP Exercises. Claudia Truini - Vittorio Ruggiero - Cristiano Padrin -

Introduction to Parallel Programming

MIT OpenCourseWare Multicore Programming Primer, January (IAP) Please use the following citation format:

OpenMP: Open Multiprocessing

Parallel and Distributed Programming. OpenMP

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

[4] 1 cycle takes 1/(3x10 9 ) seconds. One access to memory takes 50/(3x10 9 ) seconds. =16ns. Performance = 4 FLOPS / (2x50/(3x10 9 )) = 120 MFLOPS.

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS. Teacher: Jan Kwiatkowski, Office 201/15, D-2

Introduc)on to OpenMP

Parallele Numerik. Blatt 1

Data Environment: Default storage attributes

Parallel Programming in C with MPI and OpenMP

HPC Parallel Programing Multi-node Computation with MPI - I

Teaching parallel programming on multi-core and many-core processing architectures

Multithreading in C with OpenMP

COSC 6374 Parallel Computation. Introduction to OpenMP(I) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel)

15-440: Recitation 8

Our new HPC-Cluster An overview

COSC 6374 Parallel Computation. Introduction to OpenMP. Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel)

Programming Shared Memory Systems with OpenMP Part I. Book

Parallel Programming. OpenMP Parallel programming for multiprocessors for loops

Parallel Processing/Programming

ECE/ME/EMA/CS 759 High Performance Computing for Engineering Applications

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)

Parallel and Distributed Computing

OpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Introduction to MPI. Ekpe Okorafor. School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014

High Performance Computing Course Notes Message Passing Programming I

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

The following program computes a Calculus value, the "trapezoidal approximation of

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder

Transcription:

ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 016 Solutions Name:... Answer questions in space provided below questions. Use additional paper if necessary but make sure your name is on additional sheets. Clearly show how you obtained your answers. (No points for simply writing a numeric answer without showing how you got the answer, even if correct.) This is a closed book test. Do not refer to any materials except those supplied for the test. Supplied: Summary of OpenMP 3.0 C/C++ Syntax. and Basic MPI routines. Total /40 7 pages Qu. 1 Answer each of the following briefly: (a) According to Amdahl s law, what the maximum speed-up of a parallel computation given that 90% of the computation can be divided into parallel parts? What assumption are you making? Max speedup = 1/f = 1/0.10 = 10 with infinite number of processes. (b) How can one make five threads do exactly the same code sequence using OpenMP? #pragma omp parallel num_threads(5) { // code (c) In the following OpenMP code sequence, which variable or variables must be declared as private variables (in a private clause): int x, tid, a[100]; #pragma omp parallel { tid = omp_get_thread_num(); n = omp_get_num_threads(); a[tid] = 10*n; tid and n. 1

(d) What does the nowait clause do in the Open sections directive? Threads do not wait after finishing their section (e) What is the value of sum after the following OpenMP code sequence, i.e. what number does the printf statement print out? int i, sum; omp_set_num_threads(); sum = 0; #pragma omp parallel for reduction(+:sum) for (i = 0; i < 5; i++ ) { sum++; printf("sum = %d\n", sum); 5 (f) If x is a shared variable initialized to zero and three concurrent threads execute the statement x = x + 1; what are the possible value of x afterwards? All threads separate without any overlap. x = 1 + 1 + 1 = 3 All threads read x before any have altered it and last writes back x = 1 Two threads read x after one writes back a 1, and these threads write So 1, or 3

(g) Identify all the dependencies in the following sequence: 1. a = b + c;. y = a; 3. a = 3; 4. x = y + z; Clearly show how you got your answer. The statements are numbered so that you can refer to them. 1 and read after write (true) dependency 1 and 3 write after write (output) dependency and 4 read after write (true) dependency and 3 write after read dependency (antidependency) (h) In the MPI statement: MPI_Send(&x,1,MPI_INT,1,msgtag, MPI_COMM_WORLD); what can be inferred about how x has been declared. It is a single integer, declared as int x; (i) In MPI, how does the programmer specify how many processes the program will use? When you execute the program with mpiexec using the -n option to specify the number processes, e.g.. mpiexec n 4 prog1 3

(j) When does the MPI routine MPI_Send() return? When its local actions have completed and it is safe to alter the arguments but the message may not have been received. (k) What is a Jacobi iteration? Values are computed using only values from the previous iteration. (l) What does the -l option specify when used with the Linux C compiler (gcc or cc)? Give an example. (The -l option was used in Assignment, but the question is asking what -l specifies in general.) Specifies a library. lm, -lx11 4

The following include statements can be assumed: #include <stdio.h> #include <stdlib.h> #include <time.h> #include <omp.h> #include "mpi.h" in Qu a, b and c. However any define statements, variables, and arrays that you use must be declared. Qu. (a) Write a sequential C program to perform matrix addition adding two matrices A[N][N] and B[N][N] to produce a matrix C[N][N]. The arrays hold doubles. N is a constant defined with a define statement and set to 56. In matrix addition, the corresponding elements of each matrix are added together to form elements of result matrix, as shown below: j i A B C i.e. given elements of A as a i,j and elements of B as b i,j, each element of C computed as c i,j = a i,j + b i,j. You can assume that the arrays are initialized with values but show where that would be in the program with comments. 4 #define N 56 int main(int argc, char *argv[]) { int i, j; double A[N][N], B[N][N], C[N][N]; // initialize arrays for (i = 0; i < N; i++) { for (j = 0; j < N; j++) { C[i][j] = A[i][j] + B[i][j]; return 0; 5

(b) Modify the program in (a) become an OpenMP program performing matrix addition using T threads where T is a constant defined with a define statement and set to 16. 4 #define N 56 #define T 16 int main(int argc, char *argv[]) { int i, j; double A[N][N], B[N][N], C[N][N]; initialize arrays omp_set_num_threads(t); //set number of threads here #pragma omp parallel for private(j) for (i = 0; i < N; i++) { // parallel matrix addition for (j = 0; j < N; j++) { C[i][j] = A[i][j] + B[i][j]; return 0; 6

(c) Modify the program in (a) to become an MPI program performing matrix addition using P processes where P is determined when the code is executed. Partition the matrices into blksz rows, where blksz = N/P (Clue: remember how matrix multiplication was done.) It can be assumed that N/P is an integer. 8 #define N 56 int main(int argc, char *argv[]) { int i, j; double A[N][N], B[N][N], C[N][N]; MPI_Status status; int P, blksz; // MPI variables MPI_Init(&argc, &argv); // Start MPI MPI_Comm_size(MPI_COMM_WORLD, &P); blksz = N/P; // Scatter input matrix A MPI_Scatter(A, blksz*n, MPI_DOUBLE, A, blksz*n, MPI_DOUBLE, 0, MPI_COMM_WORLD); // Scatter input matrix B MPI_Scatter(B, blksz*n, MPI_DOUBLE, B, blksz*n, MPI_DOUBLE, 0, MPI_COMM_WORLD); for(i = 0 ; i < blksz; i++) { for(j = 0 ; j < N ; j++) { C[i][j] = A[i][j] + B[i][j]; // gather results MPI_Gather(C, blksz*n, MPI_DOUBLE, C, blksz*n, MPI_DOUBLE, 0, MPI_COMM_WORLD); MPI_Finalize(); return 0; 7