CPS 303 High Performance Computing. Wensheng Shen Department of Computational Science SUNY Brockport

Size: px

Start display at page:

Download "CPS 303 High Performance Computing. Wensheng Shen Department of Computational Science SUNY Brockport"

Alison Wheeler
6 years ago
Views:

1 CPS 303 High Perormance Computing Wensheng Shen Department o Computational Science SUNY Brockport

2 Chapter 4 An application: numerical integration Use MPI to solve a problem o numerical integration with the trapezoidal rule.

3 4. The trapezoidal rule Deinite integral o a nonnegative unction

4 Trapezoids approimating deinite integral

5 The ith trapezoid [ ] i i h [ ] [ ] [ ] [ ] [ ]h h h h h n n n n n... / / = =

6 #include <stdio.h> main { loat integral; /* store result in integral */ loat a, b; /* let and right endpoints */ int n; /* Number o trapezoids */ loat h; /* trapezoid base width */ loat ; int i; loat loat ; /* unction we re integrating */ print Enter a, b, and n\n ; scan % % %d, &a, &b, &n; h = b-a/n; integral = a b/.0; = a; ori=; i<=n-; i { = h; integral = integral ; } integral = integral*h; print with n=%d trapezoids, our estimate\n, n; print o the integral rom % to % = %\n, a, b, integral; }

7 loat loat { loat return_val; /* calculate. Store calculation in return_val. */ return return_val; }

8 4. Parallelizing the trapezoidal rule In this case, data is just the interval [a, b] and the number o trapezoid. We can parallelize the trapezoidal rule program by assigning a subinterval o [a, b] to each process, and having that process estimate the integral o over the subinterval.

9 Suppose there are p processes and n trapezoids, and, in order to simpliy the discussion, we also suppose that n is divisible by p. Then it is natural or the irst process to calculate the area o the irst n/p trapezoids, the second process to calculate the area o the net n/p, etc. MPI identiies each process by a nonnegative integer. So i there are p processes, the irst is process 0, the second process,, and the last process p-.

10 Process 0 i P- Interval [a, anh/p] [anh/p, anh/p] [ainh/p, ainh/p] [ap-nh/p, b] Assignments o subintervals to processes

11 Each process needs the ollowing inormation: The number o processes, p Its rank The entire interval o integration, [a,b] The number o subintervals, n Each process sends its result to process 0, and process 0 will do the inal addition.

12 Algorithm Each process calculates its interval o integration. Each process estimates the integral o over its interval using the trapezoidal rule. Each process!= 0 sends its integral to 0. Process 0 sums the calculations received rom the individual processes and prints the result.

13 Global and local variables Global variables: variables whose contents are signiicant on all processes, such as a, b, and n in the previous code; Local variables: variables whose contents are signiicant only on individual processes, such as local_a, local_b, and local_n in the previous code Use dierent names or global and local variables.

14 4.3 I/O on parallel systems In the previous code, the input data, a, b, and n, are hardwired. I we want o change any o these, we must edit and recompile the program. We assume that process 0 can do both input and output. Let us urther assume all processes can do input and output. We add the ollowing piece o code and run it with two processes: scan % % %d, &a, &b, &n; 0 04 user types in Who gets the data? Process 0, process, or both, or process 0 gets the 0 and, but process gets 04? I several processes attempt to simultaneously write data to the terminal screen, are the data printed in order? It is out o control

15 Solution: let process 0 be the master process to do I/O Only process 0 can do I/O, and we need process 0 to send the user input to the other processes. We need to write another unction, Get_data, dealing with input and output. Other ways: modiy eisting code, collective communication, net chapter.

16 Void get_dataloat* a_ptr, loat* b_ptr, int* n_ptr, int my_rank, int p { int source=0; int dest; int tag; MPI_Status status; imy_rank == 0 { print enter a, b, and n\n ; scan % % %d, a_ptr, b_ptr, n_ptr; ordest=; dest<p; dest { tag = 0; MPI_Senda_ptr,, MPI_FLOAT, dest, tag, MPI_COMM_WORLD; tag = ; MPI_Sendb_ptr,, MPI_FLOAT, dest, tag, MPI_COMM_WORLD; tag = ; MPI_Sendn_ptr,, MPI_INT, dest, tag, MPI_COMM_WORLD; } } else { tag=0; MPI_Recva_ptr,, MPI_FLOAT, source, tag, MPI_COMM_WORLD, &status; tag = ; MPI_Recvb_ptr,, MPI_FLOAT, source, tag, MPI_COMM_WORLD, &status; tag = ; MPI_Recva_ptr,, MPI_FLOAT, source, tag, MPI_COMM_WORLD, &status; } }

P does not evenly divide n We have 8 trapezoids and 3 processes, how can we distribute the eight trapezoids to 3 processes? We take the loor o n/p. n/p=8/3=.6, loorn/p=.

17 P does not evenly divide n We have 8 trapezoids and 3 processes, how can we distribute the eight trapezoids to 3 processes? We take the loor o n/p. n/p=8/3=.6, loorn/p=. processes 0 and have trapezoids each, process 3 has 4 trapezoids. We let the irst p- processes to have loorn/p, and the pth process to have n-loorn/p*p- trapezoids. The loor o n/p n p

18 The Simpson s rule Three points integration, the number o intervals must be an even number. I the number o total panels is n, then the number o Simpson panels is n/

19 ith interval h h [ai-h, aih] i i ith interval [aih, aih] h = b a n Area=h/3 i- 4 i i

[ ] 4 4 4 4 3 8 7 6 6 5 4 4 3 0 h d b a = [

20 [ ] h d b a = [ ] n n n b a h d = In general,

21 The derivation o Simpson s rule: undetermined coeicients Using 3-point integration b a a b d α a β γ b Quadratic polynomial = c c c3

22 The relationship between Simpson s rule and the trapezoid rule A b a a b b a = B b a a b b a = 3 B A b a Midpoint rule Trapezoid rule Simpson s rule

Collective Communications I

Collective Communications I Ned Nedialkov McMaster University Canada CS/SE 4F03 January 2016 Outline Introduction Broadcast Reduce c 2013 16 Ned Nedialkov 2/14 Introduction A collective communication involves