Why Structured Parallel Programming Matters. Murray Cole

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Why Structured Parallel Programming Matters. Murray Cole"

Transcription

1 1 Why Structured Parallel Programming Matters Murray Cole Institute for Computing Systems Architecture School of Informatics University of Edinburgh Murray Cole Why Structured Parallel Programming Matters September 3rd 24

2 Edinburgh 2 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

3 Edinburgh 2 Scotland s sunshine capital! Murray Cole Why Structured Parallel Programming Matters September 3rd 24

4 What is Unstructured Parallel Programming? 3 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

5 What is Unstructured Parallel Programming? 3 Simple parallel programming frameworks (Posix threads, core MPI) are universal. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

6 What is Unstructured Parallel Programming? 3 Simple parallel programming frameworks (Posix threads, core MPI) are universal. They can be used to describe arbitrarily complex and dynamically determined interactions between activities. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

7 What is Unstructured Parallel Programming? 3 Simple parallel programming frameworks (Posix threads, core MPI) are universal. They can be used to describe arbitrarily complex and dynamically determined interactions between activities. Programming is by careful selection and combination of operations drawn from a small, simple set. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

8 What is Unstructured Parallel Programming? 4 It is difficult for programmers, examining such a program statically, to understand the overall pattern involved (if such exists), and to radically revise it. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

9 What is Unstructured Parallel Programming? 4 It is difficult for programmers, examining such a program statically, to understand the overall pattern involved (if such exists), and to radically revise it. It is very difficult for implementation mechanisms (working statically and/or dynamically) to attempt optimisations which work beyond single instances of the primitives. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

10 Patterns in Parallel Computing 5 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

11 Patterns in Parallel Computing 5 Many (most?) parallel applications don t actually involve arbitrary, dynamic interaction patterns. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

12 Patterns in Parallel Computing 5 Many (most?) parallel applications don t actually involve arbitrary, dynamic interaction patterns. Sometimes the pattern is entirely pre-determined. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

13 A Pipeline 6 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

14 A Pipeline 7 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

15 A Pipeline 8 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

16 A Pipeline 9 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

17 A Pipeline 1 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

18 A Pipeline 11 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

19 A Pipeline 12 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

20 A Pipeline 13 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

21 Patterns in Parallel Computing Many (most?) parallel applications don t actually involve arbitrary, dynamic interaction patterns. Sometimes the pattern is entirely pre-determined. 14 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

22 Patterns in Parallel Computing Many (most?) parallel applications don t actually involve arbitrary, dynamic interaction patterns. Sometimes the pattern is entirely pre-determined. Sometimes non-determinism is constrained within a wider pattern. 14 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

23 A Task Farm 15 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

24 A Task Farm 16 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

25 A Task Farm 17 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

26 A Task Farm 18 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

27 A Task Farm 19 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

28 A Task Farm 2 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

29 A Task Farm 21 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

30 A Task Farm 22 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

31 A Task Farm 23 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

32 Patterns in Parallel Computing Many (most?) parallel applications don t actually involve arbitrary, dynamic interaction patterns. Sometimes the pattern is entirely pre-determined. Sometimes non-determinism is constrained within a wider pattern. 24 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

33 Patterns in Parallel Computing Many (most?) parallel applications don t actually involve arbitrary, dynamic interaction patterns. Sometimes the pattern is entirely pre-determined. Sometimes non-determinism is constrained within a wider pattern. The use of an unstructured parallel programming mechanism prevents the programmer from expressing information about the pattern - it remains implicit in the collected uses of the simple primitives. 24 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

34 What is Structured Parallel Programming? 25 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

35 What is Structured Parallel Programming? 25 The structured approach to parallelism proposes that commonly used patterns of computation and interaction should be abstracted as parameterisable library functions, control constructs or similar, so that application programmers can explicitly declare that the application follows one or more such patterns. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

36 What is Structured Parallel Programming? 25 The structured approach to parallelism proposes that commonly used patterns of computation and interaction should be abstracted as parameterisable library functions, control constructs or similar, so that application programmers can explicitly declare that the application follows one or more such patterns. Keywords: skeleton, template, pattern, archetype, higher order function Murray Cole Why Structured Parallel Programming Matters September 3rd 24

37 What is Structured Parallel Programming? 25 The structured approach to parallelism proposes that commonly used patterns of computation and interaction should be abstracted as parameterisable library functions, control constructs or similar, so that application programmers can explicitly declare that the application follows one or more such patterns. Keywords: skeleton, template, pattern, archetype, higher order function This matters because it gives a tractable handle on the issues which make correct, efficient parallel programming hard. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

38 Why Parallel Programming is Hard 26 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

39 Why Parallel Programming is Hard 26 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing Murray Cole Why Structured Parallel Programming Matters September 3rd 24

40 Why Parallel Programming is Hard 26 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing We must maintain high efficiency in practice (not just big O). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

41 Why Parallel Programming is Hard 26 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing We must maintain high efficiency in practice (not just big O). Expressing and optimising interaction is confusing. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

42 Does Structured Parallel Programming Help? 27 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

43 Does Structured Parallel Programming Help? 27 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing Murray Cole Why Structured Parallel Programming Matters September 3rd 24

44 Does Structured Parallel Programming Help? 27 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing No. Skeletons help us express algorithms but we still have to devise them. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

45 Does Structured Parallel Programming Help? 27 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing No. Skeletons help us express algorithms but we still have to devise them. We must maintain high efficiency in practice (not just big O). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

46 Does Structured Parallel Programming Help? 27 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing No. Skeletons help us express algorithms but we still have to devise them. We must maintain high efficiency in practice (not just big O). Yes. The skeleton implementation knows the interaction pattern in advance, and exploits this knowledge at a low level. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

47 Does Structured Parallel Programming Help? 27 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing No. Skeletons help us express algorithms but we still have to devise them. We must maintain high efficiency in practice (not just big O). Yes. The skeleton implementation knows the interaction pattern in advance, and exploits this knowledge at a low level. Expressing and optimising interaction is confusing. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

48 Does Structured Parallel Programming Help? 27 Devising correct, efficient sequential algorithms is hard already. efficient parallelism adds an extra conceptual dimension. Introducing No. Skeletons help us express algorithms but we still have to devise them. We must maintain high efficiency in practice (not just big O). Yes. The skeleton implementation knows the interaction pattern in advance, and exploits this knowledge at a low level. Expressing and optimising interaction is confusing. Yes. The pattern abstracted by the skeleton hides all the interaction. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

49 How Performance Optimisations Work 28 Well known, tried and trusted performance optimisations typically exploit information about the spatial and temporal context in which indvidual operations are performed. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

50 Cache Optimisations 29 for (i=;i<n;i++) for (j=;j<n;j++) for (k=; k<n; k++) c[i][j] += a[i][k]*b[k][j]; Murray Cole Why Structured Parallel Programming Matters September 3rd 24

51 Cache Optimisations 3 for (jj=; jj<n; jj=jj+b) for (kk=; kk<n; kk=kk+b) for (i=; i<n; i++) for (j=jj; j < jj+b; j++) { pa = &a[i][kk]; pb = &b[kk][j]; temp = (*pa++)*(*pb); for (k=kk+1; k < kk+b; k++) { pb = pb+n; temp += (*pa++)*(*pb); } c[i][j] += temp; } Murray Cole Why Structured Parallel Programming Matters September 3rd 24

52 Branch Prediction 31 while (a[i] < b[a[i]]) { c[j+i] += b[j]; if (c[k] < 1) { k++; b[j] = ; } i++; } Performance improved by knowing whether branches are usually taken. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

53 How Performance Optimisations Work 32 Well known, tried and trusted performance optimisations typically exploit information about the spatial and temporal context in which indvidual operations are performed. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

54 How Performance Optimisations Work 32 Well known, tried and trusted performance optimisations typically exploit information about the spatial and temporal context in which indvidual operations are performed. The use of structured parallel programming techniques provides information about future interactions, sometimes for the entire execution, in context. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

55 How Performance Optimisations Work 32 Well known, tried and trusted performance optimisations typically exploit information about the spatial and temporal context in which indvidual operations are performed. The use of structured parallel programming techniques provides information about future interactions, sometimes for the entire execution, in context. Structured Parallelism lets us tell the system what will happen next. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

56 Parallel Performance Optimisations Consider a simple iterated all-pairs structure 33 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

57 Parallel Performance Optimisations Consider a simple iterated all-pairs structure 33 Careful shared memory scheduling is essential. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

58 Parallel Performance Optimisations 34 In a simple pipeline, agglomeration of messages can be crucial. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

59 Parallel Performance Optimisations 35 It will be difficult (often impossible?) to recognise that what we have is an all-pairs or pipeline computation from the equivalent unstructured source. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

60 Parallel Performance Optimisations 35 It will be difficult (often impossible?) to recognise that what we have is an all-pairs or pipeline computation from the equivalent unstructured source. Structured Parallelism matters because it allows us to give the system this information (and as a side-effect, simplifies our programming task). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

61 Higher Level Performance Optimisation 36 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

62 Higher Level Performance Optimisation 36 Performance programming is also about choosing the right algorithm, and being sure that it is correct. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

63 Higher Level Performance Optimisation 36 Performance programming is also about choosing the right algorithm, and being sure that it is correct. Designing algorithms with structured concepts in mind allows us to benefit from high level algorithm restructuring techniques. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

64 Higher Level Performance Optimisation 36 Performance programming is also about choosing the right algorithm, and being sure that it is correct. Designing algorithms with structured concepts in mind allows us to benefit from high level algorithm restructuring techniques. The coarse grain, collective nature of skeletons allows substantial transformations to be made with a small number of steps (in contrast to the same effect achieved with the corresponding unstructured primitives). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

65 Higher Level Performance Optimisation 37 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

66 Higher Level Performance Optimisation 38 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

67 Higher Level Performance Optimisation 39 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

68 Higher Level Performance Optimisation 4 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

69 Higher Level Performance Optimisation 41 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

70 Higher Level Performance Optimisation 42 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

71 Higher Level Performance Optimisation 43 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

72 Higher Level Performance Optimisation 44 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

73 Higher Level Performance Optimisation 45 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

74 Higher Level Performance Optimisation 46 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

75 Higher Level Performance Optimisation 47 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

76 Higher Level Performance Optimisation 48 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

77 Higher Level Performance Optimisation 49 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

78 Higher Level Performance Optimisation 5 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

79 Higher Level Performance Optimisation 51 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

80 Higher Level Performance Optimisation 52 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

81 Higher Level Performance Optimisation 53 Structured Parallelism matters because it allows us to manipulate parallel algorithms at a coarse structural level. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

82 Who cares? 54 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

83 Who cares? 54 Skeletal programming remains a fringe activity Murray Cole Why Structured Parallel Programming Matters September 3rd 24

84 A Pragmatic Manifesto 55 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

85 A Pragmatic Manifesto Minimise Conceptual Disruption Murray Cole Why Structured Parallel Programming Matters September 3rd 24

86 A Pragmatic Manifesto Minimise Conceptual Disruption There are many ways of presenting these ideas. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

87 A Pragmatic Manifesto Minimise Conceptual Disruption There are many ways of presenting these ideas. Parallel programmers are happy with C/Fortran and MPI. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

88 A Pragmatic Manifesto Minimise Conceptual Disruption There are many ways of presenting these ideas. Parallel programmers are happy with C/Fortran and MPI. MPI s collectives are simple skeletons, so build on this. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

89 A Pragmatic Manifesto Integrate Ad-Hoc Parallelism Murray Cole Why Structured Parallel Programming Matters September 3rd 24

90 A Pragmatic Manifesto Integrate Ad-Hoc Parallelism Sometimes parallelism seems inherently unstructured. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

91 A Pragmatic Manifesto Integrate Ad-Hoc Parallelism Sometimes parallelism seems inherently unstructured. Allow contained integration within a structured container. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

92 A Pragmatic Manifesto Integrate Ad-Hoc Parallelism Sometimes parallelism seems inherently unstructured. Allow contained integration within a structured container. Don t overconstrain. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

93 A Pragmatic Manifesto Accommodate Diversity Murray Cole Why Structured Parallel Programming Matters September 3rd 24

94 A Pragmatic Manifesto Accommodate Diversity Well-known concepts are quite slippery. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

95 A Pragmatic Manifesto Accommodate Diversity Well-known concepts are quite slippery. Pipeline stage as function or stage as process? Murray Cole Why Structured Parallel Programming Matters September 3rd 24

96 A Pragmatic Manifesto Accommodate Diversity Well-known concepts are quite slippery. Pipeline stage as function or stage as process? Pipeline stage one-for-one or arbitrary? Murray Cole Why Structured Parallel Programming Matters September 3rd 24

97 A Pragmatic Manifesto Accommodate Diversity Well-known concepts are quite slippery. Pipeline stage as function or stage as process? Pipeline stage one-for-one or arbitrary? Implicit or explicit farmer? Murray Cole Why Structured Parallel Programming Matters September 3rd 24

98 A Pragmatic Manifesto Accommodate Diversity Well-known concepts are quite slippery. Pipeline stage as function or stage as process? Pipeline stage one-for-one or arbitrary? Implicit or explicit farmer? Don t overconstrain. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

99 A Vanilla Pipeline - Image Processing 58 Decreasing Data Size Edges Faces Objects Scene Increasing Content Murray Cole Why Structured Parallel Programming Matters September 3rd 24

100 An Exotic Pipeline - Gaussian Elimination 59 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

101 6 An Exotic Pipeline - Gaussian Elimination!!!!!! "" "" "" ## ## ## $$ $$ $$ %% %% %% && && && '' '' '' (( (( (( )) )) )) ** ** ** ,,,,,, // // // :: :: :: ;;;;;;;;;;;;;;;;;;;;;;; <<<<<<<<<<<<<<<<<<<<<<< ======================= >>>>>>>>>>>>>>>>>>>>>>>??????????????????????? AAAAAAAAAAAAAAAAAAAAAAA BBBBBBBBBBBBBBBBBBBBBBB CCCCCCCCCCCCCCCCCCCCCCC DDDDDDDDDDDDDDDDDDDDDDD EEEEEEEEEEEEEEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF GGGGGGGGGGGGGGGGGGGGGGG HHHHHHHHHHHHHHHHHHHHHHH Murray Cole Why Structured Parallel Programming Matters September 3rd 24

102 61 An Exotic Pipeline - Gaussian Elimination!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : Etc Murray Cole Why Structured Parallel Programming Matters September 3rd 24

103 Elimination Phase 62 for (each row in sequence) { normalise this pivot row; broadcast result to subsequent rows; for (each subsequent row in parallel) { } } eliminate one column using broadcast row Murray Cole Why Structured Parallel Programming Matters September 3rd 24

104 63 Elimination Phase!!!!!! "" "" "" ## ## ## $$ $$ $$ %% %% %% && && && '' '' '' (( (( (( )) )) )) ** ** ** ,,,,,, // // // :: :: :: ;; ;; ;; << << << == == == >> >> >>?????? AA AA AA BB BB BB CC CC CC DD DD DD EE EE EE FF FF FF GG GG GG HH HH HH II II II JJ JJ JJ KK KK KK LL LL LL MM MM MM NN NN NN OO OO OO PP PP PP QQ QQ QQ RR RR RR SS SS SS TT TT TT UU UU UU VV VV VV WW WW WW XX XX XX YYYYYYYYYYYYYYYYYYYYYYY ZZZZZZZZZZZZZZZZZZZZZZZ [[[[[[[[[[[[[[[[[[[[[[[ \\\\\\\\\\\\\\\\\\\\\\\ ]]]]]]]]]]]]]]]]]]]]]]] ^^^^^^^^^^^^^^^^^^^^^^^ ``````````````````````` aaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbb ccccccccccccccccccccccc ddddddddddddddddddddddd eeeeeeeeeeeeeeeeeeeeeee fffffffffffffffffffffff Pivot Row Murray Cole Why Structured Parallel Programming Matters September 3rd 24

105 64 Elimination Phase!!!!!! "" "" "" ## ## ## $$ $$ $$ %% %% %% && && && '' '' '' (( (( (( )) )) )) ** ** ** ,,,,,, // // // :: :: :: ;; ;; ;; << << << == == == >> >> >>?????? AA AA AA BB BB BB CC CC CC DD DD DD EE EE EE FF FF FF GG GG GG HH HH HH II II II JJ JJ JJ KK KK KK LL LL LL MM MM MM NN NN NN OO OO OO PP PP PP QQ QQ QQ RR RR RR SS SS SS TT TT TT UU UU UU VV VV VV WWWWWWWWWWWWWWWWWWWWWWW XXXXXXXXXXXXXXXXXXXXXXX YYYYYYYYYYYYYYYYYYYYYYY ZZZZZZZZZZZZZZZZZZZZZZZ [[[[[[[[[[[[[[[[[[[[[[[ \\\\\\\\\\\\\\\\\\\\\\\ ]]]]]]]]]]]]]]]]]]]]]]] ^^^^^^^^^^^^^^^^^^^^^^^ ``````````````````````` aaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbb ccccccccccccccccccccccc ddddddddddddddddddddddd Normalise Murray Cole Why Structured Parallel Programming Matters September 3rd 24

106 65 Elimination Phase!!!!!! "" "" "" ## ## ## $$ $$ $$ %% %% %% && && && '' '' '' (( (( (( )) )) )) ** ** ** ,,,,,, // // // :: :: :: ;; ;; ;; << << << == == == >> >> >>?????? AA AA AA BB BB BB CC CC CC DD DD DD EE EE EE FF FF FF GG GG GG HH HH HH II II II JJ JJ JJ KK KK KK LL LL LL MM MM MM NN NN NN OO OO OO PP PP PP QQ QQ QQ RR RR RR SS SS SS TT TT TT UU UU UU VV VV VV WW WW WW XX XX XX YY YY YY ZZ ZZ ZZ [[ [[ [[ \\ \\ \\ ]] ]] ]] ^^ ^^ ^^ `` `` `` aa aa aa bb bb bb cc cc cc dd dd dd ee ee ee ff ff ff gg gg gg hh hh hh ii ii ii jj jj jj kk kk kk ll ll ll mm mm mm nn nn nn oo oo oo pp pp pp qq qq qq rr rr rr ss ss ss tt tt tt uu uu uu vv vv vv wwwwwwwwwwwwwwwwwwwwwww xxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyy zzzzzzzzzzzzzzzzzzzzzzz {{{{{{{{{{{{{{{{{{{{{{{ }}}}}}}}}}}}}}}}}}}}}}} ~~~~~~~~~~~~~~~~~~~~~~~ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Broadcast Murray Cole Why Structured Parallel Programming Matters September 3rd 24

107 66 Elimination Phase!!!!!! "" "" "" ## ## ## $$ $$ $$ %% %% %% && && && '' '' '' (( (( (( )) )) )) ** ** ** ,,,,,, // // // :: :: :: ;; ;; ;; << << << == == == >> >> >>?????? AA AA AA BB BB BB CC CC CC DD DD DD EE EE EE FF FF FF GG GG GG HH HH HH II II II JJ JJ JJ KK KK KK LL LL LL MM MM MM NN NN NN OOOOOOOOOOOOOOOOOOOOOOO PPPPPPPPPPPPPPPPPPPPPPP QQQQQQQQQQQQQQQQQQQQQQQ RRRRRRRRRRRRRRRRRRRRRRR SSSSSSSSSSSSSSSSSSSSSSS TTTTTTTTTTTTTTTTTTTTTTT UUUUUUUUUUUUUUUUUUUUUUU VVVVVVVVVVVVVVVVVVVVVVV WWWWWWWWWWWWWWWWWWWWWWW XXXXXXXXXXXXXXXXXXXXXXX YYYYYYYYYYYYYYYYYYYYYYY ZZZZZZZZZZZZZZZZZZZZZZZ [[[[[[[[[[[[[[[[[[[[[[[ \\\\\\\\\\\\\\\\\\\\\\\ Eliminate Eliminate Eliminate Eliminate Murray Cole Why Structured Parallel Programming Matters September 3rd 24

108 67 Elimination Phase!!!!!! "" "" "" ## ## ## $$ $$ $$ %% %% %% && && && '' '' '' (( (( (( )) )) )) ** ** ** ,,,,,, // // // :: :: :: ;; ;; ;; << << << == == == >> >> >>?????? AA AA AA BB BB BB CC CC CC DD DD DD EE EE EE FF FF FF GG GG GG HH HH HH II II II JJ JJ JJ KK KK KK LL LL LL MM MM MM NN NN NN OOOOOOOOOOOOOOOOOOOOOOO PPPPPPPPPPPPPPPPPPPPPPP QQQQQQQQQQQQQQQQQQQQQQQ RRRRRRRRRRRRRRRRRRRRRRR SSSSSSSSSSSSSSSSSSSSSSS TTTTTTTTTTTTTTTTTTTTTTT UUUUUUUUUUUUUUUUUUUUUUU VVVVVVVVVVVVVVVVVVVVVVV WWWWWWWWWWWWWWWWWWWWWWW XXXXXXXXXXXXXXXXXXXXXXX YYYYYYYYYYYYYYYYYYYYYYY ZZZZZZZZZZZZZZZZZZZZZZZ [[[[[[[[[[[[[[[[[[[[[[[ \\\\\\\\\\\\\\\\\\\\\\\ Pivot Row Murray Cole Why Structured Parallel Programming Matters September 3rd 24

109 68 Elimination Phase!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : ; ; < < = = A A B B C C D D E E F F G G H H I I J J K K L L Normalise 1 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

110 69 Elimination Phase!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : ; ; < < = = A A B B C C D D E E F F Broadcast Murray Cole Why Structured Parallel Programming Matters September 3rd 24

111 7 Elimination Phase!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : ; ; < < = = A A B B C C D D E E F F Eliminate Eliminate Eliminate Murray Cole Why Structured Parallel Programming Matters September 3rd 24

112 71 Elimination Phase!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : ; ; < < = = Pivot Row Murray Cole Why Structured Parallel Programming Matters September 3rd 24

113 Pipelined Version 72 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

114 Pipelined Version Textbook improvement interleaves broadcast and elimination phases. 72 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

115 Pipelined Version Textbook improvement interleaves broadcast and elimination phases. 72 Processors participate in broadcast begin elimination immediately (before broadcast completes elsewhere) iterations become pipelined Murray Cole Why Structured Parallel Programming Matters September 3rd 24

116 Pipelined Version Textbook improvement interleaves broadcast and elimination phases. 72 Processors participate in broadcast begin elimination immediately (before broadcast completes elsewhere) iterations become pipelined Each iteration would be slower independently, but pipelining across iterations produces an overall gain. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

117 73 Pipelined Version!!!!!! "" "" "" ## ## ## $$ $$ $$ %% %% %% && && && '' '' '' (( (( (( )) )) )) ** ** ** ,,,,,, // // // :: :: :: ;; ;; ;; << << << == == == >> >> >>?????? AA AA AA BB BB BB CC CC CC DD DD DD EE EE EE FF FF FF GG GG GG HH HH HH II II II JJ JJ JJ KK KK KK LL LL LL MM MM MM NN NN NN OO OO OO PP PP PP QQ QQ QQ RR RR RR SS SS SS TT TT TT UU UU UU VV VV VV WWWWWWWWWWWWWWWWWWWWWWW XXXXXXXXXXXXXXXXXXXXXXX YYYYYYYYYYYYYYYYYYYYYYY ZZZZZZZZZZZZZZZZZZZZZZZ [[[[[[[[[[[[[[[[[[[[[[[ \\\\\\\\\\\\\\\\\\\\\\\ ]]]]]]]]]]]]]]]]]]]]]]] ^^^^^^^^^^^^^^^^^^^^^^^ ``````````````````````` aaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbb ccccccccccccccccccccccc ddddddddddddddddddddddd Normalise Murray Cole Why Structured Parallel Programming Matters September 3rd 24

118 74 Pipelined Version!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : ; ; < < = = A A B B C C D D E E F F G G H H I I J J K K L L M M N N O O P P Q Q R R S S T T U U V V W W X X Y Y Z Z [ [ \ \ ] ] ^ ^ Send Murray Cole Why Structured Parallel Programming Matters September 3rd 24

119 75 Pipelined Version!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : ; ; < < = = A A B B C C D D E E F F G G H H I I J J K K L L M M N N O O P P Q Q R R S S T T U U V V W W X X Y Y Z Z [ [ \ \ ] ] ^ ^ ` ` a a b b c c d d e e f f Send Murray Cole Why Structured Parallel Programming Matters September 3rd 24

120 76 Pipelined Version!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : ; ; < < = = A A B B C C D D E E F F G G H H I I J J K K L L M M N N O O P P Q Q R R S S T T U U V V W W X X Y Y Z Z [ [ \ \ ] ] ^ ^ ` ` a a b b c d d Eliminate Send Murray Cole Why Structured Parallel Programming Matters September 3rd 24

121 77 Pipelined Version!! " " # # $ $ % % & & ' ' ( ( ) ) * * + +,, / / : : ; ; < < = = A A B B C C D D E E F F G G H H I I J J K K L L M M N N O O P P Q Q R R S S T T U U V V W W X X Y Y Z Z [ [ \ \ ] ] ^ ^ ` ` Send Eliminate 1 Normalise Murray Cole Why Structured Parallel Programming Matters September 3rd 24

122 Observations 78 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

123 Observations 78 Stages have internal state. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

124 Observations 78 Stages have internal state. There are no external buffers of input or output (both are in the stage state). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

125 Observations 78 Stages have internal state. There are no external buffers of input or output (both are in the stage state). Sequence of interactions is state dependent (activity is different the final time ). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

126 More Pipelining 79 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

127 More Pipelining A further observation is that the back-substitution phase can be pipelined too, but in the other direction. 79 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

128 More Pipelining A further observation is that the back-substitution phase can be pipelined too, but in the other direction ! "! " # $ # $ % & % & ' ( ' ( ) * ) * +, +, / / : 9 : 1 Etc Murray Cole Why Structured Parallel Programming Matters September 3rd 24

129 Algorithm Summary 8 Scatter_data(); Pipeline (top-to-bottom, elimination,...); // standard MPI // skeleton call Pipeline (bottom-to-top, back_substitution,...); // skeleton call Gather_results(); // standard MPI Murray Cole Why Structured Parallel Programming Matters September 3rd 24

130 eskel 81 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

131 eskel 81 The edinburgh Skeleton library Murray Cole Why Structured Parallel Programming Matters September 3rd 24

132 eskel 81 The edinburgh Skeleton library An experimental attempt to address these issues. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

133 eskel 81 The edinburgh Skeleton library An experimental attempt to address these issues. An extension of MPI s collective operation suite. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

134 MPI Key Concepts 82 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

135 MPI Key Concepts 82 A process (not processor) based model. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

136 MPI Key Concepts 82 A process (not processor) based model. Processes identified by ranks within groups (known as communicators ). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

137 MPI Key Concepts 82 A process (not processor) based model. Processes identified by ranks within groups (known as communicators ). Default communicator (all processes) is MPI COMM WORLD. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

138 MPI Key Concepts 82 A process (not processor) based model. Processes identified by ranks within groups (known as communicators ). Default communicator (all processes) is MPI COMM WORLD. Every communication specifies its communicator. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

139 MPI Key Concepts 82 A process (not processor) based model. Processes identified by ranks within groups (known as communicators ). Default communicator (all processes) is MPI COMM WORLD. Every communication specifies its communicator. Allows programmer to reflect logical groupings in an algorithm and to insulate communications within these from outside interference. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

140 MPI Communicators 83 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

141 MPI Communicators MPI_COMM_WORLD Murray Cole Why Structured Parallel Programming Matters September 3rd 24

142 85 MPI Communicators C3 1 2 C C2 MPI_COMM_WORLD Murray Cole Why Structured Parallel Programming Matters September 3rd 24

143 MPI: Collective Operations 86 int MPI Reduce (void sndbuf, void rcvbuf, int count, MPI Datatype dt, MPI Op op, int root, MPI Comm comm) input data buffer (each process contributes) output buffer (note restriction on type) operation to be used group and special roles within it (in this case, root receives the result) Murray Cole Why Structured Parallel Programming Matters September 3rd 24

144 MPI: Collective Operations 87 int MPI Reduce (void sndbuf, void rcvbuf, int count, MPI Datatype dt, MPI Op op, int root, MPI Comm comm) input data buffer (each process contributes) output buffer (note restriction on type) operation to be used group and special roles within it (in this case, root receives the result) Murray Cole Why Structured Parallel Programming Matters September 3rd 24

145 MPI: Collective Operations 88 int MPI Reduce (void sndbuf, void rcvbuf, int count, MPI Datatype dt, MPI Op op, int root, MPI Comm comm) input data buffer (each process contributes) output buffer (note restriction on type) operation to be used group and special roles within it (in this case, root receives the result) Murray Cole Why Structured Parallel Programming Matters September 3rd 24

146 MPI: Collective Operations 89 int MPI Reduce (void sndbuf, void rcvbuf, int count, MPI Datatype dt, MPI Op op, int root, MPI Comm comm) input data buffer (each process contributes) output buffer (note restriction on type) operation to be used group and special roles within it (in this case, root receives the result) Murray Cole Why Structured Parallel Programming Matters September 3rd 24

147 MPI: Collective Operations 9 int MPI Reduce (void sndbuf, void rcvbuf, int count, MPI Datatype dt, MPI Op op, int root, MPI Comm comm) input data buffer (each process contributes) output buffer (note restriction on type) operation to be used group and special roles within it (in this case, root receives the result) Murray Cole Why Structured Parallel Programming Matters September 3rd 24

148 eskel 91 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

149 eskel 91 MPI collective operations like MPI Reduce are already simple skeletons. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

150 eskel 91 MPI collective operations like MPI Reduce are already simple skeletons. We design eskel from this basis to provide minimal conceptual disruption (principle 1) ad-hoc parallelism (principle 2) Murray Cole Why Structured Parallel Programming Matters September 3rd 24

151 eskel Skeletons 92 The current draft of eskel defines five skeletons Murray Cole Why Structured Parallel Programming Matters September 3rd 24

152 eskel Skeletons 92 The current draft of eskel defines five skeletons Pipeline Farm Murray Cole Why Structured Parallel Programming Matters September 3rd 24

153 eskel Skeletons 92 The current draft of eskel defines five skeletons Pipeline Farm Deal HaloSwap Butterfly Murray Cole Why Structured Parallel Programming Matters September 3rd 24

154 Deal 93 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

155 Deal 93 Similar to a farm, but distributes task in cyclic order to workers (no farmer). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

156 Deal 93 Similar to a farm, but distributes task in cyclic order to workers (no farmer). Useful nested in pipelines, to internally replicate a stage. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

157 Deal 94 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

158 HaloSwap 95 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

159 HaloSwap 95 Representative of iterative relaxation algorithms. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

160 HaloSwap 95 Representative of iterative relaxation algorithms. Loop over local update and check for termination. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

161 HaloSwap 95 Representative of iterative relaxation algorithms. Loop over local update and check for termination. Interactions have two components (one from each neighbour). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

162 HaloSwap 95 Representative of iterative relaxation algorithms. Loop over local update and check for termination. Interactions have two components (one from each neighbour). Optional wraparound. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

163 HaloSwap 96 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

164 Butterfly 97 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

165 Butterfly 97 Captures a class of divide-and-conquer algorithms (those based on traversing hypercube dimensions). Murray Cole Why Structured Parallel Programming Matters September 3rd 24

166 Butterfly 97 Captures a class of divide-and-conquer algorithms (those based on traversing hypercube dimensions). A sequence of activities, in groups of different sizes. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

167 Butterfly 97 Captures a class of divide-and-conquer algorithms (those based on traversing hypercube dimensions). A sequence of activities, in groups of different sizes. Constrained to work with 2 d processes. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

168 Butterfly 98 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

169 Butterfly 99 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

170 Butterfly 1 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

171 The Gory Details 11 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

172 The Gory Details Function prototype for the pipeline skeleton: 11 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

173 The Gory Details Function prototype for the pipeline skeleton: 11 void Pipeline (int ns, Amode t amode[], eskel molecule t ( stages[])(eskel molecule t ), int col, Dmode t dmode, spread t spr[], MPI Datatype ty[], void in, int inlen, int inmul, void out, int outlen, int outmul, int outbuffsz, MPI Comm comm) Murray Cole Why Structured Parallel Programming Matters September 3rd 24

174 The Gory Details Function prototype for the pipeline skeleton: 11 void Pipeline (int ns, Amode t amode[], eskel molecule t ( stages[])(eskel molecule t ), int col, Dmode t dmode, spread t spr[], MPI Datatype ty[], void in, int inlen, int inmul, void out, int outlen, int outmul, int outbuffsz, MPI Comm comm) Why do we need fifteen parameters? Murray Cole Why Structured Parallel Programming Matters September 3rd 24

175 The Gory Details Function prototype for the pipeline skeleton: 11 void Pipeline (int ns, Amode t amode[], eskel molecule t ( stages[])(eskel molecule t ), int col, Dmode t dmode, spread t spr[], MPI Datatype ty[], void in, int inlen, int inmul, void out, int outlen, int outmul, int outbuffsz, MPI Comm comm) Why do we need fifteen parameters? Because of the MPI basis and for flexibility. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

176 The Gory Details Function prototype for the pipeline skeleton: 12 void Pipeline (int ns, Amode t amode[], eskel molecule t ( stages[])(eskel molecule t ), int col, Dmode t dmode, spread t spr[], MPI Datatype ty[], void in, int inlen, int inmul, void out, int outlen, int outmul, int outbuffsz, MPI Comm comm) Why do we need fifteen parameters? Pipeline inputs Murray Cole Why Structured Parallel Programming Matters September 3rd 24

177 The Gory Details Function prototype for the pipeline skeleton: 13 void Pipeline (int ns, Amode t amode[], eskel molecule t ( stages[])(eskel molecule t ), int col, Dmode t dmode, spread t spr[], MPI Datatype ty[], void in, int inlen, int inmul, void out, int outlen, int outmul, int outbuffsz, MPI Comm comm) Why do we need fifteen parameters? Pipeline output buffer Murray Cole Why Structured Parallel Programming Matters September 3rd 24

178 The Gory Details Function prototype for the pipeline skeleton: 14 void Pipeline (int ns, Amode t amode[], eskel molecule t ( stages[])(eskel molecule t ), int col, Dmode t dmode, spread t spr[], MPI Datatype ty[], void in, int inlen, int inmul, void out, int outlen, int outmul, int outbuffsz, MPI Comm comm) Why do we need fifteen parameters? Pipeline stage activities Murray Cole Why Structured Parallel Programming Matters September 3rd 24

179 The Gory Details Function prototype for the pipeline skeleton: 15 void Pipeline (int ns, Amode t amode[], eskel molecule t ( stages[])(eskel molecule t ), int col, Dmode t dmode, spread t spr[], MPI Datatype ty[], void in, int inlen, int inmul, void out, int outlen, int outmul, int outbuffsz, MPI Comm comm) Why do we need fifteen parameters? Stage interfaces and modes Murray Cole Why Structured Parallel Programming Matters September 3rd 24

180 Summary 16 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

181 Summary 16 Parallel programming is important, but hard. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

182 Summary 16 Parallel programming is important, but hard. Structured parallel programming can help by allowing the programmer to express meta-knowledge about interaction structure. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

183 Summary 16 Parallel programming is important, but hard. Structured parallel programming can help by allowing the programmer to express meta-knowledge about interaction structure. This information allows the implementation to make macro optimisations, and supports coarse grain algorithm development methodologies. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

184 Summary 16 Parallel programming is important, but hard. Structured parallel programming can help by allowing the programmer to express meta-knowledge about interaction structure. This information allows the implementation to make macro optimisations, and supports coarse grain algorithm development methodologies. To enter the mainstream we have to be pragmatic. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

185 Future Work 17 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

186 Future Work These concepts and arguments are generic, and may be applied wherever parallelism appears: 17 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

187 Future Work These concepts and arguments are generic, and may be applied wherever parallelism appears: 17 Mainstream parallel computing Murray Cole Why Structured Parallel Programming Matters September 3rd 24

188 Future Work These concepts and arguments are generic, and may be applied wherever parallelism appears: 17 Mainstream parallel computing Grid computing? Murray Cole Why Structured Parallel Programming Matters September 3rd 24

189 Future Work These concepts and arguments are generic, and may be applied wherever parallelism appears: 17 Mainstream parallel computing Grid computing? ASIC design? Murray Cole Why Structured Parallel Programming Matters September 3rd 24

190 Future Work These concepts and arguments are generic, and may be applied wherever parallelism appears: 17 Mainstream parallel computing Grid computing? ASIC design? FPGA programming? Murray Cole Why Structured Parallel Programming Matters September 3rd 24

191 Future Work 18 Murray Cole Why Structured Parallel Programming Matters September 3rd 24

192 Future Work 18 I would like someone to give me a large amount of money and substantial resources in order to be able to pursue this programme swiftly and comprehensively. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

193 Future Work 18 I would like someone to give me a large amount of money and substantial resources in order to be able to pursue this programme swiftly and comprehensively. No reasonable offer refused. Murray Cole Why Structured Parallel Programming Matters September 3rd 24

194 Future Work 18 I would like someone to give me a large amount of money and substantial resources in order to be able to pursue this programme swiftly and comprehensively. No reasonable offer refused. Thank you Murray Cole Why Structured Parallel Programming Matters September 3rd 24

Enhancing the performance of Grid Applications with Skeletons and Process Algebras

Enhancing the performance of Grid Applications with Skeletons and Process Algebras Enhancing the performance of Grid Applications with Skeletons and Process Algebras (funded by the EPSRC, grant number GR/S21717/01) A. Benoit, M. Cole, S. Gilmore, J. Hillston http://groups.inf.ed.ac.uk/enhance/

More information

The ABC s of Web Site Evaluation

The ABC s of Web Site Evaluation Aa Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn Oo Pp Qq Rr Ss Tt Uu Vv Ww Xx Yy Zz The ABC s of Web Site Evaluation by Kathy Schrock Digital Literacy by Paul Gilster Digital literacy is the ability to understand

More information

BRANDING AND STYLE GUIDELINES

BRANDING AND STYLE GUIDELINES BRANDING AND STYLE GUIDELINES INTRODUCTION The Dodd family brand is designed for clarity of communication and consistency within departments. Bold colors and photographs are set on simple and clean backdrops

More information

Tranont Mission Statement. Tranont Vision Statement. Change the world s economy, one household at a time.

Tranont Mission Statement. Tranont Vision Statement. Change the world s economy, one household at a time. STYLE GUIDE Tranont Mission Statement Change the world s economy, one household at a time. Tranont Vision Statement We offer individuals world class financial education and training, financial management

More information

Math 96--Radicals #1-- Simplify; Combine--page 1

Math 96--Radicals #1-- Simplify; Combine--page 1 Simplify; Combine--page 1 Part A Number Systems a. Whole Numbers = {0, 1, 2, 3,...} b. Integers = whole numbers and their opposites = {..., 3, 2, 1, 0, 1, 2, 3,...} c. Rational Numbers = quotient of integers

More information

Palatino. Palatino. Linotype. Palatino. Linotype. Linotype. Palatino. Linotype. Palatino. Linotype. Palatino. Linotype

Palatino. Palatino. Linotype. Palatino. Linotype. Linotype. Palatino. Linotype. Palatino. Linotype. Palatino. Linotype Copyright 2013 Johanna Corsini Arts 79 Typography 1 Sources: http://en.wikipedia.org/wiki/ http://en.wikipedia.org/wiki/typography By Johanna Corsini P a a P o l t a a n L P i l t n a i o a o y l t n n

More information

TABLE OF CONTENTS. 3 Intro. 4 Foursquare Logo. 6 Foursquare Icon. 9 Colors. 10 Copy & Tone Of Voice. 11 Typography. 13 Crown Usage.

TABLE OF CONTENTS. 3 Intro. 4 Foursquare Logo. 6 Foursquare Icon. 9 Colors. 10 Copy & Tone Of Voice. 11 Typography. 13 Crown Usage. BRANDBOOK TABLE OF CONTENTS 3 Intro 4 Foursquare Logo 6 Foursquare Icon 9 Colors 10 Copy & Tone Of Voice 11 Typography 13 Crown Usage 14 Badge Usage 15 Iconography 16 Trademark Guidelines 2011 FOURSQUARE

More information

I.D. GUIDE Kentucky Campus Version 1.0

I.D. GUIDE Kentucky Campus Version 1.0 I.D. GUIDE 2008-2009 Kentucky Campus Version 1.0 introduction to the identity guidelines Summer 2008 Dear Asbury College community, As we continue our mission of academic excellence and spiritual vitality

More information

Using eskel to implement the multiple baseline stereo application

Using eskel to implement the multiple baseline stereo application 1 Using eskel to implement the multiple baseline stereo application Anne Benoit a, Murray Cole a, Stephen Gilmore a, Jane Hillston a a School of Informatics, The University of Edinburgh, James Clerk Maxwell

More information

1.2 Round-off Errors and Computer Arithmetic

1.2 Round-off Errors and Computer Arithmetic 1.2 Round-off Errors and Computer Arithmetic 1 In a computer model, a memory storage unit word is used to store a number. A word has only a finite number of bits. These facts imply: 1. Only a small set

More information

Two Fundamental Concepts in Skeletal Parallel Programming

Two Fundamental Concepts in Skeletal Parallel Programming Two Fundamental Concepts in Skeletal Parallel Programming Anne Benoit and Murray Cole School of Informatics, The University of Edinburgh, James Clerk Maxwell Building, The King s Buildings, Mayfield Road,

More information

The MPI Message-passing Standard Lab Time Hands-on. SPD Course Massimo Coppola

The MPI Message-passing Standard Lab Time Hands-on. SPD Course Massimo Coppola The MPI Message-passing Standard Lab Time Hands-on SPD Course 2016-2017 Massimo Coppola Remember! Simplest programs do not need much beyond Send and Recv, still... Each process lives in a separate memory

More information

Multi-view object segmentation in space and time. Abdelaziz Djelouah, Jean Sebastien Franco, Edmond Boyer

Multi-view object segmentation in space and time. Abdelaziz Djelouah, Jean Sebastien Franco, Edmond Boyer Multi-view object segmentation in space and time Abdelaziz Djelouah, Jean Sebastien Franco, Edmond Boyer Outline Addressed problem Method Results and Conclusion Outline Addressed problem Method Results

More information

Unifi 45 Projector Retrofit Kit for SMART Board 580 and 560 Interactive Whiteboards

Unifi 45 Projector Retrofit Kit for SMART Board 580 and 560 Interactive Whiteboards Unifi 45 Projector Retrofit Kit for SMRT oard 580 and 560 Interactive Whiteboards 72 (182.9 cm) 60 (152.4 cm) S580 S560 Cautions, warnings and other important product information are contained in document

More information

Connection Guide (RS-232C)

Connection Guide (RS-232C) Machine Automation Controller NJ-series General-purpose Seriarl Connection Guide (RS-232C) OMRON Corporation G9SP Safety Controller P545-E1-01 About Intellectual Property Rights and Trademarks Microsoft

More information

Handwriting Standards

Handwriting Standards ANCHOR STANDARDS adapted from the " for Handwriting & Keyboarding" retrieved from www.hw21summit.com HW.1 From legible letters, numerals, and punctuation using manuscript writing, demonstrating an understanding

More information

COMP Logic for Computer Scientists. Lecture 23

COMP Logic for Computer Scientists. Lecture 23 COMP 1002 Logic for Computer cientists Lecture 23 B 5 2 J Admin stuff Assignment 3 extension Because of the power outage, assignment 3 now due on Tuesday, March 14 (also 7pm) Assignment 4 to be posted

More information

Concept of Curve Fitting Difference with Interpolation

Concept of Curve Fitting Difference with Interpolation Curve Fitting Content Concept of Curve Fitting Difference with Interpolation Estimation of Linear Parameters by Least Squares Curve Fitting by Polynomial Least Squares Estimation of Non-linear Parameters

More information

Last Time. Intro to Parallel Algorithms. Parallel Search Parallel Sorting. Merge sort Sample sort

Last Time. Intro to Parallel Algorithms. Parallel Search Parallel Sorting. Merge sort Sample sort Intro to MPI Last Time Intro to Parallel Algorithms Parallel Search Parallel Sorting Merge sort Sample sort Today Network Topology Communication Primitives Message Passing Interface (MPI) Randomized Algorithms

More information

Bland-Altman Plot and Analysis

Bland-Altman Plot and Analysis Chapter 04 Bland-Altman Plot and Analysis Introduction The Bland-Altman (mean-difference or limits of agreement) plot and analysis is used to compare two measurements of the same variable. That is, it

More information

Point-to-Point Synchronisation on Shared Memory Architectures

Point-to-Point Synchronisation on Shared Memory Architectures Point-to-Point Synchronisation on Shared Memory Architectures J. Mark Bull and Carwyn Ball EPCC, The King s Buildings, The University of Edinburgh, Mayfield Road, Edinburgh EH9 3JZ, Scotland, U.K. email:

More information

Section 1: Introduction to Geometry Points, Lines, and Planes

Section 1: Introduction to Geometry Points, Lines, and Planes Section 1: Introduction to Geometry Points, Lines, and Planes Topic 1: Basics of Geometry - Part 1... 3 Topic 2: Basics of Geometry Part 2... 5 Topic 3: Midpoint and Distance in the Coordinate Plane Part

More information

Computer Science Technical Report. High Performance Unified Parallel C (UPC) Collectives For Linux/Myrinet Platforms

Computer Science Technical Report. High Performance Unified Parallel C (UPC) Collectives For Linux/Myrinet Platforms Computer Science Technical Report High Performance Unified Parallel C (UPC) Collectives For Linux/Myrinet Platforms Alok Mishra and Steven Seidel Michigan Technological University Computer Science Technical

More information

Section 6: Triangles Part 1

Section 6: Triangles Part 1 Section 6: Triangles Part 1 Topic 1: Introduction to Triangles Part 1... 125 Topic 2: Introduction to Triangles Part 2... 127 Topic 3: rea and Perimeter in the Coordinate Plane Part 1... 130 Topic 4: rea

More information

Numerical Algorithms

Numerical Algorithms Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0

More information

Joint Structured/Unstructured Parallelism Exploitation in muskel

Joint Structured/Unstructured Parallelism Exploitation in muskel Joint Structured/Unstructured Parallelism Exploitation in muskel M. Danelutto 1,4 and P. Dazzi 2,3,4 1 Dept. Computer Science, University of Pisa, Italy 2 ISTI/CNR, Pisa, Italy 3 IMT Institute for Advanced

More information

Syntax Analysis Top Down Parsing

Syntax Analysis Top Down Parsing Syntax Analysis Top Down Parsing CMPSC 470 Lecture 05 Topics: Overview Recursive-descent parser First and Follow A. Overview Top-down parsing constructs parse tree for input string from root and creating

More information

Print Templates. 1.0 Introduction

Print Templates. 1.0 Introduction Print Templates 1.0 Introduction Often shelters and rescues have their own forms and documents. ishelters allows you to create your own print forms to fit your requirements. Each document can be assigned

More information

Models GNK050N12B3 GNK075N14B3 GNK100N16B3 GNK125N20B3. Save This Manual For Future Reference. Multi- Ind., Inc. Lewisburg, TN USA 37091

Models GNK050N12B3 GNK075N14B3 GNK100N16B3 GNK125N20B3. Save This Manual For Future Reference. Multi- Ind., Inc. Lewisburg, TN USA 37091 Models GNK00NB GNK07NB GNK00N6B GNKN0B Manufactured by: Multi Ind., Inc. Lewisburg, TN USA 709 Design Certified by A.G.A. Save This Manual For Future Reference APPROVED R PrintedinU.S.A. LP //98 0 0 00

More information

BRAND GUIDELINES JANUARY 2017

BRAND GUIDELINES JANUARY 2017 BRAND GUIDELINES JANUARY 2017 GETTING AROUND Page 03 05 06 07 08 09 10 12 14 15 Section 01 - Our Logo 02 - Logo Don ts 03 - Our Colors 04 - Our Typeface 06 - Our Art Style 06 - Pictures 07 - Call to Action

More information

Industrial Control. 50 to 5,000 VA. Applications. Specifications. Standards. Options and Accessories. Features, Functions, Benefits.

Industrial Control. 50 to 5,000 VA. Applications. Specifications. Standards. Options and Accessories. Features, Functions, Benefits. Industrial Control 6 50 to 5,000 VA Applications n For commercial and industrial control applications including; control panels, conveyor systems, machine tool equipment, pump systems, and commercial air

More information

EITF20: Computer Architecture Part4.1.1: Cache - 2

EITF20: Computer Architecture Part4.1.1: Cache - 2 EITF20: Computer Architecture Part4.1.1: Cache - 2 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache performance optimization Bandwidth increase Reduce hit time Reduce miss penalty Reduce miss

More information

Exploring different level of parallelism Instruction-level parallelism (ILP): how many of the operations/instructions in a computer program can be performed simultaneously 1. e = a + b 2. f = c + d 3.

More information

MESSAGE FROM TRUSTED CHOICE

MESSAGE FROM TRUSTED CHOICE Logo Rules MESSAGE FROM TRUSTED CHOICE Trusted Choice Logo Rules We are pleased that you decided to participate in Trusted Choice. We have invested substantial resources in developing the Trusted Choice

More information

Introduction. Stream processor: high computation to bandwidth ratio To make legacy hardware more like stream processor: We study the bandwidth problem

Introduction. Stream processor: high computation to bandwidth ratio To make legacy hardware more like stream processor: We study the bandwidth problem Introduction Stream processor: high computation to bandwidth ratio To make legacy hardware more like stream processor: Increase computation power Make the best use of available bandwidth We study the bandwidth

More information

Embedded Systems Design with Platform FPGAs

Embedded Systems Design with Platform FPGAs Embedded Systems Design with Platform FPGAs Spatial Design Ron Sass and Andrew G. Schmidt http://www.rcs.uncc.edu/ rsass University of North Carolina at Charlotte Spring 2011 Embedded Systems Design with

More information

Giving credit where credit is due

Giving credit where credit is due CSCE 23J Computer Organization Cache Memories Dr. Steve Goddard goddard@cse.unl.edu http://cse.unl.edu/~goddard/courses/csce23j Giving credit where credit is due Most of slides for this lecture are based

More information

Advanced Caching Techniques

Advanced Caching Techniques Advanced Caching Approaches to improving memory system performance eliminate memory accesses/operations decrease the number of misses decrease the miss penalty decrease the cache/memory access times hide

More information

Systems I. Optimizing for the Memory Hierarchy. Topics Impact of caches on performance Memory hierarchy considerations

Systems I. Optimizing for the Memory Hierarchy. Topics Impact of caches on performance Memory hierarchy considerations Systems I Optimizing for the Memory Hierarchy Topics Impact of caches on performance Memory hierarchy considerations Cache Performance Metrics Miss Rate Fraction of memory references not found in cache

More information

A Computer Oriented Method for Solving Transportation Problem

A Computer Oriented Method for Solving Transportation Problem Dhaka Univ. J. Sci. 63(1): 1-7, 015 (January) A Computer Oriented Method for Solving Transportation Problem Sharmin Afroz and M. Babul Hasan* Department of Mathematics, Dhaka University, Dhaka-1000, Bangladesh

More information

Parallel Programming. March 15,

Parallel Programming. March 15, Parallel Programming March 15, 2010 1 Some Definitions Computational Models and Models of Computation real world system domain model - mathematical - organizational -... computational model March 15, 2010

More information

CS Programming In C

CS Programming In C CS 24000 - Programming In C Week Two: Basic C Program Organization and Data Types Zhiyuan Li Department of Computer Science Purdue University, USA 2 int main() { } return 0; The Simplest C Program C programs

More information

minor HCL update Pipe Hazards (finish) / Caching post-quiz Q1 B post-quiz Q1 D new version of hclrs.tar:

minor HCL update Pipe Hazards (finish) / Caching post-quiz Q1 B post-quiz Q1 D new version of hclrs.tar: minor HCL update Pipe Hazards (finish) / Caching new version of hclrs.tar: removes poptest.yo from list of tests for seqhw, pipehw2 gives error if you try to assign to register output: register xy { foo

More information

Robust Linear Regression (Passing- Bablok Median-Slope)

Robust Linear Regression (Passing- Bablok Median-Slope) Chapter 314 Robust Linear Regression (Passing- Bablok Median-Slope) Introduction This procedure performs robust linear regression estimation using the Passing-Bablok (1988) median-slope algorithm. Their

More information

COMP 175 COMPUTER GRAPHICS. Lecture 07: Scene Graph. COMP 175: Computer Graphics March 10, Remco Chang 07 Scene Graph

COMP 175 COMPUTER GRAPHICS. Lecture 07: Scene Graph. COMP 175: Computer Graphics March 10, Remco Chang 07 Scene Graph Lecture 07: Scene Graph COMP 175: Computer Graphics March 10, 2015 1/47 Refresher: OpenGL Matrix Transformation Pipeline Input: list of 3D coordinates (x, y, z) GL_MODELVIEW Model transform View transform

More information

Victorian Cursive Handwriting On Dotted Thirds

Victorian Cursive Handwriting On Dotted Thirds Victorian On Dotted Thirds Free PDF ebook Download: Victorian On Dotted Thirds Download or Read Online ebook victorian cursive handwriting on dotted thirds in PDF Format From The Best User Guide Database

More information

Where we are Distributed and Parallel Technology. Union Types. A Tree Structure. t tag node. C Revision (Part II)

Where we are Distributed and Parallel Technology. Union Types. A Tree Structure. t tag node. C Revision (Part II) Where we are Distributed and Parallel Technology C Revision (Part II) Hans-Wolfgang Loidl http://www.macs.hw.ac.uk/~hwloidl School of Mathematical and Computer Sciences Heriot-Watt University, Edinburgh

More information

Concurrent/Parallel Processing

Concurrent/Parallel Processing Concurrent/Parallel Processing David May: April 9, 2014 Introduction The idea of using a collection of interconnected processing devices is not new. Before the emergence of the modern stored program computer,

More information

Marco Danelutto. May 2011, Pisa

Marco Danelutto. May 2011, Pisa Marco Danelutto Dept. of Computer Science, University of Pisa, Italy May 2011, Pisa Contents 1 2 3 4 5 6 7 Parallel computing The problem Solve a problem using n w processing resources Obtaining a (close

More information

Caches III. CSE 351 Spring Instructor: Ruth Anderson

Caches III. CSE 351 Spring Instructor: Ruth Anderson Caches III CSE 351 Spring 2017 Instructor: Ruth Anderson Teaching Assistants: Dylan Johnson Kevin Bi Linxing Preston Jiang Cody Ohlsen Yufang Sun Joshua Curtis Administrivia Office Hours Changes check

More information

"Charting the Course... Teradata SQL Course Summary

Charting the Course... Teradata SQL Course Summary Course Summary Description In this course, students will learn SQL starting at the most basic level and going to the most advanced level with many examples. Topics Basic SQL Functions The WHERE Clause

More information

All-Pairs Shortest Paths - Floyd s Algorithm

All-Pairs Shortest Paths - Floyd s Algorithm All-Pairs Shortest Paths - Floyd s Algorithm Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 31, 2011 CPD (DEI / IST) Parallel

More information

Similarity and Model Testing

Similarity and Model Testing Similarity and Model Testing 11. 5. 014 Hyunse Yoon, Ph.D. Assistant Research Scientist IIHR-Hydroscience & Engineering e-mail: hyun-se-yoon@uiowa.edu Modeling Model: A representation of a physical system

More information

Cache Optimisation. sometime he thought that there must be a better way

Cache Optimisation. sometime he thought that there must be a better way Cache sometime he thought that there must be a better way 2 Cache 1. Reduce miss rate a) Increase block size b) Increase cache size c) Higher associativity d) compiler optimisation e) Parallelism f) prefetching

More information

R:If, else and loops

R:If, else and loops R:If, else and loops Presenter: Georgiana Onicescu January 19, 2012 Presenter: Georgiana Onicescu R:ifelse,where,looping 1/ 17 Contents Vectors Matrices If else statements For loops Leaving the loop: stop,

More information

MESHLESS METHOD FOR SIMULATION OF COMPRESSIBLE REACTING FLOW

MESHLESS METHOD FOR SIMULATION OF COMPRESSIBLE REACTING FLOW MESHLESS METHOD FOR SIMULATION OF COMPRESSIBLE REACTING FLOW Jin Young Huh*, Kyu Hong Kim**, Suk Young Jung*** *Department of Mechanical & Aerospace Engineering, Seoul National University, **Department

More information

Teacher Assignment and Transfer Program (TATP) On-line Teacher Application Quick Sheets

Teacher Assignment and Transfer Program (TATP) On-line Teacher Application Quick Sheets Teacher Assignment and Transfer Program (TATP) On-line Teacher Application Quick Sheets February 2018 On-line Teacher Application Process Teachers interested in applying for any advertised vacancies in

More information

Spectrum Collaboration Challenge

Spectrum Collaboration Challenge Spectrum Collaboration Challenge Preliminary Event 1 DRAFT Scoring Procedures October 31, 2017 Defense Advanced Research Projects Agency 675 North Randolph Street Arlington, VA 22203 1 Revision Summary

More information

Rigid Body Simulation. Jeremy Ulrich Advised by David Mount Fall 2013

Rigid Body Simulation. Jeremy Ulrich Advised by David Mount Fall 2013 Rigid Body Simulation Jeremy Ulrich Advised by David Mount Fall 2013 Overview The project presented here is a real-time 3D rigid body physics engine, and is the result of an Independent Study on collision

More information

Cursive Handwriting Chart

Cursive Handwriting Chart Chart Free PDF ebook Download: Chart Download or Read Online ebook cursive handwriting chart in PDF Format From The Best User Guide Database. 2 Copy. Sample from 3rd Grade Student Workbook Without Tears.

More information

F28HS Hardware-Software Interface: Systems Programming

F28HS Hardware-Software Interface: Systems Programming F28HS Hardware-Software Interface: Systems Programming Hans-Wolfgang Loidl School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh Semester 2 2016/17 0 No proprietary software has

More information

DECstation 5000 Miss Rates. Cache Performance Measures. Example. Cache Performance Improvements. Types of Cache Misses. Cache Performance Equations

DECstation 5000 Miss Rates. Cache Performance Measures. Example. Cache Performance Improvements. Types of Cache Misses. Cache Performance Equations DECstation 5 Miss Rates Cache Performance Measures % 3 5 5 5 KB KB KB 8 KB 6 KB 3 KB KB 8 KB Cache size Direct-mapped cache with 3-byte blocks Percentage of instruction references is 75% Instr. Cache Data

More information

NMI Component Testing Guidelines Pertaining to: NMI Release 1 (released May 7, 2002)

NMI Component Testing Guidelines Pertaining to: NMI Release 1 (released May 7, 2002) NSF Middleware Initiative Integration Testbed Page 1 of 40 NMI Component Testing Guidelines Pertaining to: NMI Release 1 (released May 7, 2002) July 8, 2002 This packet contains NMI Component Testing Guidelines

More information

Object-Oriented Principles and Practice / C++

Object-Oriented Principles and Practice / C++ Object-Oriented Principles and Practice / C++ Alice E. Fischer September 26, 2016 OOPP / C++ Lecture 4... 1/33 Global vs. Class Static Parameters Move Semantics OOPP / C++ Lecture 4... 2/33 Global Functions

More information

Programmation Concurrente (SE205)

Programmation Concurrente (SE205) Programmation Concurrente (SE205) CM1 - Introduction to Parallelism Florian Brandner & Laurent Pautet LTCI, Télécom ParisTech, Université Paris-Saclay x Outline Course Outline CM1: Introduction Forms of

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Fall 2016 Lecture 4a Andrew Tolmach Portland State University 1994-2016 Pragmatics of Large Values Real machines are very efficient at handling word-size chunks of data (e.g.

More information

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0; How will execution time grow with SIZE? int array[size]; int A = ; for (int i = ; i < ; i++) { for (int j = ; j < SIZE ; j++) { A += array[j]; } TIME } Plot SIZE Actual Data 45 4 5 5 Series 5 5 4 6 8 Memory

More information

Program Optimization

Program Optimization Program Optimization Professor Jennifer Rexford http://www.cs.princeton.edu/~jrex 1 Goals of Today s Class Improving program performance o When and what to optimize o Better algorithms & data structures

More information

Control flow graphs and loop optimizations. Thursday, October 24, 13

Control flow graphs and loop optimizations. Thursday, October 24, 13 Control flow graphs and loop optimizations Agenda Building control flow graphs Low level loop optimizations Code motion Strength reduction Unrolling High level loop optimizations Loop fusion Loop interchange

More information

Scalable Farms. Michael Poldner a, Herbert Kuchen a. D Münster, Germany

Scalable Farms. Michael Poldner a, Herbert Kuchen a. D Münster, Germany 1 Scalable Farms Michael Poldner a, Herbert Kuchen a a University of Münster, Department of Information Systems, Leonardo Campus 3, D-4819 Münster, Germany Algorithmic skeletons intend to simplify parallel

More information

Linear Programming with Bounds

Linear Programming with Bounds Chapter 481 Linear Programming with Bounds Introduction Linear programming maximizes (or minimizes) a linear objective function subject to one or more constraints. The technique finds broad use in operations

More information

Branding Guidelines. Because Freedom Can t Protect Itself

Branding Guidelines. Because Freedom Can t Protect Itself AMERICan CIVIL liberties union Branding Guidelines Because Freedom Can t Protect Itself CONTENTS American Civil Liberties Union Branding Guidelines April 2013 American Civil Liberties Union 125 Broad Street,

More information

Introduction to Parallel Programming Models

Introduction to Parallel Programming Models Introduction to Parallel Programming Models Tim Foley Stanford University Beyond Programmable Shading 1 Overview Introduce three kinds of parallelism Used in visual computing Targeting throughput architectures

More information

CS201 - Introduction to Programming Glossary By

CS201 - Introduction to Programming Glossary By CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

More information

Microsoft Word Cursive Handwriting Template

Microsoft Word Cursive Handwriting Template Microsoft Word Template Free PDF ebook Download: Microsoft Word Template Download or Read Online ebook microsoft word cursive handwriting template in PDF Format From The Best User Guide Database. 2 Copy.

More information

Algorithms PART I: Embarrassingly Parallel. HPC Fall 2012 Prof. Robert van Engelen

Algorithms PART I: Embarrassingly Parallel. HPC Fall 2012 Prof. Robert van Engelen Algorithms PART I: Embarrassingly Parallel HPC Fall 2012 Prof. Robert van Engelen Overview Ideal parallelism Master-worker paradigm Processor farms Examples Geometrical transformations of images Mandelbrot

More information

CISC 662 Graduate Computer Architecture Lecture 18 - Cache Performance. Why More on Memory Hierarchy?

CISC 662 Graduate Computer Architecture Lecture 18 - Cache Performance. Why More on Memory Hierarchy? CISC 662 Graduate Computer Architecture Lecture 18 - Cache Performance Michela Taufer Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture, 4th edition ---- Additional

More information

Victorian Modern Cursive Handwriting

Victorian Modern Cursive Handwriting Victorian Modern Free PDF ebook Download: Victorian Modern Download or Read Online ebook victorian modern cursive handwriting in PDF Format From The Best User Guide Database I Practise letter formation

More information

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11 Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed

More information

HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES. Cliff Woolley, NVIDIA

HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES. Cliff Woolley, NVIDIA HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES Cliff Woolley, NVIDIA PREFACE This talk presents a case study of extracting parallelism in the UMT2013 benchmark for 3D unstructured-mesh

More information

A Message Passing Standard for MPP and Workstations

A Message Passing Standard for MPP and Workstations A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker Message Passing Interface (MPI) Message passing library Can be

More information

Section 4: Introduction to Polygons Part 1

Section 4: Introduction to Polygons Part 1 Section 4: Introduction to Polygons Part 1 Topic 1: Introduction to Polygons Part 1... 85 Topic 2: Introduction to Polygons Part 2... 88 Topic 3: ngles of Polygons... 90 Topic 4: Translation of Polygons...

More information

Cache Memories /18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, Today s Instructor: Phil Gibbons

Cache Memories /18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, Today s Instructor: Phil Gibbons Cache Memories 15-213/18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, 2017 Today s Instructor: Phil Gibbons 1 Today Cache memory organization and operation Performance impact

More information

Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed

Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448 1 The Greed for Speed Two general approaches to making computers faster Faster uniprocessor All the techniques we ve been looking

More information

Peter Pacheco. Chapter 3. Distributed Memory Programming with MPI. Copyright 2010, Elsevier Inc. All rights Reserved

Peter Pacheco. Chapter 3. Distributed Memory Programming with MPI. Copyright 2010, Elsevier Inc. All rights Reserved An Introduction to Parallel Programming Peter Pacheco Chapter 3 Distributed Memory Programming with MPI 1 Roadmap Writing your first MPI program. Using the common MPI functions. The Trapezoidal Rule in

More information

MULTIPLE OPERAND ADDITION. Multioperand Addition

MULTIPLE OPERAND ADDITION. Multioperand Addition MULTIPLE OPERAND ADDITION Chapter 3 Multioperand Addition Add up a bunch of numbers Used in several algorithms Multiplication, recurrences, transforms, and filters Signed (two s comp) and unsigned Don

More information

Cache Memories. Cache Memories Oct. 10, Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory

Cache Memories. Cache Memories Oct. 10, Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory 5-23 The course that gies CMU its Zip! Topics Cache Memories Oct., 22! Generic cache memory organization! Direct mapped caches! Set associatie caches! Impact of caches on performance Cache Memories Cache

More information

Introduction to the Message Passing Interface (MPI)

Introduction to the Message Passing Interface (MPI) Introduction to the Message Passing Interface (MPI) CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction to the Message Passing Interface (MPI) Spring 2018

More information

COSC 6385 Computer Architecture - Memory Hierarchy Design (III)

COSC 6385 Computer Architecture - Memory Hierarchy Design (III) COSC 6385 Computer Architecture - Memory Hierarchy Design (III) Fall 2006 Reducing cache miss penalty Five techniques Multilevel caches Critical word first and early restart Giving priority to read misses

More information

A Cache Hierarchy in a Computer System

A Cache Hierarchy in a Computer System A Cache Hierarchy in a Computer System Ideally one would desire an indefinitely large memory capacity such that any particular... word would be immediately available... We are... forced to recognize the

More information

Solution of P versus NP problem

Solution of P versus NP problem Algorithms Research 2015, 4(1): 1-7 DOI: 105923/jalgorithms2015040101 Solution of P versus NP problem Mustapha Hamidi Meknes, Morocco Abstract This paper, taking Travelling Salesman Problem as our object,

More information

A Comparison of Unified Parallel C, Titanium and Co-Array Fortran. The purpose of this paper is to compare Unified Parallel C, Titanium and Co-

A Comparison of Unified Parallel C, Titanium and Co-Array Fortran. The purpose of this paper is to compare Unified Parallel C, Titanium and Co- Shaun Lindsay CS425 A Comparison of Unified Parallel C, Titanium and Co-Array Fortran The purpose of this paper is to compare Unified Parallel C, Titanium and Co- Array Fortran s methods of parallelism

More information

Eureka Math. Grade 7, Module 6. Teacher Edition

Eureka Math. Grade 7, Module 6. Teacher Edition A Story of Units Eureka Math Grade 7, Module 6 Teacher Edition Published by the non-profit Great Minds. Copyright 2015 Great Minds. No part of this work may be reproduced, sold, or commercialized, in whole

More information

HPF commands specify which processor gets which part of the data. Concurrency is defined by HPF commands based on Fortran90

HPF commands specify which processor gets which part of the data. Concurrency is defined by HPF commands based on Fortran90 149 Fortran and HPF 6.2 Concept High Performance Fortran 6.2 Concept Fortran90 extension SPMD (Single Program Multiple Data) model each process operates with its own part of data HPF commands specify which

More information

Dynamic Mode Decomposition analysis of flow fields from CFD Simulations

Dynamic Mode Decomposition analysis of flow fields from CFD Simulations Dynamic Mode Decomposition analysis of flow fields from CFD Simulations Technische Universität München Thomas Indinger Lukas Haag, Daiki Matsumoto, Christoph Niedermeier in collaboration with Agenda Motivation

More information

Cache performance Outline

Cache performance Outline Cache performance 1 Outline Metrics Performance characterization Cache optimization techniques 2 Page 1 Cache Performance metrics (1) Miss rate: Neglects cycle time implications Average memory access time

More information

Intermediate MPI features

Intermediate MPI features Intermediate MPI features Advanced message passing Collective communication Topologies Group communication Forms of message passing (1) Communication modes: Standard: system decides whether message is

More information

Last class. Caches. Direct mapped

Last class. Caches. Direct mapped Memory Hierarchy II Last class Caches Direct mapped E=1 (One cache line per set) Each main memory address can be placed in exactly one place in the cache Conflict misses if two addresses map to same place

More information

Mechanism Synthesis. Introduction: Design of a slider-crank mechanism

Mechanism Synthesis. Introduction: Design of a slider-crank mechanism Mechanism Synthesis Introduction: Mechanism synthesis is the procedure by which upon identification of the desired motion a specific mechanism (synthesis type), and appropriate dimensions of the linkages

More information

Performance Analysis of the Different Null Steering Techniques in the Field of Adaptive Beamforming

Performance Analysis of the Different Null Steering Techniques in the Field of Adaptive Beamforming Research Journal of Applied Sciences, Engineering and Technology 5(15): 4612, 213 ISSN: 24-7459; e-issn: 24-7467 Maxwell Scientific Organization, 213 Submitted: November 27,212 Accepted: January 14, 213

More information