Bulk-Synchronous Communication Mechanisms in Diderot

Size: px

Start display at page:

Download "Bulk-Synchronous Communication Mechanisms in Diderot"

Paul Briggs
6 years ago
Views:

1 Bulk-Synchronous Communication Mechanisms in Diderot John Reppy, Lamont Samuels University of Chicago October 26th, 2015

2 Diderot Diderot overview Diderot is a parallel domain-specific language designed for biomedical image-analysis and visualization algorithms. Its design models the algorithmic structure of its application domain 1 : independent strands computing over a continuous tensor field, which are reconstructed from discrete data using a separable convolution kernel h: F = V h 1 Diderot: a Domain-Specific Language for Portable Parallel Scientific Visualization and Image Analysis. G Kindlmann, C Chiw, N Seltzer, L Samuels, and J Reppy. (to appear VIS 2015). October 26th, 2015 AGERE 15 - Diderot 2

Diderot overview Diderot Diderot is a parallel domain-specific language designed for biomedical image-analysis and visualization algorithms.

3 Diderot overview Diderot Diderot is a parallel domain-specific language designed for biomedical image-analysis and visualization algorithms. Its design models the algorithmic structure of its application domain 1 : independent strands computing over a continuous tensor field, which are reconstructed from discrete data using a separable convolution kernel h: F = V h V h F Discrete image data Continuous field 1 Diderot: a Domain-Specific Language for Portable Parallel Scientific Visualization and Image Analysis. G Kindlmann, C Chiw, N Seltzer, L Samuels, and J Reppy. (to appear VIS 2015). October 26th, 2015 AGERE 15 - Diderot 2

4 Diderot overview Diderot Program Structure Square roots of integers using Heron s method. // global definitions input int N = 1000; input real eps = ; // strand definition strand SqRoot (real val) { output real root = val; update { root = (root + val/root) / 2.0; if ( rootˆ2 - val /val < eps) stabilize; // initialization initially [ SqRoot(real(i)) i in 1..N ] October 26th, 2015 AGERE 15 - Diderot 3

5 Diderot overview Diderot Parallelism Model Bulk-synchronous parallel with deterministic semantics. stabilize execution step strand state update die idle strands October 26th, 2015 AGERE 15 - Diderot 4

6 Diderot overview Diderot Parallelism Model (Cont d) This model only supports applications that allow strands to act independently of each other, such as direct volume rendering or fiber tractography. Diderot is missing features needed for algorithms of interest, such as particle systems. October 26th, 2015 AGERE 15 - Diderot 5

7 Diderot overview Diderot Parallelism Model (Cont d) This model only supports applications that allow strands to act independently of each other, such as direct volume rendering or fiber tractography. Diderot is missing features needed for algorithms of interest, such as particle systems. October 26th, 2015 AGERE 15 - Diderot 5

8 Spatial Design Communication Design Strands interact with each other based on their world coordinates. Strand uses special query functions to identify neighboring strands. Processing the queried sequence of strands is performed using a new mechanism called the foreach statement. Strand state is accessed using the selection operator. October 26th, 2015 AGERE 15 - Diderot 6

9 Spatial Design Communication Design Strands interact with each other based on their world coordinates. Strand uses special query functions to identify neighboring strands. Processing the queried sequence of strands is performed using a new mechanism called the foreach statement. Strand state is accessed using the selection operator. B D A Q C October 26th, 2015 AGERE 15 - Diderot 6

10 Spatial Design Communication Design Strands interact with each other based on their world coordinates. Strand uses special query functions to identify neighboring strands. Processing the queried sequence of strands is performed using a new mechanism called the foreach statement. Strand state is accessed using the selection operator. Circle Query B D A Q r r C October 26th, 2015 AGERE 15 - Diderot 6

11 Spatial Design Communication Design Strands interact with each other based on their world coordinates. Strand uses special query functions to identify neighboring strands. Processing the queried sequence of strands is performed using a new mechanism called the foreach statement. Strand state is accessed using the selection operator. Circle Query A Q r B r C D strand Particle (vec2 initpos) { vec2 pos = initpos; update { foreach (Particle neighbor in sphere(5.0)) {... October 26th, 2015 AGERE 15 - Diderot 6

12 Spatial Design Communication Design Strands interact with each other based on their world coordinates. Strand uses special query functions to identify neighboring strands. Processing the queried sequence of strands is performed using a new mechanism called the foreach statement. Strand state is accessed using the selection operator. Circle Query A Q r B r C D strand Particle (vec2 initpos) { vec2 pos = initpos; update { foreach (Particle neighbor in sphere(5.0)) { real dist = pos - neighbor.pos ;... October 26th, 2015 AGERE 15 - Diderot 6

13 Global Design Communication Design Strands can share state information on a larger scale within the system. Global communication is performed by using common reduction operations. The reduction operations reside in a new definition block called global. A reduction is performed across a set of strands. Strand Active Strand Stable Strand October 26th, 2015 AGERE 15 - Diderot 7

14 Global Design Communication Design Strands can share state information on a larger scale within the system. Global communication is performed by using common reduction operations. The reduction operations reside in a new definition block called global. A reduction is performed across a set of strands. Strand Active Strand Stable Strand Reductions: mean, max, min, all, exists, variance, sum, product October 26th, 2015 AGERE 15 - Diderot 7

15 Global Design Communication Design Strands can share state information on a larger scale within the system. Global communication is performed by using common reduction operations. The reduction operations reside in a new definition block called global. A reduction is performed across a set of strands. Strand Active Strand Stable Strand real gmean = 0; strand Particle (real initenergy) { vec2 initenergy; update { if ( energy < gmean) { // e.g., increase energy global { gmean = mean{ P.energy P in Particle.all Reductions: mean, max, min, all, exists, variance, sum, product October 26th, 2015 AGERE 15 - Diderot 7

16 Global Design Communication Design Strands can share state information on a larger scale within the system. Global communication is performed by using common reduction operations. The reduction operations reside in a new definition block called global. A reduction is performed across a set of strands. Strand Active Strand Stable Strand real gmean = 0; strand Particle (real initenergy) { vec2 initenergy; update { if ( energy < gmean) { Particle.active // e.g., increase energy Partical.stable global { gmean = mean{ P.energy P in Particle.all Reductions: mean, max, min, all, exists, variance, sum, product October 26th, 2015 AGERE 15 - Diderot 7

17 Spatial Implementation Communication Implementation Spatial communication is implemented using a tree-based spatial partitioning scheme (k-d tree). The tree is built using a strand s pos variable and is rebuilt before each iteration. y 24 X (12,4) A (2,22) F (3,15) I (17,20) J (10,19) H (17,18) G (9,17) D (14,13) Y (10,19) (17,18) 12 8 B (16,7) X (9,17) (2,22) (16,7) (17,20) 4 2 0,0 T (12,4) x Y (3,15) (14,13) October 26th, 2015 AGERE 15 - Diderot 8

18 Communication Implementation Global Implementation The global computation phase groups reductions into execution phases. This process is done in two steps: lifting and phase insertion. phase 0 r1 = mean(...); r3 = sum(...); r5 = product(...); Reduction Block phase 1 r2 = sum(...); Variable Phase phase 2 r4 = mean(...); a 0 lift b 1 c 0 d 2 e 0 Global Block a = mean(...); b = sum(...) + a; c = sum(...) * 10; d = mean(...)/ b; e = product(...) + 3; transform phase(0); a = r1; phase(1); b = r2 + a; c = r3 * 10; phase(2); d = r4 /b; e = r5 + 3; October 26th, 2015 AGERE 15 - Diderot 9

19 Dynamic allocation Strand Allocation Strands are dynamically allocated by the new statement. New strands will begin executing in the next iteration. New Strand Iteration i new Particle (pos + d); Iteration i + 1 October 26th, 2015 AGERE 15 - Diderot 10

20 Dynamic allocation Strand Allocation Strands are dynamically allocated by the new statement. New strands will begin executing in the next iteration. New Strand Iteration i new Particle (pos + d); Iteration i + 1 October 26th, 2015 AGERE 15 - Diderot 10

21 Evaluation Live Demo (Particle-Based Isocontour Sampler) Sampling Implicit Surfaces Sampling Valley Lines October 26th, 2015 AGERE 15 - Diderot 11

22 Summary Conclusion Provided new mechanisms to our programming model. Allows for more algorithms to be implemented within Diderot. Future Work Additional query functions Implementation of communication mechanisms on GPUs stabilize new execution step global computation strand state die update idle global computation read spawn strands October 26th, 2015 AGERE 15 - Diderot 12

23 Summary Conclusion Provided new mechanisms to our programming model. Allows for more algorithms to be implemented within Diderot. Future Work Additional query functions Implementation of communication mechanisms on GPUs stabilize new execution step global computation strand state die update idle global computation read spawn strands October 26th, 2015 AGERE 15 - Diderot 12

24 Summary Conclusion Provided new mechanisms to our programming model. Allows for more algorithms to be implemented within Diderot. Future Work Additional query functions Implementation of communication mechanisms on GPUs stabilize new execution step global computation strand state die update idle global computation read spawn strands October 26th, 2015 AGERE 15 - Diderot 12

25 Summary Conclusion Provided new mechanisms to our programming model. Allows for more algorithms to be implemented within Diderot. Future Work Additional query functions Implementation of communication mechanisms on GPUs stabilize new execution step global computation strand state die update idle global computation read spawn strands October 26th, 2015 AGERE 15 - Diderot 12

26 Conclusion Acknowledgements The Diderot Project Charisee Chiw Gordon Kindlmann John Reppy Nick Seltzer University of Chicago University of Chicago University of Chicago University of Chicago October 26th, 2015 AGERE 15 - Diderot 13

27 Conclusion Questions? October 26th, 2015 AGERE 15 - Diderot 14

Specific Language for Scientific Image Analysis and Visualization.

Diderot: Gordon A Parallel Domain- Specific Language for Scientific Image Analysis and Visualization Kindlmann glk@uchicago.edu Joint work with: Prof. John Reppy, Nicholas Seltzer, Charisee Chiw, and Lamont