Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs

Size: px

Start display at page:

Download "Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs"

Caitlin Henderson
5 years ago
Views:

1 Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs Anan Bouakaz Pascal Fradet Alain Girault Real-Time and Embedded Technology and Applications Symposium, Vienna April 14th, 2016

2 Outline Introduction 1 Introduction Application model Scheduling policy Symbolic analysis 2 Preliminary results 3 Graph A p q B 4 Generalization to acyclic graphs 5 Experiments 6 Conclusion 1 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

3 Introduction Data-flow models of computation Application model Stream-processing applications are found in many embedded systems video codecs, software defined radio,... computationally intensive strict quality-of-service requirements low energy consumption more and more these applications run on many-core platforms Data-flow models of computation are good at: Expressing task-level parallelism Achieving efficient implementation Guaranteeing performances at compile time: throughput: stream oriented applications latency: automatic control oriented applications buffer sizes: all embedded applications 2 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

4 Introduction Application model Acyclic Synchronous Data-FLow (SDF) graphs [Lee and Messerschmitt, Proc. 1987] rate actor edge A B C t A =15 t B =8 t C =17 execution time 3 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

5 Introduction Application model Acyclic Synchronous Data-FLow (SDF) graphs [Lee and Messerschmitt, Proc. 1987] A B C z A 3 = z B 2 z B 1 = z C 3 System of Balance Equations Consistent SDF graph G: this system has a non-null solution Repetition vector of G: z = [A 4, B 6, C 2 ] Iteration = firing sequence that returns G to its initial state [ ] [ ] [ ] [ ] 0 A 4 12 B 6 0 C Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

6 Scheduling policy Introduction Scheduling policy As Soon As Possible (ASAP) [Sriram and Bhattacharyya 2000] No auto-concurrency auto-concurrency Modeling Techniques buffer size A B z = [2, 3] A B t A =15 t B = Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

7 Scheduling policy Introduction Scheduling policy As Soon As Possible (ASAP) [Sriram and Bhattacharyya 2000] No auto-concurrency auto-concurrency Modeling Techniques buffer size A B z = [2, 3] A B t A =15 t B =8 3 2 transient phase steady state A B Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

8 Scheduling policy Introduction Scheduling policy Definition: Multi-iteration latency of graph G: L G (n) = the finish time of the n th iteration. A B L G (1) L G (2) 5 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

9 Scheduling policy Introduction Scheduling policy Definition: Input-output latency of graph G: l G (n) = the duration between the start and ending of the n th iteration. A B l G (1) l G (2) 5 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

10 Scheduling policy Introduction Scheduling policy Definition: Period of graph G: L G (n) P G = the average length of an iteration = lim n n Definition: Throughput of graph G: T G = 1 P G A B P G 5 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs P G

11 Introduction Maximal Cycle Mean (MCM) Scheduling policy Definition: Load of actor A: The load of an actor A is equal to z A t A. Definition: Maximum Cycle Mean (MCM): The CM of a cycle C is the sum of execution times of the actors in C divided by the number of initial tokens in the channels. MCM(G) = max C G CM(C) Property: Maximum throughput of an HSDF graph G: T G = 1 MCM(G) 6 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

12 Symbolic analysis Introduction Symbolic analysis parametric dataflow graph partially specified SDF graph SDF graph instantiation numerical SDF graph numerical analysis NP-complete for HSDF results 7 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

13 Symbolic analysis Introduction Symbolic analysis parametric dataflow graph partially specified SDF graph SDF graph symbolic numerical symbolic analysis symbolic formulas instantiation SDF graph symbolic evaluation numerical evaluation numerical analysis NP-complete for HSDF results 7 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

14 Outline Preliminary results 1 Introduction 2 Preliminary results Maximal throughput Duality theorem 3 Graph A p q B 4 Generalization to acyclic graphs 5 Experiments 6 Conclusion 8 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

15 Maximal throughput Preliminary results Maximal throughput Property The (minimal) period of an acyclic SDF graph G = (V, E) is equal to P G = max A V {z At A } 9 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

16 Maximal throughput Preliminary results Maximal throughput Property The (minimal) period of an acyclic SDF graph G = (V, E) is equal to P G = max A V {z At A } Proof. Using SDF-to-HSDF transformation and the MCM. A 1 B A B t A =10 t B =12 z = [3, 2] SDF-to-HSDF A 2 A 3 B 2 9 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

17 Duality theorem Preliminary results Duality theorem Definition: The dual of an SDF graph G: G 1 is obtained by reversing all edges of G. Duality theorem: Let G be any (cyclic or not) live graph and G 1 be its dual, then T G = T G 1 i. L G (i) = L G 1(i). and A B t A = t B =12 G A B L G (n)=l G 1(n) G 1 A B Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

18 Duality theorem Preliminary results Duality theorem Definition: The dual of an SDF graph G: G 1 is obtained by reversing all edges of G. Duality theorem: Let G be any (cyclic or not) live graph and G 1 be its dual, then T G = T G 1 i. L G (i) = L G 1(i). and Proof: Using SDF-to-HSDF transformation + unfolding: A 1 A 1 B 1 B 1 HSDF(G) A 2 A 2 HSDF(G 1 ) B 2 B 2 A 3 A 3 10 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

19 Outline Graph A p q B 1 Introduction 2 Preliminary results 3 Graph A p q B Enabling patterns Minimum buffer size 4 Generalization to acyclic graphs 5 Experiments 6 Conclusion 11 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

20 Graph A p q B Preliminaries about graph A B p q Four parameters: p, q N + and t A, t B R +. Repetition vector: [ ] q z A = gcd(p, q), zb = p gcd(p, q) ASAP period: P G = max(z A t A, z B t B ). Problem statement What is θ A,B the min. size of channel A B s.t. the ASAP execution achieves the max. throughput? Solution p + q gcd(p, q) < θ A,B 2(p + q gcd(p, q)) Proof: 18 cases in total: p, q 6 cases; t A, t B 3 cases 12 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

21 Enabling patterns Graph A p q B Enabling patterns A time-independent analytic and parametric characterization of the data-dependency A B that covers one iteration. Example: Graph A 8 5 B with t A = 20 and t B = 7 enabling point A 1 A 2 A 3 A 4 A B 1 B 2 B 3 B 4 B 5 B 6 B 7 B 8 A B A B 2 A B A B 2 A B 2 A i B j i firings of A enables j firings of B. Unfolded pattern: A B; A B 2 ; A B; A B 2 ; A B 2 13 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

22 Enabling patterns Graph A p q B Enabling patterns Unfolded pattern: A B; A B 2 ; A B; A B 2 ; A B 2 }{{}}{{} block block 14 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

23 Enabling patterns Graph A p q B Enabling patterns Unfolded pattern: Factorized pattern: A B; A B 2 ; A B; A B 2 ; A B 2 }{{}}{{} block block [ A B; [A B 2 ] f i] i=1 2 with f1 = 1, f 2 = 2 14 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

24 Enabling patterns Graph A p q B Enabling patterns Unfolded pattern: Factorized pattern: General case: A B; A B 2 ; A B; A B 2 ; A B 2 }{{}}{{} block block [ A B; [A B 2 ] f i] i=1 2 with f1 = 1, f 2 = 2 with p = kq + r and α j = [ A B k ; [A B k+1] α j ] j=1 q r gcd(p,q) jr q r (j 1)r q r 14 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

25 Enabling patterns Graph A p q B Enabling patterns Case A. p q Let p = kq + r with 0 r < q Case A.1. r = 0 Case A.2. q 2r A B k [ A B k ; [ A B k+1] α j ] j=1 q r gcd(p,q) Case A.3. q > 2r [ [A B k ] β j ; A B k+1 ] j=1 r gcd(p,q) jr (j 1)r α j = q r q r β j = jq r (j 1)q r 1 Case B. p < q Let q = kp + r with 0 r < p Case B.1. r = 0 A k B Case B.2. p 2r [ A k+1 B; [ A k B ] γ j ] j=1 r gcd(p,q) Case B.3. p < 2r γ j = [ [A k+1 B ] λ j ; A k B] j=1 p r gcd(p,q) jp r (j 1)p r 1 λ j = jr p r (j 1)r p r 15 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

26 Graph A p q B Minimum buffer size Min. buffer size θ A,B : Case 1: load of A load of B: z A t A z B t B Recall that P G = max(z A t A, z B t B ) To reach maximal throughput, A must fire consecutively δ j = minimum number of tokens in the backward edge s.t. A j occurs immediately after A j 1 θ A,B = max δ j j θ A,B 8 5 t A =20 A B t B =7 8 5 δ j : p 2p 3p q 4p 3q 5p 4q 6p 6q A B A B A B 2 A B A B 2 A B 2 16 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

27 Graph A p q B Minimum buffer size Min. buffer size θ A,B : Case 1: load of A load of B: z A t A z B t B Case 1.I. All newly enabled firings of B complete their execution before the next enabling point. Graph A 8 5 B, t A = 20, t B = 7 A B A B A B 2 A B A B 2 A B 2 Case A. p q θ A,B = 2p + q gcd(p, q) Case B. p < q θ A,B = p + q gcd(p, q) + tb p t A 17 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

28 Graph A p q B Minimum buffer size Min. buffer size θ A,B : Case 1: load of A load of B: z A t A z B t B Case 1.II. All enabled firings of B during one block complete their execution before the next block. Graph A 8 5 B, t A = 13, t B = 7 block block A A B A B 2 A B A B 2 A B 2 A B A B 2 B Case A. p q Case B. p < q Let r = t A kt B tb r θ A,B = 2p + q gcd(p, q) + r r Let r = t B kt A r θ A,B = p+2q gcd(p, q)+ (p r) t A r 18 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

29 Graph A p q B Minimum buffer size Min. buffer size θ A,B : Case 1: load of A load of B: z A t A z B t B Case 1.III. General case. Graph A 8 5 B, t A = 23, t B = 14 catch-up sequence catch-up sequence (2 blocks) A B A :B A :B 2 A :B A :B 2 A :B 2 Key elements of the solution: The maximum of δ j occurs inside the longest catch-up sequence. The finish-before relation inside a catch-up sequence follows the enabling pattern of graph A t A t B. B 19 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

30 Graph A p q B Minimum buffer size Min. buffer size θ A,B : Case 2: load of A < load of B: z A t A < z B t B To reach maximal throughput, B must fire consecutively in the steady state Graph A 8 5 B, t A = 14, t B = 10, θ A,B = 22 A B Using the duality theorem, the minimum buffer size of A B p q is equal to that of B A. q p θ A,B = θ B,A 20 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

31 Outline Generalization to acyclic graphs 1 Introduction 2 Preliminary results 3 Graph A p q B 4 Generalization to acyclic graphs Safe upper bounds Improvements 5 Experiments 6 Conclusion 21 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

32 Generalization to acyclic graphs Safe upper bounds for Θ A,B (1) Safe upper bounds Property: Let G be a graph with no directed cycle: If A B p q G, Θ A,B 2(p + q gcd(p, q)), then the ASAP execution of G achieves the max. throughput. Proof for chains: 1. For chain G = {A 1 p 1 q 1 A2 A n }, construct chain G s.t. t Ai is considered to be P G /z Ai (i.e., all actors impose the same load). 2. Graphs G and G have the same maximal throughput. 3. sizes 2(p i + q i gcd(p i, q i ) allow all actors to fire consecutively in G = maximal throughput. 4. Reducing the execution times of actors in G to their original values will never decrease the throughput. 22 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

33 Generalization to acyclic graphs Safe upper bounds for Θ A,B (2) Safe upper bounds The previous property does not hold for general acyclic graphs. Counterexample: t A =4 A 4 1 t B =3 3 3 B C 3 3 t C = D t D =8 z = [6, 8, 2, 3] buffer sizes: A B : 12 B D : 20 A C : 6 C D : 8 A B C D 23 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

34 Generalization to acyclic graphs Safe upper bounds for Θ A,B (2) Safe upper bounds The previous property does not hold for general acyclic graphs. Counterexample: t B = B 8 t A =4 A D t D =8 z = [6, 8, 2, 3] 1 C t C =12 Problem: The chain A C D influences A B D Solution: Increase the size of some channels to cut the influence. 23 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

35 Generalization to acyclic graphs Safe upper bounds for Θ A,B (2) Safe upper bounds Step 1. linearization of A B D A B 24 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

36 Generalization to acyclic graphs Safe upper bounds for Θ A,B (2) Step 1. linearization of A B D A B Safe upper bounds B u i. f B u(i) = qt A p i + ( t A + t B gcd(p,q) p t A ) 24 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

37 Generalization to acyclic graphs Safe upper bounds for Θ A,B (2) Safe upper bounds Step 1. linearization of A B D A B D x = 16 B u D u 24 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

38 Generalization to acyclic graphs Safe upper bounds for Θ A,B (2) Safe upper bounds Step 1. linearization of A B D A B B u D x = 16 D u Step 2. linearization of A C D A C D y = 28 D u 24 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

39 Generalization to acyclic graphs Safe upper bounds for Θ A,B (2) Safe upper bounds Step 1. linearization of A B D A B B u D x = 16 D u Step 2. linearization of A C D A C D y = 28 D u Step 3. Combination of the two linear chains: Increase the size of B D p q by y x t B p = 12 so that actor B can fire consecutively during the extra-delay y x. 24 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

40 Improvements (1) Generalization to acyclic graphs Improvements Definition: Load monotone chain: The chain A 1 A n is load-monotone if and only if ( i. z Ai t Ai z Ai+1 t Ai+1 ) ( i. z Ai t Ai z Ai+1 t Ai+1 ) Property A monotone chain A 1 A n where the size of each buffer A i A i+1 is at least θ Ai,A i+1 achieves its maximal throughput. The property holds for non monotone chains made of an ascending sub-chain followed by a descending one (i.e., form ). The property does not hold for an arbitrary order. Solution = next slide. 25 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

41 Improvements (2) Generalization to acyclic graphs Improvements Solution. transform any chain into the form by using retiming: 1. Increase the execution times of some actors (without exceeding the maximum load P G ). 2. Compute the buffer sizes θ Ai,A i Restore the original execution times. 26 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

42 Improvements (2) Generalization to acyclic graphs Improvements Solution. transform any chain into the form by using retiming: 1. Increase the execution times of some actors (without exceeding the maximum load P G ). 2. Compute the buffer sizes θ Ai,A i Restore the original execution times. A 4 5 B 4 2 C 3 4 D 1 3 E 4 2 F 5 5 G 1 2 H 3 2 I 2 2 J 5 2 K t Ai : Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

43 Improvements (2) Generalization to acyclic graphs Improvements Solution. transform any chain into the form by using retiming: 1. Increase the execution times of some actors (without exceeding the maximum load P G ). 2. Compute the buffer sizes θ Ai,A i Restore the original execution times. load A 4 5 B 4 2 C 3 4 D 1 3 E 4 2 F 5 5 G 1 2 H 3 2 I 2 2 J 5 2 K t Ai : Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

44 Improvements (2) Generalization to acyclic graphs Improvements Solution. transform any chain into the form by using retiming: 1. Increase the execution times of some actors (without exceeding the maximum load P G ). 2. Compute the buffer sizes θ Ai,A i Restore the original execution times. load A 4 5 B 4 2 C 3 4 D 1 3 E 4 2 F 5 5 G 1 2 H 3 2 I 2 2 J 5 2 K t Ai : Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

45 Outline Experiments 1 Introduction 2 Preliminary results 3 Graph A p q B 4 Generalization to acyclic graphs 5 Experiments 6 Conclusion 27 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

46 Experiments Heuristic vs. upper bounds 2 million randomly generated chains: 10 actors, p, q [1, 20] and t X [1, 200] Compute the ratio of the total buffers sizes (heuristic) to the sum of the lower bounds, p + q gcd(p, q) Improvement of 20% 28 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

47 Experiments Heuristic vs. exact (1) 10 thousand randomly generated chains: 4 actors, rates in [1, 10] and times in [1, 100]. Compute the ratios of the total buffers sizes (heuristic, exact) to the sum of the lower bounds. Overhead of only 25% 29 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

48 Experiments Heuristic vs. exact (1) 5 real applications from the SDF 3 benchmarks the StreamIt benchmarks. graph # actors A z A Upper bound Optimal size Heuristic (a) modem (b) sample con (c) H.263 dec (d) FFT (e) TDE Loads: (a) (b) (c) (d) (e) 30 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

49 Outline Conclusion 1 Introduction 2 Preliminary results 3 Graph A p q B 4 Generalization to acyclic graphs 5 Experiments 6 Conclusion 31 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

50 Related work Conclusion [Geilen 2011] and [Skelin et al. 2014]: (max, +) algebra to compute the token timestamp vector with the eigenvalue of the transition matrix = Requires the ceiling operator to be simplified [Ghamarian et al. 2008]: parametric throughput analysis for SDF graphs with bounded parametric execution times of actors but constant rates = Parameter space divided into a set of convex polyhedra (throughput regions), each with a throughput expression [Damavandpeyma et al. 2012]: Extension to scenario-aware dataflow (SADF) [Bodin et al. 2013]: lower bounds of the maximum throughput to compute strictly periodic schedules instead of ASAP schedules = Can handle some cyclic graphs, but usually our linearization methods provide better results 32 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

51 Conclusion Conclusion We presented: An exact analytic solution of the (p, q, t A, t B ) problem using enabling patterns. A safe generalization to acyclic graphs using linearization and retiming. We didn t present: Still to solve: Symbolic latency computation of acyclic dataflow graphs Symbolic analysis of cyclic dataflow graphs 33 Bouakaz, Fradet and Girault Symbolic Buffer Sizing of Dataflow Graphs

Reliable Embedded Multimedia Systems?

2 Overview Reliable Embedded Multimedia Systems? Twan Basten Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, and others Embedded Multi-media Analysis of