Ohua: Implicit Dataflow Programming for Concurrent Systems

Size: px

Start display at page:

Download "Ohua: Implicit Dataflow Programming for Concurrent Systems"

Elaine Harrington
6 years ago
Views:

1 Ohua: Implicit Dataflow Programming for Concurrent Systems Sebastian Ertel Compiler Construction Group TU Dresden, Germany Christof Fetzer Systems Engineering Group TU Dresden, Germany Pascal Felber Institut d informatique Université de Neuchâtel, Switzerland PPPJ

The Multi-Core Future Jons-Tobias Wamhoff. 2014.

2 The Multi-Core Future Jons-Tobias Wamhoff Exploiting Speculative and Asymmetric Execution on Multicore Architectures (Dissertation) 2

3 Parallelism Lesson learned (for general purpose programming): Work granularity must compensate scheduling overheads: Coarse-grained parallelism Concurrent programming Tim Harris and Satnam Singh. Feedback directed implicit parallelism. ICFP '07 3

4 Concurrent Programming (compiler knowledge) runtime optimizations manual automatic threads locks explicit (control structures, abstractions) tasks fork/join actors messagepassing dataflow operators dataflowgraph concurrent programming model stateful functions functional program implicit (functions, variables) All models depart from sequential programming style. Scalability in programming effort suffers. Limited compiler support for optimizations. 4

Async threads locks explicit (control structures, abstractions) tasks fork/join

model stateful functions functional program implicit (functions, variables) All

5 Concurrent Programming (compiler knowledge) runtime optimizations manual automatic IO Programming Models Sync vs. Async threads locks explicit (control structures, abstractions) tasks fork/join actors messagepassing dataflow operators dataflowgraph concurrent programming model stateful functions functional program implicit (functions, variables) All models depart from sequential programming style. Scalability in programming effort suffers. Limited compiler support for optimizations. 5

6 Concurrent Programming (compiler knowledge) runtime optimizations manual automatic threads locks explicit (control structures, abstractions) tasks fork/join actors messagepassing dataflow operators dataflowgraph concurrent programming model stateful functions functional program implicit (functions, variables) Key observation: Algorithms are composed out of functionality. 6

7 From dataflow programming to stateful functional programming (SFP) Example: simple web server a r p l c r Jesper H. Spring, Jean Privat, Rachid Guerraoui, and Jan Vitek Streamflex: high-throughput stream programming in java. OOPSLA '07 7

From dataflow programming to stateful functional programming (SFP) Example: simple web server a r p l c r port cnn cnn req resource length header a r p

8 From dataflow programming to stateful functional programming (SFP) Example: simple web server a r p l c r port cnn cnn req resource length header a r p l c r content Jesper H. Spring, Jean Privat, Rachid Guerraoui, and Jan Vitek Streamflex: high-throughput stream programming in java. OOPSLA '07 8

9 From dataflow programming to stateful functional programming (SFP) Example: simple web server Jesper H. Spring, Jean Privat, Rachid Guerraoui, and Jan Vitek Streamflex: high-throughput stream programming in java. OOPSLA '07 9

10 From dataflow programming to stateful functional programming (SFP) Example: simple web server Functions instead of channels! Stateful functions state encapsulation Implemented in imperative language/style. Jesper H. Spring, Jean Privat, Rachid Guerraoui, and Jan Vitek Streamflex: high-throughput stream programming in java. OOPSLA '07 10

11 From dataflow programming to stateful functional programming (SFP) Example: simple web server Functions instead of channels! Stateful functions state encapsulation Implemented in imperative language/style. Jesper H. Spring, Jean Privat, Rachid Guerraoui, and Jan Vitek Streamflex: high-throughput stream programming in java. OOPSLA '07 11

12 Ohua (Java/Clojure) Function (Java/Clojure) Algorithm (Clojure) Function Library Linker Dataflow Graph Extraction Dataflow Compilation Runtime Code Dataflow Graph Compiletime Data Section-Mapping Algorithm Execution Engine Hierarchical Scheduler Runtime Algorithm Code Transformation Compile-time 12

13 Ohua (Java/Clojure) Function (Java/Clojure) Algorithm (Clojure) Function Library Linker Dataflow Graph Extraction Dataflow Compilation Runtime Code Dataflow Graph Compiletime Data Section-Mapping Algorithm Execution Engine Hierarchical Scheduler Runtime Algorithm Code Transformation Compile-time 13

14 From Control Flow to Dataflow Construction of a control flow dependence graph. read load composeget-resp merge reply parseheader cond composepost-resp parsepost store Most special forms of Clojure supported. Micah Beck, Richard Johnson, and Keshav Pingali From control flow to dataflow. J. Parallel Distrib. Comput. 14

15 Closures and Destructuring Closures in combination with destructuring create opportunities for enhanced parallelism. macro construction for parallel request handling: 15

16 Closures and Destructuring Closures in combination with destructuring create opportunities for enhanced parallelism. macro construction for parallel request handling: 16

17 Closures and Destructuring Closures in combination with destructuring create opportunities for enhanced parallelism. macro construction for parallel request handling: 17

18 Closures and Destructuring Closures in combination with destructuring create opportunities for enhanced parallelism. macro construction for parallel request handling: balancedata read read read parse parse parse load load load compose compose compose reply reply reply 18

19 Closures and Destructuring Closures in combination with destructuring create opportunities for enhanced parallelism. macro construction for parallel request handling: balancedata read read read parse parse parse load load load compose compose compose reply reply reply Beware: Each operator/function invocation has its own state! 19

20 Closures and Destructuring Closures in combination with destructuring create opportunities for enhanced parallelism. macro construction for parallel request handling: balancedata read read read parse parse parse load load load compose compose compose reply reply reply 20

21 Ohua (Java/Clojure) Function (Java/Clojure) Algorithm (Clojure) Function Library Linker Dataflow Graph Extraction Dataflow Compilation Runtime Code Dataflow Graph Compiletime Data Section-Mapping Algorithm Execution Engine Hierarchical Scheduler Runtime Algorithm Code Transformation Compile-time 21

22 Sections Dataflow execution model: pipeline parallelism. Section: set of one or more operators. section request handling section section network I/O read section disk I/O section network I/O write section read parse load compose reply read parse load compose reply Construct section mapping from execution restrictions. Hierarchical scheduling: Sections executed across thread pool parallel. Operators scheduled cooperatively sequential. 22

23 Synchronization Sections provide lock-free synchronization! Consider parallel request handling by type: GET cond load parseheader composeget-resp reply read composepost-resp parsepost store reply PUT Shared external resource (file) inconsistency. 23

24 Synchronization Sections provide lock-free synchronization! Consider parallel request handling by type: GET cond load parseheader composeget-resp reply read composepost-resp parsepost store reply PUT Shared external resource (file) inconsistency. sync section cond load parseheader composeget-resp reply read composepost-resp parsepost store reply 24

25 Experimental Setup server (8-core) request delay: 20 ms clients spread across 19 - (8-core) machines 25

26 Section Strategies Goal: web server fine-tuning without code changes. Best deployment strategy: call/sgl-rd-rp/non-blocking/normal read parse load compose reply 26

27 Section Strategies Goal: web server fine-tuning without code changes. Best deployment strategy: call 27

28 Section Strategies Goal: web server fine-tuning without code changes. Best deployment strategy: call/sgl-rd-rp read parse load compose reply 28

29 Section Strategies Goal: web server fine-tuning without code changes. Best deployment strategy: call/sgl-rd-rp/non-blocking read parse load compose reply 29

30 Section Strategies Goal: web server fine-tuning without code changes. Best deployment strategy: call/sgl-rd-rp/non-blocking/normal read parse load compose reply 30

31 Jetty vs. Ohua 31

32 Jetty vs. Ohua 32

33 Jetty vs. Ohua Ohua is comparable to Jetty (NIO) in terms of latency, outperforms Jetty (NIO) in terms of throughput, but without pre-selecting the GC and without mixing programming models. 33

34 Stateful Functional Programming (SFP) SFP programming model: program = declarative algorithms + stateful functions Benefits: Scalable system construction via algorithm composition. Implicit parallelism for algorithms. Future work: (tip of the ice berg) No deadlocks or data races. Leverage dataflow experience for runtime optimizations. 34

35 Thanks for your attention! Questions?

36 Loops Example: non-blocking read nb-readconcat nbread merge cond parse load compose reply Beware of state inside loops! 36

37 Sections section request handling section read parse load compose reply section network I/O read section disk I/O section network I/O write section read parse load compose reply Manual section configuration via regular expressions: Restrictions: read parse load compose reply Ultimate goal: automatic section configuration. 37

38 Type sensitive request handling Scenario: small feeds in RAM vs. articles on disk Goal: unblock feeds from articles 38

39 Type sensitive request handling Scenario: small feeds in RAM vs. articles on disk Goal: unblock feeds from articles Latency (ms) Disk RAM # Clients Throughput (kreq/s) # Clients RAM Disk 39

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based?

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based? Agenda Designing Transactional Memory Systems Part III: Lock-based STMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch Part I: Introduction Part II: Obstruction-free STMs Part III: Lock-based