Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing

Size: px

Start display at page:

Download "Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing"

Gavin Henry
5 years ago
Views:

1 Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing Ivan Walulya, Yiannis Nikolakopoulos, Vincenzo Gulisano Marina Papatriantafilou and Philippas Tsigas Auto-DaSP 2017 Chalmers University of Technology Göteborg, Sweden 1

2 Stream Processing Applications o Process infinite streams of data High throughput Low latency o High resource requirements (multiple cores/nodes) Input event Sink 2

3 Stream Processing Applications o Process infinite streams of data High throughput Low latency o High resource requirements (multiple cores/nodes) o Abstraction: Data-flow graphs of operators and streams Expose pipeline and task parallelism Input event Sink 2

4 Introduction Input event Sink Processing Element 3

5 Introduction o Parallel processing: Pipeline parallelism Task parallelism Input event Sink sink Processing Element 3

6 o Parallel processing: Pipeline parallelism Task parallelism Introduction Compilers Run-time Schedulers Limited by data graph Input event Sink sink Processing Element 3

7 Introduction Input event..o 4 O 3 O 2 O 1 4

8 Introduction o Data or operator parallelism: Bottlenecks at operator level Split the data and replicate the operator Input event O 4 O 1 O 3 O 2 4

9 Introduction o Data or operator parallelism: Bottlenecks at operator level Split the data and replicate the operator Not limited by data-flow graph Trade latency for high throughput Preserve sequential semantics Input event O 4 O 1 O 3 O 2 4

10 Motivation o Determinism: preserve sequential semantics (safety) Input event O 4 O 1 O 3 O 2 5

11 Motivation o Determinism: preserve sequential semantics (safety) Merge operators Enforce ordering amongst the output tuples. Input event O 4 O 1 Merger O 3 O 2 5

12 Motivation o Determinism: preserve sequential semantics (safety) Merge operators Enforce ordering amongst the output tuples. Compiler generated [Schneider 2013]. Left to Application or Library developer [ Apache Storm, Flink]. Input event O 4 O 1 Merger O 3 O 2 5

13 Problem Statement layer Dedicated merge-sorting operators F 1 A2-M 1 A2 1 M-M 1 M 1 F 2 A2-M 2 A2 2 M-M 2 M 2 Dedicated channel between each pair of operator instances Communication layer 6

14 Problem Statement o o Overheads of Merge operators: Introduce computation overhead Higher latency due to increase in operators Become processing bottleneck Considerable burden on the developers Challenge: How to apply data-parallelism transparently and safely layer Dedicated merge-sorting operators F 1 A2-M 1 A2 1 M-M 1 M 1 F 2 A2-M 2 A2 2 M-M 2 M 2 Dedicated channel between each pair of operator instances Communication layer 6

15 Our Solution o Communication-layer determinism: Leverage links to achieve both communication and determinism. Shared state and synchronization layer Dedicated merge-sorting operators F 1 A2-M 1 A2 1 M-M 1 M 1 F 2 A2-M 2 A2 2 M-M 2 M 2 Dedicated channel between each pair of operator instances Communication layer 7

16 Our Solution o Communication-layer determinism: Leverage links to achieve both communication and determinism. Shared state and synchronization layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 7

17 Our Solution o Communication-layer determinism: Leverage links to achieve both communication and determinism. Shared state and synchronization Extends the ScaleGate [Gulisano et. al. 2016]. Modularly take the logic of deterministic away from developer (Viper ). layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 7

18 Our Solution o Communication-layer determinism: Leverage links to achieve both communication and determinism. Shared state and synchronization Extends the ScaleGate [Gulisano et. al. 2016]. Modularly take the logic of deterministic away from developer (Viper ). Evaluate our implementation on Apache Storm. layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 7

19 Viper Module o Overall Approach Replace merge operators and channels with a Viper module layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

20 Viper Module o Overall Approach Replace merge operators and channels with a Viper module Channel maintained for any set of source operator instances S 1,..S n layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

21 Viper Module o Overall Approach Replace merge operators and channels with a Viper module Channel maintained for any set of source operator instances S 1,..S n Channel is either a thread-safe concurrent queue or a ScaleGate object layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

22 Viper Module o Overall Approach Replace merge operators and channels with a Viper module Channel maintained for any set of source operator instances S 1,..S n Channel is either a thread-safe concurrent queue or a ScaleGate object Sorting overhead shared by threads assigned to the same instance layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

23 Viper Module o Overall Approach Replace merge operators and channels with a Viper module Channel maintained for any set of source operator instances S 1,..S n Channel is either a thread-safe concurrent queue or a ScaleGate object Sorting overhead shared by threads assigned to the same instance Watermarking mechanism to handle back-pressure layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

24 ScaleGate Data Structure [Gulisano et. al. 2016]. o API: addtuple(tuple,sourceid) allows a tuple from sourceid to be merged by ScaleGate in the resulting sorted stream of ready tuples. getnextreadytuple(readerid) provides to readerid the next ready tuple that has not been yet consumed by the former. 9

25 Viper API register(channel, sources, readers) Register a new channel, specifying which sources will add tuples and which readers will get timestamp-sorted tuples from the channel. 10

26 Viper API register(channel, sources, readers) Register a new channel, specifying which sources will add tuples and which readers will get timestamp-sorted tuples from the channel. addtuple(channel, sourceid, tuple) Add tuple from a given source sourceid to the specified channel 10

27 Viper API register(channel, sources, readers) Register a new channel, specifying which sources will add tuples and which readers will get timestamp-sorted tuples from the channel. addtuple(channel, sourceid, tuple) Add tuple from a given source sourceid to the specified channel getreadytuple(channel, readerid) Retrieve the next ready tuple (if any) for the specified readerid from the channel 10

28 Evaluation o Integrated Viper module in Apache Storm Execution Thread Task Send thread s based on LMAX disruptor A Dedicated send thread per Executor process Inbound Outbound 11

29 Evaluation o Integrated Viper module in Apache Storm Execution Thread Task Send thread s based on LMAX disruptor A Dedicated send thread per Executor process Inbound Outbound Execution Thread Remove send threads Concurrent queues or ScaleGate Inbound Task 11

30 Evaluation Setup o Linear-Road Dataset Simulate vehicular traffic on a network of dynamic-toll roads Variable toll dependent on congestion and accident proximity Position reports and historical query requests Tuple<Type = 0, Time, VID, Spd, XWay, Lane, Dir, Seg, Pos> 12

31 Evaluation Setup o Linear-Road Dataset Simulate vehicular traffic on a network of dynamic-toll roads Variable toll dependent on congestion and accident proximity Position reports and historical query requests Tuple<Type = 0, Time, VID, Spd, XWay, Lane, Dir, Seg, Pos> o Both stateful and stateless operators 12

32 Evaluation Setup o Linear-Road Dataset Simulate vehicular traffic on a network of dynamic-toll roads Variable toll dependent on congestion and accident proximity Position reports and historical query requests Tuple<Type = 0, Time, VID, Spd, XWay, Lane, Dir, Seg, Pos> o Both stateful and stateless operators o Metrics: Throughput, Latency and Energy 12

33 Evaluation Setup o Linear-Road Dataset Simulate vehicular traffic on a network of dynamic-toll roads Variable toll dependent on congestion and accident proximity Position reports and historical query requests Tuple<Type = 0, Time, VID, Spd, XWay, Lane, Dir, Seg, Pos> o Both stateful and stateless operators o Metrics: Throughput, Latency and Energy o Evaluation Platform Intel Xeon E5-2687W v2 3.4 GHz server (32 threads over 2 sockets), 64 GB of RAM Likwid library to read RAPL energy counters 12

34 Evaluation Results Viper No-Viper o o o o Stateless Forward position reports Viper : Communication Layer Viper Module used No-Viper : Layer Merge-Sort operator deployed Injection rate varied from 10,000 t/s to 1,200,000 t/s 1 Spout 2 Sink Spout N 13

35 Evaluation Results Viper No-Viper o o o o Stateless Forward position reports Viper : Communication Layer Viper Module used No-Viper : Layer Merge-Sort operator deployed Injection rate varied from 10,000 t/s to 1,200,000 t/s 1 Spout 2 Sink Spout N 13

36 Evaluation Results Viper No-Viper o o o Stateful Vehicle entering new Segment Viper : Communication Layer Viper Module used No-Viper : Layer Merge-Sort operator deployed 1 Spout 2 Sink Spout N 14

37 Conclusions and Future work o Discussed limitations of operator layer determinism and proposed a solution to overcome these at the communication layer. o Developed a Viper module that can be integrated in Ss o Evaluated the performance of the proposed module on Apache Storm o Scale the per tuple workload in the evaluation 15

38 Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing Thank You! 16

39 References o S. Schneider, M. Hirzel, B. Gedik and K. L. Wu Schneider 2015, Safe Data Parallelism for General Streaming in IEEE Transactions on Computers, o V. Gulisano, Y. Nikolakopoulos, M. Papatriantafilou and P. Tsigas Gulisano 2016, ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join in IEEE Transactions on Big Data,

Deterministic Real-Time Analytics of Geospatial Data Streams through ScaleGate Objects

Deterministic Real-Time Analytics of Geospatial Data Streams through ScaleGate Objects Vincenzo Gulisano, Yiannis Nikolakopoulos, Ivan Walulya, Marina Papatriantafilou and Philippas Tsigas Chalmers University