Streaming Analytics with Apache Flink. Stephan

Size: px

Start display at page:

Download "Streaming Analytics with Apache Flink. Stephan"

Derrick Stevens
6 years ago
Views:

1 Streaming Analytics with Apache Flink Stephan

2 Apache Flink Stack Libraries DataStream API Stream Processing DataSet API Batch Processing Runtime Distributed Streaming Data Flow Streaming and batch as first class citizens. 2

3 Today Libraries DataStream API Stream Processing DataSet API Batch Processing Runtime Distributed Streaming Data Flow Streaming and batch as first class citizens. 3

4 Streaming is the next programming paradigm for data applications, and you need to start thinking in terms of streams. 4

5 Streaming technology is enabling the obvious: continuous processing on data that is continuously produced 5

6 A brief History of Flink January 10 April 14 December 14 March 16 v0.5 v0.6 v0.7 v0.8 v0.9 v0.10 Project Stratosphere (Flink precursor) Flink Project Incubation Top Level Project Release 1.0 6

7 A brief History of Flink The academia gap: Reading/writing papers, teaching, worrying about thesis January 10 April 14 December 14 March 16 v0.5 v0.6 v0.7 v0.8 v0.9 v0.10 Project Stratosphere (Flink precursor) Flink Project Incubation Top Level Project Release 1.0 Realizing this might be interesting to people beyond academia (even more so, actually) 7

8 A Stream Processing Pipeline collect log analyze serve 8

9 Programs and Dataflows val lines: DataStream[String] = env.addsource(new FlinkKafkaConsumer09( )) val events: DataStream[Event] = lines.map((line) => parse(line)) val stats: DataStream[Statistic] = stream.keyby("sensor").timewindow(time.seconds(5)).sum(new MyAggregationFunction()) stats.addsink(new RollingSink(path)) Source Transformation Transformation Sink Source [1] map() [1] keyby()/ window()/ apply() [1] Source [2] map() [2] keyby()/ window()/ apply() [2] Sink [1] Streaming Dataflow 9

10 Why does Flink stream flink? High Throughput Low latency Make more sense of data Well-behaved flow control (back pressure) True Streaming Event Time Works on real-time and historic data Windows & user-defined state Stateful Streaming APIs Libraries Complex Event Processing Exactly-once semantics for fault tolerance Globally consistent savepoints Flexible windows (time, count, session, roll-your own) 10

11 Counting 11

12 Continuous counting A seemingly simple application, but generally an unsolved problem E.g., count visitors, impressions, interactions, clicks, etc Aggregations and OLAP cube operations are generalizations of counting 12

13 Counting in batch architecture Continuous ingestion Periodic (e.g., hourly) files Periodic batch jobs 13

14 Problems with batch architecture High latency Too many moving parts Implicit treatment of time Out of order event handling Implicit batch boundaries 14

15 Counting in λ architecture "Batch layer": what we had before "Stream layer": approximate early results 15

16 Problems with batch and λ Way too many moving parts (and code dup) Implicit treatment of time Out of order event handling Implicit batch boundaries 16

17 Counting in streaming architecture Message queue ensures stream durability and replay Stream processor ensures consistent counting 17

18 Counting in Flink DataStream API Number of visitors in last hour by country DataStream<LogEvent> stream = env.addsource(new FlinkKafkaConsumer(...)) // create stream from Kafka.keyBy("country") // group by country.timewindow(time.minutes(60)) // window of size 1 hour.apply(new CountPerWindowFunction()); // do operations per window 18

19 Counting hierarchy of needs Based on Maslow's hierarchy of needs... queryable... accurate and repeatable,... fault tolerant (exactly once),... efficiently on high volume streams,... with low latency, Continuous counting 19

20 Counting hierarchy of needs Continuous counting 20

21 Counting hierarchy of needs... with low latency, Continuous counting 21

22 Counting hierarchy of needs... efficiently on high volume streams,... with low latency, Continuous counting 22

23 Counting hierarchy of needs... fault tolerant (exactly once),... efficiently on high volume streams,... with low latency, Continuous counting 23

24 Counting hierarchy of needs... accurate and repeatable,... fault tolerant (exactly once),... efficiently on high volume streams,... with low latency, Continuous counting 24

25 Counting hierarchy of needs queryable... accurate and repeatable,... fault tolerant (exactly once),... efficiently on high volume streams,... with low latency, Continuous counting 25

26 Rest of this talk... queryable... accurate and repeatable,... fault tolerant (exactly once),... efficiently on high volume streams,... with low latency, Continuous counting 26

27 Streaming Analytics by Example 27

28 Time-Windowed Aggregations case class Event(sensor: String, measure: Double) val env = StreamExecutionEnvironment.getExecutionEnvironment val stream: DataStream[Event] = env.addsource( ) stream.keyby("sensor").timewindow(time.seconds(5)).sum("measure") 28

29 Time-Windowed Aggregations case class Event(sensor: String, measure: Double) val env = StreamExecutionEnvironment.getExecutionEnvironment val stream: DataStream[Event] = env.addsource( ) stream.keyby("sensor").timewindow(time.seconds(60), Time.seconds(5)).sum("measure") 29

30 Session-Windowed Aggregations case class Event(sensor: String, measure: Double) val env = StreamExecutionEnvironment.getExecutionEnvironment val stream: DataStream[Event] = env.addsource( ) stream.keyby("sensor").window(eventtimesessionwindows.withgap(time.seconds(60))).max("measure") 30

31 Session-Windowed Aggregations case class Event(sensor: String, measure: Double) val env = StreamExecutionEnvironment.getExecutionEnvironment val stream: DataStream[Event] = env.addsource( ) Flink 1.1 syntax stream.keyby("sensor").window(eventtimesessionwindows.withgap(time.seconds(60))).max("measure") 31

32 Pattern Detection case class Event(producer: String, evttype: Int, msg: String) case class Alert(msg: String) val stream: DataStream[Event] = env.addsource( ) stream.keyby("producer").flatmap(new RichFlatMapFuncion[Event, Alert]() { lazy val state: ValueState[Int] = getruntimecontext.getstate( ) def flatmap(event: Event, out: Collector[Alert]) = { val newstate = state.value() match { case 0 if (event.evttype == 0) => 1 case 1 if (event.evttype == 1) => 0 case x => out.collect(alert(event.msg, x)); 0 } state.update(newstate) } }) 32

33 Pattern Detection case class Event(producer: String, evttype: Int, msg: String) case class Alert(msg: String) val stream: DataStream[Event] = env.addsource( ) stream.keyby("producer").flatmap(new RichFlatMapFuncion[Event, Alert]() { lazy val state: ValueState[Int] = getruntimecontext.getstate( ) def flatmap(event: Event, out: Collector[Alert]) = { val newstate = state.value() match { case 0 if (event.evttype == 0) => 1 case 1 if (event.evttype == 1) => 0 case x => out.collect(alert(event.msg, x)); 0 } state.update(newstate) } }) Embedded key/value state store 33

34 Many more Joining streams (e.g. combine readings from sensor) Detecting Patterns (CEP) Applying (changing) rules or models to events Training and applying online machine learning models 34

35 (It's) About Time 35

36 The biggest change in moving from batch to streaming is handling time explicitly 36

37 Example: Windowing by Time case class Event(id: String, measure: Double, timestamp: Long) val env = StreamExecutionEnvironment.getExecutionEnvironment val stream: DataStream[Event] = env.addsource( ) stream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") 37

38 Example: Windowing by Time case class Event(id: String, measure: Double, timestamp: Long) val env = StreamExecutionEnvironment.getExecutionEnvironment val stream: DataStream[Event] = env.addsource( ) stream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") 38

39 Different Notions of Time Event Producer Message Queue Flink Data Source Flink Window Operator partition 1 partition 2 Event Time Ingestion Time Window Processing Time 39

40 Event Time vs. Processing Time Event Time Episode IV Episode V Episode VI Episode I Episode II Episode III Episode VII Processing Time 40

41 Processing Time case class Event(id: String, measure: Double, timestamp: Long) val env = StreamExecutionEnvironment.getExecutionEnvironment env.setstreamtimecharacteristic(processingtime) val stream: DataStream[Event] = env.addsource( ) stream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") Window by operator's processing time 41

42 Ingestion Time case class Event(id: String, measure: Double, timestamp: Long) val env = StreamExecutionEnvironment.getExecutionEnvironment env.setstreamtimecharacteristic(ingestiontime) val stream: DataStream[Event] = env.addsource( ) stream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") 42

43 Event Time case class Event(id: String, measure: Double, timestamp: Long) val env = StreamExecutionEnvironment.getExecutionEnvironment env.setstreamtimecharacteristic(eventtime) val stream: DataStream[Event] = env.addsource( ) stream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") 43

44 Event Time case class Event(id: String, measure: Double, timestamp: Long) val env = StreamExecutionEnvironment.getExecutionEnvironment env.setstreamtimecharacteristic(eventtime) val stream: DataStream[Event] = env.addsource( ) val tsstream = stream.assignascendingtimestamps(_.timestamp) tsstream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") 44

45 Event Time case class Event(id: String, measure: Double, timestamp: Long) val env = StreamExecutionEnvironment.getExecutionEnvironment env.setstreamtimecharacteristic(eventtime) val stream: DataStream[Event] = env.addsource( ) val tsstream = stream.assigntimestampsandwatermarks( new MyTimestampsAndWatermarkGenerator()) tsstream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") 45

46 Watermarks Stream (in order) W(20) Watermark W(11) Event timestamp Event Stream (out of order) W(17) W(11) Watermark Event timestamp Event 46

47 Watermarks in Parallel Watermark Event [id timestamp] Q 44 N 39 M 39 Source (1) K 35 W(33) map (1) 29 C 30 B 31 A window (1) 14 Watermark Generation W(17) D 15 Event Time at input streams R 37 O 23 L 22 Source (2) H 20 map (2) 17 G 18 W(17) F 15 E window (2) Event Time at the operator 47

48 Per Kafka Partition Watermarks N 39 L 35 O 97 M Source (1) K 77 W(33) map (1) 29 C 33 B 73 A window (1) 14 Watermark Generation W(17) D 18 Q 23 I 21 T 99 S Source (2) H 94 map (2) 17 G 91 W(17) F 15 E window (2) 48

49 Per Kafka Partition Watermarks val env = StreamExecutionEnvironment.getExecutionEnvironment env.setstreamtimecharacteristic(eventtime) val kafka = new FlinkKafkaConsumer09(topic, schema, props) kafka.assigntimestampsandwatermarks( new MyTimestampsAndWatermarkGenerator()) val stream: DataStream[Event] = env.addsource(kafka) stream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") 49

50 Matters of State (Fault Tolerance, Reinstatements, etc) 50

51 Back to the Aggregation Example case class Event(id: String, measure: Double, timestamp: Long) val env = StreamExecutionEnvironment.getExecutionEnvironment val stream: DataStream[Event] = env.addsource( new FlinkKafkaConsumer09(topic, schema, properties)) stream.keyby("id").timewindow(time.seconds(15), Time.seconds(5)).sum("measure") Stateful 51

52 Fault Tolerance Prevent data loss (reprocess lost in-flight events) Recover state consistency (exactly-once semantics) Pending windows & user-defined (key/value) state Checkpoint based fault tolerance Periodicaly create checkpoints Recovery: resume from last completed checkpoint Async. Barrier Snapshots (ABS) Algorithm 52

53 Checkpoints newer records data stream older records event State of the dataflow at point Y State of the dataflow at point X 53

54 Checkpoint Barriers Markers, injected into the streams 54

55 Checkpoint Procedure 55

56 Checkpoint Procedure 56

57 Savepoints A "Checkpoint" is a globally consistent point-in-time snapshot of the streaming program (point in stream, state) A "Savepoint" is a user-triggered retained checkpoint Streaming programs can start from a savepoint Savepoint B Savepoint A 57

58 (Re)processing data (in batch) Re-processing data (what-if exploration, to correct bugs, etc.) Usually by running a batch job with a set of old files Tools that map files to times :00 am :00 am :00 am :00pm :00pm :00am :00am Collection of files, by ingestion time To the batch processor 58

59 Unclear Batch Boundaries :00 am :00 am :00 am :00pm :00pm :00am :00am?? What about sessions across batches? To the batch processor 59

60 (Re)processing data (streaming) Draw savepoints at times that you will want to start new jobs from (daily, hourly, ) Reprocess by starting a new job from a savepoint Defines start position in stream (for example Kafka offsets) Initializes pending state (like partial sessions) Run new streaming program from savepoint Savepoint 60

61 Continuous Data Sources partition partition Stream of Kafka Partitions Savepoint Kafka offsets + Operator state WIP (Flink 1.1?) Savepoint File mod timestamp + File position + Operator state :00am :00am :00pm :00pm :00 am :00 am :00 am Stream view over sequence of files 61

62 Upgrading Programs A program starting from a savepoint can differ from the program that created the savepoint Unique operator names match state and operator Mechanism be used to fix bugs in programs, to evolve programs, parameters, libraries, 62

63 State Backends Large state is a collection of key/value pairs State backend defines what data structure holds the state, plus how it is snapshotted Most common choices Main memory snapshots to master Main memory snapshots to dist. filesystem RocksDB snapshots to dist. filesystem 63

64 Complex Event Processing Primer 64

65 Event Types 65

66 Defining Patterns 66

67 Generating Alerts 67

68 Latency and Throughput 68

69 Low Latency and High Throughput Frequently though to be mutually exclusive Event-at-a-time low latency, low throughput Mini batch high latency, high throughput The above is not true! Very little latency has to be sacrificed for very high throughput 69

70 Latency and Throughput 70

71 Latency and Throughput 71

72 The Effect of Buffering Network stack does not always operate in event-at-a-time mode Optional buffering adds some milliseconds latency but increases throughput No effect on application logic 72

73 An Outlook on Things to Come 73

74 Roadmap Dynamic Scaling, Resource Elasticity Stream SQL CEP enhancements Incremental & asynchronous state snapshotting Mesos support More connectors, end-to-end exactly once API enhancements (e.g., joins, slowly changing inputs) Security (data encryption, Kerberos with Kafka) 74

75 I stream*, do you? * beyond Netflix movies 75

76 Why does Flink stream flink? High Throughput Low latency Make more sense of data Well-behaved flow control (back pressure) True Streaming Event Time Works on real-time and historic data Windows & user-defined state Stateful Streaming APIs Libraries Complex Event Processing Exactly-once semantics for fault tolerance Globally consistent savepoints Flexible windows (time, count, session, roll-your own) 76

77 Addendum 77

78 On a technical level Decouple all things Clocks Wall clock time (processing time) Event time (watermarks & punctuations) Consistency clock (logical checkpoint timestamps) Buffering Windows (application logic) Network (throughput tuning) 78

79 Decoupling clocks 79

80 Stream Alignment 80

81 High Availability Checkpoints JobManager Client Apache Zookeeper 1. Take snapshots TaskManagers 81

82 High Availability Checkpoints JobManager Client Apache Zookeeper 1. Take snapshots 2. Persist snapshots 3. Send handles to JM TaskManagers 82

83 High Availability Checkpoints JobManager Client Apache Zookeeper 1. Take snapshots 2. Persist snapshots 3. Send handles to JM 4. Create global checkpoint TaskManagers 83

84 High Availability Checkpoints JobManager Client Apache Zookeeper 1. Take snapshots 2. Persist snapshots 3. Send handles to JM 4. Create global checkpoint 5. Persist global checkpoint TaskManagers 84

85 High Availability Checkpoints JobManager Client Apache Zookeeper 1. Take snapshots 2. Persist snapshots 3. Send handles to JM 4. Create global checkpoint 5. Persist global checkpoint 6. Write handle to ZooKeeper TaskManagers 85

The Stream Processor as a Database. Ufuk

The Stream Processor as a Database Ufuk Celebi @iamuce Realtime Counts and Aggregates The (Classic) Use Case 2 (Real-)Time Series Statistics Stream of Events Real-time Statistics 3 The Architecture collect