Research challenges in data-intensive computing The Stratosphere Project Apache Flink

Size: px

Start display at page:

Download "Research challenges in data-intensive computing The Stratosphere Project Apache Flink"

Polly Casey
5 years ago
Views:

1 Research challenges in data-intensive computing The Stratosphere Project Apache Flink Seif Haridi KTH/SICS e2e-clouds.org Presented by: Seif Haridi May 2014

2 Research Areas Data-intensive computing Multi-Clouds Big Data 2 Ericsson Internal Page 2

3 Talk Outline Overview of Big Data Big data is here to stay and importance is increasing The Stratosphere data-analytics platform Apache Flink

4 What is Big Data? Small Data Big Data

5 What is Big Data? Big Data refers to datasets and flows large enough that has outpaced our capability to store, process, analyze, and understand

6 Why is Big Data Important in Science? In a wide array of academic fields, the ability to effectively process data is superseding other more classical modes of research. More data trumps better algorithms * The more data your models have from which to learn, the more accurate they become even if they weren t cutting-edge to begin with In speech recognition research increasing the model size by two orders of magnitude reduces the [word error rate] by 10% relative.. * The Unreasonable Effectiveness of Data [Halevey et al 09]

7 Big Data means Parallelization Read genome on 100 machines: ~10 seconds

8 Big Data Processing with No Data Locality Job( /genomes/jim.bam ) submit Workflow Manager Compute Grid Node Job This doesn t scale. Bandwidth is the bottleneck

9 MapReduce Data Locality Job( /genomes/jim.bam ) submit Job Tracker Task Task Task Task Task Task Tracker Tracker Tracker Tracker Tracker Tracker Job Job Job Job Job Job DN DN DN DN DN DN R R = resultfile(s) R R

10 Hadoop 2.x Single Processing Framework Batch Apps Multiple Processing Frameworks Batch, Interactive, Streaming Hadoop 1.x Hadoop 2.x MapReduce (data processing) Others (spark, mpi, giraph, etc) MapReduce (resource mgmt, job scheduler, data processing) HDFS (distributed storage) YARN/Mesos (resource mgmt, job scheduler) HDFS (distributed storage)

11 OPEN Source Communities Ericsson Internal Page 11

count() { word => word } val output = counts.

12 New Data Processing Frameworks val input= TextFile(textInput) val words = input.flatmap { line => line.split( ) } val counts = words.groupby.count() { word => word } val output = counts.write (wordsoutput, RecordDataSinkFormat() ) val plan = new ScalaPlan(Seq(output)) Ericsson Internal Page 12

13 StraToSphere SQL Streaming Graphs ML High level Lang. MapReduce Stratosphere Mesos / YARN HDFS Spark 13

14 What is Stratosphere? An efficient distributed general-purpose data analysis platform Built on top of HDFS and YARN Focusing on ease of programming Ericsson Internal Page 14 14

with first industrial installations Apache Incubator v0.

15 Project status Research project started in 2009 by TU Berlin, HU Berlin, joined by SICS Now a growing open source project with first industrial installations Apache Incubator v0.4 - stable & documented, v0.5 beta status Ericsson Internal Page 15

Introducing Stratosphere General Purpose

Optimizer Efficient Runtime Stratosphere

16 Introducing Stratosphere General Purpose Data Analytics Platform. Database Technology MapReduce-style Technology Declarativity for SQL Optimizer Efficient Runtime Stratosphere Iterations Advanced Dataflows Declarativity Scalability User-defined functions (UDFs) Complex data types Schema on read Ericsson Internal Page 16 16

17 Stratosphere Stack Hive... Java API Scala API Spargel (graphs) Meteor (scripting) SQL,Python Hadoop MapReduce Stratosphere Optimizer Stratosphere Runtime Cluster Manager Direct YARN EC2 Ericsson Internal Page 17 Storage Local Files HDFS S3 JDBC 17...

18 Key Features Ericsson Internal Page 18 Easy to use developer APIs Java, Scala, Graphs, Nested Data (Python & SQL under development) Flexible composition of large programs High Performance Runtime Complex DAGs of operators In memory & out-of-core Data streamed between operations Automatic Optimization Join algorithms Operator chaining Reusing partitioning/sorting Native Iterations Embedded in the APIs Data streaming / in-memory Delta iterations speed up many programs by orders of mag. 18

19 Programming Model A program is expressed as an arbitrary data flow consisting of transformations, sources and sinks. Source Map Reduce Iterate Join Reduce Sink Source Map Ericsson Internal Page 19 19

20 Transformations Higher-order functions that execute user-defined functions in parallel on the input data.

21 Concise & rich APIs Basic Operators Map Reduce Join CoGroup Union Cross Iterate IterateDelta Ericsson Internal Page 21 Derived Operators Filter, FlatMap, Project Aggregate, Distinct Outer-Join, inner Join Vertex-Centric Graphs computation (Pregel style)... 21

22 Basic data operators Map Reduce Cross Match CoGroup 22 Ericsson Internal Page 22

23 Transformations: Map All pairs are independently processed. Map val input: DataSet[(Int, String)] =... val mapped = input.flatmap { case (value, words) => words.split(" ") } 23 Ericsson Internal Page 23

24 Ericsson Internal Page 24

$map { word => (word, 1)} } val counts = words.groupby {case (word, _) => word }.$

25 Concise & rich APIs Word Count in Stratosphere Scala API Data source Transformation s val input = TextFile(textInput) val words = input.flatmap { line => line.split(" ").map { word => (word, 1)} } val counts = words.groupby {case (word, _) => word }.reduce { (w1, w2) => (w1._1, w1._2 + w2._2) } val output = counts.write(wordsoutput, CsvOutputFormat()) Data sink Ericsson Internal Page 25 25

26 Job graphs to execution graphs 26 Ericsson Internal Page 26

27 Joins in Stratosphere val large = env.readcsv(...) val medium = env.readcsv(...) val small = env.readcsv(...) large γ medium small joined1 = large.join(medium).where(_._3).isequalto(_._1).map{(left,right) =>...} joined2 = small.join(joined1).where(0).equals(2).map{ (left,right) =>...} result = joined2.groupby {_._3}.reduceGroup {el => e1.maxby {_._2}} Ericsson Internal Page 27 Built-in strategies include partitioned join and replicated join with local sort-merge or hybrid-hash algorithms. 27

28 Automatic Optimization DataSet<Tuple...> large = env.readcsv(...); DataSet<Tuple...> medium = env.readcsv(...); DataSet<Tuple...> small = env.readcsv(...); DataSet<Tuple...> joined1 = large.join(medium).where(3).equals(1).with(new JoinFunction() {... }); DataSet<Tuple...> joined2 = small.join(joined1).where(0).equals(2).with(new JoinFunction() {... }); DataSet<Tuple...> result = joined2.groupby(3).aggregate(max, 2); Possible execution 2) Broadcast hash-join 1) Partitioned hash-join Ericsson Internal Page 28 Partitioned Reduce-side Broadcast Map-side 3) Grouping /Aggregation reuses the partitioning from step (1) No shuffle!!! 28

29 Distributed Runtime Master (Job Manager) handles job submission, scheduling, and metadata Workers (Task Managers) execute operations Data can be streamed between nodes All operators start in-memory and gradually go out-of-core Ericsson Internal Page 29 29

30 Input file Fault Tolerance Similar to Spark: tracks execution history to rebuild on failure by recomputation file.map(rec => (rec.type, 1)).reduce(_ + _).filter((type, count) => count > 10) map reduce filter Ericsson Internal Page 30

31 Input file Fault Tolerance Similar to Spark: tracks execution history to rebuild on failure by recomputation file.map(rec => (rec.type, 1)).reduce(_ + _).filter((type, count) => count > 10) map reduce filter Ericsson Internal Page 31

32 Runtime Architecture Comparison Ericsson Internal Page 32 empty page public class WC { public String word; public int count; } Pool of Memory Pages Works on pages of bytes Maps objects transparently to these pages Full control over memory, out-of-core enabled Algorithms work on binary representation Address individual fields (not deserialize whole object) Distributed Collection List[WC] Collections of objects General-purpose serializer (Java / Kryo) Limited control over memory & less efficient spilling Deserialize all or nothing 32

33 Iterative Programs SQL Streaming Graphs ML High level Lang. MapReduce Stratosphere Mesos / YARN HDFS Spark 33

34 Why Iterative Algorithms Algorithms that need iterations Clustering (K-Means, Canopy, ) Gradient descent (e.g., Logistic Regression, Matrix Factorization) Graph Algorithms (e.g., PageRank, Line-Rank, components, paths, reachability, centrality, ) Graph communities / dense sub-components Inference (believe propagation) Loop makes multiple passes over the data Ericsson Internal Page 34 34

35 Iterations in other systems Client Loop outside the system Step Step Step Step Step Client Loop outside the system Step Step Step Step Step Ericsson Internal Page 35 35

36 Iterations in Stratosphere Streaming dataflow with feedback red. map join join System is iteration-aware, performs automatic optimization 36 Ericsson Internal Page 36

37 Iteration Two types of iteration at stratosphere: Bulk iteration Delta iteration Both operators repeatedly invoke the step function on the current iteration state until a certain termination condition is reached S. Haridi, E2E Clouds 37 Ericsson Internal Page 37

Iteration Bulk Iteration In each iteration, the step function consumes the entire input (the result of the previous iteration, or the initial data set), and computes the next version of the partial

38 Iteration Bulk Iteration In each iteration, the step function consumes the entire input (the result of the previous iteration, or the initial data set), and computes the next version of the partial solution A new version of the entire model in each iteration val input: DataSet[Int] =... def step(partial: DataSet[Int]) = { val nextpartial = partial.map { a => a + 1 } nextpartial } val numiter = 10; val iter = input.iterate(numiter, step) Ericsson Internal Page 38 S. Haridi, E2E Clouds 38

Iteration Delta Iteration Only parts of the model change in each iteration val input: DataSet[(Int, Int)] =... val initwset: DataSet[(Int, Int)] =... val initsset: DataSet[(Int, Int)] =.

39 Iteration Delta Iteration Only parts of the model change in each iteration val input: DataSet[(Int, Int)] =... val initwset: DataSet[(Int, Int)] =... val initsset: DataSet[(Int, Int)] =... def step(ss: DataSet[Int], ws: DataSet[Int], ) = { val delta =... val nextworkset =... } val numiter = 10; val iter = input.iteratewirhwset( ) Ericsson Internal Page 39 39

40 Iteration Delta Iteration Connected Components Ericsson Internal Page 40 40

41 Ericsson Internal Page 41

42 Automatic Optimization for Iterative Programs Pushing work out of the loop Caching Loop-invariant Data Maintain state as index Ericsson Internal Page 42 42

# Vertices (thousands) Delta Iterations speed up certain problems by a lot Cover typical use cases of Pregel-like systems with comparable performance in a generic platform and developer API.

43 # Vertices (thousands) Delta Iterations speed up certain problems by a lot Cover typical use cases of Pregel-like systems with comparable performance in a generic platform and developer API. Ericsson Internal Page Bulk 400 Delta Iteration Computations performed in each iteration for connected communities of a social graph Twitter Webbase (20) Runtime (secs) 43

44 Thank you! Multi- Clouds Big Data 44

Practical Big Data Processing An Overview of Apache Flink

Practical Big Data Processing An Overview of Apache Flink Tilmann Rabl Berlin Big Data Center www.dima.tu-berlin.de bbdc.berlin rabl@tu-berlin.de With slides from Volker Markl and data artisans 1 2013