Streaming Auto-Scaling in Google Cloud Dataflow

Size: px

Start display at page:

Download "Streaming Auto-Scaling in Google Cloud Dataflow"

Stephany Morton
6 years ago
Views:

1 Streaming Auto-Scaling in Google Cloud Dataflow Manuel Fahndrich Software Engineer Google

2 Addictive Mobile Game

3 Individual Ranking Team Ranking Sarah Joe Milo 151, ,903 98,736 1,251,965 1,019, ,673 Hourly Ranking Daily Ranking

4 An Unbounded Stream of Game Events 8:00 1:00 9:00 2:00 10:00 3:00 11:00 4:00 12:00 5:00 13:00 6:00 14:00 7:00

5 with unknown delays. 8:00 8:00 8:00 8:00 9:00 10:00 11:00 12:00 13:00 14:00

6 The Resource Allocation Problem resources over-provisioned resources workload time resources workload under-provisioned resources time

7 Matching Resources to Workload resources auto-tuned resources workload time

8 Resources = Parallelism parallelism auto-tuned parallelism workload time More generally: VMs (including CPU, RAM, network, IO).

9 Assumptions Big Data Problem Embarrassingly Parallel Scaling VMs ==> Scales Throughput Horizontal Scaling

10 Agenda 1 Streaming Dataflow Pipelines 2 Pipeline Execution 3 Adjusting Parallelism Automatically 4 Summary + Future Work

11 1 Streaming Dataflow

12 Google s Data-Related Systems Dataflow MapReduce Dremel FlumeJava GFS Big Table Pregel Spanner Colossus MillWheel

13 Google Dataflow SDK Open Source SDK used to construct a Dataflow pipeline. (Now Incubating as Apache Beam)

14 Computing Team Scores // Collection of raw log lines PCollection<String> raw =...; // Element-wise transformation into team/score // pairs PCollection<KV<String, Integer>> input = raw.apply(pardo.of(new ParseFn())) // Composite transformation containing an // aggregation PCollection<KV<String, Integer>> output = input.apply(window.into(fixedwindows.of(minutes(60)))).apply(sum.integersperkey());

15 Google Cloud Dataflow Given code in Dataflow (incubating as Apache Beam) SDK... Pipelines can run On your development machine On the Dataflow Service on Google Cloud Platform On third party environments like Spark or Flink.

16 Google Cloud Dataflow Cloud Dataflow A fully-managed cloud service and programming model for batch and streaming big data processing.

17 Google Cloud Dataflow Optimize Schedule GCS GCS

18 Back to the Problem at Hand parallelism auto-tuned parallelism workload time

19 Auto-Tuning Ingredients Signals measuring Workload Policy making Decisions Mechanism actuating Change

20 2 Pipeline Execution

21 Optimized Pipeline = DAG of Stages raw input Individual points S1 S0 team points S2

22 Stage Throughput Measure throughput raw input throughput S0 Individual points S1 team points S2 throughput

23 Picture by Alexandre Duret-Lutz, Creative Commons 2.0 Generic

24 Queues of Data Ready for Processing S1 S0 S2 Queue Size = Backlog

25 Backlog Size vs. Backlog Growth

26 Backlog Growth = Processing Deficit

27 Derived Signal: Stage Input Rate throughput backlog growth S1 Input Rate = Throughput + Backlog Growth

28 Constant Backlog......could be bad

29 Backlog Time = Backlog Size Throughput

30 Backlog Time = Time to get through backlog

31 Bad Backlog = Long Backlog Time

32 Backlog Growth and Backlog Time Inform Upscaling. What Signals indicate Downscaling?

33 Low CPU Utilization

34 Signals Summary Throughput Backlog growth Backlog time CPU utilization

35 Policy: making Decisions Goals: 1. No backlog growth 2. Short backlog time 3. Reasonable CPU utilization

36 Upscaling Policy: Keeping Up Given M machines For a stage, given: average stage throughput T average positive backlog growth G of stage Machines needed for stage to keep up: M = M (T + G) T

37 Upscaling Policy: Catching Up Given M machines Given R (time to reduce backlog) For a stage, given: average backlog time B Extra machines to remove backlog: Extra = M B R

38 Upscaling Policy: All Stages Want all stages to: 1. keep up 2. have log backlog time Pick Maximum over all stages of M + Extra

39 MB/s seconds Example (signals) input rate throughput backlog growth backlog time

40 MB/s seconds Example (signals) input rate throughput backlog growth backlog time

41 MB/s seconds Example (signals) input rate throughput backlog growth backlog time

42 MB/s seconds Example (signals) input rate throughput backlog growth backlog time

43 machines Example (policy) M M Extra R=60s

44 machines Example (policy) M M Extra R=60s

45 machines Example (policy) M M Extra R=60s

46 machines Example (policy) M M Extra R=60s

47 Preconditions for Downscaling Low backlog time No backlog growth Low CPU utilization

48 How far can we downscale? Stay tuned...

49 Mechanism: actuating Change 3 Adjusting Parallelism of a Running Streaming Pipeline

50 Optimized Pipeline = DAG of Stages S1 S0 S2

51 Optimized Pipeline = DAG of Stages S1 S0 S2

52 Optimized Pipeline = DAG of Stages Machine 0 S1 S0 S2

53 Adding Parallelism S1 Machine 0 S0 S0 S1 S2 S1 S2 S0 S2

54 Adding Parallelism S1 Machine 0 S0 S2 S1 Machine 1 S0 S2

55 Adding Parallelism = Splitting Key Ranges S1 Machine 0 S0 S2 S1 Machine 1 S0 S2

56 Migrating a Computation

57 Adding Parallelism = Migrating Computation Ranges S1 Machine 0 S0 S2 S1 Machine 1 S0 S2

58 Checkpoint and Recovery ~ Computation Migration

59 Key Ranges and Persistence Machine 0 S1 Machine 2 S1 S0 S0 S2 S2 Machine 1 S1 Machine 3 S1 S0 S0 S2 S2

60 Downscaling from 4 to 2 Machines Machine 0 S1 Machine 2 S1 S0 S0 S2 S2 Machine 1 S1 Machine 3 S1 S0 S0 S2 S2

61 Downscaling from 4 to 2 Machines Machine 0 S1 Machine 2 S1 S0 S0 S2 S2 Machine 1 S1 Machine 3 S1 S0 S0 S2 S2

62 Downscaling from 4 to 2 Machines Machine 0 S1 S0 S2 Machine 1 S1 S0 S2

63 Downscaling from 4 to 2 Machines Machine 1 S1 S0 S2 Upsizing = Steps in Reverse Machine 2 S1 S0 S2

64 Granularity of Parallelism As of March 2016, Google Cloud Dataflow: Splits Key Ranges initially Based on Max Machines At Max: 1 Logical Persistent Disk per Machine Each disk has slice of key ranges from all stages Only (relatively) even Disk Distributions Results in Scaling Quanta

65 Example Scaling Quanta: Max = 60 Machines Parallelism Disk per Machine 3 N/A , 9 8 7, 8 9 6,

66 Policy: making Decisions Goals: 1. No backlog growth 2. Short backlog time 3. Reasonable CPU utilization

67 Preconditions for Downscaling Low backlog time No backlog growth Low CPU utilization

68 Downscaling Policy Next lower scaling quanta => M machines Estimate future CPU M per machine: CPU M = M M CPU M If new CPU M < threshold (say 90%), downscale to M

69 4 Summary + Future Work

70 Artificial Experiment

71 Auto-Scaling Summary Signals: throughput, backlog time, backlog growth, CPU utilization Policy: keep up, reduce backlog, use CPUs Mechanism: split key ranges, migrate computations

72 Future Work Experiment with non-uniform disk distributions to address hot ranges Dynamically splitting ranges finer than initially done. Approximate model of #VM - throughput relation

73 Questions? Further reading on streaming model: The world beyond batch: Streaming 101 The world beyond batch: Streaming 102

Google Cloud Dataflow

Google Cloud Dataflow A Unified Model for Batch and Streaming Data Processing Jelena Pjesivac-Grbovic STREAM 2015 Agenda 1 Data Shapes 2 Data Processing Tradeoffs 3 Google s Data Processing Story 4 Google