Sparrow. Distributed Low-Latency Spark Scheduling. Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Similar documents
Scalable Scheduling for Sub-Second Parallel Jobs

Lecture 11 Hadoop & Spark

DRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE

Unifying Big Data Workloads in Apache Spark

Spark, Shark and Spark Streaming Introduction

Shark. Hive on Spark. Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

The Datacenter Needs an Operating System

April Copyright 2013 Cloudera Inc. All rights reserved.

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling

Drizzle: Fast and Adaptable Stream Processing at Scale

Analytic Cloud with. Shelly Garion. IBM Research -- Haifa IBM Corporation

Discretized Streams. An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters

Scalable Tools - Part I Introduction to Scalable Tools

Spark. Cluster Computing with Working Sets. Matei Zaharia, Mosharaf Chowdhury, Michael Franklin, Scott Shenker, Ion Stoica.

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

FairRide: Near-Optimal Fair Cache Sharing

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization

Backtesting with Spark

Spark: A Brief History.

2/4/2019 Week 3- A Sangmi Lee Pallickara

A Platform for Fine-Grained Resource Sharing in the Data Center

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

Packing Tasks with Dependencies. Robert Grandl, Srikanth Kandula, Sriram Rao, Aditya Akella, Janardhan Kulkarni

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

Peacock: Probe-Based Scheduling of Jobs by Rotating Between Elastic Queues

Big Data Processing: Improve Scheduling Environment in Hadoop Bhavik.B.Joshi

Analytics in Spark. Yanlei Diao Tim Hunter. Slides Courtesy of Ion Stoica, Matei Zaharia and Brooke Wenig

Improving the MapReduce Big Data Processing Framework

Resilient Distributed Datasets

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

YARN: A Resource Manager for Analytic Platform Tsuyoshi Ozawa

UNIFY DATA AT MEMORY SPEED. Haoyuan (HY) Li, Alluxio Inc. VAULT Conference 2017

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Shark: SQL and Rich Analytics at Scale. Yash Thakkar ( ) Deeksha Singh ( )

Batch Processing Basic architecture

Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters

Chapter 4: Apache Spark

Job sample: SCOPE (VLDBJ, 2012)

Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica. University of California, Berkeley nsdi 11

Distributed Computing.

Cloud Analytics and Business Intelligence on AWS

CSE 444: Database Internals. Lecture 23 Spark

DATA SCIENCE USING SPARK: AN INTRODUCTION

Warehouse- Scale Computing and the BDAS Stack

Shark: SQL and Rich Analytics at Scale. Michael Xueyuan Han Ronny Hajoon Ko

Wide-Area Spark Streaming: Automated Routing and Batch Sizing

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

RESILIENT DISTRIBUTED DATASETS: A FAULT-TOLERANT ABSTRACTION FOR IN-MEMORY CLUSTER COMPUTING

Shark: Hive (SQL) on Spark

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Shark: SQL and Rich Analytics at Scale. Reynold Xin UC Berkeley

a Spark in the cloud iterative and interactive cluster computing

Spark Overview. Professor Sasu Tarkoma.

CompSci 516: Database Systems

Pocket: Elastic Ephemeral Storage for Serverless Analytics

Practical Big Data Processing An Overview of Apache Flink

Hadoop. Introduction / Overview

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research

Statistics Driven Workload Modeling for the Cloud

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

Oracle Big Data Fundamentals Ed 2

Data Storage Infrastructure at Facebook

On the Use of Performance Models in Autonomic Computing

Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center

Reactive App using Actor model & Apache Spark. Rahul Kumar Software

Analyzing Spark Scheduling And Comparing Evaluations On Sort And Logistic Regression With Albatross

Evolution From Shark To Spark SQL:

OPERATING SYSTEMS CS3502 Spring Processor Scheduling. Chapter 5

8/24/2017 Week 1-B Instructor: Sangmi Lee Pallickara

Bohr: Similarity Aware Geo-distributed Data Analytics. Hangyu Li, Hong Xu, Sarana Nutanong City University of Hong Kong

Fault Tolerance in K3. Ben Glickman, Amit Mehta, Josh Wheeler

Impala. A Modern, Open Source SQL Engine for Hadoop. Yogesh Chockalingam

Apache HAWQ (incubating)

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS

Stream Processing on IoT Devices using Calvin Framework

Apache Spark 2.0. Matei

Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics

Course Outline. Performance Tuning and Optimizing SQL Databases Course 10987B: 4 days Instructor Led

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Azure SQL Database for Gaming Industry Workloads Technical Whitepaper

Elastify Cloud-Native Spark Application with PMEM. Junping Du --- Chief Architect, Tencent Cloud Big Data Department Yue Li --- Cofounder, MemVerge

Varys. Efficient Coflow Scheduling. Mosharaf Chowdhury, Yuan Zhong, Ion Stoica. UC Berkeley

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::

CDS. André Schaaff1, François-Xavier Pineau1, Gilles Landais1, Laurent Michel2 de Données astronomiques de Strasbourg, 2SSC-XMM-Newton

A Spark Scheduling Strategy for Heterogeneous Cluster

Query Performance Visualization

Announcements. Reading Material. Map Reduce. The Map-Reduce Framework 10/3/17. Big Data. CompSci 516: Database Systems

Big Data solution benchmark

Accelerate Applications Using EqualLogic Arrays with directcache

Empirical Study of Stragglers in Spark SQL and Spark Streaming

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

Elastic Efficient Execution of Varied Containers. Sharma Podila Nov 7th 2016, QCon San Francisco

FROM LEGACY, TO BATCH, TO NEAR REAL-TIME. Marc Sturlese, Dani Solà

Research challenges in data-intensive computing The Stratosphere Project Apache Flink

Delft University of Technology Parallel and Distributed Systems Report Series

GLADE: A Scalable Framework for Efficient Analytics. Florin Rusu (University of California, Merced) Alin Dobra (University of Florida)

Blended Learning Outline: Cloudera Data Analyst Training (171219a)

Transcription:

Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Outline The Spark scheduling bottleneck Sparrow s fully distributed, fault-tolerant technique Sparrow s near-optimal performance

Spark Today User 1 User 2 User 3 Spark Context Query Compilation Storage Scheduling

Spark Today User 1 User 2 User 3 Spark Context Query Compilation Storage Scheduling

2004: MapReduce batch job 2010: Dremel Query 2009: Hive query 2012: Impala query 2010: In-memory Spark query 2013: Spark streaming 10 min. 10 sec. 100 ms 1 ms Job Latencies Rapidly Decreasing

Job latencies rapidly decreasing

Job latencies rapidly decreasing + Spark deployments growing in size Scheduling bottleneck!

Spark scheduler throughput: 1500 tasks / second Task Duration Cluster size (# 16-core machines) 10 second 1000 1 second 100 100 ms 10

Optimizing the Spark 0.8: Monitoring code moved off critical path 0.8.1: Result deserialization moved off critical path Future improvements may yield 2-3x higher throughput

Is the scheduler the bottleneck in my cluster? tinyurl.com/sparkdemo

Task launch Cluster Task completion tinyurl.com/sparkdemo

Task launch Cluster Task completion tinyurl.com/sparkdemo

Task launch delay Cluster Task completion tinyurl.com/sparkdemo

Spark Today User 1 User 2 User 3 Spark Context Query Compilation Storage Scheduling

Future Spark User 1 Query compilation Benefits: User 2 Query compilation High throughput Fault tolerance User 3 Query compilation

Future Spark User 1 Query compilation Storage: User 2 Query compilation Tachyon User 3 Query compilation

Scheduling with Sparrow Stage

Batch Sampling Stage 4 probes (d = 2) Place m tasks on the least loaded of 2m workers

Queue length poor predictor of wait time 80 ms 155 ms 530 ms Poor performance on heterogeneous workloads

Late Binding Stage 4 probes (d = 2) Place m tasks on the least loaded of d m workers

Late Binding Stage 4 probes (d = 2) Place m tasks on the least loaded of d m workers

Late Binding Stage requests task Place m tasks on the least loaded of d m workers

What about constraints?

Per-Task Constraints Stage Probe separately for each task

Technique Recap Batch sampling + Late binding + Constraints

How well does Sparrow perform?

How does Sparrow compare to Spark s native scheduler? )*+,-.+*!/01*!21+3!("""!'"""!&"""!%"""!$""" :,485!.490;*!+<=*>7?*8!#""" :,488-@ A>*4?!"!("""!'"""!&"""!%"""!$""" /4+5!678490-.!21+3!#"""!" 100 16-core EC2 nodes, 10 tasks/job, 10 schedulers, 80% load

TPC-H Queries: Background TPC-H: Common benchmark for analytics workloads Shark: SQL execution engine Spark Sparrow

*+,-./,+!012+!32,4!'"""!&#""!&"""!%#""!%"""!$#""!$"""!#""!" TPC-H Queries *:/6.2 ;-:<<.= >6+:? '%$5!32+674 #&8)!32+674 599$!32+674 (& (' () ($% 100 16-core EC2 nodes, 10 schedulers, 80% load Percentiles 95 Within 12% of ideal Median queuing delay of 9ms 75 50 25 5

Policy Enforcement Priorities Serve queues based on strict priorities Fair Shares Serve queues using weighted fair queuing High Priority Low Priority User A (75%) User B (25%)

Weighted Fair Sharing ()**+*,!-./0/!'""!&#"!&""!%#"!%""!$#"!$""!#"!" 5/26!" 5/26!$!"!$"!%"!&"!'"!#" -+12!3/4

Fault Tolerance Spark Client 1 Spark Client 2 1 2 01,23!2,.456.,!7*+,!-+./!&"""!%"""!$"""!#"""!"!&"""!%"""!$"""!#"""!" Timeout: 100ms Failover: 5ms Re-launch queries: 15ms 89*:12, ;492<!=:*,67!# ;492<!=:*,67!$!"!#"!$"!%"!&"!'"!(" )*+,!-./

Making Sparrow feature-complete Interfacing with UI Delay scheduling Speculation

(1) Diagnosing a Spark scheduling bottleneck (2) Distributed, faulttolerant scheduling with Sparrow www.github.com/radlab/sparrow