Packing Tasks with Dependencies. Robert Grandl, Srikanth Kandula, Sriram Rao, Aditya Akella, Janardhan Kulkarni
|
|
- Christopher Hancock
- 5 years ago
- Views:
Transcription
1 Packing Tasks with Dependencies Robert Grandl, Srikanth Kandula, Sriram Rao, Aditya Akella, Janardhan Kulkarni
2 The Cluster Scheduling Problem Jobs Goal: match tasks to resources Tasks 2
3 The Cluster Scheduling Problem Jobs Goal: match tasks to resources Tasks 2
4 The Cluster Scheduling Problem Jobs Tasks Goal: match tasks to resources to achieve High cluster utilization Fast job completion Guarantees (deadlines, fair shares) 2
5 The Cluster Scheduling Problem Jobs Tasks Goal: match tasks to resources to achieve High cluster utilization Fast job completion Guarantees (deadlines, fair shares) Constraints Scale fast twitch 2
6 The Cluster Scheduling Problem Jobs Tasks Goal: match tasks to resources to achieve High cluster utilization Fast job completion Guarantees (deadlines, fair shares) Constraints Scale fast twitch Large and high-value deployments E.g., Spark, Yarn*, Mesos*, Cosmos 2
7 The Cluster Scheduling Problem Jobs Tasks Goal: match tasks to resources to achieve High cluster utilization Fast job completion Guarantees (deadlines, fair shares) Constraints Scale fast twitch Large and high-value deployments E.g., Spark, Yarn*, Mesos*, Cosmos Today, schedulers are simple and (as we show) performance can improve a lot 2
8 Jobs have heterogeneous DAGs User queries Query optimizer Job DAG (Dryad, Spark-SQL, Hive, )
9 Jobs have heterogeneous DAGs Example DAG User queries Query optimizer Job DAG (Dryad, Spark-SQL, Hive, ) # Tasks 1 Edges Legend Duration 200s 60s 0s need shuffle can be local
10 Jobs have heterogeneous DAGs Example DAG User queries Query optimizer Job DAG (Dryad, Spark-SQL, Hive, ) # Tasks 1 Edges Legend Duration 200s 60s 0s need shuffle can be local
11 Jobs have heterogeneous DAGs Example DAG User queries Query optimizer Job DAG (Dryad, Spark-SQL, Hive, ) DAGs have deep and complex structures Task durations range from <1s to >100s Tasks use different amounts of resources # Tasks 1 Edges Legend Duration 200s 60s 0s need shuffle can be local
12 Challenges in scheduling heterogeneous DAGs 0: εε T , : , εε 3: 4: , εε , : TT , : εε TT ,
13 Challenges in scheduling heterogeneous DAGs 0: εε T , : , εε 3: 4: , εε , : TT , : εε TT ,
14 Challenges in scheduling heterogeneous DAGs 0: εε T , : , εε 3: 4: , εε , : TT , : εε TT ,
15 Challenges in scheduling heterogeneous DAGs CPSched cannot overlap tasks with complementary demands 0: εε T , : , εε 3: 4: 2: TT , : , εε , εε TT ,
16 Challenges in scheduling heterogeneous DAGs Packers 1 CPSched cannot overlap tasks with complementary demands 0: εε T , : , εε 3: 4: 2: TT , : , εε , εε TT , [1] Tetris: Multi-resource Packing for Cluster Schedulers, SIGCOMM 14
17 Challenges in scheduling heterogeneous DAGs Packers 1 CPSched cannot overlap tasks with complementary demands Packers do not handle dependencies 0: εε T , : , εε 3: 4: 2: TT , : , εε , εε TT , [1] Tetris: Multi-resource Packing for Cluster Schedulers, SIGCOMM 14
18 Challenges in scheduling heterogeneous DAGs Packers 1 CPSched cannot overlap tasks with complementary demands Packers do not handle dependencies 0: εε T , : , εε 3: 4: 2: TT , : , εε , εε TT , nn tasks dd resources [1] Tetris: Multi-resource Packing for Cluster Schedulers, SIGCOMM 14
19 Challenges in scheduling heterogeneous DAGs
20 Challenges in scheduling heterogeneous DAGs 1. Simple heuristics lead to poor schedules
21 Challenges in scheduling heterogeneous DAGs 1. Simple heuristics lead to poor schedules 2. Production DAGs are roughly 50% slower than lower bounds
22 Challenges in scheduling heterogeneous DAGs 1. Simple heuristics lead to poor schedules 2. Production DAGs are roughly 50% slower than lower bounds 3. Simple variants of Packing dependent tasks are NP-hard problems
23 Challenges in scheduling heterogeneous DAGs 1. Simple heuristics lead to poor schedules 2. Production DAGs are roughly 50% slower than lower bounds 3. Simple variants of Packing dependent tasks are NP-hard problems 4. Prior analytical solutions miss some practical concerns
24 Challenges in scheduling heterogeneous DAGs 1. Simple heuristics lead to poor schedules 2. Production DAGs are roughly 50% slower than lower bounds 3. Simple variants of Packing dependent tasks are NP-hard problems 4. Prior analytical solutions miss some practical concerns Multiple resources Complex dependencies Machine-level fragmentation Scale; Online;
25 Given an annotated DAG and available resources, compute a good schedule + practical model
26 Main ideas for one DAG resources time Existing schedulers: A task is schedulable after all its parents have finished
27 Main ideas for one DAG resources time Existing schedulers: A task is schedulable after all its parents have finished
28 Main ideas for one DAG resources time Existing schedulers: A task is schedulable after all its parents have finished Graphene: Identifies troublesome tasks and places them first
29 Main ideas for one DAG resources time Existing schedulers: A task is schedulable after all its parents have finished Graphene: Identifies troublesome tasks and places them first
30 Main ideas for one DAG resources time Existing schedulers: A task is schedulable after all its parents have finished Graphene: Identifies troublesome tasks and places them first Place other tasks around trouble
31 Does placing troublesome tasks first help? Revisit the example 0: εε T , : , εε 3: 4: 2: TT , : , εε , εε TT ,
32 Does placing troublesome tasks first help? Revisit the example 0: εε T , : , εε 3: 4: 2: TT , : , εε , εε TT , If troublesome tasks long-running tasks, Graphene OPT
33 Does placing troublesome tasks first help? Revisit the example 0: εε T , : , εε 3: 4: , εε , time 2: TT , : εε TT , If troublesome tasks long-running tasks, Graphene OPT
34 Does placing troublesome tasks first help? Revisit the example 0: εε T , : , εε 3: 4: , εε , time 2: TT , : εε TT , If troublesome tasks long-running tasks, Graphene OPT
35 How to choose troublesome tasks T? frag ff
36 How to choose troublesome tasks T? Optimal choice is intractable (recall: NP-Hard) frag ff
37 How to choose troublesome tasks T? Optimal choice is intractable (recall: NP-Hard) frag ff Stage fragmentation score f ll Task duration
38 How to choose troublesome tasks T? ffff Optimal choice is intractable (recall: NP-Hard) Graphene: BuildSchedule(T) and or Stage fragmentation score f ll Task duration
39 How to choose troublesome tasks T? Vary ll, ff ffff Optimal choice is intractable (recall: NP-Hard) Graphene: BuildSchedule(T) and or Pick the most compact schedule Stage fragmentation score f ll Task duration
40 How to choose troublesome tasks T? Vary ll, ff ffff Optimal choice is intractable (recall: NP-Hard) Graphene: BuildSchedule(T) and or Pick the most compact schedule Stage fragmentation score f ll Task duration Extensions 1) Explore different choices of T in parallel 2) Recurse 3) Memoize
41 Schedule dead-ends T resources T time
42 Schedule dead-ends T resources T time
43 Schedule dead-ends T resources T time 1) Since some parents and children of are already placed with T, may not be able to place
44 Schedule dead-ends T resources T time 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T)
45 Schedule dead-ends T resources T time 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T)
46 Schedule dead-ends P T resources T time 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T)
47 Schedule dead-ends P T resources T time 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T) 2) When placing tasks in, P, have to go backwards (place task after all children are placed)
48 Schedule dead-ends P T C resources T time 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T) 2) When placing tasks in, P, have to go backwards (place task after all children are placed)
49 Schedule dead-ends P T C S resources T time 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T) 2) When placing tasks in, P, have to go backwards (place task after all children are placed)
50 Schedule dead-ends Which of these orders are legit? T fb P b S f C f P T C S resources T time T fb P b C f S f T fb S fb P b C f 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T) 2) When placing tasks in, P, have to go backwards (place task after all children are placed)
51 Schedule dead-ends Which of these orders are legit? T fb P b S f C f P T C S resources T time T fb P b C f S f T fb S fb P b C f 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T) 2) When placing tasks in, P, have to go backwards (place task after all children are placed)
52 Schedule dead-ends Which of these orders are legit? T fb P b S f C f P T C S resources T time T fb P b C f S f T fb S fb P b C f Graphene explores all orders and avoids dead-ends 1) Since some parents and children of are already placed with T, may not be able to place T TransitiveClosure (T) 2) When placing tasks in, P, have to go backwards (place task after all children are placed)
53 Main ideas for one DAG P T C S resources T time resources P CT s time 1. Identify troublesome tasks and place them first 2. Systematically place tasks to avoid dead-ends
54 Computed offline schedule for One DAG Production clusters have Multiple DAGs
55 Convert offline schedule to priority order on tasks 2
56 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT ,
57 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Job
58 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Job tt 1 tt 3 tt 4 tt 0, tt 2, tt 5 2
59 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Offline Runtime 1 3 Job tt 1 tt 3 tt 4 tt 0, tt 2, tt 5 2 Job2 2 0 time
60 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Offline Runtime Job tt 1 tt 3 tt 4 tt 0, tt 2, tt 5 2 Job2 2 0 time
61 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Offline Runtime 1 3 Job tt 1 tt 3 tt 4 tt 0, tt 2, tt Job2 2 0 time
62 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Offline Runtime Job tt 1 tt 3 tt 4 tt 0, tt 2, tt Job2 2 0 time
63 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Offline Runtime Job tt 1 tt 3 tt 4 tt 0, tt 2, tt Job2 2 0 time
64 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Offline Runtime Job tt 1 tt 3 tt 4 tt 0, tt 2, tt Job2 2 0 time
65 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Offline Runtime Job tt 1 tt 3 tt 4 tt 0, tt 2, tt Job2 2 0 time 2T
66 Convert offline schedule to priority order on tasks 0: 1: εε T , : , εε TT , : 4: 5: , εε , εε TT , Offline Runtime Job tt 1 tt 3 tt 4 tt 0, tt 2, tt Job2 2 0 time 2T 4 5 CPSched T 4T
67 Main ideas for multiple DAGs 1) Convert offline schedule to priority order on tasks
68 Main ideas for multiple DAGs 1) Convert offline schedule to priority order on tasks 2) Online, enforce schedule priority along with heuristics for (a) Multi-resource packing (b) SRPT to lower average job completion time (c) Bounded amount of unfairness (d) Overbooking
69 Main ideas for multiple DAGs 1) Convert offline schedule to priority order on tasks 2) Online, enforce schedule priority along with heuristics for (a) Multi-resource packing (b) SRPT to lower average job completion time (c) Bounded amount of unfairness (d) Overbooking
70 Main ideas for multiple DAGs 1) Convert offline schedule to priority order on tasks 2) Online, enforce schedule priority along with heuristics for (a) Multi-resource packing (b) SRPT to lower average job completion time (c) Bounded amount of unfairness (d) Overbooking Schedule priority best for one DAG; overall? Trade-offs: Packing Efficiency Job Completion Time may lose packing efficiency, may delay job completion,? Fairness may counteract all others
71 Main ideas for multiple DAGs 1) Convert offline schedule to priority order on tasks 2) Online, enforce schedule priority along with heuristics for (a) Multi-resource packing (b) SRPT to lower average job completion time (c) Bounded amount of unfairness (d) Overbooking Schedule priority best for one DAG; overall? Trade-offs: Packing Efficiency Job Completion Time may lose packing efficiency, may delay job completion, We show that: {best perf bounded unfairness} ~ best perf? Fairness may counteract all others
72 Graphene summary & implementation 1) Offline, schedule each DAG by placing troublesome tasks first 2) Online, enforce priority over tasks along with other heuristics
73 Graphene summary & implementation 1) Offline, schedule each DAG by placing troublesome tasks first 2) Online, enforce priority over tasks along with other heuristics NM DAG DAG AM AM RM Node heartbeat Task assignment NM NM NM
74 Graphene summary & implementation 1) Offline, schedule each DAG by placing troublesome tasks first 2) Online, enforce priority over tasks along with other heuristics NM DAG DAG Schedule Constructor Schedule Constructor AM AM RM Node heartbeat Task assignment NM NM NM
75 Graphene summary & implementation 1) Offline, schedule each DAG by placing troublesome tasks first 2) Online, enforce priority over tasks along with other heuristics NM DAG DAG Schedule Constructor Schedule Constructor AM AM RM + priority order + packing + bounded unfairness + + overbook Node heartbeat Task assignment NM NM NM
76 Implementation details DAG annotations Bundling: improve schedule quality w/o killing scheduling latency Co-existence with (many) other scheduler features
77 Evaluation Prototype 200 server multi-core cluster TPC-DS, TPC-H,, GridMix to replay traces Jobs arrive online Simulations Traces from production Microsoft Cosmos and Yarn clusters Compare with many alternatives
78 Results - 1 [20K DAGs from Cosmos] Packing Packing + Deps. Lower bound 1.5X 2X
79 Results - 1 [20K DAGs from Cosmos] Packing Packing + Deps. Lower bound 1.5X 2X
80 Results - 2 [200 jobs from TPC-DS, 200 server cluster] Tez + Pack +Deps Tez + Packing
81 Results - 2 [200 jobs from TPC-DS, 200 server cluster] Tez + Pack +Deps Tez + Packing
82 Conclusions Scheduling heterogeneous DAGs well requires an online solution that handles multiple resources and dependencies
83 Conclusions Scheduling heterogeneous DAGs well requires an online solution that handles multiple resources and dependencies Graphene Offline, construct per-dag schedule by placing troublesome tasks first Online, enforce schedule priority along with other heuristics New lower bound shows nearly optimal for half of the DAGs
84 Conclusions Scheduling heterogeneous DAGs well requires an online solution that handles multiple resources and dependencies Graphene Offline, construct per-dag schedule by placing troublesome tasks first Online, enforce schedule priority along with other heuristics New lower bound shows nearly optimal for half of the DAGs Experiments show gains in job completion time, makespan,
85 Conclusions Scheduling heterogeneous DAGs well requires an online solution that handles multiple resources and dependencies Graphene Offline, construct per-dag schedule by placing troublesome tasks first Online, enforce schedule priority along with other heuristics New lower bound shows nearly optimal for half of the DAGs Experiments show gains in job completion time, makespan, Graphene generalizes to DAGs in other settings
86 When scheduler works with erroneous task profiles 0.75, , , , Amount of error Fractional change in JCT
87 When scheduler works with erroneous task profiles 0.75, , , , Amount of error Fractional change in JCT
88 When scheduler works with erroneous task profiles 0.75, , , , Amount of error Fractional change in JCT
89 When scheduler works with erroneous task profiles 0.75, , , , Amount of error Fractional change in JCT
90 DAG annotations G uses per-stage average duration and demands of {cpu, mem, net. disk} 1) Almost all frameworks have user s annotate cpu and mem 2) Recurring jobs 1 have predictable profiles (correcting for input size) 3) Ongoing work on building profiles for ad-hoc jobs Sample and project 2 Program analysis 3 [1] RoPE, NSDI 12; [2] Perforator, SOCC 16; [3] SPEED, POPL 09;
91 Using Graphene to schedule other DAGs
92 Characterizing DAGs in Cosmos clusters
93 Characterizing DAGs in Cosmos clusters 2
94 Runtime of production DAGs
95 Job completion times on different workloads
96 Makespan Fairness
97 Comparison with other alternatives
98 Online Pseudocode
Graphene: Packing and Dependency-Aware Scheduling for Data-Parallel Clusters
Graphene: Packing and Dependency-Aware Scheduling for Data-Parallel Clusters Robert Grandl, Microsoft and University of Wisconsin Madison; Srikanth Kandula and Sriram Rao, Microsoft; Aditya Akella, Microsoft
More informationfor Data-Parallel Clusters
Graphene: Packing and Dependency-aware Scheduling for Data-Parallel Clusters Robert Grandl +, Srikanth Kandula, Sriram Rao, Aditya Akella +, Janardhan Kulkarni Microsoft and University of Wisconsin-Madison
More informationBuilding A Data-center Scale Analytics Platform. Sriram Rao Scientist/Manager, CISL
Building A Data-center Scale Analytics Platform Sriram Rao Scientist/Manager, CISL CISL: Cloud and Information Services Lab Started in May 2012 Mission Statement: Applied research lab working on Systems
More informationDON T CRY OVER SPILLED RECORDS Memory elasticity of data-parallel applications and its application to cluster scheduling
DON T CRY OVER SPILLED RECORDS Memory elasticity of data-parallel applications and its application to cluster scheduling Călin Iorgulescu (EPFL), Florin Dinu (EPFL), Aunn Raza (NUST Pakistan), Wajih Ul
More informationSparrow. Distributed Low-Latency Spark Scheduling. Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica
Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica Outline The Spark scheduling bottleneck Sparrow s fully distributed, fault-tolerant technique
More informationarxiv: v2 [cs.lg] 12 Oct 2018
Learning Scheduling Algorithms for Data Processing Clusters Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh MIT Computer Science and Artificial Intelligence
More informationCoflow. Recent Advances and What s Next? Mosharaf Chowdhury. University of Michigan
Coflow Recent Advances and What s Next? Mosharaf Chowdhury University of Michigan Rack-Scale Computing Datacenter-Scale Computing Geo-Distributed Computing Coflow Networking Open Source Apache Spark Open
More informationPreemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization
Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization Wei Chen, Jia Rao*, and Xiaobo Zhou University of Colorado, Colorado Springs * University of Texas at Arlington Data Center
More informationVarys. Efficient Coflow Scheduling. Mosharaf Chowdhury, Yuan Zhong, Ion Stoica. UC Berkeley
Varys Efficient Coflow Scheduling Mosharaf Chowdhury, Yuan Zhong, Ion Stoica UC Berkeley Communication is Crucial Performance Facebook analytics jobs spend 33% of their runtime in communication 1 As in-memory
More informationLearning Scheduling Algorithms for Data Processing Clusters
Learning Scheduling Algorithms for Data Processing Clusters Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh MIT Computer Science and Artificial Intelligence
More informationThe Elasticity and Plasticity in Semi-Containerized Colocating Cloud Workload: a view from Alibaba Trace
The Elasticity and Plasticity in Semi-Containerized Colocating Cloud Workload: a view from Alibaba Trace Qixiao Liu* and Zhibin Yu Shenzhen Institute of Advanced Technology Chinese Academy of Science @SoCC
More informationHistory-Based Harvesting of Spare Cycles and Storage in Large-Scale Datacenters
History-Based Harvesting of Spare Cycles and Storage in Large-Scale Datacenters Yunqi Zhang, George Prekas, Giovanni Matteo Fumarola, Marcus Fontoura, Íñigo Goiri, Ricardo Bianchini Datacenters are underutilized
More informationKey aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling
Key aspects of cloud computing Cluster Scheduling 1. Illusion of infinite computing resources available on demand, eliminating need for up-front provisioning. The elimination of an up-front commitment
More informationStatistics Driven Workload Modeling for the Cloud
UC Berkeley Statistics Driven Workload Modeling for the Cloud Archana Ganapathi, Yanpei Chen Armando Fox, Randy Katz, David Patterson SMDB 2010 Data analytics are moving to the cloud Cloud computing economy
More informationSinbad. Leveraging Endpoint Flexibility in Data-Intensive Clusters. Mosharaf Chowdhury, Srikanth Kandula, Ion Stoica. UC Berkeley
Sinbad Leveraging Endpoint Flexibility in Data-Intensive Clusters Mosharaf Chowdhury, Srikanth Kandula, Ion Stoica UC Berkeley Communication is Crucial for Analytics at Scale Performance Facebook analytics
More informationInfiniswap. Efficient Memory Disaggregation. Mosharaf Chowdhury. with Juncheng Gu, Youngmoon Lee, Yiwen Zhang, and Kang G. Shin
Infiniswap Efficient Memory Disaggregation Mosharaf Chowdhury with Juncheng Gu, Youngmoon Lee, Yiwen Zhang, and Kang G. Shin Rack-Scale Computing Datacenter-Scale Computing Geo-Distributed Computing Coflow
More informationDon t cry over spilled records: Memory elasticity of data-parallel applications and its application to cluster scheduling
Don t cry over spilled records: Memory elasticity of data-parallel applications and its application to cluster scheduling Călin Iorgulescu and Florin Dinu, EPFL; Aunn Raza, NUST Pakistan; Wajih Ul Hassan,
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationResource and Performance Distribution Prediction for Large Scale Analytics Queries
Resource and Performance Distribution Prediction for Large Scale Analytics Queries Prof. Rajiv Ranjan, SMIEEE School of Computing Science, Newcastle University, UK Visiting Scientist, Data61, CSIRO, Australia
More informationDynamic Query Re-planning using QOOP. Kshiteej Mahajan w, Mosharaf Chowdhury m, Aditya Akella w, Shuchi Chawla w
ynamic Query Re-planning using QOOP Kshiteej Mahajan w, Mosharaf howdhury m, ditya kella w, Shuchi hawla w 1 What is QOOP? QOOP is a distributed data analytics system that performs well under resource
More informationKey aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling
Key aspects of cloud computing Cluster Scheduling 1. Illusion of infinite computing resources available on demand, eliminating need for up-front provisioning. The elimination of an up-front commitment
More informationImpala. A Modern, Open Source SQL Engine for Hadoop. Yogesh Chockalingam
Impala A Modern, Open Source SQL Engine for Hadoop Yogesh Chockalingam Agenda Introduction Architecture Front End Back End Evaluation Comparison with Spark SQL Introduction Why not use Hive or HBase?
More informationElastify Cloud-Native Spark Application with PMEM. Junping Du --- Chief Architect, Tencent Cloud Big Data Department Yue Li --- Cofounder, MemVerge
Elastify Cloud-Native Spark Application with PMEM Junping Du --- Chief Architect, Tencent Cloud Big Data Department Yue Li --- Cofounder, MemVerge Table of Contents Sparkling: The Tencent Cloud Data Warehouse
More informationEfficient Point to Multipoint Transfers Across Datacenters
Efficient Point to Multipoint Transfers cross Datacenters Mohammad Noormohammadpour 1, Cauligi S. Raghavendra 1, Sriram Rao, Srikanth Kandula 1 University of Southern California, Microsoft Source: https://azure.microsoft.com/en-us/overview/datacenters/how-to-choose/
More informationScheduling II. Today. Next Time. ! Proportional-share scheduling! Multilevel-feedback queue! Multiprocessor scheduling. !
Scheduling II Today! Proportional-share scheduling! Multilevel-feedback queue! Multiprocessor scheduling Next Time! Memory management Scheduling with multiple goals! What if you want both good turnaround
More informationMixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp
MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp Hadoop Pig, Hive Hadoop + Enterprise storage?! Shared storage
More informationBig data systems 12/8/17
Big data systems 12/8/17 Today Basic architecture Two levels of scheduling Spark overview Basic architecture Cluster Manager Cluster Cluster Manager 64GB RAM 32 cores 64GB RAM 32 cores 64GB RAM 32 cores
More informationPROBABILISTIC SCHEDULING MICHAEL ROITZSCH
Faculty of Computer Science Institute of Systems Architecture, Operating Systems Group PROBABILISTIC SCHEDULING MICHAEL ROITZSCH DESKTOP REAL-TIME 2 PROBLEM worst case execution time (WCET) largely exceeds
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationCCA-410. Cloudera. Cloudera Certified Administrator for Apache Hadoop (CCAH)
Cloudera CCA-410 Cloudera Certified Administrator for Apache Hadoop (CCAH) Download Full Version : http://killexams.com/pass4sure/exam-detail/cca-410 Reference: CONFIGURATION PARAMETERS DFS.BLOCK.SIZE
More informationApache HAWQ (incubating)
HADOOP NATIVE SQL What is HAWQ? Apache HAWQ (incubating) Is an elastic parallel processing SQL engine that runs native in Apache Hadoop to directly access data for advanced analytics. Why HAWQ? Hadoop
More informationLast Class: Processes
Last Class: Processes A process is the unit of execution. Processes are represented as Process Control Blocks in the OS PCBs contain process state, scheduling and memory management information, etc A process
More informationEfficient Point to Multipoint Transfers Across Datacenters
Efficient Point to Multipoint Transfers cross atacenters Mohammad Noormohammadpour 1, Cauligi S. Raghavendra 1, Sriram Rao, Srikanth Kandula 1 University of Southern California, Microsoft Source: https://azure.microsoft.com/en-us/overview/datacenters/how-to-choose/
More informationCoflow. Big Data. Data-Parallel Applications. Big Datacenters for Massive Parallelism. Recent Advances and What s Next?
Big Data Coflow The volume of data businesses want to make sense of is increasing Increasing variety of sources Recent Advances and What s Next? Web, mobile, wearables, vehicles, scientific, Cheaper disks,
More informationPrincipal Software Engineer Red Hat Emerging Technology June 24, 2015
USING APACHE SPARK FOR ANALYTICS IN THE CLOUD William C. Benton Principal Software Engineer Red Hat Emerging Technology June 24, 2015 ABOUT ME Distributed systems and data science in Red Hat's Emerging
More informationPerfOrator: eloquent performance models for Resource Optimization
PerfOrator: eloquent performance models for Resource Optimization Kaushik Rajan, Dharmesh Kakadia, Carlo Curino, Subru Krishnan Microsoft [krajan, dkakadia, ccurino, subru]@microsoft.com Abstract Query
More informationOPERATING SYSTEMS CS3502 Spring Processor Scheduling. Chapter 5
OPERATING SYSTEMS CS3502 Spring 2018 Processor Scheduling Chapter 5 Goals of Processor Scheduling Scheduling is the sharing of the CPU among the processes in the ready queue The critical activities are:
More informationHotCloud 17. Lube: Mitigating Bottlenecks in Wide Area Data Analytics. Hao Wang* Baochun Li
HotCloud 17 Lube: Hao Wang* Baochun Li Mitigating Bottlenecks in Wide Area Data Analytics iqua Wide Area Data Analytics DC Master Namenode Workers Datanodes 2 Wide Area Data Analytics Why wide area data
More informationHadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera
More information6.888 Lecture 8: Networking for Data Analy9cs
6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh ² Many thanks to Mosharaf Chowdhury (Michigan) and Kay Ousterhout (Berkeley) Spring 2016 1 Big Data Huge amounts of data being collected
More informationL3: Spark & RDD. CDS Department of Computational and Data Sciences. Department of Computational and Data Sciences
Indian Institute of Science Bangalore, India भ रत य व ज ञ न स स थ न ब गल र, भ रत Department of Computational and Data Sciences L3: Spark & RDD Department of Computational and Data Science, IISc, 2016 This
More informationCLARINET: WAN-Aware Optimization for Analytics Queries
CLARINET: WAN-Aware Optimization for Analytics Queries Raajay Viswanathan Ganesh Ananthanarayanan Aditya Akella University of Wisconsin-Madison Microsoft Abstract Recent work has made the case for geo-distributed
More informationYARN: A Resource Manager for Analytic Platform Tsuyoshi Ozawa
YARN: A Resource Manager for Analytic Platform Tsuyoshi Ozawa ozawa.tsuyoshi@lab.ntt.co.jp ozawa@apache.org About me Tsuyoshi Ozawa Research Engineer @ NTT Twitter: @oza_x86_64 Over 150 reviews in 2015
More informationDon t cry over spilled records: Memory elasticity of data-parallel applications and its application to cluster scheduling
Don t cry over spilled records: Memory elasticity of data-parallel applications and its application to cluster scheduling Calin Iorgulescu *, Florin Dinu *, Aunn Raza, Wajih Ul Hassan, and Willy Zwaenepoel
More informationPredictable Time-Sharing for DryadLINQ Cluster. Sang-Min Park and Marty Humphrey Dept. of Computer Science University of Virginia
Predictable Time-Sharing for DryadLINQ Cluster Sang-Min Park and Marty Humphrey Dept. of Computer Science University of Virginia 1 DryadLINQ What is DryadLINQ? LINQ: Data processing language and run-time
More informationElasecutor: Elastic Executor Scheduling in Data Analytics Systems
Elasecutor: Elastic Executor Scheduling in Data Analytics Systems ABSTRACT Libin Liu City University of Hong Kong Hong Kong libinliu-c@my.cityu.edu.hk Modern data analytics systems use long-running executors
More informationEpisode 5. Scheduling and Traffic Management
Episode 5. Scheduling and Traffic Management Part 3 Baochun Li Department of Electrical and Computer Engineering University of Toronto Outline What is scheduling? Why do we need it? Requirements of a scheduling
More informationJoint Optimization of Overlapping Phases in MapReduce
Joint Optimization of Overlapping Phases in MapReduce ABSTRACT Minghong Lin Computer Science California Institute of Technology Adam Wierman Computer Science California Institute of Technology MapReduce
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationApril Copyright 2013 Cloudera Inc. All rights reserved.
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on
More informationRe- op&mizing Data Parallel Compu&ng
Re- op&mizing Data Parallel Compu&ng Sameer Agarwal Srikanth Kandula, Nicolas Bruno, Ming- Chuan Wu, Ion Stoica, Jingren Zhou UC Berkeley A Data Parallel Job can be a collec/on of maps, A Data Parallel
More information2/26/2017. Originally developed at the University of California - Berkeley's AMPLab
Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second
More informationDynamic Resource Allocation for Distributed Dataflows. Lauritz Thamsen Technische Universität Berlin
Dynamic Resource Allocation for Distributed Dataflows Lauritz Thamsen Technische Universität Berlin 04.05.2018 Distributed Dataflows E.g. MapReduce, SCOPE, Spark, and Flink Used for scalable processing
More informationPerformance Isolation in Multi- Tenant Relational Database-asa-Service. Sudipto Das (Microsoft Research)
Performance Isolation in Multi- Tenant Relational Database-asa-Service Sudipto Das (Microsoft Research) CREATE DATABASE CREATE TABLE SELECT... INSERT UPDATE SELECT * FROM FOO WHERE App1 App2 App3 App1
More informationScheduling, part 2. Don Porter CSE 506
Scheduling, part 2 Don Porter CSE 506 Logical Diagram Binary Memory Formats Allocators Threads Today s Lecture Switching System to CPU Calls RCU scheduling File System Networking Sync User Kernel Memory
More informationTatsuhiro Chiba, Takeshi Yoshimura, Michihiro Horie and Hiroshi Horii IBM Research
Tatsuhiro Chiba, Takeshi Yoshimura, Michihiro Horie and Hiroshi Horii IBM Research IBM Research 2 IEEE CLOUD 2018 / Towards Selecting Best Combination of SQL-on-Hadoop Systems and JVMs à à Application
More informationDynamic Query Re-Planning using QOOP
ynamic Query Re-Planning using QOOP Kshiteej Mahajan, UW-Madison; Mosharaf Chowdhury, U. Michigan; ditya kella and Shuchi Chawla, UW-Madison https://www.usenix.org/conference/osdi18/presentation/mahajan
More informationLecture 11 Hadoop & Spark
Lecture 11 Hadoop & Spark Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Outline Distributed File Systems Hadoop Ecosystem
More informationTitan: Fair Packet Scheduling for Commodity Multiqueue NICs. Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift July 13 th, 2017
Titan: Fair Packet Scheduling for Commodity Multiqueue NICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift July 13 th, 2017 Ethernet line-rates are increasing! 2 Servers need: To drive increasing
More informationCSL373: Lecture 6 CPU Scheduling
CSL373: Lecture 6 CPU Scheduling First come first served (FCFS or FIFO) Simplest scheduling algorithm cpu cpu 0 0 Run jobs in order that they arrive Disadvantage: wait time depends on arrival order. Unfair
More informationMagpie: Distributed request tracking for realistic performance modelling
Magpie: Distributed request tracking for realistic performance modelling Rebecca Isaacs Paul Barham Richard Mortier Dushyanth Narayanan Microsoft Research Cambridge James Bulpin University of Cambridge
More informationPaaS and Hadoop. Dr. Laiping Zhao ( 赵来平 ) School of Computer Software, Tianjin University
PaaS and Hadoop Dr. Laiping Zhao ( 赵来平 ) School of Computer Software, Tianjin University laiping@tju.edu.cn 1 Outline PaaS Hadoop: HDFS and Mapreduce YARN Single-Processor Scheduling Hadoop Scheduling
More informationCSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)
CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) Past & Present Have looked at two constraints: Mutual exclusion constraint between two events is a requirement that
More informationShark: SQL and Rich Analytics at Scale. Michael Xueyuan Han Ronny Hajoon Ko
Shark: SQL and Rich Analytics at Scale Michael Xueyuan Han Ronny Hajoon Ko What Are The Problems? Data volumes are expanding dramatically Why Is It Hard? Needs to scale out Managing hundreds of machines
More informationOracle Enterprise Manager 12c IBM DB2 Database Plug-in
Oracle Enterprise Manager 12c IBM DB2 Database Plug-in May 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and
More informationCSE 444: Database Internals. Lecture 23 Spark
CSE 444: Database Internals Lecture 23 Spark References Spark is an open source system from Berkeley Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei
More information1 Introduction. 1 Only tasks on a failed machine needed to be re-executed. 2 E.g., it is impossible to achieve data-local reduce task placement.
F 2 : Reenvisioning Data Analytics Systems Robert Grandl, Arjun Singhvi, Raajay Viswanathan, Aditya Akella University of Wisconsin - Madison, Google, Uber under review Abstract Today s data analytics frameworks
More informationThe Legion Mapping Interface
The Legion Mapping Interface Mike Bauer 1 Philosophy Decouple specification from mapping Performance portability Expose all mapping (perf) decisions to Legion user Guessing is bad! Don t want to fight
More informationBenchmarks Prove the Value of an Analytical Database for Big Data
White Paper Vertica Benchmarks Prove the Value of an Analytical Database for Big Data Table of Contents page The Test... 1 Stage One: Performing Complex Analytics... 3 Stage Two: Achieving Top Speed...
More informationEA 2 S 2 : An Efficient Application-Aware Storage System for Big Data Processing in Heterogeneous Clusters
EA 2 S 2 : An Efficient Application-Aware Storage System for Big Data Processing in Heterogeneous Clusters Teng Wang, Jiayin Wang, Son Nam Nguyen, Zhengyu Yang, Ningfang Mi, and Bo Sheng Department of
More informationJob sample: SCOPE (VLDBJ, 2012)
Apollo High level SQL-Like language The job query plan is represented as a DAG Tasks are the basic unit of computation Tasks are grouped in Stages Execution is driven by a scheduler Job sample: SCOPE (VLDBJ,
More informationMain Memory and the CPU Cache
Main Memory and the CPU Cache CPU cache Unrolled linked lists B Trees Our model of main memory and the cost of CPU operations has been intentionally simplistic The major focus has been on determining
More informationBatch Processing Basic architecture
Batch Processing Basic architecture in big data systems COS 518: Distributed Systems Lecture 10 Andrew Or, Mike Freedman 2 1 2 64GB RAM 32 cores 64GB RAM 32 cores 64GB RAM 32 cores 64GB RAM 32 cores 3
More informationScheduling. Today. Next Time Process interaction & communication
Scheduling Today Introduction to scheduling Classical algorithms Thread scheduling Evaluating scheduling OS example Next Time Process interaction & communication Scheduling Problem Several ready processes
More informationAccelerating Spark Workloads using GPUs
Accelerating Spark Workloads using GPUs Rajesh Bordawekar, Minsik Cho, Wei Tan, Benjamin Herta, Vladimir Zolotov, Alexei Lvov, Liana Fong, and David Kung IBM T. J. Watson Research Center 1 Outline Spark
More informationEECS 571 Principles of Real-Time Embedded Systems. Lecture Note #8: Task Assignment and Scheduling on Multiprocessor Systems
EECS 571 Principles of Real-Time Embedded Systems Lecture Note #8: Task Assignment and Scheduling on Multiprocessor Systems Kang G. Shin EECS Department University of Michigan What Have We Done So Far?
More informationCS 406/534 Compiler Construction Putting It All Together
CS 406/534 Compiler Construction Putting It All Together Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy
More informationUnifying Big Data Workloads in Apache Spark
Unifying Big Data Workloads in Apache Spark Hossein Falaki @mhfalaki Outline What s Apache Spark Why Unification Evolution of Unification Apache Spark + Databricks Q & A What s Apache Spark What is Apache
More informationSql Server 2008 Query Table Schema Management Studio Create
Sql Server 2008 Query Table Schema Management Studio Create using SQL Server Management Studio or Transact-SQL by creating a new table and in Microsoft SQL Server 2016 Community Technology Preview 2 (CTP2).
More informationRobinHood: Tail Latency-Aware Caching Dynamically Reallocating from Cache-Rich to Cache-Poor
RobinHood: Tail Latency-Aware Caching Dynamically Reallocating from -Rich to -Poor Daniel S. Berger (CMU) Joint work with: Benjamin Berg (CMU), Timothy Zhu (PennState), Siddhartha Sen (Microsoft Research),
More informationOperating Systems. Scheduling
Operating Systems Scheduling Process States Blocking operation Running Exit Terminated (initiate I/O, down on semaphore, etc.) Waiting Preempted Picked by scheduler Event arrived (I/O complete, semaphore
More informationWorkload Characterization and Optimization of TPC-H Queries on Apache Spark
Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research
More informationWarehouse- Scale Computing and the BDAS Stack
Warehouse- Scale Computing and the BDAS Stack Ion Stoica UC Berkeley UC BERKELEY Overview Workloads Hardware trends and implications in modern datacenters BDAS stack What is Big Data used For? Reports,
More informationCIS : Scalable Data Analysis
CIS 602-01: Scalable Data Analysis Cloud Workloads Dr. David Koop Scaling Up PC [Haeberlen and Ives, 2015] 2 Scaling Up PC Server [Haeberlen and Ives, 2015] 2 Scaling Up PC Server Cluster [Haeberlen and
More informationPouya Kousha Fall 2018 CSE 5194 Prof. DK Panda
Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Motivation And Intro Programming Model Spark Data Transformation Model Construction Model Training Model Inference Execution Model Data Parallel Training
More informationOVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI
CMPE 655- MULTIPLE PROCESSOR SYSTEMS OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI What is MULTI PROCESSING?? Multiprocessing is the coordinated processing
More informationCS 267 Applications of Parallel Computers. Lecture 23: Load Balancing and Scheduling. James Demmel
CS 267 Applications of Parallel Computers Lecture 23: Load Balancing and Scheduling James Demmel http://www.cs.berkeley.edu/~demmel/cs267_spr99 CS267 L23 Load Balancing and Scheduling.1 Demmel Sp 1999
More informationBIG DATA COURSE CONTENT
BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data
More informationScheduling Tasks Sharing Files from Distributed Repositories
from Distributed Repositories Arnaud Giersch 1, Yves Robert 2 and Frédéric Vivien 2 1 ICPS/LSIIT, University Louis Pasteur, Strasbourg, France 2 École normale supérieure de Lyon, France September 1, 2004
More informationShark: SQL and Rich Analytics at Scale. Yash Thakkar ( ) Deeksha Singh ( )
Shark: SQL and Rich Analytics at Scale Yash Thakkar (2642764) Deeksha Singh (2641679) RDDs as foundation for relational processing in Shark: Resilient Distributed Datasets (RDDs): RDDs can be written at
More informationCPU Scheduling. CSE 2431: Introduction to Operating Systems Reading: Chapter 6, [OSC] (except Sections )
CPU Scheduling CSE 2431: Introduction to Operating Systems Reading: Chapter 6, [OSC] (except Sections 6.7.2 6.8) 1 Contents Why Scheduling? Basic Concepts of Scheduling Scheduling Criteria A Basic Scheduling
More informationImproving the MapReduce Big Data Processing Framework
Improving the MapReduce Big Data Processing Framework Gistau, Reza Akbarinia, Patrick Valduriez INRIA & LIRMM, Montpellier, France In collaboration with Divyakant Agrawal, UCSB Esther Pacitti, UM2, LIRMM
More informationOracle Big Data Fundamentals Ed 2
Oracle University Contact Us: 1.800.529.0165 Oracle Big Data Fundamentals Ed 2 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, you learn about big data, the technologies
More informationV Conclusions. V.1 Related work
V Conclusions V.1 Related work Even though MapReduce appears to be constructed specifically for performing group-by aggregations, there are also many interesting research work being done on studying critical
More informationCentralized versus distributed schedulers for multiple bag-of-task applications
Centralized versus distributed schedulers for multiple bag-of-task applications O. Beaumont, L. Carter, J. Ferrante, A. Legrand, L. Marchal and Y. Robert Laboratoire LaBRI, CNRS Bordeaux, France Dept.
More informationMaking the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor
Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack Chief Architect RainStor Agenda Importance of Hadoop + data compression Data compression techniques Compression,
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationToday s Papers. Architectural Differences. EECS 262a Advanced Topics in Computer Systems Lecture 17
EECS 262a Advanced Topics in Computer Systems Lecture 17 Comparison of Parallel DB, CS, MR and Jockey October 30 th, 2013 John Kubiatowicz and Anthony D. Joseph Electrical Engineering and Computer Sciences
More informationStream Processing on IoT Devices using Calvin Framework
Stream Processing on IoT Devices using Calvin Framework by Ameya Nayak A Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science Supervised
More informationThe Right Read Optimization is Actually Write Optimization. Leif Walsh
The Right Read Optimization is Actually Write Optimization Leif Walsh leif@tokutek.com The Right Read Optimization is Write Optimization Situation: I have some data. I want to learn things about the world,
More information