HotCloud 17. Lube: Mitigating Bottlenecks in Wide Area Data Analytics. Hao Wang* Baochun Li
|
|
- Laura Patrick
- 5 years ago
- Views:
Transcription
1 HotCloud 17 Lube: Hao Wang* Baochun Li Mitigating Bottlenecks in Wide Area Data Analytics iqua
2 Wide Area Data Analytics DC Master Namenode Workers Datanodes 2
3 Wide Area Data Analytics Why wide area data analytics? DC #1 DC #2 DC #n Data Volume User Distribution Regulation Policy Master Workers Workers Problems Namenode Datanodes Datanodes Widely shared resources Fluctuating available provision Distributed runtime environment Heterogenous utilizations 2
4 Fluctuating WAN Bandwidths Bandwidth (Mbps) (VC) (CT) (TR) (WT) (TR) 0 0:00 6:00 12:00 18:00 0:00 6:00 12:00 Jan 1 Jan 2 Measured by iperf on SAVI testbed 3
5 Heterogenous Memory Util Nodes in different DCs may have different resource utilizations node_1 node_2 node_3 node_ Time ( s) Running Berkeley Big Data Benchmark on AWS EC2 4 nodes across 4 regions. Collected by jvmtop 4
6 Runtime Bottlenecks Fluctuation Heterogeneity Bottlenecks emerges at runtime Any time Any nodes Bottlenecks Any resources Data analytics performance Long completion times Low resource utilization Invalid optimization 5
7 Optimization of Data Analytics Existing optimization method does not consider runtime bottlenecks Clarient [OSDI 16] considers the heterogeneity of available WAN bandwidth Iridium [SIGCOMM 15] trades off between time and WAN bandwidth usage Geode [NSDI 15] saves WAN usage via data placement and query plan selection SWAG [SoCC 15] reorders jobs across datacenters Much of this performance work has been motivated by three widely-accepted mantras about the performance of data analytics network, disk and straggler. Making Sense of Performance in Data Analytics Frameworks 6 NSDI 15, Kay Ousterhout
8 Mitigating Bottlenecks at Runtime Mitigating bottlenecks How to detect bottlenecks? How to overcome the scheduling delay? How to enforce the bottleneck mitigation? Resource queue Task queue in bottleneck 7
9 Architecture of Lube Three major components Performance monitors Bottleneck detecting module Bottleneck-aware scheduler Lube Client Online Bottleneck Detector Training Pool Bottleneck Detector Network I/O Disk I/O Model Update Lightweight Performance Monitors JVM more metrics Lube Master Bottleneck Info. Cache Available Worker Pool (worker, intensity) Lube Scheduler Submitted Task Queue Bottleneck-aware Scheduling 8
10 Detecting Bottlenecks ARIMA y t = θ 0 +φ 1 y t 1 +φ 2 y t 2 + +φ p y t p + ε t θ 1 ε t 1 θ 2 ε t 2 θ q ε t q ε Ramdon error y t Current state θ φ Coefficients Historical input Autoregressive (AR) + output Current states Moving Average(MA) state (time_1, mem_util) (time_2, mem_util) (time_t-1, mem_util) ARIMA(p, d, q) (time_t, mem_util) 9
11 Detecting Bottlenecks HMM Hidden Markov Model t past future Hidden states: O Observation states: Q Q q 1 q 2 q i A(a ij ) q j Emission probability: A B(b j (k)) Transition probability: B O O 1 O 2 O d O k To make HMM online {time_stamp: mem, net, cpu, disk} Sliding Hidden Markov Model A sliding window for new observations 10 A moving average approximation for outdated observations
12 Bottleneck-Aware Scheduling Memory utilization of executor processes Built-in task schedulers: Data-locality Network utilization of datanode processes Bottleneck-aware scheduler: CPU utilization of executor processes Data-locality Bottlenecks at runtime A single worker node is Disk (SSD) utilization of datanode processes bottlenecked continuously while all nodes are rarely bottlenecked at the same time Time (s) 11
13 Implementation & Deployment Implementation Spark (scheduler) redis database (cache) Python scikit-learn, Keras (ML) Deployment 37 EC2 m4.2xlarge instances 9 regions Berkeley Big Data Benchmark An 1.1 TB dataset Master Node Worker Nodes Lube Scheduler Master Redis Server Bottleneck Detection Module Worker Redis Server nethogs jvmtop iotop APIs: HGET worker_id time HSET worker_id {time: {metric: val_ob, val_inf}} SUBSCRIBE metric_1 metric_2 PUBLISH + HSET metric {time: val} (e.g, iotop {time: I/O}) 12
14 Evaluation Accuracy 100 ARIMA 100 SlidHMM Hit Rate (%) Query-1 Hit Rate (%) Query-2 Calculation #((time, detection) (time, observation)) hitrate = #(time, detection) 100 a b c 100 a b c Hit Rate (%) Query-3 Hit Rate (%) Query-4 ARIMA ignores nonlinear patterns a b c 13
15 Evaluation Completion Times 1.0 Query-1 Pure Spark Lube-ARIMA 1.0 Lube-SlidHMM Query Task completion times Query-3 Time (ms) Query-4 Time (ms) Average 75th Lube-ARIMA s s Lube-SlidHMM s s Time (ms) Time (ms)
16 Evaluation Completion Times Pure Spark Lube-ARIMA ARIMA + Spark Lube-SlidHMM Query completion times SlidHMM + Spark Time (s) Time (s) Query-1 Query Query-2 Query-4 Lube-ARIMA Lube-SlidHMM Reduce median query response time by up to 33% Control Groups for overhead ARIMA + Spark SlidHMM + Spark Negligible overhead 15
17 Conclusion Runtime performance bottleneck detection ARIMA, HMM A simple greedy bottleneck-aware task scheduler Jointly consider data-locality and bottlenecks Lube, a closed-loop framework mitigating bottlenecks at runtime. 16
18 The End Thank You
19 Discussion Bottleneck detection models More performance metrics could be explored More efficient models for time series prediction, e.g., Reinforcement Learning, LSTM Bottleneck-aware scheduling Fine-grained scheduling with specific resource awareness WAN conditions We measure pair-wise WAN bandwidths by a cron job running iperf locally Try to exploit support from SDN interfaces 18
WITH large volumes of data generated and stored at geographically
1 Mitigating Bottlenecks in Wide Area Data Analytics via Machine Learning Hao Wang and Baochun Li, Fellow, IEEE Department of Electrical and Computer Engineering, University of Toronto Abstract Over the
More informationLube: Mitigating Bottlenecks in Wide Area Data Analytics
Lube: Mitigating Bottlenecks in Wide Area Data Analytics Hao Wang University of Toronto Baochun Li University of Toronto Abstract Over the past decade, we have witnessed exponential growth in the density
More informationBohr: Similarity Aware Geo-distributed Data Analytics. Hangyu Li, Hong Xu, Sarana Nutanong City University of Hong Kong
Bohr: Similarity Aware Geo-distributed Data Analytics Hangyu Li, Hong Xu, Sarana Nutanong City University of Hong Kong 1 Big Data Analytics Analysis Generate 2 Data are geo-distributed Frankfurt US Oregon
More informationPocket: Elastic Ephemeral Storage for Serverless Analytics
Pocket: Elastic Ephemeral Storage for Serverless Analytics Ana Klimovic*, Yawen Wang*, Patrick Stuedi +, Animesh Trivedi +, Jonas Pfefferle +, Christos Kozyrakis* *Stanford University, + IBM Research 1
More informationA Hierarchical Synchronous Parallel Model for Wide-Area Graph Analytics
A Hierarchical Synchronous Parallel Model for Wide-Area Graph Analytics Shuhao Liu*, Li Chen, Baochun Li, Aiden Carnegie University of Toronto April 17, 2018 Graph Analytics What is Graph Analytics? 2
More informationGaia: Geo-Distributed Machine Learning Approaching LAN Speeds
Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds Kevin Hsieh Aaron Harlap, Nandita Vijaykumar, Dimitris Konomis, Gregory R. Ganger, Phillip B. Gibbons, Onur Mutlu Machine Learning and Big
More informationTo Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai, Mosharaf Chowdhury, Harsha Madhyastha
To Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai, Mosharaf Chowdhury, Harsha Madhyastha Background Over 40 Data Centers (DCs) on EC2, Azure, Google Cloud A geographically denser set of DCs across
More informationcstore_fdw Columnar store for analytic workloads Hadi Moshayedi & Ben Redman
cstore_fdw Columnar store for analytic workloads Hadi Moshayedi & Ben Redman What is CitusDB? CitusDB is a scalable analytics database that extends PostgreSQL Citus shards your data and automa/cally parallelizes
More informationSparrow. Distributed Low-Latency Spark Scheduling. Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica
Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica Outline The Spark scheduling bottleneck Sparrow s fully distributed, fault-tolerant technique
More informationCAVA: Exploring Memory Locality for Big Data Analytics in Virtualized Clusters
2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing : Exploring Memory Locality for Big Data Analytics in Virtualized Clusters Eunji Hwang, Hyungoo Kim, Beomseok Nam and Young-ri
More informationBig Data 7. Resource Management
Ghislain Fourny Big Data 7. Resource Management artjazz / 123RF Stock Photo Data Technology Stack User interfaces Querying Data stores Indexing Processing Validation Data models Syntax Encoding Storage
More informationVarys. Efficient Coflow Scheduling. Mosharaf Chowdhury, Yuan Zhong, Ion Stoica. UC Berkeley
Varys Efficient Coflow Scheduling Mosharaf Chowdhury, Yuan Zhong, Ion Stoica UC Berkeley Communication is Crucial Performance Facebook analytics jobs spend 33% of their runtime in communication 1 As in-memory
More informationEsgynDB Enterprise 2.0 Platform Reference Architecture
EsgynDB Enterprise 2.0 Platform Reference Architecture This document outlines a Platform Reference Architecture for EsgynDB Enterprise, built on Apache Trafodion (Incubating) implementation with licensed
More informationLecture 11 Hadoop & Spark
Lecture 11 Hadoop & Spark Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Outline Distributed File Systems Hadoop Ecosystem
More informationKey aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling
Key aspects of cloud computing Cluster Scheduling 1. Illusion of infinite computing resources available on demand, eliminating need for up-front provisioning. The elimination of an up-front commitment
More informationMapReduce, Hadoop and Spark. Bompotas Agorakis
MapReduce, Hadoop and Spark Bompotas Agorakis Big Data Processing Most of the computations are conceptually straightforward on a single machine but the volume of data is HUGE Need to use many (1.000s)
More informationCorrelation based File Prefetching Approach for Hadoop
IEEE 2nd International Conference on Cloud Computing Technology and Science Correlation based File Prefetching Approach for Hadoop Bo Dong 1, Xiao Zhong 2, Qinghua Zheng 1, Lirong Jian 2, Jian Liu 1, Jie
More informationKey aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling
Key aspects of cloud computing Cluster Scheduling 1. Illusion of infinite computing resources available on demand, eliminating need for up-front provisioning. The elimination of an up-front commitment
More informationAccelerate Big Data Insights
Accelerate Big Data Insights Executive Summary An abundance of information isn t always helpful when time is of the essence. In the world of big data, the ability to accelerate time-to-insight can not
More informationSinbad. Leveraging Endpoint Flexibility in Data-Intensive Clusters. Mosharaf Chowdhury, Srikanth Kandula, Ion Stoica. UC Berkeley
Sinbad Leveraging Endpoint Flexibility in Data-Intensive Clusters Mosharaf Chowdhury, Srikanth Kandula, Ion Stoica UC Berkeley Communication is Crucial for Analytics at Scale Performance Facebook analytics
More informationUsing Alluxio to Improve the Performance and Consistency of HDFS Clusters
ARTICLE Using Alluxio to Improve the Performance and Consistency of HDFS Clusters Calvin Jia Software Engineer at Alluxio Learn how Alluxio is used in clusters with co-located compute and storage to improve
More informationBig Data for Engineers Spring Resource Management
Ghislain Fourny Big Data for Engineers Spring 2018 7. Resource Management artjazz / 123RF Stock Photo Data Technology Stack User interfaces Querying Data stores Indexing Processing Validation Data models
More informationSiphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics. Shuhao Liu, Li Chen, Baochun Li University of Toronto July 12, 2018
Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics Shuhao Liu, Li Chen, Baochun Li University of Toronto July 12, 2018 What is a Coflow? One stage in a data analytic job Map 1 Reduce
More informationMixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp
MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp Hadoop Pig, Hive Hadoop + Enterprise storage?! Shared storage
More informationCoflow. Recent Advances and What s Next? Mosharaf Chowdhury. University of Michigan
Coflow Recent Advances and What s Next? Mosharaf Chowdhury University of Michigan Rack-Scale Computing Datacenter-Scale Computing Geo-Distributed Computing Coflow Networking Open Source Apache Spark Open
More informationDistributed Computation Models
Distributed Computation Models SWE 622, Spring 2017 Distributed Software Engineering Some slides ack: Jeff Dean HW4 Recap https://b.socrative.com/ Class: SWE622 2 Review Replicating state machines Case
More informationSparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics
SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics Min LI,, Jian Tan, Yandong Wang, Li Zhang, Valentina Salapura, Alan Bivens IBM TJ Watson Research Center * A
More information6.888 Lecture 8: Networking for Data Analy9cs
6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh ² Many thanks to Mosharaf Chowdhury (Michigan) and Kay Ousterhout (Berkeley) Spring 2016 1 Big Data Huge amounts of data being collected
More informationMonotasks: Architecting for Performance Clarity in Data Analytics Frameworks
Monotasks: Architecting for Performance Clarity in Data Analytics Frameworks ABSTRACT Kay Ousterhout UC Berkeley Sylvia Ratnasamy UC Berkeley In today s data analytics frameworks, many users struggle to
More informationData Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research
Data Processing at the Speed of 100 Gbps using Apache Crail Patrick Stuedi IBM Research The CRAIL Project: Overview Data Processing Framework (e.g., Spark, TensorFlow, λ Compute) Spark-IO Albis Pocket
More informationFast Big Data Analytics with Spark on Tachyon
1 Fast Big Data Analytics with Spark on Tachyon Shaoshan Liu http://www.meetup.com/tachyon/ 2 Fun Facts Tachyon A tachyon is a particle that always moves faster than light. The word comes from the Greek:
More informationPrincipal Software Engineer Red Hat Emerging Technology June 24, 2015
USING APACHE SPARK FOR ANALYTICS IN THE CLOUD William C. Benton Principal Software Engineer Red Hat Emerging Technology June 24, 2015 ABOUT ME Distributed systems and data science in Red Hat's Emerging
More informationProgramming Systems for Big Data
Programming Systems for Big Data CS315B Lecture 17 Including material from Kunle Olukotun Prof. Aiken CS 315B Lecture 17 1 Big Data We ve focused on parallel programming for computational science There
More information@joerg_schad Nightmares of a Container Orchestration System
@joerg_schad Nightmares of a Container Orchestration System 2017 Mesosphere, Inc. All Rights Reserved. 1 Jörg Schad Distributed Systems Engineer @joerg_schad Jan Repnak Support Engineer/ Solution Architect
More informationQunar Performs Real-Time Data Analytics up to 300x Faster with Alluxio
CASE STUDY Qunar Performs Real-Time Data Analytics up to 300x Faster with Alluxio Xueyan Li, Lei Xu, and Xiaoxu Lv Software Engineers at Qunar At Qunar, we have been running Alluxio in production for over
More informationIT has now become commonly accepted that the volume of
1 Time- and Cost- Efficient Task Scheduling Across Geo-Distributed Data Centers Zhiming Hu, Member, IEEE, Baochun Li, Fellow, IEEE, and Jun Luo, Member, IEEE Abstract Typically called big data processing,
More informationPouya Kousha Fall 2018 CSE 5194 Prof. DK Panda
Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Motivation And Intro Programming Model Spark Data Transformation Model Construction Model Training Model Inference Execution Model Data Parallel Training
More informationCSE 444: Database Internals. Lecture 23 Spark
CSE 444: Database Internals Lecture 23 Spark References Spark is an open source system from Berkeley Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei
More informationResilient Distributed Datasets
Resilient Distributed Datasets A Fault- Tolerant Abstraction for In- Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin,
More informationOptimizing Shuffle in Wide-Area Data Analytics
Optimizing Shuffle in Wide-Area Data Analytics Shuhao Liu, Hao Wang, Baochun Li Department of Electrical and Computer Engineering University of Toronto Toronto, Canada {shuhao, haowang, bli}@ece.toronto.edu
More informationSpark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies
Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay 1 Apache Spark - Intro Spark within the Big Data ecosystem Data Sources Data Acquisition / ETL Data Storage Data Analysis / ML Serving 3 Apache
More informationAnalytics Platform for ATLAS Computing Services
Analytics Platform for ATLAS Computing Services Ilija Vukotic for the ATLAS collaboration ICHEP 2016, Chicago, USA Getting the most from distributed resources What we want To understand the system To understand
More informationIntroduction to MapReduce
Basics of Cloud Computing Lecture 4 Introduction to MapReduce Satish Srirama Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed
More informationData Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research
Data Processing at the Speed of 100 Gbps using Apache Crail Patrick Stuedi IBM Research The CRAIL Project: Overview Data Processing Framework (e.g., Spark, TensorFlow, λ Compute) Spark-IO FS Albis Streaming
More informationShark: SQL and Rich Analytics at Scale. Yash Thakkar ( ) Deeksha Singh ( )
Shark: SQL and Rich Analytics at Scale Yash Thakkar (2642764) Deeksha Singh (2641679) RDDs as foundation for relational processing in Shark: Resilient Distributed Datasets (RDDs): RDDs can be written at
More informationShark: SQL and Rich Analytics at Scale. Michael Xueyuan Han Ronny Hajoon Ko
Shark: SQL and Rich Analytics at Scale Michael Xueyuan Han Ronny Hajoon Ko What Are The Problems? Data volumes are expanding dramatically Why Is It Hard? Needs to scale out Managing hundreds of machines
More informationGetting Started with Spark
Getting Started with Spark Shadi Ibrahim March 30th, 2017 MapReduce has emerged as a leading programming model for data-intensive computing. It was originally proposed by Google to simplify development
More informationAn Experimental Study of Rapidly Alternating Bottleneck in n-tier Applications
An Experimental Study of Rapidly Alternating Bottleneck in n-tier Applications Qingyang Wang, Yasuhiko Kanemasa, Jack Li, Deepal Jayasinghe, Toshihiro Shimizu, Masazumi Matsubara, Motoyuki Kawaba, Calton
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationA Heterogeneity-Aware Task Scheduler for Spark
A Heterogeneity-Aware Task Scheduler for Spark Luna Xu, Ali R. Butt, Seung-Hwan Lim, Ramakrishnan Kannan Virginia Tech, Oak Ridge National Laboratory {xuluna, butta}@cs.vt.edu, {lims1, kannanr}@ornl.gov
More informationescience in the Cloud: A MODIS Satellite Data Reprojection and Reduction Pipeline in the Windows
escience in the Cloud: A MODIS Satellite Data Reprojection and Reduction Pipeline in the Windows Jie Li1, Deb Agarwal2, Azure Marty Platform Humphrey1, Keith Jackson2, Catharine van Ingen3, Youngryel Ryu4
More informationAdaptive Cluster Computing using JavaSpaces
Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University Outline Background Introduction Related Work Summary of
More information15-744: Computer Networking. Data Center Networking II
15-744: Computer Networking Data Center Networking II Overview Data Center Topology Scheduling Data Center Packet Scheduling 2 Current solutions for increasing data center network bandwidth FatTree BCube
More informationDeep Learning Inference as a Service
Deep Learning Inference as a Service Mohammad Babaeizadeh Hadi Hashemi Chris Cai Advisor: Prof Roy H. Campbell Use case 1: Model Developer Use case 1: Model Developer Inference Service Use case
More informationHPC in Cloud. Presenter: Naresh K. Sehgal Contributors: Billy Cox, John M. Acken, Sohum Sohoni
HPC in Cloud Presenter: Naresh K. Sehgal Contributors: Billy Cox, John M. Acken, Sohum Sohoni 2 Agenda What is HPC? Problem Statement(s) Cloud Workload Characterization Translation from High Level Issues
More informationInfiniswap. Efficient Memory Disaggregation. Mosharaf Chowdhury. with Juncheng Gu, Youngmoon Lee, Yiwen Zhang, and Kang G. Shin
Infiniswap Efficient Memory Disaggregation Mosharaf Chowdhury with Juncheng Gu, Youngmoon Lee, Yiwen Zhang, and Kang G. Shin Rack-Scale Computing Datacenter-Scale Computing Geo-Distributed Computing Coflow
More informationLRC: Dependency-Aware Cache Management for Data Analytics Clusters. Yinghao Yu, Wei Wang, Jun Zhang, and Khaled B. Letaief IEEE INFOCOM 2017
LRC: Dependency-Aware Cache Management for Data Analytics Clusters Yinghao Yu, Wei Wang, Jun Zhang, and Khaled B. Letaief IEEE INFOCOM 2017 Outline Cache Management for Data Analytics Clusters Inefficiency
More informationApache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved.
Apache Hadoop 3 Balazs Gaspar Sales Engineer CEE & CIS balazs@cloudera.com 1 We believe data can make what is impossible today, possible tomorrow 2 We empower people to transform complex data into clear
More informationThe 7 deadly sins of cloud computing [2] Cloud-scale resource management [1]
The 7 deadly sins of [2] Cloud-scale resource management [1] University of California, Santa Cruz May 20, 2013 1 / 14 Deadly sins of of sin (n.) - common simplification or shortcut employed by ers; may
More informationIntegrate MATLAB Analytics into Enterprise Applications
Integrate Analytics into Enterprise Applications Dr. Roland Michaely 2015 The MathWorks, Inc. 1 Data Analytics Workflow Access and Explore Data Preprocess Data Develop Predictive Models Integrate Analytics
More informationVrooM. Abstract. Defne Gurel, Mycal Tucker, Zack Drach DP2 Report Section R11 & R12 May 9, 2014
VrooM Defne Gurel, Mycal Tucker, Zack Drach DP2 Report Section R11 & R12 May 9, 2014 Abstract These days, it is common to run distributed applications in public data centers. In this setting, network congestion
More informationFlat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897
Flat Datacenter Storage Edmund B. Nightingale, Jeremy Elson, et al. 6.S897 Motivation Imagine a world with flat data storage Simple, Centralized, and easy to program Unfortunately, datacenter networks
More informationStream Processing on IoT Devices using Calvin Framework
Stream Processing on IoT Devices using Calvin Framework by Ameya Nayak A Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science Supervised
More informationA priority based dynamic bandwidth scheduling in SDN networks 1
Acta Technica 62 No. 2A/2017, 445 454 c 2017 Institute of Thermomechanics CAS, v.v.i. A priority based dynamic bandwidth scheduling in SDN networks 1 Zun Wang 2 Abstract. In order to solve the problems
More informationExploring Cloud Security, Operational Visibility & Elastic Datacenters. Kiran Mohandas Consulting Engineer
Exploring Cloud Security, Operational Visibility & Elastic Datacenters Kiran Mohandas Consulting Engineer The Ideal Goal of Network Access Policies People (Developers, Net Ops, CISO, ) V I S I O N Provide
More informationClash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics
Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics Presented by: Dishant Mittal Authors: Juwei Shi, Yunjie Qiu, Umar Firooq Minhas, Lemei Jiao, Chen Wang, Berthold Reinwald and Fatma
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationIX: A Protected Dataplane Operating System for High Throughput and Low Latency
IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this
More informationHigh Performance File System and I/O Middleware Design for Big Data on HPC Clusters
High Performance File System and I/O Middleware Design for Big Data on HPC Clusters by Nusrat Sharmin Islam Advisor: Dhabaleswar K. (DK) Panda Department of Computer Science and Engineering The Ohio State
More informationEmpirical Study of Stragglers in Spark SQL and Spark Streaming
Empirical Study of Stragglers in Spark SQL and Spark Streaming Danish Khan, Kshiteej Mahajan, Rahul Godha, Yuvraj Patel December 19, 2015 1 Introduction Spark is an in-memory parallel processing framework.
More informationPacking Tasks with Dependencies. Robert Grandl, Srikanth Kandula, Sriram Rao, Aditya Akella, Janardhan Kulkarni
Packing Tasks with Dependencies Robert Grandl, Srikanth Kandula, Sriram Rao, Aditya Akella, Janardhan Kulkarni The Cluster Scheduling Problem Jobs Goal: match tasks to resources Tasks 2 The Cluster Scheduling
More informationData Centers and Cloud Computing. Slides courtesy of Tim Wood
Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationDisclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme
VIRT1351BE New Architectures for Virtualizing Spark and Big Data Workloads on vsphere Justin Murray Mohan Potheri VMworld 2017 Content: Not for publication #VMworld #VIRT1351BE Disclaimer This presentation
More informationAeromancer: A Workflow Manager for Large- Scale MapReduce-Based Scientific Workflows
Aeromancer: A Workflow Manager for Large- Scale MapReduce-Based Scientific Workflows Presented by Sarunya Pumma Supervisors: Dr. Wu-chun Feng, Dr. Mark Gardner, and Dr. Hao Wang synergy.cs.vt.edu Outline
More informationData Centers and Cloud Computing. Data Centers
Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More informationPreemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization
Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization Wei Chen, Jia Rao*, and Xiaobo Zhou University of Colorado, Colorado Springs * University of Texas at Arlington Data Center
More informationWide-Area Spark Streaming: Automated Routing and Batch Sizing
Wide-Area Spark Streaming: Automated Routing and Batch Sizing Wenxin Li, Di Niu, Yinan Liu, Shuhao Liu, Baochun Li University of Toronto & Dalian University of Technology University of Alberta University
More informationGridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning. Xiaowei ZHU Tsinghua University
GridGraph: Large-Scale Graph Processing on a Single Machine Using -Level Hierarchical Partitioning Xiaowei ZHU Tsinghua University Widely-Used Graph Processing Shared memory Single-node & in-memory Ligra,
More informationCloud Computing & Visualization
Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International
More informationThomas Lin, Naif Tarafdar, Byungchul Park, Paul Chow, and Alberto Leon-Garcia
Thomas Lin, Naif Tarafdar, Byungchul Park, Paul Chow, and Alberto Leon-Garcia The Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto, ON, Canada Motivation: IoT
More informationDynamic Data Placement Strategy in MapReduce-styled Data Processing Platform Hua-Ci WANG 1,a,*, Cai CHEN 2,b,*, Yi LIANG 3,c
2016 Joint International Conference on Service Science, Management and Engineering (SSME 2016) and International Conference on Information Science and Technology (IST 2016) ISBN: 978-1-60595-379-3 Dynamic
More informationMaking Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari
Making Non-Distributed Databases, Distributed Ioannis Papapanagiotou, PhD Shailesh Birari Dynomite Ecosystem Dynomite - Proxy layer Dyno - Client Dynomite-manager - Ecosystem orchestrator Dynomite-explorer
More informationTowards an Adaptive, Fully Automated Performance Modeling Methodology for Cloud Applications
Towards an Adaptive, Fully Automated Performance Modeling Methodology for Cloud Applications Ioannis Giannakopoulos 1, Dimitrios Tsoumakos 2 and Nectarios Koziris 1 1:Computing Systems Laboratory, School
More informationQoS-Aware Admission Control in Heterogeneous Datacenters
QoS-Aware Admission Control in Heterogeneous Datacenters Christina Delimitrou, Nick Bambos and Christos Kozyrakis Stanford University ICAC June 28 th 2013 Cloud DC Scheduling Workloads DC Scheduler S S
More informationWorkload Characterization and Optimization of TPC-H Queries on Apache Spark
Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research
More informationSiphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics
Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics Shuhao Liu, Li Chen and Baochun Li Department of Electrical and Computer Engineering, University of Toronto Abstract It is increasingly
More informationDRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE
DRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout, Michael Armbrust, Ali Ghodsi, Michael Franklin, Benjamin Recht, Ion Stoica STREAMING WORKLOADS
More informationSpark: A Brief History. https://stanford.edu/~rezab/sparkclass/slides/itas_workshop.pdf
Spark: A Brief History https://stanford.edu/~rezab/sparkclass/slides/itas_workshop.pdf A Brief History: 2004 MapReduce paper 2010 Spark paper 2002 2004 2006 2008 2010 2012 2014 2002 MapReduce @ Google
More informationCaching Algorithm for Content-Oriented Networks Using Prediction of Popularity of Content
Caching Algorithm for Content-Oriented Networks Using Prediction of Popularity of Content Hiroki Nakayama, Shingo Ata, Ikuo Oka BOSCO Technologies Inc. Osaka City University Background Cache has an important
More informationFlash Storage Complementing a Data Lake for Real-Time Insight
Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum
More informationSpark, Shark and Spark Streaming Introduction
Spark, Shark and Spark Streaming Introduction Tushar Kale tusharkale@in.ibm.com June 2015 This Talk Introduction to Shark, Spark and Spark Streaming Architecture Deployment Methodology Performance References
More informationSaath: Speeding up CoFlows by Exploiting the Spatial Dimension. Chengkok-Koh
Saath: Speeding up CoFlows by Exploiting the Spatial Dimension Akshay Jajoo Rohan Gandhi Y. Charlie Hu Chengkok-Koh 1 Analytics Jobs in Big Data Analytics jobs in data-centers Process huge amount of data
More informationFault Tolerance in K3. Ben Glickman, Amit Mehta, Josh Wheeler
Fault Tolerance in K3 Ben Glickman, Amit Mehta, Josh Wheeler Outline Background Motivation Detecting Membership Changes with Spread Modes of Fault Tolerance in K3 Demonstration Outline Background Motivation
More informationDistributed Systems CS6421
Distributed Systems CS6421 Intro to Distributed Systems and the Cloud Prof. Tim Wood v I teach: Software Engineering, Operating Systems, Sr. Design I like: distributed systems, networks, building cool
More informationA DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU
A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:
More informationWhy AI Frameworks Need (not only) RDMA?
Why AI Frameworks Need (not only) RDMA? With Design and Implementation Experience of Networking Support on TensorFlow GDR, Apache MXNet, WeChat Amber, and Tencent Angel Bairen Yi (byi@connect.ust.hk) Jingrong
More informationDatabase Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu
Database Architecture 2 & Storage Instructor: Matei Zaharia cs245.stanford.edu Summary from Last Time System R mostly matched the architecture of a modern RDBMS» SQL» Many storage & access methods» Cost-based
More informationQuality-Assured Cloud Bandwidth Auto-Scaling for Video-on-Demand Applications
Quality-Assured Cloud Bandwidth Auto-Scaling for Video-on-Demand Applications Di Niu, Hong Xu, Baochun Li University of Toronto Shuqiao Zhao UUSee, Inc., Beijing, China 1 Applications in the Cloud WWW
More informationPricing Intra-Datacenter Networks with
Pricing Intra-Datacenter Networks with Over-Committed Bandwidth Guarantee Jian Guo 1, Fangming Liu 1, Tao Wang 1, and John C.S. Lui 2 1 Cloud Datacenter & Green Computing/Communications Research Group
More information