HiTune. Dataflow-Based Performance Analysis for Big Data Cloud

Similar documents
Camdoop Exploiting In-network Aggregation for Big Data Applications Paolo Costa

Chapter 5. The MapReduce Programming Model and Implementation

Map Reduce Group Meeting

Hadoop Map Reduce 10/17/2018 1

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros

Analysis in the Big Data Era

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop

Research Works to Cope with Big Data Volume and Variety. Jiaheng Lu University of Helsinki, Finland

Service Oriented Performance Analysis

Next-Generation Cloud Platform

Embedded Technosolutions

Native-Task Performance Test Report

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Sinbad. Leveraging Endpoint Flexibility in Data-Intensive Clusters. Mosharaf Chowdhury, Srikanth Kandula, Ion Stoica. UC Berkeley

HADOOP FRAMEWORK FOR BIG DATA

Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce

Evolution of Big Data Facebook. Architecture Summit, Shenzhen, August 2012 Ashish Thusoo

Big Data Programming: an Introduction. Spring 2015, X. Zhang Fordham Univ.

Cloud Computing CS

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS

HADOOP 3.0 is here! Dr. Sandeep Deshmukh Sadepach Labs Pvt. Ltd. - Let us grow together!

Analytics in the cloud

Correlation based File Prefetching Approach for Hadoop

Principal Software Engineer Red Hat Emerging Technology June 24, 2015

Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet

Cloud Computing & Visualization

A Fast and High Throughput SQL Query System for Big Data

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

BIG DATA TESTING: A UNIFIED VIEW

STA141C: Big Data & High Performance Statistical Computing

The Stratosphere Platform for Big Data Analytics

Distributed File Systems II

MapReduce-II. September 2013 Alberto Abelló & Oscar Romero 1

Huge market -- essentially all high performance databases work this way

Survey Paper on Traditional Hadoop and Pipelined Map Reduce

The MapReduce Abstraction

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

TPCX-BB (BigBench) Big Data Analytics Benchmark

IBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics

Big Data for Engineers Spring Resource Management

PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP

MapReduce Design Patterns

Coflow. Recent Advances and What s Next? Mosharaf Chowdhury. University of Michigan

Chisel++: Handling Partitioning Skew in MapReduce Framework Using Efficient Range Partitioning Technique

Big Data 7. Resource Management

Lightweight Streaming-based Runtime for Cloud Computing. Shrideep Pallickara. Community Grids Lab, Indiana University

4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015)

2/26/2017. For instance, consider running Word Count across 20 splits

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research

Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor

Accelerating Big Data: Using SanDisk SSDs for Apache HBase Workloads

Stages of Data Processing

Improving the MapReduce Big Data Processing Framework

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Progress on Efficient Integration of Lustre* and Hadoop/YARN

Introduction to BigData, Hadoop:-

Locality-Aware Dynamic VM Reconfiguration on MapReduce Clouds. Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng

Lecture 7: MapReduce design patterns! Claudia Hauff (Web Information Systems)!

Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics

TI2736-B Big Data Processing. Claudia Hauff

Spark 2. Alexey Zinovyev, Java/BigData Trainer in EPAM

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?

MapReduce programming model

BigData and Map Reduce VITMAC03

Big Data Facebook

CSE 444: Database Internals. Lecture 23 Spark

MapReduce: Simplified Data Processing on Large Clusters 유연일민철기

DATA SCIENCE USING SPARK: AN INTRODUCTION

The Evolution of Big Data Platforms and Data Science

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Parallel Nested Loops

Parallel Partition-Based. Parallel Nested Loops. Median. More Join Thoughts. Parallel Office Tools 9/15/2011

Introduction to MapReduce

Oracle Big Data Connectors

Data Transformation and Migration in Polystores

Parallel Computing: MapReduce Jin, Hai

MapReduce for Data Intensive Scientific Analyses

Best Practices for Setting BIOS Parameters for Performance

2013 AWS Worldwide Public Sector Summit Washington, D.C.

SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics

Shark: Hive (SQL) on Spark

MixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp

Distributed computing: index building and use

Part 1: Indexes for Big Data

Improving Hadoop MapReduce Performance on Supercomputers with JVM Reuse

CS427 Multicore Architecture and Parallel Computing

CS 61C: Great Ideas in Computer Architecture. MapReduce

Introduction to Map Reduce

BigDataBench-MT: Multi-tenancy version of BigDataBench

Tutorial: Analyzing MPI Applications. Intel Trace Analyzer and Collector Intel VTune Amplifier XE

Transcription:

HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241 USENIX ATC 11

Big Data Industrial Revolution of Data The heartbeat of mobile, cloud and social computing Expanding faster than Moore s law E.g., Internet of Things What is Big Data? Too large to work with using traditional tools (e.g., RDBMS) Require a new architecture Massively parallel software running on 100s~1000s of servers 2 USENIX ATC 11

Dataflow Model for Big Data Analytics User Applications modeled as dataflow graphs Write subroutines running on the vertices Abstracted away from messy details of distributed computing System runtime Dynamically dataflow graphs to the cluster Handles all the low level details Data partitioning, task distribution, load balancing, node communications, fault tolerance, MapReduce Hadoop Partitioned Input D T A A Partitioned Input D A T A Map Tasks MAP MAP MAP MAP spill spill Spill spill Streaming dataflow shuffle copier merge shuffle copier merge shuffle copier merge RE sort sort DU sort CE Reduce Tasks Aggregated Output reduce reduce reduce Sequential dataflow Aggregated Output Dryad 3 USENIX ATC 11

What Worked Parallel programming is hard Distributed programming is harder Dataflow model makes it a lot easier An appropriately high level of abstraction User required to consider data parallelisms exposed by the dataflow Runtime distributes executions of subroutines by exploiting data dependencies encoded in the dataflow Nontrivial software written with threads, sehores, and mutexes are incomprehensible to humans. Edward A. Lee CGO 2007, March 2007 Auto-Partitioning Compiler for Intel Network Processor (IXP) 4 USENIX ATC 11

What Didn t Work Dataflow abstraction makes Big Data system appear as a black box Very difficult for the user to understand runtime behaviors Performance analysis & tuning remain a big challenge Key challenges of performance analysis for Big Data Massively distributed system How to correlate concurrent performance activities (across 10000s of programs and machines)? High level dataflow abstraction How to relate low level performance activities to high level dataflow model? 5 USENIX ATC 11

HiTune: Vtune for Hadoop Distributed instrumentations Lightweight sampling using binary instrumentation No source code modifications Implemented using Java programming language agents Generic sampling information collected Dataflow-driven analysis Re-constructing dataflow execution process using low level sampling information Based on a dataflow specification Implemented as several Hadoop jobs 6 USENIX ATC 11

HiTune 0.9 Status Used intensively both inside Intel and by several external customers Open sourced under Apache License 2.0 Available at https://github.com/hitune/hitune Query QL Excel Spreadsheet Visual Report Samples (.xlsm) HiTune Report (.csv) Instrumentation Local HiTune data Local HiTune data Local HiTune data Adaptor Adaptor Adaptor Chukwa Agent Aggregation Chukwa Collector Hadoop Job Analysis HiTune Analyzer Hadoop Job Chukwa Collector HDFS Local HiTune data Local HiTune data Adaptor Adaptor Chukwa Agent Chukwa Collector PostProcess Chukwa Demux Local HiTune data Adaptor HiTune Paser HiTune Paser 7 USENIX ATC 11

Overhead Ratio of instrumented vs. uninstrumented clusters Less than 2% runtime overhead due to instrumentation Ratio of Completion Time 5-slave cluster 10-slave cluster 20-slave cluster 120% 101% 100% 101% 100% 80% 60% 40% 101% 100% 100% 20% 0% 101% 102% 101% Sort WordCount Nutch indexing Workloads Ratio of Throughput 5-slave cluster 10-slave cluster 20-slave cluster 120% 100% 98% 100% 98% 80% 60% 40% 99% 98% 100% 20% 0% 98% 100% 98% Sort WordCount Nutch indexing Workloads Ratio of CPU Utilization 5-slave cluster 10-slave cluster 20-slave cluster 120% 100% 102% 101% 100% 80% 60% 40% 100% 100% 100% 20% 0% 100% 102% 100% Sort WordCount Nutch indexing Workloads Ratio of Memory Utilization 5-slave cluster 10-slave cluster 20-slave cluster 120% 100% 102% 100% 101% 80% 60% 40% 102% 100% 100% 20% 0% 102% 100% 102% Sort WordCount Nutch indexing Workloads 8 USENIX ATC 11

The Hadoop Dataflow Model Partitioned Input D A T A Map Tasks spill spill Spill spill copier merge sort reduce copier copier Reduce Tasks shuffle shuffle shuffle merge merge sort reduce sort reduce Aggregated Output Streaming dataflow Sequential dataflow 9 USENIX ATC 11

Case Study: Limitation of Traditional Tools Sorting many small files (3200 500KB-sized files) using Hadoop 0.20.1 Cluster very lightly utilized (extremely low CPU, disk I/O and network utilization) No obvious bottlenecks or hotspots in the cluster Traditional tools (e.g., system monitors and program profilers) fail to reveal the root cause 10 USENIX ATC 11

Case Study: Limitation of Traditional Tools HiTune results (dataflow execution) reveal the root cause Upgrading to Fair Scheduler 2.0 fixes the issue Dataflow Execution Chart 0.19.1 The Low Utilization Issue 0.20.1 The Fix bootstrap shuffle sort reduce idle Map/Reduce Tasks Map/Reduce Tasks Time line Time line 11 USENIX ATC 11

Case Study: Limitation of Hadoop Logs TeraSort Large gap between end of and end of shuffle None of CPU, disk I/O and network bandwidth are bottlenecked during the gap Shuffle Fetchers Busy Percent metric reported by Hadoop is always 100% Increasing the number of copier threads brings no improvement Traditional tools or Hadoop logs fail to reveal the root cause gap 12 USENIX ATC 11

Case Study: Limitation of Hadoop Logs HiTune results (dataflow-based hotspot breakdown) reveal the root cause Copier threads idle 80% of the time, waiting for memory merge thread memory merge thread busy mostly due to compression Copier threads Idle, 80% reserve, 79% Busy, 20% Others,1% Memory Merge threads Compress, 81% Idle, 13% Busy, 87% Others, 6% Changing compression codec to LZO fixes this issue 13 USENIX ATC 11

Case Study: Extensibility Easily extended to support Simply changing the dataflow specification data flow stage timeline Stage Init Active Close Close Processing Period reduce stage timeline Active data flow stage timeline Stage Init Init Active Close Stage Close Processing Period stage timeline Active Aggregation query in performance benchmarks 68% of time spent on data input/output, Hadoop/ initialization & cleanup Critical to reduce intermediate results, improve data input/output, and reduce Hadoop/ overheads Map Tasks Reduce Tasks Close 0.2% Init 2.1% Stage Close 20.2% Active 77.6% Output 3.2% Input 42.4% Operations 32.0% Close 0.2% Init 2.1% Stage Close 20.2% Active 77.6% Output 3.2% Input 42.4% Operations 32.0% 14 USENIX ATC 11

Summary HiTune - VTune for Hadoop Better insights on Hadoop runtime behaviors Dataflow-based analysis Extremely low runtime overheads Very good scalability & extensibility v0.9 open sourced under Apache License 2.0 See https://github.com/hitune/hitune 15 USENIX ATC 11

16 USENIX ATC 11