MapReduce for Data Intensive Scientific Analyses

Similar documents
MapReduce: A Programming Model for Large-Scale Distributed Computation

MapReduce: Simplified Data Processing on Large Clusters

Map Reduce Group Meeting

Parallel data processing with MapReduce

L22: SC Report, Map Reduce

ARCHITECTURE AND PERFORMANCE OF RUNTIME ENVIRONMENTS FOR DATA INTENSIVE SCALABLE COMPUTING. Jaliya Ekanayake

Chapter 5. The MapReduce Programming Model and Implementation

The MapReduce Abstraction

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications

Hadoop/MapReduce Computing Paradigm

The MapReduce Framework

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

Parallel Computing: MapReduce Jin, Hai

Large-Scale GPU programming

CS 61C: Great Ideas in Computer Architecture. MapReduce

Lecture 11 Hadoop & Spark

Concurrency for data-intensive applications

DryadLINQ for Scientific Analyses

CS6030 Cloud Computing. Acknowledgements. Today s Topics. Intro to Cloud Computing 10/20/15. Ajay Gupta, WMU-CS. WiSe Lab

MapReduce: Simplified Data Processing on Large Clusters 유연일민철기

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Parallel Programming Concepts

Big Data Management and NoSQL Databases

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

Introduction to MapReduce

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

Distributed Computations MapReduce. adapted from Jeff Dean s slides

Hybrid MapReduce Workflow. Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US

HADOOP FRAMEWORK FOR BIG DATA

Hadoop محبوبه دادخواه کارگاه ساالنه آزمایشگاه فناوری وب زمستان 1391

Fault-Tolerant Computer System Design ECE 695/CS 590. Putting it All Together

CS-2510 COMPUTER OPERATING SYSTEMS

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications

CPET 581 Cloud Computing: Technologies and Enterprise IT Strategies

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros

Data Intensive Scalable Computing

Where We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344

Azure MapReduce. Thilina Gunarathne Salsa group, Indiana University

Introduction to MapReduce

Parallel Nested Loops

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud

Parallel Partition-Based. Parallel Nested Loops. Median. More Join Thoughts. Parallel Office Tools 9/15/2011

Evaluation of Apache Hadoop for parallel data analysis with ROOT

THE SURVEY ON MAPREDUCE

Distributed computing: index building and use

Big Data Programming: an Introduction. Spring 2015, X. Zhang Fordham Univ.

Clustering Lecture 8: MapReduce

Research and Realization of AP Clustering Algorithm Based on Cloud Computing Yue Qiang1, a *, Hu Zhongyu2, b, Lei Xinhua1, c, Li Xiaoming3, d

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce

Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures

Introduction to Data Management CSE 344

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Mitigating Data Skew Using Map Reduce Application

1. Introduction to MapReduce

MapReduce. U of Toronto, 2014

Lightweight Streaming-based Runtime for Cloud Computing. Shrideep Pallickara. Community Grids Lab, Indiana University

MI-PDB, MIE-PDB: Advanced Database Systems

Introduction to MapReduce. Adapted from Jimmy Lin (U. Maryland, USA)

In the news. Request- Level Parallelism (RLP) Agenda 10/7/11

A Survey on Parallel Rough Set Based Knowledge Acquisition Using MapReduce from Big Data

MapReduce. Kiril Valev LMU Kiril Valev (LMU) MapReduce / 35

CA485 Ray Walshe Google File System

What Is Datacenter (Warehouse) Computing. Distributed and Parallel Technology. Datacenter Computing Architecture

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

Advanced Data Management Technologies

CSE Lecture 11: Map/Reduce 7 October Nate Nystrom UTA

CS 138: Google. CS 138 XVII 1 Copyright 2016 Thomas W. Doeppner. All rights reserved.

Survey Paper on Traditional Hadoop and Pipelined Map Reduce

MOHA: Many-Task Computing Framework on Hadoop

Harp-DAAL for High Performance Big Data Computing

Applicability of DryadLINQ to Scientific Applications

Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures*

Data Intensive Scalable Computing. Thanks to: Randal E. Bryant Carnegie Mellon University

Cloud Computing CS

Global Journal of Engineering Science and Research Management

Distributed computing: index building and use

Programming Models MapReduce

How to Implement MapReduce Using. Presented By Jamie Pitts

Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

International Journal of Advance Engineering and Research Development. A Study: Hadoop Framework

BigData and Map Reduce VITMAC03

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 10 Parallel Programming Models: Map Reduce and Spark

CS 345A Data Mining. MapReduce

High Throughput WAN Data Transfer with Hadoop-based Storage

CS 138: Google. CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Introduction to MapReduce

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Distributed File Systems II

MapReduce-style data processing

PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP

Batch Processing Basic architecture

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

ABSTRACT I. INTRODUCTION

TP1-2: Analyzing Hadoop Logs

PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING

Department of Computer Science San Marcos, TX Report Number TXSTATE-CS-TR Clustering in the Cloud. Xuan Wang

Transcription:

apreduce for Data Intensive Scientific Analyses Jaliya Ekanayake Shrideep Pallickara Geoffrey Fox Department of Computer Science Indiana University Bloomington, IN, 47405 5/11/2009 Jaliya Ekanayake 1

Presentation Outline Introduction apreduce and the Current Implementations Current Limitations Our Solution Evaluation and the Results Future Work and Conclusion 5/11/2009 Jaliya Ekanayake 2

Data/Compute Intensive Applications Computation and data intensive applications are increasingly prevalent The data volumes are already in peta-scale High Energy Physics (HEP) Large Hadron Collider (LHC) - Tens of Petabytes of data annually Astronomy Large Synoptic Survey Telescope -Nightly rate of 20 Terabytes Information Retrieval Google, SN, Yahoo, Wal-art etc.. any compute intensive applications and domains HEP, Astronomy, chemistry, biology, and seismology etc.. Clustering Kmeans, Deterministic Annealing, Pair-wise clustering etc ulti Dimensional Scaling (DS) for visualizing high dimensional data 5/11/2009 Jaliya Ekanayake 3

Composable Applications How do we support these large scale applications? Efficient parallel/concurrent algorithms and implementation techniques Some key observations ost of these applications are: A Single Program ultiple Data (SPD) program or a collection of SPDs Exhibits the composable property Processing can be split into small sub computations The partial-results of these computations are merged after some post-processing Loosely synchronized (Can withstand communication latencies typically experienced over wide area networks) Distinct from the closely coupled parallel applications and totally decoupled applications With large volumes of data and higher computation requirements, even closely coupled parallel applications can withstand higher communication latencies? 5/11/2009 Jaliya Ekanayake 4

The Composable Class of Applications Tightly synchronized (microseconds) Input SPDs Set of TIF files Loosely synchronized (milliseconds) PDF Files Cannon s Algorithm for matrix multiplication tightly coupled application Composable application Totally decoupled application Composable class can be implemented in high-level programming models such as apreduce and Dryad 5/11/2009 Jaliya Ekanayake 5

apreduce apreduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. apreduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat 5/11/2009 Jaliya Ekanayake 6

apreduce 1 Data is split into m parts 2 Data D1 D2 Dm map function is performed on each of these data parts concurrently map map map reduce reduce data split map reduce 3 A hash function maps the results of the map tasks to r reduce tasks The framework supports: Splitting of data Passing the output of map functions to reduce functions Sorting the inputs to the reduce function based on the intermediate keys Quality of services 5/11/2009 Jaliya Ekanayake 7 4 O1 O2 5 Once all the results for a particular reduce task is available, the framework executes the reduce task A combine task may be necessary to combine all the outputs of the reduce functions together

Hadoop Example: Word Count E.g. Word Count map(string key, String value): // key: document name // value: document contents reduce(string key, Iterator values): // key: a word // values: a list of counts Task Trackers Execute ap tasks Output of map tasks are written to local files Retrieve map results via HTTP Sort the outputs Execute reduce tasks 1 2 DN 3 4 DN TT TT 2 R 1 DN Data/Compute Nodes 4 3 DN TT TT R R 5/11/2009 Jaliya Ekanayake 8

Current Limitations The apreduce programming model could be applied to most composable applications but; Current apreduce model and the runtimes focus on Single Step apreduce computations only Intermediate data is stored and accessed via file systems Inefficient for the iterative computations to which the apreduce technique could be applied No performance model to compare with other high-level or low-level parallel runtimes 5/11/2009 Jaliya Ekanayake 9

CGL-apReduce R Content Dissemination Network Worker Nodes D D R R R R Driver User Program R D ap Worker Reduce Worker RDeamon Data Read/Write Data Split File System Communication Architecture of CGL-apReduce A streaming based apreduce runtime implemented in Java All the communications(control/intermediate results) are routed via a content dissemination network Intermediate results are directly transferred from the map tasks to the reduce tasks eliminates local files RDriver aintains the state of the system Controls the execution of map/reduce tasks User Program is the composer of apreduce computations Support both single step and iterative apreduce computations 5/11/2009 Jaliya Ekanayake 10

CGL-apReduce The Flow of Execution 1 2 3 4 5 Initialization Start the map/reduce workers Configure both map/reduce tasks (for configurations/fixed data) ap Execute map tasks passing <key, value> pairs Reduce Execute reduce tasks passing <key, List<values>> Combine Combine the outputs of all the reduce tasks Termination Terminate the map/reduce workers R Initialize map reduce combine Terminate Worker Nodes D D Data Split Fixed Data Variable Data Iterative apreduce Content Dissemination Network R Driver File System User Program CGL-apReduce, the flow of execution 5/11/2009 Jaliya Ekanayake 11 R R R

HEP Data Analysis Data: Up to 1 terabytes of data, placed in IU Data Capacitor Processing:12 dedicated computing nodes from Quarry (total of 96 processing cores) apreduce for HEP data analysis HEP data analysis, execution time vs. the volume of data (fixed compute resources) Hadoop and CGL-apReduce both show similar performance The amount of data accessed in each analysis is extremely large Performance is limited by the I/O bandwidth The overhead induced by the apreduce implementations has negligible effect on the overall computation 5/11/2009 Jaliya Ekanayake 12

HEP Data Analysis Scalability and Speedup Execution time vs. the number of compute nodes (fixed data) Speedup for 100GB of HEP data 100 GB of data One core of each node is used (Performance is limited by the I/O bandwidth) Speedup = apreduce Time / Sequential Time Speed gain diminish after a certain number of parallel processing units (after around 10 units) 5/11/2009 Jaliya Ekanayake 13

Kmeans Clustering apreduce for Kmeans Clustering Kmeans Clustering, execution time vs. the number of 2D data points (Both axes are in log scale) All three implementations perform the same Kmeans clustering algorithm Each test is performed using 5 compute nodes (Total of 40 processor cores) CGL-apReduce shows a performance close to the PI implementation Hadoop s high execution time is due to: Lack of support for iterative apreduce computation Overhead associated with the file system based communication 5/11/2009 Jaliya Ekanayake 14

Overheads of Different Runtimes Overhead f(p)= [P T(P) T(1)]/T(1) P - The number of hardware processing units T(P) The time as a function of P T(1) The time when a sequential program is used (P=1) Overhead diminishes with the amount of computation Loosely synchronous apreduce (CGL-apReduce) also shows overheads close to PI for sufficiently large problems Hadoop s higher overheads may limit its use for these types(iterative apreduce) of computations

ore Applications atrix ultiply apreduce for atrix ultiplication Histogramming Words atrix multiplication -> iterative algorithm Histogramming words -> simple apreduce application Streaming approach provide better performance in both applications 5/11/2009 Jaliya Ekanayake 16

ulticore and the Runtimes The papers [1] and [2] evaluate the performance of apreduce using ulticore computers Our results show the converging results for different runtimes The right hand side graph could be a snapshot of this convergence path Easiness to program could be a consideration Still, threads are faster in shared memory systems [1] Evaluating apreduce for ulti-core and ultiprocessor Systems. By C. Ranger et al. [2] ap-reduce for achine Learning on ulticore by C. Chu et al. 5/11/2009 Jaliya Ekanayake 17

Conclusions Parallel Algorithms with: Fine grained sub computations Tight synchronization constraints apreduce /Cloud Parallel Algorithms with: Corse grained sub computations Loose synchronization constraints Given sufficiently large problems, all runtimes converge in performance Streaming-based map reduce implementations provide faster performance necessary for most composable applications Support for iterative apreduce computations expands the usability of apreduce runtimes

Future Work Research on different fault tolerance strategies for CGL-apReduce and come up with a set of architectural recommendations Integration of a distributed file system such as HDFS Applicability for cloud computing environments 5/11/2009 Jaliya Ekanayake 19

Questions? Thank You! 5/11/2009 Jaliya Ekanayake 20

Links Hadoop vs. CGL-apReduce Is it fair to compare Hadoop with CGL-apReduce? DRYAD Fault Tolerance Rootlet Architecture Nimbus vs. Eucalyptus 5/11/2009 Jaliya Ekanayake 21

Hadoop vs. CGL-apReduce Feature Hadoop CGL-apReduce Implementation Language Java Java Other Language Support Uses Hadoop Streaming (Text Data only) Requires a Java wrapper classes Distributed File System HDFS Currently assumes a shared file system between nodes Accessing binary data from other languages Currently only a Java interface is available Shared file system enables this functionality Fault Tolerance Support failures of nodes Currently does not support fault tolerance Iterative Computations Not supported Supports Iterative apreduce Daemon Initialization Requires ssh public key access Requires ssh public key access 5/11/2009 Jaliya Ekanayake 22

Is it fair to compare Hadoop with CGL-apReduce? Hadoop access data via a distributed file system Hadoop stores all the intermediate results in this file system to ensure fault tolerance Is this is the optimum strategy? Can we use Hadoop for only single pass apreduce computations? Writing the intermediate results at the Reduce task will be a better strategy? Considerable reduction in data from map -> reduce Possibility of using duplicate reduce tasks 5/11/2009 Jaliya Ekanayake 23

Fault Tolerance Hadoop/Google Fault Tolerance Data (input, output, and intermediate) are stored in HDFS HDFS uses replications Task tracker is a single point of failure (Checkpointing) Re-executing failed map/reduce tasks CGL-apReduce Integration of HDFS or similar parallel file system for input/output data RDriver is a single point of failure (Checkpointing) Re-executing failed map tasks (Considerable reduction in data from map -> reduce) Redundant reduce tasks Amazon model S3, EC2 and SQS Reliable Queues 5/11/2009 Jaliya Ekanayake 24

DRYAD The computation is structured as a directed graph A Dryad job is a graph generator which can synthesize any directed acyclic graph These graphs can even change during execution, in response to important events in the computation Dryad handles job creation and management, resource management, job monitoring and visualization, fault tolerance, re-execution, scheduling, and accounting How to support iterative computations? 5/11/2009 Jaliya Ekanayake 25