Programming Models MapReduce

Similar documents
Cloud Computing CS

Database Applications (15-415)

Introduction to Map Reduce

ECE5610/CSC6220 Introduction to Parallel and Distribution Computing. Lecture 6: MapReduce in Parallel Computing

Lecture 11 Hadoop & Spark

Hadoop Map Reduce 10/17/2018 1

2/26/2017. For instance, consider running Word Count across 20 splits

Clustering Lecture 8: MapReduce

BigData and Map Reduce VITMAC03

MapReduce. U of Toronto, 2014

CCA-410. Cloudera. Cloudera Certified Administrator for Apache Hadoop (CCAH)

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros

Where We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

Introduction to MapReduce

CS-2510 COMPUTER OPERATING SYSTEMS

MapReduce Algorithm Design

MapReduce: Simplified Data Processing on Large Clusters 유연일민철기

A brief history on Hadoop

Announcements. Parallel Data Processing in the 20 th Century. Parallel Join Illustration. Introduction to Database Systems CSE 414

Hadoop MapReduce Framework

Hadoop An Overview. - Socrates CCDH

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce

Introduction to MapReduce (cont.)

HADOOP FRAMEWORK FOR BIG DATA

CS6030 Cloud Computing. Acknowledgements. Today s Topics. Intro to Cloud Computing 10/20/15. Ajay Gupta, WMU-CS. WiSe Lab

Database Systems CSE 414

Introduction to Data Management CSE 344

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung

MapReduce: Recap. Juliana Freire & Cláudio Silva. Some slides borrowed from Jimmy Lin, Jeff Ullman, Jerome Simeon, and Jure Leskovec

TP1-2: Analyzing Hadoop Logs

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

Announcements. Optional Reading. Distributed File System (DFS) MapReduce Process. MapReduce. Database Systems CSE 414. HW5 is due tomorrow 11pm

CS 61C: Great Ideas in Computer Architecture. MapReduce

Introduction to Data Management CSE 344

Survey Paper on Traditional Hadoop and Pipelined Map Reduce

Dept. Of Computer Science, Colorado State University

A SURVEY ON SCHEDULING IN HADOOP FOR BIGDATA PROCESSING

Developing MapReduce Programs

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2013/14

Improving the MapReduce Big Data Processing Framework

Batch Processing Basic architecture

Principles of Data Management. Lecture #16 (MapReduce & DFS for Big Data)

Hadoop. copyright 2011 Trainologic LTD

Introduction to MapReduce

The MapReduce Abstraction

Hadoop and HDFS Overview. Madhu Ankam

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 10 Parallel Programming Models: Map Reduce and Spark

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Hadoop/MapReduce Computing Paradigm

CLOUD-SCALE FILE SYSTEMS

Camdoop Exploiting In-network Aggregation for Big Data Applications Paolo Costa

Parallel Programming Concepts

Big Data for Engineers Spring Resource Management

Certified Big Data and Hadoop Course Curriculum

UNIT V PROCESSING YOUR DATA WITH MAPREDUCE Syllabus

A BigData Tour HDFS, Ceph and MapReduce

2/4/2019 Week 3- A Sangmi Lee Pallickara

Programming Systems for Big Data

Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia,

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

Big Data 7. Resource Management

This material is covered in the textbook in Chapter 21.

Parallel Computing: MapReduce Jin, Hai

Mitigating Data Skew Using Map Reduce Application

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Introduction to MapReduce

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

MI-PDB, MIE-PDB: Advanced Database Systems

What Is Datacenter (Warehouse) Computing. Distributed and Parallel Technology. Datacenter Computing Architecture

50 Must Read Hadoop Interview Questions & Answers

Map Reduce. Yerevan.

The Google File System

Survey on MapReduce Scheduling Algorithms

CISC 7610 Lecture 2b The beginnings of NoSQL

CompSci 516: Database Systems

Introduction to MapReduce. Instructor: Dr. Weikuan Yu Computer Sci. & Software Eng.

L5-6:Runtime Platforms Hadoop and HDFS

MapReduce. Stony Brook University CSE545, Fall 2016

Introduction to Map/Reduce. Kostas Solomos Computer Science Department University of Crete, Greece

MapReduce-style data processing

Announcements. Reading Material. Map Reduce. The Map-Reduce Framework 10/3/17. Big Data. CompSci 516: Database Systems

CS427 Multicore Architecture and Parallel Computing

CS 138: Google. CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

MapReduce. Kiril Valev LMU Kiril Valev (LMU) MapReduce / 35

Chapter 5. The MapReduce Programming Model and Implementation

The amount of data increases every day Some numbers ( 2012):

2/26/2017. The amount of data increases every day Some numbers ( 2012):

Chisel++: Handling Partitioning Skew in MapReduce Framework Using Efficient Range Partitioning Technique

Distributed Computation Models

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692)

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

Introduction to Distributed Data Systems

Tutorial Outline. Map/Reduce. Data Center as a computer [Patterson, cacm 2008] Acknowledgements

MapReduce & HyperDex. Kathleen Durant PhD Lecture 21 CS 3200 Northeastern University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Locality Aware Fair Scheduling for Hammr

Transcription:

Programming Models MapReduce Majd Sakr, Garth Gibson, Greg Ganger, Raja Sambasivan 15-719/18-847b Advanced Cloud Computing Fall 2013 Sep 23, 2013 1

MapReduce In a Nutshell MapReduce incorporates two phases Map Phase Reduce phase HDFS Split BLK 0 HDFS Split BLK 1 Dataset HDFS Split BLK 2 HDFS HDFS Split BLK 3 Map Task Map Task Map Task Map Task Map Phase Shuffle Stage Merge & Sort Stage Reduce Phase Reduce Task Reduce Task Reduce Task Reduce Stage To HDFS

Data Distribution In a MapReduce cluster, data is distributed to all the nodes of the cluster as it is being loaded An underlying distributed systems (e.g., GFS, HDFS) splits large data s into chunks which are managed by different nodes in the cluster Input data: A large Node 1 Chunk of input data Node 2 Chunk of input data Node 3 Chunk of input data Even though the chunks are distributed across several machines, they form a single namespace 7

MapReduce In MapReduce, splits are processed in isolation by tasks called Mappers The output from the mappers is denoted as intermediate output and brought into a second set of tasks called Reducers Map Phase splits mappers C0 C1 C2 C3 M0 M1 M2 M3 IO IO IO IO The process of reading intermediate output into a set of Reducers is known as shuffling The Reducers produce the final output Reduce Phase Shuffling Data Merge & Sort Reducers R0 R1 Overall, MapReduce breaks the data flow into two phases, map phase and reduce phase FO FO

Keys and Values The programmer in MapReduce has to specify two functions, the map function and the reduce function that implement the Mapper and the Reducer in a MapReduce program In MapReduce data elements are always structured as key-value (i.e., (K, V)) pairs The map and reduce functions receive and emit (K, V) pairs Input Splits Intermediate Output Final Output (K, V) Pairs Map Function (K, V ) Pairs Reduce Function (K, V ) Pairs

Splits A split can contain reference to one or more HDFS blocks Configurable parameter Each map task processes one split # of splits dictates the # of map tasks By default one split contains reference to one HDFS block Map tasks are scheduled in the vicinity of HDFS blocks to reduce network traffic

s Map tasks store intermediate output on local disk (not HDFS) A subset of intermediate key space is assigned to each Reducer hash(key) mod R These subsets are known as partitions Different colors represent different keys (potentially) from different Mappers s are the input to Reducers

Network Topology In MapReduce MapReduce assumes a tree style network topology Nodes are spread over different racks embraced in one or many data centers The bandwidth between two nodes is dependent on their relative locations in the network topology The assumption is that nodes that are on the same rack will have higher bandwidth between them as opposed to nodes that are off-rack

Hadoop MapReduce: A Closer Look Node 1 Node 2 Files loaded from local HDFS store Files loaded from local HDFS store InputFormat InputFormat Split Split Split Split Split Split RecordReaders RR RR RR RR RR RR RecordReaders Input (K, V) pairs Input (K, V) pairs Map Map Map Map Map Map Intermediate (K, V) pairs er Shuffling Process er Intermediate (K, V) pairs Sort Reduce Intermediate (K,V) pairs exchanged by all nodes Sort Reduce Final (K, V) pairs Final (K, V) pairs Writeback to local HDFS store OutputFormat OutputFormat Writeback to local HDFS store

Input Files Input s are where the data for a MapReduce task is initially stored The input s typically reside in a distributed system (e.g. HDFS) The format of input s is arbitrary Line-based log s Binary s Multi-line input records Or something else entirely 14

Input Splits An input split describes a unit of data that a single map task in a MapReduce program will process By dividing the into splits, we allow several map tasks to operate on a single in parallel Files loaded from local HDFS store If the is very large, this can improve performance significantly through parallelism Each map task corresponds to a single input split InputFormat Split Split Split

RecordReader The input split defines a slice of data but does not describe how to access it The RecordReader class actually loads data from its source and converts it into (K, V) pairs suitable for reading by Mappers The RecordReader is invoked repeatedly on the input until the entire split is consumed Files loaded from local HDFS store InputFormat Each invocation of the RecordReader leads to another call of the map function defined by the programmer Split Split Split RR RR RR

Mapper and Reducer The Mapper performs the user-defined work of the first phase of the MapReduce program Files loaded from local HDFS store A new instance of Mapper is created for each split InputFormat The Reducer performs the user-defined work of the second phase of the MapReduce program Split Split Split A new instance of Reducer is created for each partition For each key in the partition assigned to a Reducer, the Reducer is called once RR RR RR Map Map Map er Sort Reduce

er Each mapper may emit (K, V) pairs to any partition Files loaded from local HDFS store Therefore, the map nodes must all agree on where to send different pieces of intermediate data The partitioner class determines which partition a given (K,V) pair will go to The default partitioner computes a hash value for a given key and assigns it to a partition based on this result InputFormat Split Split Split RR RR RR Map Map Map er Sort Reduce

Sort Each Reducer is responsible for reducing the values associated with (several) intermediate keys Files loaded from local HDFS store InputFormat The set of intermediate keys on a single node is automatically sorted by MapReduce before they are presented to the Reducer Split Split Split RR RR RR Map Map Map er Sort Reduce

Combiner Functions MapReduce applications are limited by the bandwidth available on the cluster It pays to minimize the data shuffled between map and reduce tasks Hadoop allows the user to specify a combiner function (just like the reduce function) to be run on a map output (Y, T) R R N N N N MT MT MT MT MT MT RT MT Map output (1950, 0) (1950, 20) (1950, 10) LEGEND: Combiner output R = Rack N = Node MT = Map Task RT = Reduce Task Y = Year T = Temperature (1950, 20)

Job Scheduling in MapReduce In MapReduce, an application is represented as a job A job encompasses multiple map and reduce tasks Job schedulers in MapReduce are pluggable Hadoop MapReduce by default FIFO scheduler for jobs Schedules jobs in order of submission Starvation with long-running jobs No job preemption No evaluation of job priority or size 24

Multi-user Job Scheduling in MapReduce Fair scheduler (Facebook) Pools of jobs, each pool is assigned a set of shares Jobs get (on ave) equal share of slots over time Across pools, use Fair scheduler, within pool, FIFO or Fair scheduler Capacity scheduler (Yahoo!) Creates job queues Each queue is configured with # (capacity) of slots Within queue, scheduling is priority based

Task Scheduling in MapReduce MapReduce adopts a master-slave architecture The master node in MapReduce is referred to as Job Tracker (JT) Each slave node in MapReduce is referred to as Task Tracker (TT) MapReduce adopts a pull scheduling strategy rather than a push one JT Tasks Queue T0 T1 T2 TT Task Slots TT Task Slots I.e., JT does not push map and reduce tasks to TTs but rather TTs pull them by making pertaining requests 26

Map and Reduce Task Scheduling Every TT sends a heartbeat message periodically to JT encompassing a request for a map or a reduce task to run I. Map Task Scheduling: JT satisfies requests for map tasks via attempting to schedule mappers in the vicinity of their input splits (i.e., it considers locality) II. Reduce Task Scheduling: However, JT simply assigns the next yet-to-run reduce task to a requesting TT regardless of TT s network location and its implied effect on the reducer s shuffle time (i.e., it does not consider locality) 27

Task Scheduling

Fault Tolerance in Hadoop Data redundancy Achieved at the storage layer through replicas (default is 3) Stored a physically separate machines Can tolerate Corrupted s Faulty nodes HDFS: Computes checksums for all data written to it Verifies when reading Task Resiliency (task slowdown or failure) Monitor tasks to detect faulty or slow Replicate 31

Task Resiliency MapReduce can guide jobs toward a successful completion even when jobs are run on a large cluster where probability of failures increases The primary way that MapReduce achieves fault tolerance is through restarting tasks If a TT fails to communicate with JT for a period of time (by default, 1 minute in Hadoop), JT will assume that TT in question has crashed If the job is still in the map phase, JT asks another TT to re-execute all Mappers that previously ran at the failed TT If the job is in the reduce phase, JT asks another TT to re-execute all Reducers that were in progress on the failed TT 32

Speculative Execution A MapReduce job is dominated by the slowest task MapReduce attempts to locate slow tasks (stragglers) and run redundant (speculative) tasks that will optimistically commit before the corresponding stragglers This process is known as speculative execution Only one copy of a straggler is allowed to be speculated Whichever copy of a task commits first, it becomes the definitive copy, and the other copy is killed by JT

What Makes MapReduce Unique? MapReduce is characterized by: 1. Its simplified programming model which allows the user to quickly write and test distributed systems 2. Its efficient and automatic distribution of data and workload across machines 3. Its flat scalability curve. Specifically, after a Mapreduce program is written and functioning on 10 nodes, very little-if any- work is required for making that same program run on 1000 nodes 4. Its fault tolerance approach 36