Outline. Distributed File System Map-Reduce The Computational Model Map-Reduce Algorithm Evaluation Computing Joins

Similar documents
CS 345A Data Mining. MapReduce

CS 345A Data Mining. MapReduce

Programming Systems for Big Data

Generalizing Map- Reduce

Clustering Lecture 8: MapReduce

MapReduce. Stony Brook University CSE545, Fall 2016

CmpE 138 Spring 2011 Special Topics L2

MapReduce: Recap. Juliana Freire & Cláudio Silva. Some slides borrowed from Jimmy Lin, Jeff Ullman, Jerome Simeon, and Jure Leskovec

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce

Introduction to MapReduce

MapReduce: Simplified Data Processing on Large Clusters 유연일민철기

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

Lecture Map-Reduce. Algorithms. By Marina Barsky Winter 2017, University of Toronto

MapReduce and Friends

Databases 2 (VU) ( )

Introduction to MapReduce (cont.)

MapReduce. U of Toronto, 2014

CA485 Ray Walshe Google File System

Parallel Nested Loops

Parallel Partition-Based. Parallel Nested Loops. Median. More Join Thoughts. Parallel Office Tools 9/15/2011

MI-PDB, MIE-PDB: Advanced Database Systems

Introduction to MapReduce

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Map Reduce. Yerevan.

CS6030 Cloud Computing. Acknowledgements. Today s Topics. Intro to Cloud Computing 10/20/15. Ajay Gupta, WMU-CS. WiSe Lab

Distributed File Systems II

BigData and Map Reduce VITMAC03

The amount of data increases every day Some numbers ( 2012):

2/26/2017. The amount of data increases every day Some numbers ( 2012):

Parallel Programming Concepts

CS427 Multicore Architecture and Parallel Computing

Database Systems CSE 414

MapReduce & BigTable

Data contains value and knowledge. 4/1/19 Tim Althoff, UW CS547: Machine Learning for Big Data,

Announcements. Optional Reading. Distributed File System (DFS) MapReduce Process. MapReduce. Database Systems CSE 414. HW5 is due tomorrow 11pm

Big Data Management and NoSQL Databases

Batch Processing Basic architecture

CS 138: Google. CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

Introduction to MapReduce

Where We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344

Distributed Filesystem

Hadoop/MapReduce Computing Paradigm

Parallel Computing: MapReduce Jin, Hai

Map Reduce Group Meeting

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

CS 138: Google. CS 138 XVII 1 Copyright 2016 Thomas W. Doeppner. All rights reserved.

Introduction to Data Management CSE 344

Introduction to MapReduce. Adapted from Jimmy Lin (U. Maryland, USA)

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

CIS 601 Graduate Seminar Presentation Introduction to MapReduce --Mechanism and Applicatoin. Presented by: Suhua Wei Yong Yu

Distributed Computations MapReduce. adapted from Jeff Dean s slides

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Distributed System. Gang Wu. Spring,2018

Principles of Data Management. Lecture #16 (MapReduce & DFS for Big Data)

Lecture 11 Hadoop & Spark

Map-Reduce. Marco Mura 2010 March, 31th

MapReduce and the New Software Stack

The Google File System. Alexandru Costan

INTRODUCTION TO DATA SCIENCE. MapReduce and the New Software Stacks(MMDS2)

Introduction to Map Reduce

The MapReduce Abstraction

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

Distributed Computation Models

CSE Lecture 11: Map/Reduce 7 October Nate Nystrom UTA

Motivation. Map in Lisp (Scheme) Map/Reduce. MapReduce: Simplified Data Processing on Large Clusters

Chapter 5. The MapReduce Programming Model and Implementation

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

PARALLEL DATA PROCESSING IN BIG DATA SYSTEMS

Distributed Systems 16. Distributed File Systems II

CS /5/18. Paul Krzyzanowski 1. Credit. Distributed Systems 18. MapReduce. Simplest environment for parallel processing. Background.

MapReduce-style data processing

Announcements. Parallel Data Processing in the 20 th Century. Parallel Join Illustration. Introduction to Database Systems CSE 414

Map-Reduce and Related Systems

CS MapReduce. Vitaly Shmatikov

MapReduce, Hadoop and Spark. Bompotas Agorakis

Dept. Of Computer Science, Colorado State University

L22: SC Report, Map Reduce

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Introduction to Data Management CSE 344

Improving the MapReduce Big Data Processing Framework

CLOUD-SCALE FILE SYSTEMS

Programming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines

The Google File System

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

The Google File System

Tutorial Outline. Map/Reduce. Data Center as a computer [Patterson, cacm 2008] Acknowledgements

Jeffrey D. Ullman Stanford University

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

CS 61C: Great Ideas in Computer Architecture. MapReduce

MapReduce and Hadoop. Debapriyo Majumdar Indian Statistical Institute Kolkata

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung

Cloud Computing CS

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

Introduction to Hadoop and MapReduce

Hadoop Distributed File System(HDFS)

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Transcription:

MapReduce 1

Outline Distributed File System Map-Reduce The Computational Model Map-Reduce Algorithm Evaluation Computing Joins 2

Outline Distributed File System Map-Reduce The Computational Model Map-Reduce Algorithm Evaluation Computing Joins 3

Cluster Architecture 1 Gbps between any pair of nodes in a rack Switch 2-10 Gbps backbone between racks Switch Switch CPU CPU CPU CPU Mem Mem Mem Mem Disk Disk Disk Disk Each rack contains 16-64 nodes 4

Stable storage First order problem: if nodes can fail, how can we store data persistently? Answer: Distributed File System Provides global file namespace Google GFS; Hadoop HDFS; Kosmix KFS Typical usage pattern Huge files (100s of GB to TB) Data is rarely updated in place Reads and appends are common 5

Distributed File System Chunk Servers File is split into contiguous chunks Typically each chunk is 16-64MB Each chunk replicated (usually 2x or 3x) Try to keep replicas in different racks Master node Name Nodes in HDFS Stores metadata Might be replicated Client library for file access Talks to master to find chunk servers Connects directly to chunkservers to access data 6

Outline Distributed File System Map-Reduce The Computational Model Map-Reduce Algorithm Evaluation Computing Joins 7

MapReduce: The Map Step Input key-value pairs Intermediate key-value pairs k v map k k v v k v map k v k v k v 8

MapReduce: The Reduce Step Intermediate key-value pairs k k v v group Key-value groups k v v v k v v reduce reduce Output key-value pairs k k v v k v k v k v k v 9

MapReduce Input: a set of key/value pairs User supplies two functions: map(k,v) list(k1,v1) reduce(k1, list(v1)) v2 (k1,v1) is an intermediate key/value pair Output is the set of (k1,v2) pairs 10

Word Count using MapReduce map(key, value): // key: document name; value: text of document for each word w in value: emit(w, 1) reduce(key, values): // key: a word; value: an iterator over counts result = 0 for each count v in values: result += v emit(result) 11

Distributed Execution Overview User Program fork fork fork assign map Master assign reduce Input Data Split 0 Split 1 Split 2 read Worker Worker Worker local write remote read, sort Worker Worker write Output File 0 Output File 1 12

Data flow Input, final output are stored on a distributed file system Scheduler tries to schedule map tasks close to physical storage location of input data Intermediate results are stored on local FS of map and reduce workers Output is often input to another map reduce task 13

Coordination Master data structures Task status: (idle, in-progress, completed) Idle tasks get scheduled as workers become available When a map task completes, it sends the master the location and sizes of its R intermediate files, one for each reducer Master pushes this info to reducers Master pings workers periodically to detect failures 14

Failures Map worker failure Map tasks completed or in-progress at worker are reset to idle Reduce workers are notified when task is rescheduled on another worker Reduce worker failure Only in-progress tasks are reset to idle Master failure MapReduce task is aborted and client is notified 15

How many Map and Reduce jobs? M map tasks, R reduce tasks Rule of thumb: Make M and R much larger than the number of nodes in cluster One DFS chunk per map is common Improves dynamic load balancing and speeds recovery from worker failure Usually R is smaller than M, because output is spread across R files 16

Combiners Often a map task will produce many pairs of the form (k,v1), (k,v2), for the same key k E.g., popular words in Word Count Can save network time by pre-aggregating at mapper combine(k1, list(v1)) v2 Usually same as reduce function Works only if reduce function is commutative and associative 17

Exercise 1: Host size Suppose we have a large web corpus Let s look at the metadata file Lines of the form (URL, size, date, ) For each host, find the total number of bytes i.e., the sum of the page sizes for all URLs from that host 18

Exercise 2: Graph reversal Given a directed graph as an adjacency list: src1: dest11, dest12, src2: dest21, dest22, Construct the graph in which all the links are reversed 19

Implementations Google Not available outside Google Hadoop An open-source implementation in Java Uses HDFS for stable storage Aster Data Cluster-optimized SQL Database that also implements MapReduce 20

Outline Distributed File System Map-Reduce The Computational Model Map-Reduce Algorithm Evalution Computing Joins 21

Overview There is a new computing environment available: Massive files, many computing nodes. Map-reduce allows us to exploit this environment easily. But not everything is map-reduce. What else can we do in the same environment? 22

Files Stored in dedicated file system. Treated like relations. Order of elements does not matter. Massive chunks (e.g., 64MB). Chunks are replicated. Parallel read/write of chunks is possible. 23

Processes Each process operates at one node. Infinite supply of nodes. Communication among processes can be via the file system or special communication channels. Example: Master controller assembling output of Map processes and passing them to Reduce processes. 24

Algorithms An algorithm is described by an acyclic graph. 1. A collection of processes (nodes). 2. Arcs from node a to node b, indicating that (part of) the output of a goes to the input of b. 25

Example: A Map-Reduce Graph map reduce map... reduce reduce map 26

Outline Distributed File System Map-Reduce The Computational Model Map-Reduce Algorithm Evaluation Computing Joins 27

Algorithm Design Goal: Algorithms should exploit as much parallelism as possible. To encourage parallelism, we put a limit s on the amount of input or output that any one process can have. s could be: What fits in main memory. What fits on local disk. No more than a process can handle before cosmic rays are likely to cause an error. 28

Cost Measures for Algorithms 1. Communication cost = total I/O of all processes. 2. Elapsed communication cost = max of I/O along any path. 3. (Elapsed ) computation costs analogous, but count only running time of processes. 29

Example: Cost Measures For a map-reduce algorithm: Communication cost = input file size + 2 (sum of the sizes of all files passed from Map processes to Reduce processes) + the sum of the output sizes of the Reduce processes. Elapsed communication cost is the sum of the largest input + output for any map process, plus the same for any reduce process. 30

What Cost Measures Mean Either the I/O (communication) or processing (computation) cost dominates. Ignore one or the other. Total costs tell what you pay in rent from your friendly neighborhood cloud. Elapsed costs are wall-clock time using parallelism. 31

Outline Distributed File System Map-Reduce The Computational Model Map-Reduce Algorithm Evalution Computing Joins 32

Join By Map-Reduce Compute the natural join R(A,B) S(B,C). R and S each are stored in files. Tuples are pairs (a,b) or (b,c). 33

Map-Reduce Join (2) Use a hash function h from B-values to 1..k. A Map process turns input tuple R(a,b) into key-value pair (b,(a,r)) and each input tuple S(b,c) into (b,(c,s)). 34

Map-Reduce Join (3) Map processes send each key-value pair with key b to Reduce process h(b). Hadoop does this automatically; just tell it what k is. Each Reduce process matches all the pairs (b,(a,r)) with all (b,(c,s)) and outputs (a,b,c). 35

Cost of Map-Reduce Join Total communication cost = O( R + S ) 36

Three-Way Join Consider a simple join of three relations, the natural join R(A,B) S(B,C) T(C,D). One way: cascade of two 2-way joins, each implemented by map-reduce. Fine, unless the 2-way joins produce large intermediate relations. 37

Large Intermediate Relations A = good pages ; B, C = all pages ; D = spam pages. R, S, and T each represent links. 3-way join = path of length 3 from good page to spam page R S = paths of length 2 from good page to any; S T = paths of length 2 from any page to spam page. 38

Another 3-Way Join Reduce processes use hash values of entire S(B,C) tuples as key. Choose a hash function h that maps B- and C-values to k buckets. There are k 2 Reduce processes, one for each (B-bucket, C-bucket) pair. 39

Mapping for 3-Way Join We map each tuple S(b,c) to ((h(b), h(c)), (S, b, c)). We map each R(a,b) tuple to ((h(b), y), (R, a, b)) for all y = 1, 2,,k. We map each T(c,d) tuple to ((x, h(c)), (T, c, d)) for all x = 1, 2,,k. Keys Values 40

Job of the Reducers Each reducer gets, for certain B- values b and C-values c : 1. All tuples from R with B = b, 2. All tuples from T with C = c, and 3. The tuple S(b,c) if it exists. Thus it can create every tuple of the form (a, b, c, d) in the join. 41

Comments 不适合迭代计算 不能随机读取 性能问题 SPARK STORM 42