CS6030 Cloud Computing. Acknowledgements. Today s Topics. Intro to Cloud Computing 10/20/15. Ajay Gupta, WMU-CS. WiSe Lab

Similar documents
Introduction to MapReduce

Introduction to MapReduce

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Map Reduce. Yerevan.

Google File System (GFS) and Hadoop Distributed File System (HDFS)

The Google File System. Alexandru Costan

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce

Introduction to MapReduce. Adapted from Jimmy Lin (U. Maryland, USA)

Distributed Computations MapReduce. adapted from Jeff Dean s slides

Clustering Lecture 8: MapReduce

MapReduce, Hadoop and Spark. Bompotas Agorakis

Overview. Why MapReduce? What is MapReduce? The Hadoop Distributed File System Cloudera, Inc.

Distributed Filesystem

The Google File System

CLOUD-SCALE FILE SYSTEMS

The Google File System

The Google File System

Google File System. Arun Sundaram Operating Systems

Parallel Computing: MapReduce Jin, Hai

The Google File System

Map Reduce.

MapReduce. U of Toronto, 2014

CS5950 / CS6030 Cloud Computing

The Google File System

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

BigData and Map Reduce VITMAC03

CA485 Ray Walshe Google File System

Map-Reduce. Marco Mura 2010 March, 31th

Introduction to MapReduce. Adapted from Jimmy Lin (U. Maryland, USA)

Hadoop Distributed File System(HDFS)

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

The MapReduce Abstraction

Introduction to Map Reduce

CS 61C: Great Ideas in Computer Architecture. MapReduce

CSE 124: Networked Services Fall 2009 Lecture-19

CS-2510 COMPUTER OPERATING SYSTEMS

Georgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2013/14

The Google File System (GFS)

MapReduce: Simplified Data Processing on Large Clusters

CSE 124: Networked Services Lecture-16

Where We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344

The Google File System

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Database System Architectures Parallel DBs, MapReduce, ColumnStores

Information Retrieval Processing with MapReduce

CS 345A Data Mining. MapReduce

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

Google File System. By Dinesh Amatya

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Parallel Nested Loops

Parallel Partition-Based. Parallel Nested Loops. Median. More Join Thoughts. Parallel Office Tools 9/15/2011

Lecture 11 Hadoop & Spark

Google Disk Farm. Early days

ECE5610/CSC6220 Introduction to Parallel and Distribution Computing. Lecture 6: MapReduce in Parallel Computing

NPTEL Course Jan K. Gopinath Indian Institute of Science

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

MI-PDB, MIE-PDB: Advanced Database Systems

The Google File System

HDFS: Hadoop Distributed File System. Sector: Distributed Storage System

Distributed File Systems II

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

CS435 Introduction to Big Data FALL 2018 Colorado State University. 11/7/2018 Week 12-B Sangmi Lee Pallickara. FAQs

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13

L22: SC Report, Map Reduce

Introduction to Data Management CSE 344

CS 138: Google. CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CS 138: Google. CS 138 XVII 1 Copyright 2016 Thomas W. Doeppner. All rights reserved.

CMSC 723: Computational Linguistics I Session #12 MapReduce and Data Intensive NLP. University of Maryland. Wednesday, November 18, 2009

Distributed System. Gang Wu. Spring,2018

with MapReduce The ischool University of Maryland JHU Summer School on Human Language Technology Wednesday, June 17, 2009

Staggeringly Large Filesystems

Announcements. Parallel Data Processing in the 20 th Century. Parallel Join Illustration. Introduction to Database Systems CSE 414

GOOGLE FILE SYSTEM: MASTER Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung

Map Reduce Group Meeting

GFS. CS6450: Distributed Systems Lecture 5. Ryan Stutsman

Introduction to Data Management CSE 344

The Google File System

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 10 Parallel Programming Models: Map Reduce and Spark

Developing MapReduce Programs

GFS: The Google File System

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

CSE 344 MAY 2 ND MAP/REDUCE

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

GFS: The Google File System. Dr. Yingwu Zhu

Database Systems CSE 414

Introduction to MapReduce

Yuval Carmel Tel-Aviv University "Advanced Topics in Storage Systems" - Spring 2013

Google Cluster Computing Faculty Training Workshop

Large-Scale GPU programming

Concurrency for data-intensive applications

Announcements. Optional Reading. Distributed File System (DFS) MapReduce Process. MapReduce. Database Systems CSE 414. HW5 is due tomorrow 11pm

Distributed Systems 16. Distributed File Systems II

Outline. Distributed File System Map-Reduce The Computational Model Map-Reduce Algorithm Evaluation Computing Joins

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

MapReduce and Hadoop

Transcription:

CS6030 Cloud Computing Ajay Gupta B239, CEAS Computer Science Department Western Michigan University ajay.gupta@wmich.edu 276-3104 1 Acknowledgements I have liberally borrowed these slides and material from a number of sources including Web MIT, Harvard, UMD, UCSD, UW, Clarkson,... Amazon, Google, IBM, Apache, ManjraSoft, CloudBook,... Thanks to original authors including Dyer, Lin, Dean, Buyya, Ghemawat, Fanelli, Bisciglia, Kimball, Michels-Slettvet, If I have missed any, its purely unintentional. My sincere appreciation to those authors and their creative mind. 2 Today s Topics Functional programming MapReduce Distributed file system 3 1

Functional Programming MapReduce = functional programming meets distributed processing on steroids Not a new idea dates back to the 50 s (or even 30 s) What is functional programming? Computation as application of functions Theoretical foundation provided by lambda calculus How is it different? Traditional notions of data and instructions are not applicable Data flows are implicit in program Different orders of execution are possible Exemplified by LISP and ML 4 Overview of Lisp Lisp Lost In Silly Parentheses One may focus on a particular dialect: Scheme Lists are primitive data types '(1 2 3 4 5) '((a 1) (b 2) (c 3)) Functions written in prefix notation (+ 1 2) 3 (* 3 4) 12 (sqrt (+ (* 3 3) (* 4 4))) 5 (define x 3) x (* x 5) 15 5 Functions Functions = lambda expressions bound to variables (define foo (lambda (x y) (sqrt (+ (* x x) (* y y))))) Syntactic sugar for defining functions Above expressions is equivalent to: (define (foo x y) (sqrt (+ (* x x) (* y y)))) Once defined, function can be applied: (foo 3 4) 5 6 2

Other Features In Scheme, everything is an s-expression No distinction between data and code Easy to write self-modifying code Higher-order functions Functions that take other functions as arguments (define (bar f x) (f (f x))) Doesn t matter what f is, just apply it twice. (define (baz x) (* x x)) (bar baz 2) 16 7 Recursion is your friend Simple factorial example (define (factorial n) (if (= n 1) 1 (* n (factorial (- n 1))))) (factorial 6) 720 Even iteration is written with recursive calls! (define (factorial- iter n) (define (aux n top product) (if (= n top) (* n product) (aux (+ n 1) top (* n product)))) (aux 1 n 1)) (factorial- iter 6) 720 8 Lisp MapReduce? What does this have to do with MapReduce? After all, Lisp is about processing lists Two important concepts in functional programming Map: do something to everything in a list Fold: combine results of a list in some way 9 3

Map Map is a higher-order function How map works: Function is applied to every element in a list Result is a new list f f f f f 10 Fold Fold is also a higher-order function How fold works: Accumulator set to initial value Function applied to list element and the accumulator Result stored in the accumulator Repeated for every item in the list Result is the final value in the accumulator Initial value f f f f f final value 11 Map/Fold in Action Simple map example: (map (lambda (x) (* x x)) '(1 2 3 4 5)) '(1 4 9 16 25) Fold examples: (fold + 0 '(1 2 3 4 5)) 15 (fold * 1 '(1 2 3 4 5)) 120 Sum of squares: (define (sum- of- squares v) (fold + 0 (map (lambda (x) (* x x)) v))) (sum- of- squares '(1 2 3 4 5)) 55 12 4

Lisp MapReduce Let s assume a long list of records: imagine if... We can parallelize map operations We have a mechanism for bringing map results back together in the fold operation That s MapReduce! (and Hadoop) Observations: No limit to map parallelization since maps are independent We can reorder folding if the fold function is commutative and associative 13 Typical Problem Iterate over a large number of records Extract something of interest from each Shuffle and sort intermediate results Aggregate intermediate results Generate final output Map Reduce Key idea: provide an abstraction at the point of these two operations 14 Start -------MapReduce Programmers specify two functions: map (k, v) <k, v >* reduce (k, v ) <k, v >* All v with the same k are reduced together Usually, programmers also specify: partition (k, number of partitions ) partition for k Often a simple hash of the key, e.g. hash(k ) mod n Allows reduce operations for different keys in parallel Implementations: Google has a proprietary implementation in C++ Hadoop is an open source implementation in Java (lead by Yahoo) 15 5

It s just divide and conquer! Data Store Initial kv pairs Initial kv pairs Initial kv pairs Initial kv pairs map map map map k 1, values k 3, values k 1, values k 1, values k 1, values k 3, values k 3, values k 3, values k 2, values k 2, values k 2, values k 2, values Barrier: aggregate values by keys k 1, values k 2, values k 3, values reduc e reduc e reduc e final k 1 values final k 2 values final k 3 values 16 Recall these problems? How do we assign work units to workers? What if we have more work units than workers? What if workers need to share partial results? How do we aggregate partial results? How do we know all the workers have finished? What if workers die? 17 MapReduce Runtime Handles scheduling Assigns workers to map and reduce tasks Handles data distribution Moves the process to the data Handles synchronization Gathers, sorts, and shuffles intermediate data Handles faults Detects worker failures and restarts Everything happens on top of a distributed FS (later) 18 6

Hello World : Word Count Map(String input_key, String input_values): // input_key: document name // input_values: document contents for each word w in input_values: EmitIntermediate(w, "1"); Reduce(String key, Iterator intermediate_values): // key: a word, same for input and output // intermediate_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit(AsString(result)); 19 20 Source: Dean and Ghemawat (OSDI 2004) Bandwidth Optimization Issue: large number of key-value pairs Solution: use Combiner functions Executed on same machine as mapper Results in a mini-reduce right after the map phase Reduces key-value pairs to save bandwidth 21 7

Skew Problem Issue: reduce is only as fast as the slowest map Solution: redundantly execute map operations, use results of first to finish Addresses hardware problems... But not issues related to inherent distribution of data 22 How do we get data to the workers? NAS Compute Nodes SAN What s the problem here? 23 Distributed File System Don t move data to workers Move workers to the data! Store data on the local disks for nodes in the cluster Start up the workers on the node that has the data local Why? Not enough RAM to hold all the data in memory Disk access is slow, disk throughput is good A distributed file system is the answer GFS (Google File System) HDFS for Hadoop 24 8

GFS: Assumptions Commodity hardware over exotic hardware High component failure rates Inexpensive commodity components fail all the time Modest number of HUGE files Files are write-once, mostly appended to Perhaps concurrently Large streaming reads over random access High sustained throughput over low latency GFS slides adapted from material by Dean et al. 25 GFS: Design Decisions Files stored as chunks Fixed size (64MB) Reliability through replication Each chunk replicated across 3+ chunkservers Single master to coordinate access, keep metadata Simple centralized management No data caching Little benefit due to large data sets, streaming reads Simplify the API Push some of the issues onto the client 26 27 Source: Ghemawat et al. (SOSP 2003) 9

GFS Single Master We know this is a: Single point of failure Scalability bottleneck GFS solutions: Shadow masters Minimize master involvement Never move data through it, use only for metadata (and cache metadata at clients) Large chunk size Master delegates authority to primary replicas in data mutations (chunk leases) Simple, and good enough! 28 GFS Master s Responsibilities (1/2) Metadata storage Namespace management/locking Periodic communication with chunkservers Give instructions, collect state, track cluster health Chunk creation, re-replication, rebalancing Balance space utilization and access speed Spread replicas across racks to reduce correlated failures Re-replicate data if redundancy falls below threshold Rebalance data to smooth out storage and request load 29 GFS Master s Responsibilities (2/2) Garbage Collection Simpler, more reliable than traditional file delete Master logs the deletion, renames the file to a hidden name Lazily garbage collects hidden files Stale replica deletion Detect stale replicas using chunk version numbers 30 10

GFS : Metadata Global metadata is stored on the master File and chunk namespaces Mapping from files to chunks Locations of each chunk s replicas All in memory (64 bytes / chunk) Fast Easily accessible Master has an operation log for persistent logging of critical metadata updates Persistent on local disk Replicated Checkpoints for faster recovery 31 GFS: Mutations Mutation = write or append Must be done for all replicas Goal: minimize master involvement Lease mechanism: Master picks one replica as primary; gives it a lease for mutations Primary defines a serial order of mutations All replicas follow this order Data flow decoupled from control flow 32 Parallelization Problems How do we assign work units to workers? What if we have more work units than workers? What if workers need to share partial results? How do we aggregate partial results? How do we know all the workers have finished? How is MapReduce different? What if workers die? 33 11

From Theory to Practice 1. Scp data to cluster 2. Move data into HDFS 3. Develop code locally You 4. Submit MapReduce job 4a. Go back to Step 3 Hadoop Cluster 5. Move data out of HDFS 6. Scp data from cluster 34 On Amazon: With EC2 0. Allocate Hadoop cluster 1. Scp data to cluster 2. Move data into HDFS 3. Develop code locally EC2 You 4. Submit MapReduce job 4a. Go back to Step 3 Your Hadoop Cluster Uh oh. Where did the data go? 5. Move data out of HDFS 6. Scp data from cluster 7. Clean up! 35 On Amazon: EC2 and S3 Copy from S3 to HDFS EC2 (The Cloud) S3 (Persistent Store) Your Hadoop Cluster Copy from HDFS to S3 36 12

Next time More on MapReduce Synchronization and coordinating partial results Use of complex keys and values Examples 37 Questions? 38 13