Scalable Big Graph Processing in MapReduce

Similar documents
Scalable Big Graph Processing in Map Reduce

Shortest Path Algorithms. Lecture I: Shortest Path Algorithms. Example. Graphs and Matrices. Setting: Dr Kieran T. Herley.

Implementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report)

Lecture 18: Mix net Voting Systems

A Matching Algorithm for Content-Based Image Retrieval

4. Minimax and planning problems

STEREO PLANE MATCHING TECHNIQUE

A Principled Approach to. MILP Modeling. Columbia University, August Carnegie Mellon University. Workshop on MIP. John Hooker.

FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS

Chapter 4 Sequential Instructions

Image segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding

AML710 CAD LECTURE 11 SPACE CURVES. Space Curves Intrinsic properties Synthetic curves

PART 1 REFERENCE INFORMATION CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONITOR

COMP26120: Algorithms and Imperative Programming

Rule-Based Multi-Query Optimization

Less Pessimistic Worst-Case Delay Analysis for Packet-Switched Networks

Coded Caching with Multiple File Requests

Learning in Games via Opponent Strategy Estimation and Policy Search

Network management and QoS provisioning - QoS in Frame Relay. . packet switching with virtual circuit service (virtual circuits are bidirectional);

CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL

Restorable Dynamic Quality of Service Routing

COSC 3213: Computer Networks I Chapter 6 Handout # 7

CENG 477 Introduction to Computer Graphics. Modeling Transformations

Gauss-Jordan Algorithm

Data Structures and Algorithms. The material for this lecture is drawn, in part, from The Practice of Programming (Kernighan & Pike) Chapter 2

Outline. EECS Components and Design Techniques for Digital Systems. Lec 06 Using FSMs Review: Typical Controller: state

Optimal Crane Scheduling

Packet Scheduling in a Low-Latency Optical Interconnect with Electronic Buffers

4 Error Control. 4.1 Issues with Reliable Protocols

A time-space consistency solution for hardware-in-the-loop simulation system

Motor Control. 5. Control. Motor Control. Motor Control

Outline. CS38 Introduction to Algorithms 5/8/2014. Network flow. Lecture 12 May 8, 2014

Quantitative macro models feature an infinite number of periods A more realistic (?) view of time

The Impact of Product Development on the Lifecycle of Defects

An Adaptive Spatial Depth Filter for 3D Rendering IP

Test - Accredited Configuration Engineer (ACE) Exam - PAN-OS 6.0 Version

MATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008

Scheduling. Scheduling. EDA421/DIT171 - Parallel and Distributed Real-Time Systems, Chalmers/GU, 2011/2012 Lecture #4 Updated March 16, 2012

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab

Dynamic Route Planning and Obstacle Avoidance Model for Unmanned Aerial Vehicles

MIC2569. Features. General Description. Applications. Typical Application. CableCARD Power Switch

Constant-Work-Space Algorithms for Shortest Paths in Trees and Simple Polygons

Analysis of Various Types of Bugs in the Object Oriented Java Script Language Coding

Computer representations of piecewise

Difficulty-aware Hybrid Search in Peer-to-Peer Networks

Fully Dynamic Algorithm for Top-k Densest Subgraphs

Open Access Research on an Improved Medical Image Enhancement Algorithm Based on P-M Model. Luo Aijing 1 and Yin Jin 2,* u = div( c u ) u

Assignment 2. Due Monday Feb. 12, 10:00pm.

Distributed Task Negotiation in Modular Robots

In fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magnetic Field Maps

Sam knows that his MP3 player has 40% of its battery life left and that the battery charges by an additional 12 percentage points every 15 minutes.

Axiomatic Foundations and Algorithms for Deciding Semantic Equivalences of SQL Queries

Real Time Integral-Based Structural Health Monitoring

Improving Ranking of Search Engines Results Based on Power Links

BI-TEMPORAL INDEXING

The Data Locality of Work Stealing

source managemen, naming, proecion, and service provisions. This paper concenraes on he basic processor scheduling aspecs of resource managemen. 2 The

Simple Network Management Based on PHP and SNMP

REDUCTIONS BBM ALGORITHMS DEPT. OF COMPUTER ENGINEERING ERKUT ERDEM. Bird s-eye view. May. 12, Reduction.

Chapter 8 LOCATION SERVICES

Nonparametric CUSUM Charts for Process Variability

Using CANopen Slave Driver

EECS 487: Interactive Computer Graphics

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

NRMI: Natural and Efficient Middleware

NEWTON S SECOND LAW OF MOTION

A New Semantic Cache Management Method in Mobile Databases

Opportunistic Flooding in Low-Duty-Cycle Wireless Sensor Networks with Unreliable Links

Web System for the Remote Control and Execution of an IEC Application

Utility-Based Hybrid Memory Management

Mobile Robots Mapping

Voltair Version 2.5 Release Notes (January, 2018)

Nearest Keyword Search in XML Documents

6.8 Shortest Paths. Chapter 6. Dynamic Programming. Shortest Paths: Failed Attempts. Shortest Paths

SOT: Compact Representation for Triangle and Tetrahedral Meshes

Video Content Description Using Fuzzy Spatio-Temporal Relations

Automatic Calculation of Coverage Profiles for Coverage-based Testing

Performance Evaluation of Implementing Calls Prioritization with Different Queuing Disciplines in Mobile Wireless Networks

Why not experiment with the system itself? Ways to study a system System. Application areas. Different kinds of systems

Spline Curves. Color Interpolation. Normal Interpolation. Last Time? Today. glshademodel (GL_SMOOTH); Adjacency Data Structures. Mesh Simplification

A Progressive-ILP Based Routing Algorithm for Cross-Referencing Biochips

Rao-Blackwellized Particle Filtering for Probing-Based 6-DOF Localization in Robotic Assembly

Handling uncertainty in semantic information retrieval process

A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER

A Routing Algorithm for Flip-Chip Design

CMPSC 274: Transac0on Processing Lecture #6: Concurrency Control Protocols

USBFC (USB Function Controller)

A Formalization of Ray Casting Optimization Techniques

Design Alternatives for a Thin Lens Spatial Integrator Array

A Tool for Multi-Hour ATM Network Design considering Mixed Peer-to-Peer and Client-Server based Services

An Improved Square-Root Nyquist Shaping Filter

Quick Verification of Concurrent Programs by Iteratively Relaxed Scheduling

Parallel and Distributed Systems for Constructive Neural Network Learning*

Partition-based document identifier assignment (PBDIA) algorithm. (long queries)

MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES

Location. Electrical. Loads. 2-wire mains-rated. 0.5 mm² to 1.5 mm² Max. length 300 m (with 1.5 mm² cable). Example: Belden 8471

1.4 Application Separable Equations and the Logistic Equation

Managing XML Versions and Replicas in a P2P Context

Project #1 Math 285 Name:

Last Time: Curves & Surfaces. Today. Questions? Limitations of Polygonal Meshes. Can We Disguise the Facets?

Transcription:

Scalable Big Graph Processing in MapReduce Lu Qin,JeffreyXuYu,LijunChang,HongCheng,ChengqiZhang, Xuemin Lin Cenre for Quanum Compuaion and Inelligen Sysems, Universiy of Technology, Sydney, Ausralia The Chinese Universiy of Hong Kong, China The Universiy of New Souh Wales, Ausralia Eas China Normal Universiy, China {lu.qin,chengqi.zhang}@us.edu.au {yu,hcheng}@se.cuhk.edu.hk {ljchang,lxue}@cse.unsw.edu.au ABSTRACT MapReduce has become one of he mos popular parallel compuing paradigms in cloud, due o is high scalabiliy, reliabiliy, and faul-olerance achieved for a large variey of applicaions in big daa processing. In he lieraure, here are MapReduce Class MRC and Minimal MapReduce Class MMC o define he memory consumpion, communicaion cos, CPU cos, and number of MapReduce rounds for an algorihm o execue in MapReduce. However, neiher of hem is designed for big graph processing in MapReduce, since he consrains in MMC can be hardly achieved simulaneously on graphs and he condiions in MRC may induce scalabiliy problems when processing big graph daa. In his paper, we sudy scalable big graph processing in MapReduce. We inroduce a Scalable Graph processing Class SGC by relaxing some consrains in MMC o make i suiable for scalable graph processing. We define wo graph join operaors in SGC,namely,EN join and NE join, using which a wide range of graph algorihms can be designed, including PageRank, breadh firs search, graph keyword search, Conneced Componen (CC) compuaion, and Minimum Spanning Fores (MSF) compuaion. Remarkably, o he bes of our knowledge, for he wo fundamenal graph problems CC and MSF compuaion, his is he firs work ha can achieve O(log(n)) MapReduce rounds wih O(n + m) oal communicaion cos in each round and consan memory consumpion on each machine, where n and m are he number of nodes and edges in he graph respecively. We conduced exensive performance sudies using wo web-scale graphs Twier-00 and Friendser wih differen graph characerisics. The experimenal resuls demonsrae ha our algorihms can achieve high scalabiliy in big graph processing. Caegories and Subjec Descripors H.. [Informaion Sysems]: Daabase Managemen Sysems Keywords Graph; MapReduce; Cloud Compuing; Big Daa Permission o make digial or hard copies of all or par of his work for personal or classroom use is graned wihou fee provided ha copies are no made or disribued for profi or commercial advanage and ha copies bear his noice and he full ciaion on he firs page. Copyrighs for componens of his work owned by ohers han ACM mus be honored. Absracing wih credi is permied. To copy oherwise, or republish, o pos on servers or o redisribue o liss, requires prior specific permission and/or a fee. Reques permissions from permissions@acm.org. SIGMOD, June, 0, Snowbird, UT, USA. Copyrigh 0 ACM --0--//0...$.00. hp://dx.doi.org/0./... INTRODUCTION As one of he mos popular parallel compuing paradigms for big daa, MapReduce [0] has been widely used in a lo of companies such as Google, Facebook, Yahoo, and Amazon o process a large amoun of daa in he order of era-byes everyday. The success of MapReduce is due o is high scalabiliy, reliabiliy, and faul-olerance achieved for a large variey of applicaions and is easy-o-use programming model ha allows developers o develop parallel daa-driven algorihms in a disribued shared nohing environmen. A MapReduce algorihm execues in rounds. Each round has hree phases: map, shuffle, and reduce. The map phase generaes a se of key-value pairs using a map funcion, he shuffle phase ransfers he key-value pairs ino differen machines and ensures ha key-value pairs wih he same key arrive a he same machine, and he reduce phase processes all key-value pairs wih he same key using a reduce funcion. Moivaion: In he lieraure, here are researches ha define algorihm classes in MapReduce in erms of memory consumpion, communicaion cos, CPU cos, and he number of rounds. Karloff e al. [] give he firs aemp in which he MapReduce Class (MRC) is proposed. MRC defines he maximal requiremens for an algorihm o execue in MapReduce, in he sense ha if any condiion in MRC is violaed, running he algorihm in MapReduce is meaningless. Neverheless, a beer class is highly demanded o guide he developmen of more sable and scalable MapReduce algorihms. Thus, Tao e al. [] inroduce he Minimal MapReduce Class (MMC) in which several aspecs can achieve opimaliy simulaneously. A lo of imporan daabase problems including soring and sliding aggregaion can be solved in MMC. However, MMC is sill incapable of solving a large range of problems especially for hose involved in graph processing, which is an imporan branch of big daa processing. The reasons are wofold. Firs, a graph usually has some inheren characerisics which make i hard o achieve high parallelism. For example, a graph is usually unsrucured and highly irregular, making he localiy of he graph very poor [0]. Second, he loosely synchronised shared nohing compuing srucure in MapReduce makes i difficul o achieve high workload balancing and low communicaion cos simulaneously as defined in MMC when processing graphs (see Secion for more deails). Moivaed by his, in his paper, we relax some condiions in MMC and define a new class of MapReduce algorihms ha is more suiable for scalable big graph processing. Conribuions: We make he following conribuions in his paper. () New class defined for scalable graph processing: We define a new class SGC for scalable graph processing in MapReduce. We aim a achieving hree goals: scalabiliy, sabiliy, androbusness. Scalabiliy requires an algorihm o achieve good speed-up w.r. he

number of machines used. Sabiliy requires an algorihm o erminae in bounded number of rounds. Robusness requires ha an algorihm never fails regardless of how much memory each machine has. SGC relaxes wo consrains defined in MMC, namely, he communicaion cos on each machine, and he oal number of rounds. For he former, we define a new cos, ha balances he communicaion in a random manner where he randomness is relaed o he degree disribuion of he graph. For he laer, we relax he O() rounds defined in MMC o O(log(n)) where n is he graph node number. In addiion, we require he memory used in each machine o be loosely relaed o he size of he inpu daa, in order o achieve high robusness. Such a condiion is even sronger han ha defined in MMC. Therobusness requiremen is highly demanded by a commercial daabase sysem, wih which a daabase adminisraor does no need o worry ha he daa grows oo large o reside enirely in he oal memory of he machines. () Two elegan graph operaors defined o solve a large range of graph problems: We define wo graph join operaors, namely, NE join and EN join. NE join propagaes informaion from nodes o heir adjacen edges and EN join aggregaes informaion from adjacen edges o nodes. Boh NE join and EN join can be implemened in SGC. Using he wo graph join operaors, a large range of graph algorihms can be designed in SGC, including PageRank, breadh firs search, graph keyword search, Conneced Componen (CC) compuaion, and Minimum Spanning Fores (MSF) compuaion. Especially, for CC and MSF compuaion, i is non-rivial o solve hem using graph operaors in SGC. To he bes of our knowledge, for he wo fundamenal graph problems, his is he firs work ha can achieve O(log(n)) MapReduce rounds wih O(n + m) oal communicaion cos in each round and consan memory consumpion on each machine, where n and m are he number of nodes and edges in he graph respecively. We believe our findings on MapReduce can also guide he developmen of scalable graph processing algorihms in oher sysems in cloud. () Unified graph processing sysem: In all of our algorihms, we enforce he inpu and oupu of any graph operaion o be eiher a node able or an edge able, wih which a unified graph processing sysem can be designed. The benefis are wofold. Firs, he unified inpu and oupu make he sysem self-conainable, on op of which more complex graph processing asks can be designed by chaining several graph queries ogeher. For example, one may wan o find all conneced componens of he subgraph induced by nodes ha are relaed o phoography and hiking. Thiscanbedonebychain- ing hree graph queries: a graph keyword search query, an induced subgraph query, and a conneced componen query, all of which are sudied in his paper. Second, by chaining muliple queries wih he unified inpu/oupu, more query opimizaion echniques can be developed and inegraed ino he sysem. () Exensive performance sudies: We conduced exensive performance sudies using wo real web-scale graphs Twier-00 and Friendser,bohofwhichhavebillionsofedges.Twier-00 has a smaller diameer wih a skewed degree disribuion, and Friendser has a larger diameer wih a more uniform degree disribuion. All of our algorihms can achieve high scalabiliy on he wo daases. Ouline: Secion presens he preliminary. Secion inroduces he scalable graph processing class SGC and wo graph operaors NE join and EN join in SGC. Secion presens hree basic graph algorihms in SGC. SecionsandshowhowocompueCC and MSF in SGC respecively. Secion evaluaes he algorihms in SGC using exensive experimens. Secion reviews he relaed work and Secion concludes he paper.. PRELIMINARY In his secion, we inroduce he MapReduce framework and review he wo algorihm classes in MapReduce in he lieraure.. The MapReduce Framework MapReduce, inroduced by Google [0], is a programming model ha allows developers o develop highly scalable and faul-oleran parallel applicaions o process big daa in a disribued shared nohing environmen. A MapReduce algorihm execues in rounds. Each round involves hree phases: map, shuffle, and reduce. Assuming ha he inpu daa is sored in a disribued file sysem as a se of key-value pairs, he hree phases work as follows. Map: In his phase, each machine reads a par of he key-value pairs {(ki m,vj m )} from he disribued file sysem and generaes a new se of key-value pairs {(ki s,vj s )} o be ransferred o oher machines in he shuffle phase. Shuffle: The key-value pairs {(ki s,vj s )} generaed in he map phase are shuffled across all machines. A he end of he shuffle phase, all he key-value pairs {(ki s,v), s (ki s,v), s }wih he same key ki s are guaraneed o arrive a he same machine. Reduce: Each machine groups he key-value pairs wih he same key ki s ogeher as (ki s, {v,v s, s }), from which a new se of key-value pairs {(ki r,vj r )} is generaed and sored in he disribued file sysem o be processed in he nex round. Two funcions need o be implemened in each round: a map funcion and a reduce funcion. A map funcion deermines how o generae {(ki s,vj s )} from {(ki m,vj m )} whereas a reduce funcion deermines how o generae {(ki r,vj r )} from (ki s, {v,v s, s }).. Algorihm Classes in MapReduce In he lieraure, wo algorihm classes have been inroduced in MapReduce, in erms of disk usage, memory usage, communicaion cos, CPU cos, and number of MapReduce rounds. Le S be he se of objecs in he problem and be he number of machines in he sysem. The wo classes are defined as follows. MapReduce Class MRC: The MapReduce Class MRC is inroduced by Karloff e al. []. Fix a ϵ>0, a MapReduce algorihm in MRC should have he following properies: Disk: EachmachineusesO( S ϵ ) disk space. The oal disk space used is O( S ϵ ). Memory: Each machine uses O( S ϵ ) memory. The oal memory used is O( S ϵ ). Communicaion: In each round, each machine sends/receives O( S ϵ ) daa. The oal communicaion cos is O( S ϵ ). CPU: In each round, he CPU consumpion on each machine is O(poly( S )), i.e., polynomial o S. Number of rounds: ThenumberofroundsisO(log i S ) for a consan i 0. Minimal MapReduce Class MMC: The Minimal MapReduce Class MMC is inroduced by Tao e al. [] which aims o achieve ousanding efficiency in muliple aspecs simulaneously. A MapReduce algorihm in MMC should have he following properies: Disk: Each machine uses O( S ) disk space. The oal disk space used is O( S ). Memory: EachmachineusesO( S ) memory. The oal memory used is O( S ). Communicaion: In each round, each machine sends/receives O( S ) daa. The oal communicaion cos is O( S ). CPU: In each round, he CPU consumpion on each machine is O( Tseq ), wheret seq is he ime o solve he same problem on asinglesequenialmachine. Number of rounds: ThenumberofroundsisO().

. SCALABLE GRAPH PROCESSING MRC defines he basic requiremens for an algorihm o execue in MapReduce, whereas MMC requires several aspecs o achieve opimaliy simulaneously in a MapReduce algorihm. In he following, we analyze he problems involved in MRC and MMC in graph processing and propose a new class SGC which is suiable for scalable graph processing in MapReduce. We firs analyze MMC. ConsideragraphG(V,E) wih n = V nodes and m = E edges. A common graph operaion is o exchange daa among all adjacen nodes (nodes ha share a common edge) in he graph G. The memory consrain in MMC requires ha all edges/nodes are disribued evenly among all machines in he sysem. Le E i,j be he se of edges (u, v) in G such ha u is in machine i and v is in machine j. Thecommunicaionconsrain in MMC can be formalized as follows: max (Σ j,j i E i,j ) O((n + m)/) () i This requires minimizing max i (Σ j,j i E i,j ) which is an NP-hard problem []. Furhermore, even if he opimal soluion is compued, i is no guaraneed ha min (max i ( n+m j,j i Ei,j )) O( ). Thus, MMC is no suiable o define a graph algorihm in MapReduce. Nex, we discuss MRC. Since MRC defines he basic condiions ha a MapReduce algorihm should saisfy, a graph algorihm in MapReduce is no an excepion. However, like MMC, abeer class is always desirable o be defined for more sably and scalably graph processing in MapReduce. Given a graph G(V,E) wih n nodes and m edges, assume ha m n +c,heauhorsin[] define a class based on MRC for graph processing in MapReduce, in which a MapReduce algorihm has he following properies: Disk: EachmachineusesO(n + c ) disk space. The oal disk space used is O(m + c ). Memory: EachmachineusesO(n + c ) memory. The oal memory used is O(m + c ). Communicaion: In each round, each machine sends/receives O(n + c ) daa. The oal communicaion cos is O(m + c ). CPU: In each round, he CPU consumpion on each machine is O(poly(m)), i.e., polynomial o m. Number of rounds: ThenumberofroundsisO(). Such a class has a good propery ha he algorihm runs in consan rounds. However, i requires each machine o use O(n + c ) memory, which can be large even for a dense graph. When he memory of each machine canno hold O(n + c ) daa, he algorihm fails no maer how many machines are used in he sysem. Thus, he class is no scalable o handle a graph wih large n.. The Scalable Graph Processing Class SGC We now explore a beer class ha is suiable for graph processing in MapReduce. We aim a defining a MapReduce class in which a graph algorihm has he following hree properies. Scalabiliy: The algorihm can always be speeded up by adding more machines. Sabiliy: The algorihm sops in bounded rounds. Robusness: The algorihm never fails regardless of how much memory each machine has. I is difficul o disribue he communicaion cos evenly among all machines for a graph algorihm in MapReduce. The main reason is due o he skewed degree disribuion (e.g., power-law disribuion) for a large range of real-life graphs, in which some nodes may have very high degrees. Hence, insead of using O( m+n ) as he upper bound of communicaion cos per machine, we define a weaker bound, denoed as Õ( m,d(g, )). Suppose he nodes are MRC MMC SGC Disk/machine O(n + c ) O( n+m ) O( n+m ) Disk/oal O(m + c ) O(n + m) O(n + m) Memory/machine O(n + c ) O( n+m ) O() Memory/oal O(m + c ) O(n + m) O() Communicaion/machine O(n + c ) O( n+m ) Õ( m,d(g, )) Communicaion/oal O(m + c ) O(n + m) O(n + m) CPU/machine O(poly(m)) O( T seq ) Õ( m,d(g, )) CPU/oal O(poly(m)) O(T seq) O(n + m) Number of rounds O() O() O(log(n)) Table : Graph Algorihm Classes in MapReduce uniformly disribued among all machines, denoe by V i he se of nodes sored in machine i for i, andled j be he degree of node v j in he inpu graph, Õ( m,d(g, )) is defined as: Õ( m,d(g, )) = O( max (Σ v i j V i d j )) () D(G, ) = Σ v j V d j () Lemma.: Le x i ( i q) be he communicaion cos upper bound for machine i, i.e.,x i = v j V i d j,heexpecedvalueof x i, E(x i)= m, and he variance of xi, V ar(xi) =D(G, ). The proof of Lemma. is omied due o space limiaion. Noe ha he variance of he degree disribuion of G, denoedvar(g), is ( v j V (dj m n ) )/n =(nσ vj V d j m )/n.forfixed, n,andmvalues, minimizing D(G, ) is equivalen o minimizing Var(G). In oher words, he variance of communicaion cos for each machine is minimized if all nodes in he graph have he same degree. We define he scalable graph processing class SGC below. Scalable Graph Processing Class SGC: A graph MapReduce algorihm in SGC should have he following properies: Disk: EachmachineusesO( m+n ) disk space. The oal disk space used is O(m + n). This is he minimal requiremen, since we need a leas O(m + n) disk space o sore he daa. Memory: EachmachineusesO() memory. The oal memory used is O(). This is a very srong consrain, o ensure he robusness of he algorihm. Noe ha he memory defined here is he memory used in he map and reduce phases. There is also memory used in he shuffle phase, which is usually predefined by he sysem and is independen of he algorihm. Communicaion: Ineachround,eachmachinesends/receivesÕ ( m,d(g, )) daa, and he oal communicaion cos is O(m + n),whereg is he inpu graph in he round. CPU: In each round, he CPU cos on each machine is Õ ( m, D(G, )), whereg is he inpu graph in he round. The CPU cos defined here is he cos spen in he map and reduce phases. Number of rounds: ThenumberofroundsisO(log(n)). Discussion: For he memory consrain, SGC only requires each machine o use consan memory, ha is o say, even if he oal memory of he sysem is smaller han he inpu daa, he algorihm can sill be processed successfully. This is an even sronger consrain han ha defined in MMC. Neverheless, we give he flexibiliy for he algorihm o run oher query opimizaion asks using he free memory, which can be orhogonally sudied o our work. Given he consrains on memory, communicaion, and CPU, i is nearly impossible for a wide range of graph algorihms o be processed in consan rounds in MapReduce. Thus, we relax he O() rounds defined in MMC o O(log(n)) rounds, which is reasonable since O(log(n)) is he processing ime lower bound for a large number of parallel graph algorihms in he parallel random-access machines, and is pracical for he MapReduce framework as evidenced by our experimens. The comparison of he hree classes for graph processing in MapReduce is shown in Table.

. Two Graph Operaors in SGC We assume ha a graph G(V,E) is sored in a disribued file sysem as a node able V and an edge able E. Each node in he able has a unique id and some oher informaion such as label and keywords. Each edge in he able has id, id defining he source and arge node ids of he edge, and some oher informaion such as weigh and label. We use he node id o represen he node if i is obvious. G can be eiher direced or undireced. For an undireced graph, each edge is sored as wo edges (id,id ) and (id,id ). In he following, we inroduce wo graph operaors in SGC,namely,NE join, and EN join, using which a large range of graph problems can be designed... NE Join An NE join aims o propagae he informaion on nodes ino edges, i.e., for each edge (v i,v j) E,anNE join oupus an edge (v i,v j,f(v i)) (or (v i,v j,f(v j)))wheref (v i) (or F (v j)) is a se of funcions operaed on v i (or v j) in he node able V. Given a node able V i and an edge able E j,anne join of V i and E j can be formulaed using he following algebra: Π id,id,f (c ) p,f (c ) p, (σ cond(c) ((V i V ) NE id,id (E j E))) C(cond (c )) cn () or equivalenly in he following SQL form: selec id, id, f (c ) as p, f (c ) as p, from V i as V NE join E j as E on V.id = E.id where cond(c) coun cond (c ) as cn where each of c, c, c, c, is a subse of fields in he wo ables V i and E j, f k is a funcion operaed on he fields c k,andcond and cond are wo funcions ha reurn eiher rue or false defined on he fields in c and c respecively. id can be eiher id or id. The coun par couns he number of rues reurnedbycond (c ) and he number is assigned o a couner cn, which is useful in deermining a erminae condiion for an ieraive algorihm. NE join in MapReduce: The NE join operaion can be implemened in MapReduce as follows. Le he se of fields used in V be c v, and he se of fields used in E be c e. In he map phase, for each node v V, he values in c v wih key v.id are emied as a key-value pair (v.id, v.c v). Foreachedgee E, he values in c e wih key e.id are emied as a key-value pair (e.id,e.c e). In he reduce phase, for each node id, he se of key-value pairs {(id, v.c v), (id, e.c e), (id, e.c e), }can be processed as a daa sream wihou loading he whole se ino memory. Assuming ha (id, v.c v) comes firs before all oher key-value pairs (id, e i.c e) in he sream (his can be implemened as a secondary sor in MapReduce), he algorihm firs loads (id, v.c v) ino memory and hen processes each (id, e i.c e) one by one. For a cerain (id, e i.c e), he algorihm checks cond(c). Ifcond(c) reurns false, he algorihm skips (id, e i.c e) and coninues o process he nex (id, e i+.c e). Oherwise, he algorihm calculaes all f j(c j) and cond (c ) from (id, v.c v) and (id, e i.c e), oupus he seleced fields as a single uple ino he disribued file sysem, and increases cn if cond (c ) reurns rue. I is easy o see ha NE join belongs o SGC... EN Join An EN join aims o aggregae he informaion on edges ino nodes, i.e., for each node v i V,anEN join oupus a node (v i,g(adj(v i))) where adj(v i) = {(v i,v j) E}, andg is a se of decomposable aggregae funcions on he edge se adj(v i), where a decomposable aggregae funcion is defined as follows: Definiion.: (Decomposable Aggregae Funcion) An aggregae funcion g k is decomposable if for any daase s, andanywosub- Algorihm PageRank(V (id),e(id,id ),d) : V r id,cn(id ) d, V EN r(v id,id : for i =o d do : E r id,id, d r p(vr NE id,id : V r id,d, V α +( α)sum(e r.p) r(vr EN id,id E r); : reurn id,r (Vr); ses of s, s and s, wih s s = and s s = s, g k (s) can be compued using g k (s ) and g k (s ). Given a node able V i and an edge able E j,anen join of V i and E j can be formulaed using he following algebra: Π id,g (c ) p,g (c ) p, (σ cond(c) ((V i V ) EN id,id (E j E))) C(cond (c )) cn () or equivalenly in he following SQL form: selec id, g (c ) as p, g (c ) as p, from V i as V EN join E j as E on V.id = E.id where cond(c) group by id coun cond (c ) as cn where each of c, c, c, c, is a subse of fields in he wo ables V i and E j,andid can be eiher id or id.thewhere par and coun par are analogous o hose defined in NE join. g k is a decomposable aggregae funcion operaed on he fields in c k,by grouping he resuls using node ids asdenoedinhegroup by par. Since he group by field is always he node id, we omi he group by par in Eq. for simpliciy. EN join in MapReduce: The EN join operaion can be implemened in MapReduce as follows. Le he se of fields used in V be c v,andheseoffieldsusedine be c e.themap phase is similar o ha in he NE join. Tha is, for each node v V, he values in c v wih key v.id are emied as a key-value pair (v.id, v.c v), and for each edge e E, he values in c e wih key e.id are emied as a key-value pair (e.id,e.c e). In he reduce phase, for each node id, he se of key-value pairs {(id, v.c v), (id, e.c e), (id, e.c e), } can be processed as a daa sream wihou loading he whole se ino memory. Assuming ha (id, v.c v) comes firs before all oher key-value pairs (id, e i.c e) in he sream, he algorihm firs loads (id, v.c v) ino memory and hen processes each (id, e i.c e) one by one. For each funcion g k, since g k is decomposable, g k ({e,e,,e i}) can be calculaed using g k ({e,e,,e i }) and g k ({e i}). Afer processing all (id, e i.c e), all he g k funcions are compued. Finally, he algorihm checks cond(c). Ifcond(c) reurns rue, i oupus he id as well as all he g k values as a single uple ino he disribued file sysem and increases cn if cond (c ) reurns rue. I is easy o see ha EN join belongs o SGC.. BASIC GRAPH ALGORITHMS The combinaion of NE join and EN join can solve a wide range of graph problems in SGC. In his secion, we inroduce some basic graph algorihms, including PageRank, breadh firs search, and graph keyword search, in which he number of rounds is deermined by a user given parameer or a graph facor which is small and can be considered as a consan. We will inroduce more complex algorihms ha need logarihmic rounds in he wors case in he nex secions, including conneced componen and minimum spanning fores compuaion. PageRank. PageRank is a key graph operaion which compues he rank of each node based on he links (direced edges) among hem. Given a direced graph G(V,E), PageRank is compued ieraively. Le he iniial rank of each node be, in ieraion i, he rank of a V 0

Algorihm BFS(V (id),e(id,id,s) : V d id,id=s?0:φ d (V ) : for i =o + do : E d id,id,d d (V d NE id,id : V d id,((d=φ min(d ) φ)?i:d) d (V d EN id,id E d ) C(d = i) n new; : if n new =0hen break; : reurn V d ; Algorihm KWS(V (id, ),E(id,id ), {k,,k l }, rmax) : V r id,k?(id,0):(φ,φ) (p,d ),, k l?(id,0):(φ,φ) (p l,d l )(V ); : for i =o rmax do : E r id,id,(p,d ) (pe,de ),, (p l,d l ) (pe l,de l ) (V r NE id,id : V r id,amin(p,d,pe,de +) (p,d ),, amin(p l,d l,pe l,de l +) (p l,d l )(V r EN id,id E r); : reurn Vr. (σ d φ d l φ(v r)); node v is compued as r i(v) = α +( α) r i (u) V u nbr in (v), d(u) where 0 < α < is a parameer, nbr in(v) is he se of inneighbors of v in G, andd(u) is he number of ou-neighbors of u in G. The PageRank algorihm in SGC is shown in Algorihm. Given he node able V (id), he edge able E(id,id ),andhenumberof ieraions d, iniially, he algorihm compues he ou-degree of each node using V EN id,id E, assigns an iniial rank o each node, V and generaes a new able V r (line ). Then, he algorihm updaes he node ranks in d ieraions. In each ieraion, he ranks of nodes are updaed using an NE join followed by an EN join. In he NE join, he parial rank p(v) = r(v) d(v) for each node v is propagaed o all is ougoing edges using V r id,id E and a new edge able E r is generaed (line ). In he EN join, for each node v, he parial ranks p(u) from all is incoming edges (u, v) are aggregaed. The new rank is compued as α +( α) V u nbr in (v) V r EN id,id E r,andv r is updaed wih he new ranks (line ). (p(u)) using Breadh Firs Search. Breadh Firs Search (BFS) is a fundamenal graph operaion. Given an undireced graph G(V,E), anda source node s, a BFS compues for every node v V he shores disance (i.e., he minimum number of hops) from s o v in G. The BFS algorihm in SGC is shown in Algorihm. Given a node able V (id), anedgeablee(id,id ), andasourcenodes, he algorihm compues for each node v he shores disance d(v) from s o v. Iniially, a node able V d is creaed wih d(s) =0 and d(v) = φ for v s (line ). Nex, he algorihm ieraively compues he nodes wih d(v) =i from nodes wih d(v) =i. Each ieraion i is processed using an NE join followed by an EN join. The NE join propagaes d(u) ino each edge (u, v) using V d NE id,id E and produces a new able E d.theen join updaes all d(v) based on he following rule: (Disance Updae Rule): In he i-h ieraion of BFS, a node v is assigned d(v) =iiff in he (i )-h ieraion, d(v) =φ and here exiss a neighbor u of v such ha d(u) φ. The rule can be easily implemened using V d EN id,id E d (line ) in which i also compues a couner n new which is he number of nodes wih d(v) =i. Whenn new =0, he algorihm erminaes. I is easy o see ha he number of ieraions for Algorihm is no larger han he diameer of he graph G. Thushealgorihmbelongs o SGC if he diameer of he graph is small. Graph Keyword Search. Wenowinvesigaeamorecomplexalgorihm, namely, keyword search in an undireced graph G(V,E). Algorihm CC(V (id),e(id,id )) : V m id,min(id,id ) p(v EN id,id : V c V.id,V.p,cn(V.id) c((vm V ) EN id,p (Vm V )); : V p id,((c=0 id=p)?min(id ):p) p(vc EN id,id : while rue do : V s sar(v p); : E h id,id,p p (V s NE id,id : V h id,p,min(p,p (σs=(vs EN ) pm id,id E h )); : V h Vs.id,(cn(pm)=0?Vs.p:min(pm)) p (Vs EN id,p V h ); : V s sar(v h ); 0: E u id,id,p p (V s NE id,id : V u id,p,min(p p (σs=(vs EN p) pm id,id E u)); : V u Vs.id,(cn(pm)=0?Vs.p:min(pm)) p (Vs EN id,p V u ); : V p V.id,V.p((Vu V ) NE id,p (Vu V )) C(V.p V.p) n s; : if n s =0hen break; : reurn V p; : Procedure sar(v p) : V g V.id,V.p,V.p g,(v.p=v.p?:0) s ((Vp V ) NE id,p ( V p V )); : V s V.id,V.p,and(V.s,V.s) s ((Vg V ) EN id,g (Vg V )); : V s V.id,V.p,(V.s=0?0:V.s) s ((V s V ) NE id,p (V s V )); 0: reurn V s; Suppose for each v V, (v) is he ex informaion included in v. Given a keyword query wih a se of l keywords Q = {k,k,, k l },akeywordsearch[,]findsaseofrooedreesinheform of (r, {(p,d(r, p )), (p,d(r, p )),, (p l,d(r, p l ))}),wherer is he roo node, p i is a node ha conains keyword k i in (p i),and d(r, p i) is he shores disance from r o p i in G for i l. Each answer is uniquely deermined by is roo node r. rmax is he maximum disance allowed from he roo node o a keyword node in an answer, i.e., d(r, p i) rmax for i l. Graph keyword search can be solved in SGC. The algorihm is shown in Algorihm. Given a node able V (id, ), anedgeable E(id,id ), a keyword query {k,k,,k l },andrmax, he algorihm firs iniializes a able V r,whereineachnodev, forevery k i, a pair (p i,d i) is generaed as (id(v), 0) if k i is conained in v., and(φ,φ ) oherwise (line ). Then he algorihm ieraively propagaes he keyword informaion from each node o is neighbor nodes using rmax ieraions. In each ieraion, he keyword informaion for each node is firs propagaed ino is adjacen edges using NE join, and hen he informaion on edges is grouped ino nodes o updae he keyword informaion on each node using EN join. Specifically, he NE join generaes a new edge able E r, in which each edge (u, v) is embedded wih keyword informaion (p (u),d (u)),, (p l (u),d l (u)) rerieved from node u using V r NE id,id E (line ). In he EN join V r EN id,id E (line ), each node updaes is neares node p i ha conains keyword k i using an amin funcion, which is defined as: amin({(p,d ),, (p k,d k )}) =(p i,d i ) (d i = min j k d j ) () amin is a decomposable since for any wo ses s and s wih s s =, he following equaion holds: amin(s s )=amin({amin(s ),amin(s )}) () Afer rmax ieraions, for all nodes v in V r, is neares node p i ha conains keyword k i( i l) wih disance d i = d(v, p i) rmax is compued. The algorihm reurns he nodes wih d i φ for all i l as he final se of answers (line ).. CONNECTED COMPONENT Given an undireced graph G(V,E) wih n nodes and m edges, aconnecedcomponen(cc)isamaximalseofnodeshacan

0 Figure : A Sample Graph G(V,E) 0 Figure : Singleon Eliminaion: Compue V p 0 Figure : Iniialize CC: Compue V m reach each oher hrough pahs in G. CompuingallCCsofG is a fundamenal graph problem and can be solved efficienly on a sequenial machine using O(n+m) ime. However, i is non-rivial o solve he problem in MapReduce. Below, we briefly inroduce he sae-of-he-ar algorihms for CC compuaion, followed by presening our algorihm in SGC.. Sae-of-he-ar We presen hree algorihms for CC compuaion in MapReduce: HashToMin, HashGToMin, andpram-simulaion. HashToMin and HashGToMin are wo MapReduce algorihms proposed in [], wih a similar idea o use he smalles node in each CC as he represenaive of he CC, assuming ha here is a oal order among all nodes in G. PRAM-Simulaion is o simulae he algorihm in he Parallel Random Access Machine (PRAM) model in MapReduce using he simulaion mehod proposed in []. Algorihm HashToMin: Each node v V mainains a se C v iniialized as C v = {v} {u (u, v) E}. Lev min =min{u u C v}, he algorihm updaes C v in ieraions unil i converges. Each ieraion is processed using MapReduce as follows. In he map phase, for each v V, wo ypes of key-value pairs are emied: () (v min,c v), and()(u, {v min}) for all u C v. In he reduce phase, for each v V, a se of key-value pairs are received in forms of {(v, Cv),, (v, Cv k )}. The new C v is updaed as i k Ci v. The HashToMin algorihm finishes in O(log(n)) rounds, wih O(log(n)(m + n)) oal communicaion cos in each round. The algorihm can be opimized o use O() memory on each machine using secondary sor in MapReduce. Algorihm HashGToMin: Each node v V mainains a se C v iniialized as C v = {v}. Le C v = {u u C v,u > v} and v min =min{u u C v}, he algorihm updaes C v in ieraions unil i converges. Each ieraion is processed using hree MapReduce rounds. In he firs wo rounds, each round updaes C v as C v {u min (u, v) E} in MapReduce. The hird round is processed as follows. In he map phase, for each v V, wo ypes of key-value pairs are emied: () (v min,c v ),and()(u, {v min}) for all u C v. In he reduce phase, for each v V,aseof key-value pairs are received in forms of {(v, Cv),, (v, Cv k )}. The new C v is updaed as i k Ci v. The HashGToMin algorihm finishes in Õ(log(n)) (i.e., expeced O(log(n))) rounds, wih O(m + n) oal communicaion cos in each round. However, i needs O(n) memory for a single machine o hold a whole CC in memory. Thus, as indicaed in [], HashGToMin is no suiable o handle a graph wih large n. Algorihm PRAM-Simulaion: The PRAM model allows muliple processors o compue in parallel using a shared memory. There are CRCW PRAM if concurren wries are allowed, and CREW PRAM if no. In [], a heoreical resul shows ha an CREW The resul is only proved on a pah graph in []. 0 V.p V.p=V.id V.id Figure : Sar Deecion Sep : Compue V g PRAM algorihm in O() ime can be simulaed in MapReduce in O() rounds. For he CC compuaion problem, in he lieraure, he bes resul in CRWE PRAM is presened in [] which compues CCs in O(log(n)) ime. However, i needs o compue he -hop node pairs which requires O(n ) communicaion cos in he wors case in each round. Thus, he simulaion algorihm is impracical.. Conneced Componen in SGC We inroduce our algorihm o compue CCs in SGC. Concepually, he algorihm shares similar ideas wih mos deerminisic O(log(n)) CRCW PRAM algorihms, such as [] and [], bu i is a non-rivial adapion since each operaion should be carefully designed using graph joins in SGC.Ouralgorihmmainainsafores using a paren poiner p(v) for each v V.Eachrooedree in he fores represens a parial CC. A singleon is a ree wih one node, and a sar is a ree of heigh. A ree is an isolaed ree if here are no edges in E ha connec he ree o anoher ree. The fores is ieraively updaed using wo operaions: hooking and poiner jumping. Hooking merges several rees ino a larger ree, and poiner jumping changes he paren of each node o is grandparen in each ree. When he algorihm ends, each ree becomes an isolaed sar ha represens a CC in he graph. Specifically, he algorihm firs iniializes a fores o make sure ha no singleons exis excep for isolaed singleons. Then, he algorihm updaes he fores in ieraions. In each ieraion, wo hooking operaions, namely, a condiional sar hooking and an uncondiional sar hooking, followed by a poiner jumping operaion are performed. The wo hooking operaions eliminae all non-isolaed sars in he fores and he poiner jumping operaion produces new sars o be eliminaed in he nex ieraion. Our algorihm CC is shown in Algorihm, which includes five componens: Fores Iniializaion (line -), Sar Deecion (line -0), Condiional Sar Hooking (line -), Uncondiional Sar Hooking (line -), and Poiner Jumping (line -). We explain he algorihm using a sample graph G(V,E) shown in Fig.. Fores Iniializaion: The fores is iniialized in hree seps. () In he firs sep, a able V m is compued, in which each node v finds he smalles node among is neighbors in G including iself as he paren p(v) of v, i.e., p(v) =min{v {u (u, v) E}}. Suchan operaion guaranees ha no cycles are creaed excep for self cycles (i.e., p(v) =v). The operaion can be done using V EN id,id E as shown in line. () In he second sep, we creae a able V c by couning he number of subnodes for each node in he fores, i.e., for each node v, c(v) = {u p(u) =v}. Thiscanbedone using a self EN join V m EN id,p V m where he second V m is considered as an edge able since i has wo fields represening node ids (line ). () In he hird sep, we creae V p by eliminaing all non-isolaed singleons. A node v is a singleon, iff c(v) =0and p(v) = v. A non-isolaed singleon v can be eliminaed by assigning p(v) = min{u (u, v) E}, whichcanbedoneusing

0 V.id=V.g V.id Figure : Sar Deecion Sep : Compue V s 0 V h.pm V s.id=v h.p V h.id Figure : Condiional Sar Hooking: Compue V h 0 V.p=V.id Figure : Sar Deecion Sep : Compue V s V c EN id,id E (line ). Obviously, no cycles (excep for self cycles) are creaed in V p. For example, for he graph G shown in Fig., he fores V m is shown in Fig., where solid edges represen paren poiners and dashed edges represen graph edges. p() = 0 since 0 is he smalles neighbor of in G. p() = since has no neighbor which is smaller han. There are wo singleons and, which are eliminaed in V p as shown in Fig., by poining o and o, since is he smalles neighbor of in G and is he smalles neighbor of in G. Sar Deecion: We need o deec wheher each node belongs o a sar before hooking. We use s(v) =o denoe ha v belongs o a sar and s(v) =0oherwise. The sar deecion is done using hree seps based on he following hree filering rules: (Rule-): Anodev does no belong o a sar if p(v) p(p(v)). (Rule-): Anodev does no belong o a sar if u, such ha u does no belong o a sar and p(p(u)) = v. (Rule-): Anodev does no belong o a sar if p(v) does no belong o a sar. I is guaraneed ha afer applying he hree rules one by one in order, all non-sars are filered. We now inroduce how o apply he hree rules using graph join operaors. (Rule-) For each node v, wefindisgrandpareng(v) =p(p(v)), and assign or 0 o s(v) depending on wheher p(v) =g(v). Thiscanbedoneusing a self join V p NE id,p V p as shown in line. (Rule-) A node v belongs o a sar afer applying Rule- if s(v) =and for all u such ha g(u) = v, s(u) =. Thus, we use an aggregae funcion and(s(v),s(u)) which is a boolean funcion and reurns iff s(v) =and g(u) =v, s(u) =.Thiscanbedoneusinga self join V g EN id,g V g for V g creaed in Rule- as shown in line. (Rule-) For each node v, wecompues(p(v)) and assigns(v) =0 if s(p(v)) = 0. ThiscanbedoneusingaselfjoinV s NE id,p V s for V s creaed in Rule- as shown in line. For example, Fig. shows V g by applying Rule- on V p shown in Fig.. The grey nodes are hose deeced as non-sar nodes, i.e., s(v) =0. For node, i does no belong o a sar as (g() = ) (p() = ). Fig. shows V s by applying Rule- on V g. Two new nodes and are filered as non-sar nodes. For node, s() = 0 since here exiss node wih g() = and s() = 0. Fig. shows V s by applying Rule- on V s.threenodes0, and are filered. For node, s() = 0 since is paren node has s() = 0. InV s, all non-sar nodes are filered, and sars rooed a, and are deeced. Condiional Sar Hooking: In a condiional sar hooking, for any node v which is he roo of a sar (i.e., p(v) =vand s(v) =), he paren of v is updaed o min{{p(v)} {u (x, y) E, s.. p(x) = u and p(y) = v}}. In oher words, v is hooked o a new paren u, if u is no larger han v, and he ree ha u lies in is conneced o he sar ha v lies in hrough an edge (x, y) wih p(x) =u and p(y) =v. The operaion ensures ha p(v) is no V.id V u.pm V s.id=v u.p V u.id 0 Figure : Uncondiional Sar Hooking: Compue V u larger han v in order o make sure ha no cycles (excep for self cycles) are creaed. Afer he hooking, i is guaraneed ha here are no edges ha connec wo sars. Condiional sar hooking is done in hree seps. () Creae a new edge able E h by embedding p(x) o each edge (x, y) E using V s NE id,id E (line ), where V s is he fores wih all sars deeced (line ). () Creae a able V h, in which for each node y such ha y is in a sar, p m(y) = min{{p(y)} {p(x) (x, y) E}} is compued. This can be done using σ s=(v s EN id,id E h ) (line ). () Creae a able V h, in which he paren of each node v is updaed o min{p m(y) p(y) =v} if such p m exiss using V s EN id,p V h (line ). For example, for V s shown in Fig. wih all sars deeced, here exiss an edge (x, y) =(, ) wih u = p(x) =and v = p(y) =. Since is in a sar and <, is hooked o a new paren by assigning p() = as shown in Fig.. Noe ha is also in a sar in V s,however,since >, wecannohook o by assigning p() = afer which a cycle is creaed. Uncondiional Sar Hooking: Uncondiional sar hooking is similar o condiional sar hooking by dropping he condiion ha a node v should be hooked o a paren u wih u v. I is done using he similar hree seps (line 0-) wih he only difference on he second sep, which calculaes p m(y) as min{p(x) (x, y) E and p(x) p(y)}, insead of min{{p(y)} {p(x) (x, y) E}}. We add a condiion p(x) p(y) o avoid hooking a sar o iself in order o make sure ha all non-isolaed sars are eliminaed. Uncondiional sar hooking does no creae cycles (excep for self cycles) due o he fac ha afer condiional sar hooking, here is no edge ha connecs wo sars. For example, for he fores V h shown in Fig., here is only one sar rooed a node. There exiss an edge (x, y) =(, ) wih u = p(x) =and v = p(y) =0and u v, so is hooked o a new paren 0 as shown in Fig. wih no sars exising. Poiner Jumping: Poiner jumping changes he paren of each node o is grandparen in he fores V u generaed in uncondiional sar hooking by assigning p(v) =p(p(v)) for each node v. This can be done using a self join V u NE id,p V u (line ). In poiner jumping, we also creae a couner n s which couns he number of nodes wih p(v) p(p(v)). Whenn s =0, all sars in V u are isolaed sars and he algorihm erminaes wih each sar represens a CC (line ). For example, for he fores V u compued in uncondiional sar hooking, afer poiner jumping, he new fores V p is shown in Fig. wih wo sars wih roos and generaed. The following heorem shows he efficiency of Algorihm. Due o lack of space, he proof is omied. Theorem.: Algorihm sops in O(log(n)) ieraions. The comparison of algorihms HashToMin, HashGToMin, and our algorihm CC is shown in Table in erms of he memory consumpion per machine, oal communicaion cos per round, and he number of rounds, in which our algorihm is he bes in all facors.

0 Figure : Poiner Jumping: Compue V p V.p V.id=V.p V.id HashToMin HashGToMin CC Memory/machine O() O(n) O() Communicaion/round O(log(n)(n + m)) O(n + m) O(n + m) Number of rounds O(log(n)) Õ(log(n)) O(log(n)) Table : CC Compuaion Algorihms in MapReduce. MINIMUM SPANNING FOREST Given a weighed undireced graph G(V,E) of n nodes and m edges, wih each edge (u, v) E assigned a weigh w((u, v)), a Minimum Spanning Fores (MSF) is a spanning fores of G wih he minimum oal edge weigh. We also use (u, v, w((u, v))) o denoe an edge. Alhough MSF can be efficienly compued on a sequenial machine using O(m + n log(n)) ime, i is non-rivial o solve he algorihm in MapReduce.. Sae-of-he-ar We inroduce wo algorihms in MRC,namely,OneRoundMSF and MuliRoundMSF. OneRoundMSF is proposed in [] and MuliRoundMSF is proposed in []. Algorihm OneRoundMSF: Fix a number k, he algorihm pariions V ino k equally sized subses randomly, i.e., V = V V V k wih V i V j = for i j. Then k(k )/ graphs G i,j for i<j kare creaed. Each G i,j is a subgraph of G induced by nodes V i V j.nex,hemsfofeachg i,j, M i,j is compued in parallel in k(k )/ machines. Finally, all M i,j are merged in a single machine as a new graph H, and he MSF of H is compued as he MSF of G. The algorihm can be processed using one round of MapReduce, and i requires H wih size O(n + c ) o fi in he memory of a single machine assuming ha m n +c. Algorihm MuliRoundMSF: OneRoundMSF does no work efficienly since every node is duplicaed k imes. MuliRoundMSF proposed in [] improves OneRoundMSF using muliple rounds of MapReduce. In each round, he edges E are pariioned ino l equally sized subses randomly, i.e., E = E E E l wih E i E j = for i j. TheMSFofeachE i, T i is compued in parallel in l machines, and he new E is assigned T T T l. The algorihm sops when E n +ϵ for a consan ϵ and he MSF of E is compued in a single machine as he MSF of G. The algorihm requires a single machine o have O(n +ϵ ) memory.. Minimum Spanning Fores in SGC Suppose here is a oal order among all edges as follows. For any wo edges e =(u,v,w ) and e =(u,v,w ), e < e iff one of he following condiions holds: () w < w,() w = w and min(u,v ) <min(u,v ), and()w = w, min(u,v )=min(u,v ),andmax(u,v ) < max(u,v ). Our algorihm is based on he Sollin s Algorihm [] for MSF compuaion, in which he following lemma plays a key role. Lemma.: For any V s V, he smalles edge in {(u, v) u V s,v / V s} is in he MSF. Our algorihm MSF shares similar ideas wih Algorihm for CC compuaion. We mainain a fores using paren poiners. Trees in he fores are merged o form larger rees in ieraions, and he algorihm erminaes when all rees in he fores become isolaed sars. In each ieraion, he fores is updaed using wo operaions, namely, hooking and poiner jumping. Hooking eliminaes all sars Algorihm MSF(V (id),e(id,id,w)) : V p id,min((id,id,w)) em,em.id p (V EN id,id : E em (σ em φ(v p)); : while rue do : V b V.id,((V.id=V.p V.id<V.id)?V.id:V.id) p ((Vp V ) NE id,p (Vp V )); : V c V.id,V.p ((V b V ) NE id,p (V b V )) C(V.p V.p) n s; : if n s =0hen break; : V s sar(v c); : E m (id,id,w) e,p p (V s NE id,id : V m id,p,amin(e,p p p) (em,pm) (σs=(vs EN id,id E m)); 0: V p Vs.id,(cn(pm)=0?(φ,Vs.p):amin(em,pm)) (em,p) (Vs EN id,p Vm); : E E ( em (σ em φ(v p))); : reurn E ; 0 0 (a) Graph G(V,E) (b) Fores Ini: Compue V p Figure 0: A Sample Graph and Fores Iniializaion by merging hem ino oher rees and poiner jumping decreases he deph of he rees o generae new sars. The algorihm is differen from Algorihm mainly in hree aspecs: 0 Hooking Sraegy: Differen from CC compuaion, in MSF, a sar canno be arbirarily hooked o a ree as long as here is an edge connecing hem. Insead, a sar can only be hooked o a ree using an edge ha is minimum among all edges leaving he sar, as indicaed in Lemma.. Cycle Breaking: The above hooking sraegy may produce cycles among muliple nodes. We need a sraegy o break all such cycles wihou breaking any ree apar. MSF Mainenance: Insead of mainaining he fores defined by paren poiners, we also need o mainain he MSF which is anoher fores differen from he fores defined by paren poiners. The algorihm MSF is shown in Algorihm. We inroduce MSF in erms of Fores Iniializaion (line -), Cycle Breaking (line ), Poiner Jumping (line -), and Edge Hooking (line -). We explain he algorihm using a sample graph G(V,E) shown in Fig. 0(a) Fores Iniializaion: Suppose we use p(v) o denoe he paren poiner of each node v V, and use edge able E o mainain he edges in MSF. In he iniializaion sep, for each node v V, he algorihm finds is minimum adjacen edge (u, v) E, hooksv o u using p(v) =u, andadds(u, v) o he MSF E by Lemma.. The hooking can be done using V EN id,id E (line ). Le V p be he fores afer he hooking, i is guaraneed ha no singleons exis in V p excep for isolaed singleons. I is possible ha in V p,cycles of muliple nodes can be formed by paren poiners, however, he following lemma shows a good propery of V p using which a cycle breaking mehod can be applied efficienly. Lemma.: Each cycle in V p is wih lengh no larger han. For example, for he graph shown in Fig. 0(a), afer fores iniializaion, he fores V p is shown in Fig. 0(b). Node is hooked o since he edge (,, ) is he smalles edge among all adjacen edges of in G. Noe ha node is also hooked o node by

0 V.id V.p V.id=V.p Figure : Cycle Breaking: Compue V b 0 V.p V.id V.id=V.p Figure : Poiner Jumping: Compue V c he same edge (,, ), hus a cycle of size is formed by nodes and. All he edges in V p are added o he MSF E. Cycle Breaking: Cycle breaking breaks he cycles o make sure ha here are no cycles excep for self cycles in he fores. According o Lemma., a cycle of lengh can be easily deeced if here is a node v wih p(v) v and p(p(v)) = v. Wecaneliminaesuch cycles using he following rule: (Cycle Breaking Rule): For a node v wih p(v) v and p(p(v)) = v,ifp(v) >v, hen assign v o p(v). By applying he rule, we creae a able V b wih no cycles (excep for self cycles) using a self join V p NE id,p V p as shown in line. For example, for he fores V p in Fig. 0(b), he node has p() = and p(p()) =. Since p() >, byapplying he cycle breaking rule, p() is updaed o. The new fores V b wih no cycles of lengh larger han is shown in Fig.. Poiner Jumping: Poiner jumping is analogous o ha in Algorihm, which creaes a able V c by changing he paren of each node v o is grandparen, using p(v) = p(p(v)) by a self join V b V b as shown in line. Again, in poiner jumping, we creae a couner n s o coun he number of non-sar nodes. When n s =0, he algorihm erminaes and oupus E as he MSF of G. For example, afer poiner jumping, he fores V b in Fig. is changed o he fores V c in Fig.. The paren of node changes o is grandparen, and wo new sars wih roos and are creaed. NE id,p Edge Hooking: Edge hooking aims o eliminae all sars (excep for isolaed sars) in he fores V c. Suppose we creae a able V s in line wih all sars deeced using he same procedure sar in Algorihm. In edge hooking, for any node v which is he roo of a sar (i.e., p(v) = v and s(v) = ), le (x, y, w) = min{(x,y,w ) p(y )=v, p(x ) p(y )}, hen v is assigned a new paren u = p(x) afer hooking. Edge hooking is done in hree seps. () Creae a new edge able E m by embedding p(x) o each edge (x, y, w) E using V s NE id,id E (line ). () Creae a able V m, in which for each node y such ha y is in a sar, p m(y) =p(x) wih e m(y) =(x, y, w) =min{(x,y,w ) p(y )=v, p(x ) p(y )} is compued. This can be done wih an aggregae funcion amin((x, y, w), p(x) p(x) p(y)) using he EN join σ s= (V s EN id,id E m) (line ). () Creae a able V p, in which he paren of each node v is updaed o p m(y) wih (x, y, w) =min{e m(y) p(y) =v} if such p m exiss using V s EN id,p V m (line 0). The corresponding edge e m = min{e m(y) p(y) =v} is added o he MSF able E (line ) by Lemma. if i exiss. I is guaraneed ha Lemma. sill holds on he V p creaed in edge hooking. For example, for V c in Fig., here exiss an edge (x, y) = (, ) wih u = p(x) =and v = p(y) =. Since is he roo of a sar and (,, ) = min{(x,y w ) p(y )=,p(x ) } = min{(0,, ), (,, ), (,, ), (,, )}, is V m.pm V s.id=v m.p V m.id V m.em 0 Figure : Edge Hooking: Compue V p 0 OneRoundMSF MuliRoundMSF MSF Memory/machine O(n + c ) O(n +ϵ ) O() Communicaion/round O(m + c ) O(n + m) O(n + m) Number of rounds O() O( log n (m) ϵ ) O(log(n)) Table : MSF Compuaion Algorihms in MapReduce hooked o as shown in Fig.. Similarly, is also hooked o. Theedges(,, ) and (,, 0) are added o he MSF E. The following heorem shows he efficiency of Algorihm. Due o lack of space, he proof is omied. Theorem.: Algorihm sops in O(log(n)) ieraions. The comparison of algorihms OneRoundMSF, MuliRoundMSF, and our algorihm MSF is shown in Table in erms of memory consumpion per machine, oal communicaion cos per round, and he number of rounds. As we will show laer in our experimens, he high memory requiremen of OneRoundMSF and MuliRoundMSF becomes he boleneck for he algorihms o achieve high scalabiliy when handling graphs wih large n.. PERFORMANCE STUDIES In his secion, we show our experimenal resuls. We deploy acluserofcompuingnodes,includingonemasernodeand slave nodes, each of which has four Inel Xeon.GHz CPUs and GB RAM running -bi Ubunu Linux. We implemen all algorihms using Hadoop (version..) wih Java.. We allow each node o run hree mappers and hree reducers concurrenly, each of which uses a heap size of 0MB in JVM. The block size in HDFS is se o be MB, he daa replicaion facor of HDFS is se o be, and he I/O buffer size is se o be KB. Daases: We use wo web-scale graphs Twier-00 and Friendser wih differen graph characerisics for esing. Twier-00 conains,,0 nodes and,,, edges wih an average degree of. The maximum degree is,0, and he diameer of Twier-00 is around. Friendser conains,0, nodes and,0,0, edges wih an average degree of. The maximum degree is, and he diameer of Friendser is around. Algorihms: Besides he five algorihms PageRank (Algorihm ), BFS (Algorihm ), KWS (Algorihm ), CC (Algorihm ), and MSF (Algorihm ), we also implemen he algorihms for PageRank, BFS, and graph keyword search using he join operaions suppored by Pig (hp://pig.apache.org/) on Hadoop, denoed PageRank-Pig, BFS-Pig and KWS-Pig respecively. Since he algorihms for PageRanks, BFS, and graph keyword search are raher simple, i.e., for each algorihm, only wo MapReduce jobs are needed in each ieraion for boh Pig and our implemenaion, he main difference beween Pig and our implemenaion is how he join operaion is implemened. In Pig, he join operaion is implemened using a load-and-join manner where each key-value pair is accessed for more han once in he reducer, and in our implemenaion, he join operaion is implemened as a sreaming hp://law.di.unimi.i/webdaa/wier-00/ hp://snap.sanford.edu/daa/com-friendser.hml