PREGEL. A System for Large Scale Graph Processing

Size: px
Start display at page:

Download "PREGEL. A System for Large Scale Graph Processing"

Transcription

1 PREGEL A System for Large Scale Graph Processing

2 The Problem Large Graphs are often part of computations required in modern systems (Social networks and Web graphs etc.) There are many graph computing problems like shortest path, clustering, page rank, minimum cut, connected components etc. but there exists no scalable general purpose system for implementing them. Pregel 2

3 Characteristics of the algorithms They often exhibit poor locality of memory access. Very little computation work required per vertex. Changing degree of parallelism over the course of execution. Pregel Refer [1, 2] 3

4 Possible solutions Crafting a custom distributed framework for every new algorithm. Existing distributed computing platforms like MapReduce. These are sometimes used to mine large graphs [3, 4], but often give sub optimal performance and have usability issues. Single computer graph algorithm libraries Limiting the scale of the graph is necessary BGL, LEDA, NetworkX, JDSL, Standford GraphBase or FGL Existing parallel graph systems which do not handle fault tolerance and other issues The Parallel BGL [5] and CGMgraph [6] Pregel 4

5 Pregel Google, to overcome, these challenges came up with Pregel. Provides scalability Fault tolerance Flexibility to express arbitrary algorithms The high level organization of Pregel programs is inspired by Valiant s Bulk Synchronous Parallel model [7]. Pregel 5

6 Message passing model A pure message passing model has been used, omitting remote reads and ways to emulate shared memory because: 1. Message passing model was found sufficient for all graph algorithms 2. Message passing model performs better than reading remote values because latency can be amortized by delivering larges batches of messages asynchronously. Pregel 6

7 Message passing model Pregel 7

8 Example Find the largest value of a vertex in a strongly connected graph Pregel 8

9 Finding the largest value in a graph Blue Arrows are messages Blue vertices have voted to halt Pregel 9

10 Basic Organization Computations consist of a sequence of iterations called supersteps. During a superstep, the framework invokes a user defined function for each vertex which specifies the behavior at a single vertex V and a single Superstep S. The function can: Read messages sent to V in superstep S 1 Send messages to other vertices that will be received in superstep S+1 Modify the state of V and of the outgoing edges Make topology changes (Introduce/Delete/Modify edges/vertices) Pregel 10

11 Basic Organization - Superstep Pregel 11

12 Model Of Computation: Entities VERTEX Identified by a unique identifier. Has a modifiable, user defined value. EDGE Source vertex and Target vertex identifiers. Has a modifiable, user defined value. Pregel 12

13 Model Of Computation: Progress In superstep 0, all vertices are active. Only active vertices participate in a superstep. They can go inactive by voting for halt. They can be reactivated by an external message from another vertex. The algorithm terminates when all vertices have voted for halt and there are no messages in transit. Pregel 13

14 Model Of Computation: Vertex State machine for a vertex Pregel 14

15 Comparison with MapReduce Graph algorithms can be implemented as a series of MapReduce invocations but it requires passing of entire state of graph from one stage to the next, which is not the case with Pregel. Also Pregel framework simplifies the programming complexity by using supersteps. Pregel 15

16 The C++ API Creating a Pregel program typically involves subclassing the predefined Vertex class. The user overrides the virtual Compute() method. This method is the function that is computed for every active vertex in supersteps. Compute() can get the vertex s associated value by GetValue() or modify it using MutableValue() Values of edges can be inspected and modified using the out edge iterator. Pregel 16

17 The C++ API Message Passing Each message consists of a value and the name of the destination vertex. The type of value is specified in the template parameter of the Vertex class. Any number of messages can be sent in a superstep. The framework guarantees delivery and nonduplication but not in order delivery. A message can be sent to any vertex if it s identifier is known. Pregel 17

18 The C++ API Pregel Code Pregel Code for finding the max value Class MaxFindVertex : public Vertex<double, void, double> { public: virtual void Compute(MessageIterator* msgs) { int currmax = GetValue(); SendMessageToAllNeighbors(currMax); for ( ;!msgs >Done(); msgs >Next()) { if (msgs >Value() > currmax) currmax = msgs >Value(); } if (currmax > GetValue()) *MutableValue() = currmax; else VoteToHalt(); } }; Pregel 18

19 The C++ API Combiners Sending a message to another vertex that exists on a different machine has some overhead. However if the algorithm doesn t require each message explicitly but a function of it (example sum) then combiners can be used. This can be done by overriding the Combine() method. It can be used only for associative and commutative operations. Pregel 19

20 The C++ API Combiners Example: Say we want to count the number of incoming links to all the pages in a set of interconnected pages. In the first iteration, for each link from a vertex(page) we will send a message to the destination page. Here, count function over the incoming messages can be used a combiner to optimize performance. In the MaxValue Example, a Max combiner would reduce the communication load. Pregel 20

21 The C++ API Combiners Pregel 21

22 The C++ API Aggregators They are used for Global communication, monitoring and data. Each vertex can produce a value in a superstep S for the Aggregator to use. The Aggregated value is available to all the vertices in superstep S+1. Aggregators can be used for statistics and for global communication. Can be implemented by subclassing the Aggregator Class Commutativity and Assosiativity required Pregel 22

23 The C++ API Aggregators Example: Sum operator applied to out edge count of each vertex can be used to generate the total number of edges in the graph and communicate it to all the vertices. More complex reduction operators can even generate histograms. In the MaxValue example, we can finish the entire program in a single superstep by using a Max aggregator. Pregel 23

24 The C++ API Topology Mutations The Compute() function can also be used to modify the structure of the graph. Example: Hierarchical Clustering Mutations take effect in the superstep after the requests were issued. Ordering of mutations, with deletions taking place before additions, deletion of edges before vertices and addition of vertices before edges resolves most of the conflicts. Rest are handled by user defined handlers. Pregel 24

25 Implementation Pregel is designed for the Google cluster architecture. The architecture schedules jobs to optimize resource allocation, involving killing instances or moving them to different locations. Persistent data is stored as files on a distributed storage system like GFS [8] or BigTable. Pregel 25

26 Basic Architecture The Pregel library divides a graph into partitions, based on the vertex ID, each consisting of a set of vertices and all of those vertices out going edges. The default function is hash(id) mod N, where N is the number of partitions. The next few slides describe the several stages of the execution of a Pregel program. Pregel 26

27 Pregel Execution 1. Many copies of the user program begin executing on a cluster of machines. One of these copies acts as the master. The master is not assigned any portion of the graph, but is responsible for coordinating worker activity. Pregel 27

28 Pregel Execution 2. The master determines how many partitions the graph will have and assigns one or more partitions to each worker machine. Each worker is responsible for maintaining the state of its section of the graph, executing the user s Compute() method on its vertices, and managing messages to and from other workers. Pregel 28

29 Pregel Execution Pregel 29

30 Pregel Execution 3. The master assigns a portion of the user s input to each worker. The input is treated as a set of records, each of which contains an arbitrary number of vertices and edges. After the input has finished loading, all vertices are marked are active. Pregel 30

31 Pregel Execution 4. The master instructs each worker to perform a superstep. The worker loops through its active vertices, and call Compute() for each active vertex. It also delivers messages that were sent in the previous superstep. When the worker finishes it responds to the master with the number of vertices that will be active in the next superstep. Pregel 31

32 Pregel Execution Pregel 32

33 Pregel Execution Pregel 33

34 Fault Tolerance Checkpointing is used to implement fault tolerance. At the start of every superstep the master may instruct the workers to save the state of their partitions in stable storage. This includes vertex values, edge values and incoming messages. Master uses ping messages to detect worker failures. Pregel 34

35 Fault Tolerance When one or more workers fail, their associated partitions current state is lost. Master reassigns these partitions to available set of workers. They reload their partition state from the most recent available checkpoint. This can be many steps old. The entire system is restarted from this superstep. Confined recovery can be used to reduce this load Pregel 35

36 Applications PageRank Pregel 36

37 PageRank PageRank is a link analysis algorithm that is used to determine the importance of a document based on the number of references to it and the importance of the source documents themselves. [This was named after Larry Page (and not after rank of a webpage)] Pregel 37

38 PageRank A = A given page T 1. T n = Pages that point to page A (citations) d = Damping factor between 0 and 1 (usually kept as 0.85) C(T) = number of links going out of T PR(A) = the PageRank of page A PR( A) (1 d) d ( PR( T1 ) C( T ) 1 PR( T2 ) C( T ) 2... PR( Tn ) ) C( T ) n Pregel 38

39 PageRank Courtesy: Wikipedia Pregel 39

40 PageRank PageRank can be solved in 2 ways: A system of linear equations An iterative loop till convergence We look at the pseudo code of iterative version Initial value of PageRank of all pages = 1.0; While ( sum of PageRank of all pages numpages > epsilon) { for each Page P i in list { PageRank(P i ) = (1 d); for each page P j linking to page P i { PageRank(P i ) += d (PageRank(P j )/numoutlinks(p j )); } } } Pregel 40

41 PageRank in MapReduce Phase I Parsing HTML Map task takes (URL, page content) pairs and maps them to (URL, (PR init, list of urls)) PR init is the seed PageRank for URL list of urls contains all pages pointed to by URL Reduce task is just the identity function Pregel 41

42 PageRank in MapReduce Phase 2 PageRank Distribution Map task takes (URL, (cur_rank, url_list)) For each u in url_list, emit (u, cur_rank/ url_list ) Emit (URL, url_list) to carry the points to list along through iterations Reduce task gets (URL, url_list) and many (URL, val) values Sum vals and fix up with d Emit (URL, (new_rank, url_list)) Pregel 42

43 PageRank in MapReduce - Finalize A non parallelizable component determines whether convergence has been achieved If so, write out the PageRank lists done Otherwise, feed output of Phase 2 into another Phase 2 iteration Pregel 43

44 PageRank in Pregel Class PageRankVertex : public Vertex<double, void, double> { public: virtual void Compute(MessageIterator* msgs) { if (superstep() >= 1) { double sum = 0; for (;!msgs >done(); msgs >Next()) sum += msgs >Value(); *MutableValue() = * sum; } if (supersteps() < 30) { const int64 n = GetOutEdgeIterator().size(); SendMessageToAllNeighbors(GetValue() / n); } else { VoteToHalt(); }}}; Pregel 44

45 PageRank in Pregel The pregel implementation contains the PageRankVertex, which inherits from the Vertex class. The class has the vertex value type double to store tentative PageRank and message type double to carry PageRank fractions. The graph is initialized so that in superstep 0, value of each vertex is 1.0. Pregel 45

46 PageRank in Pregel In each superstep, each vertex sends out along each outgoing edge its tentative PageRank divided by the number of outgoing edges. Also, each vertex sums up the values arriving on messages into sum and sets its own tentative PageRank to sum For convergence, either there is a limit on the number of supersteps or aggregators are used to detect convergence. Pregel 46

47 Apache Giraph Large-scale Graph Processing on Hadoop Claudio Hadoop Amsterdam - 3 April 2014

48 2

49 Graphs are simple 3

50 A computer network 4

51 A social network 5

52 A semantic network 6

53 A map 7

54 Graphs are huge Google s index contains 50B pages Facebook has around1.1b users Google+ has around 570M users Twitter has around 530M users VERY rough estimates! 8

55 9

56 Graphs aren t easy 10

57 Graphs are nasty. 11

58 Each vertex depends on its neighbours, recursively. 12

59 Recursive problems are nicely solved iteratively. 13

60 PageRank in MapReduce Record: < v_i, pr, [ v_j,..., v_k ] > Mapper: emits < v_j, pr / #neighbours > Reducer: sums the partial values 14

61 MapReduce dataflow 15

62 Drawbacks Each job is executed N times Job bootstrap Mappers send PR values and structure Extensive IO at input, shuffle & sort, output 16

63 17

64 Timeline Inspired by Google Pregel (2010) Donated to ASF by Yahoo! in 2011 Top-level project in release in January release in days

65 Plays well with Hadoop 19

66 Vertex-centric API 20

67 BSP machine 21

68 BSP & Giraph 22

69 Advantages No locks: message-based communication No semaphores: global synchronization Iteration isolation: massively parallelizable 23

70 Architecture 24

71 Giraph job lifetime 25

72 Designed for iterations Stateful (in-memory) Only intermediate values (messages) sent Hits the disk at input, output, checkpoint Can go out-of-core 26

73 A bunch of other things Combiners (minimises messages) Aggregators (global aggregations) MasterCompute (executed on master) WorkerContext (executed per worker) PartitionContext (executed per partition) 27

74 Shortest Paths 28

75 Shortest Paths 29

76 Shortest Paths 30

77 Shortest Paths 31

78 Shortest Paths 32

79 Composable API 33

80 Checkpointing 34

81 No SPoFs 35

82 Giraph scales ref: dges/

83 Giraph is fast 100x over MR (Pr) jobs run within minutes given you have resources ;-) 37

84 Serialised objects 38

85 Primitive types Autoboxing is expensive Objects overhead (JVM) Use primitive types on your own Use primitive types-based libs (e.g. fastutils) 39

86 Sharded aggregators 40

87 Many stores with Gora 41

88 And graph databases 42

89 Current and next steps Out-of-core graph and messages Jython interface Remove Writable from < I V E M > Partitioned supernodes More documentation 43

90 GraphLab: A New Framework for Parallel Machine Learning Yucheng Low, Aapo Kyrola, Carlos Guestrin, Joseph Gonzalez, Danny Bickson, Joe Hellerstein Presented by Guozhang Wang DB Lunch, Nov.8, 2010

91 Overview Programming ML Algorithms in Parallel Common Parallelism and MapReduce Global Synchronization Barriers GraphLab Data Dependency as a Graph Synchronization as Fold/Reduce Implementation and Experiments From Multicore to Distributed Environment

92 Parallel Processing for ML Parallel ML is a Necessity 13 Million Wikipedia Pages 3.6 Billion photos on Flickr etc Parallel ML is Hard to Program Concurrency v.s. Deadlock Load Balancing Debug etc

93 MapReduce is the Solution? High-level abstraction: Statistical Query Model [Chu et al, 2006] Weighted Linear Regression: only sufficient statistics Θ = A -1 b, A = Σw i (x i x it ), b = Σw i (x i y i )

94 MapReduce is the Solution? High-level abstraction: Statistical Query Model [Chu et al, 2006] Embarrassingly K-Means: Parallel only independent data assignments computation No Communication class mean needed = avg(x i ), x i in class

95 ML in MapReduce Single Reducer Multiple Mapper Iterative MapReduce needs global synchronization at the single reducer K-means EM for graphical models gradient descent algorithms, etc

96 Not always Embarrassingly Parallel Data Dependency: not MapReducable Gibbs Sampling Belief Propagation SVM etc Capture Dependency as a Graph!

97 Overview Programming ML Algorithms in Parallel Common Parallelism and MapReduce Global Synchronization Barriers GraphLab Data Dependency as a Graph Synchronization as Fold/Reduce Implementation and Experiments From Multicore to Distributed Environment

98 Key Idea of GraphLab Sparse Data Dependencies Local Computations X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9

99 GraphLab for ML High-level Abstract Express data dependencies Iterative Automatic Multicore Parallelism Data Synchronization Consistency Scheduling

100 Main Components of GraphLab Data Graph Shared Data Table GraphLab Model Scheduling Update Functions and Scopes

101 Data Graph A Graph with data associated with every vertex and edge. X 1 X 2 X 3 X 4 x 3 : Sample value C(X 3 ): sample counts X 5 X 6 X 7 X X X 10 X 11 Φ(X 6,X 9 ): Binary potential 8 9

102 Update Functions Operations applied on a vertex that transform data in the scope of the vertex Gibbs Update: - Read samples on adjacent vertices - Read edge potentials - Compute a new sample for the current vertex

103 Scope Rules Consistency v.s. Parallelism Belief Propagation: Only uses edge data Gibbs Sampling: Needs to read adjacent vertices

104 Scheduling Scheduler determines the order of Update Function evaluations Static Scheduling Round Robin, etc Dynamic Scheduling FIFO, Priority Queue, etc

105 Dynamic Scheduling a CPU 1 a b c d h a e f g b h i j k CPU 2

106 Global Information Shared Data Table in Shared Memory Model parameters (updatable) Sufficient statistics (updatable) Constants, etc (fixed) Sync Functions for Updatable Shared Data Accumulate performs an aggregation over vertices Apply makes a final modification to the accumulated data

107 Sync Functions Much like Fold/Reduce Execute Aggregate over every vertices in turn Execute Apply once at the end Can be called Periodically when update functions are active (asynchronous) or By the update function or user code (synchronous)

108 GraphLab Data Graph Shared Data Table GraphLab Model Scheduling Update Functions and Scopes

109 Overview Programming ML Algorithms in Parallel Common Parallelism and MapReduce Global Synchronization Barriers GraphLab Data Dependency as a Graph Synchronization as Fold/Reduce Implementation and Experiments From Multicore to Distributed Environment

110 Implementation and Experiments Shared Memory Implemention in C++ using Pthreads Applications: Belief Propagation Gibbs Sampling CoEM Lasso etc (more on the project page)

111 Speedup Better Parallel Performance Optimal Colored Schedule Round robin schedule Number of CPUs

112 From Multicore to Distributed Enviroment MapReduce and GraphLab work well for Multicores Simple High-level Abstract Local computation + global synchronization When Migrate to Clusters Rethink Scope synchronization Rethink Shared Data single reducer Think Load Balancing Maybe think abstract model?

113 Hot Topics in Information Management PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs Igor Shevchenko Mentor: Sebastian Schelter Fachgebiet Datenbanksysteme und Informationsmanagement Technische Universität Berlin DIMA TU Berlin 1

114 Agenda 1. Natural Graphs: Properties and Problems; 2. PowerGraph: Vertex Cut and Vertex Programs; 3. GAS Decomposition; 4. Vertex Cut Partitioning; 5. Delta Caching; 6. Applications and Evaluation; Paper: Gonzalez at al. PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs DIMA TU Berlin 2

115 Natural Graphs Natural graphs are graphs derived from real-world or natural phenomena; Graphs are big: billions of vertices and edges and rich metadata; Natural graphs have Power-Law Degree Distribution DIMA TU Berlin 3

116 Power-Law Degree Distribution (Andrei Broder et al. Graph structure in the web) DIMA TU Berlin 4

117 Goal We want to analyze natural graphs; Essential for Data Mining and Machine Learning; Identify influential people and information; Identify special nodes and communities; Model complex data dependencies; Target ads and products; Find communities; Flow scheduling; DIMA TU Berlin 5

118 Problem Existing distributed graph computation systems perform poorly on natural graphs (Gonzalez et al. OSDI 12); The reason is presence of high degree vertices; High Degree Vertices: Star like motif DIMA TU Berlin 6

119 Problem Continued Possible problems with high degree vertices: Limited single-machine resources; Work imbalance; Sequential computation; Communication costs; Graph partitioning; Applicable to: Hadoop; GraphLab; Pregel (Piccolo); DIMA TU Berlin 7

120 Problem: Limited Single-Machine Resources High degree vertices can exceed the memory capacity of a single machine; Store edge meta-data and adjacency information; DIMA TU Berlin 8

121 Problem: Work Imbalance The power-law degree distribution can lead to significant work imbalance and frequency barriers; For ex. with synchronous execution (Pregel): DIMA TU Berlin 9

122 Problem: Sequential Computation No parallelization of individual vertex-programs; Edges are processed sequentially; Locking does not scale well to high degree vertices (for ex. in GraphLab); Sequentially process edges Asynchronous execution requires heavy locking DIMA TU Berlin 10

123 Problem: Communication Costs Generate and send large amount of identical messages (for ex. in Pregel); This results in communication asymmetry; DIMA TU Berlin 11

124 Problem: Graph Partitioning Natural graphs are difficult to partition; Pregel and GraphLab use random (hashed) partitioning on natural graphs thus maximizing the network communication; DIMA TU Berlin 12

125 Problem: Graph Partitioning Continued Natural graphs are difficult to partition; Pregel and GraphLab use random (hashed) partitioning on natural graphs thus maximizing the network communication; Expected edges that are cut = number of machines Examples: 10 machines: 100 machines: 90% of edges cut; 99% of edges cut; DIMA TU Berlin 13

126 In Summary GraphLab and Pregel are not well suited for computations on natural graphs; Reasons: Challenges of high-degree vertices; Low quality partitioning; Solution: PowerGraph new abstraction; DIMA TU Berlin 14

127 PowerGraph DIMA TU Berlin 15

128 Partition Techniques Two approaches for partitioning the graph in a distributed environment: Edge Cut; Vertex Cut; DIMA TU Berlin 16

129 Edge Cut Used by Pregel and GraphLab abstractions; Evenly assign vertices to machines; DIMA TU Berlin 17

130 Vertex Cut The strong point of the paper Used by PowerGraph abstraction; Evenly assign edged to machines; 4 edges 4 edges DIMA TU Berlin 18

131 Vertex Programs Think like a Vertex [Malewicz et al. SIGMOD 10] User-defined Vertex-Program: 1. Runs on each vertex; 2. Interactions are constrained by graph structure; Pregel and GraphLab also use this concept, where parallelism is achieved by running multiple vertex programs simultaneously; DIMA TU Berlin 19

132 GAS Decomposition The strong point of the paper Vertex cut distributes a single vertex-program across several machines; Allows to parallelize high-degree vertices; DIMA TU Berlin 20

133 GAS Decomposition The strong point of the paper Generalize the vertex-program into three phases: 1. Gather Accumulate information about neighborhood; 2. Apply Apply accumulated value to center vertex; 3. Scatter Update adjacent edges and vertices; Gather, Apply and Scatter are user defined functions; DIMA TU Berlin 21

134 Gather Phase Executed on the edges in parallel; Accumulate information about neighborhood; DIMA TU Berlin 22

135 Apply Phase Executed on the central vertex; Apply accumulated value to center vertex; DIMA TU Berlin 23

136 2D Partitioning Aydin Buluc and Kamesh Madduri 8 / 13

137 Graph Partitioning for Scalable Distributed Graph Computations Aydın Buluç Kamesh Madduri 10 th DIMACS Implementation Challenge, Graph Partitioning and Graph Clustering February 13 14, 2012 Atlanta, GA

138 Overview of our study We assess the impact of graph partitioning for computations on low diameter graphs Does minimizing edge cut lead to lower execution time? We choose parallel Breadth First Search as a representative distributed graph computation Performance analysis on DIMACS Challenge instances 2

139 Key Observations for Parallel BFS Well balanced vertex and edge partitions do not guarantee load balanced execution, particularly for real world graphs Range of relative speedups (8.8 50X, 256 way parallel concurrency) for low diameter DIMACS graph instances. Graph partitioning methods reduce overall edge cut and communication volume, but lead to increased computational load imbalance Inter node communication time is not the dominant cost in our tuned bulk synchronous parallel BFS implementation 3

140 Talk Outline Level synchronous parallel BFS on distributedmemory systems Analysis of communication costs Machine independent counts for inter node communication cost Parallel BFS performance results for several large scale DIMACS graph instances 4

141 Parallel BFS strategies 1. Expand current frontier (level synchronous approach, suited for low diameter graphs) source vertex O(D) parallel steps Adjacencies of all vertices in current frontier are visited in parallel 2. Stitch multiple concurrent traversals (Ullman Yannakakis, for high diameter graphs) source vertex path limited searches from super vertices APSP between super vertices 2 5

142 2D graph distribution Consider a logical 2D processor grid (p r * p c = p) and the dense matrix representation of the graph Assign each processor a sub matrix (i.e, the edges within the sub matrix) Flatten Sparse matrices Per processor local graph representation 9 vertices, 9 processors, 3x3 processor grid x x x x x x x x x x x x x x x x x x x x x x x x

143 BFS with a 1D partitioned graph Consider an undirected graph with n vertices and m edges 0 1 Each processor owns n/p vertices and stores their adjacencies (~ 2m/p per processor, assuming balanced partitions). [0,1] [0,3] [0,3] [1,0] [1,4] [1,6] [2,3] [2,5] [2,5] [2,6] [3,0] [3,0] [3,2] [3,6] [4,1] [4,5] [4,6] [5,2] [5,2] [5,4] [6,1] [6,2] [6,3] [6,4] 2 5 Steps: 1. Local discovery: Explore adjacencies of vertices in current frontier. 2. Fold: All to all exchange of adjacencies. 3. Local update: Update distances/parents for unvisited vertices.

144 BFS with a 1D partitioned graph 0 1 Current frontier: vertices 1 (partition Blue) and 6 (partition Green) P0 1. Local discovery: [1,0] [1,4] [1,6] P1 No work 2 5 P2 No work P3 [6,1] [6,2] [6, 3] [6,4] Steps: 1. Local discovery: Explore adjacencies of vertices in current frontier. 2. Fold: All to all exchange of adjacencies. 3. Local update: Update distances/parents for unvisited vertices.

145 BFS with a 1D partitioned graph 0 1 Current frontier: vertices 1 (partition Blue) and 6 (partition Green) P0 2. All to all exchange: [1,0] [1,4] [1,6] P1 No work 2 5 P2 No work P3 [6,1] [6,2] [6, 3] [6,4] Steps: 1. Local discovery: Explore adjacencies of vertices in current frontier. 2. Fold: All to all exchange of adjacencies. 3. Local update: Update distances/parents for unvisited vertices.

146 BFS with a 1D partitioned graph 0 1 Current frontier: vertices 1 (partition Blue) and 6 (partition Green) P0 2. All to all exchange: [1,0] [6,1] 2 5 P1 P2 P3 [6,2] [6, 3] [6,4] [1,6] [1,4] Steps: 1. Local discovery: Explore adjacencies of vertices in current frontier. 2. Fold: All to all exchange of adjacencies. 3. Local update: Update distances/parents for unvisited vertices.

147 BFS with a 1D partitioned graph P0 Current frontier: vertices 1 (partition Blue) and 6 (partition Green) 3. Local update: [1,0] [6,1] Frontier for next iteration P1 P2 P3 [6,2] [6, 3] [6,4] [1,6] [1,4] Steps: 1. Local discovery: Explore adjacencies of vertices in current frontier. 2. Fold: All to all exchange of adjacencies. 3. Local update: Update distances/parents for unvisited vertices. 2, 3 4

148 Modeling parallel execution time Time dominated by local memory references and inter node communication Assuming perfectly balanced computation and communication, we have Local memory references: L m p L, n / p n m p Inter node communication: N, a2a ( p) edgecut p N p Inverse local RAM bandwidth Local latency on working set n/p All to all remote bandwidth with p participating processors 12

149 BFS with a 2D partitioned graph Avoid expensive p way All to all communication step Each process collectively owns n/p r vertices Additional Allgather communication step for processes in a row Local memory references: m p n p L L, n p L, n c p r m p Inter node communication: N, a2a ( p N, gather r ( p edgecut ) p c ) 1 1 p r n p c N p r N p c 13

150 Temporal effects, communication minimizing tuning prevent us from obtaining tighter bounds The volume of communication can be further reduced by maintaining state of non local visited vertices 0 1 P0 [0,3] [0,3] [1,3] [0,4] [1,4] [0,6] [1,6] [1,6] Local pruning prior to All to all step 2 5 [0,3] [0,4] [1,6] 14

151 Predictable BFS execution time for synthetic small world graphs Randomly permuting vertex IDs ensures load balance on R MAT graphs (used in the Graph 500 benchmark). Our tuned parallel implementation for the NERSC Hopper system (Cray XE6) is ranked #2 on the current Graph 500 list. Execution time is dominated by work performed in a few parallel phases Buluc & Madduri, Parallel BFS on distributed memory systems, Proc. SC 11,

152 Modeling BFS execution time for real world graphs Can we further reduce communication time utilizing existing partitioning methods? Does the model predict execution time for arbitrary low diameter graphs? We try out various partitioning and graph distribution schemes on the DIMACS Challenge graph instances Natural ordering, Random, Metis, PaToH 16

153 Experimental Study The (weak) upper bound on aggregate data volume communication can be statically computed (based on partitioning of the graph) We determine runtime estimates of Total aggregate communication volume Sum of max. communication volume during each BFS iteration Intra node computational work balance Communication volume reduction with 2D partitioning We obtain and analyze execution times (at several different parallel concurrencies) on a Cray XE6 system (Hopper, NERSC) 17

154 Orderings for the CoPapersCiteseer graph Natural Random Metis PaToH PaToH checkerboard 18

155 BFS All to all phase total communication volume normalized to # of edges (m) Graph name Natural Random PaToH % compared to m # of partitions 19

156 Ratio of max. communication volume across iterations to total communication volume Graph name Natural Random PaToH Ratio over total volume # of partitions 20

157 Reduction in total All to all communication volume with 2D partitioning Graph name Natural Random PaToH Ratio compared to 1D # of partitions 21

158 Edge count balance with 2D partitioning Graph name Natural Random PaToH Max/Avg. ratio # of partitions

159 Parallel speedup on Hopper with 16 way partitioning 23

160 Comm. time (ms) Comm. time (ms) BFS time (ms) BFS time (ms) Execution time breakdown eu 2005 Computation Fold Expand Random 1D Random 2D Metis 1D PaToH 1D Partitioning Strategy Fold Expand Fold Expand Random 1D Random 2D Metis 1D PaToH 1D Partitioning Strategy kron simple logn18 Computation Fold Expand Random 1D Random 2D Metis 1D PaToH 1D Partitioning Strategy Random 1D Random 2D Metis 1D PaToH 1D Partitioning Strategy 24

161 Imbalance in parallel execution eu 2005, 16 processes* PaToH Random * Timeline of 4 processes shown in figures. PaToH partitioned graph suffers from severe load imbalance in computational phases. 25

162 Conclusions Randomly permuting vertex identifiers improves computational and communication load balance, particularly at higher process concurrencies Partitioning methods reduce overall communication volume, but introduce significant load imbalance Substantially lower parallel speedup with real world graphs compared to synthetic graphs (8.8X vs 50X at 256 way parallel concurrency) Points to the need for dynamic load balancing 26

Graph Processing Frameworks

Graph Processing Frameworks Graph Processing Frameworks Lecture 24 CSCI 4974/6971 5 Dec 2016 1 / 13 Today s Biz 1. Reminders 2. Review 3. Graph Processing Frameworks 4. 2D Partitioning 2 / 13 Reminders Assignment 6: due date Dec

More information

Graph Partitioning for Scalable Distributed Graph Computations

Graph Partitioning for Scalable Distributed Graph Computations Graph Partitioning for Scalable Distributed Graph Computations Aydın Buluç ABuluc@lbl.gov Kamesh Madduri madduri@cse.psu.edu 10 th DIMACS Implementation Challenge, Graph Partitioning and Graph Clustering

More information

GraphLab: A New Framework for Parallel Machine Learning

GraphLab: A New Framework for Parallel Machine Learning GraphLab: A New Framework for Parallel Machine Learning Yucheng Low, Aapo Kyrola, Carlos Guestrin, Joseph Gonzalez, Danny Bickson, Joe Hellerstein Presented by Guozhang Wang DB Lunch, Nov.8, 2010 Overview

More information

PREGEL. A System for Large-Scale Graph Processing

PREGEL. A System for Large-Scale Graph Processing PREGEL A System for Large-Scale Graph Processing The Problem Large Graphs are often part of computations required in modern systems (Social networks and Web graphs etc.) There are many graph computing

More information

PREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING

PREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING PREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) SIGMOD 2010 Presented by : Xiu

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 60 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Pregel: A System for Large-Scale Graph Processing

More information

Big Graph Processing. Fenggang Wu Nov. 6, 2016

Big Graph Processing. Fenggang Wu Nov. 6, 2016 Big Graph Processing Fenggang Wu Nov. 6, 2016 Agenda Project Publication Organization Pregel SIGMOD 10 Google PowerGraph OSDI 12 CMU GraphX OSDI 14 UC Berkeley AMPLab PowerLyra EuroSys 15 Shanghai Jiao

More information

Graph Processing. Connor Gramazio Spiros Boosalis

Graph Processing. Connor Gramazio Spiros Boosalis Graph Processing Connor Gramazio Spiros Boosalis Pregel why not MapReduce? semantics: awkward to write graph algorithms efficiency: mapreduces serializes state (e.g. all nodes and edges) while pregel keeps

More information

Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC

Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC Lecture material is mostly home-grown, partly taken with permission and courtesy from Professor Shih-Wei

More information

Pregel: A System for Large- Scale Graph Processing. Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010

Pregel: A System for Large- Scale Graph Processing. Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010 Pregel: A System for Large- Scale Graph Processing Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010 1 Graphs are hard Poor locality of memory access Very

More information

PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING

PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, G. Czajkowski Google, Inc. SIGMOD 2010 Presented by Ke Hong (some figures borrowed from

More information

Pregel: A System for Large-Scale Graph Proces sing

Pregel: A System for Large-Scale Graph Proces sing Pregel: A System for Large-Scale Graph Proces sing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkwoski Google, Inc. SIGMOD July 20 Taewhi

More information

Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G.

Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G. Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G. Speaker: Chong Li Department: Applied Health Science Program: Master of Health Informatics 1 Term

More information

modern database systems lecture 10 : large-scale graph processing

modern database systems lecture 10 : large-scale graph processing modern database systems lecture 1 : large-scale graph processing Aristides Gionis spring 18 timeline today : homework is due march 6 : homework out april 5, 9-1 : final exam april : homework due graphs

More information

Distributed Systems. 21. Graph Computing Frameworks. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems. 21. Graph Computing Frameworks. Paul Krzyzanowski. Rutgers University. Fall 2016 Distributed Systems 21. Graph Computing Frameworks Paul Krzyzanowski Rutgers University Fall 2016 November 21, 2016 2014-2016 Paul Krzyzanowski 1 Can we make MapReduce easier? November 21, 2016 2014-2016

More information

Pregel. Ali Shah

Pregel. Ali Shah Pregel Ali Shah s9alshah@stud.uni-saarland.de 2 Outline Introduction Model of Computation Fundamentals of Pregel Program Implementation Applications Experiments Issues with Pregel 3 Outline Costs of Computation

More information

King Abdullah University of Science and Technology. CS348: Cloud Computing. Large-Scale Graph Processing

King Abdullah University of Science and Technology. CS348: Cloud Computing. Large-Scale Graph Processing King Abdullah University of Science and Technology CS348: Cloud Computing Large-Scale Graph Processing Zuhair Khayyat 10/March/2013 The Importance of Graphs A graph is a mathematical structure that represents

More information

PREGEL AND GIRAPH. Why Pregel? Processing large graph problems is challenging Options

PREGEL AND GIRAPH. Why Pregel? Processing large graph problems is challenging Options Data Management in the Cloud PREGEL AND GIRAPH Thanks to Kristin Tufte 1 Why Pregel? Processing large graph problems is challenging Options Custom distributed infrastructure Existing distributed computing

More information

One Trillion Edges. Graph processing at Facebook scale

One Trillion Edges. Graph processing at Facebook scale One Trillion Edges Graph processing at Facebook scale Introduction Platform improvements Compute model extensions Experimental results Operational experience How Facebook improved Apache Giraph Facebook's

More information

COSC 6339 Big Data Analytics. Graph Algorithms and Apache Giraph

COSC 6339 Big Data Analytics. Graph Algorithms and Apache Giraph COSC 6339 Big Data Analytics Graph Algorithms and Apache Giraph Parts of this lecture are adapted from UMD Jimmy Lin s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing /34 Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing Zuhair Khayyat 1 Karim Awara 1 Amani Alonazi 1 Hani Jamjoom 2 Dan Williams 2 Panos Kalnis 1 1 King Abdullah University of

More information

Large Scale Graph Processing Pregel, GraphLab and GraphX

Large Scale Graph Processing Pregel, GraphLab and GraphX Large Scale Graph Processing Pregel, GraphLab and GraphX Amir H. Payberah amir@sics.se KTH Royal Institute of Technology Amir H. Payberah (KTH) Large Scale Graph Processing 2016/10/03 1 / 76 Amir H. Payberah

More information

CS /21/2016. Paul Krzyzanowski 1. Can we make MapReduce easier? Distributed Systems. Apache Pig. Apache Pig. Pig: Loading Data.

CS /21/2016. Paul Krzyzanowski 1. Can we make MapReduce easier? Distributed Systems. Apache Pig. Apache Pig. Pig: Loading Data. Distributed Systems 1. Graph Computing Frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 016 1 Apache Pig Apache Pig Why? Make it easy to use MapReduce via scripting instead

More information

Giraph: Large-scale graph processing infrastructure on Hadoop. Qu Zhi

Giraph: Large-scale graph processing infrastructure on Hadoop. Qu Zhi Giraph: Large-scale graph processing infrastructure on Hadoop Qu Zhi Why scalable graph processing? Web and social graphs are at immense scale and continuing to grow In 2008, Google estimated the number

More information

Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem

Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem I J C T A, 9(41) 2016, pp. 1235-1239 International Science Press Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem Hema Dubey *, Nilay Khare *, Alind Khare **

More information

Graph-Parallel Problems. ML in the Context of Parallel Architectures

Graph-Parallel Problems. ML in the Context of Parallel Architectures Case Study 4: Collaborative Filtering Graph-Parallel Problems Synchronous v. Asynchronous Computation Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 20 th, 2014

More information

Graph-Processing Systems. (focusing on GraphChi)

Graph-Processing Systems. (focusing on GraphChi) Graph-Processing Systems (focusing on GraphChi) Recall: PageRank in MapReduce (Hadoop) Input: adjacency matrix H D F S (a,[c]) (b,[a]) (c,[a,b]) (c,pr(a) / out (a)), (a,[c]) (a,pr(b) / out (b)), (b,[a])

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 14: Distributed Graph Processing Motivation Many applications require graph processing E.g., PageRank Some graph data sets are very large

More information

Pregel: A System for Large-Scale Graph Processing

Pregel: A System for Large-Scale Graph Processing Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski Google, Inc. {malewicz,austern,ajcbik,dehnert,ilan,naty,gczaj@google.com

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 14: Distributed Graph Processing Motivation Many applications require graph processing E.g., PageRank Some graph data sets are very large

More information

Graph Processing & Bulk Synchronous Parallel Model

Graph Processing & Bulk Synchronous Parallel Model Graph Processing & Bulk Synchronous Parallel Model CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 14 : 590.02 Spring 13 1 Recap: Graph Algorithms Many graph algorithms need iterafve computafon

More information

Data-Intensive Distributed Computing

Data-Intensive Distributed Computing Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 8: Analyzing Graphs, Redux (1/2) March 20, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

Report. X-Stream Edge-centric Graph processing

Report. X-Stream Edge-centric Graph processing Report X-Stream Edge-centric Graph processing Yassin Hassan hassany@student.ethz.ch Abstract. X-Stream is an edge-centric graph processing system, which provides an API for scatter gather algorithms. The

More information

BSP, Pregel and the need for Graph Processing

BSP, Pregel and the need for Graph Processing BSP, Pregel and the need for Graph Processing Patrizio Dazzi, HPC Lab ISTI - CNR mail: patrizio.dazzi@isti.cnr.it web: http://hpc.isti.cnr.it/~dazzi/ National Research Council of Italy A need for Graph

More information

Graphs (Part II) Shannon Quinn

Graphs (Part II) Shannon Quinn Graphs (Part II) Shannon Quinn (with thanks to William Cohen and Aapo Kyrola of CMU, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford University) Parallel Graph Computation Distributed computation

More information

Frameworks for Graph-Based Problems

Frameworks for Graph-Based Problems Frameworks for Graph-Based Problems Dakshil Shah U.G. Student Computer Engineering Department Dwarkadas J. Sanghvi College of Engineering, Mumbai, India Chetashri Bhadane Assistant Professor Computer Engineering

More information

CS 5220: Parallel Graph Algorithms. David Bindel

CS 5220: Parallel Graph Algorithms. David Bindel CS 5220: Parallel Graph Algorithms David Bindel 2017-11-14 1 Graphs Mathematically: G = (V, E) where E V V Convention: V = n and E = m May be directed or undirected May have weights w V : V R or w E :

More information

TI2736-B Big Data Processing. Claudia Hauff

TI2736-B Big Data Processing. Claudia Hauff TI2736-B Big Data Processing Claudia Hauff ti2736b-ewi@tudelft.nl Intro Streams Streams Map Reduce HDFS Pig Ctd. Graphs Pig Design Patterns Hadoop Ctd. Giraph Zoo Keeper Spark Spark Ctd. Learning objectives

More information

Automatic Scaling Iterative Computations. Aug. 7 th, 2012

Automatic Scaling Iterative Computations. Aug. 7 th, 2012 Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics

More information

MapReduce: A Programming Model for Large-Scale Distributed Computation

MapReduce: A Programming Model for Large-Scale Distributed Computation CSC 258/458 MapReduce: A Programming Model for Large-Scale Distributed Computation University of Rochester Department of Computer Science Shantonu Hossain April 18, 2011 Outline Motivation MapReduce Overview

More information

Apache Giraph: Facebook-scale graph processing infrastructure. 3/31/2014 Avery Ching, Facebook GDM

Apache Giraph: Facebook-scale graph processing infrastructure. 3/31/2014 Avery Ching, Facebook GDM Apache Giraph: Facebook-scale graph processing infrastructure 3/31/2014 Avery Ching, Facebook GDM Motivation Apache Giraph Inspired by Google s Pregel but runs on Hadoop Think like a vertex Maximum value

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Piccolo: Building Fast, Distributed Programs

More information

GraphHP: A Hybrid Platform for Iterative Graph Processing

GraphHP: A Hybrid Platform for Iterative Graph Processing GraphHP: A Hybrid Platform for Iterative Graph Processing Qun Chen, Song Bai, Zhanhuai Li, Zhiying Gou, Bo Suo and Wei Pan Northwestern Polytechnical University Xi an, China {chenbenben, baisong, lizhh,

More information

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011 Data-Intensive Information Processing Applications! Session #5 Graph Algorithms Jordan Boyd-Graber University of Maryland Thursday, March 3, 2011 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

Joseph hgonzalez. A New Parallel Framework for Machine Learning. Joint work with. Yucheng Low. Aapo Kyrola. Carlos Guestrin.

Joseph hgonzalez. A New Parallel Framework for Machine Learning. Joint work with. Yucheng Low. Aapo Kyrola. Carlos Guestrin. A New Parallel Framework for Machine Learning Joseph hgonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Alex Smola Guy Blelloch Joe Hellerstein David O Hallaron Carnegie Mellon

More information

Case Study 4: Collaborative Filtering. GraphLab

Case Study 4: Collaborative Filtering. GraphLab Case Study 4: Collaborative Filtering GraphLab Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin March 14 th, 2013 Carlos Guestrin 2013 1 Social Media

More information

Distributed Systems. 20. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 20. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 20. Other parallel frameworks Paul Krzyzanowski Rutgers University Fall 2017 November 20, 2017 2014-2017 Paul Krzyzanowski 1 Can we make MapReduce easier? 2 Apache Pig Why? Make it

More information

CS November 2017

CS November 2017 Distributed Systems 0. Other parallel frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 017 November 0, 017 014-017 Paul Krzyzanowski 1 Apache Pig Apache Pig Why? Make

More information

Distributed Graph Algorithms

Distributed Graph Algorithms Distributed Graph Algorithms Alessio Guerrieri University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Introduction

More information

Map-Reduce. Marco Mura 2010 March, 31th

Map-Reduce. Marco Mura 2010 March, 31th Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of

More information

Distributed computing: index building and use

Distributed computing: index building and use Distributed computing: index building and use Distributed computing Goals Distributing computation across several machines to Do one computation faster - latency Do more computations in given time - throughput

More information

Apache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis

Apache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis Apache Giraph for applications in Machine Learning & Recommendation Systems Maria Stylianou @marsty5 Novartis Züri Machine Learning Meetup #5 June 16, 2014 Apache Giraph for applications in Machine Learning

More information

Harp-DAAL for High Performance Big Data Computing

Harp-DAAL for High Performance Big Data Computing Harp-DAAL for High Performance Big Data Computing Large-scale data analytics is revolutionizing many business and scientific domains. Easy-touse scalable parallel techniques are necessary to process big

More information

GPS: A Graph Processing System

GPS: A Graph Processing System GPS: A Graph Processing System Semih Salihoglu and Jennifer Widom Stanford University {semih,widom}@cs.stanford.edu Abstract GPS (for Graph Processing System) is a complete open-source system we developed

More information

Developing MapReduce Programs

Developing MapReduce Programs Cloud Computing Developing MapReduce Programs Dell Zhang Birkbeck, University of London 2017/18 MapReduce Algorithm Design MapReduce: Recap Programmers must specify two functions: map (k, v) * Takes

More information

Research challenges in data-intensive computing The Stratosphere Project Apache Flink

Research challenges in data-intensive computing The Stratosphere Project Apache Flink Research challenges in data-intensive computing The Stratosphere Project Apache Flink Seif Haridi KTH/SICS haridi@kth.se e2e-clouds.org Presented by: Seif Haridi May 2014 Research Areas Data-intensive

More information

Databases 2 (VU) ( / )

Databases 2 (VU) ( / ) Databases 2 (VU) (706.711 / 707.030) MapReduce (Part 3) Mark Kröll ISDS, TU Graz Nov. 27, 2017 Mark Kröll (ISDS, TU Graz) MapReduce Nov. 27, 2017 1 / 42 Outline 1 Problems Suited for Map-Reduce 2 MapReduce:

More information

Handling limits of high degree vertices in graph processing using MapReduce and Pregel

Handling limits of high degree vertices in graph processing using MapReduce and Pregel Handling limits of high degree vertices in graph processing using MapReduce and Pregel Mostafa Bamha, Mohamad Al Hajj Hassan To cite this version: Mostafa Bamha, Mohamad Al Hajj Hassan. Handling limits

More information

Batch & Stream Graph Processing with Apache Flink. Vasia

Batch & Stream Graph Processing with Apache Flink. Vasia Batch & Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri Outline Distributed Graph Processing Gelly: Batch Graph Processing with Flink Gelly-Stream: Continuous Graph

More information

Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18

Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18 Accelerating PageRank using Partition-Centric Processing Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18 Outline Introduction Partition-centric Processing Methodology Analytical Evaluation

More information

Distributed Systems. 21. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2018

Distributed Systems. 21. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2018 Distributed Systems 21. Other parallel frameworks Paul Krzyzanowski Rutgers University Fall 2018 1 Can we make MapReduce easier? 2 Apache Pig Why? Make it easy to use MapReduce via scripting instead of

More information

CS6030 Cloud Computing. Acknowledgements. Today s Topics. Intro to Cloud Computing 10/20/15. Ajay Gupta, WMU-CS. WiSe Lab

CS6030 Cloud Computing. Acknowledgements. Today s Topics. Intro to Cloud Computing 10/20/15. Ajay Gupta, WMU-CS. WiSe Lab CS6030 Cloud Computing Ajay Gupta B239, CEAS Computer Science Department Western Michigan University ajay.gupta@wmich.edu 276-3104 1 Acknowledgements I have liberally borrowed these slides and material

More information

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21 Big Processing -Parallel Computation COS 418: Distributed Systems Lecture 21 Michael Freedman 2 Ex: Word count using partial aggregation Putting it together 1. Compute word counts from individual files

More information

CS November 2018

CS November 2018 Distributed Systems 1. Other parallel frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 018 1 Apache Pig Apache Pig Why? Make it easy to use MapReduce via scripting instead

More information

[CoolName++]: A Graph Processing Framework for Charm++

[CoolName++]: A Graph Processing Framework for Charm++ [CoolName++]: A Graph Processing Framework for Charm++ Hassan Eslami, Erin Molloy, August Shi, Prakalp Srivastava Laxmikant V. Kale Charm++ Workshop University of Illinois at Urbana-Champaign {eslami2,emolloy2,awshi2,psrivas2,kale}@illinois.edu

More information

Distributed computing: index building and use

Distributed computing: index building and use Distributed computing: index building and use Distributed computing Goals Distributing computation across several machines to Do one computation faster - latency Do more computations in given time - throughput

More information

Programming Systems for Big Data

Programming Systems for Big Data Programming Systems for Big Data CS315B Lecture 17 Including material from Kunle Olukotun Prof. Aiken CS 315B Lecture 17 1 Big Data We ve focused on parallel programming for computational science There

More information

Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems

Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems ABSTRACT Minyang Han David R. Cheriton School of Computer Science University of Waterloo m25han@uwaterloo.ca

More information

Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets

Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets Nadathur Satish, Narayanan Sundaram, Mostofa Ali Patwary, Jiwon Seo, Jongsoo Park, M. Amber Hassaan, Shubho Sengupta, Zhaoming

More information

PNUTS: Yahoo! s Hosted Data Serving Platform. Reading Review by: Alex Degtiar (adegtiar) /30/2013

PNUTS: Yahoo! s Hosted Data Serving Platform. Reading Review by: Alex Degtiar (adegtiar) /30/2013 PNUTS: Yahoo! s Hosted Data Serving Platform Reading Review by: Alex Degtiar (adegtiar) 15-799 9/30/2013 What is PNUTS? Yahoo s NoSQL database Motivated by web applications Massively parallel Geographically

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Week 2: MapReduce Algorithm Design (2/2) January 14, 2016 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

Optimizing CPU Cache Performance for Pregel-Like Graph Computation

Optimizing CPU Cache Performance for Pregel-Like Graph Computation Optimizing CPU Cache Performance for Pregel-Like Graph Computation Songjie Niu, Shimin Chen* State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences

More information

Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems

Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems University of Waterloo Technical Report CS-215-4 ABSTRACT Minyang Han David R. Cheriton School of Computer

More information

1 Optimizing parallel iterative graph computation

1 Optimizing parallel iterative graph computation May 15, 2012 1 Optimizing parallel iterative graph computation I propose to develop a deterministic parallel framework for performing iterative computation on a graph which schedules work on vertices based

More information

Managing and Mining Billion Node Graphs. Haixun Wang Microsoft Research Asia

Managing and Mining Billion Node Graphs. Haixun Wang Microsoft Research Asia Managing and Mining Billion Node Graphs Haixun Wang Microsoft Research Asia Outline Overview Storage Online query processing Offline graph analytics Advanced applications Is it hard to manage graphs? Good

More information

Apache Flink- A System for Batch and Realtime Stream Processing

Apache Flink- A System for Batch and Realtime Stream Processing Apache Flink- A System for Batch and Realtime Stream Processing Lecture Notes Winter semester 2016 / 2017 Ludwig-Maximilians-University Munich Prof Dr. Matthias Schubert 2016 Introduction to Apache Flink

More information

An Algorithmic Approach to Communication Reduction in Parallel Graph Algorithms

An Algorithmic Approach to Communication Reduction in Parallel Graph Algorithms An Algorithmic Approach to Communication Reduction in Parallel Graph Algorithms Harshvardhan, Adam Fidel, Nancy M. Amato, Lawrence Rauchwerger Parasol Laboratory Dept. of Computer Science and Engineering

More information

Graph Partitioning for Scalable Distributed Graph Computations

Graph Partitioning for Scalable Distributed Graph Computations Graph Partitioning for Scalable Distributed Graph Computations Aydın Buluç 1 and Kamesh Madduri 2 1 Lawrence Berkeley National Laboratory, USA, abuluc@lbl.gov 2 The Pennsylvania State University, USA,

More information

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides)

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Computing 2012 Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Algorithm Design Outline Computational Model Design Methodology Partitioning Communication

More information

CS6200 Information Retreival. The WebGraph. July 13, 2015

CS6200 Information Retreival. The WebGraph. July 13, 2015 CS6200 Information Retreival The WebGraph The WebGraph July 13, 2015 1 Web Graph: pages and links The WebGraph describes the directed links between pages of the World Wide Web. A directed edge connects

More information

Introduction to MapReduce. Adapted from Jimmy Lin (U. Maryland, USA)

Introduction to MapReduce. Adapted from Jimmy Lin (U. Maryland, USA) Introduction to MapReduce Adapted from Jimmy Lin (U. Maryland, USA) Motivation Overview Need for handling big data New programming paradigm Review of functional programming mapreduce uses this abstraction

More information

Turning NoSQL data into Graph Playing with Apache Giraph and Apache Gora

Turning NoSQL data into Graph Playing with Apache Giraph and Apache Gora Turning NoSQL data into Graph Playing with Apache Giraph and Apache Gora Team Renato Marroquín! PhD student: Interested in: Information retrieval. Distributed and scalable data management. Apache Gora:

More information

GraphChi: Large-Scale Graph Computation on Just a PC

GraphChi: Large-Scale Graph Computation on Just a PC OSDI 12 GraphChi: Large-Scale Graph Computation on Just a PC Aapo Kyrölä (CMU) Guy Blelloch (CMU) Carlos Guestrin (UW) In co- opera+on with the GraphLab team. BigData with Structure: BigGraph social graph

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 2: MapReduce Algorithm Design (2/2) January 12, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

Data-Intensive Computing with MapReduce

Data-Intensive Computing with MapReduce Data-Intensive Computing with MapReduce Session 5: Graph Processing Jimmy Lin University of Maryland Thursday, February 21, 2013 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

Why do we need graph processing?

Why do we need graph processing? Why do we need graph processing? Community detection: suggest followers? Determine what products people will like Count how many people are in different communities (polling?) Graphs are Everywhere Group

More information

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia MapReduce Spark Some slides are adapted from those of Jeff Dean and Matei Zaharia What have we learnt so far? Distributed storage systems consistency semantics protocols for fault tolerance Paxos, Raft,

More information

Introduction to MapReduce (cont.)

Introduction to MapReduce (cont.) Introduction to MapReduce (cont.) Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com USC INF 553 Foundations and Applications of Data Mining (Fall 2018) 2 MapReduce: Summary USC INF 553 Foundations

More information

CS 6453: Parameter Server. Soumya Basu March 7, 2017

CS 6453: Parameter Server. Soumya Basu March 7, 2017 CS 6453: Parameter Server Soumya Basu March 7, 2017 What is a Parameter Server? Server for large scale machine learning problems Machine learning tasks in a nutshell: Feature Extraction (1, 1, 1) (2, -1,

More information

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems Fall 2017 Exam 3 Review Paul Krzyzanowski Rutgers University Fall 2017 December 11, 2017 CS 417 2017 Paul Krzyzanowski 1 Question 1 The core task of the user s map function within a

More information

Jure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah

Jure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah Jure Leskovec (@jure) Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah 2 My research group at Stanford: Mining and modeling large social and information networks

More information

Link Analysis in the Cloud

Link Analysis in the Cloud Cloud Computing Link Analysis in the Cloud Dell Zhang Birkbeck, University of London 2017/18 Graph Problems & Representations What is a Graph? G = (V,E), where V represents the set of vertices (nodes)

More information

Clustering Lecture 8: MapReduce

Clustering Lecture 8: MapReduce Clustering Lecture 8: MapReduce Jing Gao SUNY Buffalo 1 Divide and Conquer Work Partition w 1 w 2 w 3 worker worker worker r 1 r 2 r 3 Result Combine 4 Distributed Grep Very big data Split data Split data

More information

Resilient Distributed Datasets

Resilient Distributed Datasets Resilient Distributed Datasets A Fault- Tolerant Abstraction for In- Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin,

More information

Irregular Graph Algorithms on Parallel Processing Systems

Irregular Graph Algorithms on Parallel Processing Systems Irregular Graph Algorithms on Parallel Processing Systems George M. Slota 1,2 Kamesh Madduri 1 (advisor) Sivasankaran Rajamanickam 2 (Sandia mentor) 1 Penn State University, 2 Sandia National Laboratories

More information

Graph Algorithms. Revised based on the slides by Ruoming Kent State

Graph Algorithms. Revised based on the slides by Ruoming Kent State Graph Algorithms Adapted from UMD Jimmy Lin s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States. See http://creativecommons.org/licenses/by-nc-sa/3.0/us/

More information

Efficient and Simplified Parallel Graph Processing over CPU and MIC

Efficient and Simplified Parallel Graph Processing over CPU and MIC Efficient and Simplified Parallel Graph Processing over CPU and MIC Linchuan Chen Xin Huo Bin Ren Surabhi Jain Gagan Agrawal Department of Computer Science and Engineering The Ohio State University Columbus,

More information

Shark. Hive on Spark. Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker

Shark. Hive on Spark. Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker Shark Hive on Spark Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker Agenda Intro to Spark Apache Hive Shark Shark s Improvements over Hive Demo Alpha

More information

Scalable Algorithmic Techniques Decompositions & Mapping. Alexandre David

Scalable Algorithmic Techniques Decompositions & Mapping. Alexandre David Scalable Algorithmic Techniques Decompositions & Mapping Alexandre David 1.2.05 adavid@cs.aau.dk Introduction Focus on data parallelism, scale with size. Task parallelism limited. Notion of scalability

More information

Introduction to MapReduce

Introduction to MapReduce 732A54 Big Data Analytics Introduction to MapReduce Christoph Kessler IDA, Linköping University Towards Parallel Processing of Big-Data Big Data too large to be read+processed in reasonable time by 1 server

More information