Memory-Optimized Distributed Graph Processing through Novel Compression Techniques Katia Papakonstantinopoulou Joint work with Panagiotis Liakos and Alex Delis University of Athens Athens Colloquium in Algorithms and Complexity UoA, August 26 th, 2016
Motivation UoA Katia Papakonstantinopoulou Distributed Graph Compression- Intro Motivation 2/19 graph data whose size continuously grows distributed graph processing systems (e.g., Pregel & Apache Giraph) scale of real-world graphs hardens graph processing even in distributed environments we need efficient distributed representations of such graphs We address this problem by exploiting empirically-observed properties demonstrated by behavior graphs
Distributed graph-processing approaches and systems The proposed frameworks [1, 6, 4, 7] fail to handle the huge scale of real-world graphs, as a result of ineffective memory usage [2]. The partitioning hardens the task of compression. UoA Katia Papakonstantinopoulou Distributed Graph Compression- Intro Motivation 3/19 Most of these approaches adopt the think like a vertex programming paradigm introduced with Pregel [5] ( intuitive parallelizable algorithms). A graph partitioned on a vertex basis in a distributed environment: Worker 1 Worker 2 5 2 2 3,4,5 3 6 1, 8 1 2, 3, 4 4 1, 7 7 8 8 7 Worker 3
Indicative memory optimization approaches UoA Katia Papakonstantinopoulou Distributed Graph Compression- Intro Motivation 4/19 The distributed graph-processing systems face significant memory-related issues. Some memory optimization approaches are: Apache Giraph [1] (graph processing system that follows Pregel) with contributions by Facebook focused entirely on a more careful implementation for the representation of the out-edges of a vertex, without exploiting the redunduncy in real-world graphs Gbase[3] (a number of alternative compression techniques to reduce storage and hence network traffic and query execution time) does not follow the vertex-centric model requires decompression
Contribution UoA Katia Papakonstantinopoulou Distributed Graph Compression- Intro Motivation 5/19 We follow the Pregel paradigm and partition the graph vertices among the nodes of a distributed computing environment. In this context, we present a number of novel techniques that: 1 offer space efficient-representations of the out-edges of vertices, 2 allow fast mining (in-situ) of the graph elements without the need of decompression, 3 enable the execution of graph algorithms in memory-constrained settings, and 4 ease the task of memory management, thus allowing faster execution. Our work lies in the intersection of distributed graph processing systems and compressed graph representations.
Distributed vs non-distributed settings UoA Katia Papakonstantinopoulou Distributed Graph Compression- Intro Motivation 6/19 In non-distributed settings, we can exploit the fact that vertices tend to exhibit similarities (copy property). In order to achieve memory optimization, we need representations that allow mining of the graph s elements without decompression.
Giraph structures for out-neighbors UoA Katia Papakonstantinopoulou Distributed Graph Compression- Intro Motivation 7/19 Giraph s adjacency-list representations: ByteArrayEdges vertex id 1 weight 1 vertex id 2 weight 2... size 1 size 2 size 1 size 2 The bytes required per out-neighbor are determined by the data type used for its id and weight; for integer numbers 4+4=8 bytes are required.
Representations based on consecutive out-edges UoA Katia Papakonstantinopoulou Distributed Graph Compression- Our Approach 8/19 Consider the following sequence of neighbors to be represented: (2, 9, 10, 11, 12, 14, 17, 18, 20, 127). (1) 2 (9) 2 BVedges γ(0) 4 bytes 4 bytes }{{}}{{} number interval of intervals ζ(13) ζ(11) ζ(2) ζ(0) ζ(1) ζ(106) } {{ } residuals In the context of graph compression, Elias γ coding is preferred for the representation of rather small values of x, whereas ζ coding is more proper for potentially large values.
Representations based on consecutive out-edges UoA Katia Papakonstantinopoulou Distributed Graph Compression- Our Approach 9/19 Consider again the sequence of neighbors: (2, 9, 10, 11, 12, 14, 17, 18, 20, 127). IntervalResidualEdges (2) 2 (9) 2 (4) 2 (17) 2 (2) 2 4 bytes 4+1 bytes 4+1 bytes } {{ }} {{ } {{ } number of 1st interval 2nd interval intervals (2) 2 (14) 2 (20) 2 (127) 2 4 bytes 4 bytes 4 bytes 4 bytes } {{ } residuals
Representation based on concentration of out-edges UoA Katia Papakonstantinopoulou Distributed Graph Compression- Our Approach 10/19 Again using the sequence: (2, 9, 10, 11, 12, 14, 17, 18, 20, 127). IndexedBitArrayEdges... (0) 2 (1) 2 (2) 2 (15) 2 4 bytes 1 byte
Experimental Evaluation UoA Katia Papakonstantinopoulou Distributed Graph Compression- Experimental Evaluation 11/19 How much more space-efficient is each of our three compressed out-edge representations compared to ByteArrayEdges? Are our techniques competitive speed-wise when memory is not a concern? How much more efficient are our compressed representations when the available memory is constrained? Can we execute algorithms for large graphs in settings where it was not possible before?
Space Efficiency Comparison UoA Katia Papakonstantinopoulou Distributed Graph Compression- Experimental Evaluation 12/19 Memory requirements of Giraph s ByteArrayEdges and our three representations for the small and large-scale graphs of our dataset: ByteArray- BVEdges IntervalRe- IndexedBitgraph Edges sidualedges ArrayEdges uk-2007-05@100000 22.61MB 6.41MB 7.92MB 8.91MB uk-2007-05@1000000 279.16MB 67.36MB 82.7MB 97.79MB indochina-2004 1,511.67MB 442.34MB 646.03MB 554.23MB hollywood-2011 1,381.91MB 287.53MB 613.52MB 676.88MB uk-2002 2,733.6MB 1,092.82MB 1,224.07MB 1,255.67MB arabic-2005 4,820.09MB 1,428.97MB 1,674.75MB 1,849.83MB uk-2005 7,401.88MB 2,383.54MB 2,728.74MB 2,928.81MB sk-2005 14,829.64MB 4,889.85MB 5,657.79MB 6,354.17MB
Execution Time Comparison (small-scale graphs) UoA Katia Papakonstantinopoulou Distributed Graph Compression- Experimental Evaluation 13/19 Execution time of PageRank algorithm for graph indochina-2004 using a setup of 2, 4, and 8 workers: Execution time (in minutes) 45 40 35 30 25 20 15 10 5 ByteArrayEdges BVEdges IntervalResidualEdges IndexedBitArrayEdges 0 2 workers 4 workers 8 workers
Execution Time Comparison (large-scale graphs) Execution time for each superstep of PageRank algorithm for graph uk-2005 using 5 workers: 10 Execution time (in minutes) 8 6 4 2 ByteArrayEdges BVEdges IntervalResidualEdges IndexedBitArrayEdges 0 0 5 10 15 20 25 30 Supersteps of PageRank execution ByteArrayEdges performance fluctuates due to extensive garbage collection. UoA Katia Papakonstantinopoulou Distributed Graph Compression- Experimental Evaluation 14/19
Execution Time Comparison (large-scale graphs) UoA Katia Papakonstantinopoulou Distributed Graph Compression- Experimental Evaluation 15/19 Execution time of PageRank algorithm for graph uk-2005: Execution time (in minutes) 200 150 100 50 0 ByteArrayEdges BVEdges IntervalResidualEdges IndexedBitArrayEdges uk-2005 (5 workers) FAILED uk-2005 (4 workers) IntervalResidualEdges and IndexedBitArrayEdges outperform ByteArrayEdges.
Conclusion UoA Katia Papakonstantinopoulou Distributed Graph Compression- Experimental Evaluation 16/19 Our experimental results indicate significant improvements on space-efficiency for all proposed techniques. We reduced memory requirements up to 5 times in comparison with currently applied techniques. In settings where earlier approaches were able to execute graph algorithms, we achieve significant performance improvements. We reduced execution time up to 31% due to memory optimization. These findings establish our structures as the preferable option for web graphs, or any other type of behavior graphs.
Future Directions UoA Katia Papakonstantinopoulou Distributed Graph Compression- Future Directions 17/19 Design representation methods that favor mutations of the graph. Examine the compression of edge weights.
References UoA Katia Papakonstantinopoulou Distributed Graph Compression- References 18/19 [1] Apache Giraph. http://giraph.apache.org/. [2] Minyang Han, Khuzaima Daudjee, Khaled Ammar, M. Tamer Özsu, Xingfang Wang, and Tianqi Jin. An Experimental Comparison of Pregel-like Graph Processing Systems. Proc. of the VLDB Endowment, 7(12):1047 1058, 2014. [3] U. Kang, Hanghang Tong, Jimeng Sun, Ching-Yung Lin, and Christos Faloutsos. GBASE: an efficient analysis platform for large graphs. VLDB J., 21(5):637 650, 2012. [4] Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. Distributed GraphLab: A Framework for Machine Learning in the Cloud. Proc. of the VLDB Endowment, 5(8):716 727, 2012. [5] Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: A System for Large-Scale Graph Processing. In ACM SIGMOD, 2010. [6] Semih Salihoglu and Jennifer Widom. GPS: a graph processing system. In SSDBM, 2013. [7] Da Yan, James Cheng, Yi Lu, and Wilfred Ng. Effective Techniques for Message Reduction and Load Balancing in Distributed Graph Computation. In WWW, 2015.
UoA Katia Papakonstantinopoulou Distributed Graph Compression- Contact 19/19 Thank you for your attention! for further details visit: http://hive.di.uoa.gr/network-analysis/ or email me at: katia@di.uoa.gr