Mosaic: Processing a Trillion-Edge Graph on a Single Machine
|
|
- Amie Charles
- 6 years ago
- Views:
Transcription
1 Mosaic: Processing a Trillion-Edge Graph on a Single Machine Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, Taesoo Kim Georgia Institute of Technology Best Student EuroSys 7 June 8, 07 Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
2 Large-scale graph processing is ubiquitous Social networks Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
3 Large-scale graph processing is ubiquitous Social networks Genome analysis Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
4 Large-scale graph processing is ubiquitous Social networks Genome analysis Graphs enable Machine Learning Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
5 Powerful, heterogeneous machines Terabytes of RAM on multiple sockets Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
6 Powerful, heterogeneous machines Terabytes of RAM on multiple sockets Powerful many-core coprocessors Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
7 Powerful, heterogeneous machines Terabytes of RAM on multiple sockets Powerful many-core coprocessors Fast, large-capacity Non-volatile Memory Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
8 Powerful, heterogeneous machines Take advantage of heterogeneous machine to process tera-scale graphs Terabytes of RAM on multiple sockets Powerful many-core coprocessors Fast, large-capacity Non-volatile Memory Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
9 Table of contents Graph Processing: Sample Application Design Mosaic Architecture Graph Encoding API Evaluation Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
10 Graph Processing: Applications Community Detection Find Common Friends Find Shortest Paths Estimate Impact of Vertices (webpages, users,... )... Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
11 Example: Pagerank Calculate impact of each vertex Pagerank v = α u Neighborhood(v) Pagerank u degree u + ( α) Simple Algorithm: In each iteration, distribute current impact along out-edges, weighted by degree Sum up all incoming impacts new impact for next iteration Weight new impact with regularization factor α = 0.8 Repeat until no changes Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 6 / 8
12 Pagerank: Graphical example ) Initialize Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
13 Pagerank: Graphical example ) Propagate along outgoing edges Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
14 Pagerank: Graphical example ) Sum up incoming contributions Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
15 Pagerank: Graphical example ) Apply regularization: x Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
16 Pagerank: Graphical example ) Update outgoing edges for second iteration Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
17 Pagerank: Graphical example 6) Repeat until stabilized Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
18 Mosaic: Design space Graph Processing has many faces: Single Machine Out-of-core In memory Cluster Out-of-core In memory Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 8 / 8
19 Mosaic: Design space Graph Processing has many faces: Single Machine Out-of-core Cheap, but potentially slow In memory Fast, but limited graph size Cluster Out-of-core Large graphs, but expensive & slow In memory Large graphs & fast, but very expensive Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 8 / 8
20 Mosaic: Design space Graph Processing has many faces: Single Machine Out-of-core Cheap, but potentially slow In memory Fast, but limited graph size Cluster Out-of-core Large graphs, but expensive & slow In memory Large graphs & fast, but very expensive Single machine, out-of-core is most cost-effective Goal: Good performance and large graphs! Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 8 / 8
21 Mosaic: Design goals Goal Run algorithms on very large graphs on a single machine using coprocessors Enabled by: Common, familiar API (vertex/edge-centric) Encoding: Lossless compression Cache locality Processing on isolated subgraphs Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 9 / 8
22 Architecture of Mosaic Usage of Xeon Phi & NVMe Involvement of Host Global vertex state <current state>... <next state>... stripped Host Processors (Xeon) ( 6) I I... T T edge processing NVMe... Meta transfer Tile transfer fetch Xeon Phi receive ( 6 cores)... per Xeon Phi ( ) PCIe Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
23 Graph encoding: Idea Compression Split graph into subgraphs, use local (short) identifiers Cache locality Inside subgraphs: Sort by access order Between subgraphs: Overlap vertex sets Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
24 Background: Column first Locality for write Multiple sequential reads Target vertex Source vertex P P P P P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
25 Background: Row first Locality for read Multiple sequential writes Target vertex Source vertex P P P P P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
26 Background: Hilbert order Space-filling curve Provides locality between adjacent data points Target vertex Source vertex P P P P P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
27 From global to local: Tiles Convert graph to set of tiles ) Start with adjacency Matrix: 6 ➏ ➑ ➒ ➋ ➊ ➎ ➌ ➐ ➍ Target vertex Source vertex ➊ ➐ ➒ ➑ ➋ ➍ P P P P ➌ ➎ ➏ P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
28 From global to local: Tiles Convert graph to set of tiles ) Use first edge in tile T : 6 ➏ ➑ ➒ ➋ ➊ ➎ ➌ ➐ ➍ Target vertex (global) Source vertex (global) ➊ ➐ ➒ ➑ ➋ ➍ P P P P ➌ ➎ ➏ P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Tile- (T ) ➊ meta (I ) (,) (local) (,) : local vertex id (,) : local global id ➊ : local edge store order Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
29 From global to local: Tiles Convert graph to set of tiles ) Use more edges in tile T : 6 ➏ ➑ ➒ ➋ ➊ ➎ ➌ ➐ ➍ Target vertex (global) Source vertex (global) ➊ ➐ ➒ ➑ ➋ ➍ P P P P ➌ ➎ ➏ P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Tile- (T ) ➊ ➋ meta (I ) (,) (,) (,) (local) : local vertex id (,) : local global id ➊ : local edge store order Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
30 From global to local: Tiles Convert graph to set of tiles ) Use more edges in tile T : 6 ➏ ➑ ➒ ➋ ➊ ➎ ➌ ➐ ➍ Target vertex (global) Source vertex (global) ➊ ➐ ➒ ➑ ➋ ➍ P P P P ➌ ➎ ➏ P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Tile- (T ) ➊ ➋ ➍ meta (I (,) (,) ( (,) ) (local),) : local vertex id (,) : local global id ➊ : local edge store order Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
31 From global to local: Tiles Convert graph to set of tiles ) Use more edges in tile T : 6 ➏ ➑ ➒ ➋ ➊ ➎ ➌ ➐ ➍ Target vertex (global) Source vertex (global) ➊ ➐ ➒ ➑ ➋ ➍ P P P P ➌ ➎ ➏ P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Tile- (T ) ➊ ➋ ➍ ➌ meta (I (,) (,) ( (,) ) (local),) : local vertex id (,) : local global id ➊ : local edge store order Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
32 From global to local: Tiles Convert graph to set of tiles 6) Next edges do not fit in T anymore, construct T : 6 ➏ ➑ ➒ ➋ ➊ ➎ ➌ ➐ ➍ Target vertex (global) Source vertex (global) ➊ ➐ ➒ ➑ ➋ ➍ P P P P ➌ ➎ ➏ P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Tile- (T ) ➊ ➋ ➍ ➌ meta (I (,) (,) ( (,) ) (local),) Tile- (T ) ➎ ➐ ➑ ➒ ➏ meta (I (,) (local) (,6) ( (,) ),) : local vertex id (,) : local global id ➊ : local edge store order Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
33 Locality with Hilbert-ordered tiles Overlapping sets of sources and targets Target vertex Source vertex ➊ ➐ ➒ ➑ ➋ ➍ P P P P ➌ ➎ ➏ P P P P P P P P P P P P Global adjacency matrix Partition (S = ) Better locality than row-first or column-first Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 6 / 8
34 From global to local: Data structure ) Split original graphs into two subgraphs: ➊ ➋ ➌ ➍ ➎ ➏ ➐ ➑ ➒ 6 ➊ ➋ ➌ ➍ ➎ ➏ ➐ ➑ ➒ Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
35 From global to local: Data structure ) Internal data structure of T : ➊ ➋ 6 ➑ ➏ ➒ ➎ ➌➐ ➍... ➎ ➐ ➑ ➒ ➏ Index local global 6 ➎ ➏ ➐ ➑ ➒ src tgt Edges B B sorted by Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
36 From global to local: Data structure ) Compress edges: Compressed sparse rows ➊ ➋ 6 ➑ ➏ ➒ ➎ ➌➐ ➍... ➎ ➐ ➑ ➒ ➏ Index local global 6 ➎ ➏ ➐ ➑ ➒ src tgt src (tgt,#tgt) Edges B B sorted by ➎ ➏ ➐ ➑ ➒ (, ) (, ) B B #tgt B compressed Efficient, local encoding, sequentially accessed Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
37 From global to local: Summary Better locality Efficient encoding of local graphs Effect: up to 68% reduction in data size: Graph #vertices #edges Raw data Mosaic size (red.) rmat 6.8 M 0. B.0 GB. GB (.0%) twitter.6 M. B 0.9 GB 7.7 GB ( 9.%) rmat7. M. B 6.0 GB. GB ( 0.6%) uk M.7 B 7.9 GB 8.7 GB ( 68.8%) hyperlink,7.6 M 6. B 80.0 GB. GB ( 68.%) rmat-trillion,9.9 M,000.0 B 8,000.0 GB,86.7 GB ( 9.8%) Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 8 / 8
38 API: Pagerank example Pull: Gather per edge information Reduce: Combine results from multiple subgraphs Apply: Calculate non-associative regularization Edge-centric operation Local graph processing on Tile // On edge processor (co-processor) // Edge e = (Vertex src, Vertex tgt) def Pull(Vertex src, Vertex tgt): return src.val / src.out_degree // On edge processor/global reducers (both) def Reduce(Vertex v, Vertex v ): return v.val + v.val // On global reducers (host) def Apply(Vertex v): v.val = ( - α) + α v.val Formula: Pagerank v = α ( u Neighborhood(v) Global graph processing Vertex-centric operation Pagerank u degree u ) + ( α) Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 9 / 8
39 API: Pagerank example ) Start with T Host current state next state T meta (I (,) (,) ( (,) ),) Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
40 API: Pagerank example ) Execute Pull along all edges in T Host current state next state T Pull(e) meta (I (,) (,) ( (,) ),) Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
41 API: Pagerank example ) Reduce all updates from T onto next state Host current state next state T meta (I (,) (,) ( (,) ),) Reduce(v) Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
42 API: Pagerank example ) Switch to T Host current state next state T meta (I (,) (,6) ( (,) ),) Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
43 API: Pagerank example ) Pull all updates Host current state next state T Pull(e) meta (I (,) (,6) ( (,) ),) Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
44 API: Pagerank example 6) Reduce updates from T Host current state next state T meta (I (,) (,6) ( (,) ),) Reduce(v) Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
45 API: Pagerank example 7) All tiles processed, Apply processed updates Host current state Apply(v) next state T meta (I (,) (,6) ( (,) ),) Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
46 API: Pagerank example 8) Switch current and next state, clear next state for second iteration Host current state next state T meta (I (,) (,) ( (,) ),) Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
47 Evaluation Questions: Preprocessing Cost Performance (in comparison) Impact of Design Decisions Scalability Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
48 Evaluation - Setup Single Server: sockets, cores each 768GB RAM Xeon Phi (KNC, 6 cores) 6 NVMe (.TB each) 7 Algorithms 6 Datasets ( real world, synthetic) Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
49 Preprocessing Mosaic needs explicit preprocessing step - min for small datasets, minutes for webgraph, hours for trillion edges But: Can be amortized during execution: GridGraph: Mosaic faster after twitter: 0 iterations uk007: 8 iterations X-Stream: Mosaic faster after twitter: 8 iterations uk007: iterations Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
50 Performance comparison Comparison to other single machine engines with Pagerank: Runtime (seconds) Mosaic GridGraph X-Stream GraphChi 0 rmat twitter rmat7 uk007-0 Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
51 Performance comparison Comparison to other single machine engines with Pagerank: log Runtime (seconds) 00 0 Mosaic GridGraph X-Stream GraphChi 0. rmat twitter rmat7 uk007-0 Mosaic outperforms other system by.7 to 8.6 Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
52 Hilbert-ordered tiles: Cache locality Cache misses and execution times for three different strategies Cache Misses (%) Hilbert Row-First Column-First Runtime (s) Pagerank BFS WCC 0 Pagerank BFS WCC Hilbert-ordered tiles have up to % better cache locality, up to % reduction in runtime Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
53 Evaluation - Scalability Dimensions Add Xeon Phis/NVMes Add threads on each Xeon Phi # iterations / minute # pair of Phi+NVMe Xeon Phi threads Mosaic scales well when adding threads/xeon Phis Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 6 / 8
54 Conclusion Mosaic, a graph processing engine for trillion edge graphs on a single machine Hilbert-ordered tiles allow: Enable localized processing on coprocessors Optimizes cache locality Enables compression Code is open-source: Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 7 / 8
55 Thank you! Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 8 / 8
56 Evaluation - Selective scheduling Skip subgraphs without active vertices Especially useful for traversal algorithms: BFS, Connected Components,... 0k 8k # active tiles 6k k k Tiles Time w/o Selective Scheduling Time Selective Scheduling. 0. Time (sec) 0k # iteration. speedup for BFS on twitter graph Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 9 / 8
57 Evaluation - Dynamic load balancing Runtime (seconds) Effect of choosing the wrong load balancing scheme Mosaic has light-weight balancing scheme optimal one optimal*0 0 Pagerank BFS WCC SpMV TC Up to.8 improvement in running time by choosing correct load balancing scheme Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 0 / 8
58 Mosaic architecture Host and Xeon Phi component Streaming-based design Global vertex state <current state>... <next state>... stripped Host Processors (Xeon) ( ) Meta transfer... (T, ) LF LR (T, ) (T, ) I... local... T EP ( 6 cores) I I... Tile core transfer (cache locality) T T... T t+... T T... Tile Prefetching Active tiles NVMe processing (concurrent I/O) Xeon Phi I (local, own memory) per socket (T, ) GR GR... per Xeon Phi ( ) T I / Global reducers (memory locality) PCIe messaging via ring buffer data transfer tile meta local data per tile process Steffen Maass Mosaic: Trillion Edges on a Single Machine June 8, 07 / 8
Solros: A Data-Centric Operating System Architecture for Heterogeneous Computing
Solros: A Data-Centric Operating System Architecture for Heterogeneous Computing Changwoo Min, Woonhak Kang, Mohan Kumar, Sanidhya Kashyap, Steffen Maass, Heeseung Jo, Taesoo Kim Virginia Tech, ebay, Georgia
More informationGridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning. Xiaowei ZHU Tsinghua University
GridGraph: Large-Scale Graph Processing on a Single Machine Using -Level Hierarchical Partitioning Xiaowei ZHU Tsinghua University Widely-Used Graph Processing Shared memory Single-node & in-memory Ligra,
More informationNUMA-aware Graph-structured Analytics
NUMA-aware Graph-structured Analytics Kaiyuan Zhang, Rong Chen, Haibo Chen Institute of Parallel and Distributed Systems Shanghai Jiao Tong University, China Big Data Everywhere 00 Million Tweets/day 1.11
More informationKartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18
Accelerating PageRank using Partition-Centric Processing Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18 Outline Introduction Partition-centric Processing Methodology Analytical Evaluation
More informationGraphChi: Large-Scale Graph Computation on Just a PC
OSDI 12 GraphChi: Large-Scale Graph Computation on Just a PC Aapo Kyrölä (CMU) Guy Blelloch (CMU) Carlos Guestrin (UW) In co- opera+on with the GraphLab team. BigData with Structure: BigGraph social graph
More informationGraFBoost: Using accelerated flash storage for external graph analytics
GraFBoost: Using accelerated flash storage for external graph analytics Sang-Woo Jun, Andy Wright, Sizhuo Zhang, Shuotao Xu and Arvind MIT CSAIL Funded by: 1 Large Graphs are Found Everywhere in Nature
More informationScalable GPU Graph Traversal!
Scalable GPU Graph Traversal Duane Merrill, Michael Garland, and Andrew Grimshaw PPoPP '12 Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming Benwen Zhang
More informationX-Stream: Edge-centric Graph Processing using Streaming Partitions
X-Stream: Edge-centric Graph Processing using Streaming Partitions Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel {amitabha.roy, ivo.mihailovic, willy.zwaenepoel}@epfl.ch EPFL Abstract X-Stream is a system
More informationEfficiently Multiplying Sparse Matrix - Sparse Vector for Social Network Analysis
Efficiently Multiplying Sparse Matrix - Sparse Vector for Social Network Analysis Ashish Kumar Mentors - David Bader, Jason Riedy, Jiajia Li Indian Institute of Technology, Jodhpur 6 June 2014 Ashish Kumar
More informationX-Stream: A Case Study in Building a Graph Processing System. Amitabha Roy (LABOS)
X-Stream: A Case Study in Building a Graph Processing System Amitabha Roy (LABOS) 1 X-Stream Graph processing system Single Machine Works on graphs stored Entirely in RAM Entirely in SSD Entirely on Magnetic
More informationPractical Near-Data Processing for In-Memory Analytics Frameworks
Practical Near-Data Processing for In-Memory Analytics Frameworks Mingyu Gao, Grant Ayers, Christos Kozyrakis Stanford University http://mast.stanford.edu PACT Oct 19, 2015 Motivating Trends End of Dennard
More informationA Framework for Processing Large Graphs in Shared Memory
A Framework for Processing Large Graphs in Shared Memory Julian Shun Based on joint work with Guy Blelloch and Laxman Dhulipala (Work done at Carnegie Mellon University) 2 What are graphs? Vertex Edge
More informationJure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah
Jure Leskovec (@jure) Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah 2 My research group at Stanford: Mining and modeling large social and information networks
More informationReport. X-Stream Edge-centric Graph processing
Report X-Stream Edge-centric Graph processing Yassin Hassan hassany@student.ethz.ch Abstract. X-Stream is an edge-centric graph processing system, which provides an API for scatter gather algorithms. The
More informationA New Parallel Algorithm for Connected Components in Dynamic Graphs. Robert McColl Oded Green David Bader
A New Parallel Algorithm for Connected Components in Dynamic Graphs Robert McColl Oded Green David Bader Overview The Problem Target Datasets Prior Work Parent-Neighbor Subgraph Results Conclusions Problem
More informationParallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPU
Parallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPU Lifan Xu Wei Wang Marco A. Alvarez John Cavazos Dongping Zhang Department of Computer and Information Science University of Delaware
More informationFlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs
FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs Da Zheng, Disa Mhembere, Randal Burns Department of Computer Science Johns Hopkins University Carey E. Priebe Department of Applied
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationPEGASUS: A peta-scale graph mining system Implementation. and observations. U. Kang, C. E. Tsourakakis, C. Faloutsos
PEGASUS: A peta-scale graph mining system Implementation and observations U. Kang, C. E. Tsourakakis, C. Faloutsos What is Pegasus? Open source Peta Graph Mining Library Can deal with very large Giga-,
More informationGPU Sparse Graph Traversal. Duane Merrill
GPU Sparse Graph Traversal Duane Merrill Breadth-first search of graphs (BFS) 1. Pick a source node 2. Rank every vertex by the length of shortest path from source Or label every vertex by its predecessor
More informationEfficient and Simplified Parallel Graph Processing over CPU and MIC
Efficient and Simplified Parallel Graph Processing over CPU and MIC Linchuan Chen Xin Huo Bin Ren Surabhi Jain Gagan Agrawal Department of Computer Science and Engineering The Ohio State University Columbus,
More informationWedge A New Frontier for Pull-based Graph Processing. Samuel Grossman and Christos Kozyrakis Platform Lab Retreat June 8, 2018
Wedge A New Frontier for Pull-based Graph Processing Samuel Grossman and Christos Kozyrakis Platform Lab Retreat June 8, 2018 Graph Processing Problems modelled as objects (vertices) and connections between
More informationAutomatic Scaling Iterative Computations. Aug. 7 th, 2012
Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics
More informationRStream:Marrying Relational Algebra with Streaming for Efficient Graph Mining on A Single Machine
RStream:Marrying Relational Algebra with Streaming for Efficient Graph Mining on A Single Machine Guoqing Harry Xu Kai Wang, Zhiqiang Zuo, John Thorpe, Tien Quang Nguyen, UCLA Nanjing University Facebook
More informationFast Iterative Graph Computation: A Path Centric Approach
Fast Iterative Graph Computation: A Path Centric Approach Pingpeng Yuan*, Wenya Zhang, Changfeng Xie, Hai Jin* Services Computing Technology and System Lab. Cluster and Grid Computing Lab. School of Computer
More informationData-Intensive Computing with MapReduce
Data-Intensive Computing with MapReduce Session 5: Graph Processing Jimmy Lin University of Maryland Thursday, February 21, 2013 This work is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationGraph Partitioning for Scalable Distributed Graph Computations
Graph Partitioning for Scalable Distributed Graph Computations Aydın Buluç ABuluc@lbl.gov Kamesh Madduri madduri@cse.psu.edu 10 th DIMACS Implementation Challenge, Graph Partitioning and Graph Clustering
More informationPregel. Ali Shah
Pregel Ali Shah s9alshah@stud.uni-saarland.de 2 Outline Introduction Model of Computation Fundamentals of Pregel Program Implementation Applications Experiments Issues with Pregel 3 Outline Costs of Computation
More informationUnderstanding Manycore Scalability of File Systems. Changwoo Min, Sanidhya Kashyap, Steffen Maass Woonhak Kang, and Taesoo Kim
Understanding Manycore Scalability of File Systems Changwoo Min, Sanidhya Kashyap, Steffen Maass Woonhak Kang, and Taesoo Kim Application must parallelize I/O operations Death of single core CPU scaling
More informationTanuj Kr Aasawat, Tahsin Reza, Matei Ripeanu Networked Systems Laboratory (NetSysLab) University of British Columbia
How well do CPU, GPU and Hybrid Graph Processing Frameworks Perform? Tanuj Kr Aasawat, Tahsin Reza, Matei Ripeanu Networked Systems Laboratory (NetSysLab) University of British Columbia Networked Systems
More informationGraph Data Management
Graph Data Management Analysis and Optimization of Graph Data Frameworks presented by Fynn Leitow Overview 1) Introduction a) Motivation b) Application for big data 2) Choice of algorithms 3) Choice of
More informationEfficient graph computation on hybrid CPU and GPU systems
J Supercomput (2015) 71:1563 1586 DOI 10.1007/s11227-015-1378-z Efficient graph computation on hybrid CPU and GPU systems Tao Zhang Jingjie Zhang Wei Shu Min-You Wu Xiaoyao Liang Published online: 21 January
More informationExploring the Hidden Dimension in Graph Processing
Exploring the Hidden Dimension in Graph Processing Mingxing Zhang, Yongwei Wu, Kang Chen, *Xuehai Qian, Xue Li, and Weimin Zheng Tsinghua University *University of Shouthern California Graph is Ubiquitous
More informationNXgraph: An Efficient Graph Processing System on a Single Machine
NXgraph: An Efficient Graph Processing System on a Single Machine Yuze Chi, Guohao Dai, Yu Wang, Guangyu Sun, Guoliang Li and Huazhong Yang Tsinghua National Laboratory for Information Science and Technology,
More informationNVGRAPH,FIREHOSE,PAGERANK GPU ACCELERATED ANALYTICS NOV Joe Eaton Ph.D.
NVGRAPH,FIREHOSE,PAGERANK GPU ACCELERATED ANALYTICS NOV 2016 Joe Eaton Ph.D. Agenda Accelerated Computing nvgraph New Features Coming Soon Dynamic Graphs GraphBLAS 2 ACCELERATED COMPUTING 10x Performance
More informationOptimizing Cache Performance for Graph Analytics. Yunming Zhang Presentation
Optimizing Cache Performance for Graph Analytics Yunming Zhang 6.886 Presentation Goals How to optimize in-memory graph applications How to go about performance engineering just about anything In-memory
More informationGraph-Processing Systems. (focusing on GraphChi)
Graph-Processing Systems (focusing on GraphChi) Recall: PageRank in MapReduce (Hadoop) Input: adjacency matrix H D F S (a,[c]) (b,[a]) (c,[a,b]) (c,pr(a) / out (a)), (a,[c]) (a,pr(b) / out (b)), (b,[a])
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationNXgraph: An Efficient Graph Processing System on a Single Machine
NXgraph: An Efficient Graph Processing System on a Single Machine arxiv:151.91v1 [cs.db] 23 Oct 215 Yuze Chi, Guohao Dai, Yu Wang, Guangyu Sun, Guoliang Li and Huazhong Yang Tsinghua National Laboratory
More informationNavigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets
Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets Nadathur Satish, Narayanan Sundaram, Mostofa Ali Patwary, Jiwon Seo, Jongsoo Park, M. Amber Hassaan, Shubho Sengupta, Zhaoming
More informationThe GraphGrind Framework: Fast Graph Analytics on Large. Shared-Memory Systems
Memory-Centric System Level Design of Heterogeneous Embedded DSP Systems The GraphGrind Framework: Scott Johan Fischaber, BSc (Hon) Fast Graph Analytics on Large Shared-Memory Systems Jiawen Sun, PhD candidate
More informationSqueezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O
Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O Zhiyuan Ai, Mingxing Zhang, and Yongwei Wu, Department of Computer Science and Technology, Tsinghua
More informationAS an alternative to distributed graph processing, singlemachine
JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST 018 1 CLIP: A Disk I/O Focused Parallel Out-of-core Graph Processing System Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Senior Member, IEEE, Xuehai
More informationGraph Processing. Connor Gramazio Spiros Boosalis
Graph Processing Connor Gramazio Spiros Boosalis Pregel why not MapReduce? semantics: awkward to write graph algorithms efficiency: mapreduces serializes state (e.g. all nodes and edges) while pregel keeps
More informationFrom Think Like a Vertex to Think Like a Graph. Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, John McPherson
From Think Like a Vertex to Think Like a Graph Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, John McPherson Large Scale Graph Processing Graph data is everywhere and growing
More informationHarp-DAAL for High Performance Big Data Computing
Harp-DAAL for High Performance Big Data Computing Large-scale data analytics is revolutionizing many business and scientific domains. Easy-touse scalable parallel techniques are necessary to process big
More informationDistributed computing: index building and use
Distributed computing: index building and use Distributed computing Goals Distributing computation across several machines to Do one computation faster - latency Do more computations in given time - throughput
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 60 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Pregel: A System for Large-Scale Graph Processing
More informationA Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms
A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms Gurbinder Gill, Roshan Dathathri, Loc Hoang, Keshav Pingali Department of Computer Science, University of Texas
More informationSociaLite: A Datalog-based Language for
SociaLite: A Datalog-based Language for Large-Scale Graph Analysis Jiwon Seo M OBIS OCIAL RESEARCH GROUP Overview Overview! SociaLite: language for large-scale graph analysis! Extensions to Datalog! Compiler
More informationCS 5220: Parallel Graph Algorithms. David Bindel
CS 5220: Parallel Graph Algorithms David Bindel 2017-11-14 1 Graphs Mathematically: G = (V, E) where E V V Convention: V = n and E = m May be directed or undirected May have weights w V : V R or w E :
More informationCallisto-RTS: Fine-Grain Parallel Loops
Callisto-RTS: Fine-Grain Parallel Loops Tim Harris, Oracle Labs Stefan Kaestle, ETH Zurich 7 July 2015 The following is intended to provide some insight into a line of research in Oracle Labs. It is intended
More informationGPU Sparse Graph Traversal
GPU Sparse Graph Traversal Duane Merrill (NVIDIA) Michael Garland (NVIDIA) Andrew Grimshaw (Univ. of Virginia) UNIVERSITY of VIRGINIA Breadth-first search (BFS) 1. Pick a source node 2. Rank every vertex
More informationGraphZ: Improving the Performance of Large-Scale Graph Analytics on Small-Scale Machines
GraphZ: Improving the Performance of Large-Scale Graph Analytics on Small-Scale Machines Zhixuan Zhou Henry Hoffmann University of Chicago, Dept. of Computer Science Abstract Recent programming frameworks
More informationChallenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery
Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark by Nkemdirim Dockery High Performance Computing Workloads Core-memory sized Floating point intensive Well-structured
More informationGraphs (Part II) Shannon Quinn
Graphs (Part II) Shannon Quinn (with thanks to William Cohen and Aapo Kyrola of CMU, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford University) Parallel Graph Computation Distributed computation
More informationGraphMP: An Efficient Semi-External-Memory Big Graph Processing System on a Single Machine
GraphMP: An Efficient Semi-External-Memory Big Graph Processing System on a Single Machine Peng Sun, Yonggang Wen, Ta Nguyen Binh Duong and Xiaokui Xiao School of Computer Science and Engineering, Nanyang
More informationGraph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen
Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static
More informationEverything you always wanted to know about multicore graph processing but were afraid to ask
Everything you always wanted to know about multicore graph processing but were afraid to ask Jasmina Malicevic, Baptiste Lepers, and Willy Zwaenepoel, EPFL https://www.usenix.org/conference/atc17/technical-sessions/presentation/malicevic
More informationOrder or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations
Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations George M. Slota 1 Sivasankaran Rajamanickam 2 Kamesh Madduri 3 1 Rensselaer Polytechnic Institute, 2 Sandia National
More informationDomain-specific Programming on Graphs
Lecture 16: Domain-specific Programming on Graphs Parallel Computer Architecture and Programming Last time: Increasing acceptance of domain-specific programming systems Challenge to programmers: modern
More informationMATE-EC2: A Middleware for Processing Data with Amazon Web Services
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering
More informationPathGraph: A Path Centric Graph Processing System
1 PathGraph: A Path Centric Graph Processing System Pingpeng Yuan, Changfeng Xie, Ling Liu, Fellow, IEEE, and Hai Jin Senior Member, IEEE Abstract Large scale iterative graph computation presents an interesting
More informationVENUS: Vertex-Centric Streamlined Graph Computation on a Single PC
VENUS: Vertex-Centric Streamlined Graph Computation on a Single PC Jiefeng Cheng 1, Qin Liu 2, Zhenguo Li 1, Wei Fan 1, John C.S. Lui 2, Cheng He 1 1 Huawei Noah s Ark Lab, Hong Kong {cheng.jiefeng,li.zhenguo,david.fanwei,hecheng}@huawei.com
More informationBig Data Analytics Using Hardware-Accelerated Flash Storage
Big Data Analytics Using Hardware-Accelerated Flash Storage Sang-Woo Jun University of California, Irvine (Work done while at MIT) Flash Memory Summit, 2018 A Big Data Application: Personalized Genome
More informationGraphs. Edges may be directed (from u to v) or undirected. Undirected edge eqvt to pair of directed edges
(p 186) Graphs G = (V,E) Graphs set V of vertices, each with a unique name Note: book calls vertices as nodes set E of edges between vertices, each encoded as tuple of 2 vertices as in (u,v) Edges may
More informationParallel Combinatorial BLAS and Applications in Graph Computations
Parallel Combinatorial BLAS and Applications in Graph Computations Aydın Buluç John R. Gilbert University of California, Santa Barbara SIAM ANNUAL MEETING 2009 July 8, 2009 1 Primitives for Graph Computations
More informationGraphs / Networks CSE 6242/ CX Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech
CSE 6242/ CX 4242 Graphs / Networks Centrality measures, algorithms, interactive applications Duen Horng (Polo) Chau Georgia Tech Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationVoldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Voldemort Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/29 Outline 1 2 3 Smruti R. Sarangi Leader Election 2/29 Data
More informationLocality-Aware Software Throttling for Sparse Matrix Operation on GPUs
Locality-Aware Software Throttling for Sparse Matrix Operation on GPUs Yanhao Chen 1, Ari B. Hayes 1, Chi Zhang 2, Timothy Salmon 1, Eddy Z. Zhang 1 1. Rutgers University 2. University of Pittsburgh Sparse
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 (Fall 2018) Part 4: Analyzing Graphs (1/2) October 4, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These slides are
More informationBasic Communication Operations (Chapter 4)
Basic Communication Operations (Chapter 4) Vivek Sarkar Department of Computer Science Rice University vsarkar@cs.rice.edu COMP 422 Lecture 17 13 March 2008 Review of Midterm Exam Outline MPI Example Program:
More informationOn Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs
On Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs Sungpack Hong 2, Nicole C. Rodia 1, and Kunle Olukotun 1 1 Pervasive Parallelism Laboratory, Stanford University
More informationOptimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink
Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink Rajesh Bordawekar IBM T. J. Watson Research Center bordaw@us.ibm.com Pidad D Souza IBM Systems pidsouza@in.ibm.com 1 Outline
More informationLecture 13: March 25
CISC 879 Software Support for Multicore Architectures Spring 2007 Lecture 13: March 25 Lecturer: John Cavazos Scribe: Ying Yu 13.1. Bryan Youse-Optimization of Sparse Matrix-Vector Multiplication on Emerging
More informationIndexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25
Indexing Jan Chomicki University at Buffalo Jan Chomicki () Indexing 1 / 25 Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow (nanosec) (10 nanosec) (millisec) (sec) Very small Small
More informationDatenbanksysteme II: Modern Hardware. Stefan Sprenger November 23, 2016
Datenbanksysteme II: Modern Hardware Stefan Sprenger November 23, 2016 Content of this Lecture Introduction to Modern Hardware CPUs, Cache Hierarchy Branch Prediction SIMD NUMA Cache-Sensitive Skip List
More informationHeterogeneous Memory Subsystem for Natural Graph Analytics
Heterogeneous Memory Subsystem for Natural Graph Analytics Abraham Addisie, Hiwot Kassa, Opeoluwa Matthews, and Valeria Bertacco Computer Science and Engineering, University of Michigan Email: {abrahad,
More informationEvaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices
Evaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices Jonas Hahnfeld 1, Christian Terboven 1, James Price 2, Hans Joachim Pflug 1, Matthias S. Müller
More informationLFGraph: Simple and Fast Distributed Graph Analytics
LFGraph: Simple and Fast Distributed Graph Analytics Imranul Hoque VMware, Inc. ihoque@vmware.com Indranil Gupta University of Illinois, Urbana-Champaign indy@illinois.edu Abstract Distributed graph analytics
More informationLarge Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System
Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System Seunghwa Kang David A. Bader 1 A Challenge Problem Extracting a subgraph from
More informationGTS: A Fast and Scalable Graph Processing Method based on Streaming Topology to GPUs
GTS: A Fast and Scalable Graph Processing Method based on Streaming Topology to GPUs Min-Soo Kim, Kyuhyeon An, Himchan Park, Hyunseok Seo, and Jinwook Kim Department of Information and Communication Engineering
More informationDomain-specific programming on graphs
Lecture 25: Domain-specific programming on graphs Parallel Computer Architecture and Programming CMU 15-418/15-618, Fall 2016 1 Last time: Increasing acceptance of domain-specific programming systems Challenge
More informationContents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11
Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed
More informationCuSha: Vertex-Centric Graph Processing on GPUs
CuSha: Vertex-Centric Graph Processing on GPUs Farzad Khorasani, Keval Vora, Rajiv Gupta, Laxmi N. Bhuyan HPDC Vancouver, Canada June, Motivation Graph processing Real world graphs are large & sparse Power
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationswsptrsv: a Fast Sparse Triangular Solve with Sparse Level Tile Layout on Sunway Architecture Xinliang Wang, Weifeng Liu, Wei Xue, Li Wu
swsptrsv: a Fast Sparse Triangular Solve with Sparse Level Tile Layout on Sunway Architecture 1,3 2 1,3 1,3 Xinliang Wang, Weifeng Liu, Wei Xue, Li Wu 1 2 3 Outline 1. Background 2. Sunway architecture
More informationarxiv: v1 [cs.db] 21 Oct 2017
: High-performance external graph analytics Sang-Woo Jun, Andy Wright, Sizhuo Zhang, Shuotao Xu and Arvind Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology
More informationWhen Graph Meets Big Data: Opportunities and Challenges
High Performance Graph Data Management and Processing (HPGDM 2016) When Graph Meets Big Data: Opportunities and Challenges Yinglong Xia Huawei Research America 11/13/2016 The International Conference for
More informationOptimization of Lattice QCD with CG and multi-shift CG on Intel Xeon Phi Coprocessor
Optimization of Lattice QCD with CG and multi-shift CG on Intel Xeon Phi Coprocessor Intel K. K. E-mail: hirokazu.kobayashi@intel.com Yoshifumi Nakamura RIKEN AICS E-mail: nakamura@riken.jp Shinji Takeda
More informationConflict-Free Vectorization of Associative Irregular Applications with Recent SIMD Architectural Advances
Conflict-Free Vectorization of Associative Irregular Applications with Recent SIMD Architectural Advances Peng Jiang Gagan Agrawal Department of Computer Science and Engineering The Ohio State University,
More informationLA3: A Scalable Link- and Locality-Aware Linear Algebra-Based Graph Analytics System
LA3: A Scalable Link- and Locality-Aware Linear Algebra-Based Graph Analytics System Yousuf Ahmad, Omar Khattab, Arsal Malik, Ahmad Musleh, Mohammad Hammoud Carnegie Mellon University in Qatar {myahmad,
More informationLeveraging Flash in HPC Systems
Leveraging Flash in HPC Systems IEEE MSST June 3, 2015 This work was performed under the auspices of the U.S. Department of Energy by under Contract DE-AC52-07NA27344. Lawrence Livermore National Security,
More informationTree-Based Density Clustering using Graphics Processors
Tree-Based Density Clustering using Graphics Processors A First Marriage of MRNet and GPUs Evan Samanas and Ben Welton Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 The
More informationCGraph: A Correlations-aware Approach for Efficient Concurrent Iterative Graph Processing
: A Correlations-aware Approach for Efficient Concurrent Iterative Graph Processing Yu Zhang, Xiaofei Liao, Hai Jin, and Lin Gu, Huazhong University of Science and Technology; Ligang He, University of
More informationChaos: Scale-out Graph Processing from Secondary Storage
Chaos: Scale-out Graph Processing from Secondary Storage Amitabha Roy Laurent Bindschaedler Jasmina Malicevic Willy Zwaenepoel Intel EPFL, Switzerland firstname.lastname@{intel.com, epfl.ch } Abstract
More informationGraphQ: Graph Query Processing with Abstraction Refinement -- Scalable and Programmable Analytics over Very Large Graphs on a Single PC
GraphQ: Graph Query Processing with Abstraction Refinement -- Scalable and Programmable Analytics over Very Large Graphs on a Single PC Kai Wang, Guoqing Xu, University of California, Irvine Zhendong Su,
More informationOne Trillion Edges. Graph processing at Facebook scale
One Trillion Edges Graph processing at Facebook scale Introduction Platform improvements Compute model extensions Experimental results Operational experience How Facebook improved Apache Giraph Facebook's
More informationOut-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays
Out-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays Wagner T. Corrêa James T. Klosowski Cláudio T. Silva Princeton/AT&T IBM OHSU/AT&T EG PGV, Germany September 10, 2002 Goals Render
More information