Exploring the Hidden Dimension in Graph Processing
|
|
- Cordelia Lorena Thornton
- 5 years ago
- Views:
Transcription
1 Exploring the Hidden Dimension in Graph Processing Mingxing Zhang, Yongwei Wu, Kang Chen, *Xuehai Qian, Xue Li, and Weimin Zheng Tsinghua University *University of Shouthern California
2 Graph is Ubiquitous Google: > trillion indexed pages Facebook: > 800 million active users Web Graph Social Graph 3 billion RDF triples in 0 Information Network De Bruijn: 4 k nodes (k = 0,, 40) Biological Network Acknowledgement: Arijit Khan, Sameh Elnikety [VLDB '4]
3 Graph Computing is Even More Ubiquitous Traditional Graph Analysis Many graph applications aim at analyzing the property of graphs Examples Shortest Path Triangle Counting PageRank Connecting Component
4 Graph Computing is Even More Ubiquitous Traditional Graph Analysis Many graph applications aim at analyzing the property of graphs MLDM as Graph Computing Many MLDM applications can be modeled and computed as graph problems Examples Shortest Path Triangle Counting PageRank Connecting Component Examples Collaborative Filtering SpMM Neural Network Matrix Factorization
5 D Partitioning D Partitioning: each vertex is assigned to one worker with all its incoming/outcoming edges attached Original Graph Partition into 4 nodes with D partitioning Node 0 Node V 3 V Node Node 3 4 Sub- graphs & 6 Replicas (denoted by dashed circle) V 3 V
6 D Each Vertex is a Task Vertex Program Vertex Program Each vertex program can read and modify the neighborhood of a vertex Communication by messages (e.g. Pregel) by shared state (e.g. GraphLab) Granularity of task dispatching is vertex, which is the SAME as the granularity of data dispatching Parallelism by running multiple vertex programs simultaneously Problem SKEW!!!
7 D Partitioning D Partitioning: each edge is assigned to one worker certain heuristics can be used for the assigning Original Graph Partition into 4 nodes with D partitioning Node 0 Node V 3 V V Node Node 3 4 Sub- graphs & 6 Replicas (denoted by dashed circle) V
8 D à Each Edge is a Task PowerGraph Gather User Defined: Gather( Σ + Σ à Σ 3 Y ) à Σ Apply User Defined: Apply( Y, Σ) à Y Scatter User Defined: Scaber( Y ) à Y Σ Y Y Y Y Granularity of task dispatching is edge, which is also the SAME as the granularity of data dispatching
9 Is this the END of task partitioning? YES Vertex can be attached with multiple edges But an edge is connected to only two vertices Indivisible!
10 Is this the END of task partitioning? YES Vertex can be attached with multiple edges But an edge is connected to only two vertices Indivisible! ß NOT true for many problems NO Each vertex/edge can be further partitioned!
11 Example: Collaborative Filtering Matrix View N: # of users M: # of items R R[u,v] Graph View D P * P[u] P 0 P P P u Q 0 Q Q Q v Q T Q[v] D R[u,v] <P[u], Q[v]> R[u,v] Q M P N Collaborative Filtering estimating missing ratings based on a given incomplete set of (user, item) ratings In graph view Each vertex is attached with a feature vector with size D. A typical pattern of modelling MLDM problem as graph problem.
12 3D Partitioning 3D Partitioning: each vertex is split into sub- vertices and a D partitioner is used in each layer Original Graph Partition into 4 nodes with 3D partitioning Node 0,0 Node 0, V Sub- graphs & 3 Replicas in each layer. Node,0 Node, V
13 Motivation We found a NEW dimension, shouldn t we partition it? Observation Ø These vector properties are usually operated by element- wise operators Ø Easy to parallelize v With no communication!
14 Two Kinds of Communications Node 0,0 Node 0, Intra- layer Communication Less sub- graphs Less replicas V Reduced! Node,0 Node, V
15 Two Kinds of Communications Node 0,0 Node 0, Intra- layer Communication Less sub- graphs Less replicas V Reduced! Node,0 Node, Inter- layer Communication V Not exist before Increased!
16 Two Kinds of Communications Node 0,0 Node 0, Intra- layer Communication Less sub- graphs Less replicas V Reduced! Node,0 Node, Inter- layer Communication V Not exist before Increased!
17 New System: CUBE Existing models does not formalize the inter- layer communication CUBE Adopts 3D partitioning Implements a new programming model UPPS Use a matrix backend
18 Data Model of CUBE CUBE The data graph model is also used in CUBE, but each vertex/edge can be attached with two kinds of data. Two classes of data Ø DShare :- - an indivisible property v represented by a single variable. Ø DColle :- - a divisible collection of property vector v stored as a vector of variables v len(dcolle) == collection size SC
19 3D Partitioner CUBE 3D Partitioner (L, P): where L is the number of layers, and P is a D partitioner. Node 0,0 Node,0 S S S S DShare is replicated L S V S S = Node 0, Node, S S V S S S S S DColle is equally partitioned Partitioned with P
20 Programming Model CUBE Update :- - inter- layer communication v v UpdateVertex: read/write all the data of a vertex UpdateEdge: read/write all the data of an edge Push, Pull, Sink :- - intra- layer communication Ø Variations of GAS, for an edge edg that connects (src, dst) v Push: use src and edg to update dst. v Pull: use dst and edg to update src. v Sink: use src and dst to update edg.
21 Matrix Backend CUBE Vertex Frontend ü Easy to program ü Similar to existing works Less efficiency MAP Matrix Backend Hard to program ü High efficiency v Simple indexing v Hilbert order Mapping from vertex frontend to matrix backend can significantly speedup the system; a useful method pioneered by GraphMat.
22 Evaluation CUBE Baseline: PowerGraph and PowerLyra Ø the default partitioner oblivious is used for PowerGraph Ø every possible partitioner is tested for PowerLyra and the best is used Ø P of CUBE is the same as the partitioner used by PowerLyra Platform: 8- node system; connected with Gb Ethernet Dataset Name U V E Best D Partitioner Libimseti 35,359 68,79 7,359,346 Hybrid- cut Last.fm 359,349,067 7,559,530 Bi- cut Netflix 7, ,89 00,480,507 Bi- cut
23 Micro Benchmarks: SpMM SpMM P[u] P[w] D P * Q[v] M Q = R[u,v] R R[w,v] D N Q[v] = R[u,v]*P[u] + R[w,v]*P[w] Execution Time of SpMM Libimseti, Sc = 56 Lastfm, Sc = 56 Libimseti, Sc = # of layers proportional to Communication cost # of replicas # of layers # of subgraphs inversely proportional to negative relation
24 Micro Benchmarks: SumV & SumE Execution Time of SumV Libimseti, Sc = 56 Lastfm, Sc = 56 Libimseti, Sc = # of layers Execution Time of SumE Libimseti, Sc = 56 Lastfm, Sc = 56 Libimseti, Sc = # of layers SumV: summing up each vertex s vector SumE: summing up each edge s vector Communication cost proportional to!"#!, L is # of layers
25 Compare to PowerLyra & PowerGraph CUBE Dataset D Execution Time GD ALS PowerGraph PowerLyra CUBE PowerGraph PowerLyra CUBE Libimseti Lastfm Netflix (4) (64) (8) (64) (4) (64) (8) Failed Failed 30 (64) () (8) () Failed 39 8 (8) ü Up to 4.7x speedup (V.S. PowerLyra) ü Up to 7.3x speedup (V.S. PowerGraph)
26 Piecewise Breakdown CUBE Network Reduction on GD Network Reduction on ALS ü About half of the speedup is from 3D partitioning (up to x) for Lastfm and Libimseti. Not useful for Netflix. v Can achieve.5x speedup on Netflix if D is set to 048. ü Almost all the speedup is from network reduction. v Computation of ALS is dominated by DSYSV, an O(N 3 ) computation kernel.
27 Memory Consumption CUBE Total Memory Consumption # of layers Memory Consumption better than L=, i.e., D only best for time is not always best for memory consumption Total memory consumption for running ALS on 64 workers with D=3, which means that SC=D +D=056 Partitioning Time similar curve as memory consumption
28 Conclusion We found that, for certain graph problems, vertices and edges are not indivisible. We propose CUBE, which Ø adopts 3D partitioning; Ø uses matrix backend; Ø achieves up to 4.7x speedup.
29 Thank You!!
Exploring the Hidden Dimension in Graph Processing
Exploring the Hidden imension in Graph Processing Mingxing Zhang, Yongwei Wu, and Kang hen, Tsinghua University; Xuehai Qian, University of Southern alifornia; Xue Li and Weimin Zheng, Tsinghua University
More informationCommunication for PIM-based Graph Processing with Efficient Data Partition. Mingxing Zhang, Youwei Zhuo (equal contribution),
GraphP: Reducing Communication for PIM-based Graph Processing with Efficient Data Partition Mingxing Zhang, Youwei Zhuo (equal contribution), Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis,
More informationOn Smart Query Routing: For Distributed Graph Querying with Decoupled Storage
On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage Arijit Khan Nanyang Technological University (NTU), Singapore Gustavo Segovia ETH Zurich, Switzerland Donald Kossmann Microsoft
More informationReGraph: A Graph Processing Framework that Alternately Shrinks and Repartitions the Graph
Reraph: A raph Processing Framework that Alternately Shrinks and Repartitions the raph ABSTRACT Xue Li Tsinghua University Kang Chen Tsinghua University Think Like a Sub-raph (TLAS) is a philosophy proposed
More informationMosaic: Processing a Trillion-Edge Graph on a Single Machine
Mosaic: Processing a Trillion-Edge Graph on a Single Machine Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, Taesoo Kim Georgia Institute of Technology Best Student Paper @ EuroSys
More informationBig Graph Processing. Fenggang Wu Nov. 6, 2016
Big Graph Processing Fenggang Wu Nov. 6, 2016 Agenda Project Publication Organization Pregel SIGMOD 10 Google PowerGraph OSDI 12 CMU GraphX OSDI 14 UC Berkeley AMPLab PowerLyra EuroSys 15 Shanghai Jiao
More informationMatrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1
Matrix-Vector Multiplication by MapReduce From Rajaraman / Ullman- Ch.2 Part 1 Google implementation of MapReduce created to execute very large matrix-vector multiplications When ranking of Web pages that
More informationSqueezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O
Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O Zhiyuan Ai, Mingxing Zhang, and Yongwei Wu, Department of Computer Science and Technology, Tsinghua
More informationGraph Data Management
Graph Data Management Analysis and Optimization of Graph Data Frameworks presented by Fynn Leitow Overview 1) Introduction a) Motivation b) Application for big data 2) Choice of algorithms 3) Choice of
More informationA Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms
A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms Gurbinder Gill, Roshan Dathathri, Loc Hoang, Keshav Pingali Department of Computer Science, University of Texas
More informationGraphP: Reducing Communication for PIM-based Graph Processing with Efficient Data Partition
GraphP: Reducing Communication for PIM-based Graph Processing with Efficient Data Partition Mingxing Zhang Youwei Zhuo Chao Wang Mingyu Gao Yongwei Wu Kang Chen Christos Kozyrakis Xuehai Qian Tsinghua
More informationIrregular Graph Algorithms on Parallel Processing Systems
Irregular Graph Algorithms on Parallel Processing Systems George M. Slota 1,2 Kamesh Madduri 1 (advisor) Sivasankaran Rajamanickam 2 (Sandia mentor) 1 Penn State University, 2 Sandia National Laboratories
More informationSocial-Network Graphs
Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities
More informationOne Trillion Edges. Graph processing at Facebook scale
One Trillion Edges Graph processing at Facebook scale Introduction Platform improvements Compute model extensions Experimental results Operational experience How Facebook improved Apache Giraph Facebook's
More informationExtracting Information from Complex Networks
Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform
More informationLarge Scale Graph Processing Pregel, GraphLab and GraphX
Large Scale Graph Processing Pregel, GraphLab and GraphX Amir H. Payberah amir@sics.se KTH Royal Institute of Technology Amir H. Payberah (KTH) Large Scale Graph Processing 2016/10/03 1 / 76 Amir H. Payberah
More informationAN EXPERIMENTAL COMPARISON OF PARTITIONING STRATEGIES IN DISTRIBUTED GRAPH PROCESSING SHIV VERMA THESIS
c 2017 Shiv Verma AN EXPERIMENTAL COMPARISON OF PARTITIONING STRATEGIES IN DISTRIBUTED GRAPH PROCESSING BY SHIV VERMA THESIS Submitted in partial fulfillment of the requirements for the degree of Master
More informationComputation and Communication Efficient Graph Processing with Distributed Immutable View
Computation and Communication Efficient Graph Processing with Distributed Immutable View Rong Chen, Xin Ding, Peng Wang, Haibo Chen, Binyu Zang, Haibing Guan Shanghai Key Laboratory of Scalable Computing
More informationFrom Think Like a Vertex to Think Like a Graph. Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, John McPherson
From Think Like a Vertex to Think Like a Graph Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, John McPherson Large Scale Graph Processing Graph data is everywhere and growing
More informationAn Experimental Comparison of Partitioning Strategies in Distributed Graph Processing
An Experimental Comparison of Partitioning Strategies in Distributed Graph Processing ABSTRACT Shiv Verma 1, Luke M. Leslie 1, Yosub Shin, Indranil Gupta 1 1 University of Illinois at Urbana-Champaign,
More informationGRAM: Scaling Graph Computation to the Trillions
GRAM: Scaling Graph Computation to the Trillions Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, Lidong Zhou Microsoft Research, Peking University, Beihang
More informationPlanar: Parallel Lightweight Architecture-Aware Adaptive Graph Repartitioning
Planar: Parallel Lightweight Architecture-Aware Adaptive Graph Repartitioning Angen Zheng, Alexandros Labrinidis, and Panos K. Chrysanthis University of Pittsburgh 1 Graph Partitioning Applications of
More informationCounting Triangles & The Curse of the Last Reducer. Siddharth Suri Sergei Vassilvitskii Yahoo! Research
Counting Triangles & The Curse of the Last Reducer Siddharth Suri Yahoo! Research Why Count Triangles? 2 Why Count Triangles? Clustering Coefficient: Given an undirected graph G =(V,E) cc(v) = fraction
More informationPregel. Ali Shah
Pregel Ali Shah s9alshah@stud.uni-saarland.de 2 Outline Introduction Model of Computation Fundamentals of Pregel Program Implementation Applications Experiments Issues with Pregel 3 Outline Costs of Computation
More informationWedge A New Frontier for Pull-based Graph Processing. Samuel Grossman and Christos Kozyrakis Platform Lab Retreat June 8, 2018
Wedge A New Frontier for Pull-based Graph Processing Samuel Grossman and Christos Kozyrakis Platform Lab Retreat June 8, 2018 Graph Processing Problems modelled as objects (vertices) and connections between
More informationGridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning. Xiaowei ZHU Tsinghua University
GridGraph: Large-Scale Graph Processing on a Single Machine Using -Level Hierarchical Partitioning Xiaowei ZHU Tsinghua University Widely-Used Graph Processing Shared memory Single-node & in-memory Ligra,
More informationGraph Partitioning for Scalable Distributed Graph Computations
Graph Partitioning for Scalable Distributed Graph Computations Aydın Buluç ABuluc@lbl.gov Kamesh Madduri madduri@cse.psu.edu 10 th DIMACS Implementation Challenge, Graph Partitioning and Graph Clustering
More informationAS an alternative to distributed graph processing, singlemachine
JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST 018 1 CLIP: A Disk I/O Focused Parallel Out-of-core Graph Processing System Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Senior Member, IEEE, Xuehai
More informationBipartite-oriented Distributed Graph Partitioning for Big Learning
Bipartite-oriented Distributed Graph Partitioning for Big Learning Rong Chen, Jiaxin Shi, Binyu Zang, Haibing Guan Shanghai Key Laboratory of Scalable Computing and Systems Institute of Parallel and Distributed
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationDeveloping MapReduce Programs
Cloud Computing Developing MapReduce Programs Dell Zhang Birkbeck, University of London 2017/18 MapReduce Algorithm Design MapReduce: Recap Programmers must specify two functions: map (k, v) * Takes
More informationKartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18
Accelerating PageRank using Partition-Centric Processing Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18 Outline Introduction Partition-centric Processing Methodology Analytical Evaluation
More informationArabesque. A system for distributed graph mining. Mohammed Zaki, RPI
rabesque system for distributed graph mining Mohammed Zaki, RPI Carlos Teixeira, lexandre Fonseca, Marco Serafini, Georgos Siganos, shraf boulnaga, Qatar Computing Research Institute (QCRI) 1 Big Data
More informationPractical Near-Data Processing for In-Memory Analytics Frameworks
Practical Near-Data Processing for In-Memory Analytics Frameworks Mingyu Gao, Grant Ayers, Christos Kozyrakis Stanford University http://mast.stanford.edu PACT Oct 19, 2015 Motivating Trends End of Dennard
More informationOrder or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations
Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations George M. Slota 1 Sivasankaran Rajamanickam 2 Kamesh Madduri 3 1 Rensselaer Polytechnic Institute, 2 Sandia National
More informationNVGRAPH,FIREHOSE,PAGERANK GPU ACCELERATED ANALYTICS NOV Joe Eaton Ph.D.
NVGRAPH,FIREHOSE,PAGERANK GPU ACCELERATED ANALYTICS NOV 2016 Joe Eaton Ph.D. Agenda Accelerated Computing nvgraph New Features Coming Soon Dynamic Graphs GraphBLAS 2 ACCELERATED COMPUTING 10x Performance
More informationMassively Parallel Graph Analytics
Massively Parallel Graph Analytics Manycore graph processing, distributed graph layout, and supercomputing for graph analytics George M. Slota 1,2,3 Kamesh Madduri 2 Sivasankaran Rajamanickam 1 1 Sandia
More informationMizan: A System for Dynamic Load Balancing in Large-scale Graph Processing
/34 Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing Zuhair Khayyat 1 Karim Awara 1 Amani Alonazi 1 Hani Jamjoom 2 Dan Williams 2 Panos Kalnis 1 1 King Abdullah University of
More informationPRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
Scalable and Energy-Efficient Architecture Lab (SEAL) PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in -based Main Memory Ping Chi *, Shuangchen Li *, Tao Zhang, Cong
More informationScaling Distributed Machine Learning with the Parameter Server
Scaling Distributed Machine Learning with the Parameter Server Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su Presented
More informationWhy do we need graph processing?
Why do we need graph processing? Community detection: suggest followers? Determine what products people will like Count how many people are in different communities (polling?) Graphs are Everywhere Group
More informationLocal higher-order graph clustering
Local higher-order graph clustering Hao Yin Stanford University yinh@stanford.edu Austin R. Benson Cornell University arb@cornell.edu Jure Leskovec Stanford University jure@cs.stanford.edu David F. Gleich
More informationAn Algorithmic Approach to Communication Reduction in Parallel Graph Algorithms
An Algorithmic Approach to Communication Reduction in Parallel Graph Algorithms Harshvardhan, Adam Fidel, Nancy M. Amato, Lawrence Rauchwerger Parasol Laboratory Dept. of Computer Science and Engineering
More informationTowards Performance and Scalability Analysis of Distributed Memory Programs on Large-Scale Clusters
Towards Performance and Scalability Analysis of Distributed Memory Programs on Large-Scale Clusters 1 University of California, Santa Barbara, 2 Hewlett Packard Labs, and 3 Hewlett Packard Enterprise 1
More informationG(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu
G(B)enchmark GraphBench: Towards a Universal Graph Benchmark Khaled Ammar M. Tamer Özsu Bioinformatics Software Engineering Social Network Gene Co-expression Protein Structure Program Flow Big Graphs o
More informationBig Data Analytics Using Hardware-Accelerated Flash Storage
Big Data Analytics Using Hardware-Accelerated Flash Storage Sang-Woo Jun University of California, Irvine (Work done while at MIT) Flash Memory Summit, 2018 A Big Data Application: Personalized Genome
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #12 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Last class Outline
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationarxiv: v1 [cs.dc] 21 Oct 2013
GRE: A Graph Runtime Engine for Large-Scale Distributed Graph-Parallel Applications Jie Yan, Guangming Tan, Ninghui Sun Institute of Computing Technology, Chinese Academy of Sciences University of Chinese
More informationLocality-Aware Software Throttling for Sparse Matrix Operation on GPUs
Locality-Aware Software Throttling for Sparse Matrix Operation on GPUs Yanhao Chen 1, Ari B. Hayes 1, Chi Zhang 2, Timothy Salmon 1, Eddy Z. Zhang 1 1. Rutgers University 2. University of Pittsburgh Sparse
More informationBoosting the Performance of FPGA-based Graph Processor using Hybrid Memory Cube: A Case for Breadth First Search
Boosting the Performance of FPGA-based Graph Processor using Hybrid Memory Cube: A Case for Breadth First Search Jialiang Zhang, Soroosh Khoram and Jing Li 1 Outline Background Big graph analytics Hybrid
More informationGraph Analytics using Vertica Relational Database
Graph Analytics using ertica Relational Database Alekh Jindal* Samuel Madden Malú Castellanos Meichun Hsu Microsoft MIT ertica ertica * work done while at MIT Motivation for graphs on DB Data anyways in
More informationPULP: Fast and Simple Complex Network Partitioning
PULP: Fast and Simple Complex Network Partitioning George Slota #,* Kamesh Madduri # Siva Rajamanickam * # The Pennsylvania State University *Sandia National Laboratories Dagstuhl Seminar 14461 November
More informationHPCGraph: Benchmarking Massive Graph Analytics on Supercomputers
HPCGraph: Benchmarking Massive Graph Analytics on Supercomputers George M. Slota 1, Siva Rajamanickam 2, Kamesh Madduri 3 1 Rensselaer Polytechnic Institute 2 Sandia National Laboratories a 3 The Pennsylvania
More informationGraph-Processing Systems. (focusing on GraphChi)
Graph-Processing Systems (focusing on GraphChi) Recall: PageRank in MapReduce (Hadoop) Input: adjacency matrix H D F S (a,[c]) (b,[a]) (c,[a,b]) (c,pr(a) / out (a)), (a,[c]) (a,pr(b) / out (b)), (b,[a])
More informationGraphChi: Large-Scale Graph Computation on Just a PC
OSDI 12 GraphChi: Large-Scale Graph Computation on Just a PC Aapo Kyrölä (CMU) Guy Blelloch (CMU) Carlos Guestrin (UW) In co- opera+on with the GraphLab team. BigData with Structure: BigGraph social graph
More informationHarp-DAAL for High Performance Big Data Computing
Harp-DAAL for High Performance Big Data Computing Large-scale data analytics is revolutionizing many business and scientific domains. Easy-touse scalable parallel techniques are necessary to process big
More informationrefers to the preparation of the next to-be-processed Such a procedure can be arbitrary, because the sequence
Version Traveler: Fast and Memory-Efficient Version Switching in Graph Processing Systems Xiaoen Ju Dan Williams Hani Jamjoom Kang G. Shin University of Michigan IBM T.J. Watson Research Center Abstract
More informationPopularity Prediction of Facebook Videos for Higher Quality Streaming
Popularity Prediction of Facebook Videos for Higher Quality Streaming Linpeng Tang Qi Huang, Amit Puntambekar Ymir Vigfusson, Wyatt Lloyd, Kai Li 1 Videos are Central to Facebook 8 billion views per day
More informationCommunity Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal
Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time
More informationGraFBoost: Using accelerated flash storage for external graph analytics
GraFBoost: Using accelerated flash storage for external graph analytics Sang-Woo Jun, Andy Wright, Sizhuo Zhang, Shuotao Xu and Arvind MIT CSAIL Funded by: 1 Large Graphs are Found Everywhere in Nature
More informationGraph Theoretic Models for Ad hoc Wireless Networks
Graph Theoretic Models for Ad hoc Wireless Networks Prof. Srikrishnan Divakaran DA-IICT 10/4/2009 DA-IICT 1 Talk Outline Overview of Ad hoc Networks Design Issues in Modeling Ad hoc Networks Graph Theoretic
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationX-Stream: A Case Study in Building a Graph Processing System. Amitabha Roy (LABOS)
X-Stream: A Case Study in Building a Graph Processing System Amitabha Roy (LABOS) 1 X-Stream Graph processing system Single Machine Works on graphs stored Entirely in RAM Entirely in SSD Entirely on Magnetic
More informationArchitectural Implications on the Performance and Cost of Graph Analytics Systems
Architectural Implications on the Performance and Cost of Graph Analytics Systems Qizhen Zhang,, Hongzhi Chen, Da Yan,, James Cheng, Boon Thau Loo, and Purushotham Bangalore University of Pennsylvania
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationExtending In-Memory Relational Database Engines with Native Graph Support
Extending In-Memory Relational Database Engines with Native Graph Support EDBT 18 Mohamed S. Hassan 1 Tatiana Kuznetsova 1 Hyun Chai Jeong 1 Walid G. Aref 1 Mohammad Sadoghi 2 1 Purdue University West
More informationGemini: A Computation-Centric Distributed Graph Processing System
Gemini: A Computation-Centric Distributed Graph Processing System Xiaowei Zhu 1, Wenguang Chen 1,2,, Weimin Zheng 1, and Xiaosong Ma 3 1 Department of Computer Science and Technology (TNLIST), Tsinghua
More informationDifferentially Private H-Tree
GeoPrivacy: 2 nd Workshop on Privacy in Geographic Information Collection and Analysis Differentially Private H-Tree Hien To, Liyue Fan, Cyrus Shahabi Integrated Media System Center University of Southern
More informationParallel Patterns for Window-based Stateful Operators on Data Streams: an Algorithmic Skeleton Approach
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Algorithmic Skeleton Approach Tiziano De Matteis, Gabriele Mencagli University of Pisa Italy INTRODUCTION The recent years have
More informationA Framework for Processing Large Graphs in Shared Memory
A Framework for Processing Large Graphs in Shared Memory Julian Shun Based on joint work with Guy Blelloch and Laxman Dhulipala (Work done at Carnegie Mellon University) 2 What are graphs? Vertex Edge
More informationCGraph: A Correlations-aware Approach for Efficient Concurrent Iterative Graph Processing
: A Correlations-aware Approach for Efficient Concurrent Iterative Graph Processing Yu Zhang, Xiaofei Liao, Hai Jin, and Lin Gu, Huazhong University of Science and Technology; Ligang He, University of
More informationHigh-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock
High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock Yangzihao Wang University of California, Davis yzhwang@ucdavis.edu March 24, 2014 Yangzihao Wang (yzhwang@ucdavis.edu)
More informationTanuj Kr Aasawat, Tahsin Reza, Matei Ripeanu Networked Systems Laboratory (NetSysLab) University of British Columbia
How well do CPU, GPU and Hybrid Graph Processing Frameworks Perform? Tanuj Kr Aasawat, Tahsin Reza, Matei Ripeanu Networked Systems Laboratory (NetSysLab) University of British Columbia Networked Systems
More informationTuX 2 : Distributed Graph Computation for Machine Learning
TuX 2 : Distributed Graph Computation for Machine Learning Wencong Xiao, Beihang University and Microsoft Research; Jilong Xue, Peking University and Microsoft Research; Youshan Miao, Microsoft Research;
More informationSociaLite: A Datalog-based Language for
SociaLite: A Datalog-based Language for Large-Scale Graph Analysis Jiwon Seo M OBIS OCIAL RESEARCH GROUP Overview Overview! SociaLite: language for large-scale graph analysis! Extensions to Datalog! Compiler
More informationmodern database systems lecture 10 : large-scale graph processing
modern database systems lecture 1 : large-scale graph processing Aristides Gionis spring 18 timeline today : homework is due march 6 : homework out april 5, 9-1 : final exam april : homework due graphs
More informationEfficient Computation of Radial Distribution Function on GPUs
Efficient Computation of Radial Distribution Function on GPUs Yi-Cheng Tu * and Anand Kumar Department of Computer Science and Engineering University of South Florida, Tampa, Florida 2 Overview Introduction
More informationA Distributed Multi-GPU System for Fast Graph Processing
A Distributed Multi-GPU System for Fast Graph Processing Zhihao Jia Stanford University zhihao@cs.stanford.edu Pat M c Cormick LANL pat@lanl.gov Yongkee Kwon UT Austin yongkee.kwon@utexas.edu Mattan Erez
More informationLesson 3. Prof. Enza Messina
Lesson 3 Prof. Enza Messina Clustering techniques are generally classified into these classes: PARTITIONING ALGORITHMS Directly divides data points into some prespecified number of clusters without a hierarchical
More informationDomain-specific programming on graphs
Lecture 25: Domain-specific programming on graphs Parallel Computer Architecture and Programming CMU 15-418/15-618, Fall 2016 1 Last time: Increasing acceptance of domain-specific programming systems Challenge
More informationarxiv: v1 [cs.dc] 21 Aug 2017
GraphR: Accelerating Graph Processing Using ReRAM arxiv:78.648v [cs.dc] Aug 7 Linghao Song Duke University linghao.song@duke.edu ABSTRACT Youwei Zhuo University of Southern California youweizh@usc.edu
More informationWhen Graph Meets Big Data: Opportunities and Challenges
High Performance Graph Data Management and Processing (HPGDM 2016) When Graph Meets Big Data: Opportunities and Challenges Yinglong Xia Huawei Research America 11/13/2016 The International Conference for
More informationLecture 22 : Distributed Systems for ML
10-708: Probabilistic Graphical Models, Spring 2017 Lecture 22 : Distributed Systems for ML Lecturer: Qirong Ho Scribes: Zihang Dai, Fan Yang 1 Introduction Big data has been very popular in recent years.
More informationGraphZ: Improving the Performance of Large-Scale Graph Analytics on Small-Scale Machines
GraphZ: Improving the Performance of Large-Scale Graph Analytics on Small-Scale Machines Zhixuan Zhou Henry Hoffmann University of Chicago, Dept. of Computer Science Abstract Recent programming frameworks
More informationArchitecture-Aware Graph Repartitioning for Data-Intensive Scientific Computing
Architecture-Aware Graph Repartitioning for Data-Intensive Scientific Computing Angen Zheng, Alexandros Labrinidis, Panos K. Chrysanthis Advanced Data Management Technologies Laboratory Department of Computer
More informationFast and Concurrent RDF Queries with RDMA-Based Distributed Graph Exploration
Fast and Concurrent RDF Queries with RDMA-Based Distributed Graph Exploration JIAXIN SHI, YOUYANG YAO, RONG CHEN, HAIBO CHEN, FEIFEI LI PRESENTED BY ANDREW XIA APRIL 25, 2018 Wukong Overview of Wukong
More informationApplications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors
Segmentation I Goal Separate image into coherent regions Berkeley segmentation database: http://www.eecs.berkeley.edu/research/projects/cs/vision/grouping/segbench/ Slide by L. Lazebnik Applications Intelligent
More informationCSE 701: LARGE-SCALE GRAPH MINING. A. Erdem Sariyuce
CSE 701: LARGE-SCALE GRAPH MINING A. Erdem Sariyuce WHO AM I? My name is Erdem Office: 323 Davis Hall Office hours: Wednesday 2-4 pm Research on graph (network) mining & management Practical algorithms
More informationMemory-Optimized Distributed Graph Processing. through Novel Compression Techniques
Memory-Optimized Distributed Graph Processing through Novel Compression Techniques Katia Papakonstantinopoulou Joint work with Panagiotis Liakos and Alex Delis University of Athens Athens Colloquium in
More informationNon Overlapping Communities
Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides
More informationOptimizing Network Performance in Distributed Machine Learning. Luo Mai Chuntao Hong Paolo Costa
Optimizing Network Performance in Distributed Machine Learning Luo Mai Chuntao Hong Paolo Costa Machine Learning Successful in many fields Online advertisement Spam filtering Fraud detection Image recognition
More informationDistributed computing: index building and use
Distributed computing: index building and use Distributed computing Goals Distributing computation across several machines to Do one computation faster - latency Do more computations in given time - throughput
More informationScalable Influence Maximization in Social Networks under the Linear Threshold Model
Scalable Influence Maximization in Social Networks under the Linear Threshold Model Wei Chen Microsoft Research Asia Yifei Yuan Li Zhang In collaboration with University of Pennsylvania Microsoft Research
More informationGraph Analytics in the Big Data Era
Graph Analytics in the Big Data Era Yongming Luo, dr. George H.L. Fletcher Web Engineering Group What is really hot? 19-11-2013 PAGE 1 An old/new data model graph data Model entities and relations between
More informationBounds on the signed domination number of a graph.
Bounds on the signed domination number of a graph. Ruth Haas and Thomas B. Wexler September 7, 00 Abstract Let G = (V, E) be a simple graph on vertex set V and define a function f : V {, }. The function
More informationLarge Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System
Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System Seunghwa Kang David A. Bader 1 A Challenge Problem Extracting a subgraph from
More informationDiffusion and Clustering on Large Graphs
Diffusion and Clustering on Large Graphs Alexander Tsiatas Thesis Proposal / Advancement Exam 8 December 2011 Introduction Graphs are omnipresent in the real world both natural and man-made Examples of
More informationClash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics
Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics Presented by: Dishant Mittal Authors: Juwei Shi, Yunjie Qiu, Umar Firooq Minhas, Lemei Jiao, Chen Wang, Berthold Reinwald and Fatma
More informationFPGP: Graph Processing Framework on FPGA
FPGP: Graph Processing Framework on FPGA Guohao DAI, Yuze CHI, Yu WANG, Huazhong YANG E.E. Dept., TNLIST, Tsinghua University dgh14@mails.tsinghua.edu.cn 1 Big graph is widely used Big graph is widely
More information