Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Access
|
|
- Amy Miles
- 5 years ago
- Views:
Transcription
1 Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Access Yongzhe Zhang National Institute of Informatics 3rd Spring Festival Workshop March 21, 2017
2 Outline Background of vertex-centric graph processing Example and problems in programming model My research work A novel high-level model for describing vertex-centric graph computation based on remote access Palgol: a concise and efficient domain-specific language for implementing vertex-centric graph algorithm 2
3 The Pregel Model Overview Pregel: a distributed big graph processing framework proposed by Google [SIGMOD 10] Pregel = BSP + vertex-centric approach local computation processors Bulk-Synchronous Parallel (BSP) model data communication barrier synchronization BSP superstep 3 messages v compute() messages Vertex-centric computation
4 Pregel Programming Model Each processor run such code: do foreach active vertex v on the processor (halt, msgs) = v.compute(getmsg(v)) // halt: whether v inactivates itself // msgs: messages sent to other vertices handle(halt, msgs) end exchange data with other processors activate vertices that receive messages until all vertices are inactive The only user-defined function: messages Vertex::compute(Message[])->(Bool, Messages[]) v compute() messages 4
5 Pregel Programming Example (1) A typical Pregel Program (the compute function) fn compute(message[] msgs) { switch (state) { case 1:.. case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs); z = msgs_to_parent(x, y, this); state = 3; return (false, encode(z)) } case 3:.. case 4:.. } 5 It seems that some messages are emitted in state 1 and state 3 need to handle the messages
6 Pregel Programming Example (2) Vertices send necessary messages in state 1 fn compute(message[] msgs) { switch (state) { case 1: x = msgs_to_children(this); y = msgs_to_siblings(this); h = halt_or_not(..); state = 2; return (h, encode(x, y)); } case 2:.. case 3:.. case 4:.. } 6 case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs); z = msgs_to_parent(x, y, this); state = 3; return (false, encode(z))
7 Pregel Programming Example (3) State 3 indeed handles the messages sent in state 2 fn compute(message[] msgs) { switch (state) { case 1:.. case 2:.. case 3:.. this.res += get_sum(msgs) b = make_a_decision(this,..) state = (b? 1 : 4); } return (false, []); case 4:.. } 7 case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs); z = msgs_to_parent(x, y, this); state = 3; return (false, encode(z))
8 Pregel Programming Example (4) Pregel algorithms are usually iterative: fn compute(message[] msgs) { switch (state) { case 1:.. case 2:.. case 3:.. this.res += get_sum(msgs) b = make_a_decision(this,..) state = (b? 1 : 4); return (false, []); } case 4:.. } Implement iteration by state transition (goto state 1) 8
9 Problems in Pregel Programming Too much low-level details and no compile-time check state transition (handle data dependency, iteration) message passing (encoding/decoding) termination control.. Readability hard to understand algorithm Maintainability 9
10 Reason? The gap between high-level algorithm description and Pregel s low-level interfaces do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. a simple high-level description 10
11 Reason? The gap between high-level algorithm description and Pregel s low-level interfaces fn compute(message[] msgs) { switch (state) { case 1: x = msgs_to_children(this); y = msgs_to_siblings(this); return (?, encode(x, y)) case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs);.. case 3:.. case 4:.. } } low-level implementation do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. high-level description 11
12 Reason? The gap between high-level algorithm description and Pregel s low-level interfaces fn compute(message[] msgs) { switch (state) { case 1:.. case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs); z = msgs_to_parent(x, y, this); return (false, encode(z)) case 3: this.res += get_sum(msgs).. case 4:.. } } low-level implementation do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. high-level description 12
13 Reason? The gap between high-level algorithm description and Pregel s low-level interfaces fn compute(message[] msgs) { switch (state) { case 1:.. case 2:.. case 3:.. this.res += get_sum(msgs) b = make_a_decision(this,..) state = (b? 1 : 4); return (false, []); case 4:.. } } low-level implementation do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. high-level description 13
14 Basic Idea To clearly describe the Pregel algorithm, we need vertex-centric way of algorithm description remote read / write instead of message passing high-level control flow instead of state transition do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. 14
15 A Systematic Solution Design a domain-specific language to allow users describe Pregel algorithms concisely: remote access (reading or writing attributes of other vertices through references) high-level control flow (iteration constructor) Compile the DSL to Pregel system analyze the data dependency and produce the state transition machine as well as the message passing scheme 15
16 Challenges Remote reads (using our formalization) are in general complex and hard to efficiently or mechanically translated to message passing all existing DSLs have severe restrictions that make some interesting algorithms hard to be expressed Some Pregel s low-level interfaces are hard to be properly formalized 16
17 Summary of My Research Work Study the translation from common remote access patterns to message passing in Pregel The translation is mechanically The solution is efficient Design and implement a new DSL which is based on remote access It is more expressive than all of existing DSLs It is comparable to manually written code 25% speedup ~ 30% slowdown using popular algorithms (PageRank, etc.) on real-world graphs 17
18 The High-Level Model vertex-centric computation based on remote access input graph output graph algorithmic superstep messages v compute() messages Pregel s message passing model field read v field write step Direct field access from the entire graph iteration sequence & iteration g step 1 g step 2 g sequence 18
19 Algorithmic Superstep a two-phase model that prevents data conflicts input graph field read v step remote read field write compute compute compute intermedi-ate graph computation local write output graph The first phase 19
20 Algorithmic Superstep a two-phase model that prevents data conflicts input graph compute compute compute intermedi-ate graph output graph field read v step remote write (by sending value) receive & reduce final vertex state field write The second phase (optional) 20
21 Example: The Point-Jumping Algorithm Consider the implementation of pointer-jumping Each vertex has a parent pointer Make all vertices point to the root 21
22 Palgol: Pregel Algorithmic Language A Palgol program for pointer-jumping Each vertex has a parent pointer (D field) Make all vertices point to the root field access syntax 1: input[id, D] 2: output[id, D] 3: 4: do chain access 5: for u in V 6: if (D[D[u]]!= D[u]) 7: local D[u] := D[D[u]] 8: end 9: until fix[d] field vertex id D[u] field access D[D[u]] nested field access (or chain access) 22
23 Key Technique Translation of Chain Access How to efficiently translate chain access to Pregel s message passing? e.g. D[D[D[D[u]]]] Convert to logic proposition every u knows D 4 [u] Search for a derivation in our logic system 23
24 Axioms of Our Logic System 1. 8u. u knows u 2. 8u. u knows D[u] 3. (8u. w(u) knowse(u)) ^ (8u. w(u) knowsv(u)) =) 8u. v(u) knowse(u) 24
25 Derivation of every u knows D 4 [u] step1: u knows u u knows D[u] message passing logical inference step2: D[u] knows u D[u] knows D[D[u]] step3: u knows D[D[u]] D[D[u]] knows u D[D[u]] knows D 4 [u] step4: u knows D 4 [u] 25
26 Expressiveness of Palgol Palgol can concisely implement many representative graph algorithms: Single-Source Shortest Path PageRank Randomized Bipartite Matching Strongly Connected Components Triangle Counting Shiloach-Vishkin Algorithm Minimum Spanning Forest Randomized Graph Coloring Approximate Maximum Weight Matching 26
27 Roadmap & Future Plan Distributed Graph Processing Pregel BSP model vertex-centric 8 < : Extend Palgol with more features general remote access graph mutation,.. Pregel algorithms Pregel system Domain-specific languages Compile Palgol to a customized Pregel system to improve the performance 27
28 Thank You Questions & Answers
Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Data Access
Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Data Access Yongzhe Zhang 1,2, Hsiang-Shang Ko 2, and Zhenjiang Hu 1,2 1 Department of Informatics, SOKENDAI Shonan Village, Hayama,
More informationPregel: A System for Large- Scale Graph Processing. Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010
Pregel: A System for Large- Scale Graph Processing Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010 1 Graphs are hard Poor locality of memory access Very
More informationAuthors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G.
Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G. Speaker: Chong Li Department: Applied Health Science Program: Master of Health Informatics 1 Term
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 60 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Pregel: A System for Large-Scale Graph Processing
More informationPREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING
PREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) SIGMOD 2010 Presented by : Xiu
More informationPregel: A System for Large-Scale Graph Proces sing
Pregel: A System for Large-Scale Graph Proces sing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkwoski Google, Inc. SIGMOD July 20 Taewhi
More informationPREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING
PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, G. Czajkowski Google, Inc. SIGMOD 2010 Presented by Ke Hong (some figures borrowed from
More informationPregel. Ali Shah
Pregel Ali Shah s9alshah@stud.uni-saarland.de 2 Outline Introduction Model of Computation Fundamentals of Pregel Program Implementation Applications Experiments Issues with Pregel 3 Outline Costs of Computation
More informationCOSC 6339 Big Data Analytics. Graph Algorithms and Apache Giraph
COSC 6339 Big Data Analytics Graph Algorithms and Apache Giraph Parts of this lecture are adapted from UMD Jimmy Lin s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 14: Distributed Graph Processing Motivation Many applications require graph processing E.g., PageRank Some graph data sets are very large
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 14: Distributed Graph Processing Motivation Many applications require graph processing E.g., PageRank Some graph data sets are very large
More information[CoolName++]: A Graph Processing Framework for Charm++
[CoolName++]: A Graph Processing Framework for Charm++ Hassan Eslami, Erin Molloy, August Shi, Prakalp Srivastava Laxmikant V. Kale Charm++ Workshop University of Illinois at Urbana-Champaign {eslami2,emolloy2,awshi2,psrivas2,kale}@illinois.edu
More informationDistributed Systems. 21. Graph Computing Frameworks. Paul Krzyzanowski. Rutgers University. Fall 2016
Distributed Systems 21. Graph Computing Frameworks Paul Krzyzanowski Rutgers University Fall 2016 November 21, 2016 2014-2016 Paul Krzyzanowski 1 Can we make MapReduce easier? November 21, 2016 2014-2016
More informationmodern database systems lecture 10 : large-scale graph processing
modern database systems lecture 1 : large-scale graph processing Aristides Gionis spring 18 timeline today : homework is due march 6 : homework out april 5, 9-1 : final exam april : homework due graphs
More informationLarge-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC
Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC Lecture material is mostly home-grown, partly taken with permission and courtesy from Professor Shih-Wei
More informationKing Abdullah University of Science and Technology. CS348: Cloud Computing. Large-Scale Graph Processing
King Abdullah University of Science and Technology CS348: Cloud Computing Large-Scale Graph Processing Zuhair Khayyat 10/March/2013 The Importance of Graphs A graph is a mathematical structure that represents
More informationApache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis
Apache Giraph for applications in Machine Learning & Recommendation Systems Maria Stylianou @marsty5 Novartis Züri Machine Learning Meetup #5 June 16, 2014 Apache Giraph for applications in Machine Learning
More informationDFA-G: A Unified Programming Model for Vertex-centric Parallel Graph Processing
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING DFA-G: A Unified Programming Model for Vertex-centric Parallel Graph Processing Bo Suo, Jing Su, Qun Chen, Zhanhuai Li, Wei Pan 2016-08-19 1 ABSTRACT Many systems
More information1. Let n and m be positive integers with n m. a. Write the inclusion/exclusion formula for the number S(n, m) of surjections from {1, 2,...
MATH 3012, Quiz 3, November 24, 2015, WTT Student Name and ID Number 1. Let n and m be positive integers with n m. a. Write the inclusion/exclusion formula for the number S(n, m) of surjections from {1,
More informationGraphHP: A Hybrid Platform for Iterative Graph Processing
GraphHP: A Hybrid Platform for Iterative Graph Processing Qun Chen, Song Bai, Zhanhuai Li, Zhiying Gou, Bo Suo and Wei Pan Northwestern Polytechnical University Xi an, China {chenbenben, baisong, lizhh,
More informationLECTURE 26 PRIM S ALGORITHM
DATA STRUCTURES AND ALGORITHMS LECTURE 26 IMRAN IHSAN ASSISTANT PROFESSOR AIR UNIVERSITY, ISLAMABAD STRATEGY Suppose we take a vertex Given a single vertex v 1, it forms a minimum spanning tree on one
More informationarxiv: v1 [cs.dc] 5 Nov 2018
Composing Optimization Techniques for Vertex-Centric Graph Processing via Communication Channels Yongzhe Zhang, Zhenjiang Hu Department of Informatics, SOKENDAI, Japan National Institute of Informatics
More informationPregel Algorithms for Graph Connectivity Problems with Performance Guarantees
Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees Da Yan #1, James Cheng #2, Kai Xing 3, Yi Lu #4, Wilfred Ng 5, Yingyi Bu 6 # Department of Computer Science and Engineering,
More informationBSP, Pregel and the need for Graph Processing
BSP, Pregel and the need for Graph Processing Patrizio Dazzi, HPC Lab ISTI - CNR mail: patrizio.dazzi@isti.cnr.it web: http://hpc.isti.cnr.it/~dazzi/ National Research Council of Italy A need for Graph
More informationTurning NoSQL data into Graph Playing with Apache Giraph and Apache Gora
Turning NoSQL data into Graph Playing with Apache Giraph and Apache Gora Team Renato Marroquín! PhD student: Interested in: Information retrieval. Distributed and scalable data management. Apache Gora:
More informationParallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem
I J C T A, 9(41) 2016, pp. 1235-1239 International Science Press Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem Hema Dubey *, Nilay Khare *, Alind Khare **
More informationGiraph: Large-scale graph processing infrastructure on Hadoop. Qu Zhi
Giraph: Large-scale graph processing infrastructure on Hadoop Qu Zhi Why scalable graph processing? Web and social graphs are at immense scale and continuing to grow In 2008, Google estimated the number
More informationBig Graph Processing. Fenggang Wu Nov. 6, 2016
Big Graph Processing Fenggang Wu Nov. 6, 2016 Agenda Project Publication Organization Pregel SIGMOD 10 Google PowerGraph OSDI 12 CMU GraphX OSDI 14 UC Berkeley AMPLab PowerLyra EuroSys 15 Shanghai Jiao
More informationLarge Scale Graph Processing Pregel, GraphLab and GraphX
Large Scale Graph Processing Pregel, GraphLab and GraphX Amir H. Payberah amir@sics.se KTH Royal Institute of Technology Amir H. Payberah (KTH) Large Scale Graph Processing 2016/10/03 1 / 76 Amir H. Payberah
More informationGreen-Marl. A DSL for Easy and Efficient Graph Analysis. LSDPO (2017/2018) Paper Presentation Tudor Tiplea (tpt26)
Green-Marl A DSL for Easy and Efficient Graph Analysis S. Hong, H. Chafi, E. Sedlar, K. Olukotun [1] LSDPO (2017/2018) Paper Presentation Tudor Tiplea (tpt26) Problem Paper identifies three major challenges
More informationFrameworks for Graph-Based Problems
Frameworks for Graph-Based Problems Dakshil Shah U.G. Student Computer Engineering Department Dwarkadas J. Sanghvi College of Engineering, Mumbai, India Chetashri Bhadane Assistant Professor Computer Engineering
More informationProblem Score Max Score 1 Syntax directed translation & type
CMSC430 Spring 2014 Midterm 2 Name Instructions You have 75 minutes for to take this exam. This exam has a total of 100 points. An average of 45 seconds per point. This is a closed book exam. No notes
More informationHighway Hierarchies (Dominik Schultes) Presented by: Andre Rodriguez
Highway Hierarchies (Dominik Schultes) Presented by: Andre Rodriguez Central Idea To go from Tallahassee to Gainesville*: Get to the I-10 Drive on the I-10 Get to Gainesville (8.8 mi) (153 mi) (1.8 mi)
More informationTI2736-B Big Data Processing. Claudia Hauff
TI2736-B Big Data Processing Claudia Hauff ti2736b-ewi@tudelft.nl Intro Streams Streams Map Reduce HDFS Pig Ctd. Graphs Pig Design Patterns Hadoop Ctd. Giraph Zoo Keeper Spark Spark Ctd. Learning objectives
More informationIntermediate Code Generation
Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target
More informationEE/CSCI 451 Spring 2018 Homework 8 Total Points: [10 points] Explain the following terms: EREW PRAM CRCW PRAM. Brent s Theorem.
EE/CSCI 451 Spring 2018 Homework 8 Total Points: 100 1 [10 points] Explain the following terms: EREW PRAM CRCW PRAM Brent s Theorem BSP model 1 2 [15 points] Assume two sorted sequences of size n can be
More informationCS /21/2016. Paul Krzyzanowski 1. Can we make MapReduce easier? Distributed Systems. Apache Pig. Apache Pig. Pig: Loading Data.
Distributed Systems 1. Graph Computing Frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 016 1 Apache Pig Apache Pig Why? Make it easy to use MapReduce via scripting instead
More informationCS521 \ Notes for the Final Exam
CS521 \ Notes for final exam 1 Ariel Stolerman Asymptotic Notations: CS521 \ Notes for the Final Exam Notation Definition Limit Big-O ( ) Small-o ( ) Big- ( ) Small- ( ) Big- ( ) Notes: ( ) ( ) ( ) ( )
More informationELEC 876: Software Reengineering
ELEC 876: Software Reengineering () Dr. Ying Zou Department of Electrical & Computer Engineering Queen s University Compiler and Interpreter Compiler Source Code Object Compile Execute Code Results data
More informationOperational Semantics 1 / 13
Operational Semantics 1 / 13 Outline What is semantics? Operational Semantics What is semantics? 2 / 13 What is the meaning of a program? Recall: aspects of a language syntax: the structure of its programs
More informationCS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting
CS2 Algorithms and Data Structures Note 10 Depth-First Search and Topological Sorting In this lecture, we will analyse the running time of DFS and discuss a few applications. 10.1 A recursive implementation
More informationEarly Experience with Intergrating Charm++ Support to Green-Marl DSL
Early Experience with Intergrating Charm++ Support to Green-Marl DSL Alexander Frolov DISLab, «Scientific and Research Center on Computer Techonology» (NICEVT) 15th Annual Workshop on Charm++ and its Applications
More informationSolutions to relevant spring 2000 exam problems
Problem 2, exam Here s Prim s algorithm, modified slightly to use C syntax. MSTPrim (G, w, r): Q = V[G]; for (each u Q) { key[u] = ; key[r] = 0; π[r] = 0; while (Q not empty) { u = ExtractMin (Q); for
More informationJordan Boyd-Graber University of Maryland. Thursday, March 3, 2011
Data-Intensive Information Processing Applications! Session #5 Graph Algorithms Jordan Boyd-Graber University of Maryland Thursday, March 3, 2011 This work is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationSTUDENT OUTLINE. Lesson 8: Structured Programming, Control Structures, if-else Statements, Pseudocode
STUDENT OUTLINE Lesson 8: Structured Programming, Control Structures, if- Statements, Pseudocode INTRODUCTION: This lesson is the first of four covering the standard control structures of a high-level
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 8: Analyzing Graphs, Redux (1/2) March 20, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo
More informationChapter 2 Abstract Machine Models. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam
Chapter 2 Abstract Machine Models Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Parallel Computer Models (1) A parallel machine model (also known as programming model, type architecture, conceptual
More informationCS536 Spring 2011 FINAL ID: Page 2 of 11
CS536 Spring 2011 FINAL ID: Page 2 of 11 Question 2. (30 POINTS) Consider adding forward function declarations to the Little language. A forward function declaration is a function header (including its
More informationGraph and Digraph Glossary
1 of 15 31.1.2004 14:45 Graph and Digraph Glossary A B C D E F G H I-J K L M N O P-Q R S T U V W-Z Acyclic Graph A graph is acyclic if it contains no cycles. Adjacency Matrix A 0-1 square matrix whose
More informationGraph Processing & Bulk Synchronous Parallel Model
Graph Processing & Bulk Synchronous Parallel Model CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 14 : 590.02 Spring 13 1 Recap: Graph Algorithms Many graph algorithms need iterafve computafon
More informationSociaLite: A Datalog-based Language for
SociaLite: A Datalog-based Language for Large-Scale Graph Analysis Jiwon Seo M OBIS OCIAL RESEARCH GROUP Overview Overview! SociaLite: language for large-scale graph analysis! Extensions to Datalog! Compiler
More informationNavigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets
Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets Nadathur Satish, Narayanan Sundaram, Mostofa Ali Patwary, Jiwon Seo, Jongsoo Park, M. Amber Hassaan, Shubho Sengupta, Zhaoming
More informationOne Trillion Edges. Graph processing at Facebook scale
One Trillion Edges Graph processing at Facebook scale Introduction Platform improvements Compute model extensions Experimental results Operational experience How Facebook improved Apache Giraph Facebook's
More informationOptimizing CPU Cache Performance for Pregel-Like Graph Computation
Optimizing CPU Cache Performance for Pregel-Like Graph Computation Songjie Niu, Shimin Chen* State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences
More informationMizan: A System for Dynamic Load Balancing in Large-scale Graph Processing
/34 Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing Zuhair Khayyat 1 Karim Awara 1 Amani Alonazi 1 Hani Jamjoom 2 Dan Williams 2 Panos Kalnis 1 1 King Abdullah University of
More informationDistributed Graph Storage. Veronika Molnár, UZH
Distributed Graph Storage Veronika Molnár, UZH Overview Graphs and Social Networks Criteria for Graph Processing Systems Current Systems Storage Computation Large scale systems Comparison / Best systems
More informationRecap: Functions as first-class values
Recap: Functions as first-class values Arguments, return values, bindings What are the benefits? Parameterized, similar functions (e.g. Testers) Creating, (Returning) Functions Iterator, Accumul, Reuse
More informationBatch & Stream Graph Processing with Apache Flink. Vasia
Batch & Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri Outline Distributed Graph Processing Gelly: Batch Graph Processing with Flink Gelly-Stream: Continuous Graph
More informationCSE302: Compiler Design
CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University January 30, 2007 Outline Recap
More informationHigh Performance Data Analytics: Experiences Porting the Apache Hama Graph Analytics Framework to an HPC InfiniBand Connected Cluster
High Performance Data Analytics: Experiences Porting the Apache Hama Graph Analytics Framework to an HPC InfiniBand Connected Cluster Summary Open source analytic frameworks, such as those in the Apache
More informationRed-Black trees are usually described as obeying the following rules :
Red-Black Trees As we have seen, the ideal Binary Search Tree has height approximately equal to log n, where n is the number of values stored in the tree. Such a BST guarantees that the maximum time for
More informationType Inference Systems. Type Judgments. Deriving a Type Judgment. Deriving a Judgment. Hypothetical Type Judgments CS412/CS413
Type Inference Systems CS412/CS413 Introduction to Compilers Tim Teitelbaum Type inference systems define types for all legal programs in a language Type inference systems are to type-checking: As regular
More informationDistributed Graph Algorithms
Distributed Graph Algorithms Alessio Guerrieri University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Introduction
More informationGraph Data Management
Graph Data Management Analysis and Optimization of Graph Data Frameworks presented by Fynn Leitow Overview 1) Introduction a) Motivation b) Application for big data 2) Choice of algorithms 3) Choice of
More informationAn Implementation of Connected Component Algorithm on GPU-Project Report
An Implementation of Connected Component Algorithm on GPU-Project Report Leyuan Wang March 1, 013 Motivation The connected component algorithm is widely used in many fields. In graph theory, a connected
More informationTrees Rooted Trees Spanning trees and Shortest Paths. 12. Graphs and Trees 2. Aaron Tan November 2017
12. Graphs and Trees 2 Aaron Tan 6 10 November 2017 1 10.5 Trees 2 Definition Definition Definition: Tree A graph is said to be circuit-free if, and only if, it has no circuits. A graph is called a tree
More information1 P a g e A r y a n C o l l e g e \ B S c _ I T \ C \
BSc IT C Programming (2013-2017) Unit I Q1. What do you understand by type conversion? (2013) Q2. Why we need different data types? (2013) Q3 What is the output of the following (2013) main() Printf( %d,
More informationShortest Path Problem
Shortest Path Problem CLRS Chapters 24.1 3, 24.5, 25.2 Shortest path problem Shortest path problem (and variants) Properties of shortest paths Algorithmic framework Bellman-Ford algorithm Shortest paths
More informationDistributed Algorithms 6.046J, Spring, 2015 Part 2. Nancy Lynch
Distributed Algorithms 6.046J, Spring, 2015 Part 2 Nancy Lynch 1 This Week Synchronous distributed algorithms: Leader Election Maximal Independent Set Breadth-First Spanning Trees Shortest Paths Trees
More informationContinuations provide a novel way to suspend and reexecute
Continuations provide a novel way to suspend and reexecute computations. 2. ML ( Meta Language ) Strong, compile-time type checking. Types are determined by inference rather than declaration. Naturally
More informationUniversity of New Mexico Department of Computer Science. Final Examination. CS 362 Data Structures and Algorithms Spring, 2006
University of New Mexico Department of Computer Science Final Examination CS 6 Data Structures and Algorithms Spring, 006 Name: Email: Print your name and email, neatly in the space provided above; print
More informationDistributed Systems. 20. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 20. Other parallel frameworks Paul Krzyzanowski Rutgers University Fall 2017 November 20, 2017 2014-2017 Paul Krzyzanowski 1 Can we make MapReduce easier? 2 Apache Pig Why? Make it
More informationGreedy Approach: Intro
Greedy Approach: Intro Applies to optimization problems only Problem solving consists of a series of actions/steps Each action must be 1. Feasible 2. Locally optimal 3. Irrevocable Motivation: If always
More informationimplementing the breadth-first search algorithm implementing the depth-first search algorithm
Graph Traversals 1 Graph Traversals representing graphs adjacency matrices and adjacency lists 2 Implementing the Breadth-First and Depth-First Search Algorithms implementing the breadth-first search algorithm
More informationGiraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems
Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems ABSTRACT Minyang Han David R. Cheriton School of Computer Science University of Waterloo m25han@uwaterloo.ca
More informationHIGH PERFORMANCE DATA ANALYTICS:
www.gdmissionsystems.com/hpc HIGH PERFORMANCE DATA ANALYTICS: Experiences Porting the Apache Hama Graph Analytics Framework to an HPC InfiniBand Connected Cluster 1. Summary Open source analytic frameworks,
More informationCS November 2017
Distributed Systems 0. Other parallel frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 017 November 0, 017 014-017 Paul Krzyzanowski 1 Apache Pig Apache Pig Why? Make
More informationCMSC 330: Organization of Programming Languages. OCaml Expressions and Functions
CMSC 330: Organization of Programming Languages OCaml Expressions and Functions CMSC330 Spring 2018 1 Lecture Presentation Style Our focus: semantics and idioms for OCaml Semantics is what the language
More informationType Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.
Outline Type Checking General properties of type systems Types in programming languages Notation for type rules Logical rules of inference Common type rules 2 Static Checking Refers to the compile-time
More informationAlgorithms on Graphs: Part III. Shortest Path Problems. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar..
.. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar.. Shortest Path Problems Algorithms on Graphs: Part III Path in a graph. Let G = V,E be a graph. A path p = e 1,...,e k, e i E,
More informationOutline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference
Type Checking Outline General properties of type systems Types in programming languages Notation for type rules Logical rules of inference Common type rules 2 Static Checking Refers to the compile-time
More informationA thesis submitted to Kent State University in partial fulfillment of the requirements for the degree of Master of Science. Ding Chu.
BSP IMPLEMENTATION OF BORŮVKA S MINIMUM SPANNING TREE ALGORITHM A thesis submitted to Kent State University in partial fulfillment of the requirements for the degree of Master of Science by Ding Chu May
More informationLecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture
More informationCSC Intro to Intelligent Robotics, Spring Graphs
CSC 445 - Intro to Intelligent Robotics, Spring 2018 Graphs Graphs Definition: A graph G = (V, E) consists of a nonempty set V of vertices (or nodes) and a set E of edges. Each edge has either one or two
More informationCase Study 4: Collaborative Filtering. GraphLab
Case Study 4: Collaborative Filtering GraphLab Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin March 14 th, 2013 Carlos Guestrin 2013 1 Social Media
More informationGraph-Processing Systems. (focusing on GraphChi)
Graph-Processing Systems (focusing on GraphChi) Recall: PageRank in MapReduce (Hadoop) Input: adjacency matrix H D F S (a,[c]) (b,[a]) (c,[a,b]) (c,pr(a) / out (a)), (a,[c]) (a,pr(b) / out (b)), (b,[a])
More informationAdvanced Data Management
Advanced Data Management Medha Atre Office: KD-219 atrem@cse.iitk.ac.in Sept 26, 2016 defined Given a graph G(V, E) with V as the set of nodes and E as the set of edges, a reachability query asks does
More informationThis course is intended for 3rd and/or 4th year undergraduate majors in Computer Science.
Lecture 9 Graphs This course is intended for 3rd and/or 4th year undergraduate majors in Computer Science. You need to be familiar with the design and use of basic data structures such as Lists, Stacks,
More informationGraph Processing. Connor Gramazio Spiros Boosalis
Graph Processing Connor Gramazio Spiros Boosalis Pregel why not MapReduce? semantics: awkward to write graph algorithms efficiency: mapreduces serializes state (e.g. all nodes and edges) while pregel keeps
More informationLecture 5: Graph algorithms 1
DD2458, Problem Solving and Programming Under Pressure Lecture 5: Graph algorithms 1 Date: 2008-10-01 Scribe(s): Mikael Auno and Magnus Andermo Lecturer: Douglas Wikström This lecture presents some common
More informationDistributed Systems. 21. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2018
Distributed Systems 21. Other parallel frameworks Paul Krzyzanowski Rutgers University Fall 2018 1 Can we make MapReduce easier? 2 Apache Pig Why? Make it easy to use MapReduce via scripting instead of
More informationMATH 363 Final Wednesday, April 28. Final exam. You may use lemmas and theorems that were proven in class and on assignments unless stated otherwise.
Final exam This is a closed book exam. No calculators are allowed. Unless stated otherwise, justify all your steps. You may use lemmas and theorems that were proven in class and on assignments unless stated
More informationCSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li
CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Main Goal of Algorithm Design Design fast
More informationAnswer: Early binding generally leads to greater efficiency (compilation approach) Late binding general leads to greater flexibility
Quiz Review Q1. What is the advantage of binding things as early as possible? Is there any advantage to delaying binding? Answer: Early binding generally leads to greater efficiency (compilation approach)
More informationOptimizing Cache Performance for Graph Analytics. Yunming Zhang Presentation
Optimizing Cache Performance for Graph Analytics Yunming Zhang 6.886 Presentation Goals How to optimize in-memory graph applications How to go about performance engineering just about anything In-memory
More informationCS November 2018
Distributed Systems 1. Other parallel frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 018 1 Apache Pig Apache Pig Why? Make it easy to use MapReduce via scripting instead
More informationCSE 100 Minimum Spanning Trees Prim s and Kruskal
CSE 100 Minimum Spanning Trees Prim s and Kruskal Your Turn The array of vertices, which include dist, prev, and done fields (initialize dist to INFINITY and done to false ): V0: dist= prev= done= adj:
More informationAn undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.
Trees Trees form the most widely used subclasses of graphs. In CS, we make extensive use of trees. Trees are useful in organizing and relating data in databases, file systems and other applications. Formal
More informationHigh-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock
High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock Yangzihao Wang University of California, Davis yzhwang@ucdavis.edu March 24, 2014 Yangzihao Wang (yzhwang@ucdavis.edu)
More informationGraphs. Part I: Basic algorithms. Laura Toma Algorithms (csci2200), Bowdoin College
Laura Toma Algorithms (csci2200), Bowdoin College Undirected graphs Concepts: connectivity, connected components paths (undirected) cycles Basic problems, given undirected graph G: is G connected how many
More information