Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Access

Size: px
Start display at page:

Download "Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Access"

Transcription

1 Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Access Yongzhe Zhang National Institute of Informatics 3rd Spring Festival Workshop March 21, 2017

2 Outline Background of vertex-centric graph processing Example and problems in programming model My research work A novel high-level model for describing vertex-centric graph computation based on remote access Palgol: a concise and efficient domain-specific language for implementing vertex-centric graph algorithm 2

3 The Pregel Model Overview Pregel: a distributed big graph processing framework proposed by Google [SIGMOD 10] Pregel = BSP + vertex-centric approach local computation processors Bulk-Synchronous Parallel (BSP) model data communication barrier synchronization BSP superstep 3 messages v compute() messages Vertex-centric computation

4 Pregel Programming Model Each processor run such code: do foreach active vertex v on the processor (halt, msgs) = v.compute(getmsg(v)) // halt: whether v inactivates itself // msgs: messages sent to other vertices handle(halt, msgs) end exchange data with other processors activate vertices that receive messages until all vertices are inactive The only user-defined function: messages Vertex::compute(Message[])->(Bool, Messages[]) v compute() messages 4

5 Pregel Programming Example (1) A typical Pregel Program (the compute function) fn compute(message[] msgs) { switch (state) { case 1:.. case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs); z = msgs_to_parent(x, y, this); state = 3; return (false, encode(z)) } case 3:.. case 4:.. } 5 It seems that some messages are emitted in state 1 and state 3 need to handle the messages

6 Pregel Programming Example (2) Vertices send necessary messages in state 1 fn compute(message[] msgs) { switch (state) { case 1: x = msgs_to_children(this); y = msgs_to_siblings(this); h = halt_or_not(..); state = 2; return (h, encode(x, y)); } case 2:.. case 3:.. case 4:.. } 6 case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs); z = msgs_to_parent(x, y, this); state = 3; return (false, encode(z))

7 Pregel Programming Example (3) State 3 indeed handles the messages sent in state 2 fn compute(message[] msgs) { switch (state) { case 1:.. case 2:.. case 3:.. this.res += get_sum(msgs) b = make_a_decision(this,..) state = (b? 1 : 4); } return (false, []); case 4:.. } 7 case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs); z = msgs_to_parent(x, y, this); state = 3; return (false, encode(z))

8 Pregel Programming Example (4) Pregel algorithms are usually iterative: fn compute(message[] msgs) { switch (state) { case 1:.. case 2:.. case 3:.. this.res += get_sum(msgs) b = make_a_decision(this,..) state = (b? 1 : 4); return (false, []); } case 4:.. } Implement iteration by state transition (goto state 1) 8

9 Problems in Pregel Programming Too much low-level details and no compile-time check state transition (handle data dependency, iteration) message passing (encoding/decoding) termination control.. Readability hard to understand algorithm Maintainability 9

10 Reason? The gap between high-level algorithm description and Pregel s low-level interfaces do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. a simple high-level description 10

11 Reason? The gap between high-level algorithm description and Pregel s low-level interfaces fn compute(message[] msgs) { switch (state) { case 1: x = msgs_to_children(this); y = msgs_to_siblings(this); return (?, encode(x, y)) case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs);.. case 3:.. case 4:.. } } low-level implementation do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. high-level description 11

12 Reason? The gap between high-level algorithm description and Pregel s low-level interfaces fn compute(message[] msgs) { switch (state) { case 1:.. case 2: x = get_data_from_parent(msgs); y = get_data_from_siblings(msgs); z = msgs_to_parent(x, y, this); return (false, encode(z)) case 3: this.res += get_sum(msgs).. case 4:.. } } low-level implementation do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. high-level description 12

13 Reason? The gap between high-level algorithm description and Pregel s low-level interfaces fn compute(message[] msgs) { switch (state) { case 1:.. case 2:.. case 3:.. this.res += get_sum(msgs) b = make_a_decision(this,..) state = (b? 1 : 4); return (false, []); case 4:.. } } low-level implementation do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. high-level description 13

14 Basic Idea To clearly describe the Pregel algorithm, we need vertex-centric way of algorithm description remote read / write instead of message passing high-level control flow instead of state transition do foreach vertex v x = parent s some value y = siblings some value parent.res += f(x, y, v) end until every vertex satisfies.. 14

15 A Systematic Solution Design a domain-specific language to allow users describe Pregel algorithms concisely: remote access (reading or writing attributes of other vertices through references) high-level control flow (iteration constructor) Compile the DSL to Pregel system analyze the data dependency and produce the state transition machine as well as the message passing scheme 15

16 Challenges Remote reads (using our formalization) are in general complex and hard to efficiently or mechanically translated to message passing all existing DSLs have severe restrictions that make some interesting algorithms hard to be expressed Some Pregel s low-level interfaces are hard to be properly formalized 16

17 Summary of My Research Work Study the translation from common remote access patterns to message passing in Pregel The translation is mechanically The solution is efficient Design and implement a new DSL which is based on remote access It is more expressive than all of existing DSLs It is comparable to manually written code 25% speedup ~ 30% slowdown using popular algorithms (PageRank, etc.) on real-world graphs 17

18 The High-Level Model vertex-centric computation based on remote access input graph output graph algorithmic superstep messages v compute() messages Pregel s message passing model field read v field write step Direct field access from the entire graph iteration sequence & iteration g step 1 g step 2 g sequence 18

19 Algorithmic Superstep a two-phase model that prevents data conflicts input graph field read v step remote read field write compute compute compute intermedi-ate graph computation local write output graph The first phase 19

20 Algorithmic Superstep a two-phase model that prevents data conflicts input graph compute compute compute intermedi-ate graph output graph field read v step remote write (by sending value) receive & reduce final vertex state field write The second phase (optional) 20

21 Example: The Point-Jumping Algorithm Consider the implementation of pointer-jumping Each vertex has a parent pointer Make all vertices point to the root 21

22 Palgol: Pregel Algorithmic Language A Palgol program for pointer-jumping Each vertex has a parent pointer (D field) Make all vertices point to the root field access syntax 1: input[id, D] 2: output[id, D] 3: 4: do chain access 5: for u in V 6: if (D[D[u]]!= D[u]) 7: local D[u] := D[D[u]] 8: end 9: until fix[d] field vertex id D[u] field access D[D[u]] nested field access (or chain access) 22

23 Key Technique Translation of Chain Access How to efficiently translate chain access to Pregel s message passing? e.g. D[D[D[D[u]]]] Convert to logic proposition every u knows D 4 [u] Search for a derivation in our logic system 23

24 Axioms of Our Logic System 1. 8u. u knows u 2. 8u. u knows D[u] 3. (8u. w(u) knowse(u)) ^ (8u. w(u) knowsv(u)) =) 8u. v(u) knowse(u) 24

25 Derivation of every u knows D 4 [u] step1: u knows u u knows D[u] message passing logical inference step2: D[u] knows u D[u] knows D[D[u]] step3: u knows D[D[u]] D[D[u]] knows u D[D[u]] knows D 4 [u] step4: u knows D 4 [u] 25

26 Expressiveness of Palgol Palgol can concisely implement many representative graph algorithms: Single-Source Shortest Path PageRank Randomized Bipartite Matching Strongly Connected Components Triangle Counting Shiloach-Vishkin Algorithm Minimum Spanning Forest Randomized Graph Coloring Approximate Maximum Weight Matching 26

27 Roadmap & Future Plan Distributed Graph Processing Pregel BSP model vertex-centric 8 < : Extend Palgol with more features general remote access graph mutation,.. Pregel algorithms Pregel system Domain-specific languages Compile Palgol to a customized Pregel system to improve the performance 27

28 Thank You Questions & Answers

Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Data Access

Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Data Access Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Data Access Yongzhe Zhang 1,2, Hsiang-Shang Ko 2, and Zhenjiang Hu 1,2 1 Department of Informatics, SOKENDAI Shonan Village, Hayama,

More information

Pregel: A System for Large- Scale Graph Processing. Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010

Pregel: A System for Large- Scale Graph Processing. Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010 Pregel: A System for Large- Scale Graph Processing Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010 1 Graphs are hard Poor locality of memory access Very

More information

Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G.

Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G. Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G. Speaker: Chong Li Department: Applied Health Science Program: Master of Health Informatics 1 Term

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 60 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Pregel: A System for Large-Scale Graph Processing

More information

PREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING

PREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING PREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) SIGMOD 2010 Presented by : Xiu

More information

Pregel: A System for Large-Scale Graph Proces sing

Pregel: A System for Large-Scale Graph Proces sing Pregel: A System for Large-Scale Graph Proces sing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkwoski Google, Inc. SIGMOD July 20 Taewhi

More information

PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING

PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, G. Czajkowski Google, Inc. SIGMOD 2010 Presented by Ke Hong (some figures borrowed from

More information

Pregel. Ali Shah

Pregel. Ali Shah Pregel Ali Shah s9alshah@stud.uni-saarland.de 2 Outline Introduction Model of Computation Fundamentals of Pregel Program Implementation Applications Experiments Issues with Pregel 3 Outline Costs of Computation

More information

COSC 6339 Big Data Analytics. Graph Algorithms and Apache Giraph

COSC 6339 Big Data Analytics. Graph Algorithms and Apache Giraph COSC 6339 Big Data Analytics Graph Algorithms and Apache Giraph Parts of this lecture are adapted from UMD Jimmy Lin s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 14: Distributed Graph Processing Motivation Many applications require graph processing E.g., PageRank Some graph data sets are very large

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 14: Distributed Graph Processing Motivation Many applications require graph processing E.g., PageRank Some graph data sets are very large

More information

[CoolName++]: A Graph Processing Framework for Charm++

[CoolName++]: A Graph Processing Framework for Charm++ [CoolName++]: A Graph Processing Framework for Charm++ Hassan Eslami, Erin Molloy, August Shi, Prakalp Srivastava Laxmikant V. Kale Charm++ Workshop University of Illinois at Urbana-Champaign {eslami2,emolloy2,awshi2,psrivas2,kale}@illinois.edu

More information

Distributed Systems. 21. Graph Computing Frameworks. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems. 21. Graph Computing Frameworks. Paul Krzyzanowski. Rutgers University. Fall 2016 Distributed Systems 21. Graph Computing Frameworks Paul Krzyzanowski Rutgers University Fall 2016 November 21, 2016 2014-2016 Paul Krzyzanowski 1 Can we make MapReduce easier? November 21, 2016 2014-2016

More information

modern database systems lecture 10 : large-scale graph processing

modern database systems lecture 10 : large-scale graph processing modern database systems lecture 1 : large-scale graph processing Aristides Gionis spring 18 timeline today : homework is due march 6 : homework out april 5, 9-1 : final exam april : homework due graphs

More information

Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC

Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC Lecture material is mostly home-grown, partly taken with permission and courtesy from Professor Shih-Wei

More information

King Abdullah University of Science and Technology. CS348: Cloud Computing. Large-Scale Graph Processing

King Abdullah University of Science and Technology. CS348: Cloud Computing. Large-Scale Graph Processing King Abdullah University of Science and Technology CS348: Cloud Computing Large-Scale Graph Processing Zuhair Khayyat 10/March/2013 The Importance of Graphs A graph is a mathematical structure that represents

More information

Apache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis

Apache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis Apache Giraph for applications in Machine Learning & Recommendation Systems Maria Stylianou @marsty5 Novartis Züri Machine Learning Meetup #5 June 16, 2014 Apache Giraph for applications in Machine Learning

More information

DFA-G: A Unified Programming Model for Vertex-centric Parallel Graph Processing

DFA-G: A Unified Programming Model for Vertex-centric Parallel Graph Processing SCHOOL OF COMPUTER SCIENCE AND ENGINEERING DFA-G: A Unified Programming Model for Vertex-centric Parallel Graph Processing Bo Suo, Jing Su, Qun Chen, Zhanhuai Li, Wei Pan 2016-08-19 1 ABSTRACT Many systems

More information

1. Let n and m be positive integers with n m. a. Write the inclusion/exclusion formula for the number S(n, m) of surjections from {1, 2,...

1. Let n and m be positive integers with n m. a. Write the inclusion/exclusion formula for the number S(n, m) of surjections from {1, 2,... MATH 3012, Quiz 3, November 24, 2015, WTT Student Name and ID Number 1. Let n and m be positive integers with n m. a. Write the inclusion/exclusion formula for the number S(n, m) of surjections from {1,

More information

GraphHP: A Hybrid Platform for Iterative Graph Processing

GraphHP: A Hybrid Platform for Iterative Graph Processing GraphHP: A Hybrid Platform for Iterative Graph Processing Qun Chen, Song Bai, Zhanhuai Li, Zhiying Gou, Bo Suo and Wei Pan Northwestern Polytechnical University Xi an, China {chenbenben, baisong, lizhh,

More information

LECTURE 26 PRIM S ALGORITHM

LECTURE 26 PRIM S ALGORITHM DATA STRUCTURES AND ALGORITHMS LECTURE 26 IMRAN IHSAN ASSISTANT PROFESSOR AIR UNIVERSITY, ISLAMABAD STRATEGY Suppose we take a vertex Given a single vertex v 1, it forms a minimum spanning tree on one

More information

arxiv: v1 [cs.dc] 5 Nov 2018

arxiv: v1 [cs.dc] 5 Nov 2018 Composing Optimization Techniques for Vertex-Centric Graph Processing via Communication Channels Yongzhe Zhang, Zhenjiang Hu Department of Informatics, SOKENDAI, Japan National Institute of Informatics

More information

Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees

Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees Da Yan #1, James Cheng #2, Kai Xing 3, Yi Lu #4, Wilfred Ng 5, Yingyi Bu 6 # Department of Computer Science and Engineering,

More information

BSP, Pregel and the need for Graph Processing

BSP, Pregel and the need for Graph Processing BSP, Pregel and the need for Graph Processing Patrizio Dazzi, HPC Lab ISTI - CNR mail: patrizio.dazzi@isti.cnr.it web: http://hpc.isti.cnr.it/~dazzi/ National Research Council of Italy A need for Graph

More information

Turning NoSQL data into Graph Playing with Apache Giraph and Apache Gora

Turning NoSQL data into Graph Playing with Apache Giraph and Apache Gora Turning NoSQL data into Graph Playing with Apache Giraph and Apache Gora Team Renato Marroquín! PhD student: Interested in: Information retrieval. Distributed and scalable data management. Apache Gora:

More information

Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem

Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem I J C T A, 9(41) 2016, pp. 1235-1239 International Science Press Parallel HITS Algorithm Implemented Using HADOOP GIRAPH Framework to resolve Big Data Problem Hema Dubey *, Nilay Khare *, Alind Khare **

More information

Giraph: Large-scale graph processing infrastructure on Hadoop. Qu Zhi

Giraph: Large-scale graph processing infrastructure on Hadoop. Qu Zhi Giraph: Large-scale graph processing infrastructure on Hadoop Qu Zhi Why scalable graph processing? Web and social graphs are at immense scale and continuing to grow In 2008, Google estimated the number

More information

Big Graph Processing. Fenggang Wu Nov. 6, 2016

Big Graph Processing. Fenggang Wu Nov. 6, 2016 Big Graph Processing Fenggang Wu Nov. 6, 2016 Agenda Project Publication Organization Pregel SIGMOD 10 Google PowerGraph OSDI 12 CMU GraphX OSDI 14 UC Berkeley AMPLab PowerLyra EuroSys 15 Shanghai Jiao

More information

Large Scale Graph Processing Pregel, GraphLab and GraphX

Large Scale Graph Processing Pregel, GraphLab and GraphX Large Scale Graph Processing Pregel, GraphLab and GraphX Amir H. Payberah amir@sics.se KTH Royal Institute of Technology Amir H. Payberah (KTH) Large Scale Graph Processing 2016/10/03 1 / 76 Amir H. Payberah

More information

Green-Marl. A DSL for Easy and Efficient Graph Analysis. LSDPO (2017/2018) Paper Presentation Tudor Tiplea (tpt26)

Green-Marl. A DSL for Easy and Efficient Graph Analysis. LSDPO (2017/2018) Paper Presentation Tudor Tiplea (tpt26) Green-Marl A DSL for Easy and Efficient Graph Analysis S. Hong, H. Chafi, E. Sedlar, K. Olukotun [1] LSDPO (2017/2018) Paper Presentation Tudor Tiplea (tpt26) Problem Paper identifies three major challenges

More information

Frameworks for Graph-Based Problems

Frameworks for Graph-Based Problems Frameworks for Graph-Based Problems Dakshil Shah U.G. Student Computer Engineering Department Dwarkadas J. Sanghvi College of Engineering, Mumbai, India Chetashri Bhadane Assistant Professor Computer Engineering

More information

Problem Score Max Score 1 Syntax directed translation & type

Problem Score Max Score 1 Syntax directed translation & type CMSC430 Spring 2014 Midterm 2 Name Instructions You have 75 minutes for to take this exam. This exam has a total of 100 points. An average of 45 seconds per point. This is a closed book exam. No notes

More information

Highway Hierarchies (Dominik Schultes) Presented by: Andre Rodriguez

Highway Hierarchies (Dominik Schultes) Presented by: Andre Rodriguez Highway Hierarchies (Dominik Schultes) Presented by: Andre Rodriguez Central Idea To go from Tallahassee to Gainesville*: Get to the I-10 Drive on the I-10 Get to Gainesville (8.8 mi) (153 mi) (1.8 mi)

More information

TI2736-B Big Data Processing. Claudia Hauff

TI2736-B Big Data Processing. Claudia Hauff TI2736-B Big Data Processing Claudia Hauff ti2736b-ewi@tudelft.nl Intro Streams Streams Map Reduce HDFS Pig Ctd. Graphs Pig Design Patterns Hadoop Ctd. Giraph Zoo Keeper Spark Spark Ctd. Learning objectives

More information

Intermediate Code Generation

Intermediate Code Generation Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target

More information

EE/CSCI 451 Spring 2018 Homework 8 Total Points: [10 points] Explain the following terms: EREW PRAM CRCW PRAM. Brent s Theorem.

EE/CSCI 451 Spring 2018 Homework 8 Total Points: [10 points] Explain the following terms: EREW PRAM CRCW PRAM. Brent s Theorem. EE/CSCI 451 Spring 2018 Homework 8 Total Points: 100 1 [10 points] Explain the following terms: EREW PRAM CRCW PRAM Brent s Theorem BSP model 1 2 [15 points] Assume two sorted sequences of size n can be

More information

CS /21/2016. Paul Krzyzanowski 1. Can we make MapReduce easier? Distributed Systems. Apache Pig. Apache Pig. Pig: Loading Data.

CS /21/2016. Paul Krzyzanowski 1. Can we make MapReduce easier? Distributed Systems. Apache Pig. Apache Pig. Pig: Loading Data. Distributed Systems 1. Graph Computing Frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 016 1 Apache Pig Apache Pig Why? Make it easy to use MapReduce via scripting instead

More information

CS521 \ Notes for the Final Exam

CS521 \ Notes for the Final Exam CS521 \ Notes for final exam 1 Ariel Stolerman Asymptotic Notations: CS521 \ Notes for the Final Exam Notation Definition Limit Big-O ( ) Small-o ( ) Big- ( ) Small- ( ) Big- ( ) Notes: ( ) ( ) ( ) ( )

More information

ELEC 876: Software Reengineering

ELEC 876: Software Reengineering ELEC 876: Software Reengineering () Dr. Ying Zou Department of Electrical & Computer Engineering Queen s University Compiler and Interpreter Compiler Source Code Object Compile Execute Code Results data

More information

Operational Semantics 1 / 13

Operational Semantics 1 / 13 Operational Semantics 1 / 13 Outline What is semantics? Operational Semantics What is semantics? 2 / 13 What is the meaning of a program? Recall: aspects of a language syntax: the structure of its programs

More information

CS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting

CS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting CS2 Algorithms and Data Structures Note 10 Depth-First Search and Topological Sorting In this lecture, we will analyse the running time of DFS and discuss a few applications. 10.1 A recursive implementation

More information

Early Experience with Intergrating Charm++ Support to Green-Marl DSL

Early Experience with Intergrating Charm++ Support to Green-Marl DSL Early Experience with Intergrating Charm++ Support to Green-Marl DSL Alexander Frolov DISLab, «Scientific and Research Center on Computer Techonology» (NICEVT) 15th Annual Workshop on Charm++ and its Applications

More information

Solutions to relevant spring 2000 exam problems

Solutions to relevant spring 2000 exam problems Problem 2, exam Here s Prim s algorithm, modified slightly to use C syntax. MSTPrim (G, w, r): Q = V[G]; for (each u Q) { key[u] = ; key[r] = 0; π[r] = 0; while (Q not empty) { u = ExtractMin (Q); for

More information

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011 Data-Intensive Information Processing Applications! Session #5 Graph Algorithms Jordan Boyd-Graber University of Maryland Thursday, March 3, 2011 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

STUDENT OUTLINE. Lesson 8: Structured Programming, Control Structures, if-else Statements, Pseudocode

STUDENT OUTLINE. Lesson 8: Structured Programming, Control Structures, if-else Statements, Pseudocode STUDENT OUTLINE Lesson 8: Structured Programming, Control Structures, if- Statements, Pseudocode INTRODUCTION: This lesson is the first of four covering the standard control structures of a high-level

More information

Data-Intensive Distributed Computing

Data-Intensive Distributed Computing Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 8: Analyzing Graphs, Redux (1/2) March 20, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

Chapter 2 Abstract Machine Models. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam

Chapter 2 Abstract Machine Models. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Chapter 2 Abstract Machine Models Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Parallel Computer Models (1) A parallel machine model (also known as programming model, type architecture, conceptual

More information

CS536 Spring 2011 FINAL ID: Page 2 of 11

CS536 Spring 2011 FINAL ID: Page 2 of 11 CS536 Spring 2011 FINAL ID: Page 2 of 11 Question 2. (30 POINTS) Consider adding forward function declarations to the Little language. A forward function declaration is a function header (including its

More information

Graph and Digraph Glossary

Graph and Digraph Glossary 1 of 15 31.1.2004 14:45 Graph and Digraph Glossary A B C D E F G H I-J K L M N O P-Q R S T U V W-Z Acyclic Graph A graph is acyclic if it contains no cycles. Adjacency Matrix A 0-1 square matrix whose

More information

Graph Processing & Bulk Synchronous Parallel Model

Graph Processing & Bulk Synchronous Parallel Model Graph Processing & Bulk Synchronous Parallel Model CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 14 : 590.02 Spring 13 1 Recap: Graph Algorithms Many graph algorithms need iterafve computafon

More information

SociaLite: A Datalog-based Language for

SociaLite: A Datalog-based Language for SociaLite: A Datalog-based Language for Large-Scale Graph Analysis Jiwon Seo M OBIS OCIAL RESEARCH GROUP Overview Overview! SociaLite: language for large-scale graph analysis! Extensions to Datalog! Compiler

More information

Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets

Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets Nadathur Satish, Narayanan Sundaram, Mostofa Ali Patwary, Jiwon Seo, Jongsoo Park, M. Amber Hassaan, Shubho Sengupta, Zhaoming

More information

One Trillion Edges. Graph processing at Facebook scale

One Trillion Edges. Graph processing at Facebook scale One Trillion Edges Graph processing at Facebook scale Introduction Platform improvements Compute model extensions Experimental results Operational experience How Facebook improved Apache Giraph Facebook's

More information

Optimizing CPU Cache Performance for Pregel-Like Graph Computation

Optimizing CPU Cache Performance for Pregel-Like Graph Computation Optimizing CPU Cache Performance for Pregel-Like Graph Computation Songjie Niu, Shimin Chen* State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences

More information

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing /34 Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing Zuhair Khayyat 1 Karim Awara 1 Amani Alonazi 1 Hani Jamjoom 2 Dan Williams 2 Panos Kalnis 1 1 King Abdullah University of

More information

Distributed Graph Storage. Veronika Molnár, UZH

Distributed Graph Storage. Veronika Molnár, UZH Distributed Graph Storage Veronika Molnár, UZH Overview Graphs and Social Networks Criteria for Graph Processing Systems Current Systems Storage Computation Large scale systems Comparison / Best systems

More information

Recap: Functions as first-class values

Recap: Functions as first-class values Recap: Functions as first-class values Arguments, return values, bindings What are the benefits? Parameterized, similar functions (e.g. Testers) Creating, (Returning) Functions Iterator, Accumul, Reuse

More information

Batch & Stream Graph Processing with Apache Flink. Vasia

Batch & Stream Graph Processing with Apache Flink. Vasia Batch & Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri Outline Distributed Graph Processing Gelly: Batch Graph Processing with Flink Gelly-Stream: Continuous Graph

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University January 30, 2007 Outline Recap

More information

High Performance Data Analytics: Experiences Porting the Apache Hama Graph Analytics Framework to an HPC InfiniBand Connected Cluster

High Performance Data Analytics: Experiences Porting the Apache Hama Graph Analytics Framework to an HPC InfiniBand Connected Cluster High Performance Data Analytics: Experiences Porting the Apache Hama Graph Analytics Framework to an HPC InfiniBand Connected Cluster Summary Open source analytic frameworks, such as those in the Apache

More information

Red-Black trees are usually described as obeying the following rules :

Red-Black trees are usually described as obeying the following rules : Red-Black Trees As we have seen, the ideal Binary Search Tree has height approximately equal to log n, where n is the number of values stored in the tree. Such a BST guarantees that the maximum time for

More information

Type Inference Systems. Type Judgments. Deriving a Type Judgment. Deriving a Judgment. Hypothetical Type Judgments CS412/CS413

Type Inference Systems. Type Judgments. Deriving a Type Judgment. Deriving a Judgment. Hypothetical Type Judgments CS412/CS413 Type Inference Systems CS412/CS413 Introduction to Compilers Tim Teitelbaum Type inference systems define types for all legal programs in a language Type inference systems are to type-checking: As regular

More information

Distributed Graph Algorithms

Distributed Graph Algorithms Distributed Graph Algorithms Alessio Guerrieri University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Introduction

More information

Graph Data Management

Graph Data Management Graph Data Management Analysis and Optimization of Graph Data Frameworks presented by Fynn Leitow Overview 1) Introduction a) Motivation b) Application for big data 2) Choice of algorithms 3) Choice of

More information

An Implementation of Connected Component Algorithm on GPU-Project Report

An Implementation of Connected Component Algorithm on GPU-Project Report An Implementation of Connected Component Algorithm on GPU-Project Report Leyuan Wang March 1, 013 Motivation The connected component algorithm is widely used in many fields. In graph theory, a connected

More information

Trees Rooted Trees Spanning trees and Shortest Paths. 12. Graphs and Trees 2. Aaron Tan November 2017

Trees Rooted Trees Spanning trees and Shortest Paths. 12. Graphs and Trees 2. Aaron Tan November 2017 12. Graphs and Trees 2 Aaron Tan 6 10 November 2017 1 10.5 Trees 2 Definition Definition Definition: Tree A graph is said to be circuit-free if, and only if, it has no circuits. A graph is called a tree

More information

1 P a g e A r y a n C o l l e g e \ B S c _ I T \ C \

1 P a g e A r y a n C o l l e g e \ B S c _ I T \ C \ BSc IT C Programming (2013-2017) Unit I Q1. What do you understand by type conversion? (2013) Q2. Why we need different data types? (2013) Q3 What is the output of the following (2013) main() Printf( %d,

More information

Shortest Path Problem

Shortest Path Problem Shortest Path Problem CLRS Chapters 24.1 3, 24.5, 25.2 Shortest path problem Shortest path problem (and variants) Properties of shortest paths Algorithmic framework Bellman-Ford algorithm Shortest paths

More information

Distributed Algorithms 6.046J, Spring, 2015 Part 2. Nancy Lynch

Distributed Algorithms 6.046J, Spring, 2015 Part 2. Nancy Lynch Distributed Algorithms 6.046J, Spring, 2015 Part 2 Nancy Lynch 1 This Week Synchronous distributed algorithms: Leader Election Maximal Independent Set Breadth-First Spanning Trees Shortest Paths Trees

More information

Continuations provide a novel way to suspend and reexecute

Continuations provide a novel way to suspend and reexecute Continuations provide a novel way to suspend and reexecute computations. 2. ML ( Meta Language ) Strong, compile-time type checking. Types are determined by inference rather than declaration. Naturally

More information

University of New Mexico Department of Computer Science. Final Examination. CS 362 Data Structures and Algorithms Spring, 2006

University of New Mexico Department of Computer Science. Final Examination. CS 362 Data Structures and Algorithms Spring, 2006 University of New Mexico Department of Computer Science Final Examination CS 6 Data Structures and Algorithms Spring, 006 Name: Email: Print your name and email, neatly in the space provided above; print

More information

Distributed Systems. 20. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 20. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 20. Other parallel frameworks Paul Krzyzanowski Rutgers University Fall 2017 November 20, 2017 2014-2017 Paul Krzyzanowski 1 Can we make MapReduce easier? 2 Apache Pig Why? Make it

More information

Greedy Approach: Intro

Greedy Approach: Intro Greedy Approach: Intro Applies to optimization problems only Problem solving consists of a series of actions/steps Each action must be 1. Feasible 2. Locally optimal 3. Irrevocable Motivation: If always

More information

implementing the breadth-first search algorithm implementing the depth-first search algorithm

implementing the breadth-first search algorithm implementing the depth-first search algorithm Graph Traversals 1 Graph Traversals representing graphs adjacency matrices and adjacency lists 2 Implementing the Breadth-First and Depth-First Search Algorithms implementing the breadth-first search algorithm

More information

Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems

Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems ABSTRACT Minyang Han David R. Cheriton School of Computer Science University of Waterloo m25han@uwaterloo.ca

More information

HIGH PERFORMANCE DATA ANALYTICS:

HIGH PERFORMANCE DATA ANALYTICS: www.gdmissionsystems.com/hpc HIGH PERFORMANCE DATA ANALYTICS: Experiences Porting the Apache Hama Graph Analytics Framework to an HPC InfiniBand Connected Cluster 1. Summary Open source analytic frameworks,

More information

CS November 2017

CS November 2017 Distributed Systems 0. Other parallel frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 017 November 0, 017 014-017 Paul Krzyzanowski 1 Apache Pig Apache Pig Why? Make

More information

CMSC 330: Organization of Programming Languages. OCaml Expressions and Functions

CMSC 330: Organization of Programming Languages. OCaml Expressions and Functions CMSC 330: Organization of Programming Languages OCaml Expressions and Functions CMSC330 Spring 2018 1 Lecture Presentation Style Our focus: semantics and idioms for OCaml Semantics is what the language

More information

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules. Outline Type Checking General properties of type systems Types in programming languages Notation for type rules Logical rules of inference Common type rules 2 Static Checking Refers to the compile-time

More information

Algorithms on Graphs: Part III. Shortest Path Problems. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar..

Algorithms on Graphs: Part III. Shortest Path Problems. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar.. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar.. Shortest Path Problems Algorithms on Graphs: Part III Path in a graph. Let G = V,E be a graph. A path p = e 1,...,e k, e i E,

More information

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference Type Checking Outline General properties of type systems Types in programming languages Notation for type rules Logical rules of inference Common type rules 2 Static Checking Refers to the compile-time

More information

A thesis submitted to Kent State University in partial fulfillment of the requirements for the degree of Master of Science. Ding Chu.

A thesis submitted to Kent State University in partial fulfillment of the requirements for the degree of Master of Science. Ding Chu. BSP IMPLEMENTATION OF BORŮVKA S MINIMUM SPANNING TREE ALGORITHM A thesis submitted to Kent State University in partial fulfillment of the requirements for the degree of Master of Science by Ding Chu May

More information

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture

More information

CSC Intro to Intelligent Robotics, Spring Graphs

CSC Intro to Intelligent Robotics, Spring Graphs CSC 445 - Intro to Intelligent Robotics, Spring 2018 Graphs Graphs Definition: A graph G = (V, E) consists of a nonempty set V of vertices (or nodes) and a set E of edges. Each edge has either one or two

More information

Case Study 4: Collaborative Filtering. GraphLab

Case Study 4: Collaborative Filtering. GraphLab Case Study 4: Collaborative Filtering GraphLab Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin March 14 th, 2013 Carlos Guestrin 2013 1 Social Media

More information

Graph-Processing Systems. (focusing on GraphChi)

Graph-Processing Systems. (focusing on GraphChi) Graph-Processing Systems (focusing on GraphChi) Recall: PageRank in MapReduce (Hadoop) Input: adjacency matrix H D F S (a,[c]) (b,[a]) (c,[a,b]) (c,pr(a) / out (a)), (a,[c]) (a,pr(b) / out (b)), (b,[a])

More information

Advanced Data Management

Advanced Data Management Advanced Data Management Medha Atre Office: KD-219 atrem@cse.iitk.ac.in Sept 26, 2016 defined Given a graph G(V, E) with V as the set of nodes and E as the set of edges, a reachability query asks does

More information

This course is intended for 3rd and/or 4th year undergraduate majors in Computer Science.

This course is intended for 3rd and/or 4th year undergraduate majors in Computer Science. Lecture 9 Graphs This course is intended for 3rd and/or 4th year undergraduate majors in Computer Science. You need to be familiar with the design and use of basic data structures such as Lists, Stacks,

More information

Graph Processing. Connor Gramazio Spiros Boosalis

Graph Processing. Connor Gramazio Spiros Boosalis Graph Processing Connor Gramazio Spiros Boosalis Pregel why not MapReduce? semantics: awkward to write graph algorithms efficiency: mapreduces serializes state (e.g. all nodes and edges) while pregel keeps

More information

Lecture 5: Graph algorithms 1

Lecture 5: Graph algorithms 1 DD2458, Problem Solving and Programming Under Pressure Lecture 5: Graph algorithms 1 Date: 2008-10-01 Scribe(s): Mikael Auno and Magnus Andermo Lecturer: Douglas Wikström This lecture presents some common

More information

Distributed Systems. 21. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2018

Distributed Systems. 21. Other parallel frameworks. Paul Krzyzanowski. Rutgers University. Fall 2018 Distributed Systems 21. Other parallel frameworks Paul Krzyzanowski Rutgers University Fall 2018 1 Can we make MapReduce easier? 2 Apache Pig Why? Make it easy to use MapReduce via scripting instead of

More information

MATH 363 Final Wednesday, April 28. Final exam. You may use lemmas and theorems that were proven in class and on assignments unless stated otherwise.

MATH 363 Final Wednesday, April 28. Final exam. You may use lemmas and theorems that were proven in class and on assignments unless stated otherwise. Final exam This is a closed book exam. No calculators are allowed. Unless stated otherwise, justify all your steps. You may use lemmas and theorems that were proven in class and on assignments unless stated

More information

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Main Goal of Algorithm Design Design fast

More information

Answer: Early binding generally leads to greater efficiency (compilation approach) Late binding general leads to greater flexibility

Answer: Early binding generally leads to greater efficiency (compilation approach) Late binding general leads to greater flexibility Quiz Review Q1. What is the advantage of binding things as early as possible? Is there any advantage to delaying binding? Answer: Early binding generally leads to greater efficiency (compilation approach)

More information

Optimizing Cache Performance for Graph Analytics. Yunming Zhang Presentation

Optimizing Cache Performance for Graph Analytics. Yunming Zhang Presentation Optimizing Cache Performance for Graph Analytics Yunming Zhang 6.886 Presentation Goals How to optimize in-memory graph applications How to go about performance engineering just about anything In-memory

More information

CS November 2018

CS November 2018 Distributed Systems 1. Other parallel frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 018 1 Apache Pig Apache Pig Why? Make it easy to use MapReduce via scripting instead

More information

CSE 100 Minimum Spanning Trees Prim s and Kruskal

CSE 100 Minimum Spanning Trees Prim s and Kruskal CSE 100 Minimum Spanning Trees Prim s and Kruskal Your Turn The array of vertices, which include dist, prev, and done fields (initialize dist to INFINITY and done to false ): V0: dist= prev= done= adj:

More information

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices. Trees Trees form the most widely used subclasses of graphs. In CS, we make extensive use of trees. Trees are useful in organizing and relating data in databases, file systems and other applications. Formal

More information

High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock

High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock High-Performance Graph Primitives on the GPU: Design and Implementation of Gunrock Yangzihao Wang University of California, Davis yzhwang@ucdavis.edu March 24, 2014 Yangzihao Wang (yzhwang@ucdavis.edu)

More information

Graphs. Part I: Basic algorithms. Laura Toma Algorithms (csci2200), Bowdoin College

Graphs. Part I: Basic algorithms. Laura Toma Algorithms (csci2200), Bowdoin College Laura Toma Algorithms (csci2200), Bowdoin College Undirected graphs Concepts: connectivity, connected components paths (undirected) cycles Basic problems, given undirected graph G: is G connected how many

More information