Asynchronous Graph Processing
|
|
- Kelly Scott
- 6 years ago
- Views:
Transcription
1 Asynchronous Graph Processing CompSci Instructor: Ashwin Machanavajjhala (slides adapted from Graphlab talks at UAI 10 & VLDB 12 and Gouzhang Wang s talk at CIDR 2013) Lecture 15 : Spring 13 1
2 Recap: Pregel Superstep Superstep Superstep Superstep 3 Figure 2: Maximum Value Example. Dotted lines are messages. Shaded vertices have voted to halt. Lecture 15 : Spring 13 2
3 Graph Processing Dependency Graph Local Updates IteraBve ComputaBon My Interests Friends Interests Lecture 15 : Spring 13 3
4 This Class Asynchronous Graph Processing Lecture 15 : Spring 13 4
5 Example: Belief PropagaBon p(x 1,x 2,,x n ) / Y Y u(x u ) u,v(x u,x v ) u2v (u,v)2e Want to compute marginal distribubon at each node. Lecture 15 : Spring 13 5
6 Belief PropagaBon Belief at a vertex depends on messages received from neighboring verbces v b x ϕ (x ) m (x ), m (x ) u Lecture 15 : Spring 13 6
7 Belief PropagaBon Belief at a vertex depends on messages received from neighboring verbces v b x ϕ (x ) m (x ) m (x ), b (x ) m (x ) φ, (x, x ) m (x ) (2) u Lecture 15 : Spring 13 7
8 Original BP Algorithm A B C D E F G H I Lecture 15 : Spring
9 Original BP Algorithm can be inefficient Spends Bme updabng nodes which have already converged Challenge = Boundaries Lecture 15 : Spring 13 9
10 Residual BP ImplementaBon A B C Scheduler D E F G H I Lecture 15 : Spring 13 10
11 Residual BP ImplementaBon A B C Scheduler D E F G H I Lecture 15 : Spring 13 11
12 Residual BP ImplementaBon A B C Scheduler D E F G H I Lecture 15 : Spring 13 12
13 Residual BP ImplementaBon Ordering based on residual (max change in message value) B A B C D D E F Scheduler G H I Lecture 15 : Spring 13 13
14 Residual BP ImplementaBon A B C D Scheduler D E F G H I Lecture 15 : Spring 13 14
15 Residual BP ImplementaBon B F A B C D B C G E D E F Scheduler G H I Lecture 15 : Spring 13 15
16 Residual BP converges faster 1 [Elidan et al UAI 2006] % of runs converged AGBP RGBP time in seconds Lecture 15 : Spring 13 16
17 Summary Asynchronous serial graph algorithms can converge faster than synchronous parallel graph algorithms Is there a way to correctly transform asynchronous serial algorithms to run in a parallel seyng? Lecture 15 : Spring 13 17
18 GRAPHLAB Lecture 15 : Spring 13 18
19 GraphLab Data Graph Shared Data Table Scheduling Update FuncBons and Scopes 19
20 Data Graph A Graph with data associated with every vertex and edge. X 1 X 2 X 3 X 4 x 3 : current belief X 5 X 6 X 7 X 8 X 9 X 10 X 11 Φ(X 6,X 9 ): Binary potenbal :Data 20
21 Update FuncBons Update Func=ons are operabons which are applied on a vertex and transform the data in the scope of the vertex BP Update: - Read messages on adjacent edges - Read edge potenbals - Compute a new belief for the current vertex - Write new messages on edges 21
22 Update FuncBon Schedule a CPU 1 a b c d h a e f g i b CPU 2 h i j k d 22
23 Update FuncBon Schedule CPU 1 a b c d a e f g i b CPU 2 h i j k d 23
24 StaBc Schedule Scheduler determines the order of Update FuncBon EvaluaBons Synchronous Schedule: Every vertex updated simultaneously Round Robin Schedule: Every vertex updated sequenbally 24
25 Need for Dynamic Scheduling Converged Slowly Converging Focus Effort 25
26 Dynamic Schedule a CPU 1 a b c d h a e f g b h i j k CPU 2 26
27 Dynamic Schedule Update FuncBons can insert new tasks into the schedule FIFO Queue Priority Queue Splash Schedule Wildfire BP [SelvaBci et al.] Residual BP [Elidan et al.] Splash BP [Gonzalez et al.] 27
28 Global InformaBon What if we need global informabon? Algorithm Parameters? Sufficient StaBsBcs? Sum of all the verbces? 28
29 Shared Data Table (SDT) Global constant parameters Constant: Temperature Constant: Total # Samples 29
30 Sync OperaBon Sync is a fold/reduce operabon over the graph " Accumulate performs an aggregabon over verbces " Apply makes a final modificabon to the accumulated data " Example: Compute the average of all the verbces Sync! Accumulate FuncBon: Add Apply FuncBon: Divide by V
31 Shared Data Table (SDT) Global constant parameters Global computabon (Sync Opera=on) Constant: Temperature Sync: Loglikelihood Constant: Total # Samples Sync: Sample Statistics 31
32 Safety and Consistency 32
33 Write- Write Race Write- Write Race If adjacent update funcbons write simultaneously Lek update writes: Final Value Right update writes: 33
34 Race CondiBons + Deadlocks Just one of the many possible races Race- free code is extremely difficult to write GraphLab design ensures race- free operabon 34
35 Scope Rules Guaranteed safety for all update funcbons 35
36 Full Consistency Only allow update funcbons two verbces apart to be run in parallel Reduced opportunibes for parallelism 36
37 Obtaining More Parallelism Not all update funcbons will modify the enbre scope! Belief Propaga=on: Only uses edge data Gibbs Sampling: Only needs to read adjacent verbces 37
38 Edge Consistency 38
39 Obtaining More Parallelism Map opera=ons. Feature extracbon on vertex data 39
40 Vertex Consistency 40
41 SequenBal Consistency GraphLab guarantees sequen=al consistency For every parallel execu=on, there exists a sequen=al execu=on of update funcbons which will produce the same result. Parallel CPU 1 CPU 2 Bme SequenBal CPU 1 41
42 GraphLab Data Graph Shared Data Table Scheduling Update FuncBons and Scopes 42
43 DISTRIBUTED GRAPHLAB Lecture 15 : Spring 13 43
44 DistribuBng GraphLab NOT SHARED- NOTHING (unlike MapReduce / Pregel) Need to have distributed shared memory No change to the update step Need to to distributed scheduling Need to ensure distributed consistency Need to ensure fault tolerance Lecture 15 : Spring 13 44
45 Distributed Graph ParBBon the graph across mulbple machines. 45
46 Distributed Graph Ghost verbces maintain adjacency structure and replicate remote data. ghost verbces 46
47 Distributed Graph Cut efficiently using HPC Graph parbboning tools (ParMeBs / Scotch / ) ghost verbces 47
48 Update FuncBons User- defined program: applied to a vertex and transforms data in scope of vertex Pagerank(scope){ // Update the current vertex data vertex.pagerank = α ForEach inpage: vertex.pagerank += (1 α) inpage.pagerank // Reschedule Neighbors if needed if vertex.pagerank changes then reschedule_all_neighbors; } 48
49 Distributed Scheduling Each machine maintains a schedule over the verbces it owns. a f h a b c d g e f g h i j k Distributed Consensus used to identify completion 49
50 Distributed Consistency SoluBon 1 SoluBon 2 Graph Coloring Distributed Locking
51 Edge Consistency via Graph Coloring VerBces of the same color are all at least one vertex apart. Therefore, All verbces of the same color can be run in parallel! 51
52 ChromaBc Distributed Engine Execute tasks on all vertices of color 0 Execute tasks on all vertices of color 0 Time Ghost Synchronization Completion + Barrier Execute tasks on all vertices of color 1 Execute tasks on all vertices of color 1 Ghost Synchronization Completion + Barrier 52
53 Problems Require a graph coloring to be available. Frequent Barriers make it extremely inefficient for highly dynamic systems where only a small number of verbces are acbve in each round. 53
54 Distributed Consistency SoluBon 1 SoluBon 2 Graph Coloring Distributed Locking
55 Distributed Locking Edge Consistency can be guaranteed through locking. : RW Lock 55
56 Consistency Through Locking Acquire write- lock on center vertex, read- lock on adjacent. 56
57 Consistency Through Locking Multicore Setting PThread RW- Locks Distributed Setting A B C D Distributed Locks " Challenges " Latency " Solution " Pipelining CPU Machine 1 A C B D Machine 2 57 A B C D
58 No Pipelining lock scope 1 Time scope 1 acquired update_funcbon 1 release scope 1 Process request 1 Process release 1 58
59 Pipelining / Latency Hiding Hide latency using pipelining Time lock scope 1 lock scope 2 lock scope 3 scope 1 acquired scope 2 acquired scope 3 acquired update_funcbon 1 release scope 1 update_funcbon 2 release scope 2 Process request 1 Process request 2 Process request 3 Process release 1 59
60 Checkpoints for Fault Tolerance 1: Stop the world 2: Write state to disk
61 vertices updated Snapshot Performance Because we have to stop the world, One 2.5 x slow machine slows everything down! no 108 snapshot No Snapshot Snapshot =me Slow machine async. snapshot sync. snapshot Snapshot One slow machine time elapsed(s) 61
62 Bexer CheckpoinBng Based on [Chandy, Lamport 85] Edge consistent update funcbon Algorithm 5: Snapshot Update on vertex v if v was already snapshotted then Quit Save D v // Save current vertex foreach u 2 N[v] do // Loop over neighbors if u was not snapshotted then Save data on edge D u$v Schedule u for a Snapshot Update Mark v as snapshotted Lecture 15 : Spring 13 62
63 Async. Snapshot Performance No penalty incurred by the slow machine! vertices updated 2.5 x no 108 snapshot No Snapshot async. snapshot One slow machine sync. snapshot Snapshot time elapsed(s) 63
64 Summary Asynchronous serial graph algorithms can converge faster than synchronous parallel graph algorithms GraphLab provides high level abstracbons for wribng asynchronous graph algorithms Takes care of consistency and scheduling Distributed GraphLab Graph processing using color- steps Consistency ensured via pipelined distributed locking Fault tolerance via fine grained checkpoinbng Lecture 15 : Spring 13 64
GraphLab: A New Framework for Parallel Machine Learning
GraphLab: A New Framework for Parallel Machine Learning Yucheng Low, Aapo Kyrola, Carlos Guestrin, Joseph Gonzalez, Danny Bickson, Joe Hellerstein Presented by Guozhang Wang DB Lunch, Nov.8, 2010 Overview
More informationPutting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21
Big Processing -Parallel Computation COS 418: Distributed Systems Lecture 21 Michael Freedman 2 Ex: Word count using partial aggregation Putting it together 1. Compute word counts from individual files
More informationGraph Processing. Connor Gramazio Spiros Boosalis
Graph Processing Connor Gramazio Spiros Boosalis Pregel why not MapReduce? semantics: awkward to write graph algorithms efficiency: mapreduces serializes state (e.g. all nodes and edges) while pregel keeps
More informationGraph-Parallel Problems. ML in the Context of Parallel Architectures
Case Study 4: Collaborative Filtering Graph-Parallel Problems Synchronous v. Asynchronous Computation Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 20 th, 2014
More informationLarge Scale Graph Processing Pregel, GraphLab and GraphX
Large Scale Graph Processing Pregel, GraphLab and GraphX Amir H. Payberah amir@sics.se KTH Royal Institute of Technology Amir H. Payberah (KTH) Large Scale Graph Processing 2016/10/03 1 / 76 Amir H. Payberah
More informationarxiv: v1 [cs.db] 26 Apr 2012
Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud Yucheng Low Carnegie Mellon University ylow@cs.cmu.edu Joseph Gonzalez Carnegie Mellon University jegonzal@cs.cmu.edu
More informationIntro to dataflow analysis. CSE 501 Spring 15
Intro to dataflow analysis CSE 501 Spring 15 Announcements Paper commentaries Please post them 24 hours before class ApplicaBon paper presentabons Good training for conference talks! Will help go through
More informationJoseph hgonzalez. A New Parallel Framework for Machine Learning. Joint work with. Yucheng Low. Aapo Kyrola. Carlos Guestrin.
A New Parallel Framework for Machine Learning Joseph hgonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Alex Smola Guy Blelloch Joe Hellerstein David O Hallaron Carnegie Mellon
More informationCmpE 138 Spring 2011 Special Topics L2
CmpE 138 Spring 2011 Special Topics L2 Shivanshu Singh shivanshu.sjsu@gmail.com Map Reduce ElecBon process Map Reduce Typical single node architecture Applica'on CPU Memory Storage Map Reduce Applica'on
More informationCase Study 4: Collaborative Filtering. GraphLab
Case Study 4: Collaborative Filtering GraphLab Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin March 14 th, 2013 Carlos Guestrin 2013 1 Social Media
More informationWEEK 4.3. ECE124 Digital Circuits and Systems Page 1
WEEK 4.3 ECE124 Digital Circuits and Systems Page 1 Decoders implemented with NAND gates SomeBmes, in implementabon decoders are done with NAND gates rather than AND gates. With NAND gates, the table illustrabng
More informationFault Tolerant Distributed Main Memory Systems
Fault Tolerant Distributed Main Memory Systems CompSci 590.04 Instructor: Ashwin Machanavajjhala Lecture 16 : 590.04 Fall 15 1 Recap: Map Reduce! map!!,!! list!!!!reduce!!, list(!! )!! Map Phase (per record
More informationPregel: A System for Large- Scale Graph Processing. Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010
Pregel: A System for Large- Scale Graph Processing Written by G. Malewicz et al. at SIGMOD 2010 Presented by Chris Bunch Tuesday, October 12, 2010 1 Graphs are hard Poor locality of memory access Very
More informationAutomatic Scaling Iterative Computations. Aug. 7 th, 2012
Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics
More informationGraph Processing & Bulk Synchronous Parallel Model
Graph Processing & Bulk Synchronous Parallel Model CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 14 : 590.02 Spring 13 1 Recap: Graph Algorithms Many graph algorithms need iterafve computafon
More informationDEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester Section Subject Code Subject Name Degree & Branch : I & II : M.E : CP7204 : Advanced Operating Systems : M.E C.S.E. 1. Define Process? UNIT-1
More informationGraphHP: A Hybrid Platform for Iterative Graph Processing
GraphHP: A Hybrid Platform for Iterative Graph Processing Qun Chen, Song Bai, Zhanhuai Li, Zhiying Gou, Bo Suo and Wei Pan Northwestern Polytechnical University Xi an, China {chenbenben, baisong, lizhh,
More informationPREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING
PREGEL: A SYSTEM FOR LARGE-SCALE GRAPH PROCESSING Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) SIGMOD 2010 Presented by : Xiu
More informationLecture on Storage Systems
Lecture on Storage Systems Storage Systems and OS Kernels André Brinkmann Agenda How can we represent block devices in the kernel and process requests? RepresentaBon of storage systems as block devices
More informationIO System. CP-226: Computer Architecture. Lecture 25 (24 April 2013) CADSL
IO System Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 14: Distributed Graph Processing Motivation Many applications require graph processing E.g., PageRank Some graph data sets are very large
More informationAuthors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G.
Authors: Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, L., Leiser, N., Czjkowski, G. Speaker: Chong Li Department: Applied Health Science Program: Master of Health Informatics 1 Term
More informationDomain-specific programming on graphs
Lecture 25: Domain-specific programming on graphs Parallel Computer Architecture and Programming CMU 15-418/15-618, Fall 2016 1 Last time: Increasing acceptance of domain-specific programming systems Challenge
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 14: Distributed Graph Processing Motivation Many applications require graph processing E.g., PageRank Some graph data sets are very large
More informationmodern database systems lecture 10 : large-scale graph processing
modern database systems lecture 1 : large-scale graph processing Aristides Gionis spring 18 timeline today : homework is due march 6 : homework out april 5, 9-1 : final exam april : homework due graphs
More informationSTREAMER: a Distributed Framework for Incremental Closeness Centrality
STREAMER: a Distributed Framework for Incremental Closeness Centrality Computa@on A. Erdem Sarıyüce 1,2, Erik Saule 4, Kamer Kaya 1, Ümit V. Çatalyürek 1,3 1 Department of Biomedical InformaBcs 2 Department
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 60 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Pregel: A System for Large-Scale Graph Processing
More informationFrameworks for Graph-Based Problems
Frameworks for Graph-Based Problems Dakshil Shah U.G. Student Computer Engineering Department Dwarkadas J. Sanghvi College of Engineering, Mumbai, India Chetashri Bhadane Assistant Professor Computer Engineering
More informationPREGEL AND GIRAPH. Why Pregel? Processing large graph problems is challenging Options
Data Management in the Cloud PREGEL AND GIRAPH Thanks to Kristin Tufte 1 Why Pregel? Processing large graph problems is challenging Options Custom distributed infrastructure Existing distributed computing
More informationCSE 153 Design of Operating Systems
CSE 153 Design of Operating Systems Winter 2018 Midterm Review Midterm in class on Monday Covers material through scheduling and deadlock Based upon lecture material and modules of the book indicated on
More informationSCALABLE CONSISTENCY AND TRANSACTION MODELS THANKS TO M. GROSSNIKLAUS
Sharding and Replica@on Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS THANKS TO M. GROSSNIKLAUS Sharding Breaking a database into several collecbons (shards) Each data item (e.g.,
More informationCase Study 4: Collabora1ve Filtering
Case Study 4: Collabora1ve Filtering Graph- Parallel Problems Synchronous v. Asynchronous ComputaPon Machine Learning for Big Data CSE547/STAT548, University of Washington Carlos Guestrin, guest lecturer
More informationPREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING
PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, G. Czajkowski Google, Inc. SIGMOD 2010 Presented by Ke Hong (some figures borrowed from
More informationLecture 1: Introduction to distributed Algorithms
Distributed Algorithms M.Tech., CSE, 2016 Lecture 1: Introduction to distributed Algorithms Faculty: K.R. Chowdhary : Professor of CS Disclaimer: These notes have not been subjected to the usual scrutiny
More informationParallel Gibbs Sampling From Colored Fields to Thin Junction Trees
Parallel Gibbs Sampling From Colored Fields to Thin Junction Trees Joseph Gonzalez Yucheng Low Arthur Gretton Carlos Guestrin Draw Samples Sampling as an Inference Procedure Suppose we wanted to know the
More informationParallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?
Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and
More informationDistributed Systems. 21. Graph Computing Frameworks. Paul Krzyzanowski. Rutgers University. Fall 2016
Distributed Systems 21. Graph Computing Frameworks Paul Krzyzanowski Rutgers University Fall 2016 November 21, 2016 2014-2016 Paul Krzyzanowski 1 Can we make MapReduce easier? November 21, 2016 2014-2016
More informationChapter 13: I/O Systems
COP 4610: Introduction to Operating Systems (Spring 2015) Chapter 13: I/O Systems Zhi Wang Florida State University Content I/O hardware Application I/O interface Kernel I/O subsystem I/O performance Objectives
More information61A LECTURE 27 PARALLELISM. Steven Tang and Eric Tzeng August 8, 2013
61A LECTURE 27 PARALLELISM Steven Tang and Eric Tzeng August 8, 2013 Announcements Practice Final Exam Sessions Worth 2 points extra credit just for taking it Sign-up instructions on Piazza (computer based
More informationEECS 571 Principles of Real-Time Embedded Systems. Lecture Note #10: More on Scheduling and Introduction of Real-Time OS
EECS 571 Principles of Real-Time Embedded Systems Lecture Note #10: More on Scheduling and Introduction of Real-Time OS Kang G. Shin EECS Department University of Michigan Mode Changes Changes in mission
More informationGraphs (Part II) Shannon Quinn
Graphs (Part II) Shannon Quinn (with thanks to William Cohen and Aapo Kyrola of CMU, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford University) Parallel Graph Computation Distributed computation
More informationThreads, Synchronization, and Scheduling. Eric Wu
Threads, Synchronization, and Scheduling Eric Wu (ericwu@cs) Topics for Today Project 2 Due tomorrow! Project 3 Due Feb. 17 th! Threads Synchronization Scheduling Project 2 Troubleshooting: Stock kernel
More informationPregel. Ali Shah
Pregel Ali Shah s9alshah@stud.uni-saarland.de 2 Outline Introduction Model of Computation Fundamentals of Pregel Program Implementation Applications Experiments Issues with Pregel 3 Outline Costs of Computation
More informationPregel: A System for Large-Scale Graph Proces sing
Pregel: A System for Large-Scale Graph Proces sing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkwoski Google, Inc. SIGMOD July 20 Taewhi
More informationCSE 486/586 Distributed Systems
CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo CSE 486/586 Recap: Consensus On a synchronous system There s an algorithm that works. On
More informationLecture 23 Database System Architectures
CMSC 461, Database Management Systems Spring 2018 Lecture 23 Database System Architectures These slides are based on Database System Concepts 6 th edition book (whereas some quotes and figures are used
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Markov Random Fields: Inference Exact: VE Exact+Approximate: BP Readings: Barber 5 Dhruv Batra
More informationPREGEL. A System for Large Scale Graph Processing
PREGEL A System for Large Scale Graph Processing The Problem Large Graphs are often part of computations required in modern systems (Social networks and Web graphs etc.) There are many graph computing
More informationI/O Management and Disk Scheduling. Chapter 11
I/O Management and Disk Scheduling Chapter 11 Categories of I/O Devices Human readable used to communicate with the user video display terminals keyboard mouse printer Categories of I/O Devices Machine
More informationThread Coordination -Managing Concurrency
Thread Coordination -Managing Concurrency David E. Culler CS162 Operating Systems and Systems Programming Lecture 8 Sept 17, 2014 h
More informationExam 2 Review. October 29, Paul Krzyzanowski 1
Exam 2 Review October 29, 2015 2013 Paul Krzyzanowski 1 Question 1 Why did Dropbox add notification servers to their architecture? To avoid the overhead of clients polling the servers periodically to check
More informationFrequently asked questions from the previous class survey
CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED COORDINATION/MUTUAL EXCLUSION] Shrideep Pallickara Computer Science Colorado State University L22.1 Frequently asked questions from the previous
More informationCS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University
Frequently asked questions from the previous class survey CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED COORDINATION/MUTUAL EXCLUSION] Shrideep Pallickara Computer Science Colorado State University
More informationFine-grained Transaction Scheduling in Replicated Databases via Symbolic Execution
Fine-grained Transaction Scheduling in Replicated Databases via Symbolic Execution Raminhas pedro.raminhas@tecnico.ulisboa.pt Stage: 2 nd Year PhD Student Research Area: Dependable and fault-tolerant systems
More informationLarge-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC
Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC Lecture material is mostly home-grown, partly taken with permission and courtesy from Professor Shih-Wei
More informationCMPSCI 677 Operating Systems Spring Lecture 14: March 9
CMPSCI 677 Operating Systems Spring 2014 Lecture 14: March 9 Lecturer: Prashant Shenoy Scribe: Nikita Mehra 14.1 Distributed Snapshot Algorithm A distributed snapshot algorithm captures a consistent global
More informationAsynchronous and Fault-Tolerant Recursive Datalog Evalua9on in Shared-Nothing Engines
Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in Shared-Nothing Engines Jingjing Wang, Magdalena Balazinska, Daniel Halperin University of Washington Modern Analy>cs Requires Itera>on Graph
More informationGraph Processing Frameworks
Graph Processing Frameworks Lecture 24 CSCI 4974/6971 5 Dec 2016 1 / 13 Today s Biz 1. Reminders 2. Review 3. Graph Processing Frameworks 4. 2D Partitioning 2 / 13 Reminders Assignment 6: due date Dec
More informationEECS 498 Introduction to Distributed Systems
EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Replicated State Machines Logical clocks Primary/ Backup Paxos? 0 1 (N-1)/2 No. of tolerable failures October 11, 2017 EECS 498
More informationProgramming Languages and Techniques (CIS120)
Programming Languages and Techniues (CIS120) Lecture 14 Feb 13, 2012 ImperaBve Queues Homework 4 due at midnight Announcements Homework 5 (ueues) will be available on the web aker the exam It is due Thurs
More informationData Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 14: Data Replication Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database Replication What is database replication The advantages of
More informationParallel and Distributed Systems for Probabilistic Reasoning Joseph Gonzalez
Parallel and Distributed Systems for Probabilistic Reasoning Joseph Gonzalez December 2012 CMU-ML-12-111 Parallel and Distributed Systems for Probabilistic Reasoning Joseph Gonzalez December 2012 CMU-ML-12-111
More informationAchieving Century Uptimes An Informational Series on Enterprise Computing
Achieving Century Uptimes An Informational Series on Enterprise Computing As Seen in The Connection, A Connect Publication December 2006 Present About the Authors: Dr. Bill Highleyman, Paul J. Holenstein,
More informationConcurrent Disjoint Set Union. Robert E. Tarjan Princeton University & Intertrust Technologies joint work with Siddhartha JayanB, Princeton
Concurrent Disjoint Set Union Robert E. Tarjan Princeton University & Intertrust Technologies joint work with Siddhartha JayanB, Princeton Key messages Ideas and results from sequenbal algorithms can carry
More informationLecture 17. Intro to Instruc.on Scheduling. Reading: Chapter Carnegie Mellon Todd C. Mowry 15745: Intro to Scheduling 1
Lecture 17 Intro to Instruc.on Scheduling Reading: Chapter 10.1 10.2 15745: Intro to Scheduling 1 OpBmizaBon: What s the Point? (A Quick Review) Machine- Independent OpBmizaBons: e.g., constant propagabon
More informationConcurrency and OS recap. Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne
Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne 64 Process Concept An operating system executes a variety of programs:
More informationCMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 12: Multi-Core Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 4 " Due: 11:49pm, Saturday " Two late days with penalty! Exam I " Grades out on
More informationLogisBcs. CS 6140: Machine Learning Spring K-means Algorithm. Today s Outline 3/27/16
LogisBcs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and InformaBon Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Exam
More informationLecture 22 : Distributed Systems for ML
10-708: Probabilistic Graphical Models, Spring 2017 Lecture 22 : Distributed Systems for ML Lecturer: Qirong Ho Scribes: Zihang Dai, Fan Yang 1 Introduction Big data has been very popular in recent years.
More informationKing Abdullah University of Science and Technology. CS348: Cloud Computing. Large-Scale Graph Processing
King Abdullah University of Science and Technology CS348: Cloud Computing Large-Scale Graph Processing Zuhair Khayyat 10/March/2013 The Importance of Graphs A graph is a mathematical structure that represents
More informationWhy do we need graph processing?
Why do we need graph processing? Community detection: suggest followers? Determine what products people will like Count how many people are in different communities (polling?) Graphs are Everywhere Group
More informationCS /21/2016. Paul Krzyzanowski 1. Can we make MapReduce easier? Distributed Systems. Apache Pig. Apache Pig. Pig: Loading Data.
Distributed Systems 1. Graph Computing Frameworks Can we make MapReduce easier? Paul Krzyzanowski Rutgers University Fall 016 1 Apache Pig Apache Pig Why? Make it easy to use MapReduce via scripting instead
More informationB.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2
Introduction :- Today single CPU based architecture is not capable enough for the modern database that are required to handle more demanding and complex requirements of the users, for example, high performance,
More informationLecture 25: Board Notes: Threads and GPUs
Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel
More informationComp 310 Computer Systems and Organization
Comp 310 Computer Systems and Organization Lecture #9 Process Management (CPU Scheduling) 1 Prof. Joseph Vybihal Announcements Oct 16 Midterm exam (in class) In class review Oct 14 (½ class review) Ass#2
More informationSpecial topics. CompCert: a formally verified compiler
Special topics CompCert: a formally verified compiler Prose: program synthesis - 1 day workshop - Lecture on May 1 st during class - OpBonal: stay acer class (1-5pm) for a hackaton 1 Course evaluabons
More informationCS 318 Principles of Operating Systems
CS 318 Principles of Operating Systems Fall 2017 Midterm Review Ryan Huang 10/12/17 CS 318 Midterm Review 2 Midterm October 17 th Tuesday 9:00-10:20 am at classroom Covers material before virtual memory
More informationAssignment 12: Commit Protocols and Replication
Data Modelling and Databases Exercise dates: May 24 / May 25, 2018 Ce Zhang, Gustavo Alonso Last update: June 04, 2018 Spring Semester 2018 Head TA: Ingo Müller Assignment 12: Commit Protocols and Replication
More informationDistributed Systems COMP 212. Revision 2 Othon Michail
Distributed Systems COMP 212 Revision 2 Othon Michail Synchronisation 2/55 How would Lamport s algorithm synchronise the clocks in the following scenario? 3/55 How would Lamport s algorithm synchronise
More informationLecture 27: Safety and Liveness Properties, Java Synchronizers, Dining Philosophers Problem
COMP 322: Fundamentals of Parallel Programming Lecture 27: Safety and Liveness Properties, Java Synchronizers, Dining Philosophers Problem Mack Joyner and Zoran Budimlić {mjoyner, zoran}@rice.edu http://comp322.rice.edu
More informationIP Review. CS144 Review Session 3 April 18, 2008 Ben Nham
IP Review CS144 Review Session 3 April 18, 2008 Ben Nham Layering Review ApplicaBon Data Transport Data TCP/UDP Header Network Data TCP/UDP Header IP Header Link Data TCP/UDP Header IP Header Ethernet
More informationSample Questions. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)
Sample Questions Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Sample Questions 1393/8/10 1 / 29 Question 1 Suppose a thread
More informationGraphLab: A New Framework For Parallel Machine Learning
GraphLab: A New Framework For Parallel Machine Learning Yucheng Low Carnegie Mellon University ylow@cs.cmu.edu Danny Bickson Carnegie Mellon University bickson@cs.cmu.edu Joseph Gonzalez Carnegie Mellon
More informationPrecept 3: Preemptive Scheduler. COS 318: Fall 2018
Precept 3: Preemptive Scheduler COS 318: Fall 2018 Project 3 Schedule Precept: Monday 10/15, 7:30pm (You are here) Design Review: Monday 10/22, 3-7pm Due: Sunday 11/04, 11:55pm Project 3 Overview Goal:
More informationChapter 20: Database System Architectures
Chapter 20: Database System Architectures Chapter 20: Database System Architectures Centralized and Client-Server Systems Server System Architectures Parallel Systems Distributed Systems Network Types
More informationConcurrency: what, why, how
Concurrency: what, why, how May 28, 2009 1 / 33 Lecture about everything and nothing Explain basic idea (pseudo) vs. Give reasons for using Present briefly different classifications approaches models and
More informationCSC Operating Systems Spring Lecture - XII Midterm Review. Tevfik Ko!ar. Louisiana State University. March 4 th, 2008.
CSC 4103 - Operating Systems Spring 2008 Lecture - XII Midterm Review Tevfik Ko!ar Louisiana State University March 4 th, 2008 1 I/O Structure After I/O starts, control returns to user program only upon
More informationDistributed Systems Question Bank UNIT 1 Chapter 1 1. Define distributed systems. What are the significant issues of the distributed systems?
UNIT 1 Chapter 1 1. Define distributed systems. What are the significant issues of the distributed systems? 2. What are different application domains of distributed systems? Explain. 3. Discuss the different
More informationTime. Supriya Vadlamani
Time Supriya Vadlamani Asynchrony v/s Synchrony Last class: Asynchrony Today: Event based Lamport s Logical clocks Synchrony Use real world clocks But do all the clocks show the same Bme? Problem Statement
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 9 MapReduce Prof. Li Jiang 2014/11/19 1 What is MapReduce Origin from Google, [OSDI 04] A simple programming model Functional model For large-scale
More informationCSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101
CSC 261/461 Database Systems Lecture 21 and 22 Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 Announcements Project 3 (MongoDB): Due on: 04/12 Work on Term Project and Project 1 The last (mini)
More informationDistributed Machine Learning: An Intro. Chen Huang
: An Intro. Chen Huang Feature Engineering Group, Data Mining Lab, Big Data Research Center, UESTC Contents Background Some Examples Model Parallelism & Data Parallelism Parallelization Mechanisms Synchronous
More informationPrinciples of Programming Languages
Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 15/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 30 Java 8! Lambdas and streams in Java 8 1 Java 8:
More informationCS370 Operating Systems
CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2016 Lecture 2 Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 2 System I/O System I/O (Chap 13) Central
More informationPOSIX Threads: a first step toward parallel programming. George Bosilca
POSIX Threads: a first step toward parallel programming George Bosilca bosilca@icl.utk.edu Process vs. Thread A process is a collection of virtual memory space, code, data, and system resources. A thread
More informationHuge market -- essentially all high performance databases work this way
11/5/2017 Lecture 16 -- Parallel & Distributed Databases Parallel/distributed databases: goal provide exactly the same API (SQL) and abstractions (relational tables), but partition data across a bunch
More informationOperating System Review Part
Operating System Review Part CMSC 602 Operating Systems Ju Wang, 2003 Fall Virginia Commonwealth University Review Outline Definition Memory Management Objective Paging Scheme Virtual Memory System and
More informationCPS 512 midterm exam #1, 10/7/2016
CPS 512 midterm exam #1, 10/7/2016 Your name please: NetID: Answer all questions. Please attempt to confine your answers to the boxes provided. If you don t know the answer to a question, then just say
More informationFailure Tolerance. Distributed Systems Santa Clara University
Failure Tolerance Distributed Systems Santa Clara University Distributed Checkpointing Distributed Checkpointing Capture the global state of a distributed system Chandy and Lamport: Distributed snapshot
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #19: Machine Learning 1
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #19: Machine Learning 1 Supervised Learning Would like to do predicbon: esbmate a func3on f(x) so that y = f(x) Where y can be: Real number:
More information