Nikos Anastopoulos, Konstantinos Nikas, Georgios Goumas and Nectarios Koziris

Size: px
Start display at page:

Download "Nikos Anastopoulos, Konstantinos Nikas, Georgios Goumas and Nectarios Koziris"

Transcription

1 Early Experiences on Accelerating Dijkstra s Algorithm Using Transactional Memory Nikos Anastopoulos, Konstantinos Nikas, Georgios Goumas and Nectarios Koziris Computing Systems Laboratory School of Electrical and Computer Engineering National Technical University of Athens {anastop,knikas,goumas,nkoziris}@cslab.ece.ntua.gr May 29, 29 CSLab National Technical University of Athens

2 Outline 1 Dijkstra s Basics 2 Straightforward Parallelization Scheme 3 Helper-Threading Scheme 4 Experimental Evaluation 5 Conclusions Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 2 / 27

3 The Basics of Dijkstra s Algorithm SSSP Problem Directed graph G = (V, E), weight function w : E R +, source vertex s v V : compute δ(v) = min{w(p) : s p v} Shortest path estimate d(v) gradually converges to δ(v) through relaxations relax (v,w): d(w) = min{d(w), d(v) + w(v, w)} can we find a better path s p w by going through v? Three partitions of vertices Settled: d(v) = δ(v) Queued: d(v) > δ(v) and d(v) Unreached: d(v) = Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 3 / 27

4 The Basics of Dijkstra s Algorithm Serial algorithm Input : G = (V, E), w : E R +, source vertex s, min Q Output : shortest distance array d, predecessor array π foreach v V do d[v] inf; π[v] nil; Insert(Q, v); end d[s] ; while Q do u ExtractMin(Q); foreach v adjacent to u do sum d[u] + w(u, v); if d[v] > sum then DecreaseKey(Q, v, sum); d[v] sum; π[v] u; end end 55 B A C D S E 7 Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 4 / 27

5 The Basics of Dijkstra s Algorithm j i k i: Min-priority queue implemented as binary min-heap maintains all but the settled vertices min-heap property: i : d(parent(i)) d(i) amortizes the cost of multiple ExtractMin s and DecreaseKey s O(( E + V )log V ) time complexity Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 5 / 27

6 Straightforward Parallelization Fine-grain parallelization at the inner loop level Fine-Grain Multi-Threaded S /* Initialization phase same to the serial code */ while Q do Barrier if tid = then u ExtractMin(Q); Barrier for v adjacent to u in parallel do sum d[u] + w(u, v); if d[v] > sum then Begin-Atomic DecreaseKey(Q, v, sum); End-Atomic d[v] sum; π[v] u; end end Issues: 55 B A C D E speedup bounded by average out-degree concurrent heap updates due to DecreaseKey s barrier synchronization overhead Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 6 / 27

7 Concurrent Heap Updates with Locks x 3 x j j j: i k i:17 6 k: i: k Coarse-grain synchronization (cgs-lock) enforces atomicity at the level of a DecreaseKey operation one lock for the entire heap serializes DecreaseKey s Fine-grain synchronization (fgs-lock) enforces atomicity at the level of a single swap allows multiple swap sequences to execute in parallel as long as they are temporally non-overlapping separate locks for each parent-child pair Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 7 / 27

8 Performance of FGMT with Locks Multithreaded speedup cgs-lock perfbar+cgs-lock perfbar+fgs-lock Software barriers dominate total execution time 72% with 2 threads, % with replace with idealized (simulated) zero-latency barriers Fgs-lock scheme more scalable, but still fails to outperform serial locking overhead (2 locks + 2 unlocks per swap) Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 / 27

9 Concurrent Heap Updates with TM x 3 x j j j: i k i:17 6 k: i: k TM-based Coarse-grain synchronization (cgs-tm) enclose DecreaseKey within a transaction allows multiple swap sequences to execute in parallel as long as they are spatially (and temporally) non-overlapping conflicting transaction stalls and retries or aborts Fine-grain synchronization (fgs-tm) enclose each swap operation within a transaction atomicity as in fgs-lock shorter but more transactions Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 9 / 27

10 Performance of FGMT with TM Multithreaded speedup perfbar+cgs-lock perfbar+fgs-lock perfbar+cgs-tm perfbar+fgs-tm TM-based schemes offer speedup up to 1.1 less overhead for cgs-tm, yet equally able to exploit available concurrency Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 1 / 27

11 Helper-Threading Scheme Motivation expose more parallelism to each thread eliminate costly barrier synchronization Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

12 Helper-Threading Scheme Motivation expose more parallelism to each thread eliminate costly barrier synchronization Rationale in serial, relaxations are performed only from the extracted (settled) vertex Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

13 Helper-Threading Scheme Motivation expose more parallelism to each thread eliminate costly barrier synchronization Rationale in serial, relaxations are performed only from the extracted (settled) vertex allow relaxations for out-edges of queued vertices, hoping that some of them might already be settled main thread operates as in the serial algorithm assign the next t vertices in the queue (x 2... x t+1 ) to t helper threads helper thread k relaxes all out-edges of vertex x k i 55 B i-1 A C D 2 7 S E 7 speculation on the status of d(x k ) if already optimal, main thread will be offloaded if not optimal, any suboptimal relaxations will be corrected eventually by main thread Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

14 Execution Pattern Serial FGMT Helper Threads step k step k+1 step k+2 Main step k step k+1 step k+2 Thread 4 Thread 3 Thread 2 Thread 1 extract-min read tid th -min relax edges step k step k+1 step k+2 Main kill kill kill Helper 1 kill Helper 2 kill Helper 3 the main thread stops all helpers at the end of each iteration unfinished work will be corrected, as with mis-speculated distances Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

15 Helper-Threading Scheme Main thread Helper thread while Q do u ExtractMin(Q); done ; foreach v adjacent to u do sum d[u] + w(u, v); Begin-Xact if d[v] > sum then DecreaseKey(Q, v, sum); d[v] sum; π[v] u; End-Xact end Begin-Xact done 1; End-Xact end while Q do while done = 1 do ; x ReadMin(Q, tid) stop foreach y adjacent to x and while stop = do Begin-Xact if done = then sum d[x] + w(x, y) if d[y] > sum then DecreaseKey(Q, y, sum) d[y] sum π[y] x else stop 1 End-Xact end end for a single neighbour, the check for relaxation, updates to the heap, and updates to d,π arrays, are enclosed within a transaction performed all-or-none on a conflict, only one thread commits interruption of helper threads implemented through TM, as well Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

16 Helper-Threading Scheme Main thread 1 while Q do 2 u ExtractMin(Q); 3 done ; 4 foreach v adjacent to u do 5 sum d[u] + w(u, v); 6 Begin-Xact 7 if d[v] > sum then DecreaseKey(Q, v, sum); 9 d[v] sum; 1 π[v] u; 11 End-Xact 12 end 13 Begin-Xact 14 done 1; 15 End-Xact 16 end Why with TM? Helper thread while Q do while done = 1 do ; x ReadMin(Q, tid) stop foreach y adjacent to x and while stop = do Begin-Xact if done = then sum d[x] + w(x, y) if d[y] > sum then DecreaseKey(Q, y, sum) d[y] sum π[y] x else stop 1 End-Xact end end Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

17 Helper-Threading Scheme Main thread 1 while Q do 2 u ExtractMin(Q); 3 done ; 4 foreach v adjacent to u do 5 sum d[u] + w(u, v); 6 Begin-Xact 7 if d[v] > sum then DecreaseKey(Q, v, sum); 9 d[v] sum; 1 π[v] u; 11 End-Xact 12 end 13 Begin-Xact 14 done 1; 15 End-Xact 16 end Helper thread 1 while Q do 2 while done = 1 do ; 3 x ReadMin(Q, tid) 4 stop 5 foreach y adjacent to x and while stop = do 6 Begin-Xact 7 if done = then sum d[x] + w(x, y) 9 if d[y] > sum then 1 DecreaseKey(Q, y, sum) 11 d[y] sum 12 π[y] x 13 else 14 stop 1 15 End-Xact 16 end 17 end Why with TM? composable all dependent atomic sub-operations composed into a large atomic operation, without limiting concurrency optimistic easily programmable Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

18 Experimental Setup Full-system simulation Simics in conjunction with GEMS toolset 2.1 boots unmodified Solaris 1 (UltraSPARC III Cu) LogTM ( Signature Edition ) eager version management eager conflict detection on a conflict, a transaction stalls and either retries or aborts HYBRID conflict resolution policy favors older transactions Hardware platform Software single CMP system (configurations up to 32 cores) private L1 caches (64KB), shared L2 cache (2MB) Pthreads for threading and synchronization Simics magic instructions to simulate idealized barriers Sun Studio 12 C compiler (-xo3) Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

19 Graphs Three graph families Random: G (n, m) model SSCA#2: cliques with varying size (1 C ) connected with probability P R-MAT: power-law degree distributions GTgraph graph generator Fixed #nodes (1K), varying density sparse ( 1K edges) medium ( 1K edges) dense ( 2K edges) Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

20 Speedups ssca2-1x2351 ssca2-1x1153 rmat-1x perfbar+cgs-lock perfbar+cgs-tm perfbar+fgs-lock perfbar+fgs-tm helper perfbar+cgs-lock perfbar+cgs-tm perfbar+fgs-lock perfbar+fgs-tm helper perfbar+cgs-lock perfbar+cgs-tm perfbar+fgs-lock perfbar+fgs-tm helper rmat-1x1 rand-1x1 rand-1x perfbar+cgs-lock perfbar+cgs-tm perfbar+fgs-lock perfbar+fgs-tm helper Helper-Threading perfbar+cgs-lock perfbar+cgs-tm perfbar+fgs-lock perfbar+fgs-tm helper perfbar+cgs-lock perfbar+cgs-tm perfbar+fgs-lock perfbar+fgs-tm helper speedups in 6 out of 9 cases (not all shown), up to 1.46 performance improves with increasing density main thread not obstructed by helpers (<1% abort rate in all cases) FGMT with TM speedups only with perfect barriers optimistic parallelism does exist in concurrent queue updates Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

21 Conclusions FGMT conventional synchronization mechanisms incur unacceptable overhead TM reduces overheads and highlights the existence of parallelism, but still requires very efficient barriers to offer some speedup HT with TM exposes more parallelism and eliminates barrier synchronization noteworthy speedups with minimal code extensions Future work more aggressive parallelization schemes dynamic adaptation of helper threads to algorithm s execution phases explore impact of TM characteristics applicability of HT on other SSSP algorithms ( -stepping, Bellman-Ford) and other similar ( greedy ) applications Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 1 / 27

22 Thank you! Questions? Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

23 BACKUP SLIDES Anastopoulos et al. (NTUA) MTAAP 9 May 29, 29 2 / 27

24 DecreaseKey Input : min-priority queue Q, vertex u, new key value for vertex u Q[u] value; i u; while (parent(i).key value) do Begin-Xact if swap(u, i) then i parent(i); End-Xact end Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

25 Distribution of DecreaseKey s between main and helper threads (a) 1Kx1K (b) 1Kx2K (c) 1Kx1K % DecreaseKey ops w.r.t. serial execution serial HT-main HT-helper HT-all % DecreaseKey ops w.r.t. serial execution serial HT-main HT-helper HT-all % DecreaseKey ops w.r.t. serial execution serial HT-main HT-helper HT-all Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

26 Overall and main thread transaction abort rates (a) 1Kx1K (b) 1Kx2K (c) 1Kx1K 3 25 overall main thread overall main thread..7 overall main thread Abort rate Abort rate Abort rate Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

27 Breakdown of main thread s total cycles (a) 1Kx1K (b) 1Kx2K (c) 1Kx1K Main thread cycle breakdown 1.6e+7 1.4e+7 1.2e+7 1e+7 e+6 6e+6 4e+6 2e+6 total cycles non-xact cycles xact cycles xact overhead cycles Main thread cycle breakdown 6e+7 5e+7 4e+7 3e+7 2e+7 1e+7 total cycles non-xact cycles xact cycles xact overhead cycles Main thread cycle breakdown 1.6e+ 1.4e+ 1.2e+ 1e+ e+7 6e+7 4e+7 2e+7 total cycles non-xact cycles xact cycles xact overhead cycles Non-transactional (non-xact), transactional (xact) and overhead (xact-overhead) Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

28 Percentage of useful (xact) and wasted (xact-overhead) transactional cycles (a) 1Kx1K (b) 1Kx2K (c) 1Kx1K %Cycle breakdown for average transaction xact cycles xact overhead cycles %Cycle breakdown for average transaction xact cycles xact overhead cycles %Cycle breakdown for average transaction xact cycles xact overhead cycles Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

29 Write-set sizes Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

30 The timeline of execution Cycles of main thread (per 1 iterations) 3.5e+6 3e+6 2.5e+6 2e+6 1.5e+6 1e+6 5 serial HT-2thr HT-4thr HT-best Outer loop iterations Anastopoulos et al. (NTUA) MTAAP 9 May 29, / 27

CSLab. Utilizing Underlying Synchronization Mechanisms for Efficient Support of Different Programming Models. Nikos Anastopoulos.

CSLab. Utilizing Underlying Synchronization Mechanisms for Efficient Support of Different Programming Models. Nikos Anastopoulos. Utilizing Underlying Synchronization Mechanisms for Efficient Support of Different Programming Models Nikos Anastopoulos Computing Systems Laboratory School of Electrical and Computer Engineering National

More information

Early Experiences on Accelerating Dijkstra s Algorithm Using Transactional Memory

Early Experiences on Accelerating Dijkstra s Algorithm Using Transactional Memory Early Experiences on Accelerating Dijkstra s Algorithm Using Transactional Memory Nikos Anastopoulos, Konstantinos Nikas, Georgios Goumas and Nectarios Koziris National Technical University of Athens School

More information

PPTA: Prefetch-Process-Thread-Alternation to Speed Up Dijkstra s Algorithm

PPTA: Prefetch-Process-Thread-Alternation to Speed Up Dijkstra s Algorithm : Prefetch-Process-Thread-Alternation to Speed Up Dijkstra s Algorithm Konstantinos Kanellopoulos, Konstantinos Nikas, Dimitrios Siakavaras, Vasileios Karakostas, Georgios Goumas and Nectarios Koziris

More information

Single Source Shortest Path

Single Source Shortest Path Single Source Shortest Path A directed graph G = (V, E) and a pair of nodes s, d is given. The edges have a real-valued weight W i. This time we are looking for the weight and the shortest path from s

More information

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Computer Science & Engineering 423/823 Design and Analysis of Algorithms Computer Science & Engineering 423/823 Design and Analysis of Algorithms Lecture 07 Single-Source Shortest Paths (Chapter 24) Stephen Scott and Vinodchandran N. Variyam sscott@cse.unl.edu 1/36 Introduction

More information

Algorithms on Graphs: Part III. Shortest Path Problems. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar..

Algorithms on Graphs: Part III. Shortest Path Problems. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar.. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar.. Shortest Path Problems Algorithms on Graphs: Part III Path in a graph. Let G = V,E be a graph. A path p = e 1,...,e k, e i E,

More information

Shortest Path Problem

Shortest Path Problem Shortest Path Problem CLRS Chapters 24.1 3, 24.5, 25.2 Shortest path problem Shortest path problem (and variants) Properties of shortest paths Algorithmic framework Bellman-Ford algorithm Shortest paths

More information

Dijkstra s Shortest Path Algorithm

Dijkstra s Shortest Path Algorithm Dijkstra s Shortest Path Algorithm DPV 4.4, CLRS 24.3 Revised, October 23, 2014 Outline of this Lecture Recalling the BFS solution of the shortest path problem for unweighted (di)graphs. The shortest path

More information

1 Dijkstra s Algorithm

1 Dijkstra s Algorithm Lecture 11 Dijkstra s Algorithm Scribes: Himanshu Bhandoh (2015), Virginia Williams, and Date: November 1, 2017 Anthony Kim (2016), G. Valiant (2017), M. Wootters (2017) (Adapted from Virginia Williams

More information

Single Source Shortest Paths

Single Source Shortest Paths Single Source Shortest Paths Given a connected weighted directed graph G(V, E), associated with each edge u, v E, there is a weight w(u, v). The single source shortest paths (SSSP) problem is to find a

More information

Algorithms for Data Science

Algorithms for Data Science Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Shortest paths in weighted graphs (Bellman-Ford, Floyd-Warshall) Outline 1 Shortest paths in graphs with

More information

from notes written mostly by Dr. Carla Savage: All Rights Reserved

from notes written mostly by Dr. Carla Savage: All Rights Reserved CSC 505, Fall 2000: Week 9 Objectives: learn about various issues related to finding shortest paths in graphs learn algorithms for the single-source shortest-path problem observe the relationship among

More information

Lecture 11: Analysis of Algorithms (CS ) 1

Lecture 11: Analysis of Algorithms (CS ) 1 Lecture 11: Analysis of Algorithms (CS583-002) 1 Amarda Shehu November 12, 2014 1 Some material adapted from Kevin Wayne s Algorithm Class @ Princeton 1 2 Dynamic Programming Approach Floyd-Warshall Shortest

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 6 Greedy Graph Algorithms Shortest paths Adam Smith 9/8/14 The (Algorithm) Design Process 1. Work out the answer for some examples. Look for a general principle Does

More information

Analysis of Algorithms, I

Analysis of Algorithms, I Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, March 8, 2016 Outline 1 Recap Single-source shortest paths in graphs with real edge weights:

More information

Graphs. Part II: SP and MST. Laura Toma Algorithms (csci2200), Bowdoin College

Graphs. Part II: SP and MST. Laura Toma Algorithms (csci2200), Bowdoin College Laura Toma Algorithms (csci2200), Bowdoin College Topics Weighted graphs each edge (u, v) has a weight denoted w(u, v) or w uv stored in the adjacency list or adjacency matrix The weight of a path p =

More information

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end

More information

Theory of Computing. Lecture 7 MAS 714 Hartmut Klauck

Theory of Computing. Lecture 7 MAS 714 Hartmut Klauck Theory of Computing Lecture 7 MAS 714 Hartmut Klauck Shortest paths in weighted graphs We are given a graph G (adjacency list with weights W(u,v)) No edge means W(u,v)=1 We look for shortest paths from

More information

Parallel Graph Algorithms

Parallel Graph Algorithms Parallel Graph Algorithms Design and Analysis of Parallel Algorithms 5DV050/VT3 Part I Introduction Overview Graphs definitions & representations Minimal Spanning Tree (MST) Prim s algorithm Single Source

More information

4/8/11. Single-Source Shortest Path. Shortest Paths. Shortest Paths. Chapter 24

4/8/11. Single-Source Shortest Path. Shortest Paths. Shortest Paths. Chapter 24 /8/11 Single-Source Shortest Path Chapter 1 Shortest Paths Finding the shortest path between two nodes comes up in many applications o Transportation problems o Motion planning o Communication problems

More information

Introduction. I Given a weighted, directed graph G =(V, E) with weight function

Introduction. I Given a weighted, directed graph G =(V, E) with weight function ntroduction Computer Science & Engineering 2/82 Design and Analysis of Algorithms Lecture 06 Single-Source Shortest Paths (Chapter 2) Stephen Scott (Adapted from Vinodchandran N. Variyam) sscott@cse.unl.edu

More information

LogTM: Log-Based Transactional Memory

LogTM: Log-Based Transactional Memory LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet

More information

Single Source Shortest Path (SSSP) Problem

Single Source Shortest Path (SSSP) Problem Single Source Shortest Path (SSSP) Problem Single Source Shortest Path Problem Input: A directed graph G = (V, E); an edge weight function w : E R, and a start vertex s V. Find: for each vertex u V, δ(s,

More information

Improving the Practicality of Transactional Memory

Improving the Practicality of Transactional Memory Improving the Practicality of Transactional Memory Woongki Baek Electrical Engineering Stanford University Programming Multiprocessors Multiprocessor systems are now everywhere From embedded to datacenter

More information

Minimum Spanning Trees

Minimum Spanning Trees Minimum Spanning Trees Problem A town has a set of houses and a set of roads. A road connects 2 and only 2 houses. A road connecting houses u and v has a repair cost w(u, v). Goal: Repair enough (and no

More information

The Common Case Transactional Behavior of Multithreaded Programs

The Common Case Transactional Behavior of Multithreaded Programs The Common Case Transactional Behavior of Multithreaded Programs JaeWoong Chung Hassan Chafi,, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, Christos Kozyrakis, Kunle Olukotun Computer Systems Lab

More information

Algorithms for Data Science

Algorithms for Data Science Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Thursday, October 1, 2015 Outline 1 Recap 2 Shortest paths in graphs with non-negative edge weights (Dijkstra

More information

Outline. Computer Science 331. Definitions: Paths and Their Costs. Computation of Minimum Cost Paths

Outline. Computer Science 331. Definitions: Paths and Their Costs. Computation of Minimum Cost Paths Outline Computer Science 33 Computation of Minimum-Cost Paths Dijkstra s Mike Jacobson Department of Computer Science University of Calgary Lecture #34 Dijkstra s to Find Min-Cost Paths 3 Example 4 5 References

More information

Lecture I: Shortest Path Algorithms

Lecture I: Shortest Path Algorithms Lecture I: Shortest Path Algorithms Dr Kieran T. Herley Department of Computer Science University College Cork October 201 KH (21/10/1) Lecture I: Shortest Path Algorithms October 201 1 / 28 Background

More information

Algorithms and Data Structures 2014 Exercises and Solutions Week 9

Algorithms and Data Structures 2014 Exercises and Solutions Week 9 Algorithms and Data Structures 2014 Exercises and Solutions Week 9 November 26, 2014 1 Directed acyclic graphs We are given a sequence (array) of numbers, and we would like to find the longest increasing

More information

CSE 502 Class 23. Jeremy Buhler Steve Cole. April BFS solves unweighted shortest path problem. Every edge traversed adds one to path length

CSE 502 Class 23. Jeremy Buhler Steve Cole. April BFS solves unweighted shortest path problem. Every edge traversed adds one to path length CSE 502 Class 23 Jeremy Buhler Steve Cole April 14 2015 1 Weighted Version of Shortest Paths BFS solves unweighted shortest path problem Every edge traversed adds one to path length What if edges have

More information

EazyHTM: Eager-Lazy Hardware Transactional Memory

EazyHTM: Eager-Lazy Hardware Transactional Memory EazyHTM: Eager-Lazy Hardware Transactional Memory Saša Tomić, Cristian Perfumo, Chinmay Kulkarni, Adrià Armejach, Adrián Cristal, Osman Unsal, Tim Harris, Mateo Valero Barcelona Supercomputing Center,

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 7 Greedy Graph Algorithms Topological sort Shortest paths Adam Smith The (Algorithm) Design Process 1. Work out the answer for some examples. Look for a general principle

More information

Introduction. I Given a weighted, directed graph G =(V, E) with weight function

Introduction. I Given a weighted, directed graph G =(V, E) with weight function ntroduction Computer Science & Engineering 2/82 Design and Analysis of Algorithms Lecture 05 Single-Source Shortest Paths (Chapter 2) Stephen Scott (Adapted from Vinodchandran N. Variyam) sscott@cse.unl.edu

More information

Scheduling Transactions in Replicated Distributed Transactional Memory

Scheduling Transactions in Replicated Distributed Transactional Memory Scheduling Transactions in Replicated Distributed Transactional Memory Junwhan Kim and Binoy Ravindran Virginia Tech USA {junwhan,binoy}@vt.edu CCGrid 2013 Concurrency control on chip multiprocessors significantly

More information

Potential violations of Serializability: Example 1

Potential violations of Serializability: Example 1 CSCE 6610:Advanced Computer Architecture Review New Amdahl s law A possible idea for a term project Explore my idea about changing frequency based on serial fraction to maintain fixed energy or keep same

More information

Week 12: Minimum Spanning trees and Shortest Paths

Week 12: Minimum Spanning trees and Shortest Paths Agenda: Week 12: Minimum Spanning trees and Shortest Paths Kruskal s Algorithm Single-source shortest paths Dijkstra s algorithm for non-negatively weighted case Reading: Textbook : 61-7, 80-87, 9-601

More information

COMP251: Single source shortest paths

COMP251: Single source shortest paths COMP51: Single source shortest paths Jérôme Waldispühl School of Computer Science McGill University Based on (Cormen et al., 00) Problem What is the shortest road to go from one city to another? Eample:

More information

Routing Protocols and the IP Layer

Routing Protocols and the IP Layer Routing Protocols and the IP Layer CS244A Review Session 2/0/08 Ben Nham Derived from slides by: Paul Tarjan Martin Casado Ari Greenberg Functions of a router Forwarding Determine the correct egress port

More information

Parallel Graph Algorithms

Parallel Graph Algorithms Parallel Graph Algorithms Design and Analysis of Parallel Algorithms 5DV050 Spring 202 Part I Introduction Overview Graphsdenitions, properties, representation Minimal spanning tree Prim's algorithm Shortest

More information

COT 6405 Introduction to Theory of Algorithms

COT 6405 Introduction to Theory of Algorithms COT 6405 Introduction to Theory of Algorithms Topic 16. Single source shortest path 11/18/2015 1 Problem definition Problem: given a weighted directed graph G, find the minimum-weight path from a given

More information

Implementation of Parallel Path Finding in a Shared Memory Architecture

Implementation of Parallel Path Finding in a Shared Memory Architecture Implementation of Parallel Path Finding in a Shared Memory Architecture David Cohen and Matthew Dallas Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 Email: {cohend4, dallam}

More information

5.4 Shortest Paths. Jacobs University Visualization and Computer Graphics Lab. CH : Algorithms and Data Structures 456

5.4 Shortest Paths. Jacobs University Visualization and Computer Graphics Lab. CH : Algorithms and Data Structures 456 5.4 Shortest Paths CH08-320201: Algorithms and Data Structures 456 Definitions: Path Consider a directed graph G=(V,E), where each edge e є E is assigned a non-negative weight w: E -> R +. A path is a

More information

Chí Cao Minh 28 May 2008

Chí Cao Minh 28 May 2008 Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor

More information

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations Lecture 21: Transactional Memory Topics: Hardware TM basics, different implementations 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end locks are blocking,

More information

Introduction Single-source shortest paths All-pairs shortest paths. Shortest paths in graphs

Introduction Single-source shortest paths All-pairs shortest paths. Shortest paths in graphs Shortest paths in graphs Remarks from previous lectures: Path length in unweighted graph equals to edge count on the path Oriented distance (δ(u, v)) between vertices u, v equals to the length of the shortest

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III/VI Section : CSE-1 & CSE-2 Subject Code : CS2354 Subject Name : Advanced Computer Architecture Degree & Branch : B.E C.S.E. UNIT-1 1.

More information

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section

More information

CS4800: Algorithms & Data Jonathan Ullman

CS4800: Algorithms & Data Jonathan Ullman CS4800: Algorithms & Data Jonathan Ullman Lecture 13: Shortest Paths: Dijkstra s Algorithm, Heaps DFS(?) Feb 0, 018 Navigation s 9 15 14 5 6 3 18 30 11 5 0 16 4 6 6 3 19 t Weighted Graphs A graph with

More information

CS 3410 Ch 14 Graphs and Paths

CS 3410 Ch 14 Graphs and Paths CS 3410 Ch 14 Graphs and Paths Sections 14.1-14.3 Pages 527-552 Exercises 1,2,5,7-9 14.1 Definitions 1. A vertex (node) and an edge are the basic building blocks of a graph. Two vertices, (, ) may be connected

More information

Relaxing Concurrency Control in Transactional Memory. Utku Aydonat

Relaxing Concurrency Control in Transactional Memory. Utku Aydonat Relaxing Concurrency Control in Transactional Memory by Utku Aydonat A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of The Edward S. Rogers

More information

Concurrency Control. Chapter 17. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Concurrency Control. Chapter 17. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Concurrency Control Chapter 17 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Conflict Schedules Two actions conflict if they operate on the same data object and at least one of them

More information

Betweenness Metric Dmitry Vyukov November 1, 2010

Betweenness Metric Dmitry Vyukov November 1, 2010 Problem Statement Betweenness Metric Dmitry Vyukov mailto:dvyukov@gmail.com November 1, 2010 Betweenness is a metric applied to a vertex within a weighted graph. For the purposes of this problem we will

More information

Software Speculative Multithreading for Java

Software Speculative Multithreading for Java Software Speculative Multithreading for Java Christopher J.F. Pickett and Clark Verbrugge School of Computer Science, McGill University {cpicke,clump}@sable.mcgill.ca Allan Kielstra IBM Toronto Lab kielstra@ca.ibm.com

More information

Eliminating Global Interpreter Locks in Ruby through Hardware Transactional Memory

Eliminating Global Interpreter Locks in Ruby through Hardware Transactional Memory Eliminating Global Interpreter Locks in Ruby through Hardware Transactional Memory Rei Odaira, Jose G. Castanos and Hisanobu Tomari IBM Research and University of Tokyo April 8, 2014 Rei Odaira, Jose G.

More information

Practical Near-Data Processing for In-Memory Analytics Frameworks

Practical Near-Data Processing for In-Memory Analytics Frameworks Practical Near-Data Processing for In-Memory Analytics Frameworks Mingyu Gao, Grant Ayers, Christos Kozyrakis Stanford University http://mast.stanford.edu PACT Oct 19, 2015 Motivating Trends End of Dennard

More information

Shortest Path Algorithms

Shortest Path Algorithms Shortest Path Algorithms Andreas Klappenecker [based on slides by Prof. Welch] 1 Single Source Shortest Path Given: a directed or undirected graph G = (V,E) a source node s in V a weight function w: E

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:

More information

Implementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University

Implementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Implementing and Evaluating Nested Parallel Transactions in STM Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Introduction // Parallelize the outer loop for(i=0;i

More information

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 18-742 Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 Past Due: Review Assignments Was Due: Tuesday, October 9, 11:59pm. Sohi

More information

Shortest path problems

Shortest path problems Next... Shortest path problems Single-source shortest paths in weighted graphs Shortest-Path Problems Properties of Shortest Paths, Relaxation Dijkstra s Algorithm Bellman-Ford Algorithm Shortest-Paths

More information

Cost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University)

Cost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) Cost of Concurrency in Hybrid Transactional Memory Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) 1 Transactional Memory: a history Hardware TM Software TM Hybrid TM 1993 1995-today

More information

Introduction to Algorithms. Lecture 11

Introduction to Algorithms. Lecture 11 Introduction to Algorithms Lecture 11 Last Time Optimization Problems Greedy Algorithms Graph Representation & Algorithms Minimum Spanning Tree Prim s Algorithm Kruskal s Algorithm 2 Today s Topics Shortest

More information

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of lazy TM 1 Transactions Access to shared variables is encapsulated within transactions the system gives the illusion

More information

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018 Object-oriented programming and data-structures CS/ENGRD 20 SUMMER 208 Lecture 3: Shortest Path http://courses.cs.cornell.edu/cs20/208su Graph Algorithms Search Depth-first search Breadth-first search

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

Outline. (single-source) shortest path. (all-pairs) shortest path. minimum spanning tree. Dijkstra (Section 4.4) Bellman-Ford (Section 4.

Outline. (single-source) shortest path. (all-pairs) shortest path. minimum spanning tree. Dijkstra (Section 4.4) Bellman-Ford (Section 4. Weighted Graphs 1 Outline (single-source) shortest path Dijkstra (Section 4.4) Bellman-Ford (Section 4.6) (all-pairs) shortest path Floyd-Warshall (Section 6.6) minimum spanning tree Kruskal (Section 5.1.3)

More information

Multi-Core Computing with Transactional Memory. Johannes Schneider and Prof. Roger Wattenhofer

Multi-Core Computing with Transactional Memory. Johannes Schneider and Prof. Roger Wattenhofer Multi-Core Computing with Transactional Memory Johannes Schneider and Prof. Roger Wattenhofer 1 Overview Introduction Difficulties with parallel (multi-core) programming A (partial) solution: Transactional

More information

CS 5220: Parallel Graph Algorithms. David Bindel

CS 5220: Parallel Graph Algorithms. David Bindel CS 5220: Parallel Graph Algorithms David Bindel 2017-11-14 1 Graphs Mathematically: G = (V, E) where E V V Convention: V = n and E = m May be directed or undirected May have weights w V : V R or w E :

More information

Speculative Synchronization

Speculative Synchronization Speculative Synchronization José F. Martínez Department of Computer Science University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu/martinez Problem 1: Conservative Parallelization No parallelization

More information

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Multi-Core Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 4 " Due: 11:49pm, Saturday " Two late days with penalty! Exam I " Grades out on

More information

Parallel Shortest Path Algorithms for Solving Large-Scale Instances

Parallel Shortest Path Algorithms for Solving Large-Scale Instances Parallel Shortest Path Algorithms for Solving Large-Scale Instances Kamesh Madduri Georgia Institute of Technology kamesh@cc.gatech.edu Jonathan W. Berry Sandia National Laboratories jberry@sandia.gov

More information

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013 Lecture 20: Transactional Memory Parallel Computer Architecture and Programming Slide credit Many of the slides in today s talk are borrowed from Professor Christos Kozyrakis (Stanford University) Raising

More information

Solution. violating the triangle inequality. - Initialize U = {s} - Add vertex v to U whenever DIST[v] decreases.

Solution. violating the triangle inequality. - Initialize U = {s} - Add vertex v to U whenever DIST[v] decreases. Solution Maintain a set U of all those vertices that might have an outgoing edge violating the triangle inequality. - Initialize U = {s} - Add vertex v to U whenever DIST[v] decreases. 1. Check if the

More information

Dr. Alexander Souza. Winter term 11/12

Dr. Alexander Souza. Winter term 11/12 Algorithms Theory 11 Shortest t Paths Dr. Alexander Souza 1. Shortest-paths problem Directed graph G = (V, E) Cost function c : E R 1 2 1 3 3 2 4 4 2 6 6 5 3 2 Distance between two vertices Cost of a path

More information

Dijkstra s Algorithm

Dijkstra s Algorithm Dijkstra s Algorithm 7 1 3 1 Weighted Graphs In a weighted graph, each edge has an associated numerical value, called the weight of the edge Edge weights may represent, distances, costs, etc. Example:

More information

CS 161 Lecture 11 BFS, Dijkstra s algorithm Jessica Su (some parts copied from CLRS) 1 Review

CS 161 Lecture 11 BFS, Dijkstra s algorithm Jessica Su (some parts copied from CLRS) 1 Review 1 Review 1 Something I did not emphasize enough last time is that during the execution of depth-firstsearch, we construct depth-first-search trees. One graph may have multiple depth-firstsearch trees,

More information

Optimizing Irregular Adaptive Applications on Multi-threaded Processors: The Case of Medium-Grain Parallel Delaunay Mesh Generation

Optimizing Irregular Adaptive Applications on Multi-threaded Processors: The Case of Medium-Grain Parallel Delaunay Mesh Generation Optimizing Irregular Adaptive Applications on Multi-threaded rocessors: The Case of Medium-Grain arallel Delaunay Mesh Generation Filip Blagojević The College of William & Mary CSci 710 Master s roject

More information

MST & Shortest Path -Prim s -Djikstra s

MST & Shortest Path -Prim s -Djikstra s MST & Shortest Path -Prim s -Djikstra s PRIM s - Minimum Spanning Tree - A spanning tree of a graph is a tree that has all the vertices of the graph connected by some edges. - A graph can have one or more

More information

CS 270 Algorithms. Oliver Kullmann. Breadth-first search. Analysing BFS. Depth-first. search. Analysing DFS. Dags and topological sorting.

CS 270 Algorithms. Oliver Kullmann. Breadth-first search. Analysing BFS. Depth-first. search. Analysing DFS. Dags and topological sorting. Week 5 General remarks and 2 We consider the simplest graph- algorithm, breadth-first (). We apply to compute shortest paths. Then we consider the second main graph- algorithm, depth-first (). And we consider

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Shortest Path Problem G. Guérard Department of Nouvelles Energies Ecole Supérieur d Ingénieurs Léonard de Vinci Lecture 3 GG A.I. 1/42 Outline 1 The Shortest Path Problem Introduction

More information

Automatic Compiler-Based Optimization of Graph Analytics for the GPU. Sreepathi Pai The University of Texas at Austin. May 8, 2017 NVIDIA GTC

Automatic Compiler-Based Optimization of Graph Analytics for the GPU. Sreepathi Pai The University of Texas at Austin. May 8, 2017 NVIDIA GTC Automatic Compiler-Based Optimization of Graph Analytics for the GPU Sreepathi Pai The University of Texas at Austin May 8, 2017 NVIDIA GTC Parallel Graph Processing is not easy 299ms HD-BFS 84ms USA Road

More information

TIE Graph algorithms

TIE Graph algorithms TIE-20106 1 1 Graph algorithms This chapter discusses the data structure that is a collection of points (called nodes or vertices) and connections between them (called edges or arcs) a graph. The common

More information

AN EXECUTION MODEL FOR FINE-GRAIN NESTED SPECULATIVE PARALLELISM

AN EXECUTION MODEL FOR FINE-GRAIN NESTED SPECULATIVE PARALLELISM FRACTAL AN EXECUTION MODEL FOR FINE-GRAIN NESTED SPECULATIVE PARALLELISM SUVINAY SUBRAMANIAN, MARK C. JEFFREY, MALEEN ABEYDEERA, HYUN RYONG LEE, VICTOR A. YING, JOEL EMER, DANIEL SANCHEZ ISCA 017 Current

More information

DBT Tool. DBT Framework

DBT Tool. DBT Framework Thread-Safe Dynamic Binary Translation using Transactional Memory JaeWoong Chung,, Michael Dalton, Hari Kannan, Christos Kozyrakis Computer Systems Laboratory Stanford University http://csl.stanford.edu

More information

2) Multi-Criteria 1) Contraction Hierarchies 3) for Ride Sharing

2) Multi-Criteria 1) Contraction Hierarchies 3) for Ride Sharing ) Multi-Criteria ) Contraction Hierarchies ) for Ride Sharing Robert Geisberger Bad Herrenalb, 4th and th December 008 Robert Geisberger Contraction Hierarchies Contraction Hierarchies Contraction Hierarchies

More information

More Graph Algorithms: Topological Sort and Shortest Distance

More Graph Algorithms: Topological Sort and Shortest Distance More Graph Algorithms: Topological Sort and Shortest Distance Topological Sort The goal of a topological sort is given a list of items with dependencies, (ie. item 5 must be completed before item 3, etc.)

More information

A Comparison of Capacity Management Schemes for Shared CMP Caches

A Comparison of Capacity Management Schemes for Shared CMP Caches A Comparison of Capacity Management Schemes for Shared CMP Caches Carole-Jean Wu and Margaret Martonosi Princeton University 7 th Annual WDDD 6/22/28 Motivation P P1 P1 Pn L1 L1 L1 L1 Last Level On-Chip

More information

INTRODUCTION. Hybrid Transactional Memory. Transactional Memory. Problems with Transactional Memory Problems

INTRODUCTION. Hybrid Transactional Memory. Transactional Memory. Problems with Transactional Memory Problems Hybrid Transactional Memory Peter Damron Sun Microsystems peter.damron@sun.com Alexandra Fedorova Harvard University and Sun Microsystems Laboratories fedorova@eecs.harvard.edu Yossi Lev Brown University

More information

Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 3.. NIL. 2. error new key is greater than current key 6. CASCADING-CUT(, )

Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 3.. NIL. 2. error new key is greater than current key 6. CASCADING-CUT(, ) Decreasing a key FIB-HEAP-DECREASE-KEY(,, ) 1. if >. 2. error new key is greater than current key 3.. 4.. 5. if NIL and.

More information

Parallel Shortest Path Algorithms for Solving Large-Scale Instances

Parallel Shortest Path Algorithms for Solving Large-Scale Instances Parallel Shortest Path Algorithms for Solving Large-Scale Instances Kamesh Madduri Georgia Institute of Technology kamesh@cc.gatech.edu Jonathan W. Berry Sandia National Laboratories jberry@sandia.gov

More information

Exploring Hardware Support For Scaling Irregular Applications on Multi-node Multi-core Architectures

Exploring Hardware Support For Scaling Irregular Applications on Multi-node Multi-core Architectures Exploring Hardware Support For Scaling Irregular Applications on Multi-node Multi-core Architectures MARCO CERIANI SIMONE SECCHI ANTONINO TUMEO ORESTE VILLA GIANLUCA PALERMO Politecnico di Milano - DEI,

More information

CSE 431/531: Analysis of Algorithms. Greedy Algorithms. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

CSE 431/531: Analysis of Algorithms. Greedy Algorithms. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo CSE 431/531: Analysis of Algorithms Greedy Algorithms Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Main Goal of Algorithm Design Design fast algorithms to solve

More information

Review: Creating a Parallel Program. Programming for Performance

Review: Creating a Parallel Program. Programming for Performance Review: Creating a Parallel Program Can be done by programmer, compiler, run-time system or OS Steps for creating parallel program Decomposition Assignment of tasks to processes Orchestration Mapping (C)

More information

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11 Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed

More information

Performance Improvement via Always-Abort HTM

Performance Improvement via Always-Abort HTM 1 Performance Improvement via Always-Abort HTM Joseph Izraelevitz* Lingxiang Xiang Michael L. Scott* *Department of Computer Science University of Rochester {jhi1,scott}@cs.rochester.edu Parallel Computing

More information

Taking Stock. IE170: Algorithms in Systems Engineering: Lecture 20. Example. Shortest Paths Definitions

Taking Stock. IE170: Algorithms in Systems Engineering: Lecture 20. Example. Shortest Paths Definitions Taking Stock IE170: Algorithms in Systems Engineering: Lecture 20 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University March 19, 2007 Last Time Minimum Spanning Trees Strongly

More information

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) Past & Present Have looked at two constraints: Mutual exclusion constraint between two events is a requirement that

More information

Scalable Software Transactional Memory for Chapel High-Productivity Language

Scalable Software Transactional Memory for Chapel High-Productivity Language Scalable Software Transactional Memory for Chapel High-Productivity Language Srinivas Sridharan and Peter Kogge, U. Notre Dame Brad Chamberlain, Cray Inc Jeffrey Vetter, Future Technologies Group, ORNL

More information

Lecture 25: Board Notes: Threads and GPUs

Lecture 25: Board Notes: Threads and GPUs Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel

More information