Programming Language Seminar Concurrency 2: Lock-free algorithms
|
|
- Colin Curtis
- 5 years ago
- Views:
Transcription
1 Programming Language Seminar Concurrency 2: Lock-free algorithms Peter Sestoft Friday
2 Outline for today Compare-and-swap instruction Atomic test-then-set operation Implemented directly in x86 and other hardware Treiber stack Michael & Scott non-blocking queue Correctness: Seq. consistency, linearizability Liveness: Non-blocking, lock-free Queue benchmarks Miniproject 2 2
3 Unsafe number range [lower,upper] public class NumberRange {! // INVARIANT: lower <= upper! private final AtomicInteger lower = new AtomicInteger(0);! private final AtomicInteger upper = new AtomicInteger(0);! public void setlower(int i) {! if (i > upper.get())! throw new IllegalArgumentException("can't set lower");! lower.set(i);!! public void setupper(int i) {! if (i < lower.get())! throw new IllegalArgumentException("can't set upper");! upper.set(i);!!! Test-then-set not atomic Unsafe, may violate invariant Goetz Listing
4 p 67 Immutable integer pairs private class IntPair {! // INVARIANT: lower <= upper! final int lower, upper;! public IntPair(int lower, int upper) {! this.lower = lower;! this.upper = upper;!!! Immutable, and safely publishable 5
5 Safe number range rep. public class CasNumberRange {! private final AtomicReference<IntPair> values! = new AtomicReference<IntPair>(new IntPair(0, 0));! public int getlower() { return values.get().lower;! public void setlower(int i) {! while (true) {! IntPair oldv = values.get();! if (i > oldv.upper)! throw new IllegalArgumentException("Can't set lower");! IntPair newv = new IntPair(i, oldv.upper);! if (values.compareandset(oldv, newv))! return;!!! Set if nobody else changed it Atomic replacement of one pair by another But may create many pairs before success... Goetz Listing
6 Compare-and-swap (CAS) Atomic test-then-set sequence (IBM 1970) Java AtomicReference<T> val.compareandset(t oldval, T newval): If val holds oldval, set it to newval and return true.net/cli System.Threading.Interlocked CompareExchange<T>(ref T loc, T newval, T oldval): If loc holds oldval, set it to newval and return true Implemented directly in x86 hardware See machine code in Interlocked.cs Optimistic concurrency Try to update, if it fails, retry (maybe) Just like database systems In contrast to pessimistic synchronized/lock 7
7 CAS has visibility effects Java's AtomicReference.compareAndSet etc have the same visibility effects as volatile: "The memory effects for accesses and updates of atomics generally follow the rules for volatiles" (java.util.concurrent.atomic package documentation) Also in C#/.NET/CLI, Ecma-335, I : "... atomic operations in the System.Threading.Interlocked class... perform implicit acquire/release operations" 8
8 Treiber's lock-free stack (1986) class ConcurrentStack <E> {! private static class Node <E> {! public final E item;! public Node<E> next;! public Node(E item) {! this.item = item;!!! Goetz Listing 15.6 AtomicReference<Node<E>> top = new AtomicReference<Node<E>>();!...!! top
9 Treiber's stack operations public void push(e item) {! Node<E> newhead = new Node<E>(item);! Node<E> oldhead;! do {! oldhead = top.get();! newhead.next = oldhead;! while (!top.compareandset(oldhead, newhead));!! Set top to new if not changed public E pop() {! Node<E> oldhead, newhead;! do {! oldhead = top.get();! if (oldhead == null)! return null;! newhead = oldhead.next;! while (!top.compareandset(oldhead, newhead));! return oldhead.item;!! Set top to next if not changed 10
10 CAS versus mutual exclusion (locks) Optimistic versus pessimistic concurrency Pro CAS Modern CAS is quite fast, cycles (CAS is used to implement lock acquisition itself) A failed CAS, unlike failed lock acquisition, requires no context switch, see Java Precisely p. 67 Therefore fast when contention is moderate Con CAS May fail arbitrarily many times Therefore slow when contention is very high Not all hardware implements CAS 11
11 Lock-based queue with sentinel private static class Node { final int item; volatile Node next; class SentinelLockQueue { private Node head, tail; public SentinelLockQueue() { head = tail = new Node(-444, null);... Invariants: Allocate sentinel node If empty: head==tail and head.next==null If non-empty: head!=tail, head.next is first item, tail points to last item File TestQueues.java 12
12 Lock-based queue operations public synchronized boolean put(int item) { Node node = new Node(item, null); tail.next = node; tail = node; return true; Enqueue at tail public synchronized int get() { if (head.next == null) return -999; Node first = head; head = first.next; return head.item; Dequeue from second node, becomes new sentinel Important property: Enqueue (put) updates tail but not head Dequeue (get) updates head but not tail 13
13 Michael-Scott lock-free queue with sentinel private static class Node { final int item; final AtomicReference<Node> next; class MSNonBlockingQueue1 { private final AtomicReference<Node> head, tail; public MSNonblockingQueue() { Node dummy = new Node(-444, null); head = new AtomicReference<Node>(dummy); tail = new AtomicReference<Node>(dummy); If non-empty: head.next is first item, tail points to last item ("quiescent state") or the second-last item ("intermediate state") 14
14 Michael-Scott queue operations Two-step enqueue at tail Two-step dequeue at head 15 Herlihy & Shavit ch 10
15 Michael-Scott enqueue (put) public boolean put(int item) { Node node = new Node(item, null); while (true) { Node last = tail.get(); Node next = last.next.get(); if (last == tail.get()) { if (next == null) { if (last.next.compareandset(next, node)) { tail.compareandset(last, node); return true; 2 else { tail.compareandset(last, next); Needed? 1 Quiescent, try add Success, try move tail Intermediate, try move tail "help another enqueuer" 16
16 Intermediate state and "help" Goetz et al. 17
17 Michael-Scott dequeue (get) public int get() { while (true) { Node first = head.get(), last = tail.get(), next = first.next.get(); if (first == head.get()) { Needed? if (first == last) { if (next == null) return -999; tail.compareandset(last, next); else { int result = next.item; 1 if (head.compareandset(first, next)) { return result; 2 Intermediate, try move tail (*) Try move head In Java or C#, but not C, (1) could go after (2) 18
18 Queue is empty A dequeues a B enqueues b (*) Why must dequeue mess with the tail? Scenario: B has set a.next=b but not yet tail=b A reads a's item A sets head=a.next Now tail item not reachable from head while (true) {... if (first == last) { if (next == null) return -999; tail.compareandset(last, next); else... Herlihy & Shavit 19
19 But... creates a lot of AtomicReference objects private static class Node { final int item; final AtomicReference<Node> next; OLD Must be CAS'able public Node(int item, Node next) { this.item = item; this.next = new AtomicReference<Node>(next); One AR per Node private static class Node { final int item; volatile Node next;... NEW Better, no AtomicReference object needed Instead, use an "updater" AtomicReferenceFieldUpdater<Node, Node> nextupdater = AtomicReferenceFieldUpdater.newUpdater(Node.class, Node.class, "next"); 20
20 Michael-Scott enqueue, using "updater" for.next public boolean put(int item) { Node node = new Node(item, null); while (true) { Node last = tail.get(); Node next = last.next; if (last == tail.get()) { if (next == null) { if (nextupdater.compareandset(last, next, node)) { tail.compareandset(last, node); return true; else { tail.compareandset(last, next); 21
21 CAS in Java versus.net.net uses static CAS methods in Interlocked Allows non-atomic access by mistake, bad Java's AtomicReference<T> seems safer Because must access the field through that class But, for efficiency, Java allows standard field access through AtomicReferenceFieldUpdater This is at least as bad as the.net design And gives poor tool support (IDE, refactoring,...) 22
22 Queue benchmarks Queue implementations Lock-based Lock-based, sentinel node Lock-free, sentinel node, AtomicReference Lock-free, sentinel node, Atomic...FieldUpdater Platforms Hotspot 64 bit Java 1.7.0_b147, Windows 7, Xeon W3505, 2.53GHz, 2 cores, 2009Q1 Hotspot 64 bit Java 1.6.0_37, MacOS, Core 2 Duo, 2.66GHz, 2 cores, 2008Q1 Icedtea Java 1.7.0_b21, Linux, Xeon E5320, 1.86GHz, 4/8 cores, 2006Q4 Hotspot 64 bit Java 1.7.0_25-b15, Linux, AMD Opteron 6386 SE, 32 cores, 2012Q4 Measurements probably flawed: the client threads do no useful work, only en/dequeue Nevertheless, big differences between machines 23
23 Java 1.7, Xeon W3505, 2 cores 6 5 Time as function of number of concurrent threads 4 3 LockQueue MSNonblockingQueue MSNonblockingQueueRefl SentinelLockQueue
24 Java 1.6, Core 2 Duo, 2 cores LockQueue MSNonblockingQueue MSNonblockingQueueRefl SentinelLockQueue
25 Java 1.7, Xeon E5320, 4/8 cores LockQueue MSNonblockingQueue MSNonblockingQueueRefl SentinelLockQueue
26 Java 1.7, AMD Opteron, 32 cores Very slow?! LockQueue MSNonblockingQueue MSNonblockingQueueRefl SentinelLockQueue
27 Gross performance bug in Goetz's Listing 15.7 Initial sentinel node is stored in a queue field class LinkedQueue<E> { private final Node<E> dummy = new Node<E>(null, null); private final AtomicReference<Node<E>> head = new AtomicReference<Node<E>>(dummy); private final AtomicReference<Node<E>> tail = new AtomicReference<Node<E>>(dummy);... This makes the initial sentinel live forever! Worse: It keeps all nodes ever allocated live! Because it links to them, directly or indirectly HUGE spaceleak, BAD performance Very bad 28
28 Obvious fix (after 8 hours) Don't bind the sentinel to a field: class LinkedQueue<E> { private final AtomicReference<Node<E>> head, tail; public LinkedQueue() { Node<E> dummy = new Node<E>(null, null); head = new AtomicReference<Node<E>>(dummy); tail = new AtomicReference<Node<E>>(dummy);... Now the sentinel dies as soon as the first item is dequeued No space leak, fine performance Polite sent to Mr. Goetz... no reply 29
29 Non-blocking, lock-free, wait-free Non-blocking: the delay (crash...) of one thread cannot delay another thread Generally, lock-based algorithms are not non-blocking Lock-free: some call will complete Michael & Scott queue operations are lock-free (But paper calls it non-blocking; obsolete terminology) Wait-free: every call will complete Michael & Scott queue operations are not wait-free; an enqueue could forever be delayed by other enqueues NB: An algorithm is not lock-free just because it uses no locks! (Why?) 30
30 Correctness notions (H&S ch 3): Quiescent consistency Principle Method calls should appear to happen in a onceat-a-time, sequential order Method calls separated by a period of quiescence should appear to take effect in their real-time order Method calls overlap in time Herlihy & Shavit 31
31 Principles Sequential consistency Method calls should appear to happen in a once-ata-time, sequential order Method calls should appear to take effect in program order Reads and writes on non-volatile fields fail this A B A q.enq(x) B q.enq(y) A q.deq(x) B q.deq(y) B q.enq(y) A q.enq(x) A q.deq(y) B q.deq(x) Two possible orders explaining the execution 32
32 Drawbacks of sequential consistency Not compositional: 33
33 Principles Linearizability Method calls should appear to happen in a onceat-a-time, sequential order Each method call should appear to take effect instantaneously at some moment between its invocation and response That "moment" is the linearization point of the method Linearizability implies sequential consistency 34
34 Lock-free is a bit difficult... java.util.concurrent.concurrentlinkeddequeue source comments (OpenJDK 7-b147): "We believe (without full proof) that all single-element deque operations (e.g., addfirst, peeklast, polllast) are linearizable (see Herlihy and Shavit's book). However, some combinations of operations are known not to be linearizable. In particular, when an addfirst(a) is racing with pollfirst() removing B, it is possible for an observer iterating over the elements to observe A B C and subsequently observe A C, even though no interior removes are ever performed. Nevertheless, iterators behave reasonably, providing the "weakly consistent" guarantees." 35
35 Week 1 Reading Read Goetz et al.: Java Concurrency in Practice, chapters 1, 2, 3, 4, 5 Look at Java Language Specification, section 17.4 and 17.5 Week 2 (this week) Goetz et al.: Java Concurrency in Practice, chapter 15 Michael and Scott: Simple, fast, and practical... Herlihy & Shavit: The Art of Multiprocessor Programming, chapters 3 (and 9) 36
36 Miniproject 2 Groves: Verifying Michael and Scott's Lock- Free Queue Algorithm Using Trace Reduction, 2008 Hand in to sestoft@itu.dk on 15 November see rules in miniproject-2.html 37
Practical Concurrent and Parallel Programming 10
Practical Concurrent and Parallel Programming 10 Peter Sestoft IT University of Copenhagen Friday 2016-11-11* IT University of Copenhagen 1 Plan for today Compare and swap (CAS) low-level atomicity Examples:
More informationPractical Concurrent and Parallel Programming 11
Practical Concurrent and Parallel Programming 11 Peter Sestoft Friday 2015-11-13* 1 Plan for today Compare and swap (CAS) low-level atomicity Examples: AtomicInteger and NumberRange How to implement a
More informationAtomic Variables & Nonblocking Synchronization
Atomic Variables & Nonblocking Synchronization CMSC 433 Fall 2014 Michael Hicks (with some slides due to Rance Cleaveland) A Locking Counter public final class Counter { private long value = 0; public
More informationBlocking Non-blocking Caveat:
Overview of Lecture 5 1 Progress Properties 2 Blocking The Art of Multiprocessor Programming. Maurice Herlihy and Nir Shavit. Morgan Kaufmann, 2008. Deadlock-free: some thread trying to get the lock eventually
More informationCSE 613: Parallel Programming. Lecture 17 ( Concurrent Data Structures: Queues and Stacks )
CSE 613: Parallel Programming Lecture 17 ( Concurrent Data Structures: Queues and Stacks ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2012 Desirable Properties of Concurrent
More informationNon-blocking Array-based Algorithms for Stacks and Queues!
Non-blocking Array-based Algorithms for Stacks and Queues! Niloufar Shafiei! Department of Computer Science and Engineering York University ICDCN 09 Outline! Introduction! Stack algorithm! Queue algorithm!
More informationProgramming Paradigms for Concurrency Lecture 3 Concurrent Objects
Programming Paradigms for Concurrency Lecture 3 Concurrent Objects Based on companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Modified by Thomas Wies New York University
More informationAdvances in Programming Languages
O T Y H Advances in Programming Languages APL5: Further language concurrency mechanisms David Aspinall (including slides by Ian Stark) School of Informatics The University of Edinburgh Tuesday 5th October
More informationIntroduction. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Introduction Companion slides for The by Maurice Herlihy & Nir Shavit Moore s Law Transistor count still rising Clock speed flattening sharply 2 Moore s Law (in practice) 3 Nearly Extinct: the Uniprocesor
More informationConcurrent Objects. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Concurrent Objects Companion slides for The by Maurice Herlihy & Nir Shavit Concurrent Computation memory object object 2 Objectivism What is a concurrent object? How do we describe one? How do we implement
More informationWhatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL
Whatever can go wrong will go wrong. attributed to Edward A. Murphy Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL 251 Literature Maurice Herlihy and Nir Shavit. The Art of Multiprocessor
More informationWhatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs 3.
Whatever can go wrong will go wrong. attributed to Edward A. Murphy Murphy was an optimist. authors of lock-free programs 3. LOCK FREE KERNEL 309 Literature Maurice Herlihy and Nir Shavit. The Art of Multiprocessor
More informationOverview of Lecture 4. Memory Models, Atomicity & Performance. Ben-Ari Concurrency Model. Dekker s Algorithm 4
Concurrent and Distributed Programming http://fmt.cs.utwente.nl/courses/cdp/ Overview of Lecture 4 2 Memory Models, tomicity & Performance HC 4 - Tuesday, 6 December 2011 http://fmt.cs.utwente.nl/~marieke/
More informationA Non-Blocking Concurrent Queue Algorithm
A Non-Blocking Concurrent Queue Algorithm Bruno Didot bruno.didot@epfl.ch June 2012 Abstract This report presents a new non-blocking concurrent FIFO queue backed by an unrolled linked list. Enqueue and
More informationCache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency
Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed
More informationConcurrent Objects and Linearizability
Chapter 3 Concurrent Objects and Linearizability 3.1 Specifying Objects An object in languages such as Java and C++ is a container for data. Each object provides a set of methods that are the only way
More informationNon-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei
Non-blocking Array-based Algorithms for Stacks and Queues Niloufar Shafiei Outline Introduction Concurrent stacks and queues Contributions New algorithms New algorithms using bounded counter values Correctness
More informationLinked Structures. See Section 3.2 of the text.
Linked Structures See Section 3.2 of the text. First, notice that Java allows classes to be recursive, in the sense that a class can have an element which is itself an object of that class: class Person
More information+ Today. Lecture 26: Concurrency 3/31/14. n Reading. n Objectives. n Announcements. n P&C Section 7. n Race conditions.
+ Lecture 26: Concurrency Slides adapted from Dan Grossman + Today n Reading n P&C Section 7 n Objectives n Race conditions n Announcements n Quiz on Friday 1 + This week s programming assignment n Answer
More informationLindsay Groves, Simon Doherty. Mark Moir, Victor Luchangco
Lindsay Groves, Simon Doherty Victoria University of Wellington Mark Moir, Victor Luchangco Sun Microsystems, Boston (FORTE, Madrid, September, 2004) Lock-based concurrency doesn t scale Lock-free/non-blocking
More informationGet out, you will, of this bind If, your objects, you have confined
CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [THREAD SAFETY] Putting the brakes, on impending code breaks Let a reference escape, have you? Misbehave, your code will, out of the blue Get out, you will,
More informationThread Safety. Review. Today o Confinement o Threadsafe datatypes Required reading. Concurrency Wrapper Collections
Thread Safety Today o Confinement o Threadsafe datatypes Required reading Concurrency Wrapper Collections Optional reading The material in this lecture and the next lecture is inspired by an excellent
More informationLinked List Nodes (reminder)
Outline linked lists reminders: nodes, implementation, invariants circular linked list doubly-linked lists iterators the Java foreach statement iterator implementation the ListIterator interface Linked
More informationCS 241 Honors Concurrent Data Structures
CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana Champaign March 27, 2018 CS 241 Course Staff (UIUC) Lock Free Data Structures March 27, 2018 1 / 43 What to go over
More informationUnit 6: Indeterminate Computation
Unit 6: Indeterminate Computation Martha A. Kim October 6, 2013 Introduction Until now, we have considered parallelizations of sequential programs. The parallelizations were deemed safe if the parallel
More informationIntroduction to Concurrency and Multicore Programming. Slides adapted from Art of Multicore Programming by Herlihy and Shavit
Introduction to Concurrency and Multicore Programming Slides adapted from Art of Multicore Programming by Herlihy and Shavit Overview Introduction Mutual Exclusion Linearizability Concurrent Data Structure
More informationCS510 Concurrent Systems. Jonathan Walpole
CS510 Concurrent Systems Jonathan Walpole Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms utline Background Non-Blocking Queue Algorithm Two Lock Concurrent Queue Algorithm
More informationLab. Lecture 26: Concurrency & Responsiveness. Assignment. Maze Program
Lab Lecture 26: Concurrency & Responsiveness CS 62 Fall 2016 Kim Bruce & Peter Mawhorter Using parallelism to speed up sorting using Threads and ForkJoinFramework Review relevant material. Some slides
More informationCSE332: Data Abstractions Lecture 19: Mutual Exclusion and Locking
CSE332: Data Abstractions Lecture 19: Mutual Exclusion and Locking James Fogarty Winter 2012 Including slides developed in part by Ruth Anderson, James Fogarty, Dan Grossman Banking Example This code is
More informationKeeping Order:! Stacks, Queues, & Deques. Travis W. Peters Dartmouth College - CS 10
Keeping Order:! Stacks, Queues, & Deques 1 Stacks 2 Stacks A stack is a last in, first out (LIFO) data structure Primary Operations: push() add item to top pop() return the top item and remove it peek()
More informationA Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 5 Programming with Locks and Critical Sections
A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 5 Programming with Locks and Critical Sections Dan Grossman Last Updated: May 2012 For more information, see http://www.cs.washington.edu/homes/djg/teachingmaterials/
More informationLinearizability Testing Manual
Linearizability Testing Manual Gavin Lowe April 8, 2016 This manual describes how to use the linearizability testing framework, described in [Low16]. The framework is available from http://www.cs.ox. ac.uk/people/gavin.lowe/linearizabiltytesting/.
More informationCS377P Programming for Performance Multicore Performance Synchronization
CS377P Programming for Performance Multicore Performance Synchronization Sreepathi Pai UTCS October 21, 2015 Outline 1 Synchronization Primitives 2 Blocking, Lock-free and Wait-free Algorithms 3 Transactional
More informationLecture 19: Composing Objects in Java
COMP 150-CCP Concurrent Programming Lecture 19: Composing Objects in Java Dr. Richard S. Hall rickhall@cs.tufts.edu Concurrent programming April 1, 2008 Reference The content of this lecture is based on
More informationPhantom Monitors: A Simple Foundation for Modular Proofs of Fine-Grained Concurrent Programs
Phantom Monitors: A Simple Foundation for Modular Proofs of Fine-Grained Concurrent Programs Christian J. Bell, Mohsen Lesani, Adam Chlipala, Stephan Boyer, Gregory Malecha, Peng Wang MIT CSAIL Goal: verification
More informationG Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University
G22.2110-001 Programming Languages Spring 2010 Lecture 13 Robert Grimm, New York University 1 Review Last week Exceptions 2 Outline Concurrency Discussion of Final Sources for today s lecture: PLP, 12
More informationProgram Graph. Lecture 25: Parallelism & Concurrency. Performance. What does it mean?
Program Graph Lecture 25: Parallelism & Concurrency CS 62 Fall 2015 Kim Bruce & Michael Bannister Some slides based on those from Dan Grossman, U. of Washington Program using fork and join can be seen
More information3/25/14. Lecture 25: Concurrency. + Today. n Reading. n P&C Section 6. n Objectives. n Concurrency
+ Lecture 25: Concurrency + Today n Reading n P&C Section 6 n Objectives n Concurrency 1 + Concurrency n Correctly and efficiently controlling access by multiple threads to shared resources n Programming
More informationScale Up with Lock-Free Algorithms. Non-blocking concurrency on JVM Presented at JavaOne 2017 /Roman JetBrains
Scale Up with Lock-Free Algorithms Non-blocking concurrency on JVM Presented at JavaOne 2017 /Roman Elizarov @ JetBrains Speaker: Roman Elizarov 16+ years experience Previously developed high-perf trading
More informationComposing Objects. Java and Android Concurrency.
Java and Android Concurrency Composing Objects fausto.spoto@univr.it git@bitbucket.org:spoto/java-and-android-concurrency.git git@bitbucket.org:spoto/java-and-android-concurrency-examples.git Fausto Spoto
More informationAtomicity CS 2110 Fall 2017
Atomicity CS 2110 Fall 2017 Parallel Programming Thus Far Parallel programs can be faster and more efficient Problem: race conditions Solution: synchronization Are there more efficient ways to ensure the
More informationPhantom Monitors: A Simple Foundation for Modular Proofs of Fine-Grained Concurrent Programs
Phantom Monitors: A Simple Foundation for Modular Proofs of Fine-Grained Concurrent Programs Christian J Bell, Mohsen Lesani, Adam Chlipala, Stephan Boyer, Gregory Malecha, Peng Wang MIT CSAIL cj@csailmitedu
More informationOrder Is A Lie. Are you sure you know how your code runs?
Order Is A Lie Are you sure you know how your code runs? Order in code is not respected by Compilers Processors (out-of-order execution) SMP Cache Management Understanding execution order in a multithreaded
More informationFast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Håkan Sundell Philippas Tsigas Outline Synchronization Methods Priority Queues Concurrent Priority Queues Lock-Free Algorithm: Problems
More informationHazard Pointers. Number of threads unbounded time to check hazard pointers also unbounded! difficult dynamic bookkeeping! thread B - hp1 - hp2
Hazard Pointers Store pointers of memory references about to be accessed by a thread Memory allocation checks all hazard pointers to avoid the ABA problem thread A - hp1 - hp2 thread B - hp1 - hp2 thread
More informationCS5460: Operating Systems
CS5460: Operating Systems Lecture 9: Implementing Synchronization (Chapter 6) Multiprocessor Memory Models Uniprocessor memory is simple Every load from a location retrieves the last value stored to that
More informationMidterm #1. CMSC 433: Programming Language Technologies and Paradigms. October 14, 2013
Midterm #1 CMSC 433: Programming Language Technologies and Paradigms October 14, 2013 Name Instructions Do not start until told to do so! This exam has 10 double-sided pages (including this one); make
More informationSharing Objects Ch. 3
Sharing Objects Ch. 3 Visibility What is the source of the issue? Volatile Dekker s algorithm Publication and Escape Thread Confinement Immutability Techniques of safe publication Assignment 1 Visibility
More informationThe Java Memory Model
Jeremy Manson 1, William Pugh 1, and Sarita Adve 2 1 University of Maryland 2 University of Illinois at Urbana-Champaign Presented by John Fisher-Ogden November 22, 2005 Outline Introduction Sequential
More informationLinked lists (6.5, 16)
Linked lists (6.5, 16) Linked lists Inserting and removing elements in the middle of a dynamic array takes O(n) time (though inserting at the end takes O(1) time) (and you can also delete from the middle
More informationDesign of Thread-Safe Classes
Design of Thread-Safe Classes 1 Topic Outline Thread-Safe Classes Principles Confinement Delegation Synchronization policy documentation 2 Thread-safe Class Design Process Identify the object s state (variables)
More informationMulti-threaded Performance And Scalability
1 Multi-threaded Performance And Scalability Dr Heinz M. Kabutz http://www.javaspecialists.eu/talks/wjax12/kabutz.pdf 2012 Heinz Kabutz All Rights Reserved 2 Dr Heinz Kabutz Brief Biography German from
More informationFast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors
Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Deli Zhang, Brendan Lynch, and Damian Dechev University of Central Florida, Orlando, USA April 27, 2016 Mutual Exclusion
More informationLinked Lists: Locking, Lock-Free, and Beyond. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Linked Lists: Locking, Lock-Free, and Beyond Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Objects Adding threads should not lower throughput Contention
More informationParallel linked lists
Parallel linked lists Lecture 10 of TDA384/DIT391 (Principles of Conent Programming) Carlo A. Furia Chalmers University of Technology University of Gothenburg SP3 2017/2018 Today s menu The burden of locking
More informationFast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors
Background Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Deli Zhang, Brendan Lynch, and Damian Dechev University of Central Florida, Orlando, USA December 18,
More informationCSCE 314 Programming Languages
CSCE 314 Programming Languages! Concurrency in Java Dr. Hyunyoung Lee 1 World Is Concurrent Concurrent programs: more than one activities execute simultaneously (concurrently) no interference between activities,
More informationConcurrent Queues and Stacks. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Concurrent Queues and Stacks Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit The Five-Fold Path Coarse-grained locking Fine-grained locking Optimistic synchronization
More informationCSE332: Data Abstractions Lecture 23: Programming with Locks and Critical Sections. Tyler Robison Summer 2010
CSE332: Data Abstractions Lecture 23: Programming with Locks and Critical Sections Tyler Robison Summer 2010 1 Concurrency: where are we Done: The semantics of locks Locks in Java Using locks for mutual
More information8. Fundamental Data Structures
172 8. Fundamental Data Structures Abstract data types stack, queue, implementation variants for linked lists, [Ottman/Widmayer, Kap. 1.5.1-1.5.2, Cormen et al, Kap. 10.1.-10.2] Abstract Data Types 173
More informationCS171 Midterm Exam. October 29, Name:
CS171 Midterm Exam October 29, 2012 Name: You are to honor the Emory Honor Code. This is a closed-book and closed-notes exam. You have 50 minutes to complete this exam. Read each problem carefully, and
More informationSequen&al Consistency and Linearizability
Sequen&al Consistency and Linearizability (Or, Reasoning About Concurrent Objects) Acknowledgement: Slides par&ally adopted from the companion slides for the book "The Art of Mul&processor Programming"
More informationLock-Free and Practical Doubly Linked List-Based Deques using Single-Word Compare-And-Swap
Lock-Free and Practical Doubly Linked List-Based Deques using Single-Word Compare-And-Swap Håkan Sundell Philippas Tsigas OPODIS 2004: The 8th International Conference on Principles of Distributed Systems
More informationCMSC 132: Object-Oriented Programming II. Stack and Queue
CMSC 132: Object-Oriented Programming II Stack and Queue 1 Stack Allows access to only the last item inserted. An item is inserted or removed from the stack from one end called the top of the stack. This
More informationCMSC 433 Programming Language Technologies and Paradigms. Composing Objects
CMSC 433 Programming Language Technologies and Paradigms Composing Objects Composing Objects To build systems we often need to Create thread safe objects Compose them in ways that meet requirements while
More informationLock-Free Techniques for Concurrent Access to Shared Objects
This is a revised version of the previously published paper. It includes a contribution from Shahar Frank who raised a problem with the fifo-pop algorithm. Revised version date: sept. 30 2003. Lock-Free
More informationMidterm Exam Amy Murphy 6 March 2002
University of Rochester Midterm Exam Amy Murphy 6 March 2002 Computer Systems (CSC2/456) Read before beginning: Please write clearly. Illegible answers cannot be graded. Be sure to identify all of your
More informationConcurrent Queues, Monitors, and the ABA problem. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Concurrent Queues, Monitors, and the ABA problem Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Queues Often used as buffers between producers and consumers
More informationDesign of Concurrent and Distributed Data Structures
METIS Spring School, Agadir, Morocco, May 2015 Design of Concurrent and Distributed Data Structures Christoph Kirsch University of Salzburg Joint work with M. Dodds, A. Haas, T.A. Henzinger, A. Holzer,
More informationSynchronization. Announcements. Concurrent Programs. Race Conditions. Race Conditions 11/9/17. Purpose of this lecture. A8 released today, Due: 11/21
Announcements Synchronization A8 released today, Due: 11/21 Late deadline is after Thanksgiving You can use your A6/A7 solutions or ours A7 correctness scores have been posted Next week's recitation will
More informationComputer Science 62. Bruce/Mawhorter Fall 16. Midterm Examination. October 5, Question Points Score TOTAL 52 SOLUTIONS. Your name (Please print)
Computer Science 62 Bruce/Mawhorter Fall 16 Midterm Examination October 5, 2016 Question Points Score 1 15 2 10 3 10 4 8 5 9 TOTAL 52 SOLUTIONS Your name (Please print) 1. Suppose you are given a singly-linked
More informationPer-Thread Batch Queues For Multithreaded Programs
Per-Thread Batch Queues For Multithreaded Programs Tri Nguyen, M.S. Robert Chun, Ph.D. Computer Science Department San Jose State University San Jose, California 95192 Abstract Sharing resources leads
More informationTom Ball Sebastian Burckhardt Madan Musuvathi Microsoft Research
Tom Ball (tball@microsoft.com) Sebastian Burckhardt (sburckha@microsoft.com) Madan Musuvathi (madanm@microsoft.com) Microsoft Research P&C Parallelism Concurrency Performance Speedup Responsiveness Correctness
More informationSynchronization SPL/2010 SPL/20 1
Synchronization 1 Overview synchronization mechanisms in modern RTEs concurrency issues places where synchronization is needed structural ways (design patterns) for exclusive access 2 Overview synchronization
More informationProgrammazione di sistemi multicore
Programmazione di sistemi multicore A.A. 2015-2016 LECTURE 12 IRENE FINOCCHI http://wwwusers.di.uniroma1.it/~finocchi/ Shared-memory concurrency & mutual exclusion TASK PARALLELISM AND OVERLAPPING MEMORY
More informationAllocating memory in a lock-free manner
Allocating memory in a lock-free manner Anders Gidenstam, Marina Papatriantafilou and Philippas Tsigas Distributed Computing and Systems group, Department of Computer Science and Engineering, Chalmers
More informationLinked Lists: Locking, Lock- Free, and Beyond. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Linked Lists: Locking, Lock- Free, and Beyond Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Coarse-Grained Synchronization Each method locks the object Avoid
More informationLinearizability of Persistent Memory Objects
Linearizability of Persistent Memory Objects Michael L. Scott Joint work with Joseph Izraelevitz & Hammurabi Mendes www.cs.rochester.edu/research/synchronization/ Workshop on the Theory of Transactional
More informationCSE332: Data Abstractions Lecture 22: Shared-Memory Concurrency and Mutual Exclusion. Tyler Robison Summer 2010
CSE332: Data Abstractions Lecture 22: Shared-Memory Concurrency and Mutual Exclusion Tyler Robison Summer 2010 1 Toward sharing resources (memory) So far we ve looked at parallel algorithms using fork-join
More informationRaces. Example. A race condi-on occurs when the computa-on result depends on scheduling (how threads are interleaved)
Races A race condi-on occurs when the computa-on result depends on scheduling (how threads are interleaved) Bugs that exist only due to concurrency o No interleaved scheduling with 1 thread Typically,
More informationFun facts about recursion
Outline examples of recursion principles of recursion review: recursive linked list methods binary search more examples of recursion problem solving using recursion 1 Fun facts about recursion every loop
More informationThe Relative Power of Synchronization Methods
Chapter 5 The Relative Power of Synchronization Methods So far, we have been addressing questions of the form: Given objects X and Y, is there a wait-free implementation of X from one or more instances
More informationCONCURRENT LIBRARIES. Correctness Criteria, Verification
CONCURRENT LIBRARIES Correctness Criteria, Verification Verification Ingredients Specifying a Library: φ Implementing a Library: L Verifying a Library implementation: L φ The History of an Object Object
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 31 October 2012 Lecture 6 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationOperating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017
Operating Systems Lecture 4 - Concurrency and Synchronization Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Mutual exclusion Hardware solutions Semaphores IPC: Message passing
More informationConcurrent Programming using Threads
Concurrent Programming using Threads Threads are a control mechanism that enable you to write concurrent programs. You can think of a thread in an object-oriented language as a special kind of system object
More informationThread-Local. Lecture 27: Concurrency 3. Dealing with the Rest. Immutable. Whenever possible, don t share resources
Thread-Local Lecture 27: Concurrency 3 CS 62 Fall 2016 Kim Bruce & Peter Mawhorter Some slides based on those from Dan Grossman, U. of Washington Whenever possible, don t share resources Easier to have
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 17 November 2017 Lecture 7 Linearizability Lock-free progress properties Hashtables and skip-lists Queues Reducing contention Explicit
More informationSynchronization Lecture 23 Fall 2017
Synchronization Lecture 23 Fall 2017 Announcements A8 released today, Due: 11/21 Late deadline is after Thanksgiving You can use your A6/A7 solutions or ours A7 correctness scores have been posted Next
More informationSharing is the Key. Lecture 25: Parallelism. Canonical Example. Bad Interleavings. Common to have: CS 62 Fall 2016 Kim Bruce & Peter Mawhorter
Sharing is the Key Lecture 25: Parallelism CS 62 Fall 2016 Kim Bruce & Peter Mawhorter Some slides based on those from Dan Grossman, U. of Washington Common to have: Different threads access the same resources
More informationCSE 332: Data Structures & Parallelism Lecture 17: Shared-Memory Concurrency & Mutual Exclusion. Ruth Anderson Winter 2019
CSE 332: Data Structures & Parallelism Lecture 17: Shared-Memory Concurrency & Mutual Exclusion Ruth Anderson Winter 2019 Toward sharing resources (memory) So far, we have been studying parallel algorithms
More informationLecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory
Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section
More informationFIFO Queue Synchronization
FIFO Queue Synchronization by Moshe Hoffman A Thesis submitted for the degree Master of Computer Science Supervised by Professor Nir Shavit School of Computer Science Tel Aviv University July 2008 CONTENTS
More informationCS2012 Programming Techniques II
27 January 14 Lecture 6 (continuing from 5) CS2012 Programming Techniques II Vasileios Koutavas 1 27 January 14 Lecture 6 (continuing from 5) 2 Previous Lecture Amortized running time cost of algorithms
More informationProving liveness. Alexey Gotsman IMDEA Software Institute
Proving liveness Alexey Gotsman IMDEA Software Institute Safety properties Ensure bad things don t happen: - the program will not commit a memory safety fault - it will not release a lock it does not hold
More informationRecap. Contents. Reenterancy of synchronized. Explicit Locks: ReentrantLock. Reenterancy of synchronise (ctd) Advanced Thread programming.
Lecture 07: Advanced Thread programming Software System Components 2 Behzad Bordbar School of Computer Science, University of Birmingham, UK Recap How to deal with race condition in Java Using synchronised
More informationReagent Based Lock Free Concurrent Link List Spring 2012 Ancsa Hannak and Mitesh Jain April 28, 2012
Reagent Based Lock Free Concurrent Link List Spring 0 Ancsa Hannak and Mitesh Jain April 8, 0 Introduction The most commonly used implementation of synchronization is blocking algorithms. They utilize
More information1) If a location is initialized to 0, what will the first invocation of TestAndSet on that location return?
Synchronization Part 1: Synchronization - Locks Dekker s Algorithm and the Bakery Algorithm provide software-only synchronization. Thanks to advancements in hardware, synchronization approaches have been
More informationAdvanced concurrent programming in Java Shared objects
Advanced concurrent programming in Java Shared objects Mehmet Ali Arslan 21.10.13 Visibility To see(m) or not to see(m)... 2 There is more to synchronization than just atomicity or critical sessions. Memory
More informationSynchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011
Synchronization CS61, Lecture 18 Prof. Stephen Chong November 3, 2011 Announcements Assignment 5 Tell us your group by Sunday Nov 6 Due Thursday Nov 17 Talks of interest in next two days Towards Predictable,
More information