Concurrent Queues, Monitors, and the ABA problem Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Queues Often used as buffers between producers and consumers Producers enqueue data/request to be processed Consumers dequeue data/request and process it Art of Multiprocessor Programming 2
Queues deq()/ enq( ) Total order First in First out Art of Multiprocessor Programming 3
Bounded Fixed capacity Good when resources an issue Art of Multiprocessor Programming 4
Unbounded Unlimited capacity Often more convenient Art of Multiprocessor Programming 5
Blocking zzz Block on attempt to remove from empty queue Art of Multiprocessor Programming 6
Blocking zzz Block on attempt to add to full bounded queue Art of Multiprocessor Programming 7
Non-Blocking Ouch! Throw exception on attempt to remove from empty queue Art of Multiprocessor Programming 8
This Lecture Queue Bounded, blocking, lock-based Unbounded, non-blocking, lock-free Art of Multiprocessor Programming 9
Queue: Concurrency enq(x) tail head y=deq() enq() and deq() work at different ends of the object Art of Multiprocessor Programming 10
Concurrency enq(x) y=deq() Challenge: what if the queue is empty or full? Art of Multiprocessor Programming 11
Bounded Queue head tail Sentinel Art of Multiprocessor Programming 12
Bounded Queue head tail First actual item Art of Multiprocessor Programming 13
Bounded Queue head tail deqlock Lock out other deq() calls Art of Multiprocessor Programming 14
Bounded Queue head tail deqlock enqlock Lock out other enq() calls Art of Multiprocessor Programming 15
Not Done Yet head tail deqlock enqlock Need to tell whether queue is full or empty Art of Multiprocessor Programming 16
Not Done Yet head tail deqlock enqlock size 1 Max size is 8 items Art of Multiprocessor Programming 17
Not Done Yet head tail deqlock enqlock size 1 Incremented by enq() Decremented by deq() Art of Multiprocessor Programming 18
Enqueuer head tail deqlock enqlock Lock enqlock size 1 Art of Multiprocessor Programming 19
Enqueuer head tail deqlock enqlock Read size size 1 OK Art of Multiprocessor Programming 20
Enqueuer head tail deqlock enqlock No need to lock tail size 1 Art of Multiprocessor Programming 21
Enqueuer head tail deqlock enqlock size 1 Enqueue Node Art of Multiprocessor Programming 22
Enqueuer head tail deqlock enqlock size 12 getandincrement() Art of Multiprocessor Programming 23
Enqueuer head tail deqlock enqlock size 82 Release lock Art of Multiprocessor Programming 24
Enqueuer head tail deqlock enqlock size 2 If queue was empty, notify waiting dequeuers Art of Multiprocessor Programming 25
Unsuccesful Enqueuer head tail deqlock enqlock Read size size 8 Uh-oh Art of Multiprocessor Programming 26
Dequeuer head tail deqlock enqlock size 2 Lock deqlock Art of Multiprocessor Programming 27
Dequeuer head tail deqlock enqlock size 2 Read sentinel s next field OK Art of Multiprocessor Programming 28
Dequeuer head tail deqlock enqlock size 2 Read value Art of Multiprocessor Programming 29
Dequeuer Make first Node new sentinel head tail deqlock enqlock size 2 Art of Multiprocessor Programming 30
Dequeuer head tail deqlock enqlock size 1 Decrement size Art of Multiprocessor Programming 31
Dequeuer head tail deqlock enqlock size 1 Release deqlock Art of Multiprocessor Programming 32
Unsuccesful Dequeuer head tail deqlock enqlock size 0 Read sentinel s next field uh-oh Art of Multiprocessor Programming 33
Bounded Queue public class BoundedQueue<T> { ReentrantLock enqlock, deqlock; Condition notemptycondition, notfullcondition; AtomicInteger size; Node head; Node tail; int capacity; enqlock = new ReentrantLock(); notfullcondition = enqlock.newcondition(); deqlock = new ReentrantLock(); notemptycondition = deqlock.newcondition(); } Art of Multiprocessor Programming 34
Bounded Queue public class BoundedQueue<T> { ReentrantLock enqlock, deqlock; Condition notemptycondition, notfullcondition; AtomicInteger size; Node head; Node tail; int capacity; Enq & deq locks enqlock = new ReentrantLock(); notfullcondition = enqlock.newcondition(); deqlock = new ReentrantLock(); notemptycondition = deqlock.newcondition(); } Art of Multiprocessor Programming 35
Digression: Monitor Locks Java synchronized objects and ReentrantLocks are monitors Allow blocking on a condition rather than spinning Threads: acquire and release lock wait on a condition Art of Multiprocessor Programming 36
The Java Lock Interface public interface Lock { void lock(); void lockinterruptibly() throws InterruptedException; boolean trylock(); boolean trylock(long time, TimeUnit unit); Condition newcondition(); void unlock; } Acquire lock Art of Multiprocessor Programming 37
The Java Lock Interface public interface Lock { void lock(); void lockinterruptibly() throws InterruptedException; boolean trylock(); boolean trylock(long time, TimeUnit unit); Condition newcondition(); void unlock; } Release lock Art of Multiprocessor Programming 38
The Java Lock Interface public interface Lock { void lock(); void lockinterruptibly() throws InterruptedException; boolean trylock(); boolean trylock(long time, TimeUnit unit); Condition newcondition(); void unlock; } Try for lock, but not too hard Art of Multiprocessor Programming 39
The Java Lock Interface public interface Lock { void lock(); void lockinterruptibly() throws InterruptedException; boolean trylock(); boolean trylock(long time, TimeUnit unit); Condition newcondition(); void unlock; } Create condition to wait on Art of Multiprocessor Programming 40
The Java Lock Interface public interface Lock { void lock(); void lockinterruptibly() throws InterruptedException; boolean trylock(); boolean trylock(long time, TimeUnit unit); Condition newcondition(); void unlock; } Never mind what this method does Art of Multiprocessor Programming 41
Lock Conditions public interface Condition { void await(); boolean await(long time, TimeUnit unit); void signal(); void signalall(); } Art of Multiprocessor Programming 42
Lock Conditions public interface Condition { void await(); boolean await(long time, TimeUnit unit); void signal(); void signalall(); } Release lock and wait on condition Art of Multiprocessor Programming 43
Lock Conditions public interface Condition { void await(); boolean await(long time, TimeUnit unit); void signal(); void signalall(); } Wake up one waiting thread Art of Multiprocessor Programming 44
Lock Conditions public interface Condition { void await(); boolean await(long time, TimeUnit unit); void signal(); void signalall(); } Wake up all waiting threads Art of Multiprocessor Programming 45
Await q.await() Releases lock associated with q Sleeps (gives up processor) Awakens (resumes running) Reacquires lock & returns Art of Multiprocessor Programming 46
Signal q.signal(); Awakens one waiting thread Which will reacquire lock Art of Multiprocessor Programming 47
Signal All q.signalall(); Awakens all waiting threads Which will each reacquire lock Art of Multiprocessor Programming 48
A Monitor Lock Lock() waiting room Critical Section unlock() Art of Multiprocessor Programming 49
Unsuccessful Deq Lock() waiting room Deq() Critical Section await() Oh no, Empty! Art of Multiprocessor Programming 50
Another One Lock() waiting room Deq() Critical Section await() Oh no, Empty! Art of Multiprocessor Programming 51
Enqueuer to the Rescue Lock() Yawn! Yawn! waiting room Enq( ) Critical Section signalall() unlock() Art of Multiprocessor Programming 52
Monitor Signalling waiting room Yawn! Yawn! Critical Section Awakened thread might still lose lock to outside contender Art of Multiprocessor Programming 53
Dequeuers Signalled waiting room Yawn! Critical Section Found it Art of Multiprocessor Programming 54
Dequeuers Signaled waiting room Yawn! Critical Section Still empty! Art of Multiprocessor Programming 55
Dollar Short + Day Late waiting room Critical Section Art of Multiprocessor Programming 56
Java Synchronized Methods public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; public synchronized T deq() { while (tail head == 0) wait(); T result = items[head % QSIZE]; head++; notifyall(); return result; } }} Art of Multiprocessor Programming 57
Java Synchronized Methods public class Queue<T> { } }} int head = 0, tail = 0; T[QSIZE] items; public synchronized T deq() { while (tail head == 0) wait(); T result = items[head % QSIZE]; head++; notifyall(); return result; Each object has an implicit lock with an implicit condition Art of Multiprocessor Programming 58
Java Synchronized Methods public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; Lock on entry, unlock on return public synchronized T deq() { while (tail head == 0) wait(); T result = items[head % QSIZE]; head++; notifyall(); return result; } }} Art of Multiprocessor Programming 59
Java Synchronized Methods public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; Wait on implicit condition public synchronized T deq() { while (tail head == 0) wait(); T result = items[head % QSIZE]; head++; this.notifyall(); return result; } }} Art of Multiprocessor Programming 60
Java Synchronized Methods public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; Signal all threads waiting on condition public synchronized T deq() { while (tail head == 0) this.wait(); T result = items[head % QSIZE]; head++; notifyall(); return result; } }} Art of Multiprocessor Programming 61
(Pop!) The Bounded Queue public class BoundedQueue<T> { ReentrantLock enqlock, deqlock; Condition notemptycondition, notfullcondition; AtomicInteger size; Node head; Node tail; int capacity; enqlock = new ReentrantLock(); notfullcondition = enqlock.newcondition(); deqlock = new ReentrantLock(); notemptycondition = deqlock.newcondition(); } Art of Multiprocessor Programming 62
Bounded Queue Fields public class BoundedQueue<T> { ReentrantLock enqlock, deqlock; Condition notemptycondition, notfullcondition; AtomicInteger size; Node head; Node tail; int capacity; Enq & deq locks enqlock = new ReentrantLock(); notfullcondition = enqlock.newcondition(); deqlock = new ReentrantLock(); notemptycondition = deqlock.newcondition(); } Art of Multiprocessor Programming 63
Bounded Queue Fields public class BoundedQueue<T> { ReentrantLock enqlock, deqlock; Condition notemptycondition, notfullcondition; AtomicInteger size; } Enq lock s associated condition Node head; Node tail; int capacity; enqlock = new ReentrantLock(); notfullcondition = enqlock.newcondition(); deqlock = new ReentrantLock(); notemptycondition = deqlock.newcondition(); Art of Multiprocessor Programming 64
Bounded Queue Fields public class BoundedQueue<T> { ReentrantLock enqlock, deqlock; Condition notemptycondition, notfullcondition; AtomicInteger size; Node head; } Node tail; int capacity; Num size: 0 to capacity enqlock = new ReentrantLock(); notfullcondition = enqlock.newcondition(); deqlock = new ReentrantLock(); notemptycondition = deqlock.newcondition(); Art of Multiprocessor Programming 65
Bounded Queue Fields public class BoundedQueue<T> { ReentrantLock enqlock, deqlock; Condition notemptycondition, notfullcondition; AtomicInteger size; Head and Tail Node head; Node tail; int capacity; enqlock = new ReentrantLock(); notfullcondition = enqlock.newcondition(); deqlock = new ReentrantLock(); notemptycondition = deqlock.newcondition(); } Art of Multiprocessor Programming 66
Enq Method Part One public void enq(t x) { boolean mustwakedequeuers = false; enqlock.lock(); try { while (size.get() == Capacity) notfullcondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getandincrement() == 0) mustwakedequeuers = true; } finally { enqlock.unlock(); } } Art of Multiprocessor Programming 67
Enq Method Part One public void enq(t x) { boolean mustwakedequeuers = false; enqlock.lock(); try { while (size.get() == capacity) notfullcondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getandincrement() == 0) mustwakedequeuers = true; } finally { enqlock.unlock(); } } Lock and unlock enq lock Art of Multiprocessor Programming 68
Enq Method Part One public void enq(t x) { boolean mustwakedequeuers = false; enqlock.lock(); try { while (size.get() == capacity) notfullcondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getandincrement() == 0) mustwakedequeuers = true; } finally { enqlock.unlock(); } } Wait while queue is full Art of Multiprocessor Programming 69
Be Afraid public void enq(t x) { boolean mustwakedequeuers = false; enqlock.lock(); try { while (size.get() == capacity) notfullcondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getandincrement() == 0) mustwakedequeuers = true; } finally { enqlock.unlock(); } } How do we know the queue won t become full again? Art of Multiprocessor Programming 70
Enq Method Part One public void enq(t x) { boolean mustwakedequeuers = false; enqlock.lock(); try { while (size.get() == capacity) notfullcondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getandincrement() == 0) mustwakedequeuers = true; } finally { enqlock.unlock(); } } Add new node Art of Multiprocessor Programming 71
Enq Method Part One public void enq(t x) { boolean mustwakedequeuers = false; enqlock.lock(); try { while (size.get() == capacity) notfullcondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getandincrement() == 0) mustwakedequeuers = true; } finally { enqlock.unlock(); } } If queue was empty, wake frustrated dequeuers Art of Multiprocessor Programming 72
Beware Lost Wake-Ups Lock() Yawn! waiting room Enq( ) Critical Section signal () unlock() Art of Multiprocessor Programming 73
Lock() Lost Wake-Up Yawn! waiting room Enq( ) Critical Section unlock() Art of Multiprocessor Programming 74
Lost Wake-Up Yawn! waiting room Critical Section Art of Multiprocessor Programming 75
Lost Wake-Up waiting room Critical Section Found it Art of Multiprocessor Programming 76
What s Wrong Here? waiting room Still waiting.! Critical Section Art of Multiprocessor Programming 77
Solution to Lost Wakeup Always use signalall() and notifyall() Not signal() and notify() Art of Multiprocessor Programming 78
Enq Method Part Deux public void enq(t x) { if (mustwakedequeuers) { deqlock.lock(); try { notemptycondition.signalall(); } finally { deqlock.unlock(); } } } Art of Multiprocessor Programming 79
Enq Method Part Deux public void enq(t x) { if (mustwakedequeuers) { deqlock.lock(); try { notemptycondition.signalall(); } finally { deqlock.unlock(); } } } Are there dequeuers to be signaled? Art of Multiprocessor Programming 80
Enq Method Part Deux public void enq(t x) { if (mustwakedequeuers) { deqlock.lock(); try { notemptycondition.signalall(); } finally { deqlock.unlock(); } } } Lock and unlock deq lock Art of Multiprocessor Programming 81
Enq Method Part Deux public void enq(t x) { Signal dequeuers that queue if (mustwakedequeuers) is no longer empty { deqlock.lock(); try { notemptycondition.signalall(); } finally { deqlock.unlock(); } } } Art of Multiprocessor Programming 82
The Enq() & Deq() Methods Share no locks That s good But do share an atomic counter Accessed on every method call That s not so good Can we alleviate this bottleneck? Art of Multiprocessor Programming 83
Split the Counter The enq() method Increments only Cares only if value is capacity The deq() method Decrements only Cares only if value is zero Art of Multiprocessor Programming 84
Split Counter Enqueuer increments enqsize Dequeuer decrements deqsize When enqueuer runs out Locks deqlock computes size = enqsize - DeqSize Intermittent synchronization Not with each method call Need both locks! (careful ) Art of Multiprocessor Programming 85
A Lock-Free Queue head tail Sentinel Art of Multiprocessor Programming 86
Compare and Set CAS Art of Multiprocessor Programming 87
Enqueue head tail Enq( ) Art of Multiprocessor Programming 88
Enqueue head tail Art of Multiprocessor Programming 89
Logical Enqueue head CAS tail Art of Multiprocessor Programming 90
Physical Enqueue head tail CAS Art of Multiprocessor Programming 91
Enqueue These two steps are not atomic The tail field refers to either Actual last Node (good) Penultimate Node (not so good) Be prepared! Art of Multiprocessor Programming 92
Enqueue What do you do if you find A trailing tail? Stop and help fix it If tail node has non-null next field CAS the queue s tail field to tail.next As in the universal construction Art of Multiprocessor Programming 93
When CASs Fail During logical enqueue Abandon hope, restart Still lock-free (why?) During physical enqueue Ignore it (why?) Art of Multiprocessor Programming 94
Dequeuer head tail Read value Art of Multiprocessor Programming 95
Dequeuer Make first Node new sentinel head tail CAS Art of Multiprocessor Programming 96
Memory Reuse? What do we do with nodes after we dequeue them? Java: let garbage collector deal? Suppose there is no GC, or we prefer not to use it? Art of Multiprocessor Programming 97
Dequeuer head tail CAS Can recycle Art of Multiprocessor Programming 98
Simple Solution Each thread has a free list of unused queue nodes Allocate node: pop from list Free node: push onto list Deal with underflow somehow Art of Multiprocessor Programming 99
Why Recycling is Hard head tail Want to redirect head from grey to red zzz Free pool Art of Multiprocessor Programming 100
Both Nodes Reclaimed head tail zzz Free pool Art of Multiprocessor Programming 101
One Node Recycled head tail Yawn! Free pool Art of Multiprocessor Programming 102
Why Recycling is Hard head CAS tail OK, here I go! Free pool Art of Multiprocessor Programming 103
Epic FAIL head tail zomg what went wrong? Free pool Art of Multiprocessor Programming 104
The Dreaded ABA Problem head tail Head reference has value A Thread reads value A Art of Multiprocessor Programming 105
Dreaded ABA continued head tail zzz Head reference has value B Node A freed Art of Multiprocessor Programming 106
Dreaded ABA continued head tail Yawn! Head reference has value A again Node A recycled and reinitialized Art of Multiprocessor Programming 107
Dreaded ABA continued head CAS tail CAS succeeds because references match, even though reference s meaning has changed Art of Multiprocessor Programming 108
The Dreaded ABA FAIL Is a result of CAS() semantics I blame Sun, Intel, AMD, Not with Load-Locked/Store- Conditional Good for IBM? Art of Multiprocessor Programming 109
Dreaded ABA A Solution Tag each pointer with a counter Unique over lifetime of node Pointer size vs word size issues Overflow? Don t worry be happy? Bounded tags? AtomicStampedReference class Art of Multiprocessor Programming 110
Atomic Stamped Reference class package Can get reference & stamp atomically Reference address S Stamp Art of Multiprocessor Programming 111
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. You are free: to Share to copy, distribute and transmit the work to Remix to adapt the work Under the following conditions: Attribution. You must attribute the work to The Art of Multiprocessor Programming (but not in any way that suggests that the authors endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to http://creativecommons.org/licenses/by-sa/3.0/. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights. Art of Multiprocessor Programming 112