Non-blocking Array-based Algorithms for Stacks and Queues!
|
|
- Philomena Spencer
- 6 years ago
- Views:
Transcription
1 Non-blocking Array-based Algorithms for Stacks and Queues! Niloufar Shafiei! Department of Computer Science and Engineering York University ICDCN 09
2 Outline! Introduction! Stack algorithm! Queue algorithm! Time analysis! Verification! Implementation and comparison! Conclusion! 2!
3 Computation model! Asynchronous distributed system! Each process has its own independent clock.! Shared memory! Processes communicate through shared memory.! Failures! Any processes may fail.! 3!
4 Shared Stacks and Queues! Fundamental data structure! Used in parallel application! Operating systems! Garbage collection! 4!
5 Traditional implementation! Mutual exclusion (using lock)! Disadvantages! One process may slow down the whole system.! Not fault tolerant! Priority inversion! Deadlock! 5!
6 Non-blocking implementations! Non-blocking! Some operation always completes in a finite number of steps.! Advantage! Immune to deadlock! Fault tolerant! Slow process does not slow down the whole system.! Disadvantage! Complex and subtle! 6!
7 Linearizability! Correctness condition for shared objects! 7!
8 Linearizability! Correctness condition for shared objects! time push(v 1 )? push(v 2 ) pop stack 8!
9 Linearizability! Correctness condition for shared objects! time push(v 1 ) X? push(v 2 ) X X pop empty stack 9!
10 Linearizability! Correctness condition for shared objects! time push(v 1 ) X push(v 2 ) X v 2 X pop empty v 1 stack 10!
11 Linearizability! Correctness condition for shared objects! time push(v 1 ) X X push(v 2 ) pop X v 1 v 2 stack 11!
12 Compare and Swap! Impossible to implement non-blocking stack and queue using only atomic read/write registers! Use universal synchronization primitives! Compare and Swap (C&S)! 12!
13 Compare and Swap! Impossible to implement non-blocking stack and queue using only atomic read/write registers! Use universal synchronization primitives! Compare and Swap (C&S)!!!!C&S(X, old, new)!!!! if (X = old)!!!! X := new!!!! return true!!!! else!!!! return false! 13!
14 ABA problem! X = A! old = X!!.!!!:!!!!! C&S(X, old, new)! X has not been changed 14!
15 ABA problem! X = A! old = X!!.!X = B!!:!:!!!X = A! C&S(X, old, new)! X has not been changed 15!
16 ABA problem! X = A! old = X!!.!X = B!!:!:!!!X = A! C&S(X, old, new)! X has not been changed Solution: values 16!
17 Categories of implementations! Link-based! use dynamically allocated nodes! Array-based! primarily use arrays! 17!
18 Link-based versus array-based! Link-based! Extra space required for pointers! Memory management overhead! Array-based! Compact data structure! Leave enough space in word for s! Good locality of reference! Fixed size! 18!
19 Related work on stacks! Stack! Array-based! Primitive used! Progress property! Treiber [1986]! No! C&S! Non-blocking! Herlihy [1993]! No! C&S! Wait-free! Hendler et al. [2004]! No! C&S! Non-blocking! Boehm [2004]! No! C&S! Almost non-blocking! Afek et al. [2006]! Infinite array! Swap, Test&Set, Fetch&Add! Wait-free! Massalin and Pu [1992]! Yes! DC&S! Non-blocking! Shavit and Touitou [1997]! Shavit and Zemach [2000]! Yes! Read/Write! Non-blocking! (not linearizable)! Yes! Read/Write! Lock-based! 19!
20 Related work on queues! queue! Array-based! Primitive used! Progress property! Herlihy [1993]! No! C&S! Wait-free! Prakash et al. [1994]! No! C&S! Non-blocking! Valois [1994]! No! C&S! Non-blocking! Michael and Scott [1996]! No! C&S! Non-blocking! Ladan-Mozes and Shavit [2004]! No! C&S! Non-blocking! Moir et al. [2005]! No! C&S! Non-blocking! Herlihy and Wing [2006]! Infinite array! Swap, Fetch&Add! Wait-free! Valois [1994]! Yes! DC&S! Non-blocking! Tsigas and Zhang [2001]! Yes! C&S! Non-blocking(low probability of error)! Colvin and Groves [2005]! Yes! C&S! Non-blocking! 20!
21 Contributions! A non-blocking array-based algorithm for stacks! First practical non-blocking array-based stack implementation! A non-blocking array-based algorithm for queues using bounded s! First time that bounded s are used to implement shared queues (and stacks)! 21!
22 Stack algorithm! 22!
23 Stack algorithm! Shared Variables: array (Stack) Stores entries of stack 23!
24 Stack algorithm! Shared Variables: array (Stack) register (Top) Stores index of top element of stack Stores entries of stack 24!
25 Stack algorithm! Shared Variables (Stack and Top) are changed using C&S. array (Stack) register (Top) 25!
26 Stack algorithm! array (Stack) register (Top) value 26!
27 Stack algorithm! array (Stack) register (Top) index value value 27!
28 Example! In middle of some execution Stack 3 Null 0 Top 2 V 2 C 2 2 V 2 C 2 1 V 1 C 1 index value 0 Null 0 value 28!
29 Example! In middle of some execution Stack 3 Null 0 Top 2 V 2 C 2 2 V 2 C 2 1 V 1 C 1 index value 0 Null 0 value dummy entry 29!
30 Example! Push(V 3 ): Stack 3 Null 0 Top 2 V 2 C 2 2 V 2 C 2 1 V 1 C 1 index value 0 Null 0 value 30!
31 Example! Push(V 3 ): step 1: read Top (index,value,) = (2,V 2,C 2 ) Stack 3 Null 0 Top 2 V 2 C 2 2 V 2 C 2 1 V 1 C 1 index value 0 Null 0 value 31!
32 Example! Push(V 3 ): step 1: read Top (index,value,) = (2,V 2,C 2 ) Stack step 2: try to complete previous operation C&S on Stack[index] 3 Null 0 Top Helping method 2 1 V 2 C 2 V 1 C 1 2 V 2 index value C 2 0 Null 0 value 32!
33 Example! Push(V 3 ): step 1: read Top (index,value,) = (2,V 2,C 2 ) Stack step 2: try to complete previous operation C&S on Stack[index] 3 Null 0 Top step 3: check if stack is full if index=3 return Full 2 1 V 2 C 2 V 1 C 1 2 V 2 index value C 2 If Push returns Full, linearization point is step 1 0 Null 0 value 33!
34 Example! Push(V 3 ): step 1: read Top (index,value,) = (2,V 2,C 2 ) Stack step 2: try to complete previous operation C&S on Stack[index] 3 Null 0 Top step 3: check if stack is full if index=3 return Full 2 1 V 2 C 2 V 1 C 1 2 V 2 index value C 2 step 4: read of entry above top Stack[3]. 0 Null 0 value 34!
35 Example! Push(V 3 ): step 1: read Top (index,value,) = (2,V 2,C 2 ) Stack step 2: try to complete previous operation C&S on Stack[index] 3 Null 0 Top step 3: check if stack is full if index=3 return Full 2 1 V 2 C 2 V 1 C 1 2 V 2 index value C 2 step 4: read of entry above top Stack[3]. 0 Null 0 value step 5: try to change Top to (3, V 3,1) C&S on Top 35!
36 Example! Push(V 3 ): step 1: read Top (index,value,) = (2,V 2,C 2 ) Stack step 2: try to complete previous operation C&S on Stack[index] 3 Null 0 Top step 3: check if stack is full if index=3 return Full 2 1 V 2 C 2 V 1 C 1 2 V 2 index value C 2 step 4: read of entry above top Stack[3]. 0 Null 0 value step 5: try to change Top to (3, V 3,1) C&S on Top False 36!
37 Example! Push(V 3 ): step 1: read Top (index,value,) = (2,V 2,C 2 ) Stack step 2: try to complete previous operation C&S on Stack[index] 3 Null 0 Top step 3: check if stack is full if index=3 return Full 2 1 V 2 C 2 V 1 C 1 3 V 3 index value 0+1 step 4: read of entry above top Stack[3]. 0 Null 0 value step 5: try to change Top to (3, V 3,1) C&S on Top True Linearization point 37!
38 Example! Next operation (Pop or Push): Update Stack step 1: read Top (index,value,) = (3,V 3,1) 3 Stack Null 0 Top 2 V 2 C 2 3 V V 1 C 1 index value 0 Null 0 value 38!
39 Example! Next operation (Pop or Push): Update Stack step 1: read Top (index,value,) = (3,V 3,1) step 2: try to complete previous operation C&S on Stack[index] 3 2 Stack Null 0 V 2 C 2 Top 3 V V 1 C 1 index value Helping method 0 Null 0 value 39!
40 Example! Next operation (Pop or Push): Update Stack step 1: read Top (index,value,) = (3,V 3,1) step 2: try to complete previous operation C&S on Stack[index] 3 2 Stack V 3 1 V 2 C 2 Top 3 V 3 1 True 1 V 1 C 1 index value Helping method 0 Null 0 Update stack based on information in Top Index Value Counter value value 40!
41 An execution! Execution: an interleaving of steps of processes time X X X X Change Top Change Top Change Top Change Top 41!
42 An execution! Execution: an interleaving of steps of processes time X X X X Change Top Update array Change Top Update array Change Top Update array Change Top 42!
43 Structure of proof! Top is not set to the same value twice. (The ABA problem on Top is avoided.)! 43!
44 Structure of proof! Top is not set to the same value twice. (The ABA problem on Top is avoided.)! How exactly shared array is changed.! 44!
45 Structure of proof! Top is not set to the same value twice. (The ABA problem on Top is avoided.)! How exactly shared array is changed.! At any time during an execution, data structure correctly represents the abstract stack.! 45!
46 Queue algorithm! The same technique can be used to implement array-based algorithm for queues.! For more details refer to my M.Sc. thesis.! 46!
47 Queue algorithm! Shared Variables: circular array (Queue) Stores entries of queue 47!
48 Queue algorithm! Shared Variables: circular array (Queue) register (Rear) Stores index of rear element of queue Stores entries of queue 48!
49 Queue algorithm! Shared Variables: circular array (Queue) register (Rear) Stores index of rear element of queue register (Front) Stores entries of queue Stores index of front element of queue 49!
50 Queue algorithm! Shared Variables (Queue, Rear and Front) are changed using C&S. circular array (Queue) register (Rear) register (Front) 50!
51 Queue algorithm! circular array (Queue) register (Rear) register (Front) value 51!
52 Queue algorithm! circular array (Queue) register (Rear) index value register (Front) value 52!
53 Queue algorithm! circular array (Queue) register (Rear) index value register (Front) value index Counter Independent from Queue 53!
54 Example! In middle of some execution Queue Rear 3 Null 0 2 V 2 C 2 2 V 2 C 2 index value 1 0 V 1 C 1 Null 0 Front 1 0 value index 54!
55 Example! Enqueue(V 3 ): Queue Rear 3 Null 0 2 V 2 C 2 2 V 2 C 2 index value 1 0 V 1 C 1 Null 0 Front 1 0 value index 55!
56 Example! Enqueue(V 3 ): step 1: read Rear Queue Rear 3 Null 0 2 V 2 C 2 2 V 2 C 2 index value 1 0 V 1 C 1 Null 0 Front 1 0 value index 56!
57 Example! Enqueue(V 3 ): step 1: read Rear step 2: read Front Queue Rear 3 Null 0 2 V 2 C 2 2 V 2 C 2 index value 1 0 V 1 C 1 Null 0 Front 1 0 value index 57!
58 Example! Enqueue(V 3 ): step 1: read Rear step 2: read Front step 3: if Rear changed since step1, go to step 1 get a consistent view of Rear and Front 3 2 Queue Rear Null 0 2 V 2 V 2 C 2 C 2 index value 1 0 V 1 C 1 Null 0 Front 1 0 value index 58!
59 Example! Enqueue(V 3 ): step 1: read Rear step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous enqueue C&S on Queue[2] 3 2 Queue Rear Null 0 2 V 2 V 2 C 2 C 2 index value Helping method 1 V 1 C 1 Front 0 Null value index 59!
60 Example! Enqueue(V 3 ): step 1: read Rear step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous enqueue C&S on Queue 3 2 Queue Rear Null 0 2 V 2 V 2 C 2 C 2 index value step 5: check if queue is full 1 V 1 C 1 Front If Enqueue returns Full, linearization point is step 2 0 Null 0 value 1 index 0 60!
61 Example! Enqueue(V 3 ): step 1: read Rear step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous enqueue C&S on Queue 3 2 Queue Rear Null 0 2 V 2 V 2 C 2 C 2 index value step 5: check if queue is full 1 V 1 C 1 Front step 6: read of entry above rear Queue[3]. 0 Null 0 value 1 index 0 61!
62 Example! Enqueue(V 3 ): step 1: read Rear step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous enqueue C&S on Queue 3 2 Queue Rear Null 0 2 V 2 V 2 C 2 C 2 index value step 5: check if queue is full 1 V 1 C 1 Front step 6: read of entry above rear Queue[3]. 0 Null 0 value 1 index 0 step 7: try to change Rear to (3, V 3,1) C&S on Rear 62!
63 Example! Enqueue(V 3 ): step 1: read Rear step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous enqueue C&S on Queue 3 2 Queue Rear Null 0 2 V 2 V 2 C 2 C 2 index value step 5: check if queue is full 1 V 1 C 1 Front step 6: read of entry above rear Queue[3]. 0 Null 0 value 1 index 0 step 7: try to change Rear to (3, V 3,1) C&S on Rear False 63!
64 Example! Enqueue(V 3 ): step 1: read Rear step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous enqueue C&S on Queue 3 2 Queue Rear Null 0 3 V 3 1 V 2 C 2 index value step 5: check if queue is full 1 V 1 C 1 Front step 6: read of entry above rear Queue[3]. 0 Null 0 value 1 index 0 step 7: try to change Rear to (3, V 3,1) C&S on Rear True Linearization point 64!
65 Queue algorithm! Concurrent dequeue and enqueue operations interfere with each other only if there is only one element in the queue.! 65!
66 Queue algorithm! Concurrent dequeue and enqueue operations interfere with each other only if there is only one element in the queue.! Queue Rear 3 Null 0 1 V 1 C 1 2 Null 0 index value 1 0 V 1 C 1 Null 0 Front 1 0 value index 66!
67 How to make s bounded?! Main idea: reuse values! Employ adaptive collect object! 67!
68 Collect object! Store! Collect object! Each process can store some value into its component of collect object.! Collect! Each process can collect the values that have been stored in collect object. 68!
69 Queue algorithm using bounded s! Queue Rear index value old New Front value index Counter (Independent from Queue) 69!
70 Queue algorithm using bounded s! Main idea:! Processes store the values that they may use into collect object.! 70!
71 Queue algorithm using bounded s! Main idea:! Processes store the values that they may use into collect object.! To choose a new value, a process chooses a different value from values in collect object.! 71!
72 Enqueue(V 3 ): Queue algorithm using bounded s! step 1: read Rear step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous operation C&S on Queue step 5: check if queue is full step 6: read of above rear Queue[3]. step 7: try to change Rear to (3, V 3,1) C&S on Rear 72!
73 Enqueue(V 3 ): Queue algorithm using bounded s! step 1: read Rear step 1 : announce values that it may use (storing) step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous operation C&S on Queue step 5: check if queue is full step 6: read of above rear Queue[3]. step 7: try to change Rear to (3, V 3,1) C&S on Rear 73!
74 Enqueue(V 3 ): Queue algorithm using bounded s! step 1: read Rear step 1 : announce values that it may use (storing) step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous operation C&S on Queue step 5: check if queue is full step 6: read of above rear Queue[3]. step 6 : choose a new different from values in use (collecting) step 7: try to change Rear to (3, V 3,1) C&S on Rear 74!
75 Enqueue(V 3 ): Queue algorithm using bounded s! step 1: read Rear step 1 : announce values that it may use (storing) step 2: read Front step 3: if Rear changed since step1, go to step 1 step 4: try to complete previous operation C&S on Queue step 5: check if queue is full step 6: read of above rear Queue[3]. step 6 : choose a new different from values in use (collecting) step 7: try to change Rear to (3, V 3,old,new ) C&S on Rear 75!
76 Structure of proof! If there is a successful C&S, Rear or Front has not been changed from last reading them. ( To avoid the ABA problem)! 76!
77 Structure of proof! If there is a successful C&S, Rear or Front has not been changed from last reading them. (To avoid the ABA problem)! How exactly shared array is changed.! 77!
78 Structure of proof! If there is a successful C&S, Rear or Front has not been changed from last reading them. (To avoid the ABA problem)! How exactly shared array is changed.! What happened in data structure exactly matches with abstract queue.! 78!
79 Structure of proof! If there is a successful C&S, Rear or Front has not been changed from last reading them. (To avoid the ABA problem)! How exactly shared array is changed.! What happened in data structure exactly matches with abstract queue.! Operations return results consistent with their linearization order.! 79!
80 Time analysis! In non-blocking implementation, an operation can take arbitrarily many steps as long as some other operation is making progress.! Amortized analysis to evaluate the system as a whole! Assign blame in unsuccessful loop iteration to other operations that did successfully change the shared variables! The worst-case amortized cost of our algorithms depends only on point contention! Point contention: maximum number of process running concurrently at any time! 80!
81 Time analysis! time op 1 op 2 op 3 op 4 81!
82 Time analysis! time op 1 blame op 4 op 2 blame op 4 op 3 blame op 4 T 1 op 4 82!
83 Time analysis! time op 1 blame op 4 op 2 blame op 4 op 3 blame op 4 T 1 op 4 83!
84 Time analysis! time op 1 blame op 4 blame op 2 T 2 op 2 blame op 4 op 3 blame op 4 blame op 2 op 4 T 1 84!
85 Time analysis! time op 1 blame op 4 blame op 2 T 2 op 2 blame op 4 op 3 blame op 4 blame op 2 op 4 T 1 85!
86 Time analysis! time op 1 blame op 4 blame op 2 blame op 3 op 2 blame op 4 T 2 T 3 op 3 blame op 4 blame op 2 op 4 T 1 86!
87 Time analysis! time op 1 blame op 4 blame op 2 blame op 3 op 2 blame op 4 T 2 T 3 op 3 blame op 4 blame op 2 op 4 T 1 87!
88 Time analysis! time T 4 op 1 blame op 4 blame op 2 blame op 3 op 2 blame op 4 T 2 T 3 op 3 blame op 4 blame op 2 op 4 T 1 88!
89 Time analysis! time T 4 op 1 blame op 4 blame op 2 blame op 3 T 2 0 op 2 blame op 4 T 3 op 3 op 4 blame op 4 3 T 1 blame op Number of unsuccessful loop iteration: ( Point contention(t i ) -1 ) 89!
90 Model checking! Spin model checker! Define abstract stack/queue variables! Atomically change abstract stack/queue at linearization points of successful operations! At linearization points, assert that the contents of shared data structures are the same as the state of the abstract stack/queue! Define end-state labels when operations return to make sure all operations terminate! 90!
91 Model checking! Verify our algorithms for four operations and array size of three! Use exhaustive search! Partial reduction! 91!
92 Implementations! Compare our stack algorithms using unbounded values! Treiberʼs stack algorithm! Compare our queue algorithms using unbounded values! Queue algorithm of Michael and Scott! Array-based queue algorithm of Colvin and Groves! Implementations! java (java.util.concurrent.atomic)! System with two quad processors! 92!
93 Comparison! Compare in both low and high contentions! Total number of operations is the same in all executions! Executions have different numbers of threads! 50 runs! 93!
94 Comparison of concurrent stack algorithms! 94!
95 Comparison of concurrent queue algorithms! 95!
96 Conclusions! New array-based algorithms for stacks and queues! Amortized time complexity of an operation depends on point contention! Verification using the Spin model checker! Implementation and comparison! Our stack implementation is first practical nonblocking array-based stack implementation! It is the first time that bounded values are used to implement shared queue (and stack)! 96!
97 Thank you!!
Non-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei
Non-blocking Array-based Algorithms for Stacks and Queues Niloufar Shafiei Outline Introduction Concurrent stacks and queues Contributions New algorithms New algorithms using bounded counter values Correctness
More informationCache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency
Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed
More informationFast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Håkan Sundell Philippas Tsigas Outline Synchronization Methods Priority Queues Concurrent Priority Queues Lock-Free Algorithm: Problems
More informationLock-Free and Practical Doubly Linked List-Based Deques using Single-Word Compare-And-Swap
Lock-Free and Practical Doubly Linked List-Based Deques using Single-Word Compare-And-Swap Håkan Sundell Philippas Tsigas OPODIS 2004: The 8th International Conference on Principles of Distributed Systems
More informationAllocating memory in a lock-free manner
Allocating memory in a lock-free manner Anders Gidenstam, Marina Papatriantafilou and Philippas Tsigas Distributed Computing and Systems group, Department of Computer Science and Engineering, Chalmers
More informationWhatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs 3.
Whatever can go wrong will go wrong. attributed to Edward A. Murphy Murphy was an optimist. authors of lock-free programs 3. LOCK FREE KERNEL 309 Literature Maurice Herlihy and Nir Shavit. The Art of Multiprocessor
More informationA Non-Blocking Concurrent Queue Algorithm
A Non-Blocking Concurrent Queue Algorithm Bruno Didot bruno.didot@epfl.ch June 2012 Abstract This report presents a new non-blocking concurrent FIFO queue backed by an unrolled linked list. Enqueue and
More informationEfficient and Reliable Lock-Free Memory Reclamation Based on Reference Counting
Efficient and Reliable Lock-Free Memory Reclamation d on Reference ounting nders Gidenstam, Marina Papatriantafilou, Håkan Sundell and Philippas Tsigas Distributed omputing and Systems group, Department
More informationWhatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL
Whatever can go wrong will go wrong. attributed to Edward A. Murphy Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL 251 Literature Maurice Herlihy and Nir Shavit. The Art of Multiprocessor
More informationLock-Free Techniques for Concurrent Access to Shared Objects
This is a revised version of the previously published paper. It includes a contribution from Shahar Frank who raised a problem with the fifo-pop algorithm. Revised version date: sept. 30 2003. Lock-Free
More informationLindsay Groves, Simon Doherty. Mark Moir, Victor Luchangco
Lindsay Groves, Simon Doherty Victoria University of Wellington Mark Moir, Victor Luchangco Sun Microsystems, Boston (FORTE, Madrid, September, 2004) Lock-based concurrency doesn t scale Lock-free/non-blocking
More informationDesign of Concurrent and Distributed Data Structures
METIS Spring School, Agadir, Morocco, May 2015 Design of Concurrent and Distributed Data Structures Christoph Kirsch University of Salzburg Joint work with M. Dodds, A. Haas, T.A. Henzinger, A. Holzer,
More informationProving linearizability & lock-freedom
Proving linearizability & lock-freedom Viktor Vafeiadis MPI-SWS Michael & Scott non-blocking queue head tail X 1 3 2 null CAS compare & swap CAS (address, expectedvalue, newvalue) { atomic { if ( *address
More informationConcurrent Access Algorithms for Different Data Structures: A Research Review
Concurrent Access Algorithms for Different Data Structures: A Research Review Parminder Kaur Program Study of Information System University Sari Mutiara, Indonesia Parm.jass89@gmail.com Abstract Algorithms
More informationFine-grained synchronization & lock-free programming
Lecture 17: Fine-grained synchronization & lock-free programming Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Tunes Minnie the Moocher Robbie Williams (Swings Both Ways)
More informationNon-Blocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors
Non-Blocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors Maged M. Michael Michael L. Scott Department of Computer Science University of Rochester Rochester,
More informationFine-grained synchronization & lock-free data structures
Lecture 19: Fine-grained synchronization & lock-free data structures Parallel Computer Architecture and Programming Redo Exam statistics Example: a sorted linked list struct Node { int value; Node* next;
More informationLock Oscillation: Boosting the Performance of Concurrent Data Structures
Lock Oscillation: Boosting the Performance of Concurrent Data Structures Panagiota Fatourou FORTH ICS & University of Crete Nikolaos D. Kallimanis FORTH ICS The Multicore Era The dominance of Multicore
More informationSimple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms
Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms Maged M. Michael Michael L. Scott Department of Computer Science University of Rochester Rochester, NY 14627-0226 fmichael,scottg@cs.rochester.edu
More informationNon-Blocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors
Non-Blocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors Maged M. Michael Michael L. Scott Department of Computer Science University of Rochester Rochester,
More informationTime and Space Lower Bounds for Implementations Using k-cas
Time and Space Lower Bounds for Implementations Using k-cas Hagit Attiya Danny Hendler September 12, 2006 Abstract This paper presents lower bounds on the time- and space-complexity of implementations
More information1 P age DS & OOPS / UNIT II
UNIT II Stacks: Definition operations - applications of stack. Queues: Definition - operations Priority queues - De que Applications of queue. Linked List: Singly Linked List, Doubly Linked List, Circular
More informationThe Universality of Consensus
Chapter 6 The Universality of Consensus 6.1 Introduction In the previous chapter, we considered a simple technique for proving statements of the form there is no wait-free implementation of X by Y. We
More informationFIFO Queue Synchronization
FIFO Queue Synchronization by Moshe Hoffman A Thesis submitted for the degree Master of Computer Science Supervised by Professor Nir Shavit School of Computer Science Tel Aviv University July 2008 CONTENTS
More informationThe Relative Power of Synchronization Methods
Chapter 5 The Relative Power of Synchronization Methods So far, we have been addressing questions of the form: Given objects X and Y, is there a wait-free implementation of X from one or more instances
More informationOperating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017
Operating Systems Lecture 4 - Concurrency and Synchronization Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Mutual exclusion Hardware solutions Semaphores IPC: Message passing
More informationNon-Blocking Concurrent FIFO Queues With Single Word Synchronization Primitives
37th International Conference on Parallel Processing Non-Blocking Concurrent FIFO Queues With Single Word Synchronization Primitives Claude Evequoz University of Applied Sciences Western Switzerland 1400
More informationRelative Performance of Preemption-Safe Locking and Non-Blocking Synchronization on Multiprogrammed Shared Memory Multiprocessors
Relative Performance of Preemption-Safe Locking and Non-Blocking Synchronization on Multiprogrammed Shared Memory Multiprocessors Maged M. Michael Michael L. Scott University of Rochester Department of
More informationProving liveness. Alexey Gotsman IMDEA Software Institute
Proving liveness Alexey Gotsman IMDEA Software Institute Safety properties Ensure bad things don t happen: - the program will not commit a memory safety fault - it will not release a lock it does not hold
More informationIT 540 Operating Systems ECE519 Advanced Operating Systems
IT 540 Operating Systems ECE519 Advanced Operating Systems Prof. Dr. Hasan Hüseyin BALIK (5 th Week) (Advanced) Operating Systems 5. Concurrency: Mutual Exclusion and Synchronization 5. Outline Principles
More informationA Scalable Lock-free Stack Algorithm
A Scalable Lock-free Stack Algorithm Danny Hendler Ben-Gurion University Nir Shavit Tel-Aviv University Lena Yerushalmi Tel-Aviv University The literature describes two high performance concurrent stack
More informationChapter 5 Concurrency: Mutual Exclusion and Synchronization
Operating Systems: Internals and Design Principles Chapter 5 Concurrency: Mutual Exclusion and Synchronization Seventh Edition By William Stallings Designing correct routines for controlling concurrent
More informationSolo-Valency and the Cost of Coordination
Solo-Valency and the Cost of Coordination Danny Hendler Nir Shavit November 21, 2007 Abstract This paper introduces solo-valency, a variation on the valency proof technique originated by Fischer, Lynch,
More informationUnit 6: Indeterminate Computation
Unit 6: Indeterminate Computation Martha A. Kim October 6, 2013 Introduction Until now, we have considered parallelizations of sequential programs. The parallelizations were deemed safe if the parallel
More informationCS 241 Honors Concurrent Data Structures
CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana Champaign March 27, 2018 CS 241 Course Staff (UIUC) Lock Free Data Structures March 27, 2018 1 / 43 What to go over
More informationSection 4 Concurrent Objects Correctness, Progress and Efficiency
Section 4 Concurrent Objects Correctness, Progress and Efficiency CS586 - Panagiota Fatourou 1 Concurrent Objects A concurrent object is a data object shared by concurrently executing processes. Each object
More informationAST: scalable synchronization Supervisors guide 2002
AST: scalable synchronization Supervisors guide 00 tim.harris@cl.cam.ac.uk These are some notes about the topics that I intended the questions to draw on. Do let me know if you find the questions unclear
More informationChapter 5 Concurrency: Mutual Exclusion. and. Synchronization. Operating Systems: Internals. and. Design Principles
Operating Systems: Internals and Design Principles Chapter 5 Concurrency: Mutual Exclusion and Synchronization Seventh Edition By William Stallings Designing correct routines for controlling concurrent
More informationThe Wait-Free Hierarchy
Jennifer L. Welch References 1 M. Herlihy, Wait-Free Synchronization, ACM TOPLAS, 13(1):124-149 (1991) M. Fischer, N. Lynch, and M. Paterson, Impossibility of Distributed Consensus with One Faulty Process,
More informationk-abortable Objects: Progress under High Contention
k-abortable Objects: Progress under High Contention Naama Ben-David 1, David Yu Cheng Chan 2, Vassos Hadzilacos 2, and Sam Toueg 2 Carnegie Mellon University 1 University of Toronto 2 Outline Background
More informationCS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University
CS 333 Introduction to Operating Systems Class 3 Threads & Concurrency Jonathan Walpole Computer Science Portland State University 1 Process creation in UNIX All processes have a unique process id getpid(),
More informationScheduler Activations. CS 5204 Operating Systems 1
Scheduler Activations CS 5204 Operating Systems 1 Concurrent Processing How can concurrent processing activity be structured on a single processor? How can application-level information and system-level
More informationOwnership of a queue for practical lock-free scheduling
Ownership of a queue for practical lock-free scheduling Lincoln Quirk May 4, 2008 Abstract We consider the problem of scheduling tasks in a multiprocessor. Tasks cannot always be scheduled independently
More informationCS 571 Operating Systems. Midterm Review. Angelos Stavrou, George Mason University
CS 571 Operating Systems Midterm Review Angelos Stavrou, George Mason University Class Midterm: Grading 2 Grading Midterm: 25% Theory Part 60% (1h 30m) Programming Part 40% (1h) Theory Part (Closed Books):
More informationA simple correctness proof of the MCS contention-free lock. Theodore Johnson. Krishna Harathi. University of Florida. Abstract
A simple correctness proof of the MCS contention-free lock Theodore Johnson Krishna Harathi Computer and Information Sciences Department University of Florida Abstract Mellor-Crummey and Scott present
More information6.852: Distributed Algorithms Fall, Class 21
6.852: Distributed Algorithms Fall, 2009 Class 21 Today s plan Wait-free synchronization. The wait-free consensus hierarchy Universality of consensus Reading: [Herlihy, Wait-free synchronization] (Another
More informationCS510 Concurrent Systems. Jonathan Walpole
CS510 Concurrent Systems Jonathan Walpole Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms utline Background Non-Blocking Queue Algorithm Two Lock Concurrent Queue Algorithm
More informationMutex Implementation
COS 318: Operating Systems Mutex Implementation Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Revisit Mutual Exclusion (Mutex) u Critical
More informationWait-Free Multi-Word Compare-And-Swap using Greedy Helping and Grabbing
Wait-Free Multi-Word Compare-And-Swap using Greedy Helping and Grabbing H. Sundell 1 1 School of Business and Informatics, University of Borås, Borås, Sweden Abstract We present a new algorithm for implementing
More informationProgress Guarantees When Composing Lock-Free Objects
Progress Guarantees When Composing Lock-Free Objects Nhan Nguyen Dang and Philippas Tsigas Department of Computer Science and Engineering Chalmers University of Technology Gothenburg, Sweden {nhann,tsigas}@chalmers.se
More informationCourse: Operating Systems Instructor: M Umair. M Umair
Course: Operating Systems Instructor: M Umair Process The Process A process is a program in execution. A program is a passive entity, such as a file containing a list of instructions stored on disk (often
More informationEven Better DCAS-Based Concurrent Deques
Even Better DCAS-Based Concurrent Deques David L. Detlefs, Christine H. Flood, Alexander T. Garthwaite, Paul A. Martin, Nir N. Shavit, and Guy L. Steele Jr. Sun Microsystems Laboratories, 1 Network Drive,
More informationCS377P Programming for Performance Multicore Performance Synchronization
CS377P Programming for Performance Multicore Performance Synchronization Sreepathi Pai UTCS October 21, 2015 Outline 1 Synchronization Primitives 2 Blocking, Lock-free and Wait-free Algorithms 3 Transactional
More informationNonblocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors 1
journal of parallel and distributed computing 51, 126 (1998) article no. PC981446 Nonblocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors 1 Maged M. Michael
More informationConcurrent Preliminaries
Concurrent Preliminaries Sagi Katorza Tel Aviv University 09/12/2014 1 Outline Hardware infrastructure Hardware primitives Mutual exclusion Work sharing and termination detection Concurrent data structures
More informationIndistinguishability: Friend and Foe of Concurrent Data Structures. Hagit Attiya CS, Technion
Indistinguishability: Friend and Foe of Concurrent Data Structures Hagit Attiya CS, Technion Uncertainty is a main obstacle for designing correct applications in concurrent systems Formally captured by
More informationSynchronization and memory consistency on Intel Single-chip Cloud Computer. Ivan Walulya
Synchronization and memory consistency on Intel Single-chip Cloud Computer Master of Science Thesis in Programme Computer Systems and Networks Ivan Walulya Chalmers University of Technology University
More informationPer-Thread Batch Queues For Multithreaded Programs
Per-Thread Batch Queues For Multithreaded Programs Tri Nguyen, M.S. Robert Chun, Ph.D. Computer Science Department San Jose State University San Jose, California 95192 Abstract Sharing resources leads
More informationParallel Programming in Distributed Systems Or Distributed Systems in Parallel Programming
Parallel Programming in Distributed Systems Or Distributed Systems in Parallel Programming Philippas Tsigas Chalmers University of Technology Computer Science and Engineering Department Philippas Tsigas
More informationHåkan Sundell University College of Borås Parallel Scalable Solutions AB
Brushing the Locks out of the Fur: A Lock-Free Work Stealing Library Based on Wool Håkan Sundell University College of Borås Parallel Scalable Solutions AB Philippas Tsigas Chalmers University of Technology
More informationLinked Lists: The Role of Locking. Erez Petrank Technion
Linked Lists: The Role of Locking Erez Petrank Technion Why Data Structures? Concurrent Data Structures are building blocks Used as libraries Construction principles apply broadly This Lecture Designing
More informationSynchronization COMPSCI 386
Synchronization COMPSCI 386 Obvious? // push an item onto the stack while (top == SIZE) ; stack[top++] = item; // pop an item off the stack while (top == 0) ; item = stack[top--]; PRODUCER CONSUMER Suppose
More informationConcurrent Queues and Stacks. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Concurrent Queues and Stacks Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit The Five-Fold Path Coarse-grained locking Fine-grained locking Optimistic synchronization
More informationCS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University
CS 112 Introduction to Computing II Wayne Snyder Department Boston University Today Introduction to Linked Lists Stacks and Queues using Linked Lists Next Time Iterative Algorithms on Linked Lists Reading:
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 31 October 2012 Lecture 6 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationLinearizability Checking: Reductions to State Reachability
Linearizability Checking: Reductions to State Reachability Ahmed Bouajjani Univ Paris Diderot - Paris 7 Joint work with Michael Emmi Constantin Enea Jad Hamza Bell Labs, Nokia Univ Paris Diderot EPFL IMS-NUS,
More informationHazard Pointers. Number of threads unbounded time to check hazard pointers also unbounded! difficult dynamic bookkeeping! thread B - hp1 - hp2
Hazard Pointers Store pointers of memory references about to be accessed by a thread Memory allocation checks all hazard pointers to avoid the ABA problem thread A - hp1 - hp2 thread B - hp1 - hp2 thread
More informationLock-free Serializable Transactions
Lock-free Serializable Transactions Jeff Napper jmn@cs.utexas.edu Lorenzo Alvisi lorenzo@cs.utexas.edu Laboratory for Advanced Systems Research Department of Computer Science The University of Texas at
More informationAgenda. Lecture. Next discussion papers. Bottom-up motivation Shared memory primitives Shared memory synchronization Barriers and locks
Agenda Lecture Bottom-up motivation Shared memory primitives Shared memory synchronization Barriers and locks Next discussion papers Selecting Locking Primitives for Parallel Programming Selecting Locking
More informationClustered Communication for Efficient Pipelined Multithreading on Commodity MCPs
Clustered Communication for Efficient Pipelined Multithreading on Commodity MCPs Yuanming Zhang, Kanemitsu Ootsu, Takashi Yokota, and Takanobu Baba Abstract Low inter-core communication overheads are critical
More informationSimple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms. M.M. Michael and M.L. Scott. Technical Report 600 December 1995
Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms M.M. Michael and M.L. Scott Technical Report 600 December 1995 UNIVERSITY OF ROCHESTER COMPUTER SCIENCE 9960605 014 rroggtmo?rstäi
More informationCSc33200: Operating Systems, CS-CCNY, Fall 2003 Jinzhong Niu December 10, Review
CSc33200: Operating Systems, CS-CCNY, Fall 2003 Jinzhong Niu December 10, 2003 Review 1 Overview 1.1 The definition, objectives and evolution of operating system An operating system exploits and manages
More informationCS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University
CS 333 Introduction to Operating Systems Class 3 Threads & Concurrency Jonathan Walpole Computer Science Portland State University 1 The Process Concept 2 The Process Concept Process a program in execution
More informationC09: Process Synchronization
CISC 7310X C09: Process Synchronization Hui Chen Department of Computer & Information Science CUNY Brooklyn College 3/29/2018 CUNY Brooklyn College 1 Outline Race condition and critical regions The bounded
More informationInformation Science 2
Information Science 2 - Basic Data Structures- Week 02 College of Information Science and Engineering Ritsumeikan University Today s class outline l Basic data structures: Definitions and implementation
More informationConcurrency: Mutual Exclusion and Synchronization. Concurrency
Concurrency: Mutual Exclusion and Synchronization Chapter 5 1 Concurrency Multiple applications Structured applications Operating system structure 2 1 Concurrency 3 Difficulties of Concurrency Sharing
More informationReview: Easy Piece 1
CS 537 Lecture 10 Threads Michael Swift 10/9/17 2004-2007 Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift 1 Review: Easy Piece 1 Virtualization CPU Memory Context Switch Schedulers
More informationThreading and Synchronization. Fahd Albinali
Threading and Synchronization Fahd Albinali Parallelism Parallelism and Pseudoparallelism Why parallelize? Finding parallelism Advantages: better load balancing, better scalability Disadvantages: process/thread
More informationDr. D. M. Akbar Hussain DE5 Department of Electronic Systems
Concurrency 1 Concurrency Execution of multiple processes. Multi-programming: Management of multiple processes within a uni- processor system, every system has this support, whether big, small or complex.
More informationDCAS-Based Concurrent Deques
DCAS-Based Concurrent Deques Ole Agesen 1 David. Detlefs Christine H. Flood Alexander T. Garthwaite Paul A. Martin Mark Moir Nir N. Shavit 1 Guy. Steele Jr. VMware Tel Aviv University Sun Microsystems
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 17 November 2017 Lecture 7 Linearizability Lock-free progress properties Hashtables and skip-lists Queues Reducing contention Explicit
More informationConcurrent specifications beyond linearizability
Concurrent specifications beyond linearizability Éric Goubault Jérémy Ledent Samuel Mimram École Polytechnique, France OPODIS 2018, Hong Kong December 19, 2018 1 / 14 Objects Processes communicate through
More informationLecture 10: Avoiding Locks
Lecture 10: Avoiding Locks CSC 469H1F Fall 2006 Angela Demke Brown (with thanks to Paul McKenney) Locking: A necessary evil? Locks are an easy to understand solution to critical section problem Protect
More informationChapter 6: Process Synchronization
Chapter 6: Process Synchronization Objectives Introduce Concept of Critical-Section Problem Hardware and Software Solutions of Critical-Section Problem Concept of Atomic Transaction Operating Systems CS
More informationCSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)
CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) Past & Present Have looked at two constraints: Mutual exclusion constraint between two events is a requirement that
More information.:: UNIT 4 ::. STACK AND QUEUE
.:: UNIT 4 ::. STACK AND QUEUE 4.1 A stack is a data structure that supports: Push(x) Insert x to the top element in stack Pop Remove the top item from stack A stack is collection of data item arrange
More informationA Wait-Free Queue for Multiple Enqueuers and Multiple Dequeuers Using Local Preferences and Pragmatic Extensions
A Wait-Free Queue for Multiple Enqueuers and Multiple Dequeuers Using Local Preferences and Pragmatic Extensions Philippe Stellwag, Alexander Ditter, Wolfgang Schröder-Preikschat Friedrich-Alexander University
More informationMidterm Exam Amy Murphy 6 March 2002
University of Rochester Midterm Exam Amy Murphy 6 March 2002 Computer Systems (CSC2/456) Read before beginning: Please write clearly. Illegible answers cannot be graded. Be sure to identify all of your
More informationDCAS-Based Concurrent Deques
DCAS-Based Concurrent Deques Ole Agesen David. Detlefs Christine H. Flood Alexander T. Garthwaite Paul A. Martin Nir N. Shavit VMware Sun Microsystems aboratories Guy. Steele Jr. Abstract The computer
More informationSynchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011
Synchronization CS61, Lecture 18 Prof. Stephen Chong November 3, 2011 Announcements Assignment 5 Tell us your group by Sunday Nov 6 Due Thursday Nov 17 Talks of interest in next two days Towards Predictable,
More informationScalable Flat-Combining Based Synchronous Queues
Scalable Flat-Combining Based Synchronous Queues Danny Hendler 1, Itai Incze 2, Nir Shavit 2,3 and Moran Tzafrir 2 1 Ben-Gurion University 2 Tel-Aviv University 3 Sun Labs at Oracle Abstract. In a synchronous
More informationConcurrent Programming: Algorithms, Principles, and Foundations
Concurrent Programming: Algorithms, Principles, and Foundations Algorithms, Principles, and Foundations Bearbeitet von Michel Raynal 1. Auflage 2012. Buch. xxxii, 516 S. Hardcover ISBN 978 3 642 32026
More informationCPSC/ECE 3220 Summer 2018 Exam 2 No Electronics.
CPSC/ECE 3220 Summer 2018 Exam 2 No Electronics. Name: Write one of the words or terms from the following list into the blank appearing to the left of the appropriate definition. Note that there are more
More informationThread-Local. Lecture 27: Concurrency 3. Dealing with the Rest. Immutable. Whenever possible, don t share resources
Thread-Local Lecture 27: Concurrency 3 CS 62 Fall 2016 Kim Bruce & Peter Mawhorter Some slides based on those from Dan Grossman, U. of Washington Whenever possible, don t share resources Easier to have
More informationStanford University Computer Science Department CS 140 Midterm Exam Dawson Engler Winter 1999
Stanford University Computer Science Department CS 140 Midterm Exam Dawson Engler Winter 1999 Name: Please initial the bottom left corner of each page. This is an open-book exam. You have 50 minutes to
More informationBQ: A Lock-Free Queue with Batching
BQ: A Lock-Free Queue with Batching Gal Milman Technion, Israel galy@cs.technion.ac.il Alex Kogan Oracle Labs, USA alex.kogan@oracle.com Yossi Lev Oracle Labs, USA levyossi@icloud.com ABSTRACT Victor Luchangco
More informationDESIGN CHALLENGES FOR SCALABLE CONCURRENT DATA STRUCTURES for Many-Core Processors
DESIGN CHALLENGES FOR SCALABLE CONCURRENT DATA STRUCTURES for Many-Core Processors DIMACS March 15 th, 2011 Philippas Tsigas Data Structures In Manycore Sys. Decomposition Synchronization Load Balancing
More informationConcurrency. Chapter 5
Concurrency 1 Chapter 5 2 Concurrency Is a fundamental concept in operating system design Processes execute interleaved in time on a single processor Creates the illusion of simultaneous execution Benefits
More informationLinearizability of Persistent Memory Objects
Linearizability of Persistent Memory Objects Michael L. Scott Joint work with Joseph Izraelevitz & Hammurabi Mendes www.cs.rochester.edu/research/synchronization/ Workshop on the Theory of Transactional
More informationAn Efficient Synchronisation Mechanism for Multi-Core Systems
An Efficient Synchronisation Mechanism for Multi-Core Systems Marco Aldinucci 1, Marco Danelutto 2, Peter Kilpatrick 3, Massimiliano Meneghin 4, and Massimo Torquati 2 1 Computer Science Department, University
More information