Lock-free algorithms for Kotlin coroutines. It is all about scalability Presented at SPTCC 2017 /Roman JetBrains
|
|
- Preston Robinson
- 5 years ago
- Views:
Transcription
1 Lock-free algorithms for Kotlin coroutines It is all about scalability resented at TCC 2017 /Roman JetBrains
2 peaker: Roman Elizarov 16+ years experience reviously developed high-perf trading Devexperts Teach concurrent & distributed t. etersburg ITMO University Chief Northeastern European Region of ACM ICC Now work on JetBrains
3 Agenda Kotlin coroutines overview & motivation for lockfree algorithms Lock-free doubly linked list Lock-free multi-word compare-and-swap Combining them to get more complex atomic operations (without TM)
4 Kotlin basic facts Kotlin is a JVM language developed by JetBrains General purpose and statically-typed Object-oriented and functional paradigms Open source under Apache 2.0 Reached version 1.0 in 2016 Compatibility commitment Now at version 1.1 Officially supported by Google on Android
5 Kotlin is Modern Concise afe Extensible ragmatic Fun to work with!
6 Kotlin is pragmatic and easy to learn
7 Coroutines Asynchronous programming made easy
8 How do we write code that waits for something most of the time?
9 Blocking threads Kotlin fun postitem(item: Item) { val token = requesttoken() val post = submitost(token, item) processost(post) }
10 Callbacks Kotlin fun postitem(item: Item) { requesttoken { token -> submitost(token, item) { post -> processost(post) } } }
11 Futures/romises/Rx Kotlin fun postitem(item: Item) { requesttoken().thencompose { token -> submitost(token, item) }.thenaccept { post -> processost(post) } }
12 Coroutines Kotlin fun postitem(item: Item) { launch(commonool) { val token = requesttoken() val post = submitost(token, item) processost(post) } }
13 C & Actor models A style of programming for modern systems Lots of concurrent tasks / jobs Waiting most of the time Communicating all the time hare data by communicating
14 Kotlin coroutines primitives Jobs/Deferreds (futures) join/await Channels send & receive synchronous & buffered channels elect/alternatives Atomically wait on multiple events Cancellation arent-child hierarchies
15 Implementation challenges Coroutines are like light-weight threads All the low-level scheduling & communication mechanisms have to scale to lots of coroutines
16 Lock-free algorithms
17 Building blocks ingle-word CA (that s all we have on JVM) Automatic memory management (GC) ractical lock-free algorithms Lock-Free and ractical Doubly Linked List-Based Deques Using ingle-word Compare-and-wap by undell and Tsigas A ractical Multi-Word Compare-and-wap Operation by Timothy L. Harris, Keir Fraser and Ian A. ratt.
18 Doubly linked list next links form logical list contents prev links are auxiliary H N 1 N T sentinel sentinel Use same node in practice
19 Insert ushright (like in queue)
20 Doubly linked list (insert 0) 2 2 N create & init H N 1 N 1 T
21 Doubly linked list (insert 1) 2 N H N 1 N CA T Retry insert on CA failure
22 Doubly linked list (insert 2) finish insert 2 N H N 1 N CA T Ignore CA failure
23 Remove opleft (like in queue)
24 Doubly linked list (remove 1) 2 Mark removed node s next link H N 1 1 N CA T Retry remove on CA failure Use wrapper object for mark in practice Don t use AtomicMarkableReference Cache wrappers in pointed-to nodes
25 Doubly linked list (remove 2) finish remove Mark removed node s prev link H N CA 1 N T Retry marking on CA failure
26 Doubly linked list (remove 3) help remove fixup next links H CA N 1 N T
27 Doubly linked list (remove 4) correct prev fixup prev links H N 1 N T CA
28 tate transitions
29 Node states correct prev Init next: Ok prev: Ok prev.next: -- next.prev: -- Insert 1 next: Ok prev: Ok prev.next: me next.prev: -- Insert 2 next: Ok prev: Ok prev.next: me next.prev: me Remove 1 next: Rem prev: Ok prev.next: me next.prev: me Remove 2 next: Rem prev: Rem prev.next: me next.prev: me Remove 3 next: Rem prev: Rem prev.next: ++ next.prev: me Remove 4 next: Rem prev: Rem prev.next: ++ next.prev: ++ help remove correct prev
30 Helping
31 Concurrent insert
32 Concurrent insert (0) I2 I3 2 N H N 1 N T 3 N
33 Concurrent insert (1) I2 I3 2 N H N 1 N CA ok T CA fail 3 N
34 Concurrent insert (2) I2 I3 2 N H N 1 N T 3 N detect wrong prev (t.prev.next!= t)
35 Concurrent insert (3) I2 I3 2 N correct prev H N 1 N T 3 N
36 Concurrent insert (4) I2 I3 2 N H N 1 N T 3 N reinit & repeat
37 Concurrent remove
38 Concurrent remove (0) R1 R2 H N 1 N 2 N T
39 Concurrent remove (1) R1 R2 R1 H N 1 N 2 N T
40 Concurrent remove (2) R1 R2 R1 H N 1 N 2 N T Finds already removed R2
41 Concurrent remove (3) R1 R2 R1 help remove H N 1 N 2 N T mark prev R2
42 Concurrent remove (4) R1 R2 R1 H N 1 N 2 N T Retry with corrected next R2
43 Concurrent remove (5) R1 R2 R1 help remove H N 1 N 2 N T R2
44 Concurrent remove (6) R1 R2 R1 H N 1 N 2 N T correct prev R2
45 Concurrent remove & insert When remove wins
46 Concurrent remove & insert (0) R1 I2 2 N create & init H N 1 N T R1
47 Concurrent remove & insert (1) R1 I2 2 N H N remove first 1 N T R1
48 Concurrent remove & insert (2) R1 I2 2 N CA fail H N 1 N T R1
49 Concurrent remove & insert (3) R1 I2 2 N H N 1 N T R1 detect wrong prev (t.prev.next -- removed) do correct prev
50 Concurrent remove & insert (4) R1 I2 2 N fixup next H N mark prev 1 N T R1
51 Concurrent remove & insert (5) R1 I2 2 N H N 1 N T update prev R1
52 Concurrent remove & insert (6) R1 I2 2 N reinit & repeat H N 1 N T R1
53 Concurrent remove & insert When insert wins
54 Concurrent remove & insert (0) R1 I2 2 N create & init H N 1 N T R1
55 Concurrent remove & insert (1) R1 I2 2 N H N 1 N CA T R1
56 Concurrent remove & insert (2) R1 I2 2 N H N 1 N T will succeed marking on remove retry R1
57 Concurrent remove & insert (3) R1 I2 help remove 2 N H N 1 N T mark prev R1
58 Concurrent remove & insert (4) R1 I2 correct prev 2 N H N 1 N T R1 Remove is over!
59 Concurrent remove & insert (5) R1 I2 2 N H N 1 N T correct prev R1
60 Takeaways A kind of algo you need a paper for Hard to improve w/o writing another paper Good news: stress tests uncover most impl bugs Bad news: when stress test fails, you up to long hours More bad news: hard to find bugs that violate lockfreedomness of algorithm
61 ummary: what we can do Insert items (at the end of the queue) Remove items (at the front of the queue) Traverse the list Remove items at arbitrary locations In O(1)
62 Linearizability Insert last Linearizes at CA of next Remove first / arbitrary uccess at CA of next Fail at read of head.next
63 More about algorithm undell & Tsigas algo supports deque operations Can ushleft & opright opleft is simple read head.next & remove But cannot linearize them all at cas points ushleft, ushright, opright - Ok opleft linearizes at head.next read (!!!)
64 ummary of impl notes Use GC (drop all memory management details) Merge head & tail into a single sentinel node Empty list is just one object (prev & next onto itself) One item += one object Q N 1 N Reuse remove mark objects One-element lists reuse of ptrs to sentinel all the time Encapsulate!
65
66 Mods More complex atomic operations
67 Basic mods (1) Insert item conditionally on prev tail value 2 N check & bailout before CA H N 1 N T
68 Basic mods (2) Remove head conditionally on prev head value check & bailout before CA H N 1 N T R1
69 ractical use-case: synchronous channels 1 2 val channel = Channel<Int>() // coroutine #1 for (x in 1..5) { channel.send(x * x) } 3 // coroutine #2 repeat(5) { println(channel.receive()) }
70 enders wait Incoming receivers More senders ender #1 H ender #2 ender #3 T Receiver removes first if it is a sender node ender inserts last if it is not a receiver node
71 Receivers wait Incoming senders More receivers Receiver #1 H Receiver #2 Receiver #3 T ender removes first if it is a receiver node Receiver inserts last if it is not a sender node
72 end function sketch 1 fun send(element: T) { while (true) { // try to add sender, unless prev is receiver 2 if (enqueueend(element)) break // try to remove first receiver 3 val receiver = removefirstreceiver() if (receiver!= null) { 4 receiver.resume(element) // resume receiver break } } }
73 Channel use-case recap Uses insert/remove ops conditional on tail/head node Can abort (cancel) wait to receive/send at any time by using remove Full removal -- no garbage is left retty efficient in practice One item lists one garbage object
74 Multi-word compare and swap (CAN) Build even bigger atomic operations
75 Use-case: select expression val channel1 = Channel<Int>() val channel2 = Channel<Int>() select { channel1.onreceive { e ->... } channel2.onreceive { e ->... } }
76 Impl summary: register (1) elect status: N 1. Not selected 2. elected Channel1 Queue Channel2 Queue
77 Impl summary: register (2) elect status: N Add node to channel1 queue if not selected (N) yet Channel1 Queue N1 Channel2 Queue
78 Impl summary: register (3) elect status: N Add node to channel2 queue if not selected (N) yet Channel1 Queue N1 Channel2 Queue N2
79 Impl summary: wait elect status: N Channel1 Queue N1 Channel2 Queue N2
80 Impl summary: select (resume) elect status: Make selected and remove node from queue Channel1 Queue N1 Channel2 Queue
81 Impl summary: clean up rest elect status: Remove non-selected waiters from queue Channel1 Queue Channel2 Queue
82 Double-Compare ingle-wap (DC) Building block for CAN
83 DC spec in pseudo-code A B 1 2 fun <A,B> dcss( a: Ref<A>, expecta: A, updatea: A, b: Ref<B>, expectb: B) = atomic { 3 if (a.value == expecta && b.value == expectb) { 4 a.value = updatea } }
84 DC: init descriptor expecta A B expectb updatea DC Descriptor (a, expecta, updatea, b, expectb)
85 DC: prepare expecta A B expectb updatea CA ptr to descriptor if a.value == expecta DC Descriptor (a, expecta, updatea, b, expectb)
86 DC: read b.value expecta A B expectb updatea CA ptr to descriptor if a.value == expecta DC Descriptor (a, expecta, updatea, b, expectb)
87 DC: complete (when success) expecta A B expectb updatea CA to updated value if a still points to descriptor DC Descriptor (a, expecta, updatea, b, expectb)
88 DC: complete (alternative) expecta A B!expectB updatea DC Descriptor (a, expecta, updatea, b, expectb)
89 DC: complete (when fail) expecta A B!expectB updatea CA to original value if a still points to descriptor DC Descriptor (a, expecta, updatea, b, expectb)
90 DC: tates Any other thread encountering descriptor helps complete 1 2 Init prep ok A:??? (desc created) A: desc A was expecta success 3 A: updatea B was expectb prep fail fail 4 5 A:??? A was!expecta A: expecta B was!expectb Originator cannot learn what was the outcome one tread Lock-free algorithm without loops!
91 Caveats A & B locations must be totally ordered or risk stack-overflow while helping One way to look at it: Restricted DC (RDC)
92 DC Mod: learn outcome A B fun <A,B> dcssmod( a: Ref<A>, expecta: A, updatea: A, b: Ref<B>, expectb: B): Boolean = atomic { if (a.value == expecta && b.value == expectb) { a.value = updatea true } else false }
93 DC Mod: init descriptor expecta A B expectb updatea DC Descriptor (a, expecta, updatea, b, expectb) Outcome: UNDECIDED Consensus
94 DC Mod: prepare expecta A B expectb updatea DC Descriptor (a, expecta, updatea, b, expectb) Outcome: UNDECIDED
95 DC Mod: read b.value expecta A B expectb updatea DC Descriptor (a, expecta, updatea, b, expectb) Outcome: UNDECIDED
96 DC Mod: reach consensus expecta A B expectb updatea DC Descriptor (a, expecta, updatea, b, expectb) Outcome: UCCE CA(UNDECIDED, DECIION)
97 DC Mod: complete expecta A B expectb updatea DC Descriptor (a, expecta, updatea, b, expectb) Outcome: UCCE
98 DC Mod: tates 1 2 Init A:??? prep ok A: desc Outcome: UND Outcome: UND A was expecta (desc created) success 3 A: desc Outcome: UCC B was expectb prep fail fail 5 A:??? Outcome: FAIL A was!expecta 6 A: desc Outcome: FAIL 4 7 A: updatea A: expecta one tread till no loops!
99 Compare-And-wap N-words (CAN) The ultimate atomic update
100 CAN spec in pseudo-code For two words, for simplicity A B 1 2 fun <A,B> cas2( a: Ref<A>, expecta: A, updatea: A, b: Ref<B>, expectb: B, updateb: B): Boolean = 3 atomic { if (a.value == expecta && b.value == expectb) { } 4 5 a.value = updatea b.value = updateb true } else false
101 CAN: init descriptor expecta A B expectb updatea updateb DC Descriptor (a, expecta, updatea, b, expectb, updateb) Outcome: UNDECIDED
102 CAN: prepare (1) expecta A B expectb updatea CA updateb DC Descriptor (a, expecta, updatea, b, expectb, updateb) Outcome: UNDECIDED
103 CAN: prepare (2) expecta A B expectb updatea DC updateb DC Descriptor (a, expecta, updatea, b, expectb, updateb) Outcome: UNDECIDED Use DC to update B if Outcome == UNDECIDED
104 CAN: decide expecta A B expectb updatea updateb DC Descriptor (a, expecta, updatea, b, expectb, updateb) Outcome: UCCE CA outcome
105 CAN: complete (1) expecta A B expectb updatea CA updateb DC Descriptor (a, expecta, updatea, b, expectb, updateb) Outcome: UCCE
106 CAN: complete (2) expecta A B expectb updatea CA updateb DC Descriptor (a, expecta, updatea, b, expectb, updateb) Outcome: UCCE
107 CAN: tates A!= expecta A:??? B:??? O: UND 7 8 A:??? B:??? O: FAIL one tread B!= expectb A: desc B:??? O: UND A: desc B:??? O: FAIL DC 9 A: desc B: desc O: UND A: expecta B:??? O: FAIL 4 A: desc B: desc O: UCC revents from going back in this M 5 6 A: updatea B: desc O: UCC Init A: updatea B: updateb O: UCC descriptor is known to other (helping) threads
108 Using it in practice All the little things that matter
109 It is easy to combine multiple operations with DC/CAN that linearize on a CA with a descriptor parameters that are known in advance
110 Trivial example: Treiber stack expect TO 1 2 CA update 3 New node
111 Let s go deeper Into unpublished territory
112 Doubly linked list: insert last
113 Doubly linked list: insert (0) 2 N CA here H N 1 N T Operation Descriptor A ref:??? expecta: entinel updatea: Node #2 Outcome: UNDECIDED DC here is needed (always!)??? can fill in A before CA & update on retry We know expected value for CA in advance We know updated value for CA in advance
114 Doubly linked list: insert (1) 2 N H N 1 N T Operation Descriptor A ref:??? expecta: entinel updatea: Node #2 Outcome: UNDECIDED DC Descriptor affected node: #1 operation ref
115 Doubly linked list: insert (2) Helpers are a bound to stumble upon the same descriptor Competing inserts will complete (help) us first 2 N H N 1 N T CA can only succeed on last node Operation Descriptor A ref:??? expecta: entinel updatea: Node #2 Outcome: UNDECIDED DC Descriptor affected node: #1 operation ref
116 Doubly linked list: insert (3) 2 N H N 1 N T desc is updated after successful DC Operation Descriptor A ref: Node #1 expecta: entinel updatea: Node #2 Outcome: UNDECIDED DC Descriptor affected node: #1 operation ref
117 Doubly linked list: insert (4) 2 N H N 1 N T Operation Descriptor tays pointed until operation outcome is decided A ref: Node #1 expecta: entinel updatea: Node #2 Outcome: UNDECIDED
118 Doubly linked list: remove first
119 Doubly linked list: remove (0) R1 CA here H N 1 N 2 N T Operation Descriptor A ref:??? expecta:??? updatea: Rem[???] Outcome: UNDECIDED Both not known in advance Deterministic f(expecta)
120 Doubly linked list: remove (1) R1 H N 1 N 2 N T Operation Descriptor A ref:??? expecta:??? updatea: Rem[???] Outcome: UNDECIDED DC Descriptor affected node: #1 old value: #2 operation ref
121 Doubly linked list: remove (2) Cannot change w/o removal of #1 We don t support ushleft!!! R1 It locks what node we are to remove H N 1 N 2 N T Operation Descriptor A ref:??? expecta:??? updatea: Rem[???] Outcome: UNDECIDED DC Descriptor affected node: #1 old value: #2 operation ref
122 Doubly linked list: remove (3) R1 H N 1 N 2 N T desc is updated after successful DC Operation Descriptor A ref: Node #1 expecta: Node #2 updatea: Rem[#2] Outcome: UNDECIDED DC Descriptor affected node: #1 old value: #2 operation ref
123 Doubly linked list: remove (4) R1 H N 1 N 2 N T Operation Descriptor tays pointed until operation outcome is decided A ref: Node #1 expecta: Node #2 updatea: Rem[#2] Outcome: UNDECIDED
124 Closing notes All we care about is CA that linearizes operation ubsequent updates are helper moves Invoke regular help/correct functions erfect algorithm to combine with optional Hardware Transactional Memory (HTM)
125 Let s enjoy what we ve accomplished
126 References Kotlin language Kotlin coroutines support library
127 Thank you Any questions? me to elizarov at gmail relizarov
Fresh Async With Kotlin. Presented at QCon SF, 2017 /Roman JetBrains
Fresh Async With Kotlin Presented at QCon SF, 2017 /Roman Elizarov @ JetBrains Speaker: Roman Elizarov 16+ years experience Previously developed high-perf trading software @ Devexperts Teach concurrent
More informationDeep dive into Coroutines on JVM. Roman Elizarov elizarov at JetBrains
Deep dive into Coroutines on JVM Roman Elizarov elizarov at JetBrains There is no magic Continuation Passing Style (CPS) A toy problem fun postitem(item: Item) { val token = requesttoken() val post = createpost(token,
More informationScale Up with Lock-Free Algorithms. Non-blocking concurrency on JVM Presented at JavaOne 2017 /Roman JetBrains
Scale Up with Lock-Free Algorithms Non-blocking concurrency on JVM Presented at JavaOne 2017 /Roman Elizarov @ JetBrains Speaker: Roman Elizarov 16+ years experience Previously developed high-perf trading
More informationOrder Is A Lie. Are you sure you know how your code runs?
Order Is A Lie Are you sure you know how your code runs? Order in code is not respected by Compilers Processors (out-of-order execution) SMP Cache Management Understanding execution order in a multithreaded
More informationLock-Free and Practical Doubly Linked List-Based Deques using Single-Word Compare-And-Swap
Lock-Free and Practical Doubly Linked List-Based Deques using Single-Word Compare-And-Swap Håkan Sundell Philippas Tsigas OPODIS 2004: The 8th International Conference on Principles of Distributed Systems
More informationIntroduction to Coroutines. Roman Elizarov elizarov at JetBrains
Introduction to Coroutines Roman Elizarov elizarov at JetBrains Asynchronous programming How do we write code that waits for something most of the time? A toy problem Kotlin 1 fun requesttoken(): Token
More informationNon-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei
Non-blocking Array-based Algorithms for Stacks and Queues Niloufar Shafiei Outline Introduction Concurrent stacks and queues Contributions New algorithms New algorithms using bounded counter values Correctness
More informationLOCKLESS ALGORITHMS. Lockless Algorithms. CAS based algorithms stack order linked list
Lockless Algorithms CAS based algorithms stack order linked list CS4021/4521 2017 jones@scss.tcd.ie School of Computer Science and Statistics, Trinity College Dublin 2-Jan-18 1 Obstruction, Lock and Wait
More informationMarwan Burelle. Parallel and Concurrent Programming
marwan.burelle@lse.epita.fr http://wiki-prog.infoprepa.epita.fr Outline 1 2 Solutions Overview 1 Sharing Data First, always try to apply the following mantra: Don t share data! When non-scalar data are
More informationProving linearizability & lock-freedom
Proving linearizability & lock-freedom Viktor Vafeiadis MPI-SWS Michael & Scott non-blocking queue head tail X 1 3 2 null CAS compare & swap CAS (address, expectedvalue, newvalue) { atomic { if ( *address
More informationCS 241 Honors Concurrent Data Structures
CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana Champaign March 27, 2018 CS 241 Course Staff (UIUC) Lock Free Data Structures March 27, 2018 1 / 43 What to go over
More informationExam Distributed Systems
Exam Distributed Systems 5 February 2010, 9:00am 12:00pm Part 2 Prof. R. Wattenhofer Family Name, First Name:..................................................... ETH Student ID Number:.....................................................
More informationInside core.async Channels. Rich Hickey
Inside core.async s Rich Hickey Warning! Implementation details Subject to change The Problems Single channel implementation For use from both dedicated threads and go threads simultaneously, on same channel
More informationParallel GC. (Chapter 14) Eleanor Ainy December 16 th 2014
GC (Chapter 14) Eleanor Ainy December 16 th 2014 1 Outline of Today s Talk How to use parallelism in each of the 4 components of tracing GC: Marking Copying Sweeping Compaction 2 Introduction Till now
More information<Insert Picture Here> Revisting Condition Variables and Transactions
Revisting Condition Variables and Transactions Victor Luchangco and Virendra Marathe Oracle Labs Copyright 2011 Oracle and/or its affiliates ( Oracle ). All rights are reserved by
More informationSynchronization (contd.)
Outline Synchronization (contd.) http://net.pku.edu.cn/~course/cs501/2008 Hongfei Yan School of EECS, Peking University 3/17/2008 Mutual Exclusion Permission-based Token-based Election Algorithms The Bully
More informationEECS 482 Introduction to Operating Systems
EECS 482 Introduction to Operating Systems Winter 2018 Baris Kasikci Slides by: Harsha V. Madhyastha http://knowyourmeme.com/memes/mind-blown 2 Recap: Processes Hardware interface: app1+app2+app3 CPU +
More informationFast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Håkan Sundell Philippas Tsigas Outline Synchronization Methods Priority Queues Concurrent Priority Queues Lock-Free Algorithm: Problems
More informationLinked Lists: The Role of Locking. Erez Petrank Technion
Linked Lists: The Role of Locking Erez Petrank Technion Why Data Structures? Concurrent Data Structures are building blocks Used as libraries Construction principles apply broadly This Lecture Designing
More informationLecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory
Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section
More informationNon-blocking Array-based Algorithms for Stacks and Queues!
Non-blocking Array-based Algorithms for Stacks and Queues! Niloufar Shafiei! Department of Computer Science and Engineering York University ICDCN 09 Outline! Introduction! Stack algorithm! Queue algorithm!
More informationUniversality of Consensus. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Universality of Consensus Companion slides for The by Maurice Herlihy & Nir Shavit Turing Computability 1 0 1 1 0 1 0 A mathematical model of computation Computable = Computable on a T-Machine 2 Shared-Memory
More informationINTERFACING REAL-TIME AUDIO AND FILE I/O
INTERFACING REAL-TIME AUDIO AND FILE I/O Ross Bencina portaudio.com rossbencina.com rossb@audiomulch.com @RossBencina Example source code: https://github.com/rossbencina/realtimefilestreaming Q: How to
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 31 October 2012 Lecture 6 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationConcurrent Preliminaries
Concurrent Preliminaries Sagi Katorza Tel Aviv University 09/12/2014 1 Outline Hardware infrastructure Hardware primitives Mutual exclusion Work sharing and termination detection Concurrent data structures
More informationStanford University Computer Science Department CS 140 Midterm Exam Dawson Engler Winter 1999
Stanford University Computer Science Department CS 140 Midterm Exam Dawson Engler Winter 1999 Name: Please initial the bottom left corner of each page. This is an open-book exam. You have 50 minutes to
More informationFine-grained synchronization & lock-free programming
Lecture 17: Fine-grained synchronization & lock-free programming Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Tunes Minnie the Moocher Robbie Williams (Swings Both Ways)
More informationThe Universality of Consensus
Chapter 6 The Universality of Consensus 6.1 Introduction In the previous chapter, we considered a simple technique for proving statements of the form there is no wait-free implementation of X by Y. We
More informationLecture 10: Avoiding Locks
Lecture 10: Avoiding Locks CSC 469H1F Fall 2006 Angela Demke Brown (with thanks to Paul McKenney) Locking: A necessary evil? Locks are an easy to understand solution to critical section problem Protect
More informationAdvances in Programming Languages
O T Y H Advances in Programming Languages APL5: Further language concurrency mechanisms David Aspinall (including slides by Ian Stark) School of Informatics The University of Edinburgh Tuesday 5th October
More informationProgramming Safe Agents in Blueprint. Alex Muscar University of Craiova
Programming Safe Agents in Blueprint Alex Muscar University of Craiova Programmers are craftsmen, and, as such, they are only as productive as theirs tools allow them to be Introduction Agent Oriented
More information[2:3] Linked Lists, Stacks, Queues
[2:3] Linked Lists, Stacks, Queues Helpful Knowledge CS308 Abstract data structures vs concrete data types CS250 Memory management (stack) Pointers CS230 Modular Arithmetic !!!!! There s a lot of slides,
More informationUnderstanding Hardware Transactional Memory
Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence
More informationSymmetric Multiprocessors: Synchronization and Sequential Consistency
Constructive Computer Architecture Symmetric Multiprocessors: Synchronization and Sequential Consistency Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November
More informationLecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation
Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end
More informationRelaxed Memory-Consistency Models
Relaxed Memory-Consistency Models [ 9.1] In Lecture 13, we saw a number of relaxed memoryconsistency models. In this lecture, we will cover some of them in more detail. Why isn t sequential consistency
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Multithreading Multiprocessors Description Multiple processing units (multiprocessor) From single microprocessor to large compute clusters Can perform multiple
More informationSynchronization. Chapter 5
Synchronization Chapter 5 Clock Synchronization In a centralized system time is unambiguous. (each computer has its own clock) In a distributed system achieving agreement on time is not trivial. (it is
More informationCMSC 330: Organization of Programming Languages. Threads Classic Concurrency Problems
: Organization of Programming Languages Threads Classic Concurrency Problems The Dining Philosophers Problem Philosophers either eat or think They must have two forks to eat Can only use forks on either
More informationAST: scalable synchronization Supervisors guide 2002
AST: scalable synchronization Supervisors guide 00 tim.harris@cl.cam.ac.uk These are some notes about the topics that I intended the questions to draw on. Do let me know if you find the questions unclear
More informationReuse, don t Recycle: Transforming Lock-free Algorithms that Throw Away Descriptors
Reuse, don t Recycle: Transforming Lock-free Algorithms that Throw Away Descriptors Maya Arbel-Raviv and Trevor Brown Technion, Computer Science Department, Haifa, Israel mayaarl@cs.technion.ac.il, me@tbrown.pro
More informationRsyslog: going up from 40K messages per second to 250K. Rainer Gerhards
Rsyslog: going up from 40K messages per second to 250K Rainer Gerhards What's in it for you? Bad news: will not teach you to make your kernel component five times faster Perspective user-space application
More informationLocking: A necessary evil? Lecture 10: Avoiding Locks. Priority Inversion. Deadlock
Locking: necessary evil? Lecture 10: voiding Locks CSC 469H1F Fall 2006 ngela Demke Brown (with thanks to Paul McKenney) Locks are an easy to understand solution to critical section problem Protect shared
More informationProgress Guarantees When Composing Lock-Free Objects
Progress Guarantees When Composing Lock-Free Objects Nhan Nguyen Dang and Philippas Tsigas Department of Computer Science and Engineering Chalmers University of Technology Gothenburg, Sweden {nhann,tsigas}@chalmers.se
More informationOperating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017
Operating Systems Lecture 4 - Concurrency and Synchronization Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Mutual exclusion Hardware solutions Semaphores IPC: Message passing
More informationAC: COMPOSABLE ASYNCHRONOUS IO FOR NATIVE LANGUAGES. Tim Harris, Martín Abadi, Rebecca Isaacs & Ross McIlroy
AC: COMPOSABLE ASYNCHRONOUS IO FOR NATIVE LANGUAGES Tim Harris, Martín Abadi, Rebecca Isaacs & Ross McIlroy Synchronous IO in the Windows API Read the contents of h, and compute a result BOOL ProcessFile(HANDLE
More informationCS 134: Operating Systems
CS 134: Operating Systems Locks and Low-Level Synchronization CS 134: Operating Systems Locks and Low-Level Synchronization 1 / 25 2 / 25 Overview Overview Overview Low-Level Synchronization Non-Blocking
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution
More informationDEEPIKA KAMBOJ UNIT 2. What is Stack?
What is Stack? UNIT 2 Stack is an important data structure which stores its elements in an ordered manner. You must have seen a pile of plates where one plate is placed on top of another. Now, when you
More informationSynchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17
Synchronization Part 2 REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17 1 Outline Part 2! Clock Synchronization! Clock Synchronization Algorithms!
More informationLecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM
Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of lazy TM 1 Transactions Access to shared variables is encapsulated within transactions the system gives the illusion
More informationDistributed Systems Synchronization. Marcus Völp 2007
Distributed Systems Synchronization Marcus Völp 2007 1 Purpose of this Lecture Synchronization Locking Analysis / Comparison Distributed Operating Systems 2007 Marcus Völp 2 Overview Introduction Hardware
More informationFine-grained synchronization & lock-free data structures
Lecture 19: Fine-grained synchronization & lock-free data structures Parallel Computer Architecture and Programming Redo Exam statistics Example: a sorted linked list struct Node { int value; Node* next;
More informationCPS 310 first midterm exam, 10/6/2014
CPS 310 first midterm exam, 10/6/2014 Your name please: Part 1. More fun with fork and exec* What is the output generated by this program? Please assume that each executed print statement completes, e.g.,
More informationComposable Shared Memory Transactions Lecture 20-2
Composable Shared Memory Transactions Lecture 20-2 April 3, 2008 This was actually the 21st lecture of the class, but I messed up the naming of subsequent notes files so I ll just call this one 20-2. This
More informationA Non-Blocking Concurrent Queue Algorithm
A Non-Blocking Concurrent Queue Algorithm Bruno Didot bruno.didot@epfl.ch June 2012 Abstract This report presents a new non-blocking concurrent FIFO queue backed by an unrolled linked list. Enqueue and
More informationWhat's wrong with Semaphores?
Next: Monitors and Condition Variables What is wrong with semaphores? Monitors What are they? How do we implement monitors? Two types of monitors: Mesa and Hoare Compare semaphore and monitors Lecture
More informationLock-Free and Practical Deques and Doubly Linked Lists using Single-Word Compare-And-Swap 1
Chapter 7 Lock-Free and Practical Deques and Doubly Linked Lists using Single-Word Compare-And-Swap 1 Håkan Sundell, Philippas Tsigas Department of Computing Science Chalmers Univ. of Technol. and Göteborg
More informationUsing Relaxed Consistency Models
Using Relaxed Consistency Models CS&G discuss relaxed consistency models from two standpoints. The system specification, which tells how a consistency model works and what guarantees of ordering it provides.
More informationDCAS-Based Concurrent Deques
DCAS-Based Concurrent Deques Ole Agesen 1 David. Detlefs Christine H. Flood Alexander T. Garthwaite Paul A. Martin Mark Moir Nir N. Shavit 1 Guy. Steele Jr. VMware Tel Aviv University Sun Microsystems
More informationChapter 3 Processes. Process Concept. Process Concept. Process Concept (Cont.) Process Concept (Cont.) Process Concept (Cont.)
Process Concept Chapter 3 Processes Computers can do several activities at a time Executing user programs, reading from disks writing to a printer, etc. In multiprogramming: CPU switches from program to
More informationThe Dining Philosophers Problem CMSC 330: Organization of Programming Languages
The Dining Philosophers Problem CMSC 0: Organization of Programming Languages Threads Classic Concurrency Problems Philosophers either eat or think They must have two forks to eat Can only use forks on
More informationWritten by Marco Tusa Wednesday, 23 February :03 - Last Updated Sunday, 18 August :39
The Binary Log The binary log in MySQL has two main declared purpose, replication and PTR (point in time recovery), as declared in the MySQL manual. In the MySQL binary log are stored all that statements
More informationChapter 3: Process-Concept. Operating System Concepts 8 th Edition,
Chapter 3: Process-Concept, Silberschatz, Galvin and Gagne 2009 Chapter 3: Process-Concept Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2 Silberschatz, Galvin
More informationTRANSACTION MEMORY. Presented by Hussain Sattuwala Ramya Somuri
TRANSACTION MEMORY Presented by Hussain Sattuwala Ramya Somuri AGENDA Issues with Lock Free Synchronization Transaction Memory Hardware Transaction Memory Software Transaction Memory Conclusion 1 ISSUES
More informationWhatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs 3.
Whatever can go wrong will go wrong. attributed to Edward A. Murphy Murphy was an optimist. authors of lock-free programs 3. LOCK FREE KERNEL 309 Literature Maurice Herlihy and Nir Shavit. The Art of Multiprocessor
More informationLecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations
Lecture 21: Transactional Memory Topics: Hardware TM basics, different implementations 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end locks are blocking,
More informationCS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019
CS 31: Introduction to Computer Systems 22-23: Threads & Synchronization April 16-18, 2019 Making Programs Run Faster We all like how fast computers are In the old days (1980 s - 2005): Algorithm too slow?
More informationComputer Architecture and Engineering CS152 Quiz #5 April 27th, 2016 Professor George Michelogiannakis Name: <ANSWER KEY>
Computer Architecture and Engineering CS152 Quiz #5 April 27th, 2016 Professor George Michelogiannakis Name: This is a closed book, closed notes exam. 80 Minutes 19 pages Notes: Not all questions
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 21 November 2014
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 21 November 2014 Lecture 7 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationCMSC 330: Organization of Programming Languages. The Dining Philosophers Problem
CMSC 330: Organization of Programming Languages Threads Classic Concurrency Problems The Dining Philosophers Problem Philosophers either eat or think They must have two forks to eat Can only use forks
More informationConcurrent Data Structures Concurrent Algorithms 2016
Concurrent Data Structures Concurrent Algorithms 2016 Tudor David (based on slides by Vasileios Trigonakis) Tudor David 11.2016 1 Data Structures (DSs) Constructs for efficiently storing and retrieving
More informationDistributed Operating Systems
Distributed Operating Systems Synchronization in Parallel Systems Marcus Völp 2009 1 Topics Synchronization Locking Analysis / Comparison Distributed Operating Systems 2009 Marcus Völp 2 Overview Introduction
More information6.852: Distributed Algorithms Fall, Class 20
6.852: Distributed Algorithms Fall, 2009 Class 20 Today s plan z z z Transactional Memory Reading: Herlihy-Shavit, Chapter 18 Guerraoui, Kapalka, Chapters 1-4 Next: z z z Asynchronous networks vs asynchronous
More informationHazard Pointers. Number of threads unbounded time to check hazard pointers also unbounded! difficult dynamic bookkeeping! thread B - hp1 - hp2
Hazard Pointers Store pointers of memory references about to be accessed by a thread Memory allocation checks all hazard pointers to avoid the ABA problem thread A - hp1 - hp2 thread B - hp1 - hp2 thread
More informationLecture #7: Implementing Mutual Exclusion
Lecture #7: Implementing Mutual Exclusion Review -- 1 min Solution #3 to too much milk works, but it is really unsatisfactory: 1) Really complicated even for this simple example, hard to convince yourself
More informationProblems with Concurrency. February 19, 2014
with Concurrency February 19, 2014 s with concurrency interleavings race conditions dead GUI source of s non-determinism deterministic execution model 2 / 30 General ideas Shared variable Access interleavings
More informationChapter 3: Process Concept
Chapter 3: Process Concept Chapter 3: Process Concept Process Concept Process Scheduling Operations on Processes Inter-Process Communication (IPC) Communication in Client-Server Systems Objectives 3.2
More informationChapter 3: Process Concept
Chapter 3: Process Concept Chapter 3: Process Concept Process Concept Process Scheduling Operations on Processes Inter-Process Communication (IPC) Communication in Client-Server Systems Objectives 3.2
More informationLecture 4. The Java Collections Framework
Lecture 4. The Java s Framework - 1 - Outline Introduction to the Java s Framework Iterators Interfaces, Classes and Classes of the Java s Framework - 2 - Learning Outcomes From this lecture you should
More informationLock-free Serializable Transactions
Lock-free Serializable Transactions Jeff Napper jmn@cs.utexas.edu Lorenzo Alvisi lorenzo@cs.utexas.edu Laboratory for Advanced Systems Research Department of Computer Science The University of Texas at
More informationLinked List Implementation of Queues
Outline queue implementation: linked queue application of queues and stacks: data structure traversal double-ended queues application of queues: simulation of an airline counter random numbers recursion
More informationWait-Free Multi-Word Compare-And-Swap using Greedy Helping and Grabbing
Wait-Free Multi-Word Compare-And-Swap using Greedy Helping and Grabbing H. Sundell 1 1 School of Business and Informatics, University of Borås, Borås, Sweden Abstract We present a new algorithm for implementing
More informationIntroduction to OS Synchronization MOS 2.3
Introduction to OS Synchronization MOS 2.3 Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Introduction to OS 1 Challenge How can we help processes synchronize with each other? E.g., how
More informationWhatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL
Whatever can go wrong will go wrong. attributed to Edward A. Murphy Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL 251 Literature Maurice Herlihy and Nir Shavit. The Art of Multiprocessor
More informationHigh Performance Transactions in Deuteronomy
High Performance Transactions in Deuteronomy Justin Levandoski, David Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang Microsoft Research Overview Deuteronomy: componentized DB stack Separates transaction,
More information30. Parallel Programming IV
924 30. Parallel Programming IV Futures, Read-Modify-Write Instructions, Atomic Variables, Idea of lock-free programming [C++ Futures: Williams, Kap. 4.2.1-4.2.3] [C++ Atomic: Williams, Kap. 5.2.1-5.2.4,
More informationParallel Programming in Distributed Systems Or Distributed Systems in Parallel Programming
Parallel Programming in Distributed Systems Or Distributed Systems in Parallel Programming Philippas Tsigas Chalmers University of Technology Computer Science and Engineering Department Philippas Tsigas
More informationOperating Systems (1DT020 & 1TT802)
Uppsala University Department of Information Technology Name: Perso. no: Operating Systems (1DT020 & 1TT802) 2009-05-27 This is a closed book exam. Calculators are not allowed. Answers should be written
More informationMarwan Burelle. Parallel and Concurrent Programming
marwan.burelle@lse.epita.fr http://wiki-prog.infoprepa.epita.fr Outline 1 2 3 OpenMP Tell Me More (Go, OpenCL,... ) Overview 1 Sharing Data First, always try to apply the following mantra: Don t share
More informationOperating Systems EDA092, DIT 400 Exam
Chalmers University of Technology and Gothenburg University Operating Systems EDA092, DIT 400 Exam 2015-04-14 Date, Time, Place: Tuesday 2015/04/14, 14:00 18:00, Väg och vatten -salar Course Responsible:
More informationDistributed Algorithms 6.046J, Spring, Nancy Lynch
Distributed Algorithms 6.046J, Spring, 205 Nancy Lynch What are Distributed Algorithms? Algorithms that run on networked processors, or on multiprocessors that share memory. They solve many kinds of problems:
More informationErlang. Joe Armstrong.
Erlang Joe Armstrong joe.armstrong@ericsson.com 1 Who is Joe? Inventor of Erlang, UBF, Open Floppy Grid Chief designer of OTP Founder of the company Bluetail Currently Software Architect Ericsson Current
More informationAgenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based?
Agenda Designing Transactional Memory Systems Part III: Lock-based STMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch Part I: Introduction Part II: Obstruction-free STMs Part III: Lock-based
More informationFORTH SEMESTER DIPLOMA EXAMINATION IN ENGINEERING/ TECHNOLIGY- OCTOBER, 2012 DATA STRUCTURE
TED (10)-3071 Reg. No.. (REVISION-2010) Signature. FORTH SEMESTER DIPLOMA EXAMINATION IN ENGINEERING/ TECHNOLIGY- OCTOBER, 2012 DATA STRUCTURE (Common to CT and IF) [Time: 3 hours (Maximum marks: 100)
More informationProving liveness. Alexey Gotsman IMDEA Software Institute
Proving liveness Alexey Gotsman IMDEA Software Institute Safety properties Ensure bad things don t happen: - the program will not commit a memory safety fault - it will not release a lock it does not hold
More informationImplementing caches. Example. Client. N. America. Client System + Caches. Asia. Client. Africa. Client. Client. Client. Client. Client.
N. America Example Implementing caches Doug Woos Asia System + Caches Africa put (k2, g(get(k1)) put (k2, g(get(k1)) What if clients use a sharded key-value store to coordinate their output? Or CPUs use
More informationFinal Exam Solutions May 11, 2012 CS162 Operating Systems
University of California, Berkeley College of Engineering Computer Science Division EECS Spring 2012 Anthony D. Joseph and Ion Stoica Final Exam May 11, 2012 CS162 Operating Systems Your Name: SID AND
More informationCase Study: Using Inspect to Verify and fix bugs in a Work Stealing Deque Simulator
Case Study: Using Inspect to Verify and fix bugs in a Work Stealing Deque Simulator Wei-Fan Chiang 1 Introduction Writing bug-free multi-threaded programs is hard. Bugs in these programs have various types
More informationReview sheet for Final Exam (List of objectives for this course)
Review sheet for Final Exam (List of objectives for this course) Please be sure to see other review sheets for this semester Please be sure to review tests from this semester Week 1 Introduction Chapter
More information