Last Lecture: Spin-Locks

Similar documents
Linked Lists: Locking, Lock- Free, and Beyond. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Programming Paradigms for Concurrency Lecture 6 Synchronization of Concurrent Objects

Linked Lists: Locking, Lock-Free, and Beyond. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Linked Lists: Locking, Lock- Free, and Beyond. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Locking Granularity. CS 475, Spring 2019 Concurrent & Distributed Systems. With material from Herlihy & Shavit, Art of Multiprocessor Programming

Coarse-grained and fine-grained locking Niklas Fors

Concurrency. Part 2, Chapter 10. Roger Wattenhofer. ETH Zurich Distributed Computing

Concurrent Data Structures

Non-Blocking Synchronization

Coarse-Grained Synchronization

Linked Lists: The Role of Locking. Erez Petrank Technion

The Art of Multiprocessor Programming Copyright 2007 Elsevier Inc. All rights reserved. October 3, 2007 DRAFT COPY

Introduction to Concurrency and Multicore Programming. Slides adapted from Art of Multicore Programming by Herlihy and Shavit

Concurrent Skip Lists. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Locking. Part 2, Chapter 7. Roger Wattenhofer. ETH Zurich Distributed Computing

Practice: Small Systems Part 2, Chapter 3

Programming Paradigms for Concurrency Lecture 3 Concurrent Objects

Parallel linked lists

Shared-Memory Computability

Spin Locks and Contention. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Spin Locks and Contention. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Modern High-Performance Locking

Spin Locks and Contention

Lecture 7: Mutual Exclusion 2/16/12. slides adapted from The Art of Multiprocessor Programming, Herlihy and Shavit

Solution: a lock (a/k/a mutex) public: virtual void unlock() =0;

A Concurrent Skip List Implementation with RTM and HLE

Concurrent Queues, Monitors, and the ABA problem. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

6.852: Distributed Algorithms Fall, Class 20

Distributed Computing Group

Concurrent Data Structures Concurrent Algorithms 2016

Locks Chapter 4. Overview. Introduction Spin Locks. Queue locks. Roger Wattenhofer. Test-and-Set & Test-and-Test-and-Set Backoff lock

Universality of Consensus. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Hashing and Natural Parallism. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

A Simple Optimistic skip-list Algorithm

Building Efficient Concurrent Graph Object through Composition of List-based Set

Lecture 10: Avoiding Locks

Introduction. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory

Integrating Transactionally Boosted Data Structures with STM Frameworks: A Case Study on Set

Advance Operating Systems (CS202) Locks Discussion

Goldibear and the 3 Locks. Programming With Locks Is Tricky. More Lock Madness. And To Make It Worse. Transactional Memory: The Big Idea

Mutual Exclusion. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Whatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL

Concurrent Queues and Stacks. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Thread Synchronization: Foundations. Properties. Safety properties. Edsger s perspective. Nothing bad happens

Advanced Topic: Efficient Synchronization

Peterson s Algorithm

LOCKLESS ALGORITHMS. Lockless Algorithms. CAS based algorithms stack order linked list

A simple correctness proof of the MCS contention-free lock. Theodore Johnson. Krishna Harathi. University of Florida. Abstract

Locking. Part 2, Chapter 11. Roger Wattenhofer. ETH Zurich Distributed Computing

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)

Spin Locks and Contention. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

15 418/618 Project Final Report Concurrent Lock free BST

Concurrent Preliminaries

Spin Locks and Contention. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Spin Locks and Contention. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation

Spin Locks and Contention. Companion slides for Chapter 7 The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Locking: A necessary evil? Lecture 10: Avoiding Locks. Priority Inversion. Deadlock

Multi-Core Computing with Transactional Memory. Johannes Schneider and Prof. Roger Wattenhofer

Non-blocking Array-based Algorithms for Stacks and Queues!

Lecture 10: Multi-Object Synchronization

Models of concurrency & synchronization algorithms

Whatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs 3.

Reagent Based Lock Free Concurrent Link List Spring 2012 Ancsa Hannak and Mitesh Jain April 28, 2012

Concurrent Objects and Linearizability

Lecture: Consistency Models, TM

Lecture 9: Multiprocessor OSs & Synchronization. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture: Consistency Models, TM. Topics: consistency models, TM intro (Section 5.6)

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430

Introduction. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit. Art of Multiprocessor Programming

10/17/ Gribble, Lazowska, Levy, Zahorjan 2. 10/17/ Gribble, Lazowska, Levy, Zahorjan 4

Concurrent Objects. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

The New Java Technology Memory Model

CS510 Concurrent Systems. Jonathan Walpole

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations

Page 1. Challenges" Concurrency" CS162 Operating Systems and Systems Programming Lecture 4. Synchronization, Atomic operations, Locks"

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago

Non-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei

6.852: Distributed Algorithms Fall, Class 15

Agenda. Lecture. Next discussion papers. Bottom-up motivation Shared memory primitives Shared memory synchronization Barriers and locks

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 3 Nov 2017

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency

HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran. Hot Topics in Parallelism (HotPar '12), Berkeley, CA

Scalable Concurrent Hash Tables via Relativistic Programming

Multiprocessors and Locking

Per-Thread Batch Queues For Multithreaded Programs

Symmetric Multiprocessors: Synchronization and Sequential Consistency

Operating Systems. Synchronization

PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES

CSE 153 Design of Operating Systems

Parallel GC. (Chapter 14) Eleanor Ainy December 16 th 2014

CSE332: Data Abstractions Lecture 23: Programming with Locks and Critical Sections. Tyler Robison Summer 2010

CSE 153 Design of Operating Systems Fall 2018

CMSC 330: Organization of Programming Languages

Monitors; Software Transactional Memory

Concurrent Counting using Combining Tree

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 14 November 2014

Transcription:

Linked Lists: Locking, Lock- Free, and Beyond Last Lecture: Spin-Locks. spin lock CS critical section Resets lock upon exit Today: Concurrent Objects Adding threads should not lower throughput Contention effects Mostly fixed by Queue locks Should increase throughput Not possible if inherently sequential Surprising things are parallelizable Coarse-Grained Synchronization Each method locks the object Avoid contention using queue locks Easy to reason about In simple cases Standard Java model Synchronized blocks and methods So, are we done? 3 4 Coarse-Grained Synchronization Sequential bottleneck Threads stand in line Adding more threads Does not improve throughput Struggle to keep it from getting worse So why even use a multiprocessor? Well, some apps inherently parallel This Lecture Introduce four patterns Bag of tricks Methods that work more than once For highly-concurrent objects Goal: Concurrent access More threads, more throughput 5 6 1

First: Fine-Grained Synchronization Instead of using a single lock.. Split object into Independently-synchronized components Methods conflict when they access The same component At the same time Second: Optimistic Synchronization Search without locking If you find it, lock and check OK: we are done Oops: start over Evaluation Usually cheaper than locking Mistakes are expensive 7 8 Third: Lazy Synchronization Postpone hard work Removing components is tricky Logical removal Mark component to be deleted Physical removal Do what needs to be done Fourth: Lock-Free Synchronization Don t use locks at all Use compareandset() & relatives Advantages No Scheduler Assumptions/Support Disadvantages Complex Sometimes high overhead 9 10 Linked List Illustrate these patterns Using a list-based Set Common application Building block for other apps Set Interface Unordered collection of items No duplicates Methods add(x) put x in set remove(x) take x out of set contains(x) tests if x in set 11 12 2

List-Based Sets public interface Set<T> { public boolean add(t x); public boolean remove(t x); public boolean contains(t x); List-Based Sets public interface Set<T> { public boolean add(t x); public boolean remove(t x); public boolean contains(t x); Add item to set 13 14 List-Based Sets public interface Set<T> { public boolean add(t x); public boolean remove(t x); public boolean contains(tt x); List-Based Sets public interface Set<T> { public boolean add(t x); public boolean remove(t x); public boolean contains(t x); Remove item from set Is item in set? 15 16 public class Node { public T item; public int key; public Node next; List Node public class Node { public T item; public int key; public Node next; List Node item of interest 17 18 3

public class Node { public T item; public int key; public Node next; List Node public class Node { public T item; public int key; public Node next; List Node Usually hash code Reference to next node 19 20 The List-Based Set Reasoning about Concurrent Objects - a b c Sorted with Sentinel nodes (min & max possible keys) + Invariant Property that always holds Established because True when object is created Truth preserved by each method Each step of each method 21 22 Specifically Invariants preserved by add() remove() contains() Most steps are trivial Usually one step tricky Often linearization point Interference Invariants make sense only if methods considered are the only modifiers Language encapsulation helps List nodes not visible outside class 23 24 4

Interference Freedom from interference needed even for removed nodes Some algorithms traverse removed nodes Careful with malloc() & free()! Garbage-collection helps here Abstract Data Types Concrete representation a Abstract Type {a, b b 25 26 Abstract Data Types Meaning of rep given by abstraction map S( a b ) = {a,b Rep Invariant Which concrete values meaningful? Sorted? Duplicates? Rep invariant Characterizes legal concrete reps Preserved by methods Relied on by methods 27 28 Blame Game Rep invariant is a contract Suppose add() leaves behind 2 copies of x remove() removes only 1 Which one is incorrect? 29 Blame Game Suppose add() leaves behind 2 copies of x remove() removes only 1 Which one is incorrect? If rep invariant says no duplicates add() is incorrect Otherwise remove() is incorrect 30 5

Rep Invariant (partly) Sentinel nodes tail reachable from head Sorted No duplicates Abstraction Map S(head) = { x there exists a such that a reachable from head and a.item = x 31 32 Sequential List Based Set Add() a c d Sequential List Based Set Add() a c d Remove() a b c Remove() b a b c 33 34 Course Grained Locking Course Grained Locking a b d a b d c 35 36 6

Course Grained Locking Coarse-Grained Locking a b d honk! honk! Simple but hotspot + bottleneck c Easy, same as synchronized methods One lock to rule them all Simple, clearly correct Deserves respect! Works poorly with contention Queue locks help But bottleneck still an issue 37 38 Fine-grained Locking Requires careful thought Do not meddle in the affairs of wizards, for they are subtle and quick to anger Split object into pieces Each piece has own lock Methods that work on disjoint pieces need not exclude each other Hand-over-Hand locking a b c 39 40 Hand-over-Hand locking Hand-over-Hand locking a b c a b c 41 42 7

Hand-over-Hand locking Hand-over-Hand locking a b c a b c 43 44 45 46 47 48 8

a c d Why do we need to always hold 2 locks? 49 50 Concurrent Removes Concurrent Removes 51 52 Concurrent Removes Concurrent Removes 53 54 9

Concurrent Removes Concurrent Removes 55 56 Concurrent Removes Concurrent Removes 57 58 Uh, Oh Uh, Oh Bad news, C not removed a c d a c d 59 60 10

Problem To delete node c Swing node b s next field to d Problem is, Someone deleting b concurrently could direct a pointer a b c to c a b c Insight If a node is locked No one can delete node s successor If a thread locks Node to be deleted And its predecessor Then it works 61 62 Hand-Over-Hand Again Hand-Over-Hand Again 63 64 Hand-Over-Hand Again Hand-Over-Hand Again Found it! 65 66 11

Hand-Over-Hand Again Hand-Over-Hand Again a c d Found it! 67 68 69 70 71 72 12

73 74 75 76 Must acquire Lock of b Cannot acquire lock of b 77 78 13

a b d Wait! Proceed to 79 80 a b d a b d 81 82 a d a d 83 84 14

Remove method Remove method public boolean remove(item item) { int key = item.hashcode(); Node pred, curr; try { finally { curr.unlock(); 85 public boolean remove(item item) { int key = item.hashcode(); Node pred, curr; try { finally { curr.unlock(); Key used to order node 86 Remove method Remove method public boolean remove(item item) { int key = item.hashcode(); Node pred, curr; try { finally { currnode.unlock(); prednode.unlock(); Predecessor and current nodes 87 public boolean remove(item item) { int key = item.hashcode(); Node pred, curr; try { Make sure finally { locks released curr.unlock(); 88 Remove method public boolean remove(item item) { int key = item.hashcode(); Node pred, curr; try { finally { curr.unlock(); Everything else Remove method try { pred = this.head; pred.lock(); curr = pred.next; curr.lock(); finally { 89 90 15

Remove method lock pred == head try { pred = this.head; pred.lock(); curr = pred.next; curr.lock(); finally { Remove method try { Lock current pred = this.head; pred.lock(); curr = pred.next; curr.lock(); finally { 91 92 Remove method Remove: searching try { pred = this.head; pred.lock(); Traversing list curr = pred.next; curr.lock(); finally { 93 if (item == curr.item) { curr.lock(); 94 Remove: searching Remove: searching if (item == curr.item) { Search key range curr.lock(); 95 if (item == curr.item) { At start of each loop: curr and pred locked curr.lock(); 96 16

Remove: searching Remove: searching if (item == curr.item) { curr.lock(); If return item false; found, remove node 97 if (item == curr.item) { curr.lock(); If return node false; found, remove it Remove: searching Unlock predecessor if (item == curr.item) { curr.lock(); 99 Remove: searching Only one node locked! if (item == curr.item) { curr.lock(); 100 Remove: searching Remove: searching demote current if (item == curr.item) { curr.lock(); 101 if Find (item and == lock curr.item) new current { pred = currnode; curr.lock(); 102 17

Remove: searching Lock if (item invariant == curr.item) restored { pred = currnode; curr.lock(); 103 Remove: searching if (item == curr.item) { Otherwise, not present curr.lock(); 104 Why does this work? Why remove() is linearizable To remove node e Must lock e Must lock e s predecessor Therefore, if you lock a node It can t be removed And neither can its successor 105 if (item == curr.item) { curr.lock(); pred reachable from head curr is pred.next So curr.item is in the set 106 Why remove() is linearizable Why remove() is linearizable if (item == curr.item) { curr.lock(); Linearization point if item is present 107 if (item == curr.item) { curr.lock(); Node locked, so no other thread can remove it. 108 18

Why remove() is linearizable Why remove() is linearizable if (item == curr.item) { Item not present curr.lock(); 109 if (item == curr.item) { pred reachable from head curr.lock(); curr is pred.next pred.key < key key < curr.key 110 Why remove() is linearizable Adding Nodes if (item == curr.item) { Linearization point curr.lock(); 111 To add node e Must lock predecessor Must lock successor Neither can be deleted (Is successor lock actually required?) 112 Same Abstraction Map S(head) = { x there exists a such that a reachable from head and a.item = x Rep Invariant Easy to check that tail always reachable from head Nodes sorted, no duplicates 113 114 19

Drawbacks Better than coarse-grained lock Threads can traverse in parallel Still not ideal Long chain of acquire/release Inefficient Optimistic Synchronization Find nodes without locking Lock nodes Check that everything is OK 115 116 Optimistic: Traverse without Locking Optimistic: Lock and Load a b d e a b d e add(c) Aha! add(c) 117 118 What could go wrong? Validate Part 1 (while holding locks) a b d e a b d e add(c) Aha! remove(b ) add(c) Yes, b still reachable from head 119 120 20

What Else Can Go Wrong? What Else Can Go Wrong? b a b d e a b d e add(c) add(c) add(b ) 121 122 What Else Can Go Wrong? b Validate Part 2 (while holding locks) a b d e a b d e add(c) Aha! add(c) Yes, b still points to d 123 124 Optimistic: Linearization Point Same Abstraction Map add(c) a b d e c S(head) = { x there exists a such that a reachable from head and a.item = x 125 126 21

Invariants Careful: we may traverse deleted nodes But we establish properties by Validation After we lock target nodes Correctness If Nodes b and c both locked Node b still accessible Node c still successor to b Then Neither will be deleted OK to delete and return true 127 128 Unsuccessful Remove Validate (1) a b d e a b d e Aha! Yes, b still reachable from head 129 130 Validate (2) OK Computer a b d e a b d e Yes, b still points to d return false 131 132 22

Correctness If Nodes b and d both locked Node b still accessible Node d still successor to b Then Neither will be deleted No thread can add c after b OK to return false Validation private boolean validate(node pred, Node curry) { Node node = head; while (node.key <= pred.key) { if (node == pred) return pred.next == curr; node = node.next; 133 134 Validation private boolean validate(node pred, Node curr) { Node node = head; while (node.key <= pred.key) { if (node == pred) return pred.next == curr; node = node.next; Predecessor & current nodes Validation private boolean validate(node pred, Node curr) { Node node = head; while (node.key <= pred.key) { if (node == pred) return pred.next == curr; node = node.next; Begin at the beginning 135 136 Validation private boolean validate(node pred, Node curr) { Node node = head; while (node.key <= pred.key) { if (node == pred) return pred.next == curr; node = node.next; Search range of keys Validation private boolean validate(node pred, Node curr) { Node node = head; while (node.key <= pred.key) { if (node == pred) return pred.next == curr; node = node.next; Predecessor reachable 137 138 23

Validation private boolean validate(node pred, Node curry) { Node node = head; while (node.key <= pred.key) { if (node == pred) return pred.next == curr; node = node.next; Is current node next? Validation private boolean Otherwise move on validate(node pred, Node curr) { Node node = head; while (node.key <= pred.key) { if (node == pred) return pred.next == curr; node = node.next; 139 140 Validation private boolean Predecessor not reachable validate(node pred, Node curr) { Node node = head; while (node.key <= pred.key) { if (node == pred) return pred.next == curr; node = node.next; Remove: searching public boolean remove(item item) { int key = item.hashcode(); retry: while (true true) ) { Node pred = this.head; Node curr = pred.next; if (item == curr.item) break; 141 142 Remove: searching public boolean remove(item item) { int key = item.hashcode(); retry: while (true) { Node pred = this.head; Node curr = pred.next; if (item == curr.item) break; Search key Remove: searching public boolean remove(item item) { int key = item.hashcode(); retry: while (true true) ) { Node pred = this.head; Node curr = pred.next; if (item == curr.item) break; Retry on synchronization conflict 143 144 24

Remove: searching public boolean remove(item item) { int key = item.hashcode(); retry: while (true) { Node pred = this.head; Node curr = pred.next; if (item == curr.item) break; Examine predecessor and current nodes Remove: searching public boolean remove(item item) { int key = item.hashcode(); retry: while (true) { Node pred = this.head; Node curr = pred.next; if (item == curr.item) break; Search by key 145 146 Remove: searching public boolean remove(item item) { int key = item.hashcode(); retry: while (true) { Node pred = this.head; Node curr = pred.next; if (item == curr.item) break; Stop if we find item Remove: searching public boolean remove(item item) { int key Move = along item.hashcode(); retry: while (true) { Node pred = this.head; Node curr = pred.next; if (item == curr.item) break; 147 148 On Exit from Loop If item is present curr holds item pred just before curr If item is absent curr has first higher key pred just before curr Assuming no synchronization problems 149 Remove Method try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.item == item) { else { finally { curr.unlock(); 150 25

Remove Method try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.item == item) { else { Always unlock finally { curr.unlock(); 151 Remove Method try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.item == item) { else { finally { Lock both nodes curr.unlock(); 152 Remove Method try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.item == item) { else { Check for synchronization conflicts finally { curr.unlock(); 153 Remove Method try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.item == item) { else { target found, finally { remove node curr.unlock(); 154 Remove Method try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.item == item) { else { target not found finally { curr.unlock(); 155 Optimistic List Limited hot-spots Targets of add(), remove(), contains() No contention on traversals Moreover Traversals are wait-free Food for thought 156 26

So Far, So Good Much less lock acquisition/release Performance Concurrency Problems Need to traverse list twice contains() method acquires locks Evaluation Optimistic is effective if cost of scanning twice without locks is less than cost of scanning once with locks Drawback contains() acquires locks 90% of calls in many apps 157 158 Lazy List Like optimistic, except Scan once contains(x) never locks Key insight Removing nodes causes trouble Do it lazily Lazy List remove() Scans list (as before) Locks predecessor & current (as before) Logical delete Marks current node as removed (new!) Physical delete Redirects predecessor s next (as before) 159 160 Lazy Removal Lazy Removal Present in list 161 162 27

Lazy Removal Lazy Removal Logically deleted Physically deleted 163 164 Lazy Removal a b d Physically deleted Lazy List All Methods Scan through locked and marked nodes Removing a node doesn t slow down other method calls Must still lock pred and curr nodes. 165 166 Validation No need to rescan list! Check that pred is not marked Check that curr is not marked Check that pred points to curr Business as Usual a b c 167 168 28

Business as Usual Business as Usual a b c a b c 169 170 Business as Usual Business as Usual a b c a b c a not marked 171 172 Business as Usual Business as Usual a b c a b c a still points to b Logical delete 173 174 29

Business as Usual Business as Usual a b c a b c physical delete 175 176 New Abstraction Map S(head) = { x there exists node a such that a reachable from head and a.item = x and a is unmarked Invariant If not marked then item in the set and reachable from head and if not yet traversed it is reachable from pred 177 178 Validation private boolean validate(node pred, Node curr) { return!pred.marked &&!curr.marked && pred.next == curr); List Validate Method private boolean validate(node pred, Node curr) { return!pred.marked &&!curr.marked && pred.next == curr); Predecessor not Logically removed 179 180 30

List Validate Method private boolean validate(node pred, Node curr) { return!pred.marked &&!curr.marked && pred.next == curr); List Validate Method private boolean validate(node pred, Node curr) { return!pred.marked &&!curr.marked && pred.next == curr); Current not Logically removed Predecessor still Points to current 181 182 try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.key == key) { curr.marked = true; else { finally { curr.unlock(); Remove Remove try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.key == key) { curr.marked = true; else { Validate as before finally { curr.unlock(); 183 184 Remove try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.key == key) { curr.marked = true; else { finally { Key found curr.unlock(); 185 Remove try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.key == key) { curr.marked = true; else { finally { Logical remove curr.unlock(); 186 31

Remove try { pred.lock(); curr.lock(); if (validate(pred,curr) { if (curr.key == key) { curr.marked = true; else { finally { physical remove curr.unlock(); 187 Contains public boolean contains(item item) { int key = item.hashcode(); Node curr = this.head; while (curr.key < key) { return curr.key == key &&!curr.marked; 188 Contains public boolean contains(item item) { int key = item.hashcode(); Node curr = this.head; while (curr.key < key) { return curr.key == key &&!curr.marked; Contains public boolean contains(item item) { int key = item.hashcode(); Node curr = this.head; while (curr.key < key) { return curr.key == key &&!curr.marked; Start at the head 189 Search key range 190 Contains public boolean contains(item item) { int key = item.hashcode(); Node curr = this.head; while (curr.key < key) { return curr.key == key &&!curr.marked; Traverse without locking (nodes may have been removed) 191 Contains public boolean contains(item item) { int key = item.hashcode(); Node curr = this.head; while (curr.key < key) { return curr.key == key &&!curr.marked; Present and undeleted? 192 32

Summary: Wait-free Contains Lazy List a 0 b 0 dc 10 e 0 a 0 b 0 dc 10 e 0 Use Mark bit + Fact that List is ordered 1. Not marked in the set 2. Marked or missing not in the set Lazy add() and remove() + Wait-free contains() 193 194 Evaluation Traffic Jam Good: contains() doesn t lock In fact, its wait-free! Good because typically high % contains() Uncontended calls don t re-traverse Bad Contended add() and remove() calls do re-traverse Traffic jam if one thread delays 195 Any concurrent data structure based on mutual exclusion has a weakness If one thread Enters critical section And eats the big muffin Cache miss, page fault, descheduled Everyone else using that lock is stuck! Need to trust the scheduler. 196 Reminder: Lock-Free Data Structures No matter what Guarantees minimal progress in any execution i.e. Some thread will always complete a method call Even if others halt at malicious times Implies that implementation can t use locks Lock-free Lists Next logical step Eliminate locking entirely contains() wait-free and add() and remove() lock-free Use only compareandset() What could go wrong? 197 198 33

Remove Using CAS Logical Removal = Set Mark Bit a 0 b 0 c 10 e 0 Problem Logical Removal = Set Mark Bit a 0 b 0 c 10 e 0 Use CAS to verify pointer is correct Not enough! Physical Removal CAS pointer Problem: Physical d not added to list Removal Must Prevent CAS manipulation of removed node s pointer d 0 Node added Before Physical Removal CAS 199 200 The Solution: Combine Bit and Pointer Logical Removal = Set Mark Bit a 0 b 0 c 10 e 0 Physical Removal CAS Mark-Bit and Pointer are CASed together (AtomicMarkableReference) d 0 Fail CAS: Node not added after logical Removal Solution Use AtomicMarkableReference Atomically Swing reference and Update flag Remove in two steps Set mark bit in next field Redirect predecessor s pointer 201 202 Marking a Node AtomicMarkableReference class Java.util.concurrent.atomic package Extracting Reference & Mark Public Object get(boolean boolean[] marked); Reference address F mark bit 203 34

Extracting Reference & Mark Public Object get(boolean boolean[] marked); Extracting Reference Only public boolean ismarked(); Returns reference Returns mark at array index 0! Value of mark Changing State Public boolean compareandset( Object expectedref, Object updateref, boolean expectedmark, boolean updatemark); Changing State If this is the current reference Public boolean compareandset( Object expectedref, Object updateref, boolean expectedmark, boolean updatemark); And this is the current mark Changing State Changing State then change to this new reference Public boolean compareandset( Object expectedref, Object updateref, boolean expectedmark, boolean updatemark); and this new mark public boolean attemptmark( Object expectedref, boolean updatemark); 35

Changing State public boolean attemptmark( Object expectedref, boolean updatemark); Changing State public boolean attemptmark( Object expectedref, boolean updatemark); If this is the current reference.. then change to this new mark. failed a bcas c d a CAS bcas c d remov e c remov e b remov e c 213 214 a d remov e b remov e c remov e b remov e c 215 216 36

Traversing the List Q: what do you do when you find a logically deleted node in your path? A: finish the job. CAS the predecessor s next field Proceed (repeat as needed) pred Lock-Free Traversal (only Add and Remove) curr pred curr CAS Uh-oh 217 218 The Window Class The Window Class class Window { public Node pred; public Node curr; Window(Node pred, Node curr) { this.pred = pred; this.curr = curr; class Window { public Node pred; public Node curr; Window(Node pred, Node curr) { this.pred = pred; this.curr = curr; A container for pred and current values Using the Find Method Using the Find Method Window window = find(head, key); Node pred = window.pred; curr = window.curr; Window window = find(head, key); Node pred = window.pred; curr = window.curr; Find returns window 37

Using the Find Method The Find Method Window window = find(head, key); Node pred = window.pred; curr = window.curr; Window window = find(item); Extract pred and curr At some instant, item or pred curr succ Herlihy-Shavit 2007 The Find Method Remove Window window = find(item); At some instant, item pred curr= null succ Herlihy-Shavit 2007 not in list public boolean remove(t item) { Boolean snip; while (true true) ) { Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key!= key) { else { Node succ = curr.next.getreference(); snip = curr.next.attemptmark(succ, true); if (!snip) continue; pred.next.compareandset(curr, succ, false, false); Remove Remove public boolean remove(t item) { Boolean snip; while (true true) ) { Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key!= key) { else { Node succ = curr.next.getreference(); snip = curr.next.attemptmark(succ, true); if (!snip) continue; pred.next.compareandset(curr, succ, false, false); Keep trying public boolean remove(t item) { Boolean snip; while (true) { Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key!= key) { else { Node succ = curr.next.getreference(); snip = curr.next.attemptmark(succ, true); if (!snip) continue; pred.next.compareandset(curr, succ, false, false); Find neighbors 38

Remove public boolean remove(t item) { Boolean snip; while (true) { Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key!= key) { else { Node succ = curr.next.getreference(); snip = curr.next.attemptmark(succ, true); if (!snip) continue; pred.next.compareandset(curr, succ, false, false); She s not there Remove public boolean remove(t item) { Boolean Try snip; to mark node as deleted while (true) { Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key!= key) { else { Node succ = curr.next.getreference(); snip = curr.next.attemptmark(succ, true); if (!snip) continue; pred.next.compareandset(curr, succ, false, false); Remove public boolean remove(t item) { Boolean If it snip; doesn t work, while (true) just retry, { Window window = find(head, key); Node if it pred does, = window.pred, job curr = window.curr; essentially if (curr.key done!= key) { else { Node succ = curr.next.getreference(); snip = curr.next.attemptmark(succ, true); if (!snip) continue; pred.next.compareandset(curr, succ, false, false); Remove public boolean remove(t item) { Boolean snip; while (true) { a Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key!= key) { Try to advance reference else { Node (if we succ don t = curr.next.getreference(); succeed, someone else did or will). snip = curr.next.attemptmark(succ, true); if (!snip) continue; pred.next.compareandset(curr, succ, false, false); Add public boolean add(t item) { boolean splice; while (true true) ) { Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key == key) { else { Node node = new Node(item); node.next = new AtomicMarkableRef(curr, false); if (pred.next.compareandset(curr, node, false, false)) {return { true; Add public boolean add(t item) { boolean splice; while (true) { Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key == key) { else { Node node = new Node(item); node.next = new AtomicMarkableRef(curr, false); if (pred.next.compareandset(curr, node, false, false)) { Item already there. 234 39

Add public boolean add(t item) { boolean splice; while (true) { Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key == key) { else { Node node = new Node(item); node.next = new AtomicMarkableRef(curr, false); if (pred.next.compareandset(curr, node, false, false)) { create new node Add public boolean add(t item) { boolean splice; Install new node, while (true) { else retry loop Window window = find(head, key); Node pred = window.pred, curr = window.curr; if (curr.key == key) { else { Node node = new Node(item); node.next = new AtomicMarkableRef(curr, false); if (pred.next.compareandset(curr, node, false, false)) {return { true; Wait-free Contains Wait-free Contains public boolean contains(tt item) { boolean marked; int key = item.hashcode(); Node curr = this.head; while (curr.key < key) Node succ = curr.next.get(marked); return (curr.key == key &&!marked[0]) public boolean contains(t item) { boolean marked; Only diff is that we int key = item.hashcode(); get and check Node curr = this.head; marked while (curr.key < key) Node succ = curr.next.get(marked); return (curr.key == key &&!marked[0]) 237 238 Lock-free Find public Window find(node head, int key) { Node pred = null, curr = null, succ = null; boolean[] marked = {false { false; boolean snip; retry: while (true true) ) { pred = head; curr = pred.next.getreference(); while (true true) ) { succ = curr.next.get(marked); while (marked[0]) { if (curr.key >= key) return new Window(pred, curr); curr = succ; Lock-free Find public Window find(node head, int key) { Node pred = null, curr = null, succ = null; boolean[] marked = {false; boolean snip; retry: while (true true) ) { pred = head; curr = pred.next.getreference(); while (true) { If list changes succ = curr.next.get(marked); while while (marked[0]) { traversed, start over Lock-Free if (curr.key >= key) return new Window(pred, curr); because we start over only curr = succ; if someone else makes progress 40

Lock-free Find public Window find(node head, int key) { Node pred = null, Start curr looking = null, succ from = head null; boolean[] marked = {false; boolean snip; retry: while (true) { pred = head; curr = pred.next.getreference(); while (true) { succ = curr.next.get(marked); while (marked[0]) { if (curr.key >= key) return new Window(pred, curr); curr = succ; 241 Lock-free Find public Window find(node head, int key) { Node pred = null, curr = null, succ = null; boolean[] marked = {false; boolean snip; retry: while (true) { Move down the list pred = head; curr = pred.next.getreference(); while (true true) ) { succ = curr.next.get(marked); while (marked[0]) { if (curr.key >= key) return new Window(pred, curr); curr = succ; 242 Lock-free Find public Window find(node head, int key) { Node pred = null, curr = null, succ = null; boolean[] marked = {false; boolean snip; retry: while (true) { pred = head; curr = pred.next.getreference(); while (true) { succ = curr.next.get(marked); while (marked[0]) { if (curr.key >= key) return new Window(pred, curr); curr = Get succ; ref to successor and current deleted bit 243 Lock-free Find public Window find(node head, int key) { Node pred = null, curr = null, succ = null; boolean[] marked = {false; boolean snip; retry: while (true) { pred = head; curr = pred.next.getreference(); while (true) { succ = curr.next.get(marked); while (marked[0]) { if (curr.key >= key) return new Window(pred, curr); curr = succ; pathcode details soon 244 Try to remove deleted nodes in Lock-free Find public Window find(node head, int key) { Node pred = null, curr = null, succ = null; boolean[] marked = {false; boolean snip; retry: while (true) { pred = head; curr = pred.next.getreference(); If while curr (true) key that { is greater or succ = curr.next.get(marked); equal, return pred and curr while (marked[0]) { if (curr.key >= key) return new Window(pred, curr); curr = succ; Lock-free Find public Window find(node head, int key) { Node pred = null, curr = null, succ = null; boolean[] marked = {false; boolean snip; retry: while (true) { pred = head; curr = pred.next.getreference(); while (true) { succ = curr.next.get(marked); Otherwise while (marked[0]) advance { window and loop again if (curr.key >= key) return new Window(pred, curr); curr = succ; 41

Lock-free Find retry: while (true true) ) { while (marked[0]) { snip = pred.next.compareandset(curr, succ, false, false); if (!snip) continue retry; curr = succ; succ = curr.next.get(marked); Lock-free Find Try to snip out node retry: while (true) { while (marked[0]) { snip = pred.next.compareandset(curr, succ, false, false); if (!snip) continue retry; curr = succ; succ = curr.next.get(marked); Lock-free Find if predecessor s next field changed must retry whole traversal retry: while (true) { while (marked[0]) { snip = pred.next.compareandset(curr, succ, false, false); if (!snip) continue retry; curr = succ; succ = curr.next.get(marked); Lock-free Find Otherwise move on to check if next node deleted retry: while (true) { while (marked[0]) { snip = pred.next.compareandset(curr, succ, false, false); if (!snip) continue retry; curr = succ; succ = curr.next.get(marked); Performance High Contains Ratio On 16 node shared memory machine Benchmark throughput of Java List-based Set algs. Vary % of Contains() method Calls. Lock-free Lazy list Course Grained Fine Lock-coupling 251 252 42

Low Contains Ratio As Contains Ratio Increases Lock-free Lock-free Lazy list Lazy list Course Grained Fine Lock-coupling Course Grained Fine Lock-coupling % Contains() 253 254 Summary Coarse-grained locking Fine-grained locking Optimistic synchronization Lock-free synchronization To Lock or Not to Lock Locking vs. Non-blocking: Extremist views on both sides The answer: nobler to compromise, combine locking and non-blocking Example: Lazy list combines blocking add() and remove() and a wait-free contains() Remember: Blocking/non-blocking is a property of a method 255 256 This work is licensed under a Creative Commons Attribution- ShareAlike 2.5 License. You are free: to Share to copy, distribute and transmit the work to Remix to adapt the work Under the following conditions: Attribution. You must attribute the work to The Art of Multiprocessor Programming (but not in any way that suggests that the authors endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to http://creativecommons.org/licenses/by-sa/3.0/. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights. 43