Speculative Locks. Dept. of Computer Science

Size: px

Start display at page:

Download "Speculative Locks. Dept. of Computer Science"

Elwin Fields
5 years ago
Views:

1 Speculative Locks José éf. Martínez and djosep Torrellas Dept. of Computer Science University it of Illinois i at Urbana-Champaign

2 Motivation Lock granularity a trade-off: Fine grain greater concurrency Coarse grain greater programmability/time to market Our goal: Pursue fine-grain concurrency Provide coarse-grain programmability Our proposal Speculative Locks: Execute critical section assuming no dependencies Monitor for dependence violations Squash and roll back on the fly if violation detected

3 Example of Critical Section ST B LD B Conflicting accesses on variable B

4 Enforcing Serialisation 1 ACQ ACQ 2 Acquire sequence determines order ST B LD B REL REL

5 Race-Free Execution* ACQ ST B ACQ LD B REL REL *Adve et al, ISCA90,93 po so po ST B REL ACQ LD B

6 Release Consistency Suboptimal ACQ ST B ACQ LD B REL REL

7 Actual RC Implementations Can speculatively execute ops past acquire point* Use processor s memory buffers as temporary storage Retire memory ops when acquire successful Redo accesses if externally invalidated Memory buffers small Good enough to hide acquire latency Not suitable to exploit thread-level parallelism *Gharachorlo et al, ICPP 1991, Ranangathan et al, SPAA 1997

8 Lock-Free Accesses Specialised code without synchronisation primitives Typically work on private copy Check for conflicts at commit time Discard changes and repeat if conflicts detected d Livelock avoidance typically tackled using exponential back-off Limitations: it ti Private copy generally expensive Use of specialised lock-free code Forward progress not guaranteed automatically No memory conflicts tolerated

9 Speculative Locks Write conventional lock-based code Guarantee forward progress through lock owner Execute transactions speculatively except lock owner Use caches as temporary buffer except tlock owner Monitor for violations and on the fly Use existing coherence protocol All in-order (safe to speculative) conflicts tolerated Restart at once if violation detected Partial parallelism achievable

10 Example of Speculative Lock A B C D E ACQ REL

11 Example of Speculative Lock B C D E ACQ A A enters CS REL

12 Example of Speculative Lock C D ACQ A B E B enters CS E enters CS REL

13 Example of Speculative Lock D ACQ A B C C enters CS E exits CS E REL

14 Example of Speculative Lock D ACQ A B A exits CS C C chosen new owner E REL E becomes safe (all CS accesses complete)

15 Speculative Concurrency Unit

16 Handling of Critical Section SCU: off-loads processor from T&T&S L: ld $1,loc bnz L t&s $1,loc bnz L Processor: checkpoints state and continues Lines accessed will be marked as Speculative Dirty lines will be written back prior to speculative access When lock is acquired: Set Owner bit Clear all Speculative bits

17 Out-of-Order Accesses External action on speculative line: Invalidate all Speculative+Dirty lines Squash thread and roll back to checkpointed state Early Release: Complete all accesses before release point, then Set Release bit (thread still especulative) Lock detected available after Release: Clear all Speculative bits Lock variable left alone!

18 Summary of Benefits Concurrent access to critical section Does not require major code changes Utilises existing caches, coherence protocol Fast acquire and on-the-fly squash operations Forward progress guaranteed (limited worst case) All in-order conflicts tolerated

19 Limitations Cache overflow of speculative state Stall and wait to become owner Need not restart owner guarantees forward progress Can use victim buffer to reduce overflow due to conflict* Non-cacheable/non-repeatable operations Spin on SCU Owner bit itself before performing these

20 Example: (Very) Simplified TPC-C Preliminary feasibility experiment Results shown for PRAM w/ infinite i caches Synthetic OLTP on 64 processors Five branches, five tellers/branch Variable number of accounts/teller (5-1000) One million randomly-generated synthetic transactions: Pre-processing Balance read Internal processing Balance update Post-processing

21 Example: (Very) Simplified TPC-C

22 Results against Finest Grain

23 Conclusions and Future Work Speculative Locks hardware-controlled concurrent execution of critical sections, may: Improve performance of existing codes Offer better programmability/concurrency trade-off SCU simple, yet effective implementation In progress: analysis of effect on full applications through detailed execution-driven simulation

24 Speculative Locks José éf. Martínez and djosep Torrellas

25 Transactional Memory* Lock-free technique (multivariable LL/SC) Uses special ilmemory and control operations Executes all transactions speculatively (no owner) Requires provisions for deadlock prevention Monitors for conflicts (no in- or out-of-order concept) Repeats if any conflict detected at commit time No gain if conflicts arise *Herlihy et al, ISCA93

26 Oklahoma Update* Lock-free technique (multivariable LL/SC) Load variables and modify speculatively l in reservation registers Request exclusive rights to all such variables at commit time Success if permissions granted w/o intervening invalidation Limitations: Speculative state limited to dedicated reservation registers Commit phase potentially slow, traffic-intensive Forward progress guarantee based on Absence of false sharing Buffering capacity at node to defer incoming invalidations *Stone et al, IEEE Parallel & Dist. Tech, 1993

27 Support for Multiple Locks

28 Support for Flags and Barriers SCU-based synchronisation possible: Flag: speculative lock with Release bit set Barrier: lock+flag Consequence: SCU useful in multiple workload types Barriers, flags: parallel applications, e.g. numerical Locks: multi-threaded environments, e.g. OLTP, OS kernel

Speculative Synchronization

Speculative Synchronization José F. Martínez Department of Computer Science University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu/martinez Problem 1: Conservative Parallelization No parallelization