bool Account::withdraw(int val) { atomic { if(balance > val) { balance = balance val; return true; } else return false; } }

Size: px
Start display at page:

Download "bool Account::withdraw(int val) { atomic { if(balance > val) { balance = balance val; return true; } else return false; } }"

Transcription

1 Transac'onal Memory Acknowledgement: Slides in part adopted from: 1. a talk on Intel TSX from Intel Developer's Forum in the companion slides for the book "The Art of Mul'processor Programming" by Maurice Herlihy and Nir Shavit

2 Transac'onal Memory A high- level programming construct for op'mis'c concurrency control. The programmer denotes the cri'cal sec'on, and the underlying TM system ensure that the cri'cal sec'on executes atomically. bool Account::withdraw(int val) { atomic { if(balance > val) { balance = balance val; return true; } else return false; } }

3 Transac'onal Memory The execu'on of an atomic block cons'tutes a transac'on. The idea of transac'on came from the DB world [1]. The underlying system ensures atomicity and isola,on of concurrently execu'ng transac'ons. Atomicity: program state changes performed by a transac'on are indivisible from the perspec've of other transac'ons. Isola.on: a transac'on appears to execute serially one at a 'me (i.e., other concurrently execu'ng transac'ons do not affect its its result). Make lock- free synchroniza'on efficient and easy. First hardware TM design proposed by Herlihy and Moss (1993) [2].

4 Advantages of Transac'onal Memory Avoid common pizalls of locks: priority inversion convoying deadlock Provides be[er composability Poten'ally allows higher concurrency.

5 TM Provides Be[er Composability Example: Implement transfers between accounts void transfer(account *from, Account *to, int amount) { if( from->withdraw(amount) ) { to->deposit(amount); } }

6 TM Provides Be[er Composability Example: Implement transfers between accounts Lock *global = new Lock(); void transfer(account *from, Account *to, int amount) { global->lock(); } if( from->withdraw(amount) ) { } to->deposit(amount); global->unlock(); Global lock limits concurrency.

7 TM Provides Be[er Composability Example: Implement transfers between accounts void transfer(account *from, Account *to, int amount) { from->lock(); to->lock(); } if( from->withdraw(amount) ) { } to->deposit(amount); to->unlock(); from->unlock(); Fine- grained locking breaks abstrac'on. And, the code may deadlock.

8 TM Provides Be[er Composability Example: Implement transfers between accounts void transfer(account *from, Account *to, int amount) { atomic { } } if( from->withdraw(amount) ) { } to->deposit(amount); to->unlock(); from->unlock(); Transac'onal interface allows the two method calls to execute atomically without these issues.

9 TM Poten'ally Allows Higher Concurrency Example: concurrent search and insert in a chained hash table Example: concurrent search and insert in a balanced binary search tree

10 How Does TM Work? Each transac'on maintains a read set and a write set. Two transac'ons conflict when they access the same memory loca'on in a conflic'ng way (i.e., at least one is a write). The underlying TM system performs conflict detec'on during execu'on, and a transac'on commits if it executes to comple'on without a conflict; otherwise, it aborts.

11 Implementa'on Alterna'ves Hardware transac'onal memory: Intel s Haswell TSX IBM s Blue Gene/Q & System Z & Power8 Soiware transac'onal memory Hybrid transac'onal memory

12 Different Implementa'on Strategies Deferred update vs Direct Update system Direct update: keep old values off to the side Deferred update: keep new values off to the side Early vs lazy conflict detec'on Early conflict detec'on: detect conflict as soon as it occurs Lazy conflict detec'on: detect conflict when transac'on is about to complete

13 Lecture Today TSX features in Intel Haswell TLS2 (an implementa'on of STM) Limita'ons / Open Issues of TM

14 Architectural Support for HTM The L1 Data cache serves as a write buffer writes within a transac'on is not visible to others un'l commit. Each cache line has addi'onal bits*: RS: the cache line is brought in for a transac'onal load WS: the cache line is brought in for a transac'onal store Piggyback on the underlying cache coherence protocol for conflict detec'on (so granularity is at cache line level) Abort and commit are local to the L1 cache: Upon a conflict, abort the transac'on: invalidate dirty transac'onal and revert back the register state. When transac'on completes, commit if no conflict occurred: clear the RS and WS bits (the cache coherence takes care of the rest). Deferred update and Eager conflict detec'on Best effort HTM * This is just a guess; Intel did not reveal the architectural details of how TM is supported but should be similar to architectural support proposed by Herlihy and Moss [2].

15 Intel Transac'onal Synchroniza'on Extension (TSX) Hardware Lock Elision (HLE) XACQUIRE/XRELEASE Soiware uses legacy compa'ble hints to iden'fy cri'cal sec'on. Hints ignored on hardware without TSX Hardware support to execute transac'onally without acquiring lock Abort causes a re- execu'on without elision Hardware manages all architectural state Restricted Transac'onal Memory (RTM) XBEGIN/XEND Soiware uses new instruc'ons to specify cri'cal sec'ons Cri'cal sec'on executes transac'onally Abort transfers control to target specified by XBEGIN operand Abort informa'on returned in a general purpose register (EAX) Addi'onal instruc'ons: XTEST and XABORT slides from

16 Hardware Lock Elision HLE is backward compa'ble and allows locking code to execute transac'onally on hardware that supports TM. slides from

17 Intel TSX Interface: HLE mov eax, 1 Try: lock xchg mutex, eax cmp eax, 0 jz Success Spin: pause cmp mutex, 1 jz Spin jmp Try mov eax, 1 Try: xacquire lock xchg mutex, eax cmp eax, 0 jz Success Spin: pause cmp mutex, 1 jz Spin jmp Try Library Application acquire_lock (mutex) ; do critical section ; function calls, ; memory operations, release_lock (mutex) Enter HLE execution If lock not free, execution will abort either early (if pause used) or when lock gets free Commit HLE execution mov mutex, 0 xrelease mov mutex, 0 slides from

18 Hardware Lock Elision Identify and Elide: HLE xacquire lock cmpxchg mutex, ebx mutex: 0 Hardware executes XACQUIRE hint Hardware elides acquire write to mutex Hardware starts transactional execution mov ecx, mutex mutex: 1 Reading mutex in critical section sees last value written (1) Other threads reading see free value (0) mutex value self others 0 0 xrelease mov mutex, 0 Hardware executes XRELEASE hint Hardware elides release write to mutex Hardware commits transactional execution mutex: 1 mutex: slides from

19 Hardware Lock Elision Hardware support to elide multiple locks Hardware elision buffer manages actively elided locks XACQUIRE/XRELEASE allocate/free elision buffer entries Skips elision without aborting if no free entry available Hardware treats XACQUIRE/XRELEASE as hints Functionally correct even if hints used improperly Hardware checks if locks meet requirements for elision May expose latent bugs and incorrect timing assumptions slides from

20 Intel TSX Interface: RTM Retry: xbegin Abort cmp mutex, 0 jz Success xabort $0xff Abort: check EAX and do retry policy actually acquire lock or wait to retry. Enter RTM execution, Abort is fallback path Check to see if mutex is free Abort transactional execution if mutex busy Fallback path in software Retry RTM or explicitly acquire mutex acquire_lock (mutex) ; do critical section ; function calls, ; memory operations, release_lock (mutex) cmp mutex, 0 jnz release_lock xend Mutex not free was not an RTM execution Commit RTM execution slides from

21 Intel TSX Interface: RTM Intel TSX Interface: RTM Retry: xbegin Abort cmp mutex, 0 jz Success xabort $0xff Abort: check EAX and do retry policy actually acquire lock or wait to retry. Try: mov eax, 1 lock xchg mutex, eax cmp eax, 0 jz Success Spin: pause cmp mutex, 1 jz Spin jmp Try acquire_lock (mutex) acquire_lock (mutex) ; do critical section ; function calls, ; memory operations, release_lock (mutex) cmp mutex, 0 jnz release_lock xend mov mutex, 0 release_lock (mutex) slides from

22 Haswell: Best Effort HTM Transac'onal execu'on can abort due to: data conflicts cache overflow cache evic'on of a transac'onal write (ran out of associa'vity) system ac'vity: context switch, interrupt, page fault use of special instruc'on (e.g. pause, cpuid) transac'on nes'ng depth exceeds maximum limit A bug was discovered in TSX (2014), so Intel issued a soiware "microcode update" to turn TSX off rather than fixing it.

23 A Simple Lock- Based STM STMs come in different forms Lock- based Lock- free First, a simple lock- based STM: Deferred update: changes installed at commit Lazy conflict detec'on: conflicts detected at commit

24 STM: Transac'onal Locking [3] Map Application Memory V# V# V# Array of version #s & locks companion slides from the Art Mul'processor Programming

25 Commit Time Locking (Write Buff) Mem X Y Locks V# 0 V# 0 V#+1 0 V#+1 10 V# 0 V# 0 V#+1 0 V# V# 0 V# 0 V# 0 V# 0 Read In write set? unlocked? add value to read set Write add value to write set, Validate acquire locks check version #s unchanged install changes, increment vesion #s, unlock companion slides from the Art Mul'processor Programming

26 Commit Time Locking (Write Buff) Mem Locks X Y V# 0 V# 0 V#+1 0 V#+1 10 V# 0 V# 0 V#+1 0 V# V# 0 V# 0 V# 0 V# 0 1. To Read: load lock + location 2. Location in write-set? (Bloom Filter) 3. Check unlocked add to Read-Set 4. To Write: add value to write set 5. Acquire Locks 6. Validate read/write v# s unchanged 7. Release each lock with v#+1 Hold locks for very short duration companion slides from the Art Mul'processor Programming

27 Problem: Internal Inconsistency A Zombie is an ac've transac'on des'ned to abort. If Zombies see inconsistent states bad things can happen companion slides from the Art Mul'processor Programming

28 Internal Consistency x y Invariant: x = 2y Transaction A: reads x = 4 Transaction B: writes 8 to x, 16 to y, aborts A ) Transaction A: (zombie) reads y = 4 computes 1/(x-y) Divide by zero FAIL! companion slides from the Art Mul'processor Programming

29 Solu'on: The Global Clock The TL2 Algorithm [4] Have one shared global clock Incremented by (small subset of) wri'ng transac'ons Read by all transac'ons Used to validate that state worked on is always consistent companion slides from the Art Mul'processor Programming

30 Example Mem Locks Shared Version Clock X X Y Y V# V# V# RV ß Shared Version Clock 2. On Read/Write: check unlocked and v# <= RV then add to Read/Write-Set 3. Acquire Locks 4. WV = F&I(VClock) 5. Validate each v# <= RV 6. Release locks with v# ß WV Commit 100 RV Reads+Inc+Writes =serializable companion slides from the Art Mul'processor Programming

31 Open Issues of TM Lack of performance model: Difficult to guarantee forward progress Conten'on management is s'll an ac've research area Irrevocable ac'ons within a transac'on Seman'c Issues: interac'ons with excep'ons nested transac'ons strong atomicity versus weak atomicity does not support certain synchroniza'on pa[erns

32 References [1] Lomet, D.B. Process structuring, synchroniza'on, and recovery using atomic ac'ons. In Proceedings of the ACM Conference on Language Design for Reliable So:ware (Raleigh, NC, 1977). ACM, NY, [2] Herlihy, M. and Moss, J.E.B. Transac'onal memory: Architectural support for lock- free data structures. In Proceedings of the 20th Interna?onal Symposium on Computer Architecture. ACM, 1993, [3] Dice, D., Shavit, N., What really makes transac'ons fast? In: TRANSACT06 ACM Workshop. (2006) [4] Dice D., Shavit, O., and Shavit, N., Transac'onal locking II. In Proceedings of the 20 th interna'onal conference on Distributed Compu'ng (DISC'06). Shlomi Dolev (Ed.). Springer- Verlag, Berlin, Heidelberg, 2006,

Transactional Memory. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Transactional Memory. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Transactional Memory Companion slides for The by Maurice Herlihy & Nir Shavit Our Vision for the Future In this course, we covered. Best practices New and clever ideas And common-sense observations. 2

More information

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28 Transactional Memory or How to do multiple things at once Benjamin Engel Transactional Memory 1 / 28 Transactional Memory: Architectural Support for Lock-Free Data Structures M. Herlihy, J. Eliot, and

More information

A Concurrent Skip List Implementation with RTM and HLE

A Concurrent Skip List Implementation with RTM and HLE A Concurrent Skip List Implementation with RTM and HLE Fan Gao May 14, 2014 1 Background Semester Performed: Spring, 2014 Instructor: Maurice Herlihy The main idea of my project is to implement a skip

More information

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 28 November 2014

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 28 November 2014 NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 28 November 2014 Lecture 8 Problems with locks Atomic blocks and composition Hardware transactional memory Software transactional memory

More information

Going Under the Hood with Intel s Next Generation Microarchitecture Codename Haswell

Going Under the Hood with Intel s Next Generation Microarchitecture Codename Haswell Going Under the Hood with Intel s Next Generation Microarchitecture Codename Haswell Ravi Rajwar Intel Corporation QCon San Francisco Nov 9, 2012 1 What is Haswell? 45nm 32nm 22nm Nehalem Westmere Sandy

More information

Hardware Transactional Memory on Haswell

Hardware Transactional Memory on Haswell Hardware Transactional Memory on Haswell Viktor Leis Technische Universität München 1 / 15 Introduction transactional memory is a very elegant programming model transaction { transaction { a = a 10; c

More information

Intel Transactional Synchronization Extensions (Intel TSX) Linux update. Andi Kleen Intel OTC. Linux Plumbers Sep 2013

Intel Transactional Synchronization Extensions (Intel TSX) Linux update. Andi Kleen Intel OTC. Linux Plumbers Sep 2013 Intel Transactional Synchronization Extensions (Intel TSX) Linux update Andi Kleen Intel OTC Linux Plumbers Sep 2013 Elision Elision : the act or an instance of omitting something : omission On blocking

More information

Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing

Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing Richard Yoo, Christopher Hughes: Intel Labs Konrad Lai, Ravi Rajwar: Intel Architecture Group Agenda

More information

CS5460/6460: Operating Systems. Lecture 14: Scalability techniques. Anton Burtsev February, 2014

CS5460/6460: Operating Systems. Lecture 14: Scalability techniques. Anton Burtsev February, 2014 CS5460/6460: Operating Systems Lecture 14: Scalability techniques Anton Burtsev February, 2014 Recap: read and write barriers void foo(void) { a = 1; smp_wmb(); b = 1; } void bar(void) { while (b == 0)

More information

Invyswell: A HyTM for Haswell RTM. Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy

Invyswell: A HyTM for Haswell RTM. Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy Invyswell: A HyTM for Haswell RTM Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy Multicore Performance Scaling u Problem: Locking u Solution: HTM? u IBM BG/Q, zec12,

More information

6 Transactional Memory. Robert Mullins

6 Transactional Memory. Robert Mullins 6 Transactional Memory ( MPhil Chip Multiprocessors (ACS Robert Mullins Overview Limitations of lock-based programming Transactional memory Programming with TM ( STM ) Software TM ( HTM ) Hardware TM 2

More information

Lecture 12 Transactional Memory

Lecture 12 Transactional Memory CSCI-UA.0480-010 Special Topics: Multicore Programming Lecture 12 Transactional Memory Christopher Mitchell, Ph.D. cmitchell@cs.nyu.edu http://z80.me Database Background Databases have successfully exploited

More information

Database design and implementation CMPSCI 645. Lectures 18: Transactions and Concurrency

Database design and implementation CMPSCI 645. Lectures 18: Transactions and Concurrency Database design and implementation CMPSCI 645 Lectures 18: Transactions and Concurrency 1 DBMS architecture Query Parser Query Rewriter Query Op=mizer Query Executor Lock Manager Concurrency Control Access

More information

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013 Lecture 20: Transactional Memory Parallel Computer Architecture and Programming Slide credit Many of the slides in today s talk are borrowed from Professor Christos Kozyrakis (Stanford University) Raising

More information

Transac'onal Libraries Alexander Spiegelman *, Guy Golan-Gueta, and Idit Keidar * Technion Yahoo Research

Transac'onal Libraries Alexander Spiegelman *, Guy Golan-Gueta, and Idit Keidar * Technion Yahoo Research Transac'onal Libraries Alexander Spiegelman *, Guy Golan-Gueta, and Idit Keidar * * Technion Yahoo Research 1 Mul'-Threading is Everywhere 2 Agenda Mo@va@on Concurrent Data Structure Libraries (CDSLs)

More information

Atomic Transac1ons. Atomic Transactions. Q1: What if network fails before deposit? Q2: What if sequence is interrupted by another sequence?

Atomic Transac1ons. Atomic Transactions. Q1: What if network fails before deposit? Q2: What if sequence is interrupted by another sequence? CPSC-4/6: Operang Systems Atomic Transactions The Transaction Model / Primitives Serializability Implementation Serialization Graphs 2-Phase Locking Optimistic Concurrency Control Transactional Memory

More information

6.852: Distributed Algorithms Fall, Class 20

6.852: Distributed Algorithms Fall, Class 20 6.852: Distributed Algorithms Fall, 2009 Class 20 Today s plan z z z Transactional Memory Reading: Herlihy-Shavit, Chapter 18 Guerraoui, Kapalka, Chapters 1-4 Next: z z z Asynchronous networks vs asynchronous

More information

Sequen&al Consistency and Linearizability

Sequen&al Consistency and Linearizability Sequen&al Consistency and Linearizability (Or, Reasoning About Concurrent Objects) Acknowledgement: Slides par&ally adopted from the companion slides for the book "The Art of Mul&processor Programming"

More information

Mutex Locking versus Hardware Transactional Memory: An Experimental Evaluation

Mutex Locking versus Hardware Transactional Memory: An Experimental Evaluation Mutex Locking versus Hardware Transactional Memory: An Experimental Evaluation Thesis Defense Master of Science Sean Moore Advisor: Binoy Ravindran Systems Software Research Group Virginia Tech Multiprocessing

More information

Transactional Memory. Concurrency unlocked Programming. Bingsheng Wang TM Operating Systems

Transactional Memory. Concurrency unlocked Programming. Bingsheng Wang TM Operating Systems Concurrency unlocked Programming Bingsheng Wang TM Operating Systems 1 Outline Background Motivation Database Transaction Transactional Memory History Transactional Memory Example Mechanisms Software Transactional

More information

VMM Emulation of Intel Hardware Transactional Memory

VMM Emulation of Intel Hardware Transactional Memory VMM Emulation of Intel Hardware Transactional Memory Maciej Swiech, Kyle Hale, Peter Dinda Northwestern University V3VEE Project www.v3vee.org Hobbes Project 1 What will we talk about? We added the capability

More information

Thread-Level Speculation on Off-the-Shelf Hardware Transactional Memory

Thread-Level Speculation on Off-the-Shelf Hardware Transactional Memory Thread-Level Speculation on Off-the-Shelf Hardware Transactional Memory Rei Odaira Takuya Nakaike IBM Research Tokyo Thread-Level Speculation (TLS) [Franklin et al., 92] or Speculative Multithreading (SpMT)

More information

Transactional Memory: Architectural Support for Lock-Free Data Structures Maurice Herlihy and J. Eliot B. Moss ISCA 93

Transactional Memory: Architectural Support for Lock-Free Data Structures Maurice Herlihy and J. Eliot B. Moss ISCA 93 Transactional Memory: Architectural Support for Lock-Free Data Structures Maurice Herlihy and J. Eliot B. Moss ISCA 93 What are lock-free data structures A shared data structure is lock-free if its operations

More information

Work Report: Lessons learned on RTM

Work Report: Lessons learned on RTM Work Report: Lessons learned on RTM Sylvain Genevès IPADS September 5, 2013 Sylvain Genevès Transactionnal Memory in commodity hardware 1 / 25 Topic Context Intel launches Restricted Transactional Memory

More information

Transac.on Management. Transac.ons. CISC437/637, Lecture #16 Ben Cartere?e

Transac.on Management. Transac.ons. CISC437/637, Lecture #16 Ben Cartere?e Transac.on Management CISC437/637, Lecture #16 Ben Cartere?e Copyright Ben Cartere?e 1 Transac.ons A transac'on is a unit of program execu.on that accesses and possibly updates rela.ons The DBMS s view

More information

1 RCU. 2 Improving spinlock performance. 3 Kernel interface for sleeping locks. 4 Deadlock. 5 Transactions. 6 Scalable interface design

1 RCU. 2 Improving spinlock performance. 3 Kernel interface for sleeping locks. 4 Deadlock. 5 Transactions. 6 Scalable interface design Overview of Monday s and today s lectures Outline Locks create serial code - Serial code gets no speedup from multiprocessors Test-and-set spinlock has additional disadvantages - Lots of traffic over memory

More information

Transactional Memory. Lecture 19: Parallel Computer Architecture and Programming CMU /15-618, Spring 2015

Transactional Memory. Lecture 19: Parallel Computer Architecture and Programming CMU /15-618, Spring 2015 Lecture 19: Transactional Memory Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Credit: many of the slides in today s talk are borrowed from Professor Christos Kozyrakis

More information

CS4021/4521 INTRODUCTION

CS4021/4521 INTRODUCTION CS4021/4521 Advanced Computer Architecture II Prof Jeremy Jones Rm 4.16 top floor South Leinster St (SLS) jones@scss.tcd.ie South Leinster St CS4021/4521 2018 jones@scss.tcd.ie School of Computer Science

More information

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 18-742 Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 Past Due: Review Assignments Was Due: Tuesday, October 9, 11:59pm. Sohi

More information

Managing Resource Limitation of Best-Effort HTM

Managing Resource Limitation of Best-Effort HTM Managing Resource Limitation of Best-Effort HTM Mohamed Mohamedin, Roberto Palmieri, Ahmed Hassan, Binoy Ravindran Abstract The first release of hardware transactional memory (HTM) as commodity processor

More information

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations Lecture 21: Transactional Memory Topics: Hardware TM basics, different implementations 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end locks are blocking,

More information

Implementing Transactional Memory in Kernel space

Implementing Transactional Memory in Kernel space Implementing Transactional Memory in Kernel space Breno Leitão Linux Technology Center leitao@debian.org leitao@freebsd.org Problem Statement Mutual exclusion concurrency control (Lock) Locks type: Semaphore

More information

Transactional Memory

Transactional Memory Transactional Memory Michał Kapałka EPFL, LPD STiDC 08, 1.XII 2008 Michał Kapałka (EPFL, LPD) Transactional Memory STiDC 08, 1.XII 2008 1 / 25 Introduction How to Deal with Multi-Threading? Locks? Wait-free

More information

Lecture: Consistency Models, TM. Topics: consistency models, TM intro (Section 5.6)

Lecture: Consistency Models, TM. Topics: consistency models, TM intro (Section 5.6) Lecture: Consistency Models, TM Topics: consistency models, TM intro (Section 5.6) 1 Coherence Vs. Consistency Recall that coherence guarantees (i) that a write will eventually be seen by other processors,

More information

Software transactional memory

Software transactional memory Transactional locking II (Dice et. al, DISC'06) Time-based STM (Felber et. al, TPDS'08) Mentor: Johannes Schneider March 16 th, 2011 Motivation Multiprocessor systems Speed up time-sharing applications

More information

HTM in the wild. Konrad Lai June 2015

HTM in the wild. Konrad Lai June 2015 HTM in the wild Konrad Lai June 2015 Industrial Considerations for HTM Provide a clear benefit to customers Improve performance & scalability Ease programmability going forward Improve something common

More information

Lecture: Consistency Models, TM

Lecture: Consistency Models, TM Lecture: Consistency Models, TM Topics: consistency models, TM intro (Section 5.6) No class on Monday (please watch TM videos) Wednesday: TM wrap-up, interconnection networks 1 Coherence Vs. Consistency

More information

Scheduling Transactions in Replicated Distributed Transactional Memory

Scheduling Transactions in Replicated Distributed Transactional Memory Scheduling Transactions in Replicated Distributed Transactional Memory Junwhan Kim and Binoy Ravindran Virginia Tech USA {junwhan,binoy}@vt.edu CCGrid 2013 Concurrency control on chip multiprocessors significantly

More information

Reminder from last <me

Reminder from last <me Concurrent systems Lecture 2: Mutual exclusion and process synchronisa

More information

COMP3151/9151 Foundations of Concurrency Lecture 8

COMP3151/9151 Foundations of Concurrency Lecture 8 1 COMP3151/9151 Foundations of Concurrency Lecture 8 Transactional Memory Liam O Connor CSE, UNSW (and data61) 8 Sept 2017 2 The Problem with Locks Problem Write a procedure to transfer money from one

More information

CSE Opera,ng System Principles

CSE Opera,ng System Principles CSE 30341 Opera,ng System Principles Synchroniza2on Overview Background The Cri,cal-Sec,on Problem Peterson s Solu,on Synchroniza,on Hardware Mutex Locks Semaphores Classic Problems of Synchroniza,on Monitors

More information

UNIT V: CENTRAL PROCESSING UNIT

UNIT V: CENTRAL PROCESSING UNIT UNIT V: CENTRAL PROCESSING UNIT Agenda Basic Instruc1on Cycle & Sets Addressing Instruc1on Format Processor Organiza1on Register Organiza1on Pipeline Processors Instruc1on Pipelining Co-Processors RISC

More information

An Update on Haskell H/STM 1

An Update on Haskell H/STM 1 An Update on Haskell H/STM 1 Ryan Yates and Michael L. Scott University of Rochester TRANSACT 10, 6-15-2015 1 This work was funded in part by the National Science Foundation under grants CCR-0963759, CCF-1116055,

More information

Cost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University)

Cost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) Cost of Concurrency in Hybrid Transactional Memory Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) 1 Transactional Memory: a history Hardware TM Software TM Hybrid TM 1993 1995-today

More information

Reminder from last ;me

Reminder from last ;me Concurrent systems Lecture 5: Concurrency without shared data; transac;ons Dr Robert N. M. Watson 1 Reminder from last ;me Liveness proper;es Deadlock (requirements; resource alloca;on graphs; detec;on;

More information

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end

More information

Concurrent programming: From theory to practice. Concurrent Algorithms 2015 Vasileios Trigonakis

Concurrent programming: From theory to practice. Concurrent Algorithms 2015 Vasileios Trigonakis oncurrent programming: From theory to practice oncurrent Algorithms 2015 Vasileios Trigonakis From theory to practice Theoretical (design) Practical (design) Practical (implementation) 2 From theory to

More information

Lock vs. Lock-free Memory Project proposal

Lock vs. Lock-free Memory Project proposal Lock vs. Lock-free Memory Project proposal Fahad Alduraibi Aws Ahmad Eman Elrifaei Electrical and Computer Engineering Southern Illinois University 1. Introduction The CPU performance development history

More information

Lecture: Transactional Memory. Topics: TM implementations

Lecture: Transactional Memory. Topics: TM implementations Lecture: Transactional Memory Topics: TM implementations 1 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Avoids deadlock 2 Design Space Data Versioning

More information

DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms

DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms Lecture 4 Consensus, I Mads Dam Autumn/Winter 2011 Slides: Much material due to M. Herlihy and R Wa8enhofer Last Lecture Shared

More information

ABORTING CONFLICTING TRANSACTIONS IN AN STM

ABORTING CONFLICTING TRANSACTIONS IN AN STM Committing ABORTING CONFLICTING TRANSACTIONS IN AN STM PPOPP 09 2/17/2009 Hany Ramadan, Indrajit Roy, Emmett Witchel University of Texas at Austin Maurice Herlihy Brown University TM AND ITS DISCONTENTS

More information

Chí Cao Minh 28 May 2008

Chí Cao Minh 28 May 2008 Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor

More information

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) ) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Transactions - Definition A transaction is a sequence of data operations with the following properties: * A Atomic All

More information

CS377P Programming for Performance Multicore Performance Synchronization

CS377P Programming for Performance Multicore Performance Synchronization CS377P Programming for Performance Multicore Performance Synchronization Sreepathi Pai UTCS October 21, 2015 Outline 1 Synchronization Primitives 2 Blocking, Lock-free and Wait-free Algorithms 3 Transactional

More information

The Multicore Transformation

The Multicore Transformation Ubiquity Symposium The Multicore Transformation The Future of Synchronization on Multicores by Maurice Herlihy Editor s Introduction Synchronization bugs such as data races and deadlocks make every programmer

More information

There is a tempta7on to say it is really used, it must be good

There is a tempta7on to say it is really used, it must be good Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit

More information

Linux kernel synchroniza7on

Linux kernel synchroniza7on Linux kernel synchroniza7on Don Porter CSE 506 Memory Management Logical Diagram Binary Memory Threads Formats Allocators Today s Lecture Synchroniza7on System in Calls the kernel RCU File System Networking

More information

Conflict Detection and Validation Strategies for Software Transactional Memory

Conflict Detection and Validation Strategies for Software Transactional Memory Conflict Detection and Validation Strategies for Software Transactional Memory Michael F. Spear, Virendra J. Marathe, William N. Scherer III, and Michael L. Scott University of Rochester www.cs.rochester.edu/research/synchronization/

More information

TRANSACTION MEMORY. Presented by Hussain Sattuwala Ramya Somuri

TRANSACTION MEMORY. Presented by Hussain Sattuwala Ramya Somuri TRANSACTION MEMORY Presented by Hussain Sattuwala Ramya Somuri AGENDA Issues with Lock Free Synchronization Transaction Memory Hardware Transaction Memory Software Transaction Memory Conclusion 1 ISSUES

More information

10/27/11. Is The Queue Correct? Concurrent Computaton. Objects. A Concurrent FIFO Queue head. A Concurrent FIFO Queue head

10/27/11. Is The Queue Correct? Concurrent Computaton. Objects. A Concurrent FIFO Queue head. A Concurrent FIFO Queue head DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms Lecture 2 Concurrent Objects memory Concurrent Computaton Mads Dam Autumn/Winter 2011 object object Art of Mul/processor 2 Objects

More information

Charlie Garrod Bogdan Vasilescu

Charlie Garrod Bogdan Vasilescu Principles of So3ware Construc9on: Objects, Design, and Concurrency Part 3: Concurrency Introduc9on to concurrency, part 2 Concurrency primi9ves and challenges, con9nued Charlie Garrod Bogdan Vasilescu

More information

Reduced Hardware Lock Elision

Reduced Hardware Lock Elision Reduced Hardware Lock Elision Yehuda Afek Tel-Aviv University afek@post.tau.ac.il Alexander Matveev MIT matveeva@post.tau.ac.il Nir Shavit MIT shanir@csail.mit.edu Abstract Hardware lock elision (HLE)

More information

Synchronizing Data Structures

Synchronizing Data Structures 1 / 78 Overview caches and atomics list-based set memory reclamation Adaptive Radix Tree B-tree Bw-tree split-ordered list hardware transactional memory 2 / 78 Caches Caches modern CPUs consist of multiple

More information

Why Transac'ons? Database systems are normally being accessed by many users or processes at the same 'me.

Why Transac'ons? Database systems are normally being accessed by many users or processes at the same 'me. Transac'ons 1 Why Transac'ons? Database systems are normally being accessed by many users or processes at the same 'me. Both queries and modifica'ons. Unlike opera'ng systems, which support interac'on

More information

Portland State University ECE 588/688. Transactional Memory

Portland State University ECE 588/688. Transactional Memory Portland State University ECE 588/688 Transactional Memory Copyright by Alaa Alameldeen 2018 Issues with Lock Synchronization Priority Inversion A lower-priority thread is preempted while holding a lock

More information

Part 1: Concepts and Hardware- Based Approaches

Part 1: Concepts and Hardware- Based Approaches Part 1: Concepts and Hardware- Based Approaches CS5204-Operating Systems Introduction Provide support for concurrent activity using transactionstyle semantics without explicit locking Avoids problems with

More information

G Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University

G Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University G22.2110-001 Programming Languages Spring 2010 Lecture 13 Robert Grimm, New York University 1 Review Last week Exceptions 2 Outline Concurrency Discussion of Final Sources for today s lecture: PLP, 12

More information

Potential violations of Serializability: Example 1

Potential violations of Serializability: Example 1 CSCE 6610:Advanced Computer Architecture Review New Amdahl s law A possible idea for a term project Explore my idea about changing frequency based on serial fraction to maintain fixed energy or keep same

More information

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) ) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties

More information

DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms

DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms Lecture 2 Concurrent Objects Mads Dam Autumn/Winter 2011 Concurrent Computaton memory object object Art of Mul*processor 2 Objects

More information

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section

More information

Massimiliano Ghilardi

Massimiliano Ghilardi 7 th European Lisp Symposium Massimiliano Ghilardi May 5-6, 2014 IRCAM, Paris, France High performance concurrency in Common Lisp hybrid transactional memory with STMX 2 Beautiful and fast concurrency

More information

Enhancing Concurrency in Distributed Transactional Memory through Commutativity

Enhancing Concurrency in Distributed Transactional Memory through Commutativity Enhancing Concurrency in Distributed Transactional Memory through Commutativity Junwhan Kim, Roberto Palmieri, Binoy Ravindran Virginia Tech USA Lock-based concurrency control has serious drawbacks Coarse

More information

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of lazy TM 1 Transactions Access to shared variables is encapsulated within transactions the system gives the illusion

More information

Lock Elision and Transactional Memory Predictor in Hardware. William Galliher, Liang Zhang, Kai Zhao. University of Wisconsin Madison

Lock Elision and Transactional Memory Predictor in Hardware. William Galliher, Liang Zhang, Kai Zhao. University of Wisconsin Madison Lock Elision and Transactional Memory Predictor in Hardware William Galliher, Liang Zhang, Kai Zhao University of Wisconsin Madison Email: {galliher, lzhang432, kzhao32}@wisc.edu ABSTRACT Shared data structure

More information

CS370 OperaBng Systems

CS370 OperaBng Systems CS370 OperaBng Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 14 Deadlock Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ What happens if only one program

More information

Linked Lists: The Role of Locking. Erez Petrank Technion

Linked Lists: The Role of Locking. Erez Petrank Technion Linked Lists: The Role of Locking Erez Petrank Technion Why Data Structures? Concurrent Data Structures are building blocks Used as libraries Construction principles apply broadly This Lecture Designing

More information

Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench

Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Ajay Singh Dr. Sathya Peri Anila Kumari Monika G. February 24, 2017 STM vs Synchrobench IIT Hyderabad February 24,

More information

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based?

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based? Agenda Designing Transactional Memory Systems Part III: Lock-based STMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch Part I: Introduction Part II: Obstruction-free STMs Part III: Lock-based

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #17: Transac0ons 2: 2PL and Deadlocks

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #17: Transac0ons 2: 2PL and Deadlocks CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #17: Transac0ons 2: 2PL and Deadlocks Review (last lecture) DBMSs support ACID Transac0on seman0cs. Concurrency control and

More information

Can Mainstream Processors Support Hardware Transactional Memory?

Can Mainstream Processors Support Hardware Transactional Memory? Can Mainstream Processors Support Hardware Transactional Memory? Brown University CS IPP Symposium April 30, 2009 Dave Christie AMD Research & Advanced Development Lab Agenda The Disconnect Implementation

More information

Synchronization via Transactions

Synchronization via Transactions Synchronization via Transactions 1 Concurrency Quiz If two threads execute this program concurrently, how many different final values of X are there? Initially, X == 0. Thread 1 Thread 2 void increment()

More information

Transactional Memory. Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Tech

Transactional Memory. Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Tech Transactional Memory Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Tech (Adapted from Stanford TCC group and MIT SuperTech Group) Motivation Uniprocessor Systems Frequency

More information

Eliminating Global Interpreter Locks in Ruby through Hardware Transactional Memory

Eliminating Global Interpreter Locks in Ruby through Hardware Transactional Memory Eliminating Global Interpreter Locks in Ruby through Hardware Transactional Memory Rei Odaira, Jose G. Castanos and Hisanobu Tomari IBM Research and University of Tokyo April 8, 2014 Rei Odaira, Jose G.

More information

Implementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University

Implementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Implementing and Evaluating Nested Parallel Transactions in STM Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Introduction // Parallelize the outer loop for(i=0;i

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #17: Transac0ons 1: Intro. to ACID

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #17: Transac0ons 1: Intro. to ACID CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #17: Transac0ons 1: Intro. to ACID Why Transac0ons? Database systems are normally being accessed by many users or processes

More information

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) ) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties

More information

Lecture 4: Directory Protocols and TM. Topics: corner cases in directory protocols, lazy TM

Lecture 4: Directory Protocols and TM. Topics: corner cases in directory protocols, lazy TM Lecture 4: Directory Protocols and TM Topics: corner cases in directory protocols, lazy TM 1 Handling Reads When the home receives a read request, it looks up memory (speculative read) and directory in

More information

On Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions

On Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions On Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions Ahmed Hassan Preliminary Examination Proposal submitted to the Faculty of the Virginia Polytechnic

More information

Transactional Memory. Lecture 18: Parallel Computer Architecture and Programming CMU /15-618, Spring 2017

Transactional Memory. Lecture 18: Parallel Computer Architecture and Programming CMU /15-618, Spring 2017 Lecture 18: Transactional Memory Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2017 Credit: many slides in today s talk are borrowed from Professor Christos Kozyrakis (Stanford

More information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #6: Transac/ons 1: Intro. to ACID

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #6: Transac/ons 1: Intro. to ACID CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #6: Transac/ons 1: Intro. to ACID Project dates Proposal due: Feb 23 Milestone due: Mar 28 Final report/posters etc: May 2 (last class)

More information

Summary: Issues / Open Questions:

Summary: Issues / Open Questions: Summary: The paper introduces Transitional Locking II (TL2), a Software Transactional Memory (STM) algorithm, which tries to overcomes most of the safety and performance issues of former STM implementations.

More information

Thread-unsafe code. Synchronized blocks

Thread-unsafe code. Synchronized blocks Thread-unsafe code How can the following class be broken by mul6ple threads? 1 public class Counter { 2 private int c = 0; 3 4 public void increment() { int old = c; 5 6 c = old + 1; // c++; 7 8 public

More information

Reminder from last time

Reminder from last time Concurrent systems Lecture 7: Crash recovery, lock-free programming, and transactional memory DrRobert N. M. Watson 1 Reminder from last time History graphs; good (and bad) schedules Isolation vs. strict

More information

Understanding Hardware Transactional Memory

Understanding Hardware Transactional Memory Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence

More information

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks Lecture: Transactional Memory, Networks Topics: TM implementations, on-chip networks 1 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Avoids deadlock

More information

Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench

Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Ajay Singh, Sathya Peri, G Monika and Anila kumari Dept. of Computer Science And Engineering IIT Hyderabad Hyderabad,

More information

Lecture 12: TM, Consistency Models. Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations

Lecture 12: TM, Consistency Models. Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations Lecture 12: TM, Consistency Models Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations 1 Paper on TM Pathologies (ISCA 08) LL: lazy versioning, lazy conflict detection, committing

More information

Hardware Transactional Memory. Daniel Schwartz-Narbonne

Hardware Transactional Memory. Daniel Schwartz-Narbonne Hardware Transactional Memory Daniel Schwartz-Narbonne Hardware Transactional Memories Hybrid Transactional Memories Case Study: Sun Rock Clever ways to use TM Recap: Parallel Programming 1. Find independent

More information