Improving STM Performance with Transactional Structs 1

Size: px
Start display at page:

Download "Improving STM Performance with Transactional Structs 1"

Transcription

1 Improving STM Performance with Transactional Structs 1 Ryan Yates and Michael L. Scott University of Rochester IFL, This work was funded in part by the National Science Foundation under grants CCR , CCF , CCF , and CCF , and by support from the IBM Canada Centres for Advanced Studies. 1/31

2 Outline Haskell STM TStruct Performance results Future work Slides: Paper: 2/31

3 What is Transactional Memory? Transactional memory is the joining of two ideas: The ability to express what should be atomic without saying how. An implementation that uses speculation to optimistically execute and try again if needed. 3/31

4 Existing Haskell STM Implementation In STM execution, reads and writes to TVars are tracked in a transactional record (TRec). Execution continues under the assumption that there have been no conflicts. A conflict is where two threads access the same location and at least one is a write. 4/31

5 Existing Haskell STM Implementation At the end of the transaction: The RTS validates that reads still match the values in the TVars. And commits by performing writes atomically. If validation fails, start over. Similar to OSTM [Fraser, 2004]. No global bottlenecks. Read-only transactions do not acquire locks. OSTM is non-blocking, GHC s STM can livelock. 5/31

6 Motivation Haskell STM concurrent data structures suffer bloat from indirection. TVar version watch-list value Node key value color parent left right TVar version watch-list value value Int# 6/31

7 Our Work TStruct Removes indirection required by TVars. Mutable unboxed values paired with mutable pointer values. Increases locality of data structure nodes. Maintains properties: Commit parallels TVar commit. No global bottleneck. Read-only transactions do not acquire locks. Avoids conflating lock and value. Flexible transactional variable granularity. 7/31

8 Data Structures Red-Black Tree Skip List Cuckoo Hash Table Hashed Array Mapped Trie (HAMT) 8/31

9 Red-Black Tree Rebalancing TVar Extra Indirection No mutable unboxed values (Color field) TStruct Node initialization Nil node and indirection Accesses at constant offsets 9/31

10 Extra Indirection -- TVar data Node k v = Node { _key ::!k, _value ::!v, _color :: TVar Color, _parent :: TVar (Node k v), _left :: TVar (Node k v), _right :: TVar (Node k v) } Nil 10/31

11 Avoiding Sum Type Indirection -- TStruct with sum type data Node k v = Node { _tstruct :: TStruct# RealWorld (Node k v) } Nil -- TStruct without sum type data Node k v = Node { _tstruct :: TStruct# RealWorld Any } 11/31

12 Skip List No rebalancing needed Random number source TVar Pure node with TArray of next pointers Extra indirection TStruct TStruct containing both values and next pointers Node initialization 12/31

13 Node Initialization When a new node is made in a transaction no other thread can see it until the transaction has committed. We take advantage of this and access these nodes non-transactionally. In the skip list, this happens on insertion. The new node is created and the next pointers are written to match the previous node at that level. 13/31

14 Cuckoo Hash Table Only insert needs to do significant work. TVar TStruct TArray of immutable buckets Immutable array of TStruct buckets 14/31

15 Cuckoo Hash Table TStruct size... TArray version watch-list value version watch-list value size entry... Array size entry... Array lock lock-count version watch-list count key1 key2... keyn value1 value2... valuen 15/31

16 Hashed Array Mapped Trie (HAMT) No rebalancing TVar TStruct Extra indirection Node initialization Immutable fields Node tags 16/31

17 Immutable fields Fields that are immutable can be safely be read non-transactionally. No bookkeeping needed! Two primitive read functions: Transactional readtstruct# implemented in Cmm and C. Non-transactional readtstructnt# implemented in code generator. Things can go wrong in very unexpected ways! 17/31

18 Node tags -- TVar data Node a = Nodes (TVar (WordArray a)) Leaf Hash a Leaves Hash (SizedArray a) data WordArray a = WordArray Bitmap (Array (Node a)) data SizedArray a = SizedArray Size (Array a) 18/31

19 Node tags -- TStruct data Node a = WordArray Size Bitmap (Array (Node a)) SizedArray Size Hash (Array a) data Node a = Node { _tstruct :: TStruct# RealWorld Any } 19/31

20 HAMT Nodes tvar size entry... Node tag=nodes TVar version watch-list value WordArray bitmap array SizedArray tag=nodes tvar version watch-list value WordArray bitmap array SizedArray size entry... Node TVar tag=leaves hash array SizedArray size entry... Node 20/31

21 HAMT Nodes TVar version watch-list value WordArray lock lock-count version watch-list tag=0 size bitmap entry... WordArray lock lock-count version watch-list tag=0 size bitmap entry... SizedArray lock lock-count version watch-list tag=1 size hash entry... 21/31

22 Code Example Example from lookup in the TVar-based HAMT. Pattern matching ensures we do not handle a leaf as a node. lookuptvar... = do arr <- readtvar... case wordarraylookup i arr of Nothing ->... Just (Nodes ns) ->... Just (Leaves h la) ->... 22/31

23 Code Example In the TStruct-based HAMT lookup we lose safety. No bounds check in readtstructwordnt#. Nodes can be confused with leaves. readtagnt (WordArray arr#) = STM $ \s1# -> case readtstructwordnt# arr# 0# s1# of (# s2#, w# #) -> (# s2#, W# w# #) lookuptstruct... arr = do t <- readtagnt arr case t of 0 -> do... readindicesnt arr > do... readhashnt arr... 23/31

24 Benchmarks Machine Intel c Xeon TM E v3 two socket, 36-core, 72-thread Tests Data structure with concurrent inserts (5%), deletes (5%), and lookups (90%) measuring throughput at steady state. Structure initially has 50,000 entries in a key space of 100,000 keys. 24/31

25 TVar (Intel c Xeon TM E v3 two socket, 36-core) 10 7 Operations per second 4 2 RBTree SkipList Cuckoo HAMT Threads 25/31

26 HAMT (Intel c Xeon TM E v3 two socket, 36-core) Operations per second TVar TStruct CTrie Threads 26/31

27 Cuckoo Hash (Intel c Xeon TM E v3 two socket, 36-core) 10 7 Operations per second TVar TStruct Threads 27/31

28 Skip List (Intel c Xeon TM E v3 two socket, 36-core) Operations per second 2 1 TVar TStruct Threads 28/31

29 Red-Black Tree (Intel c Xeon TM E v3 two socket, 36-core) Operations per second 2 1 TVar TStruct Threads 29/31

30 Future Work Continue to improve performance and understand what factors contribute to good performance. Recover safety for TStruct features Node initialization Node tagging Accesses at constant offsets Other data structures 30/31

31 Thanks! Slides: Paper: 31/31

32 32/31

33 Haskell STM Metadata Structure Node key value parent left right color TVar value watch Watch Queue thread next prev Watch Queue thread next prev prev index tvar old new tvar old new... TRec tvar old new 33/31

34 Haskell Before TStruct Node key value parent left right color TVar value watch Node key value parent left right color 34/31

35 Haskell with TStruct Node lock watch key value color parent left right Node lock watch key value color parent left right 35/31

36 Haskell STM commit commit(trec* trec) { if (validate(trec)) { if (read_check(trec)) { update(trec) return true } } return false } 36/31

37 Haskell STM commit bool validate(trec* trec) { for (e in trec) { if (is_write(e)) { if (!lock(e) e->value!= e->tvar->value) { release_locks(trec) return false; } } else { e->version = e->tvar->version } } } 37/31

38 Haskell STM commit bool read_check(trec* trec) { for (e in trec) { if (is_read(e)) { if (e->value!= e->tvar->value e->version!= e->tvar->version) { release_locks(trec) return false } } } } 38/31

39 Haskell STM commit update(trec* trec) { for (e in trec) { if (is_write(e)) { e->tvar->version++ e->tvar->value = e->new_value } } } 39/31

40 References [Fraser, 2004] Fraser, K. (2004). Practical lock-freedom. PhD thesis, University of Cambridge Computer Laboratory. 40/31

An Update on Haskell H/STM 1

An Update on Haskell H/STM 1 An Update on Haskell H/STM 1 Ryan Yates and Michael L. Scott University of Rochester TRANSACT 10, 6-15-2015 1 This work was funded in part by the National Science Foundation under grants CCR-0963759, CCF-1116055,

More information

Improving STM Performance with Transactional Structs

Improving STM Performance with Transactional Structs Improving STM Performance with Transactional Structs Abstract Ryan Yates Computer Science Department University of Rochester Rochester, NY, USA ryates@cs.rochester.edu Software transactional memory (STM)

More information

Improving STM Performance with Transactional Structs

Improving STM Performance with Transactional Structs Improving STM Performance with Transactional Structs Ryan Yates Computer Science Department University of Rochester Rochester, NY, USA ryates@cs.rochester.edu Michael L. Scott Computer Science Department

More information

Leveraging Hardware TM in Haskell

Leveraging Hardware TM in Haskell Abstract Ryan Yates Houghton College Houghton, NY, USA ryan.yates@houghton.edu Transactional memory (TM) is heavily used for synchronization in the Haskell programming language, but its performance has

More information

COMP3151/9151 Foundations of Concurrency Lecture 8

COMP3151/9151 Foundations of Concurrency Lecture 8 1 COMP3151/9151 Foundations of Concurrency Lecture 8 Transactional Memory Liam O Connor CSE, UNSW (and data61) 8 Sept 2017 2 The Problem with Locks Problem Write a procedure to transfer money from one

More information

Conflict Detection and Validation Strategies for Software Transactional Memory

Conflict Detection and Validation Strategies for Software Transactional Memory Conflict Detection and Validation Strategies for Software Transactional Memory Michael F. Spear, Virendra J. Marathe, William N. Scherer III, and Michael L. Scott University of Rochester www.cs.rochester.edu/research/synchronization/

More information

Revisiting Software Transactional Memory in Haskell 1

Revisiting Software Transactional Memory in Haskell 1 Revisiting Software Transactional Memory in Haskell Matthew Le Rochester Institute of Technology ml995@cs.rit.edu Ryan Yates University of Rochester ryates@cs.rochester.edu Matthew Fluet Rochester Institute

More information

A Hybrid TM for Haskell

A Hybrid TM for Haskell A Hybrid TM for Haskell Ryan Yates Michael L. Scott Computer Science Department, University of Rochester {ryates,scott}@cs.rochester.edu Abstract Much of the success of Haskell s Software Transactional

More information

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based?

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based? Agenda Designing Transactional Memory Systems Part III: Lock-based STMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch Part I: Introduction Part II: Obstruction-free STMs Part III: Lock-based

More information

A Practical Scalable Distributed B-Tree

A Practical Scalable Distributed B-Tree A Practical Scalable Distributed B-Tree CS 848 Paper Presentation Marcos K. Aguilera, Wojciech Golab, Mehul A. Shah PVLDB 08 March 8, 2010 Presenter: Evguenia (Elmi) Eflov Presentation Outline 1 Background

More information

Split-Ordered Lists: Lock-Free Extensible Hash Tables. Pierre LaBorde

Split-Ordered Lists: Lock-Free Extensible Hash Tables. Pierre LaBorde 1 Split-Ordered Lists: Lock-Free Extensible Hash Tables Pierre LaBorde Nir Shavit 2 Tel-Aviv University, Israel Ph.D. from Hebrew University Professor at School of Computer Science at Tel-Aviv University

More information

CSE 230. Concurrency: STM. Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart

CSE 230. Concurrency: STM. Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart CSE 230 Concurrency: STM Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart The Grand Challenge How to properly use multi-cores? Need new programming models! Parallelism vs Concurrency

More information

Composable Shared Memory Transactions Lecture 20-2

Composable Shared Memory Transactions Lecture 20-2 Composable Shared Memory Transactions Lecture 20-2 April 3, 2008 This was actually the 21st lecture of the class, but I messed up the naming of subsequent notes files so I ll just call this one 20-2. This

More information

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013 Lecture 20: Transactional Memory Parallel Computer Architecture and Programming Slide credit Many of the slides in today s talk are borrowed from Professor Christos Kozyrakis (Stanford University) Raising

More information

Atomicity via Source-to-Source Translation

Atomicity via Source-to-Source Translation Atomicity via Source-to-Source Translation Benjamin Hindman Dan Grossman University of Washington 22 October 2006 Atomic An easier-to-use and harder-to-implement primitive void deposit(int x){ synchronized(this){

More information

Lowering the Overhead of Nonblocking Software Transactional Memory

Lowering the Overhead of Nonblocking Software Transactional Memory Lowering the Overhead of Nonblocking Software Transactional Memory Virendra J. Marathe Michael F. Spear Christopher Heriot Athul Acharya David Eisenstat William N. Scherer III Michael L. Scott Background

More information

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012 NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 31 October 2012 Lecture 6 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability

More information

Multicore programming in Haskell. Simon Marlow Microsoft Research

Multicore programming in Haskell. Simon Marlow Microsoft Research Multicore programming in Haskell Simon Marlow Microsoft Research A concurrent web server server :: Socket -> IO () server sock = forever (do acc

More information

Tackling Concurrency With STM. Mark Volkmann 10/22/09

Tackling Concurrency With STM. Mark Volkmann 10/22/09 Tackling Concurrency With Mark Volkmann mark@ociweb.com 10/22/09 Two Flavors of Concurrency Divide and conquer divide data into subsets and process it by running the same code on each subset concurrently

More information

Tackling Concurrency With STM

Tackling Concurrency With STM Tackling Concurrency With Mark Volkmann mark@ociweb.com 10/22/09 Two Flavors of Concurrency Divide and conquer divide data into subsets and process it by running the same code on each subset concurrently

More information

Understanding Hardware Transactional Memory

Understanding Hardware Transactional Memory Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence

More information

SILT: A Memory-Efficient, High- Performance Key-Value Store

SILT: A Memory-Efficient, High- Performance Key-Value Store SILT: A Memory-Efficient, High- Performance Key-Value Store SOSP 11 Presented by Fan Ni March, 2016 SILT is Small Index Large Tables which is a memory efficient high performance key value store system

More information

A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention

A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention Jonatan Lindén and Bengt Jonsson Uppsala University, Sweden December 18, 2013 Jonatan Lindén 1 Contributions Motivation: Improve

More information

Design Tradeoffs in Modern Software Transactional Memory Systems

Design Tradeoffs in Modern Software Transactional Memory Systems Design Tradeoffs in Modern Software al Memory Systems Virendra J. Marathe, William N. Scherer III, and Michael L. Scott Department of Computer Science University of Rochester Rochester, NY 14627-226 {vmarathe,

More information

Locks and Threads and Monads OOo My. Stephan Bergmann StarOffice/OpenOffice.org Sun Microsystems

Locks and Threads and Monads OOo My. Stephan Bergmann StarOffice/OpenOffice.org Sun Microsystems Locks and Threads and Monads OOo My Stephan Bergmann StarOffice/OpenOffice.org Sun Microsystems Locks and Threads and Monads OOo My 1 - Tomorrow's hardware... 2 -...and today's software 3 - Stateful vs.

More information

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017 NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 17 November 2017 Lecture 7 Linearizability Lock-free progress properties Hashtables and skip-lists Queues Reducing contention Explicit

More information

Cost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University)

Cost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) Cost of Concurrency in Hybrid Transactional Memory Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) 1 Transactional Memory: a history Hardware TM Software TM Hybrid TM 1993 1995-today

More information

TRANSACTION MEMORY. Presented by Hussain Sattuwala Ramya Somuri

TRANSACTION MEMORY. Presented by Hussain Sattuwala Ramya Somuri TRANSACTION MEMORY Presented by Hussain Sattuwala Ramya Somuri AGENDA Issues with Lock Free Synchronization Transaction Memory Hardware Transaction Memory Software Transaction Memory Conclusion 1 ISSUES

More information

unreadtvar: Extending Haskell Software Transactional Memory for Performance

unreadtvar: Extending Haskell Software Transactional Memory for Performance unreadtvar: Extending Haskell Software Transactional Memory for Performance Nehir Sonmez, Cristian Perfumo, Srdjan Stipic, Adrian Cristal, Osman S. Unsal, and Mateo Valero Barcelona Supercomputing Center,

More information

Synchronising Threads

Synchronising Threads Synchronising Threads David Chisnall March 1, 2011 First Rule for Maintainable Concurrent Code No data may be both mutable and aliased Harder Problems Data is shared and mutable Access to it must be protected

More information

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations Lecture 21: Transactional Memory Topics: Hardware TM basics, different implementations 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end locks are blocking,

More information

Implementierungstechniken für Hauptspeicherdatenbanksysteme: The Bw-Tree

Implementierungstechniken für Hauptspeicherdatenbanksysteme: The Bw-Tree Implementierungstechniken für Hauptspeicherdatenbanksysteme: The Bw-Tree Josef Schmeißer January 9, 218 Abstract The Bw-Tree as presented by Levandoski et al. was designed to accommodate the emergence

More information

Hardware Transactional Memory on Haswell

Hardware Transactional Memory on Haswell Hardware Transactional Memory on Haswell Viktor Leis Technische Universität München 1 / 15 Introduction transactional memory is a very elegant programming model transaction { transaction { a = a 10; c

More information

! Part I: Introduction. ! Part II: Obstruction-free STMs. ! DSTM: an obstruction-free STM design. ! FSTM: a lock-free STM design

! Part I: Introduction. ! Part II: Obstruction-free STMs. ! DSTM: an obstruction-free STM design. ! FSTM: a lock-free STM design genda Designing Transactional Memory ystems Part II: Obstruction-free TMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch! Part I: Introduction! Part II: Obstruction-free TMs! DTM: an obstruction-free

More information

Introduction to Locks. Intrinsic Locks

Introduction to Locks. Intrinsic Locks CMSC 433 Programming Language Technologies and Paradigms Spring 2013 Introduction to Locks Intrinsic Locks Atomic-looking operations Resources created for sequential code make certain assumptions, a large

More information

Transactional Memory. Yaohua Li and Siming Chen. Yaohua Li and Siming Chen Transactional Memory 1 / 41

Transactional Memory. Yaohua Li and Siming Chen. Yaohua Li and Siming Chen Transactional Memory 1 / 41 Transactional Memory Yaohua Li and Siming Chen Yaohua Li and Siming Chen Transactional Memory 1 / 41 Background Processor hits physical limit on transistor density Cannot simply put more transistors to

More information

!!"!#"$%& Atomic Blocks! Atomic blocks! 3 primitives: atomically, retry, orelse!

!!!#$%& Atomic Blocks! Atomic blocks! 3 primitives: atomically, retry, orelse! cs242! Kathleen Fisher!! Multi-cores are coming!! - For 50 years, hardware designers delivered 40-50% increases per year in sequential program speed.! - Around 2004, this pattern failed because power and

More information

Lecture 21 Concurrency Control Part 1

Lecture 21 Concurrency Control Part 1 CMSC 461, Database Management Systems Spring 2018 Lecture 21 Concurrency Control Part 1 These slides are based on Database System Concepts 6 th edition book (whereas some quotes and figures are used from

More information

Chapter 15 : Concurrency Control

Chapter 15 : Concurrency Control Chapter 15 : Concurrency Control What is concurrency? Multiple 'pieces of code' accessing the same data at the same time Key issue in multi-processor systems (i.e. most computers today) Key issue for parallel

More information

Fine-grained synchronization & lock-free programming

Fine-grained synchronization & lock-free programming Lecture 17: Fine-grained synchronization & lock-free programming Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Tunes Minnie the Moocher Robbie Williams (Swings Both Ways)

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution

Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution Ravi Rajwar and Jim Goodman University of Wisconsin-Madison International Symposium on Microarchitecture, Dec. 2001 Funding

More information

Improving the Practicality of Transactional Memory

Improving the Practicality of Transactional Memory Improving the Practicality of Transactional Memory Woongki Baek Electrical Engineering Stanford University Programming Multiprocessors Multiprocessor systems are now everywhere From embedded to datacenter

More information

Concurrency Control. R &G - Chapter 19

Concurrency Control. R &G - Chapter 19 Concurrency Control R &G - Chapter 19 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book Review DBMSs support concurrency, crash recovery with: ACID

More information

Bw-Tree. Josef Schmeißer. January 9, Josef Schmeißer Bw-Tree January 9, / 25

Bw-Tree. Josef Schmeißer. January 9, Josef Schmeißer Bw-Tree January 9, / 25 Bw-Tree Josef Schmeißer January 9, 2018 Josef Schmeißer Bw-Tree January 9, 2018 1 / 25 Table of contents 1 Fundamentals 2 Tree Structure 3 Evaluation 4 Further Reading Josef Schmeißer Bw-Tree January 9,

More information

Transactional Memory. Lecture 18: Parallel Computer Architecture and Programming CMU /15-618, Spring 2017

Transactional Memory. Lecture 18: Parallel Computer Architecture and Programming CMU /15-618, Spring 2017 Lecture 18: Transactional Memory Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2017 Credit: many slides in today s talk are borrowed from Professor Christos Kozyrakis (Stanford

More information

Concurrency Control CHAPTER 17 SINA MERAJI

Concurrency Control CHAPTER 17 SINA MERAJI Concurrency Control CHAPTER 17 SINA MERAJI Announcement Sign up for final project presentations here: https://docs.google.com/spreadsheets/d/1gspkvcdn4an3j3jgtvduaqm _x4yzsh_jxhegk38-n3k/edit#gid=0 Deadline

More information

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end

More information

Transactional Memory. Lecture 19: Parallel Computer Architecture and Programming CMU /15-618, Spring 2015

Transactional Memory. Lecture 19: Parallel Computer Architecture and Programming CMU /15-618, Spring 2015 Lecture 19: Transactional Memory Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Credit: many of the slides in today s talk are borrowed from Professor Christos Kozyrakis

More information

Linked Lists: The Role of Locking. Erez Petrank Technion

Linked Lists: The Role of Locking. Erez Petrank Technion Linked Lists: The Role of Locking Erez Petrank Technion Why Data Structures? Concurrent Data Structures are building blocks Used as libraries Construction principles apply broadly This Lecture Designing

More information

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs CSE 451: Operating Systems Winter 2005 Lecture 7 Synchronization Steve Gribble Synchronization Threads cooperate in multithreaded programs to share resources, access shared data structures e.g., threads

More information

Chapter 13 : Concurrency Control

Chapter 13 : Concurrency Control Chapter 13 : Concurrency Control Chapter 13: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols Validation-Based Protocols Multiple Granularity Multiversion Schemes Insert and Delete Operations

More information

Transactional Memory

Transactional Memory Transactional Memory Architectural Support for Practical Parallel Programming The TCC Research Group Computer Systems Lab Stanford University http://tcc.stanford.edu TCC Overview - January 2007 The Era

More information

Introduction to the HAMT: Opportunity for Tcl Tcl Conference Don Porter Tcl/Tk Release Manager

Introduction to the HAMT: Opportunity for Tcl Tcl Conference Don Porter Tcl/Tk Release Manager Introduction to the HAMT: Opportunity for Tcl 2017 Tcl Conference Don Porter Tcl/Tk Release Manager Hash Maps in Tcl Dictionaries Array variables Name lookups (commands, vars, etc.) Much much more Most

More information

Implementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University

Implementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Implementing and Evaluating Nested Parallel Transactions in STM Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Introduction // Parallelize the outer loop for(i=0;i

More information

MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing

MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing Bin Fan (CMU), Dave Andersen (CMU), Michael Kaminsky (Intel Labs) NSDI 2013 http://www.pdl.cmu.edu/ 1 Goal: Improve Memcached 1. Reduce space overhead

More information

Distributed Transaction Management 2003

Distributed Transaction Management 2003 Distributed Transaction Management 2003 Jyrki Nummenmaa http://www.cs.uta.fi/~dtm jyrki@cs.uta.fi General information We will view this from the course web page. Motivation We will pick up some motivating

More information

Chí Cao Minh 28 May 2008

Chí Cao Minh 28 May 2008 Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor

More information

Teleportation as a Strategy for Improving Concurrent Skiplist Performance. Frances Steen

Teleportation as a Strategy for Improving Concurrent Skiplist Performance. Frances Steen Teleportation as a Strategy for Improving Concurrent Skiplist Performance by Frances Steen Submitted to the Department of Computer Science in partial fulfillment of the requirements for the degree of Bachelor

More information

PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES

PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES Anish Athalye and Patrick Long Mentors: Austin Clements and Stephen Tu 3 rd annual MIT PRIMES Conference Sequential

More information

Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems

Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Håkan Sundell Philippas Tsigas Outline Synchronization Methods Priority Queues Concurrent Priority Queues Lock-Free Algorithm: Problems

More information

NePaLTM: Design and Implementation of Nested Parallelism for Transactional Memory Systems

NePaLTM: Design and Implementation of Nested Parallelism for Transactional Memory Systems NePaLTM: Design and Implementation of Nested Parallelism for Transactional Memory Systems Haris Volos 1, Adam Welc 2, Ali-Reza Adl-Tabatabai 2, Tatiana Shpeisman 2, Xinmin Tian 2, and Ravi Narayanaswamy

More information

Blurred Persistence in Transactional Persistent Memory

Blurred Persistence in Transactional Persistent Memory Blurred Persistence in Transactional Persistent Memory Youyou Lu, Jiwu Shu, Long Sun Tsinghua University Overview Problem: high performance overhead in ensuring storage consistency of persistent memory

More information

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 21 November 2014

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 21 November 2014 NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 21 November 2014 Lecture 7 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability

More information

Fine-grained synchronization & lock-free data structures

Fine-grained synchronization & lock-free data structures Lecture 19: Fine-grained synchronization & lock-free data structures Parallel Computer Architecture and Programming Redo Exam statistics Example: a sorted linked list struct Node { int value; Node* next;

More information

Linearizability of Persistent Memory Objects

Linearizability of Persistent Memory Objects Linearizability of Persistent Memory Objects Michael L. Scott Joint work with Joseph Izraelevitz & Hammurabi Mendes www.cs.rochester.edu/research/synchronization/ Workshop on the Theory of Transactional

More information

Simon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010

Simon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010 Simon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010 Geeks Practitioners 1,000,000 10,000 100 1 The quick death 1yr 5yr 10yr 15yr Geeks Practitioners 1,000,000 10,000 100 The slow

More information

Non-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei

Non-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei Non-blocking Array-based Algorithms for Stacks and Queues Niloufar Shafiei Outline Introduction Concurrent stacks and queues Contributions New algorithms New algorithms using bounded counter values Correctness

More information

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Hank Levy 412 Sieg Hall

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Hank Levy 412 Sieg Hall CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization Hank Levy Levy@cs.washington.edu 412 Sieg Hall Synchronization Threads cooperate in multithreaded programs to share resources, access shared

More information

IMPORTANT: Circle the last two letters of your class account:

IMPORTANT: Circle the last two letters of your class account: Spring 2011 University of California, Berkeley College of Engineering Computer Science Division EECS MIDTERM I CS 186 Introduction to Database Systems Prof. Michael J. Franklin NAME: STUDENT ID: IMPORTANT:

More information

Performance Improvement via Always-Abort HTM

Performance Improvement via Always-Abort HTM 1 Performance Improvement via Always-Abort HTM Joseph Izraelevitz* Lingxiang Xiang Michael L. Scott* *Department of Computer Science University of Rochester {jhi1,scott}@cs.rochester.edu Parallel Computing

More information

Programmazione di sistemi multicore

Programmazione di sistemi multicore Programmazione di sistemi multicore A.A. 2015-2016 LECTURE 14 IRENE FINOCCHI http://wwwusers.di.uniroma1.it/~finocchi/ Programming with locks and critical sections MORE BAD INTERLEAVINGS GUIDELINES FOR

More information

RocksDB Key-Value Store Optimized For Flash

RocksDB Key-Value Store Optimized For Flash RocksDB Key-Value Store Optimized For Flash Siying Dong Software Engineer, Database Engineering Team @ Facebook April 20, 2016 Agenda 1 What is RocksDB? 2 RocksDB Design 3 Other Features What is RocksDB?

More information

Low Overhead Concurrency Control for Partitioned Main Memory Databases

Low Overhead Concurrency Control for Partitioned Main Memory Databases Low Overhead Concurrency Control for Partitioned Main Memory Databases Evan Jones, Daniel Abadi, Samuel Madden, June 2010, SIGMOD CS 848 May, 2016 Michael Abebe Background Motivations Database partitioning

More information

! A lock is a mechanism to control concurrent access to a data item! Data items can be locked in two modes :

! A lock is a mechanism to control concurrent access to a data item! Data items can be locked in two modes : Lock-Based Protocols Concurrency Control! A lock is a mechanism to control concurrent access to a data item! Data items can be locked in two modes : 1 exclusive (X) mode Data item can be both read as well

More information

Implementing Symmetric Multiprocessing in LispWorks

Implementing Symmetric Multiprocessing in LispWorks Implementing Symmetric Multiprocessing in LispWorks Making a multithreaded application more multithreaded Martin Simmons, LispWorks Ltd Copyright 2009 LispWorks Ltd Outline Introduction Changes in LispWorks

More information

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed

More information

Heckaton. SQL Server's Memory Optimized OLTP Engine

Heckaton. SQL Server's Memory Optimized OLTP Engine Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability

More information

A Comparison of Relativistic and Reader-Writer Locking Approaches to Shared Data Access

A Comparison of Relativistic and Reader-Writer Locking Approaches to Shared Data Access A Comparison of Relativistic and Reader-Writer Locking Approaches to Shared Data Access Philip W. Howard, Josh Triplett, and Jonathan Walpole Portland State University Abstract. This paper explores the

More information

1 RCU. 2 Improving spinlock performance. 3 Kernel interface for sleeping locks. 4 Deadlock. 5 Transactions. 6 Scalable interface design

1 RCU. 2 Improving spinlock performance. 3 Kernel interface for sleeping locks. 4 Deadlock. 5 Transactions. 6 Scalable interface design Overview of Monday s and today s lectures Outline Locks create serial code - Serial code gets no speedup from multiprocessors Test-and-set spinlock has additional disadvantages - Lots of traffic over memory

More information

Tom Hart, University of Toronto Paul E. McKenney, IBM Beaverton Angela Demke Brown, University of Toronto

Tom Hart, University of Toronto Paul E. McKenney, IBM Beaverton Angela Demke Brown, University of Toronto Making Lockless Synchronization Fast: Performance Implications of Memory Reclamation Tom Hart, University of Toronto Paul E. McKenney, IBM Beaverton Angela Demke Brown, University of Toronto Outline Motivation

More information

A Skip List for Multicore

A Skip List for Multicore A Skip List for Multicore Ian Dick University of Sydney Alan Fekete University of Sydney Vincent Gramoli University of Sydney Abstract In this paper, we introduce the Rotating skip list, the fastest concurrent

More information

LogTM: Log-Based Transactional Memory

LogTM: Log-Based Transactional Memory LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet

More information

Dalí: A Periodically Persistent Hash Map

Dalí: A Periodically Persistent Hash Map Dalí: A Periodically Persistent Hash Map Faisal Nawab* 1, Joseph Izraelevitz* 2, Terence Kelly*, Charles B. Morrey III*, Dhruva R. Chakrabarti*, and Michael L. Scott 2 1 Department of Computer Science

More information

Advances in Programming Languages

Advances in Programming Languages O T Y H Advances in Programming Languages APL5: Further language concurrency mechanisms David Aspinall (including slides by Ian Stark) School of Informatics The University of Edinburgh Tuesday 5th October

More information

COMP3151/9151 Foundations of Concurrency Lecture 8

COMP3151/9151 Foundations of Concurrency Lecture 8 1 COMP3151/9151 Foundations of Concurrency Lecture 8 Liam O Connor CSE, UNSW (and data61) 8 Sept 2017 2 Shared Data Consider the Readers and Writers problem from Lecture 6: Problem We have a large data

More information

Comparing the Performance of Concurrent Linked-List Implementations in Haskell

Comparing the Performance of Concurrent Linked-List Implementations in Haskell Comparing the Performance of Concurrent Linked-List Implementations in Haskell Martin Sulzmann IT University of Copenhagen, Denmark martin.sulzmann@gmail.com Edmund S. L. Lam National University of Singapore,

More information

Fall 2015 COMP Operating Systems. Lab 06

Fall 2015 COMP Operating Systems. Lab 06 Fall 2015 COMP 3511 Operating Systems Lab 06 Outline Monitor Deadlocks Logical vs. Physical Address Space Segmentation Example of segmentation scheme Paging Example of paging scheme Paging-Segmentation

More information

Transaction Management: Concurrency Control, part 2

Transaction Management: Concurrency Control, part 2 Transaction Management: Concurrency Control, part 2 CS634 Class 16 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke Locking for B+ Trees Naïve solution Ignore tree structure,

More information

Locking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching

Locking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching Locking for B+ Trees Transaction Management: Concurrency Control, part 2 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke CS634 Class 16 Naïve solution Ignore tree structure,

More information

Monitors; Software Transactional Memory

Monitors; Software Transactional Memory Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 18, 2012 CPD (DEI / IST) Parallel and

More information

What's new in MySQL 5.5? Performance/Scale Unleashed

What's new in MySQL 5.5? Performance/Scale Unleashed What's new in MySQL 5.5? Performance/Scale Unleashed Mikael Ronström Senior MySQL Architect The preceding is intended to outline our general product direction. It is intended for

More information

Panu Silvasti Page 1

Panu Silvasti Page 1 Multicore support in databases Panu Silvasti Page 1 Outline Building blocks of a storage manager How do existing storage managers scale? Optimizing Shore database for multicore processors Page 2 Building

More information

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability Topics COS 318: Operating Systems File Performance and Reliability File buffer cache Disk failure and recovery tools Consistent updates Transactions and logging 2 File Buffer Cache for Performance What

More information

Concurrent Data Structures Concurrent Algorithms 2016

Concurrent Data Structures Concurrent Algorithms 2016 Concurrent Data Structures Concurrent Algorithms 2016 Tudor David (based on slides by Vasileios Trigonakis) Tudor David 11.2016 1 Data Structures (DSs) Constructs for efficiently storing and retrieving

More information

HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran. Hot Topics in Parallelism (HotPar '12), Berkeley, CA

HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran. Hot Topics in Parallelism (HotPar '12), Berkeley, CA HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran Hot Topics in Parallelism (HotPar '12), Berkeley, CA Motivation & Objectives Background Architecture Program Reconstruction Implementation

More information

Software Transactional Memory Should Not Be Obstruction-Free

Software Transactional Memory Should Not Be Obstruction-Free Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals IRC-TR-06-052 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL

More information

Massimiliano Ghilardi

Massimiliano Ghilardi 7 th European Lisp Symposium Massimiliano Ghilardi May 5-6, 2014 IRCAM, Paris, France High performance concurrency in Common Lisp hybrid transactional memory with STMX 2 Beautiful and fast concurrency

More information

MULTI-THREADED QUERIES

MULTI-THREADED QUERIES 15-721 Project 3 Final Presentation MULTI-THREADED QUERIES Wendong Li (wendongl) Lu Zhang (lzhang3) Rui Wang (ruiw1) Project Objective Intra-operator parallelism Use multiple threads in a single executor

More information

No compromises: distributed transactions with consistency, availability, and performance

No compromises: distributed transactions with consistency, availability, and performance No compromises: distributed transactions with consistency, availability, and performance Aleksandar Dragojevi c, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam,

More information