Improving STM Performance with Transactional Structs 1
|
|
- Ronald Manning
- 5 years ago
- Views:
Transcription
1 Improving STM Performance with Transactional Structs 1 Ryan Yates and Michael L. Scott University of Rochester IFL, This work was funded in part by the National Science Foundation under grants CCR , CCF , CCF , and CCF , and by support from the IBM Canada Centres for Advanced Studies. 1/31
2 Outline Haskell STM TStruct Performance results Future work Slides: Paper: 2/31
3 What is Transactional Memory? Transactional memory is the joining of two ideas: The ability to express what should be atomic without saying how. An implementation that uses speculation to optimistically execute and try again if needed. 3/31
4 Existing Haskell STM Implementation In STM execution, reads and writes to TVars are tracked in a transactional record (TRec). Execution continues under the assumption that there have been no conflicts. A conflict is where two threads access the same location and at least one is a write. 4/31
5 Existing Haskell STM Implementation At the end of the transaction: The RTS validates that reads still match the values in the TVars. And commits by performing writes atomically. If validation fails, start over. Similar to OSTM [Fraser, 2004]. No global bottlenecks. Read-only transactions do not acquire locks. OSTM is non-blocking, GHC s STM can livelock. 5/31
6 Motivation Haskell STM concurrent data structures suffer bloat from indirection. TVar version watch-list value Node key value color parent left right TVar version watch-list value value Int# 6/31
7 Our Work TStruct Removes indirection required by TVars. Mutable unboxed values paired with mutable pointer values. Increases locality of data structure nodes. Maintains properties: Commit parallels TVar commit. No global bottleneck. Read-only transactions do not acquire locks. Avoids conflating lock and value. Flexible transactional variable granularity. 7/31
8 Data Structures Red-Black Tree Skip List Cuckoo Hash Table Hashed Array Mapped Trie (HAMT) 8/31
9 Red-Black Tree Rebalancing TVar Extra Indirection No mutable unboxed values (Color field) TStruct Node initialization Nil node and indirection Accesses at constant offsets 9/31
10 Extra Indirection -- TVar data Node k v = Node { _key ::!k, _value ::!v, _color :: TVar Color, _parent :: TVar (Node k v), _left :: TVar (Node k v), _right :: TVar (Node k v) } Nil 10/31
11 Avoiding Sum Type Indirection -- TStruct with sum type data Node k v = Node { _tstruct :: TStruct# RealWorld (Node k v) } Nil -- TStruct without sum type data Node k v = Node { _tstruct :: TStruct# RealWorld Any } 11/31
12 Skip List No rebalancing needed Random number source TVar Pure node with TArray of next pointers Extra indirection TStruct TStruct containing both values and next pointers Node initialization 12/31
13 Node Initialization When a new node is made in a transaction no other thread can see it until the transaction has committed. We take advantage of this and access these nodes non-transactionally. In the skip list, this happens on insertion. The new node is created and the next pointers are written to match the previous node at that level. 13/31
14 Cuckoo Hash Table Only insert needs to do significant work. TVar TStruct TArray of immutable buckets Immutable array of TStruct buckets 14/31
15 Cuckoo Hash Table TStruct size... TArray version watch-list value version watch-list value size entry... Array size entry... Array lock lock-count version watch-list count key1 key2... keyn value1 value2... valuen 15/31
16 Hashed Array Mapped Trie (HAMT) No rebalancing TVar TStruct Extra indirection Node initialization Immutable fields Node tags 16/31
17 Immutable fields Fields that are immutable can be safely be read non-transactionally. No bookkeeping needed! Two primitive read functions: Transactional readtstruct# implemented in Cmm and C. Non-transactional readtstructnt# implemented in code generator. Things can go wrong in very unexpected ways! 17/31
18 Node tags -- TVar data Node a = Nodes (TVar (WordArray a)) Leaf Hash a Leaves Hash (SizedArray a) data WordArray a = WordArray Bitmap (Array (Node a)) data SizedArray a = SizedArray Size (Array a) 18/31
19 Node tags -- TStruct data Node a = WordArray Size Bitmap (Array (Node a)) SizedArray Size Hash (Array a) data Node a = Node { _tstruct :: TStruct# RealWorld Any } 19/31
20 HAMT Nodes tvar size entry... Node tag=nodes TVar version watch-list value WordArray bitmap array SizedArray tag=nodes tvar version watch-list value WordArray bitmap array SizedArray size entry... Node TVar tag=leaves hash array SizedArray size entry... Node 20/31
21 HAMT Nodes TVar version watch-list value WordArray lock lock-count version watch-list tag=0 size bitmap entry... WordArray lock lock-count version watch-list tag=0 size bitmap entry... SizedArray lock lock-count version watch-list tag=1 size hash entry... 21/31
22 Code Example Example from lookup in the TVar-based HAMT. Pattern matching ensures we do not handle a leaf as a node. lookuptvar... = do arr <- readtvar... case wordarraylookup i arr of Nothing ->... Just (Nodes ns) ->... Just (Leaves h la) ->... 22/31
23 Code Example In the TStruct-based HAMT lookup we lose safety. No bounds check in readtstructwordnt#. Nodes can be confused with leaves. readtagnt (WordArray arr#) = STM $ \s1# -> case readtstructwordnt# arr# 0# s1# of (# s2#, w# #) -> (# s2#, W# w# #) lookuptstruct... arr = do t <- readtagnt arr case t of 0 -> do... readindicesnt arr > do... readhashnt arr... 23/31
24 Benchmarks Machine Intel c Xeon TM E v3 two socket, 36-core, 72-thread Tests Data structure with concurrent inserts (5%), deletes (5%), and lookups (90%) measuring throughput at steady state. Structure initially has 50,000 entries in a key space of 100,000 keys. 24/31
25 TVar (Intel c Xeon TM E v3 two socket, 36-core) 10 7 Operations per second 4 2 RBTree SkipList Cuckoo HAMT Threads 25/31
26 HAMT (Intel c Xeon TM E v3 two socket, 36-core) Operations per second TVar TStruct CTrie Threads 26/31
27 Cuckoo Hash (Intel c Xeon TM E v3 two socket, 36-core) 10 7 Operations per second TVar TStruct Threads 27/31
28 Skip List (Intel c Xeon TM E v3 two socket, 36-core) Operations per second 2 1 TVar TStruct Threads 28/31
29 Red-Black Tree (Intel c Xeon TM E v3 two socket, 36-core) Operations per second 2 1 TVar TStruct Threads 29/31
30 Future Work Continue to improve performance and understand what factors contribute to good performance. Recover safety for TStruct features Node initialization Node tagging Accesses at constant offsets Other data structures 30/31
31 Thanks! Slides: Paper: 31/31
32 32/31
33 Haskell STM Metadata Structure Node key value parent left right color TVar value watch Watch Queue thread next prev Watch Queue thread next prev prev index tvar old new tvar old new... TRec tvar old new 33/31
34 Haskell Before TStruct Node key value parent left right color TVar value watch Node key value parent left right color 34/31
35 Haskell with TStruct Node lock watch key value color parent left right Node lock watch key value color parent left right 35/31
36 Haskell STM commit commit(trec* trec) { if (validate(trec)) { if (read_check(trec)) { update(trec) return true } } return false } 36/31
37 Haskell STM commit bool validate(trec* trec) { for (e in trec) { if (is_write(e)) { if (!lock(e) e->value!= e->tvar->value) { release_locks(trec) return false; } } else { e->version = e->tvar->version } } } 37/31
38 Haskell STM commit bool read_check(trec* trec) { for (e in trec) { if (is_read(e)) { if (e->value!= e->tvar->value e->version!= e->tvar->version) { release_locks(trec) return false } } } } 38/31
39 Haskell STM commit update(trec* trec) { for (e in trec) { if (is_write(e)) { e->tvar->version++ e->tvar->value = e->new_value } } } 39/31
40 References [Fraser, 2004] Fraser, K. (2004). Practical lock-freedom. PhD thesis, University of Cambridge Computer Laboratory. 40/31
An Update on Haskell H/STM 1
An Update on Haskell H/STM 1 Ryan Yates and Michael L. Scott University of Rochester TRANSACT 10, 6-15-2015 1 This work was funded in part by the National Science Foundation under grants CCR-0963759, CCF-1116055,
More informationImproving STM Performance with Transactional Structs
Improving STM Performance with Transactional Structs Abstract Ryan Yates Computer Science Department University of Rochester Rochester, NY, USA ryates@cs.rochester.edu Software transactional memory (STM)
More informationImproving STM Performance with Transactional Structs
Improving STM Performance with Transactional Structs Ryan Yates Computer Science Department University of Rochester Rochester, NY, USA ryates@cs.rochester.edu Michael L. Scott Computer Science Department
More informationLeveraging Hardware TM in Haskell
Abstract Ryan Yates Houghton College Houghton, NY, USA ryan.yates@houghton.edu Transactional memory (TM) is heavily used for synchronization in the Haskell programming language, but its performance has
More informationCOMP3151/9151 Foundations of Concurrency Lecture 8
1 COMP3151/9151 Foundations of Concurrency Lecture 8 Transactional Memory Liam O Connor CSE, UNSW (and data61) 8 Sept 2017 2 The Problem with Locks Problem Write a procedure to transfer money from one
More informationConflict Detection and Validation Strategies for Software Transactional Memory
Conflict Detection and Validation Strategies for Software Transactional Memory Michael F. Spear, Virendra J. Marathe, William N. Scherer III, and Michael L. Scott University of Rochester www.cs.rochester.edu/research/synchronization/
More informationRevisiting Software Transactional Memory in Haskell 1
Revisiting Software Transactional Memory in Haskell Matthew Le Rochester Institute of Technology ml995@cs.rit.edu Ryan Yates University of Rochester ryates@cs.rochester.edu Matthew Fluet Rochester Institute
More informationA Hybrid TM for Haskell
A Hybrid TM for Haskell Ryan Yates Michael L. Scott Computer Science Department, University of Rochester {ryates,scott}@cs.rochester.edu Abstract Much of the success of Haskell s Software Transactional
More informationAgenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based?
Agenda Designing Transactional Memory Systems Part III: Lock-based STMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch Part I: Introduction Part II: Obstruction-free STMs Part III: Lock-based
More informationA Practical Scalable Distributed B-Tree
A Practical Scalable Distributed B-Tree CS 848 Paper Presentation Marcos K. Aguilera, Wojciech Golab, Mehul A. Shah PVLDB 08 March 8, 2010 Presenter: Evguenia (Elmi) Eflov Presentation Outline 1 Background
More informationSplit-Ordered Lists: Lock-Free Extensible Hash Tables. Pierre LaBorde
1 Split-Ordered Lists: Lock-Free Extensible Hash Tables Pierre LaBorde Nir Shavit 2 Tel-Aviv University, Israel Ph.D. from Hebrew University Professor at School of Computer Science at Tel-Aviv University
More informationCSE 230. Concurrency: STM. Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart
CSE 230 Concurrency: STM Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart The Grand Challenge How to properly use multi-cores? Need new programming models! Parallelism vs Concurrency
More informationComposable Shared Memory Transactions Lecture 20-2
Composable Shared Memory Transactions Lecture 20-2 April 3, 2008 This was actually the 21st lecture of the class, but I messed up the naming of subsequent notes files so I ll just call this one 20-2. This
More informationLecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013
Lecture 20: Transactional Memory Parallel Computer Architecture and Programming Slide credit Many of the slides in today s talk are borrowed from Professor Christos Kozyrakis (Stanford University) Raising
More informationAtomicity via Source-to-Source Translation
Atomicity via Source-to-Source Translation Benjamin Hindman Dan Grossman University of Washington 22 October 2006 Atomic An easier-to-use and harder-to-implement primitive void deposit(int x){ synchronized(this){
More informationLowering the Overhead of Nonblocking Software Transactional Memory
Lowering the Overhead of Nonblocking Software Transactional Memory Virendra J. Marathe Michael F. Spear Christopher Heriot Athul Acharya David Eisenstat William N. Scherer III Michael L. Scott Background
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 31 October 2012 Lecture 6 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationMulticore programming in Haskell. Simon Marlow Microsoft Research
Multicore programming in Haskell Simon Marlow Microsoft Research A concurrent web server server :: Socket -> IO () server sock = forever (do acc
More informationTackling Concurrency With STM. Mark Volkmann 10/22/09
Tackling Concurrency With Mark Volkmann mark@ociweb.com 10/22/09 Two Flavors of Concurrency Divide and conquer divide data into subsets and process it by running the same code on each subset concurrently
More informationTackling Concurrency With STM
Tackling Concurrency With Mark Volkmann mark@ociweb.com 10/22/09 Two Flavors of Concurrency Divide and conquer divide data into subsets and process it by running the same code on each subset concurrently
More informationUnderstanding Hardware Transactional Memory
Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence
More informationSILT: A Memory-Efficient, High- Performance Key-Value Store
SILT: A Memory-Efficient, High- Performance Key-Value Store SOSP 11 Presented by Fan Ni March, 2016 SILT is Small Index Large Tables which is a memory efficient high performance key value store system
More informationA Skiplist-based Concurrent Priority Queue with Minimal Memory Contention
A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention Jonatan Lindén and Bengt Jonsson Uppsala University, Sweden December 18, 2013 Jonatan Lindén 1 Contributions Motivation: Improve
More informationDesign Tradeoffs in Modern Software Transactional Memory Systems
Design Tradeoffs in Modern Software al Memory Systems Virendra J. Marathe, William N. Scherer III, and Michael L. Scott Department of Computer Science University of Rochester Rochester, NY 14627-226 {vmarathe,
More informationLocks and Threads and Monads OOo My. Stephan Bergmann StarOffice/OpenOffice.org Sun Microsystems
Locks and Threads and Monads OOo My Stephan Bergmann StarOffice/OpenOffice.org Sun Microsystems Locks and Threads and Monads OOo My 1 - Tomorrow's hardware... 2 -...and today's software 3 - Stateful vs.
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 17 November 2017 Lecture 7 Linearizability Lock-free progress properties Hashtables and skip-lists Queues Reducing contention Explicit
More informationCost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University)
Cost of Concurrency in Hybrid Transactional Memory Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) 1 Transactional Memory: a history Hardware TM Software TM Hybrid TM 1993 1995-today
More informationTRANSACTION MEMORY. Presented by Hussain Sattuwala Ramya Somuri
TRANSACTION MEMORY Presented by Hussain Sattuwala Ramya Somuri AGENDA Issues with Lock Free Synchronization Transaction Memory Hardware Transaction Memory Software Transaction Memory Conclusion 1 ISSUES
More informationunreadtvar: Extending Haskell Software Transactional Memory for Performance
unreadtvar: Extending Haskell Software Transactional Memory for Performance Nehir Sonmez, Cristian Perfumo, Srdjan Stipic, Adrian Cristal, Osman S. Unsal, and Mateo Valero Barcelona Supercomputing Center,
More informationSynchronising Threads
Synchronising Threads David Chisnall March 1, 2011 First Rule for Maintainable Concurrent Code No data may be both mutable and aliased Harder Problems Data is shared and mutable Access to it must be protected
More informationLecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations
Lecture 21: Transactional Memory Topics: Hardware TM basics, different implementations 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end locks are blocking,
More informationImplementierungstechniken für Hauptspeicherdatenbanksysteme: The Bw-Tree
Implementierungstechniken für Hauptspeicherdatenbanksysteme: The Bw-Tree Josef Schmeißer January 9, 218 Abstract The Bw-Tree as presented by Levandoski et al. was designed to accommodate the emergence
More informationHardware Transactional Memory on Haswell
Hardware Transactional Memory on Haswell Viktor Leis Technische Universität München 1 / 15 Introduction transactional memory is a very elegant programming model transaction { transaction { a = a 10; c
More information! Part I: Introduction. ! Part II: Obstruction-free STMs. ! DSTM: an obstruction-free STM design. ! FSTM: a lock-free STM design
genda Designing Transactional Memory ystems Part II: Obstruction-free TMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch! Part I: Introduction! Part II: Obstruction-free TMs! DTM: an obstruction-free
More informationIntroduction to Locks. Intrinsic Locks
CMSC 433 Programming Language Technologies and Paradigms Spring 2013 Introduction to Locks Intrinsic Locks Atomic-looking operations Resources created for sequential code make certain assumptions, a large
More informationTransactional Memory. Yaohua Li and Siming Chen. Yaohua Li and Siming Chen Transactional Memory 1 / 41
Transactional Memory Yaohua Li and Siming Chen Yaohua Li and Siming Chen Transactional Memory 1 / 41 Background Processor hits physical limit on transistor density Cannot simply put more transistors to
More information!!"!#"$%& Atomic Blocks! Atomic blocks! 3 primitives: atomically, retry, orelse!
cs242! Kathleen Fisher!! Multi-cores are coming!! - For 50 years, hardware designers delivered 40-50% increases per year in sequential program speed.! - Around 2004, this pattern failed because power and
More informationLecture 21 Concurrency Control Part 1
CMSC 461, Database Management Systems Spring 2018 Lecture 21 Concurrency Control Part 1 These slides are based on Database System Concepts 6 th edition book (whereas some quotes and figures are used from
More informationChapter 15 : Concurrency Control
Chapter 15 : Concurrency Control What is concurrency? Multiple 'pieces of code' accessing the same data at the same time Key issue in multi-processor systems (i.e. most computers today) Key issue for parallel
More informationFine-grained synchronization & lock-free programming
Lecture 17: Fine-grained synchronization & lock-free programming Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Tunes Minnie the Moocher Robbie Williams (Swings Both Ways)
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationSpeculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution
Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution Ravi Rajwar and Jim Goodman University of Wisconsin-Madison International Symposium on Microarchitecture, Dec. 2001 Funding
More informationImproving the Practicality of Transactional Memory
Improving the Practicality of Transactional Memory Woongki Baek Electrical Engineering Stanford University Programming Multiprocessors Multiprocessor systems are now everywhere From embedded to datacenter
More informationConcurrency Control. R &G - Chapter 19
Concurrency Control R &G - Chapter 19 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book Review DBMSs support concurrency, crash recovery with: ACID
More informationBw-Tree. Josef Schmeißer. January 9, Josef Schmeißer Bw-Tree January 9, / 25
Bw-Tree Josef Schmeißer January 9, 2018 Josef Schmeißer Bw-Tree January 9, 2018 1 / 25 Table of contents 1 Fundamentals 2 Tree Structure 3 Evaluation 4 Further Reading Josef Schmeißer Bw-Tree January 9,
More informationTransactional Memory. Lecture 18: Parallel Computer Architecture and Programming CMU /15-618, Spring 2017
Lecture 18: Transactional Memory Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2017 Credit: many slides in today s talk are borrowed from Professor Christos Kozyrakis (Stanford
More informationConcurrency Control CHAPTER 17 SINA MERAJI
Concurrency Control CHAPTER 17 SINA MERAJI Announcement Sign up for final project presentations here: https://docs.google.com/spreadsheets/d/1gspkvcdn4an3j3jgtvduaqm _x4yzsh_jxhegk38-n3k/edit#gid=0 Deadline
More informationLecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation
Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end
More informationTransactional Memory. Lecture 19: Parallel Computer Architecture and Programming CMU /15-618, Spring 2015
Lecture 19: Transactional Memory Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Credit: many of the slides in today s talk are borrowed from Professor Christos Kozyrakis
More informationLinked Lists: The Role of Locking. Erez Petrank Technion
Linked Lists: The Role of Locking Erez Petrank Technion Why Data Structures? Concurrent Data Structures are building blocks Used as libraries Construction principles apply broadly This Lecture Designing
More informationCSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs
CSE 451: Operating Systems Winter 2005 Lecture 7 Synchronization Steve Gribble Synchronization Threads cooperate in multithreaded programs to share resources, access shared data structures e.g., threads
More informationChapter 13 : Concurrency Control
Chapter 13 : Concurrency Control Chapter 13: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols Validation-Based Protocols Multiple Granularity Multiversion Schemes Insert and Delete Operations
More informationTransactional Memory
Transactional Memory Architectural Support for Practical Parallel Programming The TCC Research Group Computer Systems Lab Stanford University http://tcc.stanford.edu TCC Overview - January 2007 The Era
More informationIntroduction to the HAMT: Opportunity for Tcl Tcl Conference Don Porter Tcl/Tk Release Manager
Introduction to the HAMT: Opportunity for Tcl 2017 Tcl Conference Don Porter Tcl/Tk Release Manager Hash Maps in Tcl Dictionaries Array variables Name lookups (commands, vars, etc.) Much much more Most
More informationImplementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University
Implementing and Evaluating Nested Parallel Transactions in STM Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Introduction // Parallelize the outer loop for(i=0;i
More informationMemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing
MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing Bin Fan (CMU), Dave Andersen (CMU), Michael Kaminsky (Intel Labs) NSDI 2013 http://www.pdl.cmu.edu/ 1 Goal: Improve Memcached 1. Reduce space overhead
More informationDistributed Transaction Management 2003
Distributed Transaction Management 2003 Jyrki Nummenmaa http://www.cs.uta.fi/~dtm jyrki@cs.uta.fi General information We will view this from the course web page. Motivation We will pick up some motivating
More informationChí Cao Minh 28 May 2008
Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor
More informationTeleportation as a Strategy for Improving Concurrent Skiplist Performance. Frances Steen
Teleportation as a Strategy for Improving Concurrent Skiplist Performance by Frances Steen Submitted to the Department of Computer Science in partial fulfillment of the requirements for the degree of Bachelor
More informationPERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES
PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES Anish Athalye and Patrick Long Mentors: Austin Clements and Stephen Tu 3 rd annual MIT PRIMES Conference Sequential
More informationFast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Håkan Sundell Philippas Tsigas Outline Synchronization Methods Priority Queues Concurrent Priority Queues Lock-Free Algorithm: Problems
More informationNePaLTM: Design and Implementation of Nested Parallelism for Transactional Memory Systems
NePaLTM: Design and Implementation of Nested Parallelism for Transactional Memory Systems Haris Volos 1, Adam Welc 2, Ali-Reza Adl-Tabatabai 2, Tatiana Shpeisman 2, Xinmin Tian 2, and Ravi Narayanaswamy
More informationBlurred Persistence in Transactional Persistent Memory
Blurred Persistence in Transactional Persistent Memory Youyou Lu, Jiwu Shu, Long Sun Tsinghua University Overview Problem: high performance overhead in ensuring storage consistency of persistent memory
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 21 November 2014
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 21 November 2014 Lecture 7 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationFine-grained synchronization & lock-free data structures
Lecture 19: Fine-grained synchronization & lock-free data structures Parallel Computer Architecture and Programming Redo Exam statistics Example: a sorted linked list struct Node { int value; Node* next;
More informationLinearizability of Persistent Memory Objects
Linearizability of Persistent Memory Objects Michael L. Scott Joint work with Joseph Izraelevitz & Hammurabi Mendes www.cs.rochester.edu/research/synchronization/ Workshop on the Theory of Transactional
More informationSimon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010
Simon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010 Geeks Practitioners 1,000,000 10,000 100 1 The quick death 1yr 5yr 10yr 15yr Geeks Practitioners 1,000,000 10,000 100 The slow
More informationNon-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei
Non-blocking Array-based Algorithms for Stacks and Queues Niloufar Shafiei Outline Introduction Concurrent stacks and queues Contributions New algorithms New algorithms using bounded counter values Correctness
More informationCSE 451: Operating Systems Winter Lecture 7 Synchronization. Hank Levy 412 Sieg Hall
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization Hank Levy Levy@cs.washington.edu 412 Sieg Hall Synchronization Threads cooperate in multithreaded programs to share resources, access shared
More informationIMPORTANT: Circle the last two letters of your class account:
Spring 2011 University of California, Berkeley College of Engineering Computer Science Division EECS MIDTERM I CS 186 Introduction to Database Systems Prof. Michael J. Franklin NAME: STUDENT ID: IMPORTANT:
More informationPerformance Improvement via Always-Abort HTM
1 Performance Improvement via Always-Abort HTM Joseph Izraelevitz* Lingxiang Xiang Michael L. Scott* *Department of Computer Science University of Rochester {jhi1,scott}@cs.rochester.edu Parallel Computing
More informationProgrammazione di sistemi multicore
Programmazione di sistemi multicore A.A. 2015-2016 LECTURE 14 IRENE FINOCCHI http://wwwusers.di.uniroma1.it/~finocchi/ Programming with locks and critical sections MORE BAD INTERLEAVINGS GUIDELINES FOR
More informationRocksDB Key-Value Store Optimized For Flash
RocksDB Key-Value Store Optimized For Flash Siying Dong Software Engineer, Database Engineering Team @ Facebook April 20, 2016 Agenda 1 What is RocksDB? 2 RocksDB Design 3 Other Features What is RocksDB?
More informationLow Overhead Concurrency Control for Partitioned Main Memory Databases
Low Overhead Concurrency Control for Partitioned Main Memory Databases Evan Jones, Daniel Abadi, Samuel Madden, June 2010, SIGMOD CS 848 May, 2016 Michael Abebe Background Motivations Database partitioning
More information! A lock is a mechanism to control concurrent access to a data item! Data items can be locked in two modes :
Lock-Based Protocols Concurrency Control! A lock is a mechanism to control concurrent access to a data item! Data items can be locked in two modes : 1 exclusive (X) mode Data item can be both read as well
More informationImplementing Symmetric Multiprocessing in LispWorks
Implementing Symmetric Multiprocessing in LispWorks Making a multithreaded application more multithreaded Martin Simmons, LispWorks Ltd Copyright 2009 LispWorks Ltd Outline Introduction Changes in LispWorks
More informationCache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency
Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed
More informationHeckaton. SQL Server's Memory Optimized OLTP Engine
Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability
More informationA Comparison of Relativistic and Reader-Writer Locking Approaches to Shared Data Access
A Comparison of Relativistic and Reader-Writer Locking Approaches to Shared Data Access Philip W. Howard, Josh Triplett, and Jonathan Walpole Portland State University Abstract. This paper explores the
More information1 RCU. 2 Improving spinlock performance. 3 Kernel interface for sleeping locks. 4 Deadlock. 5 Transactions. 6 Scalable interface design
Overview of Monday s and today s lectures Outline Locks create serial code - Serial code gets no speedup from multiprocessors Test-and-set spinlock has additional disadvantages - Lots of traffic over memory
More informationTom Hart, University of Toronto Paul E. McKenney, IBM Beaverton Angela Demke Brown, University of Toronto
Making Lockless Synchronization Fast: Performance Implications of Memory Reclamation Tom Hart, University of Toronto Paul E. McKenney, IBM Beaverton Angela Demke Brown, University of Toronto Outline Motivation
More informationA Skip List for Multicore
A Skip List for Multicore Ian Dick University of Sydney Alan Fekete University of Sydney Vincent Gramoli University of Sydney Abstract In this paper, we introduce the Rotating skip list, the fastest concurrent
More informationLogTM: Log-Based Transactional Memory
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet
More informationDalí: A Periodically Persistent Hash Map
Dalí: A Periodically Persistent Hash Map Faisal Nawab* 1, Joseph Izraelevitz* 2, Terence Kelly*, Charles B. Morrey III*, Dhruva R. Chakrabarti*, and Michael L. Scott 2 1 Department of Computer Science
More informationAdvances in Programming Languages
O T Y H Advances in Programming Languages APL5: Further language concurrency mechanisms David Aspinall (including slides by Ian Stark) School of Informatics The University of Edinburgh Tuesday 5th October
More informationCOMP3151/9151 Foundations of Concurrency Lecture 8
1 COMP3151/9151 Foundations of Concurrency Lecture 8 Liam O Connor CSE, UNSW (and data61) 8 Sept 2017 2 Shared Data Consider the Readers and Writers problem from Lecture 6: Problem We have a large data
More informationComparing the Performance of Concurrent Linked-List Implementations in Haskell
Comparing the Performance of Concurrent Linked-List Implementations in Haskell Martin Sulzmann IT University of Copenhagen, Denmark martin.sulzmann@gmail.com Edmund S. L. Lam National University of Singapore,
More informationFall 2015 COMP Operating Systems. Lab 06
Fall 2015 COMP 3511 Operating Systems Lab 06 Outline Monitor Deadlocks Logical vs. Physical Address Space Segmentation Example of segmentation scheme Paging Example of paging scheme Paging-Segmentation
More informationTransaction Management: Concurrency Control, part 2
Transaction Management: Concurrency Control, part 2 CS634 Class 16 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke Locking for B+ Trees Naïve solution Ignore tree structure,
More informationLocking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching
Locking for B+ Trees Transaction Management: Concurrency Control, part 2 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke CS634 Class 16 Naïve solution Ignore tree structure,
More informationMonitors; Software Transactional Memory
Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 18, 2012 CPD (DEI / IST) Parallel and
More informationWhat's new in MySQL 5.5? Performance/Scale Unleashed
What's new in MySQL 5.5? Performance/Scale Unleashed Mikael Ronström Senior MySQL Architect The preceding is intended to outline our general product direction. It is intended for
More informationPanu Silvasti Page 1
Multicore support in databases Panu Silvasti Page 1 Outline Building blocks of a storage manager How do existing storage managers scale? Optimizing Shore database for multicore processors Page 2 Building
More informationTopics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability
Topics COS 318: Operating Systems File Performance and Reliability File buffer cache Disk failure and recovery tools Consistent updates Transactions and logging 2 File Buffer Cache for Performance What
More informationConcurrent Data Structures Concurrent Algorithms 2016
Concurrent Data Structures Concurrent Algorithms 2016 Tudor David (based on slides by Vasileios Trigonakis) Tudor David 11.2016 1 Data Structures (DSs) Constructs for efficiently storing and retrieving
More informationHydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran. Hot Topics in Parallelism (HotPar '12), Berkeley, CA
HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran Hot Topics in Parallelism (HotPar '12), Berkeley, CA Motivation & Objectives Background Architecture Program Reconstruction Implementation
More informationSoftware Transactional Memory Should Not Be Obstruction-Free
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals IRC-TR-06-052 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL
More informationMassimiliano Ghilardi
7 th European Lisp Symposium Massimiliano Ghilardi May 5-6, 2014 IRCAM, Paris, France High performance concurrency in Common Lisp hybrid transactional memory with STMX 2 Beautiful and fast concurrency
More informationMULTI-THREADED QUERIES
15-721 Project 3 Final Presentation MULTI-THREADED QUERIES Wendong Li (wendongl) Lu Zhang (lzhang3) Rui Wang (ruiw1) Project Objective Intra-operator parallelism Use multiple threads in a single executor
More informationNo compromises: distributed transactions with consistency, availability, and performance
No compromises: distributed transactions with consistency, availability, and performance Aleksandar Dragojevi c, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam,
More information