Performance Evaluation of Adaptivity in STM. Mathias Payer and Thomas R. Gross Department of Computer Science, ETH Zürich

Size: px
Start display at page:

Download "Performance Evaluation of Adaptivity in STM. Mathias Payer and Thomas R. Gross Department of Computer Science, ETH Zürich"

Transcription

1 Performance Evaluation of Adaptivity in STM Mathias Payer and Thomas R. Gross Department of Computer Science, ETH Zürich

2 Motivation STM systems rely on many assumptions Often contradicting for different programs Statically tuned to a baseline Use self-optimizing systems Adapt to different workloads What parameters can be adapted? How to measure effectiveness? ISPASS'11 / Mathias Payer / ETH Zürich 2

3 Outline Introduction STM System STM Baseline Adaptive Parameters Evaluation Related work Conclusion ISPASS'11 / Mathias Payer / ETH Zürich 3

4 Introduction Software Transactional Memory (STM) applies transactions to memory (Optimistic) concurrency control mechanism Alternative to lock-based synchronization Multiple concurrent threads run transactions Concurrent memory modifications ISPASS'11 / Mathias Payer / ETH Zürich 4

5 Introduction Concurrent transactions modify memory without synchronization Transaction is verified after completion Conflicts are detected and resolved Changes committed for conflict-free transactions Modifications only visible after commit ISPASS'11 / Mathias Payer / ETH Zürich 5

6 withdraw { tmp = balance; tmp = tmp 1 balance = tmp; } Introduction deposit { tmp = balance; tmp = tmp + 1 balance = tmp; } TX starts balance in read-set balance in write-set Conflict detection, data committed What happens when balance is accessed concurrently? Either locking or STM needed to ensure correct end balance STM system decides which tx is executed first ISPASS'11 / Mathias Payer / ETH Zürich 6

7 STM Baseline Many efficient STM implementations agree on important design decisions: Word-based locking Global locking / version table Eager locking (Almost) no contention management Simple write-set and read-set implementations ISPASS'11 / Mathias Payer / ETH Zürich 7

8 STM Baseline Combined global write lock / version array Write list / buffer Lock list Read list / buffer Write list / buffer Lock list Read list / buffer Write Hash Read Hash Write Hash Read Hash Transaction Transaction ISPASS'11 / Mathias Payer / ETH Zürich 8

9 Adaptive STM Parameters Global adaptivity Synchronization needed Optimizes to global optimum Averages over all concurrent transactions (Thread-) local adaptivity No synchronization needed Limits adaptable parameters Best parameters for each thread/transaction ISPASS'11 / Mathias Payer / ETH Zürich 9

10 Adaptive STM Parameters Different adaptive parameters measured: Size of global locking/version-table *G Size of local hash-tables *L Write strategy *L Locality tuning for hash-functions *L Contention management *L *L local, *G global ISPASS'11 / Mathias Payer / ETH Zürich 1

11 Adaptive Hash-Table Global hash-table: trade-off between overlocking and locality Global strategy: coordinate lock collisions and overlocking between threads Adapt size based on global information Local hash-table: trade-off between reset cost, and # hash-collisions Local strategy: sample moving average of unique write locations Adapt size based on trend ISPASS'11 / Mathias Payer / ETH Zürich 11

12 Adaptive Write Strategy Different costs depending on strategy Write-back: cheap abort, expensive commit Write-through: expensive abort, cheap commit Adapt strategy to per-thread workload Measure abort rate ISPASS'11 / Mathias Payer / ETH Zürich 12

13 Adaptive Locality Tuning Different applications have different data access patterns No optimal hash function for all data accesses Measure number of hash collisions for threadlocal hash tables Circle through different hash functions ISPASS'11 / Mathias Payer / ETH Zürich 13

14 Adaptive Contention Management No single strategy works in all environments Measure contention and implement an adaptive back-off strategy Wait and retry Abort later ISPASS'11 / Mathias Payer / ETH Zürich 14

15 Local Adaptive STM Parameters (for local hash-table) # writes vs. hash-table space enlarge write-hash no change shrink write-hash ISPASS'11 / Mathias Payer / ETH Zürich 15

16 Local Adaptive STM Parameters (for local hash-table) no change change hash-function # hash collisions ISPASS'11 / Mathias Payer / ETH Zürich 16

17 Local Adaptive STM Parameters (for local hash-table) # writes vs. hash-table space enlarge write-hash no change shrink write-hash # hash collisions enlarge write-hash & change hash-function change hash-function shrink write-hash & change hash-function ISPASS'11 / Mathias Payer / ETH Zürich 17

18 AdaptSTM Adaptive STM system built on presented features Statically tuned competitive baseline Static global hash function and hash table Mature and stable implementation Different local adaptive parameters Write-set hash function and size of hash table Write-through and write-back write strategy Adaptive contention management ISPASS'11 / Mathias Payer / ETH Zürich 18

19 Evaluation Benchmark: STAMP configuration (increased workload for kmeans) AdaptSTM version.5.1 Intel 4-core Xeon E552 CPU GHz, 12GB RAM 64bit Ubuntu 9.4 ISPASS'11 / Mathias Payer / ETH Zürich 19

20 Evaluation: Global Hash-Table kmeans 4 Threads Genome 4 Threads Time [s] ^16 2^18 2^2 2^22 2^24 2^26 Time [s] ^16 2^18 2^2 2^22 2^24 2^ # Shifts # Shifts ISPASS'11 / Mathias Payer / ETH Zürich 2

21 Evaluation: Global Adaptivity Global optimizations have limited potential Small optimization potential High synchronization cost Reasonable baseline outperforms global optimization ISPASS'11 / Mathias Payer / ETH Zürich 21

22 Evaluation: Local Adaptivity Different configurations: nawb: no adaptivity, use write-back awbt: adaptivity, adjust write-through / write-back awwh: awbt plus an adaptive hash-table for the write-set awhh: awwh plus different hash functions aall: all adaptive parameters plus Bloom filter for write-entries Adaptation system starts with best 'average' parameters, improves from there ISPASS'11 / Mathias Payer / ETH Zürich 22

23 Evaluation: Local Adaptivity kmeans Labyrinth 15.% 3.% 1.% 2.% Speedup to non adaptive 5.%.% -5.% Speedup to non adaptive 1.%.% awbt awwh awhh -1.% aall -2.% awbt awwh awhh aall -1.% -3.% -15.% % Threads Threads awbt: adaptive, write-back/-through awwh: adaptive, write-back/-through, write-hash awhh: adaptive, write-back/-through, write-hash, hash-function aall: adaptive, write-back/-through, write-hash, hash-function, Bloom filter ISPASS'11 / Mathias Payer / ETH Zürich 23

24 Evaluation: Local Adaptivity 6.% Genome 5.% Vacation Speedup to non adaptive 5.% 4.% 3.% 2.% 1.%.% -1.% Speedup to non adaptive 4.% 3.% awbt 2.% awwh awhh 1.% aall.% awbt awwh awhh aall -2.% -1.% -3.% Threads -2.% Threads awbt: adaptive, write-back/-through awwh: adaptive, write-back/-through, write-hash awhh: adaptive, write-back/-through, write-hash, hash-function aall: adaptive, write-back/-through, write-hash, hash-function, Bloom filter ISPASS'11 / Mathias Payer / ETH Zürich 24

25 Evaluation: Local Adaptivity No single optimization works for all benchmarks Combination of all options leads to best performance Impressive speed-ups for individual benchmarks compared to the globally optimized case ISPASS'11 / Mathias Payer / ETH Zürich 25

26 Related Work TL2 (Dice et al.): baseline STM system Different related work on static tuning of global parameters (Harris, Dice, Ennals, Felber) Crucial for efficient baseline TinySTM (Felber et al.): adapts size and hash function of global locking table ASTM (Marathe et. al.): adapts lazy-eager locking strategies and different meta-formats ISPASS'11 / Mathias Payer / ETH Zürich 26

27 Conclusions Adaptivity in STM is important for good performance Speedups up to 1% possible Global optimization are limited Low potential, high synchronization cost Local optimizations tune thread-local parameters High correlation with workload ISPASS'11 / Mathias Payer / ETH Zürich 27

28 Questions? Contact: Source: ISPASS'11 / Mathias Payer / ETH Zürich 28

29 Evaluation: Global Hash-Table Bayes Genome 4 Threads 4 Threads Time [s] Time [s] ^16 2^18 2^2 2^22 2^24 2^ # Shifts # Shifts Vacation kmeans 4 Threads 4 Threads 3 8 Time [s] Time [s] ^16 2^18 2^2 2^22 2^24 2^ # Shifts # Shifts ISPASS'11 / Mathias Payer / ETH Zürich 29

30 Evaluation: Global Hash-Table Labyrinth Intruder 4 Threads 4 Threads Time [s] ^16 2^18 2^2 2^22 2^24 2^26 Time [s] ^16 2^18 2^2 2^22 2^24 2^ # Shifts # Shifts SSCA2 YADA 4 Threads 4 Threads 18 5 Time [s] ^16 2^18 2^2 2^22 2^24 2^26 Time [s] ^16 2^18 2^2 2^22 2^24 2^ # Shifts # Shifts ISPASS'11 / Mathias Payer / ETH Zürich 3

31 STM Comparison 1.6 Genome 5 Vacation Relative runtime Relative runtime astm 2.5 tl2 tstm2 tstm astm tl2 tstm tstm Threads Threads 1.8 Labyrinth 6 Intruder Relative runtime Relative runtime 4 astm 3 tl2 tstm tstm99 2 astm tl2 tstm tstm Threads Threads ISPASS'11 / Mathias Payer / ETH Zürich 31

32 Evaluation: Local Adaptivity Bayes SSCA2 3.% 5.% 2.% 4.% 1.% 3.% Speedup to non adaptive.% -1.% -2.% Speedup to non adaptive 2.% awbt awwh 1.% awhh aall.% -1.% awbt awwh awhh aall -3.% -2.% -4.% Threads -3.% Threads ISPASS'11 / Mathias Payer / ETH Zürich 32

33 Evaluation: Local Adaptivity 12.% YADA 1.% 8.% Speedup to non adaptive 6.% 4.% 2.% awbt awwh awhh aall.% -2.% Threads ISPASS'11 / Mathias Payer / ETH Zürich 33

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications Fernando Rui, Márcio Castro, Dalvan Griebler, Luiz Gustavo Fernandes Email: fernando.rui@acad.pucrs.br,

More information

Chí Cao Minh 28 May 2008

Chí Cao Minh 28 May 2008 Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor

More information

SELF-TUNING HTM. Paolo Romano

SELF-TUNING HTM. Paolo Romano SELF-TUNING HTM Paolo Romano 2 Based on ICAC 14 paper N. Diegues and Paolo Romano Self-Tuning Intel Transactional Synchronization Extensions 11 th USENIX International Conference on Autonomic Computing

More information

Work Report: Lessons learned on RTM

Work Report: Lessons learned on RTM Work Report: Lessons learned on RTM Sylvain Genevès IPADS September 5, 2013 Sylvain Genevès Transactionnal Memory in commodity hardware 1 / 25 Topic Context Intel launches Restricted Transactional Memory

More information

EazyHTM: Eager-Lazy Hardware Transactional Memory

EazyHTM: Eager-Lazy Hardware Transactional Memory EazyHTM: Eager-Lazy Hardware Transactional Memory Saša Tomić, Cristian Perfumo, Chinmay Kulkarni, Adrià Armejach, Adrián Cristal, Osman Unsal, Tim Harris, Mateo Valero Barcelona Supercomputing Center,

More information

Evaluating Contention Management Using Discrete Event Simulation

Evaluating Contention Management Using Discrete Event Simulation Evaluating Contention Management Using Discrete Event Simulation Brian Demsky Alokika Dash Department of Electrical Engineering and Computer Science University of California, Irvine Irvine, CA 92697 {bdemsky,adash}@uci.edu

More information

ABORTING CONFLICTING TRANSACTIONS IN AN STM

ABORTING CONFLICTING TRANSACTIONS IN AN STM Committing ABORTING CONFLICTING TRANSACTIONS IN AN STM PPOPP 09 2/17/2009 Hany Ramadan, Indrajit Roy, Emmett Witchel University of Texas at Austin Maurice Herlihy Brown University TM AND ITS DISCONTENTS

More information

Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing

Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing Richard Yoo, Christopher Hughes: Intel Labs Konrad Lai, Ravi Rajwar: Intel Architecture Group Agenda

More information

Coping With Context Switches in Lock-Based Software Transacional Memory Algorithms

Coping With Context Switches in Lock-Based Software Transacional Memory Algorithms TEL-AVIV UNIVERSITY RAYMOND AND BEVERLY SACKLER FACULTY OF EXACT SCIENCES SCHOOL OF COMPUTER SCIENCE Coping With Context Switches in Lock-Based Software Transacional Memory Algorithms Dissertation submitted

More information

Relaxing Concurrency Control in Transactional Memory. Utku Aydonat

Relaxing Concurrency Control in Transactional Memory. Utku Aydonat Relaxing Concurrency Control in Transactional Memory by Utku Aydonat A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of The Edward S. Rogers

More information

Software transactional memory

Software transactional memory Transactional locking II (Dice et. al, DISC'06) Time-based STM (Felber et. al, TPDS'08) Mentor: Johannes Schneider March 16 th, 2011 Motivation Multiprocessor systems Speed up time-sharing applications

More information

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work Using Non-blocking Operations in HPC to Reduce Execution Times David Buettner, Julian Kunkel, Thomas Ludwig Euro PVM/MPI September 8th, 2009 Outline 1 Motivation 2 Theory of a non-blocking benchmark 3

More information

A Machine Learning Approach to Adaptive Software Transaction Memory

A Machine Learning Approach to Adaptive Software Transaction Memory Lehigh University Lehigh Preserve Theses and Dissertations 2011 A Machine Learning Approach to Adaptive Software Transaction Memory Qingping Wang Lehigh University Follow this and additional works at:

More information

ByteSTM: Java Software Transactional Memory at the Virtual Machine Level

ByteSTM: Java Software Transactional Memory at the Virtual Machine Level ByteSTM: Java Software Transactional Memory at the Virtual Machine Level Mohamed Mohamedin Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment

More information

DESIGNING AN EFFECTIVE HYBRID TRANSACTIONAL MEMORY SYSTEM

DESIGNING AN EFFECTIVE HYBRID TRANSACTIONAL MEMORY SYSTEM DESIGNING AN EFFECTIVE HYBRID TRANSACTIONAL MEMORY SYSTEM A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT

More information

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based?

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based? Agenda Designing Transactional Memory Systems Part III: Lock-based STMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch Part I: Introduction Part II: Obstruction-free STMs Part III: Lock-based

More information

Cost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University)

Cost of Concurrency in Hybrid Transactional Memory. Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) Cost of Concurrency in Hybrid Transactional Memory Trevor Brown (University of Toronto) Srivatsan Ravi (Purdue University) 1 Transactional Memory: a history Hardware TM Software TM Hybrid TM 1993 1995-today

More information

A Machine Learning-Based Approach for Thread Mapping on Transactional Memory Applications

A Machine Learning-Based Approach for Thread Mapping on Transactional Memory Applications A Machine Learning-Based Approach for Thread Mapping on Transactional Memory Applications Márcio Castro, Luís Fabrício Wanderley Góes, Christiane Pousa Ribeiro, Murray Cole, Marcelo Cintra and Jean-François

More information

Remote Invalidation: Optimizing the Critical Path of Memory Transactions

Remote Invalidation: Optimizing the Critical Path of Memory Transactions Remote idation: Optimizing the Critical Path of Memory Transactions Ahmed Hassan, Roberto Palmieri, Binoy Ravindran Electrical and Computer Engineering Department Virginia Tech Blacksburg, Virginia, USA

More information

Remote Transaction Commit: Centralizing Software Transactional Memory Commits

Remote Transaction Commit: Centralizing Software Transactional Memory Commits IEEE TRANSACTIONS ON COMPUTERS 1 Remote Transaction Commit: Centralizing Software Transactional Memory Commits Ahmed Hassan, Roberto Palmieri, and Binoy Ravindran Abstract Software Transactional Memory

More information

Processing Transactions in a Predefined Order

Processing Transactions in a Predefined Order Mohamed M. Saad Alexandria University Alexandria, Egypt msaad@alexu.edu.eg Masoomeh Javidi Kishi Lehigh University Bethlehem, PA, USA maj77@lehigh.edu Shihao Jing Lehigh University Bethlehem, PA, USA shj36@lehigh.edu

More information

EXPLOITING SEMANTIC COMMUTATIVITY IN HARDWARE SPECULATION

EXPLOITING SEMANTIC COMMUTATIVITY IN HARDWARE SPECULATION EXPLOITING SEMANTIC COMMUTATIVITY IN HARDWARE SPECULATION GUOWEI ZHANG, VIRGINIA CHIU, DANIEL SANCHEZ MICRO 2016 Executive summary 2 Exploiting commutativity benefits update-heavy apps Software techniques

More information

ESTIMA: Extrapolating ScalabiliTy of In-Memory Applications

ESTIMA: Extrapolating ScalabiliTy of In-Memory Applications : Extrapolating ScalabiliTy of In-Memory Applications Georgios Chatzopoulos EPFL georgios.chatzopoulos@epfl.ch Aleksandar Dragojević Microsoft Research alekd@microsoft.com Rachid Guerraoui EPFL rachid.guerraoui@epfl.ch

More information

Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory

Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev MIT amatveev@csail.mit.edu Nir Shavit MIT shanir@csail.mit.edu Abstract Because of hardware TM limitations, software

More information

The Implications of Shared Data Synchronization Techniques on Multi-Core Energy Efficiency

The Implications of Shared Data Synchronization Techniques on Multi-Core Energy Efficiency The Implications of Shared Data Synchronization Techniques on Multi-Core Energy Efficiency Ashok Gautham 1, Kunal Korgaonkar 1,2, Patanjali SLPSK 1, Shankar Balachandran 1, and Kamakoti Veezhinathan 1

More information

Preventing versus Curing: Avoiding Conflicts in Transactional Memories

Preventing versus Curing: Avoiding Conflicts in Transactional Memories Preventing versus Curing: Avoiding Conflicts in Transactional Memories Aleksandar Dragojević Anmol V. Singh Rachid Guerraoui Vasu Singh EPFL Abstract Transactional memories are typically speculative and

More information

10 Supporting Time-Based QoS Requirements in Software Transactional Memory

10 Supporting Time-Based QoS Requirements in Software Transactional Memory 1 Supporting Time-Based QoS Requirements in Software Transactional Memory WALTHER MALDONADO, University of Neuchâtel, Switzerland PATRICK MARLIER, University of Neuchâtel, Switzerland PASCAL FELBER, University

More information

TBD A Transactional Memory with Automatic Performance Tuning

TBD A Transactional Memory with Automatic Performance Tuning TBD A Transactional Memory with Automatic Performance Tuning QINGPING WANG, Lehigh University SAMEER KULKARNI, University of Delaware JOHN CAVAZOS, University of Delaware MICHAEL SPEAR, Lehigh University

More information

Stretching Transactional Memory

Stretching Transactional Memory Stretching Transactional Memory Aleksandar Dragojević Rachid Guerraoui Michał Kapałka Ecole Polytechnique Fédérale de Lausanne, School of Computer and Communication Sciences, I&C, Switzerland {aleksandar.dragojevic,

More information

Implementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University

Implementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Implementing and Evaluating Nested Parallel Transactions in STM Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Introduction // Parallelize the outer loop for(i=0;i

More information

Proactive Transaction Scheduling for Contention Management

Proactive Transaction Scheduling for Contention Management Proactive Transaction for Contention Management Geoffrey Blake University of Michigan Ann Arbor, MI blakeg@umich.edu Ronald G. Dreslinski University of Michigan Ann Arbor, MI rdreslin@umich.edu Trevor

More information

Autonomic Thread Parallelism and Mapping Control for Software Transactional Memory

Autonomic Thread Parallelism and Mapping Control for Software Transactional Memory Autonomic Thread Parallelism and Mapping Control for Software Transactional Memory Presented by Naweiluo Zhou PhD Supervisors: Dr. Gwenae l Delaval Dr. E ric Rutten Prof. Jean-Franc ois Me haut PhD Defence

More information

EigenBench: A Simple Exploration Tool for Orthogonal TM Characteristics

EigenBench: A Simple Exploration Tool for Orthogonal TM Characteristics EigenBench: A Simple Exploration Tool for Orthogonal TM Characteristics Pervasive Parallelism Laboratory, Stanford University Sungpack Hong Tayo Oguntebi Jared Casper Nathan Bronson Christos Kozyrakis

More information

Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench

Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Ajay Singh, Sathya Peri, G Monika and Anila kumari Dept. of Computer Science And Engineering IIT Hyderabad Hyderabad,

More information

Performance Tradeoffs in Software Transactional Memory

Performance Tradeoffs in Software Transactional Memory Master Thesis Computer Science Thesis no: MCS-2010-28 May 2010 Performance Tradeoffs in Software Transactional Memory Gulfam Abbas Naveed Asif School School of Computing of Computing Blekinge Blekinge

More information

Scheduling Transactions in Replicated Distributed Transactional Memory

Scheduling Transactions in Replicated Distributed Transactional Memory Scheduling Transactions in Replicated Distributed Transactional Memory Junwhan Kim and Binoy Ravindran Virginia Tech USA {junwhan,binoy}@vt.edu CCGrid 2013 Concurrency control on chip multiprocessors significantly

More information

Invyswell: A HyTM for Haswell RTM. Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy

Invyswell: A HyTM for Haswell RTM. Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy Invyswell: A HyTM for Haswell RTM Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy Multicore Performance Scaling u Problem: Locking u Solution: HTM? u IBM BG/Q, zec12,

More information

Lerna: Transparent and Effective Speculative Loop Parallelization

Lerna: Transparent and Effective Speculative Loop Parallelization Lerna: Transparent and Effective Speculative Loop Parallelization Mohamed M. Saad Roberto Palmieri Binoy Ravindran Virginia Tech {msaad, robertop, binoy}@vt.edu Abstract In this paper, we present Lerna,

More information

On Exploring Markov Chains for Transaction Scheduling Optimization in Transactional Memory

On Exploring Markov Chains for Transaction Scheduling Optimization in Transactional Memory WTTM 2015 7 th Workshop on the Theory of Transactional Memory On Exploring Markov Chains for Transaction Scheduling Optimization in Transactional Memory Pierangelo Di Sanzo, Marco Sannicandro, Bruno Ciciani,

More information

Lock vs. Lock-free Memory Project proposal

Lock vs. Lock-free Memory Project proposal Lock vs. Lock-free Memory Project proposal Fahad Alduraibi Aws Ahmad Eman Elrifaei Electrical and Computer Engineering Southern Illinois University 1. Introduction The CPU performance development history

More information

Supporting Time-Based QoS Requirements in Software Transactional Memory

Supporting Time-Based QoS Requirements in Software Transactional Memory Supporting Time-Based QoS Requirements in Software Transactional Memory WALTHER MALDONADO, PATRICK MARLIER, and PASCAL FELBER, University of Neuchâtel, Switzerland JULIA LAWALL and GILLES MULLER, Inria/LIP6/UPMC/Sorbonne

More information

Concurrent execution of an analytical workload on a POWER8 server with K40 GPUs A Technology Demonstration

Concurrent execution of an analytical workload on a POWER8 server with K40 GPUs A Technology Demonstration Concurrent execution of an analytical workload on a POWER8 server with K40 GPUs A Technology Demonstration Sina Meraji sinamera@ca.ibm.com Berni Schiefer schiefer@ca.ibm.com Tuesday March 17th at 12:00

More information

HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran. Hot Topics in Parallelism (HotPar '12), Berkeley, CA

HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran. Hot Topics in Parallelism (HotPar '12), Berkeley, CA HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran Hot Topics in Parallelism (HotPar '12), Berkeley, CA Motivation & Objectives Background Architecture Program Reconstruction Implementation

More information

Cache Affinity Optimization Techniques for Scaling Software Transactional Memory Systems on Multi-CMP Architectures

Cache Affinity Optimization Techniques for Scaling Software Transactional Memory Systems on Multi-CMP Architectures Cache Affinity Optimization Techniques for Scaling Software Transactional Memory Systems on Multi-CMP Architectures Kinson Chan The University of Hong Kong Hong Kong, China Email: kchan@cs.hku.hk King

More information

LOCK-FREE DINING PHILOSOPHER

LOCK-FREE DINING PHILOSOPHER LOCK-FREE DINING PHILOSOPHER VENKATAKASH RAJ RAOJILLELAMUDI 1, SOURAV MUKHERJEE 2, RYAN SAPTARSHI RAY 3, UTPAL KUMAR RAY 4 Department of Information Technology, Jadavpur University, Kolkata, India 1,2,

More information

Edinburgh Research Explorer

Edinburgh Research Explorer Edinburgh Research Explorer Autotuning Skeleton-Driven Optimizations for Transactional Worklist Applications Citation for published version: Goes, LFW, Ioannou, N, Xekalakis, P, Cole, M & Cintra, M 2012,

More information

Early Foundations of a Transactional Boosting Library for Scala and Java

Early Foundations of a Transactional Boosting Library for Scala and Java Early Foundations of a Transactional Boosting Library for Scala and Java A Masters Project Report Authored by Marquita Ellis Supervised by Maurice Herlihy Conducted at Brown University Department of Computer

More information

Heckaton. SQL Server's Memory Optimized OLTP Engine

Heckaton. SQL Server's Memory Optimized OLTP Engine Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability

More information

Transactifying Apache s Cache Module

Transactifying Apache s Cache Module H. Eran O. Lutzky Z. Guz I. Keidar Department of Electrical Engineering Technion Israel Institute of Technology SYSTOR 2009 The Israeli Experimental Systems Conference Outline 1 Why legacy applications

More information

Lightweight, Robust Adaptivity for Software Transactional Memory

Lightweight, Robust Adaptivity for Software Transactional Memory Lightweight, Robust Adaptivity for Software Transactional Memory Michael F. Spear Department of Computer Science and Engineering Lehigh University spear@cse.lehigh.edu ABSTRACT When a program uses Software

More information

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28 Transactional Memory or How to do multiple things at once Benjamin Engel Transactional Memory 1 / 28 Transactional Memory: Architectural Support for Lock-Free Data Structures M. Herlihy, J. Eliot, and

More information

VMM Emulation of Intel Hardware Transactional Memory

VMM Emulation of Intel Hardware Transactional Memory VMM Emulation of Intel Hardware Transactional Memory Maciej Swiech, Kyle Hale, Peter Dinda Northwestern University V3VEE Project www.v3vee.org Hobbes Project 1 What will we talk about? We added the capability

More information

What Is STM and How Well It Performs? Aleksandar Dragojević

What Is STM and How Well It Performs? Aleksandar Dragojević What Is STM and How Well It Performs? Aleksandar Dragojević 1 Hardware Trends 2 Hardware Trends CPU 2 Hardware Trends CPU 2 Hardware Trends CPU 2 Hardware Trends CPU 2 Hardware Trends CPU CPU 2 Hardware

More information

Lowering the Overhead of Nonblocking Software Transactional Memory

Lowering the Overhead of Nonblocking Software Transactional Memory Lowering the Overhead of Nonblocking Software Transactional Memory Virendra J. Marathe Michael F. Spear Christopher Heriot Athul Acharya David Eisenstat William N. Scherer III Michael L. Scott Background

More information

LogTM: Log-Based Transactional Memory

LogTM: Log-Based Transactional Memory LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet

More information

STEPS Towards Cache-Resident Transaction Processing

STEPS Towards Cache-Resident Transaction Processing STEPS Towards Cache-Resident Transaction Processing Stavros Harizopoulos joint work with Anastassia Ailamaki VLDB 2004 Carnegie ellon CPI OLTP workloads on modern CPUs 6 4 2 L2-I stalls L2-D stalls L1-I

More information

Adaptive Locks: Combining Transactions and Locks for Efficient Concurrency

Adaptive Locks: Combining Transactions and Locks for Efficient Concurrency Adaptive Locks: Combining Transactions and Locks for Efficient Concurrency Takayuki Usui a,1, Reimer Behrends a,2, Jacob Evans b, Yannis Smaragdakis b, a Department of Computer and Information Science,

More information

Lock-Free Readers/Writers

Lock-Free Readers/Writers www.ijcsi.org 180 Lock-Free Readers/Writers Anupriya Chakraborty 1, Sourav Saha 2, Ryan Saptarshi Ray 3 and Utpal Kumar Ray 4 1 Department of Information Technology, Jadavpur University Salt Lake Campus

More information

An Update on Haskell H/STM 1

An Update on Haskell H/STM 1 An Update on Haskell H/STM 1 Ryan Yates and Michael L. Scott University of Rochester TRANSACT 10, 6-15-2015 1 This work was funded in part by the National Science Foundation under grants CCR-0963759, CCF-1116055,

More information

[MS10987A]: Performance Tuning and Optimizing SQL Databases

[MS10987A]: Performance Tuning and Optimizing SQL Databases [MS10987A]: Performance Tuning and Optimizing SQL Databases Length : 4 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server Delivery Method : Instructor-led (Classroom) Course

More information

Course Outline. Performance Tuning and Optimizing SQL Databases Course 10987B: 4 days Instructor Led

Course Outline. Performance Tuning and Optimizing SQL Databases Course 10987B: 4 days Instructor Led Performance Tuning and Optimizing SQL Databases Course 10987B: 4 days Instructor Led About this course This four-day instructor-led course provides students who manage and maintain SQL Server databases

More information

Conflict Detection and Validation Strategies for Software Transactional Memory

Conflict Detection and Validation Strategies for Software Transactional Memory Conflict Detection and Validation Strategies for Software Transactional Memory Michael F. Spear, Virendra J. Marathe, William N. Scherer III, and Michael L. Scott University of Rochester www.cs.rochester.edu/research/synchronization/

More information

Generating Low-Overhead Dynamic Binary Translators. Mathias Payer and Thomas R. Gross Department of Computer Science ETH Zürich

Generating Low-Overhead Dynamic Binary Translators. Mathias Payer and Thomas R. Gross Department of Computer Science ETH Zürich Generating Low-Overhead Dynamic Binary Translators Mathias Payer and Thomas R. Gross Department of Computer Science ETH Zürich Motivation Binary Translation (BT) well known technique for late transformations

More information

FlexTM. Flexible Decoupled Transactional Memory Support. Arrvindh Shriraman Sandhya Dwarkadas Michael L. Scott Department of Computer Science

FlexTM. Flexible Decoupled Transactional Memory Support. Arrvindh Shriraman Sandhya Dwarkadas Michael L. Scott Department of Computer Science FlexTM Flexible Decoupled Transactional Memory Support Arrvindh Shriraman Sandhya Dwarkadas Michael L. Scott Department of Computer Science 1 Transactions: Our Goal Lazy Txs (i.e., optimistic conflict

More information

Mutex Locking versus Hardware Transactional Memory: An Experimental Evaluation

Mutex Locking versus Hardware Transactional Memory: An Experimental Evaluation Mutex Locking versus Hardware Transactional Memory: An Experimental Evaluation Thesis Defense Master of Science Sean Moore Advisor: Binoy Ravindran Systems Software Research Group Virginia Tech Multiprocessing

More information

Enhancing Real-Time Behaviour of Parallel Applications using Intel TSX

Enhancing Real-Time Behaviour of Parallel Applications using Intel TSX Enhancing Real-Time Behaviour of Parallel Applications using Intel TSX Florian Haas, Stefan Metzlaff, Sebastian Weis, and Theo Ungerer Department of Computer Science, University of Augsburg, Germany January

More information

McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime

McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime B. Saha, A-R. Adl- Tabatabai, R. Hudson, C.C. Minh, B. Hertzberg PPoPP 2006 Introductory TM Sales Pitch Two legs

More information

Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench

Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Ajay Singh Dr. Sathya Peri Anila Kumari Monika G. February 24, 2017 STM vs Synchrobench IIT Hyderabad February 24,

More information

Lecture 12 Transactional Memory

Lecture 12 Transactional Memory CSCI-UA.0480-010 Special Topics: Multicore Programming Lecture 12 Transactional Memory Christopher Mitchell, Ph.D. cmitchell@cs.nyu.edu http://z80.me Database Background Databases have successfully exploited

More information

Dependence-Aware Transactional Memory for Increased Concurrency. Hany E. Ramadan, Christopher J. Rossbach, Emmett Witchel University of Texas, Austin

Dependence-Aware Transactional Memory for Increased Concurrency. Hany E. Ramadan, Christopher J. Rossbach, Emmett Witchel University of Texas, Austin Dependence-Aware Transactional Memory for Increased Concurrency Hany E. Ramadan, Christopher J. Rossbach, Emmett Witchel University of Texas, Austin Concurrency Conundrum Challenge: CMP ubiquity Parallel

More information

RETCON: Transactional Repair Without Replay

RETCON: Transactional Repair Without Replay University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 11-2-2009 RETCON: Transactional Repair Without Replay Colin lundell University of Pennsylvania,

More information

Boosting Timestamp-based Transactional Memory by Exploiting Hardware Cycle Counters

Boosting Timestamp-based Transactional Memory by Exploiting Hardware Cycle Counters Boosting Timestamp-based Transactional Memory by Exploiting Hardware Cycle Counters Wenjia Ruan, Yujie Liu, and Michael Spear Lehigh University {wer1, yul1, spear}@cse.lehigh.edu Abstract Time-based transactional

More information

An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees

An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees Chi Cao Minh, Martin Trautmann, JaeWoong Chung, Austen McDonald, Nathan Bronson, Jared Casper, Christos Kozyrakis, Kunle

More information

SmartMD: A High Performance Deduplication Engine with Mixed Pages

SmartMD: A High Performance Deduplication Engine with Mixed Pages SmartMD: A High Performance Deduplication Engine with Mixed Pages Fan Guo 1, Yongkun Li 1, Yinlong Xu 1, Song Jiang 2, John C. S. Lui 3 1 University of Science and Technology of China 2 University of Texas,

More information

HTM in the wild. Konrad Lai June 2015

HTM in the wild. Konrad Lai June 2015 HTM in the wild Konrad Lai June 2015 Industrial Considerations for HTM Provide a clear benefit to customers Improve performance & scalability Ease programmability going forward Improve something common

More information

TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions

TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions Yige Hu, Zhiting Zhu, Ian Neal, Youngjin Kwon, Tianyu Chen, Vijay Chidambaram, Emmett Witchel The University of Texas at Austin

More information

Performance Improvement via Always-Abort HTM

Performance Improvement via Always-Abort HTM 1 Performance Improvement via Always-Abort HTM Joseph Izraelevitz* Lingxiang Xiang Michael L. Scott* *Department of Computer Science University of Rochester {jhi1,scott}@cs.rochester.edu Parallel Computing

More information

Extracting Parallelism from Legacy Sequential Code Using Software Transactional Memory

Extracting Parallelism from Legacy Sequential Code Using Software Transactional Memory Extracting Parallelism from Legacy Sequential Code Using Software Transactional Memory Mohamed M. Saad Preliminary Examination Proposal submitted to the Faculty of the Virginia Polytechnic Institute and

More information

Speculative Parallelization of Sequential Loops On Multicores

Speculative Parallelization of Sequential Loops On Multicores Speculative Parallelization of Sequential Loops On Multicores Chen Tian Min Feng Vijay Nagarajan Rajiv Gupta Department of Computer Science and Engineering University of California at Riverside, CA, U.S.A

More information

Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures

Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures

More information

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013 Lecture 20: Transactional Memory Parallel Computer Architecture and Programming Slide credit Many of the slides in today s talk are borrowed from Professor Christos Kozyrakis (Stanford University) Raising

More information

Hierarchical PLABs, CLABs, TLABs in Hotspot

Hierarchical PLABs, CLABs, TLABs in Hotspot Hierarchical s, CLABs, s in Hotspot Christoph M. Kirsch ck@cs.uni-salzburg.at Hannes Payer hpayer@cs.uni-salzburg.at Harald Röck hroeck@cs.uni-salzburg.at Abstract Thread-local allocation buffers (s) are

More information

Atomicity via Source-to-Source Translation

Atomicity via Source-to-Source Translation Atomicity via Source-to-Source Translation Benjamin Hindman Dan Grossman University of Washington 22 October 2006 Atomic An easier-to-use and harder-to-implement primitive void deposit(int x){ synchronized(this){

More information

Efficient Hybrid Transactional Memory Scheme using Near-optimal Retry Computation and Sophisticated Memory Management in Multi-core Environment

Efficient Hybrid Transactional Memory Scheme using Near-optimal Retry Computation and Sophisticated Memory Management in Multi-core Environment J Inf Process Syst, Vol.14, No.2, pp.499~509, April 2018 https://doi.org/10.3745/jips.01.0026 ISSN 1976-913X (Print) ISSN 2092-805X (Electronic) Efficient Hybrid Transactional Memory Scheme using Near-optimal

More information

A Dynamic Instrumentation Approach to Software Transactional Memory. Marek Olszewski

A Dynamic Instrumentation Approach to Software Transactional Memory. Marek Olszewski A Dynamic Instrumentation Approach to Software Transactional Memory by Marek Olszewski A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department

More information

SQL Server Administration 10987: Performance Tuning and Optimizing SQL Databases. Upcoming Dates. Course Description.

SQL Server Administration 10987: Performance Tuning and Optimizing SQL Databases. Upcoming Dates. Course Description. SQL Server Administration 10987: Performance Tuning and Optimizing SQL Databases Learn the high level architectural overview of SQL Server 2016 and explore SQL Server execution model, waits and queues

More information

Low Overhead Concurrency Control for Partitioned Main Memory Databases

Low Overhead Concurrency Control for Partitioned Main Memory Databases Low Overhead Concurrency Control for Partitioned Main Memory Databases Evan Jones, Daniel Abadi, Samuel Madden, June 2010, SIGMOD CS 848 May, 2016 Michael Abebe Background Motivations Database partitioning

More information

Transactional Memory. Concurrency unlocked Programming. Bingsheng Wang TM Operating Systems

Transactional Memory. Concurrency unlocked Programming. Bingsheng Wang TM Operating Systems Concurrency unlocked Programming Bingsheng Wang TM Operating Systems 1 Outline Background Motivation Database Transaction Transactional Memory History Transactional Memory Example Mechanisms Software Transactional

More information

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions Transactions Main issues: Concurrency control Recovery from failures 2 Distributed Transactions

More information

Evaluation of AMD s Advanced Synchronization Facility Within a Complete Transactional Memory Stack

Evaluation of AMD s Advanced Synchronization Facility Within a Complete Transactional Memory Stack Evaluation of AMD s Advanced Synchronization Facility Within a Complete Transactional Memory Stack Dave Christie Jae-Woong Chung Stephan Diestelhorst Michael Hohmuth Martin Pohlack Advanced Micro Devices,

More information

A Hardware/Software Approach for Alleviating Scalability Bottlenecks in Transactional Memory Applications

A Hardware/Software Approach for Alleviating Scalability Bottlenecks in Transactional Memory Applications A Hardware/Software Approach for Alleviating Scalability Bottlenecks in Transactional Memory Applications by Geoffrey Wyman Blake A dissertation submitted in partial fulfillment of the requirements for

More information

Outline. Database Tuning. Ideal Transaction. Concurrency Tuning Goals. Concurrency Tuning. Nikolaus Augsten. Lock Tuning. Unit 8 WS 2013/2014

Outline. Database Tuning. Ideal Transaction. Concurrency Tuning Goals. Concurrency Tuning. Nikolaus Augsten. Lock Tuning. Unit 8 WS 2013/2014 Outline Database Tuning Nikolaus Augsten University of Salzburg Department of Computer Science Database Group 1 Unit 8 WS 2013/2014 Adapted from Database Tuning by Dennis Shasha and Philippe Bonnet. Nikolaus

More information

Cigarette-Smokers Problem with STM

Cigarette-Smokers Problem with STM Rup Kamal, Ryan Saptarshi Ray, Utpal Kumar Ray & Parama Bhaumik Department of Information Technology, Jadavpur University Kolkata, India Abstract - The past few years have marked the start of a historic

More information

Atomic Shelters: Coping with Multi-core Fallout

Atomic Shelters: Coping with Multi-core Fallout Atomic : Coping with Multi-core Fallout Zachary Ryan Anderson David Gay Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2010-39 http://www.eecs.berkeley.edu/pubs/techrpts/2010/eecs-2010-39.html

More information

On Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions

On Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions On Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions Ahmed Hassan Preliminary Examination Proposal submitted to the Faculty of the Virginia Polytechnic

More information

HARP: Adaptive Abort Recurrence Prediction for Hardware Transactional Memory

HARP: Adaptive Abort Recurrence Prediction for Hardware Transactional Memory RP: daptive bort Recurrence Prediction for ardware Transactional Memory drià rmejach nurag Negi drián ristal Osman Unsal Per Stenstrom Tim arris Barcelona Supercomputing enter halmers University of Technology

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication DB Reading Group Fall 2015 slides by Dana Van Aken Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports

More information

TrC-MC: Decentralized Software Transactional Memory for Multi-Multicore Computers

TrC-MC: Decentralized Software Transactional Memory for Multi-Multicore Computers TrC-MC: Decentralized Software Transactional Memory for Multi-Multicore Computers Kinson Chan, Cho-Li Wang The University of Hong Kong {kchan, clwang}@cs.hku.hk Abstract To achieve single-lock atomicity

More information

Database Management and Tuning

Database Management and Tuning Database Management and Tuning Concurrency Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 8 May 10, 2012 Acknowledgements: The slides are provided by Nikolaus

More information

Improving the Practicality of Transactional Memory

Improving the Practicality of Transactional Memory Improving the Practicality of Transactional Memory Woongki Baek Electrical Engineering Stanford University Programming Multiprocessors Multiprocessor systems are now everywhere From embedded to datacenter

More information