Lecture 8: Eager Transactional Memory. Topics: implementation details of eager TM, various TM pathologies

Similar documents
Lecture 6: TM Eager Implementations. Topics: Eager conflict detection (LogTM), TM pathologies

Lecture 10: TM Implementations. Topics: wrap-up of eager implementation (LogTM), scalable lazy implementation

Lecture 12: TM, Consistency Models. Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations

Lecture: Transactional Memory. Topics: TM implementations

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks

Lecture 8: Transactional Memory TCC. Topics: lazy implementation (TCC)

Lecture 4: Directory Protocols and TM. Topics: corner cases in directory protocols, lazy TM

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM

Lecture: Consistency Models, TM

Lecture: Consistency Models, TM. Topics: consistency models, TM intro (Section 5.6)

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory

PERFORMANCE PATHOLOGIES IN HARDWARE TRANSACTIONAL MEMORY

Log-Based Transactional Memory

Relaxing Concurrency Control in Transactional Memory. Utku Aydonat

Lecture 5: Directory Protocols. Topics: directory-based cache coherence implementations

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation

Lecture 17: Transactional Memories I

Lecture 3: Directory Protocol Implementations. Topics: coherence vs. msg-passing, corner cases in directory protocols

Lecture 18: Coherence and Synchronization. Topics: directory-based coherence protocols, synchronization primitives (Sections

LogTM: Log-Based Transactional Memory

Lecture 8: Snooping and Directory Protocols. Topics: split-transaction implementation details, directory implementations (memory- and cache-based)

Commit Algorithms for Scalable Hardware Transactional Memory. Abstract

Lecture 18: Coherence Protocols. Topics: coherence protocols for symmetric and distributed shared-memory multiprocessors (Sections

Scalable and Reliable Communication for Hardware Transactional Memory

Lecture 8: Directory-Based Cache Coherence. Topics: scalable multiprocessor organizations, directory protocol design issues

Concurrency control CS 417. Distributed Systems CS 417

Transactional Memory

Lecture 12: Hardware/Software Trade-Offs. Topics: COMA, Software Virtual Memory

Lecture 25: Multiprocessors. Today s topics: Snooping-based cache coherence protocol Directory-based cache coherence protocol Synchronization

EazyHTM: Eager-Lazy Hardware Transactional Memory

Lecture 2: Snooping and Directory Protocols. Topics: Snooping wrap-up and directory implementations

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels

EigenBench: A Simple Exploration Tool for Orthogonal TM Characteristics

Design and Implementation of Signatures in Transactional Memory Systems

Lecture 25: Multiprocessors

Distributed Systems. 12. Concurrency Control. Paul Krzyzanowski. Rutgers University. Fall 2017

ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Envionment

Lecture: Coherence, Synchronization. Topics: directory-based coherence, synchronization primitives (Sections )

Portland State University ECE 588/688. Transactional Memory

6.852: Distributed Algorithms Fall, Class 20

Conflict Detection and Validation Strategies for Software Transactional Memory

A Basic Snooping-Based Multi-Processor Implementation

A More Sophisticated Snooping-Based Multi-Processor

Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory

Topics in Reliable Distributed Systems

Lecture 7: Implementing Cache Coherence. Topics: implementation details

A Scalable, Non-blocking Approach to Transactional Memory

Multi-Core Computing with Transactional Memory. Johannes Schneider and Prof. Roger Wattenhofer

Hardware Support For Serializable Transactions: A Study of Feasibility and Performance

Design and Implementation of Signatures. for Transactional Memory Systems

Multiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types

Foundation of Database Transaction Processing. Copyright 2012 Pearson Education, Inc.

Lecture 7: PCM Wrap-Up, Cache coherence. Topics: handling PCM errors and writes, cache coherence intro

Lecture 3: Snooping Protocols. Topics: snooping-based cache coherence implementations

Distributed Systems. 13. Distributed Deadlock. Paul Krzyzanowski. Rutgers University. Fall 2017

A Basic Snooping-Based Multi-Processor Implementation

LogSI-HTM: Log Based Snapshot Isolation in Hardware Transactional Memory

Module 9: Addendum to Module 6: Shared Memory Multiprocessors Lecture 17: Multiprocessor Organizations and Cache Coherence. The Lecture Contains:

Multiprocessors II: CC-NUMA DSM. CC-NUMA for Large Systems

MULTIPROCESSORS AND THREAD LEVEL PARALLELISM

Lecture 22: Fault Tolerance

Consistency & TM. Consistency

Module 14: "Directory-based Cache Coherence" Lecture 31: "Managing Directory Overhead" Directory-based Cache Coherence: Replacement of S blocks

Page 1. Consistency. Consistency & TM. Consider. Enter Consistency Models. For DSM systems. Sequential consistency

Transactional Memory. Lecture 19: Parallel Computer Architecture and Programming CMU /15-618, Spring 2015

Outline. Optimistic CC (Kung&Robinson) Carnegie Mellon Univ. Dept. of Computer Science Database Applications

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012

Optimistic Concurrency Control. April 13, 2017

Lecture 8: Branch Prediction, Dynamic ILP. Topics: static speculation and branch prediction (Sections )

DHTM: Durable Hardware Transactional Memory

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28

Lecture 7: PCM, Cache coherence. Topics: handling PCM errors and writes, cache coherence intro

Scheduling Transactions in Replicated Distributed Transactional Memory

How Oracle Does It. No Read Locks

Shared Memory Multiprocessors

Evaluating Contention Management Using Discrete Event Simulation

Tradeoffs in Transactional Memory Virtualization

Transactional Memory. Lecture 18: Parallel Computer Architecture and Programming CMU /15-618, Spring 2017

EC 513 Computer Architecture

BulkCommit: Scalable and Fast Commit of Atomic Blocks in a Lazy Multiprocessor Environment

! Part I: Introduction. ! Part II: Obstruction-free STMs. ! DSTM: an obstruction-free STM design. ! FSTM: a lock-free STM design

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Potential violations of Serializability: Example 1

CS 448 Database Systems. Serializability Issues

CS 541 Database Systems. Serializability Issues

DATABASE TRANSACTIONS. CS121: Relational Databases Fall 2017 Lecture 25

Unbounded Page-Based Transactional Memory

EECS 591 DISTRIBUTED SYSTEMS

Today s Class. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Formal Properties of Schedules. Conflicting Operations

Lecture 1: Introduction

Transactional Memory

Optimistic Concurrency Control. April 18, 2018

Lecture 24: Virtual Memory, Multiprocessors

Lowering the Overhead of Nonblocking Software Transactional Memory

Lecture 9: Dynamic ILP. Topics: out-of-order processors (Sections )

Goldibear and the 3 Locks. Programming With Locks Is Tricky. More Lock Madness. And To Make It Worse. Transactional Memory: The Big Idea

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 5. Multiprocessors and Thread-Level Parallelism

Concurrency Control. Concurrency Control Ensures interleaving of operations amongst concurrent transactions result in serializable schedules

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Transcription:

Lecture 8: Eager Transactional Memory Topics: implementation details of eager TM, various TM pathologies 1

Eager Overview Topics: Logs Log optimization Conflict examples Handling deadlocks Sticky scenarios Aborts/commits/parallelism P P P P C Dir R W C Dir R W C Dir R W C Dir R W Scalable Non-broadcast Interconnect 2

Eager Implementation (Based Primarily on LogTM) A write is made permanent immediately (we do not wait until the end of the transaction) Can t lose the old value (in case this transaction is aborted) hence, before the write, we copy the old value into a log (the log is some space in virtual memory -- the log itself may be in cache, so not too expensive) This is eager versioning 3

Versioning Every overflowed write first requires a read and a write to log the old value the log is maintained in virtual memory and will likely be found in cache Aborts are uncommon typically only when the contention manager kicks in on a potential deadlock; the logs are walked through in reverse order If a block is already marked as being logged (wr-set), the next write by that transaction can avoid the re-log Log writes can be placed in a write buffer to reduce contention for L1 cache ports 4

Conflict Detection and Resolution Since Transaction-A s writes are made permanent rightaway, it is possible that another Transaction-B s rd/wr miss is re-directed to Tr-A At this point, we detect a conflict (neither transaction has reached its end, hence, eager conflict detection): two transactions handling the same cache line and at least one of them does a write One solution: requester stalls: Tr-A sends a NACK to Tr-B; Tr-B waits and re-tries again; hopefully, Tr-A has committed and can hand off the latest cache line to B neither transaction needs to abort 5

Deadlocks Can lead to deadlocks: each transaction is waiting for the other to finish Need a separate (hw/sw) contention manager to detect such deadlocks and force one of them to abort Tr-A Tr-B write X write Y read Y read X Alternatively, every transaction maintains an age and a young transaction aborts and re-starts if it is keeping an older transaction waiting and itself receives a nack from an older transaction 6

Block Replacement If a block in a transaction s rd/wr-set is evicted, the data is written back to memory if necessary, but the directory continues to maintain a sticky pointer to that node (subsequent requests have to confirm that the transaction has committed before proceeding) The sticky pointers are lazily removed over time (commits continue to be fast) 7

Paper on TM Pathologies LL: lazy versioning, lazy conflict detection, committing transaction wins conflicts EL: lazy versioning, eager conflict detection, requester succeeds and others abort EE: eager versioning, eager conflict detection, requester stalls 8

Pathology 1: Friendly Fire Two conflicting transactions that keep aborting each other Can do exponential back-off to handle livelock Fixable by doing requester stalls? VM: any CD: eager CR: requester wins 9

Pathology 2: Starving Writer A writer has to wait for the reader to finish but if more readers keep showing up, the writer is starved (note that the directory allows new readers to proceed by just adding them to the list of sharers) VM: any CD: eager CR: requester stalls 10

Pathology 3: Serialized Commit If there s a single commit token, transaction commit is serialized VM: lazy CD: lazy CR: any There are ways to alleviate this problem 11

Pathology 4: Futile Stall A transaction is stalling on another transaction that ultimately aborts and takes a while to reinstate old values VM: any CD: eager CR: requester stalls 12

Pathology 5: Starving Elder Small successful transactions can keep aborting a large transaction VM: lazy CD: lazy CR: committer wins The large transaction can eventually grab the token and not release it until after it commits 13

Pathology 6: Restart Convoy A number of similar (conflicting) transactions execute together one wins, the others all abort shortly, these transactions all return and repeat the process VM: lazy CD: lazy CR: committer wins 14

Pathology 7: Dueling Upgrades If two transactions both read the same object and then both decide to write it, a deadlock is created VM: eager CD: eager CR: requester stalls Exacerbated by the Futile Stall pathology Solution? 15

Four Extensions Predictor: predict if the read will soon be followed by a write and acquire write permissions aggressively Hybrid: if a transaction believes it is a Starving Writer, it can force other readers to abort; for everything else, use requester stalls Timestamp: In the EL case, requester wins only if it is the older transaction (handles Friendly Fire pathology) Backoff: in the LL case, aborting transactions invoke exponential back-off to prevent convoy formation 16

Title Bullet 17