Efficient Persist Barriers for Multicores

Size: px
Start display at page:

Download "Efficient Persist Barriers for Multicores"

Transcription

1 fficient Persist Barriers for Multicores Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas

2 Summary fficient persist barrier Used to implement persistency models Persistency = when stores become durable (Consistency = when stores become visible) valuated Buffered poch Persistency Bulk Strict Persistency 2

3 Access Granularity Persistent Memory Access Latency 3

4 Persistent Memory Access Granularity Secondary Storage Cache DRAM Access Latency 3

5 Persistent Memory Access Granularity PCM STT-MRAM 3D Xpoint Secondary Storage Cache DRAM NVRAM Access Latency 3

6 Persistent Memory Access Granularity PCM STT-MRAM 3D Xpoint Secondary Storage Cache DRAM NVRAM Access Latency Fast, fine grained persistence. 3

7 Persistent Memory Advantages: Unify memory and storage Access to persistent data through processor load/store interface Challenge: Maintaining consistency of data structures in memory 4

8 Consistency Challenge Core Cache DRAM Software Controlled Secondary Storage 5

9 Consistency Challenge Core Core Cache Cache Hardware Controlled DRAM NVRAM Software Controlled Secondary Storage Secondary Storage 5

10 Linked List - Naïve Cache Pseudo-code HAD 1. Create Update Pointer 3. Update Head Pointer HAD 0 1 NVRAM 6

11 Linked List - Naïve Cache Pseudo-code HAD 1. Create Update Pointer 3. Update Head Pointer HAD 0 1 NVRAM 6

12 Linked List - Naïve Cache Pseudo-code HAD 1. Create Update Pointer 3. Update Head Pointer HAD NVRAM 6

13 Linked List - Naïve Cache Pseudo-code HAD 1. Create Update Pointer 3. Update Head Pointer HAD NVRAM 6

14 Linked List - Naïve Cache Pseudo-code 1. Create System Crash! 2. Update Pointer 3. Update Head Pointer HAD NVRAM 6

15 Linked List - Failsafe Cache Pseudo-code HAD 1. Create 2. Update Pointer Persist Barrier 4. Update Head Pointer HAD 0 1 NVRAM 7

16 Linked List - Failsafe Cache Pseudo-code HAD 1. Create Update Pointer 3. Persist Barrier 4. Update Head Pointer HAD 0 1 NVRAM 7

17 Linked List - Failsafe Cache Pseudo-code HAD 1. Create Update Pointer 3. Persist Barrier 4. Update Head Pointer HAD NVRAM 7

18 Linked List - Failsafe Cache Pseudo-code HAD 1. Create Update Pointer 3. Persist Barrier 4. Update Head Pointer HAD NVRAM 7

19 Linked List - Failsafe Cache Pseudo-code HAD 1. Create Update Pointer 3. Persist Barrier 4. Update Head Pointer HAD NVRAM 7

20 Linked List - Failsafe Pseudo-code 1. Create 2. Update Pointer poch A 3. Persist Barrier 4. Update Head Pointer poch B 8

21 Linked List - Failsafe Pseudo-code 1. Create 2. Update Pointer 3. Persist Barrier Programmer Introduced 4. Update Head Pointer 8

22 Linked List - Failsafe Pseudo-code 1. Create 2. Update Pointer 3. Persist Barrier Programmer Introduced 4. Update Head Pointer Divide program execution into epochs through programmer inserted persist barriers = poch Persistence 8

23 poch Persistence* St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d Persistence a poch 1 b c b a a c d poch 2 e poch 3 d p q d d e * Pelley et. al., Memory Persistency, in ISCA

24 poch Persistence* St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d Persistence a poch 1 b c b a a c d poch 2 e poch 3 d p q d d e * Pelley et. al., Memory Persistency, in ISCA

25 poch Persistence* St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d Persistence a poch 1 b c b a a c d poch 2 e poch 3 d p q d d e * Pelley et. al., Memory Persistency, in ISCA

26 poch Persistence* St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d Persistence a poch 1 b c b a a c d poch 2 e poch 3 d p q d d e * Pelley et. al., Memory Persistency, in ISCA

27 poch Persistence* St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d Persistence a poch 1 b c b a a c d poch 2 e poch 3 d p q d d e * Pelley et. al., Memory Persistency, in ISCA

28 poch Persistence* St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d Persistence a poch 1 b c b a a c d poch 2 e poch 3 d p q d d e * Pelley et. al., Memory Persistency, in ISCA

29 poch Persistence* St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d Persistence a poch 1 b c b a a c d poch 2 e poch 3 d p q d d e Persist operations happen in the critical path of execution. * Pelley et. al., Memory Persistency, in ISCA

30 Buffered poch Persistence* through Lazy Barrier (LB) Implementation of poch Persistence Durability lags visibility To allow performing persist operations out of critical path * Pelley et. al., Memory Persistency, in ISCA * Condit et. al., Better I/O through byte-addressable, persistent memory, in SOSP

31 Buffered poch Persistence* through Lazy Barrier (LB) poch 1 poch 2 poch 3 a b c a d e d p q d Conflicting request Persistence b a c d e * Pelley et. al., Memory Persistency, in ISCA * Condit et. al., Better I/O through byte-addressable, persistent memory, in SOSP

32 Buffered poch Persistence* through Lazy Barrier (LB) poch 1 poch 2 poch 3 a b c a d e d p q d Conflicting request Persistence b a c d e * Pelley et. al., Memory Persistency, in ISCA * Condit et. al., Better I/O through byte-addressable, persistent memory, in SOSP

33 Buffered poch Persistence* through Lazy Barrier (LB) poch 1 poch 2 poch 3 a b c a d e d p q d Conflicting request Persistence b a c d e * Pelley et. al., Memory Persistency, in ISCA * Condit et. al., Better I/O through byte-addressable, persistent memory, in SOSP

34 Buffered poch Persistence* through Lazy Barrier (LB) poch 1 poch 2 poch 3 a b c a d e d p q d Conflicting request Persistence b a c d e Cache Line viction * Pelley et. al., Memory Persistency, in ISCA * Condit et. al., Better I/O through byte-addressable, persistent memory, in SOSP

35 Buffered poch Persistence* through Lazy Barrier (LB) poch 1 poch 2 poch 3 a b c a d e d p q d Conflicting request Persistence b a c d e * Pelley et. al., Memory Persistency, in ISCA * Condit et. al., Better I/O through byte-addressable, persistent memory, in SOSP

36 Buffered poch Persistence* through Lazy Barrier (LB) poch 1 poch 2 poch 3 a b c a d e d p q d Conflicting request Persistence b a c d e Conflicts bring persist operations back in the critical path. * Pelley et. al., Memory Persistency, in ISCA * Condit et. al., Better I/O through byte-addressable, persistent memory, in SOSP

37 Intra-thread Conflict poch 1 poch 2 poch 3 a b c a d e d p q d Conflicting request Persistence b a c d e 12

38 Intra-thread Conflict poch 1 poch 2 poch 3 a b c a d e d p q d Conflicting request Persistence b a c d e 12

39 Proactive Flush (PF) Persist Triggers in Lazy Barrier Passive Trigger: cache line eviction Reactive Trigger: flush on (intra/inter)-thread conflicts 13

40 Proactive Flush (PF) Persist Triggers in Lazy Barrier Passive Trigger: cache line eviction Reactive Trigger: flush on (intra/inter)-thread conflicts Persist Trigger with Proactive Flush Proactive Trigger: proactively flush on epoch completion 13

41 Proactive Flush (PF) poch 1 poch 2 poch 3 a b c a d e d p q d Persistence b a c d e 14

42 Proactive Flush (PF) poch 1 poch 2 poch 3 a b c a d e d p q d Persistence b a c d e 14

43 Proactive Flush (PF) poch 1 poch 2 poch 3 a b c a d e d p q d Persistence b a c d e 14

44 Proactive Flush (PF) poch 1 poch 2 poch 3 a b c a d e d p q d Persistence b a c d e 14

45 Proactive Flush (PF) poch 1 poch 2 poch 3 a b c a d e d p q d Persistence b a c d e Proactive Flush 14

46 Proactive Flush (PF) poch 1 poch 2 poch 3 a b c a d e d p q d Persistence b a c d e Proactive Flush 14

47 Proactive Flush (PF) poch 1 poch 2 poch 3 a b c a d e d p q d Persistence b a c d e Proactive Flush Reduces the probability of encountering conflicts. 14

48 Inter-thread Conflict Thread T 0 W A poch 00 W B W W F Persistence Z A B F Thread T 1 R X R Y W Z R P R B R Q W poch 10 poch 11 15

49 Inter-thread Conflict Thread T 0 W A poch 00 W B W W F Persistence Z A B F Thread T 1 R X R Y W Z R P R B R Q W poch 10 poch 11 15

50 Inter-thread Conflict Thread T 0 W A poch 00 W B W W F Persistence Z A B F Thread T 1 R X R Y W Z R P R B R Q W poch 10 poch 11 15

51 Inter-thread Conflict Thread T 0 W A poch 00 W B W W F Persistence Z A B F Thread T 1 R X R Y W Z R P R B R Q W poch 10 poch 11 15

52 Inter-thread Conflict Thread T 0 W A poch 00 W B W W F Persistence Z A B F Thread T 1 R X R Y W Z R P R B R Q W poch 10 poch 11 15

53 Inter-thread Dependency Lazy barrier Tracking (IDT) No tracking of inter-thread dependencies Need to enforce dependencies online 16

54 Inter-thread Dependency Lazy barrier Tracking (IDT) No tracking of inter-thread dependencies Need to enforce dependencies online Inter-thread Dependency Tracking Add inter-thread dependence tracking registers Track dependencies to enforce offline 16

55 Inter-thread Dependency Tracking (IDT) Thread T 0 poch 00 Source IDT Table Dependent W A W B W W F Persistence Z A B F Thread R X R Y poch W Z R P R B T 1 R Q W poch

56 Inter-thread Dependency Tracking (IDT) Thread T 0 poch 00 Source IDT Table Dependent W A W B W W F Persistence Z A B F Thread R X R Y poch W Z R P R B T 1 R Q W poch

57 Inter-thread Dependency Tracking (IDT) Thread T 0 W A poch 00 W B W W F IDT Table Source Dependent poch 00 poch 11 Persistence Z A B F Thread R X R Y poch W Z R P R B T 1 R Q W poch

58 Inter-thread Dependency Tracking (IDT) Thread T 0 W A poch 00 W B W W F IDT Table Source Dependent poch 00 poch 11 Persistence Z A B F Thread R X R Y poch W Z R P R B T 1 R Q W poch

59 Inter-thread Dependency Tracking (IDT) Thread T 0 W A poch 00 W B W W F IDT Table Source Dependent poch 00 poch 11 Persistence Z A B F Thread R X R Y poch W Z R P R B T 1 R Q W poch

60 Inter-thread Dependency Tracking (IDT) Thread T 0 W A poch 00 W B W W F IDT Table Source Dependent poch 00 poch 11 Persistence Z A B F Thread R X R Y poch W Z R P R B T 1 R Q W poch Reduces the latency of conflicting requests. 17

61 valuation LB LB+IDT LB+PF LB++ Lazy barrier Lazy barrier with inter-thread dependence tracking (IDT) Lazy barrier with proactive flush (PF) Lazy barrier with both IDT and PF Persistency Models Persist Barrier Designs Buffered poch Persistency (BP) maintaining in-memory persistent data structures Bulk Strict Persistency (BSP) = BP + atomicity provide stronger persistency model (strict persistency) similar to doing sequential consistency in bulk mode* * Ceze et. al., BulkSC: Bulk enforcement of sequential consistency, in ISCA

62 System Configuration We evaluate proposed design using GM5 full-system simulation mode 32 Core CMP with 32x1MB LLC cache banks and 4 memory controllers More details on implementation of persist barrier for such a system are in the paper. 19

63 BP Transaction Throughput Higher is Better 20

64 BP Transaction Throughput 3% Higher is Better 20

65 BP Transaction Throughput 15% Higher is Better 20

66 BP Transaction Throughput 22% Higher is Better 20

67 BSP xecution Time Lower is Better 21

68 BSP xecution Time 15% Lower is Better 21

69 BSP xecution Time 20% Lower is Better 21

70 Conclusion We propose an efficient implementation of a persist barrier primitive Buffered implementation, to move persists out of critical path We highlight how conflicts bring them back into critical path We propose and implement two optimizations Proactive Flush: Reduce the percentage of conflicting epochs Inter-thread Dependence Tracking: Reduce the penalty of inter-thread conflicts We demonstrate the efficacy by implementing two persistence models, namely BP and BSP efficiently 22

71 fficient Persist Barriers for Multicores Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas

Architectural Support for Atomic Durability in Non-Volatile Memory

Architectural Support for Atomic Durability in Non-Volatile Memory Architectural Support for Atomic Durability in Non-Volatile Memory Arpit Joshi, Vijay Nagarajan, Stratis Viglas, Marcelo Cintra NVMW 2018 Summary Non-Volatile Memory (NVM) - on the memory bus enables in-memory

More information

DHTM: Durable Hardware Transactional Memory

DHTM: Durable Hardware Transactional Memory DHTM: Durable Hardware Transactional Memory Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas ISCA 2018 is here!2 is here!2 Systems LLC!3 Systems - Non-volatility over the memory bus - Load/Store

More information

Instant Recovery for Main-Memory Databases

Instant Recovery for Main-Memory Databases Instant Recovery for Main-Memory Databases Ismail Oukid*, Wolfgang Lehner*, Thomas Kissinger*, Peter Bumbulis, and Thomas Willhalm + *TU Dresden SAP SE + Intel GmbH CIDR 2015, Asilomar, California, USA,

More information

Hardware Support for NVM Programming

Hardware Support for NVM Programming Hardware Support for NVM Programming 1 Outline Ordering Transactions Write endurance 2 Volatile Memory Ordering Write-back caching Improves performance Reorders writes to DRAM STORE A STORE B CPU CPU B

More information

Phase Change Memory An Architecture and Systems Perspective

Phase Change Memory An Architecture and Systems Perspective Phase Change Memory An Architecture and Systems Perspective Benjamin Lee Electrical Engineering Stanford University Stanford EE382 2 December 2009 Benjamin Lee 1 :: PCM :: 2 Dec 09 Memory Scaling density,

More information

Blurred Persistence in Transactional Persistent Memory

Blurred Persistence in Transactional Persistent Memory Blurred Persistence in Transactional Persistent Memory Youyou Lu, Jiwu Shu, Long Sun Tsinghua University Overview Problem: high performance overhead in ensuring storage consistency of persistent memory

More information

3. Memory Persistency Goals. 2. Background

3. Memory Persistency Goals. 2. Background Memory Persistency Steven Pelley Peter M. Chen Thomas F. Wenisch University of Michigan {spelley,pmchen,twenisch}@umich.edu Abstract Emerging nonvolatile memory technologies (NVRAM) promise the performance

More information

Empirical Study of Redo and Undo Logging in Persistent Memory

Empirical Study of Redo and Undo Logging in Persistent Memory Empirical Study of Redo and Undo Logging in Persistent Memory Hu Wan, Youyou Lu, Yuanchao Xu, and Jiwu Shu College of Information Engineering, Capital Normal University, Beijing, China Department of Computer

More information

Phase Change Memory An Architecture and Systems Perspective

Phase Change Memory An Architecture and Systems Perspective Phase Change Memory An Architecture and Systems Perspective Benjamin C. Lee Stanford University bcclee@stanford.edu Fall 2010, Assistant Professor @ Duke University Benjamin C. Lee 1 Memory Scaling density,

More information

Log-Free Concurrent Data Structures

Log-Free Concurrent Data Structures Log-Free Concurrent Data Structures Abstract Tudor David IBM Research Zurich udo@zurich.ibm.com Rachid Guerraoui EPFL rachid.guerraoui@epfl.ch Non-volatile RAM (NVRAM) makes it possible for data structures

More information

Operating System Supports for SCM as Main Memory Systems (Focusing on ibuddy)

Operating System Supports for SCM as Main Memory Systems (Focusing on ibuddy) 2011 NVRAMOS Operating System Supports for SCM as Main Memory Systems (Focusing on ibuddy) 2011. 4. 19 Jongmoo Choi http://embedded.dankook.ac.kr/~choijm Contents Overview Motivation Observations Proposal:

More information

SoftWrAP: A Lightweight Framework for Transactional Support of Storage Class Memory

SoftWrAP: A Lightweight Framework for Transactional Support of Storage Class Memory SoftWrAP: A Lightweight Framework for Transactional Support of Storage Class Memory Ellis Giles Rice University Houston, Texas erg@rice.edu Kshitij Doshi Intel Corp. Portland, OR kshitij.a.doshi@intel.com

More information

Mnemosyne Lightweight Persistent Memory

Mnemosyne Lightweight Persistent Memory Mnemosyne Lightweight Persistent Memory Haris Volos Andres Jaan Tack, Michael M. Swift University of Wisconsin Madison Executive Summary Storage-Class Memory (SCM) enables memory-like storage Persistent

More information

An Analysis of Persistent Memory Use with WHISPER

An Analysis of Persistent Memory Use with WHISPER An Analysis of Persistent Memory Use with WHISPER Sanketh Nalli, Swapnil Haria, Michael M. Swift, Mark D. Hill, Haris Volos*, Kimberly Keeton* University of Wisconsin- Madison & *Hewlett- Packard Labs

More information

An Analysis of Persistent Memory Use with WHISPER

An Analysis of Persistent Memory Use with WHISPER An Analysis of Persistent Memory Use with WHISPER Sanketh Nalli, Swapnil Haria, Michael M. Swift, Mark D. Hill, Haris Volos*, Kimberly Keeton* University of Wisconsin- Madison & *Hewlett- Packard Labs

More information

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model

More information

NV-Tree Reducing Consistency Cost for NVM-based Single Level Systems

NV-Tree Reducing Consistency Cost for NVM-based Single Level Systems NV-Tree Reducing Consistency Cost for NVM-based Single Level Systems Jun Yang 1, Qingsong Wei 1, Cheng Chen 1, Chundong Wang 1, Khai Leong Yong 1 and Bingsheng He 2 1 Data Storage Institute, A-STAR, Singapore

More information

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho Why memory hierarchy? L1 cache design Sangyeun Cho Computer Science Department Memory hierarchy Memory hierarchy goals Smaller Faster More expensive per byte CPU Regs L1 cache L2 cache SRAM SRAM To provide

More information

Failure-atomic Synchronization-free Regions

Failure-atomic Synchronization-free Regions Failure-atomic Synchronization-free Regions Vaibhav Gogte, Stephan Diestelhorst $, William Wang $, Satish Narayanasamy, Peter M. Chen, Thomas F. Wenisch NVMW 2018, San Diego, CA 03/13/2018 $ Promise of

More information

The Bulk Multicore Architecture for Programmability

The Bulk Multicore Architecture for Programmability The Bulk Multicore Architecture for Programmability Department of Computer Science University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu Acknowledgments Key contributors: Luis Ceze Calin

More information

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory JOY ARULRAJ JUSTIN LEVANDOSKI UMAR FAROOQ MINHAS PER-AKE LARSON Microsoft Research NON-VOLATILE MEMORY [NVM] PERFORMANCE DRAM VOLATILE

More information

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early

More information

New Abstractions for Fast Non-Volatile Storage

New Abstractions for Fast Non-Volatile Storage New Abstractions for Fast Non-Volatile Storage Joel Coburn, Adrian Caulfield, Laura Grupp, Ameen Akel, Steven Swanson Non-volatile Systems Laboratory Department of Computer Science and Engineering University

More information

Persistence Parallelism Optimization: A Holistic Approach from Memory Bus to RDMA Network

Persistence Parallelism Optimization: A Holistic Approach from Memory Bus to RDMA Network Persistence Parallelism Optimization: A Holistic Approach from Memory Bus to RDMA Network Xing Hu Matheus Ogleari Jishen Zhao Shuangchen Li Abanti Basak Yuan Xie University of California, Santa Barbara,

More information

Loose-Ordering Consistency for Persistent Memory

Loose-Ordering Consistency for Persistent Memory Loose-Ordering Consistency for Persistent Memory Youyou Lu, Jiwu Shu, Long Sun and Onur Mutlu Department of Computer Science and Technology, Tsinghua University, Beijing, China State Key Laboratory of

More information

System Software for Persistent Memory

System Software for Persistent Memory System Software for Persistent Memory Subramanya R Dulloor, Sanjay Kumar, Anil Keshavamurthy, Philip Lantz, Dheeraj Reddy, Rajesh Sankaran and Jeff Jackson 72131715 Neo Kim phoenixise@gmail.com Contents

More information

Lazy Persistency: a High-Performing and Write-Efficient Software Persistency Technique

Lazy Persistency: a High-Performing and Write-Efficient Software Persistency Technique Lazy Persistency: a High-Performing and Write-Efficient Software Persistency Technique Mohammad Alshboul, James Tuck, and Yan Solihin Email: maalshbo@ncsu.edu ARPERS Research Group Introduction Future

More information

Relaxed Memory Consistency

Relaxed Memory Consistency Relaxed Memory Consistency Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

OpF-STM : Optimized Persistent Overhead Fail-safety software transactional memory in Non-volatile memory

OpF-STM : Optimized Persistent Overhead Fail-safety software transactional memory in Non-volatile memory OpF-STM : Optimized Persistent Overhead Fail-safety software transactional memory in Non-volatile memory Jihyun Kim Hanyang University Seoul, Korea wlgus3018@hanyang.ac.kr Youjip Won Hanyang University

More information

Linearizability of Persistent Memory Objects

Linearizability of Persistent Memory Objects Linearizability of Persistent Memory Objects Michael L. Scott Joint work with Joseph Izraelevitz & Hammurabi Mendes www.cs.rochester.edu/research/synchronization/ Workshop on the Theory of Transactional

More information

CS 2410 Mid term (fall 2018)

CS 2410 Mid term (fall 2018) CS 2410 Mid term (fall 2018) Name: Question 1 (6+6+3=15 points): Consider two machines, the first being a 5-stage operating at 1ns clock and the second is a 12-stage operating at 0.7ns clock. Due to data

More information

Dalí: A Periodically Persistent Hash Map

Dalí: A Periodically Persistent Hash Map Dalí: A Periodically Persistent Hash Map Faisal Nawab* 1, Joseph Izraelevitz* 2, Terence Kelly*, Charles B. Morrey III*, Dhruva R. Chakrabarti*, and Michael L. Scott 2 1 Department of Computer Science

More information

Linearizability of Persistent Memory Objects

Linearizability of Persistent Memory Objects Linearizability of Persistent Memory Objects Michael L. Scott Joint work with Joseph Izraelevitz & Hammurabi Mendes www.cs.rochester.edu/research/synchronization/ Compiler-Driven Performance Workshop,

More information

JOURNALING techniques have been widely used in modern

JOURNALING techniques have been widely used in modern IEEE TRANSACTIONS ON COMPUTERS, VOL. XX, NO. X, XXXX 2018 1 Optimizing File Systems with a Write-efficient Journaling Scheme on Non-volatile Memory Xiaoyi Zhang, Dan Feng, Member, IEEE, Yu Hua, Senior

More information

Flushing the Cache Termination of Caching Interactions with the I/O Manager The Read-Ahead Module Lazy-Write Functionality

Flushing the Cache Termination of Caching Interactions with the I/O Manager The Read-Ahead Module Lazy-Write Functionality In this chapter: 326 Chapter Flushing the Cache 327 Parameters: 328 Chapter 8: The NT Cache Manager III The Cache Manager then asks the Virtual Memory Manager to flush the section object (representing

More information

WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems

WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems Se Kwon Lee K. Hyun Lim 1, Hyunsub Song, Beomseok Nam, Sam H. Noh UNIST 1 Hongik University Persistent Memory (PM) Persistent memory

More information

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of lazy TM 1 Transactions Access to shared variables is encapsulated within transactions the system gives the illusion

More information

Computer Architecture: Multithreading (I) Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Multithreading (I) Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Multithreading (I) Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-742 Fall 2012, Parallel Computer Architecture, Lecture 9: Multithreading

More information

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers System Performance Scaling of IBM POWER6 TM Based Servers Jeff Stuecheli Hot Chips 19 August 2007 Agenda Historical background POWER6 TM chip components Interconnect topology Cache Coherence strategies

More information

Hardware Undo+Redo Logging. Matheus Ogleari Ethan Miller Jishen Zhao CRSS Retreat 2018 May 16, 2018

Hardware Undo+Redo Logging. Matheus Ogleari Ethan Miller Jishen Zhao   CRSS Retreat 2018 May 16, 2018 Hardware Undo+Redo Logging Matheus Ogleari Ethan Miller Jishen Zhao https://users.soe.ucsc.edu/~mogleari/ CRSS Retreat 2018 May 16, 2018 Typical Memory and Storage Hierarchy: Memory Fast access to working

More information

SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS

SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS Samira Khan MEMORY IN TODAY S SYSTEM Processor DRAM Memory Storage DRAM is critical for performance

More information

Declarative query processing in imperative managed runtimes

Declarative query processing in imperative managed runtimes Declarative query processing in imperative managed runtimes Stratis D. Viglas Google, US & School of Informatics, University of Edinburgh, UK sviglas@google.com Active/HardBD 2017 Multi-tier applications

More information

740: Computer Architecture Memory Consistency. Prof. Onur Mutlu Carnegie Mellon University

740: Computer Architecture Memory Consistency. Prof. Onur Mutlu Carnegie Mellon University 740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University Readings: Memory Consistency Required Lamport, How to Make a Multiprocessor Computer That Correctly Executes Multiprocess

More information

Closing the Performance Gap Between Volatile and Persistent K-V Stores

Closing the Performance Gap Between Volatile and Persistent K-V Stores Closing the Performance Gap Between Volatile and Persistent K-V Stores Yihe Huang, Harvard University Matej Pavlovic, EPFL Virendra Marathe, Oracle Labs Margo Seltzer, Oracle Labs Tim Harris, Oracle Labs

More information

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based?

Agenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based? Agenda Designing Transactional Memory Systems Part III: Lock-based STMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch Part I: Introduction Part II: Obstruction-free STMs Part III: Lock-based

More information

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 18-742 Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 Past Due: Review Assignments Was Due: Tuesday, October 9, 11:59pm. Sohi

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III/VI Section : CSE-1 & CSE-2 Subject Code : CS2354 Subject Name : Advanced Computer Architecture Degree & Branch : B.E C.S.E. UNIT-1 1.

More information

SCALING HARDWARE AND SOFTWARE

SCALING HARDWARE AND SOFTWARE SCALING HARDWARE AND SOFTWARE FOR THOUSAND-CORE SYSTEMS Daniel Sanchez Electrical Engineering Stanford University Multicore Scalability 1.E+06 10 6 1.E+05 10 5 1.E+04 10 4 1.E+03 10 3 1.E+02 10 2 1.E+01

More information

Multiprocessor Synchronization

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency In addition, read Doeppner, 5.1 and 5.2 (Much material in this section has been freely borrowed from Gernot Heiser at UNSW and from Kevin Elphinstone) MP Memory

More information

CBM: A Cooperative Buffer Management for SSD

CBM: A Cooperative Buffer Management for SSD 3 th International Conference on Massive Storage Systems and Technology (MSST 4) : A Cooperative Buffer Management for SSD Qingsong Wei, Cheng Chen, Jun Yang Data Storage Institute, A-STAR, Singapore June

More information

Redo Log Removal Mechanism for NVRAM Log Buffer

Redo Log Removal Mechanism for NVRAM Log Buffer Redo Log Removal Mechanism for NVRAM Log Buffer 2009/10/20 Kyoung-Gu Woo, Heegyu Jin Advanced Software Research Laboratories SAIT, Samsung Electronics Contents Background Basic Philosophy Proof of Correctness

More information

Database Workload. from additional misses in this already memory-intensive databases? interference could be a problem) Key question:

Database Workload. from additional misses in this already memory-intensive databases? interference could be a problem) Key question: Database Workload + Low throughput (0.8 IPC on an 8-wide superscalar. 1/4 of SPEC) + Naturally threaded (and widely used) application - Already high cache miss rates on a single-threaded machine (destructive

More information

COS 318: Operating Systems. Virtual Memory and Address Translation

COS 318: Operating Systems. Virtual Memory and Address Translation COS 318: Operating Systems Virtual Memory and Address Translation Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Today s Topics

More information

Beyond ILP. Hemanth M Bharathan Balaji. Hemanth M & Bharathan Balaji

Beyond ILP. Hemanth M Bharathan Balaji. Hemanth M & Bharathan Balaji Beyond ILP Hemanth M Bharathan Balaji Multiscalar Processors Gurindar S Sohi Scott E Breach T N Vijaykumar Control Flow Graph (CFG) Each node is a basic block in graph CFG divided into a collection of

More information

Soft Updates Made Simple and Fast on Non-volatile Memory

Soft Updates Made Simple and Fast on Non-volatile Memory Soft Updates Made Simple and Fast on Non-volatile Memory Mingkai Dong and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University https://www.usenix.org/conference/atc7/technical-sessions/presentation/dong

More information

Rethink the Sync 황인중, 강윤지, 곽현호. Embedded Software Lab. Embedded Software Lab.

Rethink the Sync 황인중, 강윤지, 곽현호. Embedded Software Lab. Embedded Software Lab. 1 Rethink the Sync 황인중, 강윤지, 곽현호 Authors 2 USENIX Symposium on Operating System Design and Implementation (OSDI 06) System Structure Overview 3 User Level Application Layer Kernel Level Virtual File System

More information

Leave the Cache Hierarchy Operation as It Is: A New Persistent Memory Accelerating Approach

Leave the Cache Hierarchy Operation as It Is: A New Persistent Memory Accelerating Approach Leave the Cache Hierarchy Operation as It Is: A New Persistent Memory Accelerating Approach Chun-Hao Lai Jishen Zhao Chia-Lin Yang National Taiwan University, University of California, Santa Cruz, Academia

More information

Parallel Programming Principle and Practice. Lecture 9 Introduction to GPGPUs and CUDA Programming Model

Parallel Programming Principle and Practice. Lecture 9 Introduction to GPGPUs and CUDA Programming Model Parallel Programming Principle and Practice Lecture 9 Introduction to GPGPUs and CUDA Programming Model Outline Introduction to GPGPUs and Cuda Programming Model The Cuda Thread Hierarchy / Memory Hierarchy

More information

Benchmarking the Memory Hierarchy of Modern GPUs

Benchmarking the Memory Hierarchy of Modern GPUs 1 of 30 Benchmarking the Memory Hierarchy of Modern GPUs In 11th IFIP International Conference on Network and Parallel Computing Xinxin Mei, Kaiyong Zhao, Chengjian Liu, Xiaowen Chu CS Department, Hong

More information

Loose-Ordering Consistency for Persistent Memory

Loose-Ordering Consistency for Persistent Memory Loose-Ordering Consistency for Persistent Memory Youyou Lu 1, Jiwu Shu 1, Long Sun 1, Onur Mutlu 2 1 Tsinghua University 2 Carnegie Mellon University Summary Problem: Strict write ordering required for

More information

Spring Prof. Hyesoon Kim

Spring Prof. Hyesoon Kim Spring 2011 Prof. Hyesoon Kim 2 Warp is the basic unit of execution A group of threads (e.g. 32 threads for the Tesla GPU architecture) Warp Execution Inst 1 Inst 2 Inst 3 Sources ready T T T T One warp

More information

Automatic OpenCL Optimization for Locality and Parallelism Management

Automatic OpenCL Optimization for Locality and Parallelism Management Automatic OpenCL Optimization for Locality and Parallelism Management Xing Zhou, Swapnil Ghike In collaboration with: Jean-Pierre Giacalone, Bob Kuhn and Yang Ni (Intel) Maria Garzaran and David Padua

More information

BulkCommit: Scalable and Fast Commit of Atomic Blocks in a Lazy Multiprocessor Environment

BulkCommit: Scalable and Fast Commit of Atomic Blocks in a Lazy Multiprocessor Environment BulkCommit: Scalable and Fast Commit of Atomic Blocks in a Lazy Multiprocessor Environment, Benjamin Sahelices Josep Torrellas, Depei Qian University of Illinois University of Illinois http://iacoma.cs.uiuc.edu/

More information

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28 Transactional Memory or How to do multiple things at once Benjamin Engel Transactional Memory 1 / 28 Transactional Memory: Architectural Support for Lock-Free Data Structures M. Herlihy, J. Eliot, and

More information

Energy Aware Persistence: Reducing Energy Overheads of Memory-based Persistence in NVMs

Energy Aware Persistence: Reducing Energy Overheads of Memory-based Persistence in NVMs Energy Aware Persistence: Reducing Energy Overheads of Memory-based Persistence in NVMs Sudarsun Kannan College of Computing, Georgia Tech sudarsun@gatech.edu Moinuddin Qureshi School of ECE, Georgia Tech

More information

Evaluating the Portability of UPC to the Cell Broadband Engine

Evaluating the Portability of UPC to the Cell Broadband Engine Evaluating the Portability of UPC to the Cell Broadband Engine Dipl. Inform. Ruben Niederhagen JSC Cell Meeting CHAIR FOR OPERATING SYSTEMS Outline Introduction UPC Cell UPC on Cell Mapping Compiler and

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme FUT3040BU Storage at Memory Speed: Finally, Nonvolatile Memory Is Here Rajesh Venkatasubramanian, VMware, Inc Richard A Brunner, VMware, Inc #VMworld #FUT3040BU Disclaimer This presentation may contain

More information

NVWAL: Exploiting NVRAM in Write-Ahead Logging

NVWAL: Exploiting NVRAM in Write-Ahead Logging NVWAL: Exploiting NVRAM in Write-Ahead Logging Wook-Hee Kim, Jinwoong Kim, Woongki Baek, Beomseok Nam, Ulsan National Institute of Science and Technology (UNIST) {okie90,jwkim,wbaek,bsnam}@unist.ac.kr

More information

Accessing NVM Locally and over RDMA Challenges and Opportunities

Accessing NVM Locally and over RDMA Challenges and Opportunities Accessing NVM Locally and over RDMA Challenges and Opportunities Wendy Elsasser Megan Grodowitz William Wang MSST - May 2018 Emerging NVM A wide variety of technologies with varied characteristics Address

More information

BUFFER HASH KV TABLE

BUFFER HASH KV TABLE BUFFER HASH KV TABLE CHEAP AND LARGE CAMS FOR HIGH PERFORMANCE DATA-INTENSIVE NETWORKED SYSTEMS PAPER BY ASHOK ANAND, CHITRA MUTHUKRISHNAN, STEVEN KAPPES, ADITYA AKELLA AND SUMAN NATH PRESENTED BY PRAMOD

More information

TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions

TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions Yige Hu, Zhiting Zhu, Ian Neal, Youngjin Kwon, Tianyu Chen, Vijay Chidambaram, Emmett Witchel The University of Texas at Austin

More information

High Performance Transactions in Deuteronomy

High Performance Transactions in Deuteronomy High Performance Transactions in Deuteronomy Justin Levandoski, David Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang Microsoft Research Overview Deuteronomy: componentized DB stack Separates transaction,

More information

Outline. Exploiting Program Parallelism. The Hydra Approach. Data Speculation Support for a Chip Multiprocessor (Hydra CMP) HYDRA

Outline. Exploiting Program Parallelism. The Hydra Approach. Data Speculation Support for a Chip Multiprocessor (Hydra CMP) HYDRA CS 258 Parallel Computer Architecture Data Speculation Support for a Chip Multiprocessor (Hydra CMP) Lance Hammond, Mark Willey and Kunle Olukotun Presented: May 7 th, 2008 Ankit Jain Outline The Hydra

More information

McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime

McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime B. Saha, A-R. Adl- Tabatabai, R. Hudson, C.C. Minh, B. Hertzberg PPoPP 2006 Introductory TM Sales Pitch Two legs

More information

Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory

Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory Pengfei Zuo, Yu Hua, and Jie Wu, Huazhong University of Science and Technology https://www.usenix.org/conference/osdi18/presentation/zuo

More information

Multiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types

Multiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types Chapter 5 Multiprocessor Cache Coherence Thread-Level Parallelism 1: read 2: read 3: write??? 1 4 From ILP to TLP Memory System is Coherent If... ILP became inefficient in terms of Power consumption Silicon

More information

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance 6.823, L11--1 Cache Performance and Memory Management: From Absolute Addresses to Demand Paging Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Cache Performance 6.823,

More information

RISC-V Support for Persistent Memory Systems

RISC-V Support for Persistent Memory Systems RISC-V Support for Persistent Memory Systems Matheus Ogleari, Storage Architecture Engineer June 26, 2018 7/6/2018 Persistent Memory State-of-the-art, hybrid memory + storage properties Supported by hardware

More information

arxiv: v1 [cs.db] 19 Jan 2019

arxiv: v1 [cs.db] 19 Jan 2019 Guaranteeing Recoverability via Partially Constrained Transaction Logs Huan Zhou, Jinwei Guo, Huiqi Hu, Weining Qian, Xuan Zhou, Aoying Zhou School of Data Science & Engineering East China Normal University

More information

Page 1. Multilevel Memories (Improving performance using a little cash )

Page 1. Multilevel Memories (Improving performance using a little cash ) Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency

More information

Topics. " Start using a write-ahead log on disk " Log all updates Commit

Topics.  Start using a write-ahead log on disk  Log all updates Commit Topics COS 318: Operating Systems Journaling and LFS Copy on Write and Write Anywhere (NetApp WAFL) File Systems Reliability and Performance (Contd.) Jaswinder Pal Singh Computer Science epartment Princeton

More information

TUNING CUDA APPLICATIONS FOR MAXWELL

TUNING CUDA APPLICATIONS FOR MAXWELL TUNING CUDA APPLICATIONS FOR MAXWELL DA-07173-001_v6.5 August 2014 Application Note TABLE OF CONTENTS Chapter 1. Maxwell Tuning Guide... 1 1.1. NVIDIA Maxwell Compute Architecture... 1 1.2. CUDA Best Practices...2

More information

JANUARY 20, 2016, SAN JOSE, CA. Microsoft. Microsoft SQL Hekaton Towards Large Scale Use of PM for In-memory Databases

JANUARY 20, 2016, SAN JOSE, CA. Microsoft. Microsoft SQL Hekaton Towards Large Scale Use of PM for In-memory Databases JANUARY 20, 2016, SAN JOSE, CA PRESENTATION Cristian TITLE Diaconu GOES HERE Microsoft Microsoft SQL Hekaton Towards Large Scale Use of PM for In-memory Databases What this talk is about Trying to answer

More information

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end

More information

Portland State University ECE 588/688. Graphics Processors

Portland State University ECE 588/688. Graphics Processors Portland State University ECE 588/688 Graphics Processors Copyright by Alaa Alameldeen 2018 Why Graphics Processors? Graphics programs have different characteristics from general purpose programs Highly

More information

Tessellation OS. Present and Future. Juan A. Colmenares, Sarah Bird, Andrew Wang, Krste Asanovid, John Kubiatowicz [Parallel Computing Laboratory]

Tessellation OS. Present and Future. Juan A. Colmenares, Sarah Bird, Andrew Wang, Krste Asanovid, John Kubiatowicz [Parallel Computing Laboratory] Tessellation OS Present and Future Juan A. Colmenares, Sarah Bird, Andrew Wang, Krste Asanovid, John Kubiatowicz [Parallel Computing Laboratory] Steven Hofmeyr, John Shalf [Lawrence Berkeley National Laboratory]

More information

Lecture 2: CUDA Programming

Lecture 2: CUDA Programming CS 515 Programming Language and Compilers I Lecture 2: CUDA Programming Zheng (Eddy) Zhang Rutgers University Fall 2017, 9/12/2017 Review: Programming in CUDA Let s look at a sequential program in C first:

More information

SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS

SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CSAIL IAP MEETING MAY 21, 2013 Research Agenda Lack of technology progress Moore s Law still alive Power

More information

Load-Sto-Meter: Generating Workloads for Persistent Memory Damini Chopra, Doug Voigt Hewlett Packard (Enterprise)

Load-Sto-Meter: Generating Workloads for Persistent Memory Damini Chopra, Doug Voigt Hewlett Packard (Enterprise) Load-Sto-Meter: Generating Workloads for Persistent Memory Damini Chopra, Doug Voigt Hewlett Packard (Enterprise) Application vs. Pure Workloads Benchmarks that reproduce application workloads Assist in

More information

Better I/O Through Byte-Addressable, Persistent Memory

Better I/O Through Byte-Addressable, Persistent Memory Better I/O Through Byte-Addressable, Persistent Memory Jeremy Condit Edmund B. Nightingale Christopher Frost Engin Ipek Benjamin Lee Doug Burger Derrick Coetzee Microsoft Research UCLA ABSTRACT Modern

More information

ThyNVM. Enabling So1ware- Transparent Crash Consistency In Persistent Memory Systems

ThyNVM. Enabling So1ware- Transparent Crash Consistency In Persistent Memory Systems ThyNVM Enabling So1ware- Transparent Crash Consistency In Persistent Memory Systems Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, and Onur Mutlu TWO- LEVEL STORAGE MODEL MEMORY CPU STORAGE

More information

EECS 570 Final Exam - SOLUTIONS Winter 2015

EECS 570 Final Exam - SOLUTIONS Winter 2015 EECS 570 Final Exam - SOLUTIONS Winter 2015 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: # Points 1 / 21 2 / 32

More information

Distributed Shared Persistent Memory

Distributed Shared Persistent Memory Distributed Shared Persistent Memory (SoCC 17) Yizhou Shan, Yiying Zhang Persistent Memory (PM/NVM) Byte Addressable Persistent CPU Cache Low Latency Capacity Cost effective PM DRAM 2 Many PM Work, but

More information

Evaluating Phase Change Memory for Enterprise Storage Systems

Evaluating Phase Change Memory for Enterprise Storage Systems Hyojun Kim Evaluating Phase Change Memory for Enterprise Storage Systems IBM Almaden Research Micron provided a prototype SSD built with 45 nm 1 Gbit Phase Change Memory Measurement study Performance Characteris?cs

More information

Lecture 4: Directory Protocols and TM. Topics: corner cases in directory protocols, lazy TM

Lecture 4: Directory Protocols and TM. Topics: corner cases in directory protocols, lazy TM Lecture 4: Directory Protocols and TM Topics: corner cases in directory protocols, lazy TM 1 Handling Reads When the home receives a read request, it looks up memory (speculative read) and directory in

More information

Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Active thread Idle thread

Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Active thread Idle thread Intra-Warp Compaction Techniques Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Goal Active thread Idle thread Compaction Compact threads in a warp to coalesce (and eliminate)

More information

EE382 Processor Design. Processor Issues for MP

EE382 Processor Design. Processor Issues for MP EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors, Part I EE 382 Processor Design Winter 98/99 Michael Flynn 1 Processor Issues for MP Initialization Interrupts Virtual Memory TLB Coherency

More information

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section

More information

Lab 4 File System. CS140 February 27, Slides adapted from previous quarters

Lab 4 File System. CS140 February 27, Slides adapted from previous quarters Lab 4 File System CS140 February 27, 2015 Slides adapted from previous quarters Logistics Lab 3 was due at noon today Lab 4 is due Friday, March 13 Overview Motivation Suggested Order of Implementation

More information

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) ) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Transactions - Definition A transaction is a sequence of data operations with the following properties: * A Atomic All

More information