ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors
|
|
- Sherman Robinson
- 6 years ago
- Views:
Transcription
1 ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors Milos Prvulovic, Zheng Zhang*, Josep Torrellas University of Illinois at Urbana-Champaign *Hewlett-Packard Laboratories
2 Motivation Availability & Reliability increasingly important Frequency, Feature Size Errors Complexity, Verification Cost Errors Multiprocessors Errors Global software-only recovery too slow Can hardware help? Prvulovic et al. ReVive: Cost Effective Rollback Recovery 2
3 Motivation Cost vs. Performance vs. Availability Low Cost Simple changes to a few key components Low Performance Overhead Handle frequent operations in hardware High Availability Fast recovery from a wide class of errors Prvulovic et al. ReVive: Cost Effective Rollback Recovery 3
4 Contribution: New Scheme Low Cost HW changes only to directory controllers Memory overhead only 12.5% (with 7+1 parity) Low Performance Overhead Only 6% performance overhead on average High Availability Recovery from: system-wide transients, loss of one node Availability better than % (assuming 1 error/ day) Prvulovic et al. ReVive: Cost Effective Rollback Recovery 4
5 Overview of ReVive Entire main memory protected by distributed parity Like RAID-5, but in memory Periodically establish s a checkpoint c Main memory is the checkpoint state Write-back kdirty data from caches, save processor r context t Save overwritten data to enable restoring checkpoint When program execution modifies memory for 1st time Prvulovic et al. ReVive: Cost Effective Rollback Recovery 5
6 Distributed N+1 Parity Node 0 Node 1 Node N Parity Group Distributed to minimize contention Parity Data Data... Allocation Granularity: page Update Granularity: cache line Prvulovic et al. ReVive: Cost Effective Rollback Recovery 6
7 Distributed Parity Update in HW WB Line X XOR Dir Mem Par Dir Dir Par Ack Rd Wr Rd Wr Home of Line X Mem Mem XOR Home of parity for Line X Prvulovic et al. ReVive: Cost Effective Rollback Recovery 7
8 ReVive: Checkpoint Creation Timeline Checkpoint (<1ms for 2MB L2) Execute (100 ms) P0 <5μs <1μs ~400μs/MB ~20μs P1 P2 Time P3 Timer Interrupt Save CPU Context Write-Back Sync Prvulovic et al. ReVive: Cost Effective Rollback Recovery 8
9 Logging in HW RdExcl Line X Rd Line X Data Dir Wr Log Mem Note: Wr Log also updates the parity Home of Line X Prvulovic et al. ReVive: Cost Effective Rollback Recovery 9
10 Log Filtering Add L bit to directory entry of each line Clear all L bits on each checkpoint Set when logged Do not log if already set Not needed for correctness Can be only in directory cache Can be completely omitted Prvulovic et al. ReVive: Cost Effective Rollback Recovery 10
11 Classes of Recoverable Errors CPU Caches Dir Ctrl Revive Hardware Mem Ctrl DRAM Net Ctrl Module Interconnection Network Can recover from (Trans + perm) errors in 1 node Trans errors in N nodes Prvulovic et al. ReVive: Cost Effective Rollback Recovery 11
12 Permanent Node Loss: Recovery Unavailable (~840ms) Degraded P0 Execute Detection Repair Log Repair Data P1 100ms 80ms ~100ms ~490ms ~20s P2 Time P3 Checkpoint Bzzzt! HW Rollback Prvulovic et al. ReVive: Cost Effective Rollback Recovery 12
13 Evaluation Setup Splash-2 benchmarks 16 superscalar processors (6-issue at 1GHz) 16kB L1 cache, 512kB L2 cache 2-D torus network, virtual cut-through routing 100MHz DDR SDRAM Using 7+1 distributed parity Checkpoint interval: 10ms and infinite Prvulovic et al. ReVive: Cost Effective Rollback Recovery 13
14 Performance Overhead 25% 20% 15% 10% 5% L2 Misses/1000 Instructions 5 6 Ckp+Log Par 9 Cp10ms 7+1 CpInf 7+1 Cp10ms 1+1 CpInf 1+1 Exec. Time Increase 0% Barnes Cho holesky FFT FMM LU Ocean Rad adiosity Radix Ray aytrace Volrend Wat ater-n2 Wat ater-sp Average Tolerable 6% performance overhead Prvulovic et al. ReVive: Cost Effective Rollback Recovery 14
15 Worst-Case Recovery Time Millis seconds Reconstruct (Log) Reconstruct (Mem) Undo 0 Barnes Cholesky FFT FMM LU Ocean Radiosity Redo Work Radix Raytrace Volrend HW Repair Water-N2 Water-Sp Average Radix: 590ms + 180ms + 50ms = 820ms % 999% availability Prvulovic et al. ReVive: Cost Effective Rollback Recovery 15
16 Network Traffic Bytes/Instr PAR Ckp WB Exe WB RD/RDX 0 Barnes Cholesky FFT FMM LU Ocean Radiosity Radix Raytrace Volrend Water-N2 Water-Sp H Mean Prvulovic et al. ReVive: Cost Effective Rollback Recovery 16
17 Memory Traffic Bytes/Instr 2.0 PAR LOG 1.0 Ckp WB Exe WB RD/RDX 0 Barnes Cholesky FFT FMM LU Ocean Radiosity Radix Raytrace Volrend Water-N2 Water-Sp H Mean Prvulovic et al. ReVive: Cost Effective Rollback Recovery 17
18 Related Work Device- or problem-specific p schemes DIVA, Redundant Multithreading, Slipstream, ECC, etc. ReVive can handle errors that escape these schemes, improving overall availability at low additional cost Other system-recovery schemes Plank et al. - N+1 parity in software Masubuchi et al. - logging with bus-snooper SafetyNet Prvulovic et al. ReVive: Cost Effective Rollback Recovery 18
19 Related Work: SafetyNet Types of recoverable errors ReVive: Permanent (loss of a node)+transient SafetyNet: Transient; perm only w/ redundant devices HW modifications ReVive: Directory controller only SafetyNet: Memory, caches, coherence protocol Performance Overhead 6% with ReVive, negligible with SafetyNet Prvulovic et al. ReVive: Cost Effective Rollback Recovery 19
20 Conclusions Recovery from: system-wide transients, loss of 1 node Availability better than % Low performance overhead (6% on average) HW changes only to directory controllers Memory overhead 12.5% with 7+1 parity Overhead can be reduced by increasing parity groups Prvulovic et al. ReVive: Cost Effective Rollback Recovery 20
21 ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors Milos Prvulovic, Zheng Zhang, Josep Torrellas v c
22 Rollback Recovery in Multiprocessors Checkpoint Consistency Global, Local Coordinated or Local Uncoordinated Checkpoint Separation Full or Partial Partial can be with Logging, Renaming or Buffering Checkpoint Storage Safe External, Safe Internal or for a Specialized Error Class Prvulovic et al. ReVive: Cost Effective Rollback Recovery 22
23 Checkpoint Consistency Global Synchronization is fast enough on shared-memory machines All synchronize to make a single consistent checkpoint Local Coordinated Synchronize as needed for a set of consistent checkpoints Local Uncoordinated Do not synchronize Set of consistent checkpoints computed when recovering Prvulovic et al. ReVive: Cost Effective Rollback Recovery 23
24 Checkpoint Storage Safe External (e.g. RAID) Not fast enough Recovery data on redundancy protected-disk Safe Internal (e.g. DRAM) Recovery data in redundancy-protected memory Unsafe Internal Not general enough Recovery data not protected t dby redundancy d Assumes memory content survives errors Prvulovic et al. ReVive: Cost Effective Rollback Recovery 24
25 Checkpoint Separation Full Too much storage needed Checkpoint and working data sets do not intersect Partial with Buffering Commit atomicity, overhead Buffer non-checkpoint data, flush to commit Partial with Renaming Rename to avoid overwriting checkpoint data Partial with Logging Save overwritten checkpoint data in a log Complex HW or coarse grain Prvulovic et al. ReVive: Cost Effective Rollback Recovery 25
26 Log & Parity Update Races Error while log update in progress Must fully perform log update before starting overwrite Error while parity update in progress Assume a single node fails Can recover r either old or new content t Both result in consistent recovery (see paper) Long error detection latency Keep sufficient logs to recover far enough into the past Prvulovic et al. ReVive: Cost Effective Rollback Recovery 26
27 Availability vs Overhead If checkpoint interval too short Lost work and hardware self-check dominate recovery Fault-free execution performance suffers If checkpoint interval too long Low availability Find a good balance Checkpoint intervals of 100ms to 1s Prvulovic et al. ReVive: Cost Effective Rollback Recovery 27
28 Analysis Cache size vs. checkpoint interval 512kB caches with checkpoints every 10ms 5MB caches with checkpoints every 100ms Log size vs. checkpoint interval Log will grow in sub-linear proportion to interval size 10ms: <3MB per node, only two apps >128kB per node Parity overhead: 12.5% of system memory is parity Prvulovic et al. ReVive: Cost Effective Rollback Recovery 28
ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors
To appear in the Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA-29) May 2002 ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors
More informationLight64: Ligh support for data ra. Darko Marinov, Josep Torrellas. a.cs.uiuc.edu
: Ligh htweight hardware support for data ra ce detection ec during systematic testing Adrian Nistor, Darko Marinov, Josep Torrellas University of Illinois, Urbana Champaign http://iacoma a.cs.uiuc.edu
More informationThe Design Complexity of Program Undo Support in a General Purpose Processor. Radu Teodorescu and Josep Torrellas
The Design Complexity of Program Undo Support in a General Purpose Processor Radu Teodorescu and Josep Torrellas University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu Processor with program
More informationVulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically
Vulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically Abdullah Muzahid, Shanxiang Qi, and University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu MICRO December
More informationOutline. Failure Types
Outline Database Tuning Nikolaus Augsten University of Salzburg Department of Computer Science Database Group 1 Unit 10 WS 2013/2014 Adapted from Database Tuning by Dennis Shasha and Philippe Bonnet. Nikolaus
More informationSpeculative Synchronization: Applying Thread Level Speculation to Parallel Applications. University of Illinois
Speculative Synchronization: Applying Thread Level Speculation to Parallel Applications José éf. Martínez * and Josep Torrellas University of Illinois ASPLOS 2002 * Now at Cornell University Overview Allow
More informationJosé F. Martínez 1, Jose Renau 2 Michael C. Huang 3, Milos Prvulovic 2, and Josep Torrellas 2
CHERRY: CHECKPOINTED EARLY RESOURCE RECYCLING José F. Martínez 1, Jose Renau 2 Michael C. Huang 3, Milos Prvulovic 2, and Josep Torrellas 2 1 2 3 MOTIVATION Problem: Limited processor resources Goal: More
More informationECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Availability. Copyright 2010 Daniel J. Sorin Duke University
Advanced Computer Architecture II (Parallel Computer Architecture) Availability Copyright 2010 Daniel J. Sorin Duke University Definition and Motivation Outline General Principles of Available System Design
More informationFAULT TOLERANT SYSTEMS
FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 17 - Checkpointing II Chapter 6 - Checkpointing Part.17.1 Coordinated Checkpointing Uncoordinated checkpointing may lead
More informationCS5460: Operating Systems Lecture 20: File System Reliability
CS5460: Operating Systems Lecture 20: File System Reliability File System Optimizations Modern Historic Technique Disk buffer cache Aggregated disk I/O Prefetching Disk head scheduling Disk interleaving
More informationShared Memory Multiprocessors. Symmetric Shared Memory Architecture (SMP) Cache Coherence. Cache Coherence Mechanism. Interconnection Network
Shared Memory Multis Processor Processor Processor i Processor n Symmetric Shared Memory Architecture (SMP) cache cache cache cache Interconnection Network Main Memory I/O System Cache Coherence Cache
More informationh Coherence Controllers
High-Throughput h Coherence Controllers Anthony-Trung Nguyen Microprocessor Research Labs Intel Corporation 9/30/03 Motivations Coherence Controller (CC) throughput is bottleneck of scalable systems. CCs
More informationPageForge: A Near-Memory Content- Aware Page-Merging Architecture
PageForge: A Near-Memory Content- Aware Page-Merging Architecture Dimitrios Skarlatos, Nam Sung Kim, and Josep Torrellas University of Illinois at Urbana-Champaign MICRO-50 @ Boston Motivation: Server
More informationSnatch: Opportunistically Reassigning Power Allocation between Processor and Memory in 3D Stacks
Snatch: Opportunistically Reassigning Power Allocation between and in 3D Stacks Dimitrios Skarlatos, Renji Thomas, Aditya Agrawal, Shibin Qin, Robert Pilawa, Ulya Karpuzcu, Radu Teodorescu, Nam Sung Kim,
More informationA Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures
A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures Ricardo Fernández-Pascual, José M. García, Manuel E. Acacio Dpto. Ingeniería y Tecnología de Computadores Universidad de Murcia SPAIN
More informationPage 1 FAULT TOLERANT SYSTEMS. Coordinated Checkpointing. Time-Based Synchronization. A Coordinated Checkpointing Algorithm
FAULT TOLERANT SYSTEMS Coordinated http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Chapter 6 II Uncoordinated checkpointing may lead to domino effect or to livelock Example: l P wants to take a
More informationThe Common Case Transactional Behavior of Multithreaded Programs
The Common Case Transactional Behavior of Multithreaded Programs JaeWoong Chung Hassan Chafi,, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, Christos Kozyrakis, Kunle Olukotun Computer Systems Lab
More informationDBS related failures. DBS related failure model. Introduction. Fault tolerance
16 Logging and Recovery in Database systems 16.1 Introduction: Fail safe systems 16.1.1 Failure Types and failure model 16.1.2 DBS related failures 16.2 DBS Logging and Recovery principles 16.2.1 The Redo
More informationLect. 6: Directory Coherence Protocol
Lect. 6: Directory Coherence Protocol Snooping coherence Global state of a memory line is the collection of its state in all caches, and there is no summary state anywhere All cache controllers monitor
More informationFAULT TOLERANT SYSTEMS
FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 18 Chapter 7 Case Studies Part.18.1 Introduction Illustrate practical use of methods described previously Highlight fault-tolerance
More informationSpeculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution
Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution Ravi Rajwar and Jim Goodman University of Wisconsin-Madison International Symposium on Microarchitecture, Dec. 2001 Funding
More informationCurrent Topics in OS Research. So, what s hot?
Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general
More informationSoftware-Controlled Multithreading Using Informing Memory Operations
Software-Controlled Multithreading Using Informing Memory Operations Todd C. Mowry Computer Science Department University Sherwyn R. Ramkissoon Department of Electrical & Computer Engineering University
More informationFlexBulk: Intelligently Forming Atomic Blocks in Blocked-Execution Multiprocessors to Minimize Squashes
FlexBulk: Intelligently Forming Atomic Blocks in Blocked-Execution Multiprocessors to Minimize Squashes Rishi Agarwal and Josep Torrellas University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu
More informationComputer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-742 Fall 2012, Parallel Computer Architecture, Lecture 13:
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationFlexible Cache Error Protection using an ECC FIFO
Flexible Cache Error Protection using an ECC FIFO Doe Hyun Yoon and Mattan Erez Dept Electrical and Computer Engineering The University of Texas at Austin 1 ECC FIFO Goal: to reduce on-chip ECC overhead
More informationPrototyping Architectural Support for Program Rollback Using FPGAs
Prototyping Architectural Support for Program Rollback Using FPGAs Radu Teodorescu and Josep Torrellas http://iacoma.cs.uiuc.edu University of Illinois at Urbana-Champaign Motivation Problem: Software
More informationAppendix D: Storage Systems
Appendix D: Storage Systems Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Storage Systems : Disks Used for long term storage of files temporarily store parts of pgm
More informationMemory Mapped ECC Low-Cost Error Protection for Last Level Caches. Doe Hyun Yoon Mattan Erez
Memory Mapped ECC Low-Cost Error Protection for Last Level Caches Doe Hyun Yoon Mattan Erez 1-Slide Summary Reliability issues in caches Increasing soft error rate (SER) Cost increases with error protection
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (III)
COSC 6385 Computer Architecture - Memory Hierarchies (III) Edgar Gabriel Spring 2014 Memory Technology Performance metrics Latency problems handled through caches Bandwidth main concern for main memory
More informationSnoop-Based Multiprocessor Design III: Case Studies
Snoop-Based Multiprocessor Design III: Case Studies Todd C. Mowry CS 41 March, Case Studies of Bus-based Machines SGI Challenge, with Powerpath SUN Enterprise, with Gigaplane Take very different positions
More informationVirtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili
Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed
More information) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Transactions - Definition A transaction is a sequence of data operations with the following properties: * A Atomic All
More informationStorage. Hwansoo Han
Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics
More informationChapter 11: File System Implementation. Objectives
Chapter 11: File System Implementation Objectives To describe the details of implementing local file systems and directory structures To describe the implementation of remote file systems To discuss block
More informationReEnact: Using Thread-Level Speculation Mechanisms to Debug Data Races in Multithreaded Codes
Appears in the Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA-30), June 2003. ReEnact: Using Thread-Level Speculation Mechanisms to Debug Data Races in Multithreaded
More informationDistributed Systems
15-440 Distributed Systems 11 - Fault Tolerance, Logging and Recovery Tuesday, Oct 2 nd, 2018 Logistics Updates P1 Part A checkpoint Part A due: Saturday 10/6 (6-week drop deadline 10/8) *Please WORK hard
More informationScalable Cache Coherence
arallel Computing Scalable Cache Coherence Hwansoo Han Hierarchical Cache Coherence Hierarchies in cache organization Multiple levels of caches on a processor Large scale multiprocessors with hierarchy
More informationLecture 18: Reliable Storage
CS 422/522 Design & Implementation of Operating Systems Lecture 18: Reliable Storage Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of
More informationDBT Tool. DBT Framework
Thread-Safe Dynamic Binary Translation using Transactional Memory JaeWoong Chung,, Michael Dalton, Hari Kannan, Christos Kozyrakis Computer Systems Laboratory Stanford University http://csl.stanford.edu
More informationCache Coherence. CMU : Parallel Computer Architecture and Programming (Spring 2012)
Cache Coherence CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Shared memory multi-processor Processors read and write to shared variables - More precisely: processors issues
More informationAdvanced Memory Management
Advanced Memory Management Main Points Applications of memory management What can we do with ability to trap on memory references to individual pages? File systems and persistent storage Goals Abstractions
More informationLecture 25: Multiprocessors. Today s topics: Snooping-based cache coherence protocol Directory-based cache coherence protocol Synchronization
Lecture 25: Multiprocessors Today s topics: Snooping-based cache coherence protocol Directory-based cache coherence protocol Synchronization 1 Snooping-Based Protocols Three states for a block: invalid,
More informationDeterministic Replay and Data Race Detection for Multithreaded Programs
Deterministic Replay and Data Race Detection for Multithreaded Programs Dongyoon Lee Computer Science Department - 1 - The Shift to Multicore Systems 100+ cores Desktop/Server 8+ cores Smartphones 2+ cores
More informationCarnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Today s Class. Faloutsos/Pavlo CMU /615
Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos A. Pavlo Lecture#23: Crash Recovery Part 1 (R&G ch. 18) Last Class Basic Timestamp Ordering Optimistic Concurrency
More informationScalable Cache Coherence. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
Scalable Cache Coherence Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hierarchical Cache Coherence Hierarchies in cache organization Multiple levels
More informationDatabase Systems. November 2, 2011 Lecture #7. topobo (mit)
Database Systems November 2, 2011 Lecture #7 1 topobo (mit) 1 Announcement Assignment #2 due today Assignment #3 out today & due on 11/16. Midterm exam in class next week. Cover Chapters 1, 2,
More informationCOS 318: Operating Systems. NSF, Snapshot, Dedup and Review
COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early
More informationI/O CANNOT BE IGNORED
LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.
More informationLogTM: Log-Based Transactional Memory
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet
More informationLecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo
Lecture 21: Logging Schemes 15-445/645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo Crash Recovery Recovery algorithms are techniques to ensure database consistency, transaction
More informationDepartment of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.
Department of Computer Science, Institute for System Architecture, Operating Systems Group Real-Time Systems '08 / '09 Hardware Marcus Völp Outlook Hardware is Source of Unpredictability Caches Pipeline
More informationLast Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications
Last Class Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Basic Timestamp Ordering Optimistic Concurrency Control Multi-Version Concurrency Control C. Faloutsos A. Pavlo Lecture#23:
More informationCOSC 6385 Computer Architecture. Storage Systems
COSC 6385 Computer Architecture Storage Systems Spring 2012 I/O problem Current processor performance: e.g. Pentium 4 3 GHz ~ 6GFLOPS Memory Bandwidth: 133 MHz * 4 * 64Bit ~ 4.26 GB/s Current network performance:
More informationLecture 2: Snooping and Directory Protocols. Topics: Snooping wrap-up and directory implementations
Lecture 2: Snooping and Directory Protocols Topics: Snooping wrap-up and directory implementations 1 Split Transaction Bus So far, we have assumed that a coherence operation (request, snoops, responses,
More informationARIES (& Logging) April 2-4, 2018
ARIES (& Logging) April 2-4, 2018 1 What does it mean for a transaction to be committed? 2 If commit returns successfully, the transaction is recorded completely (atomicity) left the database in a stable
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University Storage and Other I/O Topics I/O Performance Measures Types and Characteristics of I/O Devices Buses Interfacing I/O Devices
More informationRecoverability. Kathleen Durant PhD CS3200
Recoverability Kathleen Durant PhD CS3200 1 Recovery Manager Recovery manager ensures the ACID principles of atomicity and durability Atomicity: either all actions in a transaction are done or none are
More informationDatabase Management System
Database Management System Lecture 10 Recovery * Some materials adapted from R. Ramakrishnan, J. Gehrke and Shawn Bowers Basic Database Architecture Database Management System 2 Recovery Which ACID properties
More informationComputer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine
More informationSwizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems
1 Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems Ronald Dreslinski, Korey Sewell, Thomas Manville, Sudhir Satpathy, Nathaniel Pinckney, Geoff Blake, Michael Cieslak, Reetuparna
More informationComputer Science 146. Computer Architecture
Computer Science 46 Computer Architecture Spring 24 Harvard University Instructor: Prof dbrooks@eecsharvardedu Lecture 22: More I/O Computer Science 46 Lecture Outline HW5 and Project Questions? Storage
More informationNON-SPECULATIVE LOAD LOAD REORDERING IN TSO 1
NON-SPECULATIVE LOAD LOAD REORDERING IN TSO 1 Alberto Ros Universidad de Murcia October 17th, 2017 1 A. Ros, T. E. Carlson, M. Alipour, and S. Kaxiras, "Non-Speculative Load-Load Reordering in TSO". ISCA,
More informationCache Performance, System Performance, and Off-Chip Bandwidth... Pick any Two
Cache Performance, System Performance, and Off-Chip Bandwidth... Pick any Two Bushra Ahsan and Mohamed Zahran Dept. of Electrical Engineering City University of New York ahsan bushra@yahoo.com mzahran@ccny.cuny.edu
More informationZettabyte Reliability with Flexible End-to-end Data Integrity
Zettabyte Reliability with Flexible End-to-end Data Integrity Yupu Zhang, Daniel Myers, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau University of Wisconsin - Madison 5/9/2013 1 Data Corruption Imperfect
More informationDHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI
DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI Department of Computer Science and Engineering CS6302- DATABASE MANAGEMENT SYSTEMS Anna University 2 & 16 Mark Questions & Answers Year / Semester: II / III
More informationParallel SimOS: Scalability and Performance for Large System Simulation
Parallel SimOS: Scalability and Performance for Large System Simulation Ph.D. Oral Defense Robert E. Lantz Computer Systems Laboratory Stanford University 1 Overview This work develops methods to simulate
More informationIntroduction to cache memories
Course on: Advanced Computer Architectures Introduction to cache memories Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Summary Summary Main goal Spatial and temporal
More informationNOTES W2006 CPS610 DBMS II. Prof. Anastase Mastoras. Ryerson University
NOTES W2006 CPS610 DBMS II Prof. Anastase Mastoras Ryerson University Recovery Transaction: - a logical unit of work. (text). It is a collection of operations that performs a single logical function in
More informationStorage Systems. Storage Systems
Storage Systems Storage Systems We already know about four levels of storage: Registers Cache Memory Disk But we've been a little vague on how these devices are interconnected In this unit, we study Input/output
More informationLecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks
Lecture: Transactional Memory, Networks Topics: TM implementations, on-chip networks 1 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Avoids deadlock
More informationMaking the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory
Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory Colin Blundell (University of Pennsylvania) Joe Devietti (University of Pennsylvania) E Christopher Lewis (VMware,
More informationI/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University
I/O, Disks, and RAID Yi Shi Fall 2017 Xi an Jiaotong University Goals for Today Disks How does a computer system permanently store data? RAID How to make storage both efficient and reliable? 2 What does
More informationP2FS: supporting atomic writes for reliable file system design in PCM storage
LETTER IEICE Electronics Express, Vol.11, No.13, 1 6 P2FS: supporting atomic writes for reliable file system design in PCM storage Eunji Lee 1, Kern Koh 2, and Hyokyung Bahn 2a) 1 Department of Software,
More informationLecture 13: Consistency Models. Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models
Lecture 13: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models 1 Coherence Vs. Consistency Recall that coherence guarantees
More informationUltra Low-Cost Defect Protection for Microprocessor Pipelines
Ultra Low-Cost Defect Protection for Microprocessor Pipelines Smitha Shyam Kypros Constantinides Sujay Phadke Valeria Bertacco Todd Austin Advanced Computer Architecture Lab University of Michigan Key
More informationECE 669 Parallel Computer Architecture
ECE 669 Parallel Computer Architecture Lecture 9 Workload Evaluation Outline Evaluation of applications is important Simulation of sample data sets provides important information Working sets indicate
More informationOverview of Transaction Management
Overview of Transaction Management Chapter 16 Comp 521 Files and Databases Fall 2010 1 Database Transactions A transaction is the DBMS s abstract view of a user program: a sequence of database commands;
More informationSystem Challenges and Opportunities for Transactional Memory
System Challenges and Opportunities for Transactional Memory JaeWoong Chung Computer System Lab Stanford University My thesis is about Computer system design that help leveraging hardware parallelism Transactional
More informationFile System Consistency. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
File System Consistency Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Crash Consistency File system may perform several disk writes to complete
More informationThe Google File System (GFS)
1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints
More informationATLAS: A Chip-Multiprocessor. with Transactional Memory Support
ATLAS: A Chip-Multiprocessor with Transactional Memory Support Njuguna Njoroge, Jared Casper, Sewook Wee, Yuriy Teslyar, Daxia Ge, Christos Kozyrakis, and Kunle Olukotun Transactional Coherence and Consistency
More informationApplication-Transparent Checkpoint/Restart for MPI Programs over InfiniBand
Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand Qi Gao, Weikuan Yu, Wei Huang, Dhabaleswar K. Panda Network-Based Computing Laboratory Department of Computer Science & Engineering
More informationIO System. CP-226: Computer Architecture. Lecture 25 (24 April 2013) CADSL
IO System Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationGoldibear and the 3 Locks. Programming With Locks Is Tricky. More Lock Madness. And To Make It Worse. Transactional Memory: The Big Idea
Programming With Locks s Tricky Multicore processors are the way of the foreseeable future thread-level parallelism anointed as parallelism model of choice Just one problem Writing lock-based multi-threaded
More informationCOS 318: Operating Systems. Virtual Memory and Address Translation
COS 318: Operating Systems Virtual Memory and Address Translation Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Today s Topics
More information) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties
More informationAccelerating Multi-core Processor Design Space Evaluation Using Automatic Multi-threaded Workload Synthesis
Accelerating Multi-core Processor Design Space Evaluation Using Automatic Multi-threaded Workload Synthesis Clay Hughes & Tao Li Department of Electrical and Computer Engineering University of Florida
More informationFile System Consistency
File System Consistency Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3052: Introduction to Operating Systems, Fall 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationDatacenter replication solution with quasardb
Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION
More informationPhoenix: Detecting and Recovering from Permanent Processor Design Bugs with Programmable Hardware
Phoenix: Detecting and Recovering from Permanent Processor Design Bugs with Programmable Hardware Smruti R. Sarangi Abhishek Tiwari Josep Torrellas University of Illinois at Urbana-Champaign Can a Processor
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationSpeculative Locks. Dept. of Computer Science
Speculative Locks José éf. Martínez and djosep Torrellas Dept. of Computer Science University it of Illinois i at Urbana-Champaign Motivation Lock granularity a trade-off: Fine grain greater concurrency
More informationRECOVERY CHAPTER 21,23 (6/E) CHAPTER 17,19 (5/E)
RECOVERY CHAPTER 21,23 (6/E) CHAPTER 17,19 (5/E) 2 LECTURE OUTLINE Failures Recoverable schedules Transaction logs Recovery procedure 3 PURPOSE OF DATABASE RECOVERY To bring the database into the most
More informationMulticast Snooping: A Multicast Address Network. A New Coherence Method Using. With sponsorship and/or participation from. Mark Hill & David Wood
Multicast Snooping: A New Coherence Method Using A Multicast Address Ender Bilir, Ross Dickson, Ying Hu, Manoj Plakal, Daniel Sorin, Mark Hill & David Wood Computer Sciences Department University of Wisconsin
More informationHandout 3 Multiprocessor and thread level parallelism
Handout 3 Multiprocessor and thread level parallelism Outline Review MP Motivation SISD v SIMD (SIMT) v MIMD Centralized vs Distributed Memory MESI and Directory Cache Coherency Synchronization and Relaxed
More informationAdministrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review
Administrivia CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Homework #4 due Thursday answers posted soon after Exam #2 on Thursday, April 24 on memory hierarchy (Unit 4) and
More informationMicroarchitecture-Based Introspection: A Technique for Transient-Fault Tolerance in Microprocessors. Moinuddin K. Qureshi Onur Mutlu Yale N.
Microarchitecture-Based Introspection: A Technique for Transient-Fault Tolerance in Microprocessors Moinuddin K. Qureshi Onur Mutlu Yale N. Patt High Performance Systems Group Department of Electrical
More informationLecture 8: Transactional Memory TCC. Topics: lazy implementation (TCC)
Lecture 8: Transactional Memory TCC Topics: lazy implementation (TCC) 1 Other Issues Nesting: when one transaction calls another flat nesting: collapse all nested transactions into one large transaction
More information