ARCHITECTURE DESIGN FOR SOFT ERRORS
|
|
- Tobias Fields
- 5 years ago
- Views:
Transcription
1 ARCHITECTURE DESIGN FOR SOFT ERRORS Shubu Mukherjee ^ШВпШшр"* AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO T^"ТГПШГ SAN FRANCISCO SINGAPORE SYDNEY TOKYO ^ P f ^ ^ ELSEVIER Morgan Kaufmann Publishers is an imprint of Elsevier MORGAN KAUFMANN PUBLISHERS
2 Foreword Preface xiii xvii 1 Introduction Overview Evidence of Soft Errors Types of Soft Errors Cost-Eff ective Solutions to Mitigate the Impact of Soft Errors Faults Errors Metrics Dependability Models Reliability Availability Miscellaneous Models Permanent Faults in Complementary Metal Oxide Semiconductor Technology Metal Failure Modes Gate Oxide Failure Modes Radiation-Induced Transient Faults in CMOS Transistors The Alpha Particle The Neutron Interaction of Alpha Particles and Neutrons with Silicon Crystals Architectural Fault Models for Alpha Particle and Neutron Strikes Silent Data Corruption and Detected Unrecoverable Error Basic Definitions: SDC and DUE SDC and DUE Budgets 34 vii
3 viii 1.10 Soft Error Scaling Trends SRAM and Latch Scaling Trends DRAM Scaling Trends Summary Historical Anecdote 39 References 40 2 Device- and Circuit-Level Modeling, Measurement, and Mitigation Overview Modeling Circuit-Level SERs Impact of Alpha Particle or Neutron on Circuit Elements Critical Charge (Qcrit) Timing Vulnerability Factor Masking Effects in Combinatorial Logic Gates Vulnerability of Clock Circuits Measurement Field Data Collection Accelerated Alpha Particle Tests Accelerated Neutron Tests Mitigation Techniques Device Enhancements Circuit Enhancements Summary Historical Anecdote 76 References 76 3 Architectural Vulnerability Analysis Overview AVF Basics Does a Bit Matter? SDC and DUE Equations Bit-Level SDC and DUE FIT Equations Chip-Level SDC and DUE FIT Equations False DUE AVF Case Study: False DUE from Lockstepped Checkers Process-Kill versus System-Kill DUE AVF ACE Principles Types of ACE and Un-ACE Bits Point-of-Strike Model versus Propagated Fault Model 3.6 Microarchitectural Un-ACE Bits Idle or Invalid State Misspeculated State Predictor Structures Ex-ACE State
4 3.7 Architectural Un-ACE Bits NOP Instructions Performance-Enhancing Operations Predicated False Instructions Dynamically Dead Instructions Logical Masking AVF Equations for a Hardware Structure Computing AVF with Little's Law Implications of Little's Law for AVF Computation Computing AVF with a Performance Model Limitations of AVF Analysis with Performance Models ACE Analysis Using the Point-of-Strike Fault Model AVF Results from an Itanium 2 Performance Model ACE Analysis Using the Propagated Fault Model Summary Historical Anecdote 118 References Advanced Architectural Vulnerability Analysis Overview Lifetime Analysis of RAM Arrays Basic Idea of Lifetime Analysis Accounting for Structural Differences in Lifetime Analysis Impact of Working Set Size for Lifetime Analysis Granularity of Lifetime Analysis Computing the DUE AVF Lifetime Analysis of CAM Arrays Handling False-Positive Matches in a CAM Array Handling False-Negative Matches in a CAM Array Effect of Cooldown in Lifetime Analysis AVF Results for Cache, Data Translation Buffer, and Store Buffer Unknown Components RAM Arrays CAM Arrays DUE AVF Computing AVFs Using SFI into an RTL Model Comparison of Fault Injection and ACE Analyses Random Sampling in SFI Determining if an Injected Fault Will Result in an Error Case Study of SFI The Illinois SFI Study SFI Methodology Transient Faults in Pipeline State Transient Faults in Logic Blocks 156
5 4.8 Summary Historical Anecdote 159 References 160 Error Coding Techniques Overview Fault Detection and ECC for State Bits Basics of Error Coding Error Detection Using Parity Codes Single-Error Correction Codes Single-Error Correct Double-Error Detect Code Double-Error Correct Triple-Error Detect Code Cyclic Redundancy Check Error Detection Codes for Execution Units AN Codes Residue Codes Parity Prediction Circuits Implementation Overhead of Error Detection and Correction Codes Number of Logic Levels Overhead in Area Scrubbing Analysis DUE FIT from Temporal Double-Bit Error with No Scrubbing DUE Rate from Temporal Double-Bit Error with Fixed-Interval Scrubbing Detecting False Errors Sources of False DUE Events in a Microprocessor Pipeline Mechanism to Propagate Error Information Distinguishing False Errors from True Errors Hardware Assertions Machine Check Architecture Informing the OS of an Error Recording Information about the Error Isolating the Error Summary Historical Anecdote 205 References 205 Fault Detection via Redundant Execution Overview Sphere of Replication Components of the Sphere of Replication The Size of Sphere of Replication Output Comparison and Input Replication 211
6 XI 6.3 Fault Detection via Cycle-by-Cycle Lockstepping Advantages of Lockstepping Disadvantages of Lockstepping Lockstepping in the Stratus ftserver Lockstepping in the Hewlett-Packard NonStop Himalaya Architecture Lockstepping in the IBM Z-series Processors Fault Detection via RMT RMT in the Marathon Endurance Server RMT in the Hewlett-Packard NonStop Advanced Architecture RMT Within a Single-Processor Core A Simultaneous Multithreaded Processor Design Space for SMT in a Single Core Output Comparison in an SRT Processor Input Replication in an SRT Processor Input Replication of Cached Load Data Two Techniques to Enhance Performance of an SRT Processor Performance Evaluation of an SRT Processor Alternate Single-Core RMT Implementation RMT in a Multicore Architecture DIVA: RMT Using Specialized Checker Processor RMT Enhancements Relaxed Input Replication Relaxed Output Comparison Partial RMT Summary Historical Anecdote 248 References Hardware Error Recovery Overview Classification of Hardware Error Recovery Schemes Reboot Forward Error Recovery Backward Error Recovery Forward Error Recovery Fail-Over Systems DMR with Recovery Triple Modular Redundancy Pair-and-Spare Backward Error Recovery with Fault Detection Before Register Commit Fujitsu SPARC64 V: Parity with Retry IBM Z-Series: Lockstepping with Retry 265
7 XII Simultaneous and Redundantly Threaded Processor with Recovery Chip-Level Redundantly Threaded Processor with Recovery (CRTR) Exposure Reduction via Pipeline Squash Fault Screening with Pipeline Squash and Re-execution Backward Error Recovery with Fault Detection before Memory Commit Incremental Checkpointing Using a History Buffer Periodic Checkpointing with Fingerprinting Backward Error Recovery with Fault Detection before I/O Commit LVQ-Based Recovery in an SRT Processor Re Vive: Backward Error Recovery Using Global Checkpoints SafetyNet: Backward Error Recovery Using Local Checkpoints Backward Error Recovery with Fault Detection after I/O Commit Summary Historical Anecdote 294 References Software Detection and Recovery Overview Fault Detection Using SIS Fault Detection Using Software RMT Error Detection by Duplicated Instructions Software-Implemented Fault Tolerance Configurable Transient Fault Detection via Dynamic Binary Translation Fault Detection Using Hybrid RMT CRAFT: A Hybrid RMT Implementation CRAFT Evaluation Fault Detection Using RVMs Application-Level Recovery Forward Error Recovery Using Software RMT and AN Codes for? auist Detection Log-Based Backward Error Recovery in Database Systems Checkpoint-Based Backward Error Recovery for Shared-Memory Programs OS-Level and VMM-Level Recoveries Summary 323 References 324 Index 327
Transient Fault Detection and Reducing Transient Error Rate. Jose Lugo-Martinez CSE 240C: Advanced Microarchitecture Prof.
Transient Fault Detection and Reducing Transient Error Rate Jose Lugo-Martinez CSE 240C: Advanced Microarchitecture Prof. Steven Swanson Outline Motivation What are transient faults? Hardware Fault Detection
More informationReliable Architectures
6.823, L24-1 Reliable Architectures Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 6.823, L24-2 Strike Changes State of a Single Bit 10 6.823, L24-3 Impact
More informationECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Availability. Copyright 2010 Daniel J. Sorin Duke University
Advanced Computer Architecture II (Parallel Computer Architecture) Availability Copyright 2010 Daniel J. Sorin Duke University Definition and Motivation Outline General Principles of Available System Design
More informationUsing Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance
Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance Outline Introduction and Motivation Software-centric Fault Detection Process-Level Redundancy Experimental Results
More informationComputers as Components Principles of Embedded Computing System Design
Computers as Components Principles of Embedded Computing System Design Third Edition Marilyn Wolf ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY
More informationComputer Architecture A Quantitative Approach
Computer Architecture A Quantitative Approach Third Edition John L. Hennessy Stanford University David A. Patterson University of California at Berkeley With Contributions by David Goldberg Xerox Palo
More informationFAULT TOLERANT SYSTEMS
FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 18 Chapter 7 Case Studies Part.18.1 Introduction Illustrate practical use of methods described previously Highlight fault-tolerance
More informationComputer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-742 Fall 2012, Parallel Computer Architecture, Lecture 13:
More informationEmbedded Systems Architecture
Embedded Systems Architecture A Comprehensive Guide for Engineers and Programmers By Tammy Noergaard ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE
More informationAR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors Computer Sciences Department University of Wisconsin Madison http://www.cs.wisc.edu/~ericro/ericro.html ericro@cs.wisc.edu High-Performance
More informationIlan Beer. IBM Haifa Research Lab 27 Oct IBM Corporation
Ilan Beer IBM Haifa Research Lab 27 Oct. 2008 As the semiconductors industry progresses deeply into the sub-micron technology, vulnerability of chips to soft errors is growing In high reliability systems,
More informationM (~ Computer Organization and Design ELSEVIER. David A. Patterson. John L. Hennessy. University of California, Berkeley. Stanford University
T H I R D EDITION REVISED Computer Organization and Design THE HARDWARE/SOFTWARE INTERFACE David A. Patterson University of California, Berkeley John L. Hennessy Stanford University With contributions
More informationFPGAs: Instant Access
FPGAs: Instant Access Clive"Max"Maxfield AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO % ELSEVIER Newnes is an imprint of Elsevier Newnes Contents
More informationFault Tolerant Computing. Prof. David August/Prof. David Walker. Without the Transistor. Transistors.
Fault Tolerant Computing Prof. David August/Prof. David Walker 2 3 Without the Transistor Transistors 4 http://www.ominous-valve.com/vtsc.html 1 Basic MOSFET Transistor Semiconductors Pure semiconductors
More informationJeremy W. Sheaffer 1 David P. Luebke 2 Kevin Skadron 1. University of Virginia Computer Science 2. NVIDIA Research
A Hardware Redundancy and Recovery Mechanism for Reliable Scientific Computation on Graphics Processors Jeremy W. Sheaffer 1 David P. Luebke 2 Kevin Skadron 1 1 University of Virginia Computer Science
More informationTechniques to Reduce the Soft Error Rate of a High-Performance Microprocessor
Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor Abstract Transient faults due to neutron and alpha particle strikes pose a significant obstacle to increasing processor transistor
More informationWITH the continuous decrease of CMOS feature size and
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 5, MAY 2012 777 IVF: Characterizing the Vulnerability of Microprocessor Structures to Intermittent Faults Songjun Pan, Student
More informationProgramming 8-bit PIC Microcontrollers in С
Programming 8-bit PIC Microcontrollers in С with Interactive Hardware Simulation Martin P. Bates älllllltlilisft &Щ*лЛ AMSTERDAM BOSTON HEIDELBERG LONDON ^^Ш NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationFAULT TOLERANT SYSTEMS
FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 5 Processor-Level Techniques & Byzantine Failures Chapter 2 Hardware Fault Tolerance Part.5.1 Processor-Level Techniques
More informationAn Introduction to Parallel Programming
F 'C 3 R'"'C,_,. HO!.-IJJ () An Introduction to Parallel Programming Peter S. Pacheco University of San Francisco ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationPROBABILITY THAT A FAULT WILL CAUSE A DECLARED ERROR. THE FIRST
REDUCING THE SOFT-ERROR RATE OF A HIGH-PERFORMANCE MICROPROCESSOR UNLIKE TRADITIONAL APPROACHES, WHICH FOCUS ON DETECTING AND RECOVERING FROM FAULTS, THE TECHNIQUES INTRODUCED HERE REDUCE THE PROBABILITY
More informationECE 574 Cluster Computing Lecture 19
ECE 574 Cluster Computing Lecture 19 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 10 November 2015 Announcements Projects HW extended 1 MPI Review MPI is *not* shared memory
More informationInformation Modeling and Relational Databases
Information Modeling and Relational Databases Second Edition Terry Halpin Neumont University Tony Morgan Neumont University AMSTERDAM» BOSTON. HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationDigital System Design with SystemVerilog
Digital System Design with SystemVerilog Mark Zwolinski AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Capetown Sydney Tokyo
More informationComputing Architectural Vulnerability Factors for Address-Based Structures
Computing Architectural Vulnerability Factors for Address-Based Structures Arijit Biswas 1, Paul Racunas 1, Razvan Cheveresan 2, Joel Emer 3, Shubhendu S. Mukherjee 1 and Ram Rangan 4 1 FACT Group, Intel
More informationABSTRACT. Reducing the Soft Error Rates of a High-Performance Microprocessor Using Front-End Throttling
ABSTRACT Title of Thesis: Reducing the Soft Error Rates of a High-Performance Microprocessor Using Front-End Throttling Smitha M Kalappurakkal, Master of Science, 2006 Thesis directed by: Professor Manoj
More informationRedundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992
Redundancy in fault tolerant computing D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 1 Redundancy Fault tolerance computing is based on redundancy HARDWARE REDUNDANCY Physical
More informationOutline. Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication. Outline
Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication Khanh N. Dang and Xuan-Tu Tran Email: khanh.n.dang@vnu.edu.vn VNU Key Laboratory for Smart Integrated Systems
More informationSystem Assurance. Beyond Detecting. Vulnerabilities. Djenana Campara. Nikolai Mansourov
System Assurance Beyond Detecting Vulnerabilities Nikolai Mansourov Djenana Campara ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SYDNEY TOKYO Morgan Kaufmann
More informationDetailed Design and Evaluation of Redundant Multithreading Alternatives*
Detailed Design and Evaluation of Redundant Multithreading Alternatives* Shubhendu S. Mukherjee VSSAD Massachusetts Microprocessor Design Center Intel Corporation 334 South Street, SHR1-T25 Shrewsbury,
More informationDesign and Evaluation of Hybrid Fault-Detection Systems
Design and Evaluation of Hybrid Fault-Detection Systems George A. Reis Jonathan Chang Neil Vachharajani Ram Rangan David I. August Departments of Electrical Engineering and Computer Science Princeton University
More informationManaged. Code Rootkits. Hooking. into Runtime. Environments. Erez Metula ELSEVIER. Syngress is an imprint of Elsevier SYNGRESS
Managed Code Rootkits Hooking into Runtime Environments Erez Metula ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEWYORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Syngress is an imprint
More informationCalculating Architectural Vulnerability Factors for Spatial Multi-bit Transient Faults
Calculating Architectural Vulnerability Factors for Spatial Multi-bit Transient Faults Mark Wilkening, Vilas Sridharan, Si Li, Fritz Previlon, Sudhanva Gurumurthi and David R. Kaeli ECE Department, Northeastern
More informationDESIGN AND ANALYSIS OF TRANSIENT FAULT TOLERANCE FOR MULTI CORE ARCHITECTURE
DESIGN AND ANALYSIS OF TRANSIENT FAULT TOLERANCE FOR MULTI CORE ARCHITECTURE DivyaRani 1 1pg scholar, ECE Department, SNS college of technology, Tamil Nadu, India -----------------------------------------------------------------------------------------------------------------------------------------------
More informationMulticore Soft Error Rate Stabilization Using Adaptive Dual Modular Redundancy
Multicore Soft Error Rate Stabilization Using Adaptive Dual Modular Redundancy Ramakrishna Vadlamani, Jia Zhao, Wayne Burleson and Russell Tessier Department of Electrical and Computer Engineering University
More informationEngineering Real- Time Applications with Wild Magic
3D GAME ENGINE ARCHITECTURE Engineering Real- Time Applications with Wild Magic DAVID H. EBERLY Geometric Tools, Inc. AMSTERDAM BOSTON HEIDELRERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE
More informationTransient Fault Detection via Simultaneous Multithreading
Transient Fault Detection via Simultaneous Multithreading Steven K. Reinhardt EECS Department University of Michigan, Ann Arbor 1301 Beal Avenue Ann Arbor, MI 48109-2122 stever@eecs.umich.edu Shubhendu
More informationCoding for Penetration
Coding for Penetration Testers Building Better Tools Jason Andress Ryan Linn ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Syngress is
More informationFingerprinting: Hash-Based Error Detection in Microprocessors. Jared C. Smolens
CARNEGIE MELLON UNIVERSITY CARNEGIE INSTITUTE OF TECHNOLOGY DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
More informationAlgorithmic Graph Theory and Perfect Graphs
Algorithmic Graph Theory and Perfect Graphs Second Edition Martin Charles Golumbic Caesarea Rothschild Institute University of Haifa Haifa, Israel 2004 ELSEVIER.. Amsterdam - Boston - Heidelberg - London
More informationStructured Parallel Programming
Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationMPEG-l.MPEG-2, MPEG-4
The MPEG Handbook MPEG-l.MPEG-2, MPEG-4 Second edition John Watkinson PT ^PVTPR AMSTERDAM BOSTON HEIDELBERG LONDON. NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Focal Press is an
More informationEvaluating the Effects of Compiler Optimisations on AVF
Evaluating the Effects of Compiler Optimisations on AVF Timothy M. Jones, Michael F.P. O Boyle Member of HiPEAC, School of Informatics University of Edinburgh, UK {tjones1,mob}@inf.ed.ac.uk Oğuz Ergin
More informationMicroarchitecture-Based Introspection: A Technique for Transient-Fault Tolerance in Microprocessors. Moinuddin K. Qureshi Onur Mutlu Yale N.
Microarchitecture-Based Introspection: A Technique for Transient-Fault Tolerance in Microprocessors Moinuddin K. Qureshi Onur Mutlu Yale N. Patt High Performance Systems Group Department of Electrical
More informationPOWER4 Systems: Design for Reliability. Douglas Bossen, Joel Tendler, Kevin Reick IBM Server Group, Austin, TX
Systems: Design for Reliability Douglas Bossen, Joel Tendler, Kevin Reick IBM Server Group, Austin, TX Microprocessor 2-way SMP system on a chip > 1 GHz processor frequency >1GHz Core Shared L2 >1GHz Core
More informationExamining the Impact of ACE interference on Multi-Bit AVF Estimates
Examining the Impact of ACE interference on Multi-Bit AVF Estimates Fritz Previlon, Mark Wilkening, Vilas Sridharan, Sudhanva Gurumurthi and David R. Kaeli ECE Department, Northeastern University, Boston,
More informationStructured Parallel Programming Patterns for Efficient Computation
Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationArea-Efficient Error Protection for Caches
Area-Efficient Error Protection for Caches Soontae Kim Department of Computer Science and Engineering University of South Florida, FL 33620 sookim@cse.usf.edu Abstract Due to increasing concern about various
More informationChip, Heal Thyself. The BulletProof Project
Chip, Heal Thyself Todd Austin Advanced Computer Architecture Lab University of Michigan With Prof. Valeria Bertacco, Prof. Scott Mahlke Kypros Constantinides, Smitha Shyam Mojtaba Mehrara, Mona Attariyan,
More informationLecture 22: Fault Tolerance
Lecture 22: Fault Tolerance Papers: Token Coherence: Decoupling Performance and Correctness, ISCA 03, Wisconsin A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures, HPCA 07, Spain Error
More informationSurvey of Error and Fault Detection Mechanisms
Survey of Error and Fault Detection Mechanisms Ikhwan Lee ikhwan@mail.utexas.edu Michael Sullivan mbsullivan@mail.utexas.edu Evgeni Krimer krimer@utexas.edu Dong Wan Kim wannikim@utexas.edu Mehmet Basoglu
More informationComputer Animation. Algorithms and Techniques. z< MORGAN KAUFMANN PUBLISHERS. Rick Parent Ohio State University AN IMPRINT OF ELSEVIER SCIENCE
Computer Animation Algorithms and Techniques Rick Parent Ohio State University z< MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF ELSEVIER SCIENCE AMSTERDAM BOSTON LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationThe Designer's Guide to VHDL Second Edition
The Designer's Guide to VHDL Second Edition Peter J. Ashenden EDA CONSULTANT, ASHENDEN DESIGNS PTY. VISITING RESEARCH FELLOW, ADELAIDE UNIVERSITY Cl MORGAN KAUFMANN PUBLISHERS An Imprint of Elsevier SAN
More informationThe Pennsylvania State University The Graduate School College of Engineering REDUNDANCY AND PARALLELISM TRADEOFFS FOR
The Pennsylvania State University The Graduate School College of Engineering REDUNDANCY AND PARALLELISM TRADEOFFS FOR RELIABLE, HIGH-PERFORMANCE ARCHITECTURES A Thesis in Computer Science and Engineering
More informationEVALUATING OVERHEADS OF MULTIBIT SOFT-ERROR PROTECTION
[3B2-9] mmi2013040010.3d 11/7/013 17:9 Page 2... EVALUATING OVERHEADS OF MULTIBIT SOFT-ERROR PROTECTION IN THE PROCESSOR CORE... THE SVALINN FRAMEWORK PROVIDES COMPREHENSIVE ANALYSIS OF MULTIBIT ERROR
More informationUtilizing Dynamically Coupled Cores to Form a Resilient Chip Multiprocessor
Utilizing Dynamically Coupled Cores to Form a Resilient Chip Multiprocessor Christopher LaFrieda Engin İpek José F.Martínez Rajit Manohar Computer Systems Laboratory Cornell University Ithaca, NY 14853
More informationSlicK: Slice-based Locality Exploitation for Efficient Redundant Multithreading
SlicK: Slice-based Locality Exploitation for Efficient Redundant Multithreading Angshuman Parashar Sudhanva Gurumurthi Anand Sivasubramaniam Dept. of Computer Science and Engineering Dept. of Computer
More informationMemory technology and optimizations ( 2.3) Main Memory
Memory technology and optimizations ( 2.3) 47 Main Memory Performance of Main Memory: Latency: affects Cache Miss Penalty» Access Time: time between request and word arrival» Cycle Time: minimum time between
More informationOn the Characterization of Data Cache Vulnerability in High-Performance Embedded Microprocessors
On the Characterization of Data Cache Vulnerability in High-Performance Embedded Microprocessors Shuai Wang, Jie Hu, and Sotirios G. Ziavras Department of Electrical and Computer Engineering New Jersey
More informationLow Power Cache Design. Angel Chen Joe Gambino
Low Power Cache Design Angel Chen Joe Gambino Agenda Why is low power important? How does cache contribute to the power consumption of a processor? What are some design challenges for low power caches?
More informationUltra Low-Cost Defect Protection for Microprocessor Pipelines
Ultra Low-Cost Defect Protection for Microprocessor Pipelines Smitha Shyam Kypros Constantinides Sujay Phadke Valeria Bertacco Todd Austin Advanced Computer Architecture Lab University of Michigan Key
More informationA Low Cost Checker for Matrix Multiplication
A Low Cost Checker for Matrix Multiplication Lisbôa, C. A., Erigson, M. I., and Carro, L. Instituto de Informática, Universidade Federal do Rio Grande do Sul calisboa@inf.ufrgs.br, mierigson@terra.com.br,
More informationTECHNOLOGY scaling has driven the computer industry
516 IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 4, NO. 3, SEPTEMBER 2004 Timing Vulnerability Factors of Sequentials Norbert Seifert, Senior Member, IEEE, and Nelson Tam, Member, IEEE Abstract
More informationEliminating Single Points of Failure in Software Based Redundancy
Eliminating Single Points of Failure in Software Based Redundancy Peter Ulbrich, Martin Hoffmann, Rüdiger Kapitza, Daniel Lohmann, Reiner Schmid and Wolfgang Schröder-Preikschat EDCC May 9, 2012 SYSTEM
More informationReal-Time Systems and Programming Languages
Real-Time Systems and Programming Languages Ada, Real-Time Java and C/Real-Time POSIX Fourth Edition Alan Burns and Andy Wellings University of York * ADDISON-WESLEY An imprint of Pearson Education Harlow,
More informationMSP430 Microcontroller Basics
MSP430 Microcontroller Basics John H. Davies AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Newnes is an imprint of Elsevier N WPIGS Contents Preface
More informationApplication Programming
Multicore Application Programming For Windows, Linux, and Oracle Solaris Darryl Gove AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris
More informationImproving the Fault Tolerance of a Computer System with Space-Time Triple Modular Redundancy
Improving the Fault Tolerance of a Computer System with Space-Time Triple Modular Redundancy Wei Chen, Rui Gong, Fang Liu, Kui Dai, Zhiying Wang School of Computer, National University of Defense Technology,
More informationRobust System Design with MPSoCs Unique Opportunities
Robust System Design with MPSoCs Unique Opportunities Subhasish Mitra Robust Systems Group Departments of Electrical Eng. & Computer Sc. Stanford University Email: subh@stanford.edu Acknowledgment: Stanford
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationSelf-Repair for Robust System Design. Yanjing Li Intel Labs Stanford University
Self-Repair for Robust System Design Yanjing Li Intel Labs Stanford University 1 Hardware Failures: Major Concern Permanent: our focus Temporary 2 Tolerating Permanent Hardware Failures Detection Diagnosis
More informationChapter 8. Coping with Physical Failures, Soft Errors, and Reliability Issues. System-on-Chip EE141 Test Architectures Ch. 8 Physical Failures - P.
Chapter 8 Coping with Physical Failures, Soft Errors, and Reliability Issues System-on-Chip EE141 Test Architectures Ch. 8 Physical Failures - P. 1 1 What is this chapter about? Gives an Overview of and
More informationUltra Depedable VLSI by Collaboration of Formal Verifications and Architectural Technologies
Ultra Depedable VLSI by Collaboration of Formal Verifications and Architectural Technologies CREST-DVLSI - Fundamental Technologies for Dependable VLSI Systems - Masahiro Fujita Shuichi Sakai Masahiro
More informationRedundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992
Redundancy in fault tolerant computing D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 1 Redundancy Fault tolerance computing is based on redundancy HARDWARE REDUNDANCY Physical
More informationMoving to the Cloud. Developing Apps in. the New World of Cloud Computing. Dinkar Sitaram. Geetha Manjunath. David R. Deily ELSEVIER.
Moving to the Cloud Developing Apps in the New World of Cloud Computing Dinkar Sitaram Geetha Manjunath Technical Editor David R. Deily AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO
More informationChecker Processors. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India
Advanced Department of Computer Science Indian Institute of Technology New Delhi, India Outline Introduction Advanced 1 Introduction 2 Checker Pipeline Checking Mechanism 3 Advanced Core Checker L1 Failure
More informationCoding for Penetration Testers Building Better Tools
Coding for Penetration Testers Building Better Tools Second Edition Jason Andress Ryan Linn Clara Hartwell, Technical Editor ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO
More informationThe Essential Guide to Video Processing
The Essential Guide to Video Processing Second Edition EDITOR Al Bovik Department of Electrical and Computer Engineering The University of Texas at Austin Austin, Texas AMSTERDAM BOSTON HEIDELBERG LONDON
More informationDATABASE SYSTEM CONCEPTS
DATABASE SYSTEM CONCEPTS HENRY F. KORTH ABRAHAM SILBERSCHATZ University of Texas at Austin McGraw-Hill, Inc. New York St. Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico Milan Montreal
More informationREPAS: Reliable Execution for Parallel ApplicationS in Tiled-CMPs
REPAS: Reliable Execution for Parallel ApplicationS in Tiled-CMPs Daniel Sánchez, Juan L. Aragón and José M. García Departamento de Ingeniería y Tecnología de Computadores Universidad de Murcia, 30071
More informationFoundations of Multidimensional and Metric Data Structures
Foundations of Multidimensional and Metric Data Structures Hanan Samet University of Maryland, College Park ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE
More informationA Microarchitectural Analysis of Soft Error Propagation in a Production-Level Embedded Microprocessor
A Microarchitectural Analysis of Soft Error Propagation in a Production-Level Embedded Microprocessor Jason Blome 1, Scott Mahlke 1, Daryl Bradley 2 and Krisztián Flautner 2 1 Advanced Computer Architecture
More informationFine-Grain Redundancy Techniques for High- Reliable SRAM FPGA`S in Space Environment: A Brief Survey
Fine-Grain Redundancy Techniques for High- Reliable SRAM FPGA`S in Space Environment: A Brief Survey T.Srinivas Reddy 1, J.Santosh 2, J.Prabhakar 3 Assistant Professor, Department of ECE, MREC, Hyderabad,
More informationModern Embedded Computing Designing Connected, Pervasive, Media-Rich Systems
Modern Embedded Computing Designing Connected, Pervasive, Media-Rich Systems Peter Barry Patrick Crowley ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE
More informationNetworked Graphics 01_P374423_PRELIMS.indd i 10/27/2009 6:57:42 AM
Networked Graphics Networked Graphics Building Networked Games and Virtual Environments Anthony Steed Manuel Fradinho Oliveira AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationPTC Mathcad Prime 3.0
Essential PTC Mathcad Prime 3.0 A Guide for New and Current Users Brent Maxfield, P.E. AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO @ Academic
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh (thanks to Prof. Nigel Topham) General Information Instructor
More informationDISTRIBUTED SYSTEMS. Second Edition. Andrew S. Tanenbaum Maarten Van Steen. Vrije Universiteit Amsterdam, 7'he Netherlands PEARSON.
DISTRIBUTED SYSTEMS 121r itac itple TAYAdiets Second Edition Andrew S. Tanenbaum Maarten Van Steen Vrije Universiteit Amsterdam, 7'he Netherlands PEARSON Prentice Hall Upper Saddle River, NJ 07458 CONTENTS
More informationPuey Wei Tan. Danny Lee. IBM zenterprise 196
Puey Wei Tan Danny Lee IBM zenterprise 196 IBM zenterprise System What is it? IBM s product solutions for mainframe computers. IBM s product models: 700/7000 series System/360 System/370 System/390 zseries
More informationROEVER ENGINEERING COLLEGE DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
ROEVER ENGINEERING COLLEGE DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING 16 MARKS CS 2354 ADVANCE COMPUTER ARCHITECTURE 1. Explain the concepts and challenges of Instruction-Level Parallelism. Define
More informationDesigning Enterprise SSDs with Low Cost Media
Designing Enterprise SSDs with Low Cost Media Jeremy Werner Director of Marketing SandForce Flash Memory Summit August 2011 Santa Clara, CA 1 Everyone Knows Flash is migrating: To smaller nodes 2-bit and
More informationINTRODUCING ABSTRACTION TO VULNERABILITY ANALYSIS
INTRODUCING ABSTRACTION TO VULNERABILITY ANALYSIS A Dissertation Presented by Vilas Keshav Sridharan to The Department of Electrical and Computer Engineering in partial fulfillment of the requirements
More informationReliability Improvement in Reconfigurable FPGAs
Reliability Improvement in Reconfigurable FPGAs B. Chagun Basha Jeudis de la Comm 22 May 2014 1 Overview # 2 FPGA Fabrics BlockRAM resource Dedicated multipliers I/O Blocks Programmable interconnect Configurable
More informationCOMP3221: Microprocessors and. and Embedded Systems. Overview. Lecture 23: Memory Systems (I)
COMP3221: Microprocessors and Embedded Systems Lecture 23: Memory Systems (I) Overview Memory System Hierarchy RAM, ROM, EPROM, EEPROM and FLASH http://www.cse.unsw.edu.au/~cs3221 Lecturer: Hui Wu Session
More informationArecent study [24] shows that the soft-error rate [16] per
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 8, AUGUST 2007 1 Power-Efficient Approaches to Redundant Multithreading Niti Madan, Student Member, IEEE, and Rajeev Balasubramonian,
More informationCommercial-Off-the-shelf Hardware Transactional Memory for Tolerating Transient Hardware Errors
Commercial-Off-the-shelf Hardware Transactional Memory for Tolerating Transient Hardware Errors Rasha Faqeh TU- Dresden 19.01.2015 Dresden, 23.09.2011 Transient Error Recovery Motivation Folie Nr. 12 von
More informationReal World Multicore Embedded Systems
Real World Multicore Embedded Systems A Practical Approach Expert Guide Bryon Moyer AMSTERDAM BOSTON HEIDELBERG LONDON I J^# J NEW YORK OXFORD PARIS SAN DIEGO S V J SAN FRANCISCO SINGAPORE SYDNEY TOKYO
More informationAnalysis of Soft Error Mitigation Techniques for Register Files in IBM Cu-08 90nm Technology
Analysis of Soft Error Mitigation Techniques for s in IBM Cu-08 90nm Technology Riaz Naseer, Rashed Zafar Bhatti, Jeff Draper Information Sciences Institute University of Southern California Marina Del
More informationCS 470 Spring Fault Tolerance. Mike Lam, Professor. Content taken from the following:
CS 47 Spring 27 Mike Lam, Professor Fault Tolerance Content taken from the following: "Distributed Systems: Principles and Paradigms" by Andrew S. Tanenbaum and Maarten Van Steen (Chapter 8) Various online
More informationSOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE
SOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE SOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE O. Goloubeva, M. Rebaudengo, M. Sonza Reorda, and M. Violante Politecnico di Torino - Dipartimento di Automatica
More information