Fine-Grain Redundancy Techniques for High- Reliable SRAM FPGA`S in Space Environment: A Brief Survey

Similar documents
ALMA Memo No Effects of Radiation on the ALMA Correlator

Initial Single-Event Effects Testing and Mitigation in the Xilinx Virtex II-Pro FPGA

Dynamic Partial Reconfiguration of FPGA for SEU Mitigation and Area Efficiency

Fast SEU Detection and Correction in LUT Configuration Bits of SRAM-based FPGAs

Space: The Final Frontier FPGAs for Space and Harsh Environments

SpaceWire IP Cores for High Data Rate and Fault Tolerant Networking

High temperature / radiation hardened capable ARM Cortex -M0 microcontrollers

Single Event Upset Mitigation Techniques for SRAM-based FPGAs

Multiple Event Upsets Aware FPGAs Using Protected Schemes

Clearspeed Embedded Apps and Architecture for Space

SAN FRANCISCO, CA, USA. Ediz Cetin & Oliver Diessel University of New South Wales

Improving FPGA Design Robustness with Partial TMR

Single Event Latchup Power Switch Cell Characterisation

DESIGN AND ANALYSIS OF SOFTWARE FAULTTOLERANT TECHNIQUES FOR SOFTCORE PROCESSORS IN RELIABLE SRAM-BASED FPGA

Improving the Fault Tolerance of a Computer System with Space-Time Triple Modular Redundancy

THE SMART BACKPLANE LOWERING THE COST OF SPACECRAFT AVIONICS BY IMPROVING THE RADIATION TOLERANCE OF COTS ELECTRONIC SYSTEMS Daniel Gleeson, MSc.

Reliability Improvement in Reconfigurable FPGAs

Configurable Fault Tolerant Processor (CFTP) for Space Based Applications

A Fault-Tolerant Alternative to Lockstep Triple Modular Redundancy

Hamming FSM with Xilinx Blind Scrubbing - Trick or Treat

NEPP Independent Single Event Upset Testing of the Microsemi RTG4: Preliminary Data

PROGRAMMABLE MODULE WHICH USES FIRMWARE TO REALISE AN ASTRIUM PATENTED COSMIC RANDOM NUMBER GENERATOR FOR GENERATING SECURE CRYPTOGRAPHIC KEYS

FPGAs operating in a radiation environment: lessons learned from FPGAs in space

Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing

Leso Martin, Musil Tomáš

Design of Hardened Embedded Systems on Multi-FPGA Platforms

Radiation Hardened System Design with Mitigation and Detection in FPGA

Single Event Effects in SRAM based FPGA for space applications Analysis and Mitigation

Outline of Presentation Field Programmable Gate Arrays (FPGAs(

Fault-tolerant system design using novel majority voters of 5-modular redundancy configuration

16th Microelectronics Workshop Oct 22-24, 2003 (MEWS-16) Tsukuba Space Center JAXA

Dynamic Reconfigurable Computing Architecture for Aerospace Applications

RADIATION TOLERANT MANY-CORE COMPUTING SYSTEM FOR AEROSPACE APPLICATIONS. Clinton Francis Gauer

Progress in Autonomous Fault Recovery of Field Programmable Gate Arrays

SEE Tolerant Self-Calibrating Simple Fractional-N PLL

Outline. Trusted Design in FPGAs. FPGA Architectures CLB CLB. CLB Wiring

New Logic Module for secured FPGA based system

LSN 6 Programmable Logic Devices

39 Mitigation of Radiation Effects in SRAM-based FPGAs for Space Applications

Comparison of SET-Resistant Approaches for Memory-Based Architectures

DETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX

VORAGO TECHNOLOGIES. 6 th Interplanetary CubeSat Workshop Cambridge, May, 2017

Progress in Autonomous Fault Recovery of Field Programmable Gate Arrays

LA-UR- Title: Author(s): Intended for: Approved for public release; distribution is unlimited.

Implementation of single bit Error detection and Correction using Embedded hamming scheme

Hardware Design with VHDL PLDs I ECE 443. FPGAs can be configured at least once, many are reprogrammable.

AUTONOMOUS RECONFIGURATION OF IP CORE UNITS USING BLRB ALGORITHM

Analysis of Soft Error Mitigation Techniques for Register Files in IBM Cu-08 90nm Technology

MicroCore Labs. MCL51 Application Note. Lockstep. Quad Modular Redundant System

Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing

A Low-Cost SEE Mitigation Solution for Soft-Processors Embedded in Systems On Programmable Chips

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER

Enabling Testability of Fault-Tolerant Circuits by Means of IDDQ-Checkable Voters

ARCHITECTURE DESIGN FOR SOFT ERRORS

Processor Building Blocks for Space Applications

Reliability of Programmable Input/Output Pins in the Presence of Configuration Upsets

Ensuring Power Supply Sequencing in Spaceflight Systems

Results of Radiation Test of the Cathode Front-end Board for CMS Endcap Muon Chambers

A Network-on-Chip for Radiation Tolerant, Multi-core FPGA Systems

OPTIMIZING DYNAMIC LOGIC REALIZATIONS FOR PARTIAL RECONFIGURATION OF FIELD PROGRAMMABLE GATE ARRAYS

TU Wien. Fault Isolation and Error Containment in the TT-SoC. H. Kopetz. TU Wien. July 2007

Title Laser testing and analysis of SEE in DDR3 memory components

Using COTS at the LHC: Building on Space Experience

RTG4 PLL SEE Test Results July 10, 2017 Revised March 29, 2018 Revised July 31, 2018

Error Detecting and Correcting Code Using Orthogonal Latin Square Using Verilog HDL

SPECIAL ISSUE ENERGY, ENVIRONMENT, AND ENGINEERING SECTION: RECENT ADVANCES IN BIG DATA ANALYSIS (ABDA) ISSN:

Exploiting Unused Spare Columns to Improve Memory ECC

About using FPGAs in radiation environments

Hardware Implementation of a Fault-Tolerant Hopfield Neural Network on FPGAs

A New Fault Injection Approach to Study the Impact of Bitflips in the Configuration of SRAM-Based FPGAs

CMP annual meeting, January 23 rd, 2014

Area Efficient Scan Chain Based Multiple Error Recovery For TMR Systems

ATMEL ATF280E Rad Hard SRAM Based FPGA. Bernard BANCELIN ATMEL Nantes SAS, Aerospace Business Unit

Simulation of Hamming Coding and Decoding for Microcontroller Radiation Hardening Rehab I. Abdul Rahman, Mazhar B. Tayel

DESIGN AND IMPLEMENTATION OF 8X8 DRAM MEMORY ARRAY USING 45nm TECHNOLOGY

Reducing Rad-Hard Memories for FPGA Configuration Storage on Space-bound Payloads

Reconfigurable, Radiation Tolerant S-Band Transponder for Small Satellite Applications

Advanced technologies for transient faults detection and compensation

HASP Student Payload Application for 2012

CMPE 415 Programmable Logic Devices FPGA Technology I

EXTREMELY LOW-POWER AI HARDWARE ENABLED BY CRYSTALLINE OXIDE SEMICONDUCTORS

New Reliable Reconfigurable FPGA Architecture for Safety and Mission Critical Applications LOGO

Programmable Logic Devices FPGA Architectures II CMPE 415. Overview This set of notes introduces many of the features available in the FPGAs of today.

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

Decentralized Run-Time Recovery Mechanism for Transient and Permanent Hardware Faults for Space-borne FPGA-based Computing Systems

Programmable Logic Devices Introduction CMPE 415. Programmable Logic Devices

Error Correction Using Extended Orthogonal Latin Square Codes

RECONFIGURABLE FAULT TOLERANCE FOR SPACE SYSTEMS

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

ECE 636. Reconfigurable Computing. Lecture 2. Field Programmable Gate Arrays I

Testability Optimizations for A Time Multiplexed CPLD Implemented on Structured ASIC Technology

DATAPATH ARCHITECTURE FOR RELIABLE COMPUTING IN NANO-SCALE TECHNOLOGY

FPGA Programming Technology

An Integrated ECC and BISR Scheme for Error Correction in Memory

Evaluating a radiation monitor for mixed-field environments based on SRAM technology

SEU Mitigation Design Techniques for the XQR4000XL Author: Phil Brinkley, Avnet and Carl Carmichael

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May-2013 ISSN

Study on FPGA SEU Mitigation for Readout Electronics of DAMPE BGO Calorimeter

Building High Reliability into Microsemi Designs with Synplify FPGA Tools

An Energy-Efficient Scan Chain Architecture to Reliable Test of VLSI Chips

Transcription:

Fine-Grain Redundancy Techniques for High- Reliable SRAM FPGA`S in Space Environment: A Brief Survey T.Srinivas Reddy 1, J.Santosh 2, J.Prabhakar 3 Assistant Professor, Department of ECE, MREC, Hyderabad, India 1 Assistant Professor, Department of ECE, MREC, Hyderabad, India 2 Assistant Professor, Department of ECE, NMREC, Hyderabad, India 3 ABSTRACT: SRAM based reprogrammable FPGA with high-flexibility combined with high-performance have become increasingly important for use in space applications. With the advances in technology, the device size decreasing below nm, FPGAs used in space environment are more susceptible to radiation. The radiation effects can cause Single Event Upset (SEU) which are soft-errors and non-destructive. This can appear as transient pulses in logic or support circuitry, or as bit flips in memory cells or registers of SRAM cells and respectively change the function of logic elements within FPGA. There are various methodologies proposed in the literature that would reduce the effects and the mask them for proper device operation. In the paper we study the various redundancy techniques that have been proposed to improve reliability of the system designed with SRAM FPGA`s. KEYWORDS: SEE, SEU, TMR, CGTMR, AND FGTMR. I.INTRODUCTION Over the past few years there has been a strong tendency to replace hardened electronics for space by commercial-offthe-shelf (COTS) components, fabricated in a mainstream commercial CMOS technology [1]. A Field Programmable Gate Array (FPGA) is a semiconductor logic device with reprogrammability. Reprogrammability in FPGAs makes it possible to realize a logic function after the manufacturing process. With satellite lifetimes increased far beyond 10 years, much longer than the validity of previous standards, reprogrammability of system becomes a stringent requirement. With little software solutions possible, FPGA are the only possible solution. A strong expectation is that scaled technologies should inherently be more radiation tolerant. In order make a FPGA fault tolerant various methodologies have been proposed in the literature. Of these redundancy techniques play vital role in conjunction with reconfiguration techniques improve the reliability of the system drastically in space environment. In the section II of the paper we explore the various technologies of FPGA and in the process we focus on the suitability of SRAM-based FPGA for space applications, then radiation effects on SRAM FPGA is discussed in section III. In the section IV we study about mitigation of the system errors by common test and mitigation techniques. Section V then focuses on SEU mitigation through various redundancy techniques proposed in the literature. The change to fine-grain approach from coarse-gain is discussed in Section VI and the paper concludes with Section VII. II.FPGA TECHNOLOGIES Field Programmable Gate Arrays (FPGAs) with in-field reprogrammability, low non-recurring engineering costs (NRE), and with relative short design cycle are becoming a key component of digital system. Recently, there has been great interest in using FPGAs within spacecraft with a mixed level of success. FPGA`s are classified based upon different process technologies to build the memory cells used to program the device. Of these Anti-fused and SRAMbased FPGA are widely used for implementation of digital systems. Copyright to IJAREEIE www.ijareeie.com 14047

Antifused FPGA: Antifuse cells are non-volatile and only one-time programmable (OTP). Instead of breaking a metal connection by passing current through (like fuse-technology), a link is grown to make a connection. Antifuse-based devices are very good for high reliability applications, because of their unlimited data-retention time. However due to large programming transistors on the device, design changes are not possible and FPGAs cannot be reprogrammed. SRAM-based FPGA: SRAM programming technology uses static memory cells or SRAM cells for programming. In SRAM-based devices, static memory cells, such as the one shown in Fig.1, provide configurability for FPGAs. SRAM cells are also used in interconnection or implementing logic functions. In interconnects, SRAM cells are used to select lines to multiplexers in order to connect logic components in an appropriate way. In order to implement logic functions, SRAM-based FPGAs use Look Up Tables(LUTs). Fig.1: Program Bit in SRAM cell SRAM-based FPGAs are widely used in different applications due to their reprogrammability. SRAM bits can be configured by user defined configurations on power up. This can be done an indefinite number of times. Unlike other programming technologies, SRAM-based cells use the standard CMOS processing technology. As a result, the latest available CMOS technology can also be applied to SRAM-based FPGAs and, therefore, benefit from the increased integration, the higher speeds and the lower dynamic power consumption Advantage of SRAM FPGAs SRAM-based FPGAs can be created using a standard CMOS process which means they are available at the forefront of each new technology node. Antifuse-based FPGAs require extra processing steps during the manufacturing process. Therefore, SRAM-based FPGAs are more suitable for reconfigurable computing applications, where the device function is dynamically changed according to the need of the design. III. CHALLENGES FOR FPGAs IN SPACE ENIVORNMENT Space environment [2] is different from terrestrial systems. Space environment consists of electrons, protons and heavy ions which frequently effect the operation of electronic devices in space. The radiation environment and its effects on electronic systems [3],[4] is shown in the Fig.2. The flow as shown in the Fig.2 is from the radiation environment (top) to effects that these interactions lead to (bottom).the bottom layer shows the three effect categories: total ionizing dose, displacement damage, and single event effects. Copyright to IJAREEIE www.ijareeie.com 14048

Total Ionizing Dose The term, total ionizing dose, implies the dose that is deposited in the electronics through ionization effects only. All types of electronics are susceptible to ionization but the charge generated inside the semiconductormaterial can quickly be collected and removed without ill-effect. In general, TID effects are mitigated through proper use of shielding materials. Displacement Damage The second area of the cumulative effects of radiation is displacement effects. In which if sufficiently high energy supplied to the atom, it can overcome the binding energy of the atom in the crystalline lattice of the material. If this occurs, the atom is "displaced" from its normal position to various end locations. Unless the end location is an exact duplicate of the former position, the regular order of the crystalline lattice is disturbed. In general, displacement effects are also mitigated through proper use of shielding materials. Single Event Effects (SEE) Single Event Effects (SEEs) are effects caused by a single, energetic particle radiation on electronic circuit, which causes transient errors and it can take on many forms. There are three main subclasses of SEEs which are: Single Event upsets (SEUs), Single Event Functional Interrupt (SEFIs), and Single Event Transients (SETs). Single Event Upsets (SEUs) are soft errors, and non-destructive. They normally appear as transient pulses in logic or support circuitry, or as bit-flips in memory cells or registers. Several types of hard errors, potentially destructive, can appear: Single Event Latchup (SEL) results in a high operating current, above devicespecifications, and must be cleared by a power reset. When impacting ions induce voltage pulses on combinatorial circuitry in a device, these effects are known as Single Event Transients (SETs). The concern for FPGAs is the effect of the radiation on SRAM cells. If an ion strike of sufficient energy occurs near one of these transistors, the bit value stored in the cell can change or flip.the major effect of the upset is configuration bit-stream bit flipping which results in the change of the functionality of the SRAM FPGAs. IV. SEU MITIGATION TECHNIQUES Various Techniques are employed in order to reduce the effects of radiation on SRAM FPGA`S. These mitigation techniques are broadly classified into two categories, Reconfiguration-based techniques, Redundancy-based techniques. Reconfiguration-based Techniques Reconfiguration, also known as Scrubbing, is to write the original configuration to the memory periodically, so single event upsets (SEU) due to radiation are corrected.scrubbing techniques are classified into two types, Blind Scrubbing and Read-back Scrubbing. Blind-Scrubbing Blind Scrubbing is a periodical scrubbing in which the reconfiguration of memory occurs irrespective of occurrence of any upsets in the system. Due to periodically stopping the system operation and reconfiguring it may degrade the system performance if the minimum scrubbing cycle duration is not achieved. Read-back Scrubbing Read-back Scrubbing makes use of a fault-detector which refreshes the configuration memory contents only when a sensitive bit upset or a certain number of non-sensitive upsets are detected. It involves both reading and reloading the configuration memory contents. Limitations of Reconfiguration-based Techniques For critical applications, use of only reconfiguration techniques to cope with the system failures may not be the viable choice.[5] Scrubbing, reloading the configuration memory, results in temporary termination of system operation can be fatal for critical applications because scrubbing may lead to a temporary system termination. Apart from scrubbing, the redundancy-based techniques, which detect, mask and correct errors, are also essential to cope with high single event upset rates. [10] Therefore, the focus of the paper will be on Redundancy-based Techniques which help in the reliable system operation even in the presence of SEUs. Copyright to IJAREEIE www.ijareeie.com 14049

V. REDUNDANCY-BASED TECHNIQUES Redundancy techniques are widely used to build reliable systems that continue to operate satisfactorily in the presence of faults occurring in the components. Redundancy can take many forms such as hardware, software, time, and spatial redundancies. However, hardware redundancy techniques, use of additional hardware components, are used extensively for mitigating the SEU effects in SRAM FPGAs. TRIPLE MODULAR REDUNDANCY (TMR) [6] : TMR is a widely used redundancy technique for mitigation of SEUs in digital circuits. TMR scheme uses three identical logic blocks performing the same task, the output of each block are compared through a voter and the output is generated as shown in the Fig.3 given below. Fig.3 TMR with Majority Voter TMR technique is generally used in ASIC to protect memory elements from the radiation effects. Similarly, FPGA configuration memory is protected using the technique otherwise any modification due to SEUs may affect entire FPGA device. For ground based complex systems, TMR might be able to mask and correct single failures. However, when a SEU hits the voter, it may not function anymore, hence TMR is implemented by triplicating the voter. With the advances in device technology the device scaling reducing below nm, there is an increase in high upset rates in SRAM FPGA`s as shown in the Table 1. For high upset rates simple replication of the entire system i.e modular redundancy may not be sufficient for the reliable system performance. Table1. Frequency of Single Event Upset VI. COARSE-GRAIN VS FINE-GRAIN The granularity of a fault-tolerance mechanism determines how the system is divided into modules for the sake of the applying technique. The conventional TMR is named as Coarse Grain TMR (CGTMR). CGTMR is a view to the fault tolerance issue, which does not cope with the fine grain parts of the system. In other words, a designis triplicated and when an upset occurs, it is not clear where exactly in the design is affected. This is specially an obstacle when more than one copy is affected in CGTMR which is often in case of multiple SETs. A coarse grain view to redundancy might not be sufficient for scenarios with high rates of SETs, SEUs and MBUs. Fine-Grain Redundancy If the system would be able to detect failures and upsets locally, it can be able to deal with high failure rates. These challenges lead to a more localized view regarding the application of fault tolerance methods. It is categorized as Fine- Grain Redundancy Techniques. Copyright to IJAREEIE www.ijareeie.com 14050

The basic advantage of FGTMR is that Homogenous architecture like FPGAs, TMR can be applied to fine-grain homogenous parts like LUT,CLB,etc., of the design instead of whole design. The level of granularity can be determined based upon the requirements of the system. Reliability-Probability Model for Granularity [8] A simple reliability-probability modelling has been done to analyse the effect of fine granularities on TMR technique which is shown in the Fig.4 indicates that reliability probability of different fine grains is in better condition than coarse grain. Fig.4 Reliability of Coarse-Grain VsFGTMR with different number of grains(m) VII. CONCLUSION The paper presented the different mitigation techniques employed for reliable operation of SRAM FPGA`s in space environment. The paper indicates that scrubbing cannot be the only method to mitigate SEU. However, a combination of reconfiguration and redundancy techniques is the ideal way to deal with SEU effects in SRAM FPGA`s. The reliability of SRAM FPGA`s can be improved still further by change in the granularity from coarse to fine-grain. REFERENCES [01] ROOSTA, R.: A Comparison of Radiation-Hard and Radiation-Tolerant FPGAsfor Space Applications. NASA Electronic Parts and Packaging Program,NASA and Jet Propulsion Laboratory, December 2004. [02] RUHL, K.: An introduction to Space Weather. Computer Sceintists, March 2010. [03] RICHARD H. MAURER, MARTIN E. FRAEMAN,M. N. M. andd. R. ROTH:Harsh Environments: Space Radiation Environment, Effects, and Mitigation. John Hopkins Apl Technical Digest,28(1), 2008. [04] NASA/GSFC Radiation Effects and Analysis home page. [05] BERG, M., C. POIVEY, D. PETRICK, D. ESPINOSA, A. LESEA, K. LABEL, M. FRIENDLICH, H. KIM and A. PHAN: Effectiveness of Internal Versus External SEU Scrubbing Mitigation Strategies in a Xilinx FPGA: Design, Test, and AnalysisNuclear Science, IEEE Transactions on, 55(4):2259 2266, aug.2008. [06] LYONS, R. E. and W. VANDERKULK: The Use of Triple-Modular Redundancy to Improve Computer Reliability. IBM Journal of Research and Development,6(2):200 209, April, 1962. [07] Anthony J.Yu, Guy G.Lemieux: FPGA Defect Tolerance: Impact of Granularity [08] NIKNAHAD, M., O. SANDER and J. BECKER: FGTMR - Fine Grain Redundancy Method for Reconfigurable Architectures under high Failure Rates. NASNIT 2011. Copyright to IJAREEIE www.ijareeie.com 14051