Enabling Increased Safety with Fault Robustness in Microcontroller Applications

Similar documents
Understanding SW Test Libraries (STL) for safetyrelated integrated circuits and the value of white-box SIL2(3) ASILB(D) YOGITECH faultrobust STL

Fault-robust microcontrollers for automotive applications

Growth outside Cell Phone Applications

Safety and Reliability of Software-Controlled Systems Part 14: Fault mitigation

DEPENDABLE PROCESSOR DESIGN

Functional Safety Design Packages for STM32 & STM8 MCUs

Using an innovative SoC-level FMEA methodology to design in compliance with IEC61508

New ARMv8-R technology for real-time control in safetyrelated

Is This What the Future Will Look Like?

What functional safety module designers need from IC developers

Functional Safety and Safety Standards: Challenges and Comparison of Solutions AA309

A tool based estimation computation method of MCU random failure rate &functional safety metrics

Introduction CHAPTER IN THIS CHAPTER

Using Zynq-7000 SoC IEC Artifacts to Achieve ISO Compliance

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC,

Report. Certificate Z Rev. 00. SIMATIC Safety System

So you think developing an SoC needs to be complex or expensive? Think again

FUNCTIONAL SAFETY FOR INDUSTRIAL AUTOMATION

Riccardo Mariani, Intel Fellow, IOTG SEG, Chief Functional Safety Technologist

Virtual Hardware ECU How to Significantly Increase Your Testing Throughput!

88 Dugald Campbell. Making Industrial Systems Safer Meeting the IEC standards

Aldebaran Microcontroller SoC for Mobile Robot (Low Power MCU Core Technology)

Report. Certificate M6A SIMATIC S7 Distributed Safety

BUILDING FUNCTIONAL SAFETY PRODUCTS WITH WIND RIVER VXWORKS RTOS

GE Intelligent Platforms PAC8000 RTU

ASYNC Rik van de Wiel COO Handshake Solutions

Alexandre Esper, Geoffrey Nelissen, Vincent Nélis, Eduardo Tovar

Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development

Advanced IP solutions enabling the autonomous driving revolution

Functional Safety Processes and SIL Requirements

A Model-Based Reference Workflow for the Development of Safety-Related Software

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

ID 025C: An Introduction to the OSEK Operating System

Failure Diagnosis and Prognosis for Automotive Systems. Tom Fuhrman General Motors R&D IFIP Workshop June 25-27, 2010

Solving Today s Interface Challenge With Ultra-Low-Density FPGA Bridging Solutions

Compute solutions for mass deployment of autonomy

Failure Modes, Effects and Diagnostic Analysis

Design and Implementation of on-chip Safety Controller in terms of the Standard IEC 61508

Diagnosis in the Time-Triggered Architecture

Welcome to the overview of ACS880 functional safety, FSO-11 Safety functions module.

Fending Off Cyber Attacks Hardening ECUs by Fuzz Testing

The exida. IEC Functional Safety and. IEC Cybersecurity. Certification Programs

ARM instruction sets and CPUs for wide-ranging applications

Power Ac-Dc Power Supplies & Dc-Dc Converters

Whose fault is it? Advanced techniques for optimizing ISO fault analysis

Technical Report Reliability Analyses

New developments about PL and SIL. Present harmonised versions, background and changes.

Chapter 2. Literature Survey. 2.1 Remote access technologies

Hardware safety integrity (HSI) in IEC 61508/ IEC 61511

WHITE PAPER. 10 Reasons to Use Static Analysis for Embedded Software Development

GridEye FOR A SAFE, EFFICIENT AND CONTROLLED POWER GRID

Fault-Injection testing and code coverage measurement using Virtual Prototypes on the context of the ISO standard

AUTOBEST: A microkernel-based system (not only) for automotive applications. Marc Bommert, Alexander Züpke, Robert Kaiser.

Cypress PSoC 4 Microcontrollers

Smart Payments. Generating a seamless experience in a digital world.

Failure Modes, Effects and Diagnostic Analysis

Certified Automotive Software Tester Sample Exam Paper Syllabus Version 2.0

The Future of Smart Cards: Bigger, Faster and More Secure

ISO26262 This Changes Everything!

End-to-end Safety, Security and Reliability Keys for a successful I4.0 Migration

DesignWare IP for IoT SoC Designs

Report. Certificate M6A SIMATIC Safety System

Isolation of Cores. Reduce costs of mixed-critical systems by using a divide-and-conquer startegy on core level

Multicore platform towards automotive safety challenges

Designing Next Generation Test Systems An In-Depth Developers Guide

ARM processors driving automotive innovation

A Developer's Guide to Security on Cortex-M based MCUs

European Conference on Nanoelectronics and Embedded Systems for Electric Mobility

2 Control Equipment for General Applications

Safety and Security for Automotive using Microkernel Technology

Introduction to Safety PLCs GuardLogix & CIP Safety

Multicore for safety-critical embedded systems: challenges andmarch opportunities 15, / 28

The Value of Certified Consumables Yield Numbers

FUNCTIONAL SAFETY CERTIFICATE

TIRIAS RESEARCH. Lowering Barriers to Entry for ASICs. Why ASICs? Silicon Business Models

TU Wien. Fault Isolation and Error Containment in the TT-SoC. H. Kopetz. TU Wien. July 2007

Integration of Mixed Criticality Systems on MultiCores: Limitations, Challenges and Way ahead for Avionics

Siemens Safety Integrated Take a safe step into the future

Roller Shutter Door Kits. Diversity in profiles

Smart Inrush Current Limiter Enables Higher Efficiency In AC-DC Converters

Report. Certificate Z AC500-S

Error Detection by Code Coverage Analysis without Instrumenting the Code

SIMPLIFYING THE CAR. Helix chassis. Helix chassis. Helix chassis WIND RIVER HELIX CHASSIS WIND RIVER HELIX DRIVE WIND RIVER HELIX CARSYNC

Actionable Standards Accelerate the adoption of sustainable WASH technologies

ODX Process from the Perspective of an Automotive Supplier. Dietmar Natterer, Thomas Ströbele, Dr.-Ing. Franz Krauss ZF Friedrichshafen AG

Chapter 5. Introduction ARM Cortex series

byneuron A BUILDING MANAGEMENT SYSTEM CONTROLS YOUR BUILDING BUT WHO CONTROLS YOUR MANAGEMENT SYSTEM?

Report. Certificate Z SIMATIC S7 F/FH Systems

FMEDA-Based Fault Injection and Data Analysis in Compliance with ISO SPEAKER. Dept. of Electrical Engineering, National Taipei University

The evolution of the cookbook

Industrial Embedded Systems - Design for Harsh Environment - Dr. Alexander Walsch

Scalable series-stacked power delivery architectures for improved efficiency and reduced supply current

-- Smart Grid Communication --

MGE Galaxy /30/40/60/80/100/120 kva. The recommended power protection for all critical applications.

MGE Galaxy /50/60/80/100/130 kva. The recommended power protection for all critical applications

EXPERIENCES FROM MODEL BASED DEVELOPMENT OF DRIVE-BY-WIRE CONTROL SYSTEMS

FlexRay The Hardware View

AVS: A Test Suite for Automatically Generated Code

Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers

The HITRUST CSF. A Revolutionary Way to Protect Electronic Health Information

Transcription:

Enabling Increased Safety with Fault Robustness in Microcontroller Applications Wayne Lyons ARM 110 Fulbourn Road Cambridge CB1 9NJ, England Abstract All safety-critical or high-reliability applications are looking for solutions and methodologies to address fault robustness. While there are a number of fault robustness implementations, the ARM Cortex -M3 processor is the first in the market to enable an optional observation port to interface to Yogitech's faultrobust-cpu (frcpu). This presentation examines the features and benefits of this approach, highlights examples in markets such as the medical and consumer segments, and shows how, after performing a failure-mode and effect analysis (FMEA), the Cortex-M3 processor and Yogitech's frcpu have been certified to Safety Integrity Level 3 (SIL3) of the IEC61508 norm. 1.0 Introduction As a processor design company ARM has been involved in the embedded microcontroller market for many years. This involvement was initially through implementations which used the ARM7TDMI T processor as the basis for the microcontroller and more recently using the ARM Cortex-M3 processor, designed to address next generation microcontroller requirements. In this time, the perception of safety systems has moved far beyond that of a convenient optional extra to becoming a mandated requirement. Research suggests that at least 88% of car buyers believe safety features should be provided as standard fit [1]. Through silicon partners such as Texas Instruments, ARM is now implemented in around two-thirds of today s electronic braking systems. This provides ARM with useful insight into the future requirements of future safety systems. In addition to regulatory requirements, continued adoption of deep submicron technologies results in a new population of faults and failure modes, making such safety functions critical in terms of their integrity.[2]

2.0 Evolving Standards IEC 61508 is a generic standard to define the "Functional safety of electrical, electronic, programmable electronic safety related systems [3]. As a safety norm IEC61508 has given rise to the emergence of specific standards in various application domains, such as process machinery, train control, medical equipment control [4]. It has also led to the development (ongoing) intended to cover safety in the automobile: ISO WD 26262. Figure 1. Safety Standards Within the IEC 61508 different levels of safety integrity are prescribed to outline what is expected for a particular platform. The relevant Safety Integrity Levels (SIL) and Safe Failure Fraction (SFF) are prescribed as follows: SIL1: SFF > 60%, fault model mostly oriented to static (permanent) faults SIL2: SFF > 90%, fault model includes transients SIL3: SFF > 99%, fault model includes transients These margins effectively prescribe the reliability safety levels expected for a particular platform. While standards such as ISO 26262 for automobile will prescribe market specific levels in more detail, it is likely that the growing number of safety critical vehicle features such as stability control and braking are an example of applications which are expected to reach the highest level in these guidelines, namely SIL 3 (comparable to ASILD of ISO 26262). Many of the CPU architectures used today in Microcontrollers date back as far as the 1980 s and while this provides benefits in terms of toolchain maturity, it means that the core was developed without any knowledge of safety standards such as the IEC61508 norm. Both the ARM Cortex-M3 and ARM Cortex-R4 processors have been developed with knowledge of this specification, and targeted for deeply embedded applications such as automotive. This means that, while they still possess toolchain compatibility with older ARM designs, their memory and CPU interfaces were designed to improve compatibility with fault tolerant requirements. Figure 2: Basic Diagnostic Architecture: Single channel with diagnostics

3.0 Safety Methodologies The recommended techniques to address this within a microcontroller platform are prescribed in the following diagram where the diagnostic channel consists of many elements to check consistency of the logic subsystem: Figure 3. Recommended Techniques for high Diagnostic Coverage A common approach to tackle this challenge is to extend traditional architectures to implement the specific safety functions and to guarantee the required safety integrity level of the MCU using a black box approach. In other words, individual MCU components such as the CPU are either replicated or continuously tested via their external interfaces Figure 4: Black Box Technique : Dual Core Lock Step

This concept can be found in several safety systems today and has proven very popular since the approach is relatively straightforward; the CPU is duplicated so that one CPU can be used to check the results generated by the CPU used in the logic sub-system. However, some form of architectural diversity is usually necessary to avoid the potential of a commoncause fault affecting both CPU s which cannot be detected by the compare unit. As a result, the dual-cpu lock-step architecture requires additional measures to reach SIL3. Such measures are outlined in IEC 61508 and include the use of an external watchdog, diverse synthesis for the CPU checker and specific layout requirements. In addition, IEC 61508 also requires a minimum diagnostic coverage for each CPU of at least 60%, therefore software routines for CPU initialization and periodic diagnostic tests are also needed. [2] In contrast, a white box approach seeks to understand the operation of the logic subsystem, through a detailed failure modes and effects analysis (FMEA) providing an asymmetric monitoring approach which inherently avoids many common cause failures. Another aspect of this approach is that, as systems become more complex, it can scale to multiple core applications without needing to duplicate the logic of each CPU. One company that has pioneered significant developments in this White Box approach is Yogitech, with their faultrobust portfolio. The Yogitech faultrobust portfolio extends from Memory check and onchip bus components such as frmem and frbus through to their frcpu component, optimized for a specific CPU [2]. Figure 5: White Box Architecture 4.0 Fault Robust and the ARM Cortex-M3 Fault Observation Interface The development of Yogitech s frcpu IP begins with a detailed FMEA analysis of the target CPU using their frfmea process included in frmethodology. frmethodology is approved by TÜV SÜD, a well known leading international service organization that performs safety testing and IEC certification. According to the frmethodology, the CPU is partitioned into sensitive zones covering the logic cones related to all CPU registers. For these sensitive zones, the main effect of a fault is then evaluated at an observation point. Failure modes are defined for permanent, transient and IEC defined faults. Each failure mode is then weighed with statistics, such as the gate count and connectivity of the input logic cone. Diagnostic coverage and a safe/dangerous failure fraction associated with each failure mode are then calculated. The target diagnostic coverage for each sensitive zone is selected to provide an SFF = 99% in an optimized silicon footprint. As a result, the frcpu takes up only a fraction of the original CPU size. From an architectural point of view, frcpu is composed of sub units which monitor the CPU and model its behavior on an abstract level. It uses a set of independent blocks tailored to check and detect deviations of the CPU behavior from this model [2]. As such, frcpu is not provided as a replication of the CPU, hence it is architecturally and functionally diverse, which helps to address the IEC 61508 requirement.

For the frcpu to deliver the required coverage level with minimum gate count, it is essential to provide it with the correct visibility of the activities going on inside the CPU. To achieve this, ARM worked in close collaboration with Yogitech to identify the requirements and implement the necessary interface. The result of this collaboration was the definition of a Fault Observation interface known as the faultrobust Diagnostic interface (frdi), which totals almost 150 signals and was made public with the release 2 of the Cortex-M3 processor during 2008. Figure 6: ARM Cortex-M3 faultrobust Diagnostic Interface (frdi), Block Diagram The observation interface includes information on the register bank ports, the instruction in the execute stage of the pipeline and certain debug information. 5.0 Discussion One challenge of the white box approach compared to dual-core lock step can be development timescale, since the frcpu block first needs to be based on a detailed understanding and FMEA of the CPU core. To address this for the ARM Cortex-M3 processor, the frcpu_armcm3 block is delivered pre-certified by TÜV SÜD. The certification procedure provided the following information: the size of the frcpu on 150nm technology equated to 0.32mm 2, compared to the total size of around 1.6mm 2 for the second core, the compare unit and the additional measures needed for dual core solution, demonstrating that this technique can successfully be implemented in a smaller footprint than traditional black box designs. The resulting platform required a memory footprint of less than 2kbytes for the start-up procedure and run time tests and the resulting SFF/Diagnostic coverage reached 99.2%. The coverage figure was verified using Yogitech s frmethodology and confirmed through fault injection and simulation. It is clear that traditional dual-core lock step architectures, as a proven solution, will remain popular for some time to come. However, as an increasing number of cost sensitive applications require fault tolerance, and the need to provide tolerance in larger, multicore designs becomes mandatory, the methodology defined by Yogitech can offer several benefits in terms of gate count, power consumption and common cause failures. Moreover, as system on chip designs and the safety requirements these architectures become more stringent it is likely that a combination of techniques will be considered for future designs. One example of this is the following configuration described in Figure 7 which is designed to provide fail-functional performance by keeping the second channel in a hot standby mode

Figure 7: One -out of- two diagnostic channel with 2 nd channel in hot standby 6.0 Conclusion Microelectronics provides increasingly active control of systems in applications ranging from process control or healthcare through to vehicle safety and driver assistance. While safety and reliability standards continue to evolve, the economic reality is that any resulting platform must meet the appropriate level at the right price. For system on chip designers this can mean an arduous process of validation and certification as the chip is being developed. The combination of Yogitech s frcpu_armcm3 together with the faultrobust Diagnostic Interface provided by ARM Cortex-M3 CPU provides designers with a pre-certified starting point for one of the key elements in the chip design. 7.0 References [1] Automobil Elektronik, Bosch/Spiegel-Insitut: Jan 18 th 2004. Published at http://www.automobil-produktion.de/themen/02794/index.php [2] Dr. M. Baumeister, Dipl.-Ing. P. Fuhrmann, Philips Research Laboratories, Aachen; Dr. R. Mariani, Yogitech Spa, Pisa/I.: A single channel, fail-safe microcontroller to simplify SIL3 safety architectures in automotive applications. Baden Baden, VDI Berichte Nr. 2000, 2007 [3] Wikipedia Reference: IEC61508. http://en.wikipedia.org/wiki/iec_61508. Last modified 10 th October 2008. [4] SiemensVDO: Functional Safety in Automotive Electronics Introduction to Future ISO26262. RTCA/EUROCAE SC205/WG71 plenary meeting, Toulouse, 2006-09-12