Methods and Tools for Embedded Distributed System Timing and Safety Analysis. Steve Vestal Honeywell Labs

Similar documents
Investigation of System Timing Concerns in Embedded Systems: Tool-based Analysis of AADL Models

Hierarchical Composition and Abstraction In Architecture Models

Pattern-Based Analysis of an Embedded Real-Time System Architecture

An Information Model for High-Integrity Real Time Systems

Flow Latency Analysis with the Architecture Analysis and Design Language (AADL)

Alexandre Esper, Geoffrey Nelissen, Vincent Nélis, Eduardo Tovar

Communication in Avionics

Impact of Runtime Architectures on Control System Stability

SAE AS5643 and IEEE1394 Deliver Flexible Deterministic Solution for Aerospace and Defense Applications

CS4514 Real-Time Systems and Modeling

Automatic Generation of Static Fault Trees from AADL Models

Model-Based Embedded System Engineering & Analysis of Performance-Critical Systems

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

A Modeling Framework for Schedulability Analysis of Distributed Avionics Systems. Pujie Han MARS/VPT Thessaloniki, 20 April 2018

Understanding and Using the Controller Area Network Communication Protocol

Architecture Description Languages. Peter H. Feiler 1, Bruce Lewis 2, Steve Vestal 3 and Ed Colbert 4

Deriving safety requirements according to ISO for complex systems: How to avoid getting lost?

Model-based Architectural Verification & Validation

A System Dependability Modeling Framework Using AADL and GSPNs

Unit 2: High-Level Synthesis

Distributed IMA with TTEthernet

AADL Fault Modeling and Analysis Within an ARP4761 Safety Assessment

Analysis of AADL Models Using Real-Time Calculus With Applications to Wireless Architectures

Precedence Graphs Revisited (Again)

Architecture Modeling and Analysis for Embedded Systems

Organic Self-organizing Bus-based Communication Systems

Ensuring Schedulability of Spacecraft Flight Software

The Design and Implementation of a Low-Latency On-Chip Network

Probabilistic Worst-Case Response-Time Analysis for the Controller Area Network

A Multi-Modal Composability Framework for Cyber-Physical Systems

Reaching for the sky with certified and safe solutions for the aerospace market

Best Practices for Incremental Compilation Partitions and Floorplan Assignments

Error Model Annex Revision

Time-Triggered Ethernet

2. Introduction to Software for Embedded Systems

8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments

Resource-bound process algebras for Schedulability and Performance Analysis of Real-Time and Embedded Systems

Chapter 9. Design for Testability

DTU IMM. MSc Thesis. Analysis and Optimization of TTEthernet-based Safety Critical Embedded Systems. Radoslav Hristov Todorov s080990

Communication Networks for the Next-Generation Vehicles

SAE AADL Error Model Annex: Discussion Items

Scheduling Algorithm and Analysis

Test and Evaluation of Autonomous Systems in a Model Based Engineering Context

AEROSPACE STANDARD. SAE Architecture Analysis and Design Language (AADL) Annex Volume 3: Annex E: Error Model Annex RATIONALE

Platform modeling and allocation

Software architecture in ASPICE and Even-André Karlsson

SAE AADL Error Model Annex: An Overview

What are Embedded Systems? Lecture 1 Introduction to Embedded Systems & Software

FPGA-Based Embedded Systems for Testing and Rapid Prototyping

AirTight: A Resilient Wireless Communication Protocol for Mixed- Criticality Systems

Green Hills Software, Inc.

SWE 760 Lecture 1: Introduction to Analysis & Design of Real-Time Embedded Systems

Subsystem Hazard Analysis (SSHA)

Concurrent activities in daily life. Real world exposed programs. Scheduling of programs. Tasks in engine system. Engine system

Analytical Architecture Fault Models

Dependability Modeling Based on AADL Description (Architecture Analysis and Design Language)

Real-Time (Paradigms) (47)

Introduction to Real-Time Communications. Real-Time and Embedded Systems (M) Lecture 15

CIS 890: High-Assurance Systems

Analysis and Design Language (AADL) for Quantitative System Reliability and Availability Modeling

AADL v2.1 errata AADL meeting Sept 2014

Embedded Software Engineering

Simulation of AADL models with software-in-the-loop execution

Complexity-Reducing Design Patterns for Cyber-Physical Systems. DARPA META Project. AADL Standards Meeting January 2011 Steven P.

Schedulability Analysis of AADL Models

AADL performance analysis with Cheddar : a review

COMPASS: FORMAL METHODS FOR SYSTEM-SOFTWARE CO-ENGINEERING

Time Triggered and Event Triggered; Off-line Scheduling

Evaluation of numerical bus systems used in rocket engine test facilities

Reducing Uncertainty in Architectural Decisions with AADL

Software Architecture. Lecture 4

Real-Time Component Software. slide credits: H. Kopetz, P. Puschner

Chapter 13: I/O Systems

Basic Concepts of Reliability

Introduction to Real-time Systems. Advanced Operating Systems (M) Lecture 2

OSATE Analysis Support

Components & Characteristics of an Embedded System Embedded Operating System Application Areas of Embedded d Systems. Embedded System Components

Multimedia Systems 2011/2012

Model-Based Safety Approach for Early Validation of Integrated and Modular Avionics Architectures

Automatic Selection of Feasibility Tests With the Use of AADL Design Patterns

Traditional Approaches to Modeling

Real-Time Architectures 2003/2004. Resource Reservation. Description. Resource reservation. Reinder J. Bril

1. Introduction. Real-Time Systems. Initial intuition /1. Outline. 2016/17 UniPD / T. Vardanega 26/02/2017. Real-Time Systems 1

Implementing Scheduling Algorithms. Real-Time and Embedded Systems (M) Lecture 9

Constructing and Verifying Cyber Physical Systems

System-level co-modeling AADL and Simulink specifications using Polychrony (and Syndex)

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 7, NO. 1, FEBRUARY

Chapter 12: I/O Systems

Chapter 13: I/O Systems

Chapter 12: I/O Systems. Operating System Concepts Essentials 8 th Edition

System Design and Methodology/ Embedded Systems Design (Modeling and Design of Embedded Systems)

New ARMv8-R technology for real-time control in safetyrelated

Cross Clock-Domain TDM Virtual Circuits for Networks on Chips

Lecture 9: Load Balancing & Resource Allocation

Part I: Preliminaries 24

PROBABILISTIC SCHEDULING MICHAEL ROITZSCH

Functional Safety and Safety Standards: Challenges and Comparison of Solutions AA309

Mixed Critical Architecture Requirements (MCAR)

Scheduling Complexity and Scheduling Strategies Scheduling Strategies in TTP Plan

AADL : about code generation

Transcription:

Methods and Tools for Embedded Distributed System Timing and Safety Analysis Steve Vestal Honeywell Labs Steve.Vestal@Honeywell.com 5 April 2006

Outline Preliminary Comments Timing and Resource Utilization Safety and RAM 2 AADL Symposium Dec 2005

Tools Solve Problems Tools are a means to an end. This talk is mostly about problems and methods few details on specific AADL tools This talk addresses only timing & safety. 3 AADL Symposium Dec 2005

Compositional Specification and Modeling Develop and use compositional modeling technologies: - Specify models for individual components - Automatically generate system models from the specification of how components are assembled to form an overall architecture. Do this in a way that provides: - Reusable component models - Easily reconfigured architectures - Structured traceability between specifications, models, analyses - More understandable models and more tractable analysis decomposition abstraction mixed fidelity Select and integrate a set of modeling methods suited to the product family architectures and requirements. 4 AADL Symposium Dec 2005

Cost and Schedule Benefits total project savings 50%, re-target savings 90% 8000 7000 6000 Man Hours 5000 4000 3000 2000 Traditional Approach 1000 0 Using MetaH 6-DOF Review 3-DOF Translate RT- 6DOF AMCOM SED studies show NRE cost/schedule reduction for first delivery NRE cost/schedule reduction for upgrades Test 6DOF Transform RT- Missile Build Debug Debug Re-target Current MetaH This talk is on methods and tools that determine resource requirements. Efficiency of allocation and scheduling Degree of redundancy needed for safety and availability Biggest savings may be in reduced recurring cost Recurring cost exceeds NRE over product lifetime (except space) Easier upgrade means faster technology refresh & upgrade Reduced size/weight/power ripples through rest of vehicle 5 AADL Symposium Dec 2005

Comments about Efficient Design Timing requirements are derived, they are specified by application engineers not end users. - Avoid unnecessarily tight requirements. - Use a process that makes it easy to reassess and change. - Avoid local over-sampling to solve global scheduling problems. - Manage uncertainty and variability in WCET budgets. Redundancy is a design decision made to satisfy safety and reliability requirements. - Dominant system failure scenarios can be difficult to intuit - Avoid over-engineering subsystems whose failure rates are of low significance. - Avoid over-engineering subsystems whose failures are mitigated elsewhere. - Lack of coverage can be more important than lack of resources. - Pick the best redundancy management pattern for each application, and mix patterns in the overall architecture. Use parametric sensitivity analysis. Carefully engineer the 20% of the design that makes 80% of the difference. 6 AADL Symposium Dec 2005

Classical Allocation & Scheduling Problem Given - A hardware architecture - A software architecture - Scheduling assumptions and timing constraints Produce - An assignment of software to hardware components - A schedule of software activities on hardware components such that timing constraints are satisfied. 7 AADL Symposium Dec 2005

Something Closer to the Real Problem Given - Physical sensor and actuator locations - Functions (algorithms, control/data flows, I/Os) - Constraints on Size Power Cabling Availability Weight Placement Integrity Timing Produce - A co-designed software + hardware architecture that satisfies the constraints. 8 AADL Symposium Dec 2005

Evaluations Evaluations of various methods and tools have been carried out over the past few years using one or more of the following workloads. Air transport aircraft IMA (simplified production workload) Globally time-triggered 6 processors, 1 multi-drop bus 105 threads, 51 message sources Military helicopter MMS (first release, partial) Globally time-triggered 14 dual processors, 14 bus bridges, 2 multi-drop buses 306 threads, 979 [source, destination] connections Air transport aircraft IMA (preliminary, partial, estimated) Globally asynchronous processors, precedence-constrained switched network 26 processors, 12 switches 1402 threads, 2644 [source, destination] connections Regional aircraft IMA (production workload) Globally time-triggered 49 processors, 2 multi-drop busses 244 processes (TBD threads), 3179 [source, destination] connections Is the tool applicable to real-world design details? tractable for systems of real-world size? 9 AADL Symposium Dec 2005

A Few Comments This talk focuses on periodic workloads. Efficient level A partitioned periodic+incremental+aperiodic workloads are flying today. Watch out for timing anomalies and non-robust behaviors. Analytic modeling lays a good foundation for understanding and bounding these. Analytic modeling provides parametric sensitivity data. Every organization tailors its own development environment. Appropriate models must be selected & integrated. Can t avoid filling gaps with custom scripts & tooling. Architecture specification is one among many work products, but it plays a central role. A hub or integrating specification. Traceability between requirements, functions, and architecture is important. Traceability between specification, models, analyses, and implementation is important. How are you going to specify your architecture? Everybody uses MathWorks stuff. What about the other 60-95% of the code? How well do you automate system integration and verification? Adoption and use of a standard architecture interchange format involves both the system integrator and the subsystem suppliers. 10 AADL Symposium Dec 2005

Outline Preliminary Comments Timing and Resource Utilization Safety and RAM 11 AADL Symposium Dec 2005

Compositional Global Scheduling and Analysis We assume single-resource scheduling technology is available for each individual resource in the system. The global scheduling problem is to assign local scheduling parameters (e.g. release times, deadlines) in a way that achieves desired global scheduling properties. The global schedulability analysis problem is to combine local schedulability analysis results to produce end-to-end timing analysis. This depends on the temporal interfaces between components and subsystems: - Globally Time-Triggered - Precedence-Constrained Sequencing - Asynchronously Sampled Internal Interfaces Many systems combine more than one. 12 AADL Symposium Dec 2005

Time-Triggered Model Globally Time-Triggered completion time variability due to queuing and service time variability T 1 T 2 T 3 Globally synchronized clocks are maintained on every resource. Release times and deadlines are statically scheduled within the hyperperiod. A hyperperiod schedule of these events is loaded as a table into every resource and used at run-time to trigger dispatches, transmits, etc. 13 AADL Symposium Dec 2005

Precedence-Constrained Sequencing A subjob J i,k+1;j is released at the completion of subjob J i,k;j J1,1 completion and release time variability due to queuing and service time variability J1,2 J1,3 earliest release latest release earliest completion latest completion Run-time release signals are sent from the resource hosting the completing subjob to the resource hosting the successor subjob. Release and completion time jitter grow at each step. Traffic regulators are sometimes used to obtain tighter bounds and minimize anomalous scheduling effects. 14 AADL Symposium Dec 2005

Asynchronously Sampled Internal Interfaces plant (continuous) Each resource (subsystem) performs independent periodic sampling. Periodic events on different resources are unsynchronized. They have unknown phase offsets and are subject to relative drift. Each task asynchronously samples the outputs of the preceding tasks at dispatch and sets the values of its output buffers at completion. 15 AADL Symposium Dec 2005

Some Pros and Cons Criteria Time Triggered Asynchronous Sequenced deterministic, observable, anomalyfree ease of fan-in/fan-out aperiodics, dynamic adaptability loss-less How tractable and efficient are scheduling and analysis? 16 AADL Symposium Dec 2005

A Practical Example Locally time-triggered Precedence-constrained sequencing Asynchronously sampled internal interface. ARINC 664 Network ARINC 653 RTOS ARINC 653 clocks are not globally synchronized. 17 AADL Symposium Dec 2005

Time-Triggered Decomposition Scheduling R i L Ci R ij L Xij D i T i thread message T D = i i,0 D F ( L ci, L xij ) i, t + 1 = 2 Decomposition scheduling iteratively computes a sequence of intermediate process deadline/ message release time points in a way that balances loading, using laxity results from individual resource scheduling and schedulability analysis algorithm. 18 AADL Symposium Dec 2005

Decomposition Scheduling Results Messages from same process over same bus merged One period deadline on each process/message pair (pre-period deadlines on arbitrary length chains not implemented yet) Workload (decomposition scheduler model): - 1322 messages on 8 time-triggered busses - 1402 processes on 26 processors Model generated from specification in about 45 seconds Model scheduled and analyzed in about 10 seconds 19 AADL Symposium Dec 2005

Precedence-Constrained Sequencing There are several heuristics for priority assignment, e.g. - Ultimate deadline (final deadline) - Effective deadline (final deadline minus remaining service time) - Proportional deadline (final deadline apportioned by service times) - Normalized proportional deadline (normalized by resource utilizations) Approaches to worst-case latency analysis - Extended busy period (aka extended time demand) - Network calculus 20 AADL Symposium Dec 2005

Precedence Constrained Sequencing Results Prototype tools - Multiplexed multicast routing - extended busy period latency, jitter, queue size analysis tool - network calculus latency, jitter, queue size analysis tool Only connections from the same partition were multiplexed. Evaluated using regional aircraft and synthesized clock synchronization workloads cascaded star and fully interconnected switch synthesized networks Routed, multiplexed, scheduled, analyzed in about 40 seconds. Binary search for breakdown workload for 5 switch 100mbs network Search took 14 iterations, ~ 30 minutes Breakdown workload ~1600 processes, ~26000 [source, destination] connections 21 AADL Symposium Dec 2005

Age Scheduling Across Asynchronous Interfaces Definition: The Age of an output signal is the time elapsed since the input on which it is based was sampled. age of output for sample #1 age of output for sample #2 T output signal age L sample #1 output #1 Theorem: The age of an output signal is bounded by ( T t + Li ) Developed uni-processor efficiency bounds and age scheduling algorithms. 22 AADL Symposium Dec 2005 i Ψ φ Global problem expressed as a system of non-linear constraints, Ct U At A ρ φ A t Ψ sample #2 output #2 sample #3 output #3 ρ t t Ψ φ

Age Scheduling Results Connections were merged (multiplexed) if - They had the same route between the same processors - The connected processes had the same periods - 2644 merged to 610 Processes at same rates on same IO modules were merged - 1472 merged to 203 Workload (AMPL model): - 1425 variables (one for each process/processor and message/bus pair) - 1872 constraints (one for each resource and one for each signal) Model generated from specification in about 45 seconds Feasible solution found by CONOPT in about 45 seconds 23 AADL Symposium Dec 2005

Outline Preliminary Comments Timing and Resource Utilization Safety and RAM 24 AADL Symposium Dec 2005

Example: Integrated Safety, FMEA, Reliability System Capture hazards Subsystem Capture risk mitigation architecture E.g. SAE ARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment Component Capture FMEA model Error Model features permit checking for consistency and completeness between these various declarations. 25 AADL Symposium Dec 2005

Error Model Type fault and repair events error model Basic features Fail_Stop, Fail_Babbling : error event; Error_Free: initial error state; Stopped, Babbling: error state; internal error states, system hazards No_Data, Bad_Data : in out error propagation; end Basic; external failure modes/effects, mishaps 26 AADL Symposium Dec 2005

Error Model Implementation error model implementation Basic.Nominal transitions Error_Free -[Fail_Stop, in No_Data]-> Stopped; Error_Free -[Fail_Babbling, in Bad_Data]-> Babbling; Stopped [ out No_Data ]-> Stopped; Babbling [ out Bad_Data ]-> Babbling; properties Occurrence => poisson 10E-4 applies to Fail_Stop; Occurrence => poisson 10E-6 applies to Babbling; end Basic.Nominal; An error model is a type of stochastic automaton. 27 AADL Symposium Dec 2005

Hierarchical Modeling A subsystem of components may have an explicitly associated error model. The user may declare whether a subsystem error model 1. has a state determined by a user-specified function of the error states of the components (e.g. to model internal redundancy) 2. is an abstract error model to be substituted for the composition of the component models (e.g. to improve tractability of analysis) The annex supports abstraction and mixed fidelity modeling. 28 AADL Symposium Dec 2005

Model Generators Stochastic Concurrent Automata and Markov Chains General (cyclic) component error models Prototype generator (joint evaluation with UIUC) Fault Trees Cycle-free component error models Prototype generator evaluated 29 AADL Symposium Dec 2005

MetaH Markov Prototype MetaH Front-End MetaH HW/SW Binder Concurrent Stochastic Automata Generator Markov Process Generator SURE (NASA) UltraSAN (UIUC) Pr(death state) 30 AADL Symposium Dec 2005

What Should be Done OSATE HW/SW Binder Concurrent Stochastic Automata Generator MÖBIUS (UIUC) Pr(failure scenario) parametric analyses More tractable analysis More useful design feed-back 31 AADL Symposium Dec 2005

Translating Error Models to Fault Trees OUT_data_corrupted error_free loss_of_availability, loss_of_function loss_of_integrity, data_corrupted =» loss_of_function data_corrupted propagate loss_of_function propagate data_corrupted IN_data_corrupted OBJ_loss_of_integrity One fault tree template is constructed for each error state. Restricted to cycle-free error models (excluding propagate) for fault tree generation (cyclic models may require dynamic analysis, e.g. Markovian). 32 AADL Symposium Dec 2005

Notional Generated Fault Tree ADIRS_loss_of_function ADIRS_L...... TAT_R TAT_L ADIRS_R Fwd_TAT_L Fwd_L_RDC HULL Switch L CCR A Switch L_CPM_1 33 AADL Symposium Dec 2005

Fault Tree Results Generated two fault trees per function: loss of availability loss of integrity (Multi-function analyses are possible but were not specified or performed.) Fault tree sizes ranged from about 1000 to 15000 gates. Full set generated from specification in about 30 seconds. Largest tree required about 10 minutes to analyze. 34 AADL Symposium Dec 2005