Deterministic Ethernet as Reliable Communication Infrastructure for Distributed Dependable Systems DREAM Seminar UC Berkeley, January 21 st, 2014 Wilfried Steiner, Corporate Scientist wilfried.steiner@tttech.com Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 1
MOTIVATION Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 2
What They Have in Common Boeing 787 NASA Orion Reliable Networks Are a key element of a dependable system Audi A8 Airbus A380 Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 3
Trend towards Advanced Driver Assistant Systems (ADAS) Copyright TTTech Automotive GmbH. All rights reserved. 2/17/2014 / Page 4
Sensors Laserscanner Long Range Radar Long-Range-Radar (LRR 4) Video Video Camera Top view Camera Top View Middle-Range-Radar (MRR) Ultra Sonic Ultraschall Laser Scanner Mid Range Radar Predicitive Map Data Car2x Connectivity Copyright TTTech Automotive GmbH. All rights reserved. 2/17/2014 / Page 5
Actuators Necessary Actuators for Automated Driving Electronic Stability Control Hold management system Decelleration management Powertrain Coordination Shift-by-Wire Electric Power Steering Steering Column Accelerator Pedal Drive Select Steering Damper Shift-by-wire Copyright TTTech Automotive GmbH. All rights reserved. 2/17/2014 / Page 6
NETWORK BECOMES MORE AND MORE IMPORTANT Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 7
Automotive Need of a Reliable Communication Infrastructure http://www.ieee802.org/1/files/public/docs2013/new-tsn-jochim-goals-of-802-1tsn-0713-v01.pdf 2/17/2014 / Page 8
General Industrial Trend towards Converged Networks Closed World Communication Performance guarantees: real-time, dependability, safety Standards: ARINC 664, ARINC 429, TTP, MOST, FlexRay, CAN, LIN, Applications: Flight control, powertrain, chassis, passive and active safety,.. Validation & verification: Certification, formal analysis,... Open World Communication No performance guarantees: best efforts Standards: Ethernet, TCP/IP, UDP, FTP, Telnet, SSH,... Applications: Multi-media, audio, video, phones, PDAs, internet, web, Validation & verification: No certification, test, simulation,... High cost Low cost We see a market requirement to use the same physical network for data flows from both worlds. Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 9
The Motivation for Ethernet Ethernet hardware is low cost. Ethernet is a well-established open-world standard and very scaleable. The OSI reference model gives a well-structured classification of concepts that can be built on top of Ethernet. Existing tools can be leveraged as cost-efficient diagnosis tools. Standard protocols like SNMP can be leveraged for maintenance and configuration. Engineers learn about Ethernet at school. Ethernet means to use well-established technology, but needs real-time and dependability improvements. Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 10
DETERMINISTIC ETHERNET Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 11
Ethernet = Asynchronous Communication NIC NIC NIC SWITCH X SWITCH X NIC NIC NIC X NIC NIC Asynchronous Communication Transmission Points in Time are not predictable Transmission Latency and Jitter accumulate Number of Hops has a significant impact Usually solved by High Wire-Speeds & Low Utilization and/or Priorities Problem of ``Indeterminism remains SWITCH NIC NIC NIC Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 12
Towards Determinism: Synchronization of the distributed local clocks In an ensemble of clocks, the precision is defined as the maximum distance between any two synchronized nonfaulty clocks at any point in real time. Late Clock Perfect Clock Early Clock Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 13
Single-Master Clock Synchronization Eth Time Master IN 1 Eth Enabler for Synchronous Communication: Synchronized Global Time 1588 1588 Communication Schedule Eth Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 14
Time-Triggered Operation Synchronized time and a communication schedule allows to realize the time-triggered communication paradigm. Node A send receive receive t 1 t 2 t 3 Node B receive send receive t 1 t 2 t 3 Node C receive receive send t 1 Slot t 2 t 3 time Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 15
Synchronous Communication (TT) NIC NIC NIC SWITCH SWITCH NIC NIC NIC X NIC NIC Synchronous Communication SWITCH NIC X Exactly one order of messages M i (in contrast to PERM(M i ) in async. comm) NIC NIC Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 16
Example: 1,000 Frames (Industrial-Sized) 2 1 Dataflow Links are enumerated on the x-axis 3 4 6 5 1 2 X Time-Triggered Only Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 17
Deterministic (TT)Ethernet Traffic Classes thernet provides several traffic classes in parallel: time-triggered, rate-constrained, and best-effort Time-Triggered: dispatch messages according a predefined communication schedule Rate-Constrained: enforce minimum duration between two frames of the same stream Best-Effort: standard Ethernet communication paradigm no temporal guarantees are given Layer 3-7 Application Time-Triggered Extension Ethernet IEEE 802.3 thernet 40 msec 40 msec 40 msec TT1 TT2 RC RC BE TT1 BE RC TT2 BE TT1 RC BE RC TT2 TT1 BE BE RC RC 30 msec 30 msec 30 msec 30 msec TT1 TIME Longest Communication Cycle in this Example: LCM(30,40) = 120msec Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 18
thernet Dataflow: Rate-Constrained Traffic Rate-Constrained Traffic (RC) Switch/Router Receiver Sender min. duration min. duration min. duration Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 19
Mixed Traffic on Ethernet RC Accumulated Jitter 00:10 2 00:01 1 4a 3a 2a 1a thernet Switch Time Triggered 00:11 00:02 Rate Constrained 4b 3b 2b 1b Best Effort Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 20
Mixed Traffic on Ethernet RC Accumulated Jitter 00:10 TT is dispatched according synchronized time 00:01 4a thernet Switch Time Triggered 00:11 00:02 Rate Constrained 3a 3b TT is forwarded according synchronized time 2 1a TT has lowest latency and lowest jitter 1b 1 4b 2b Best Effort 2a RC frame delivery is guaranteed, but potentially has high latency and jitter RC potentially queue-up in switch memory Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 21
Mixed Traffic on Ethernet BE Buffer Overflow thernet Switch Time Triggered 4 3 2 1 Rate Constrained Best Effort 3 2 1 Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 22
Mixed Traffic on Ethernet BE Buffer Overflow thernet Switch Time Triggered Rate-constrained frame delivery (standard Ethernet traffic) is guaranteed! Rate Constrained 3 2 1 Best Effort 4 2 1 3 Best-effort frame delivery (standard Ethernet traffic) is NOT guaranteed! Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 23
Converged Network Example TT BE TT BE TT BE t 3ms cycle 3ms cycle 3ms cycle Sender 1 Switch/Router Dataflow Integration - Time-Triggered (TT) - Rate-Constrained (RC) - Standard Ethernet (BE) Receiver Sender 2 TT TT RC BE TT TT BE BE TT RC TT TT BE t TT BE BE TT RC TT BE 3ms cycle 3ms cycle 3ms cycle 2ms cycle 2ms cycle 2ms cycle t 2ms cycle 2ms cycle 2ms cycle 2ms cycle 6ms Cluster Cycle thernet Switches are non-preemptive store-and-forward switches using priorities Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 24
Example: 1,000 Frames (Industrial-Sized) 2 1 Dataflow Links are enumerated on the x-axis 3 4 6 5 1 2 RC/BE frames are also integrated during TT phases. RC TT X RC TT RC TT RC TT Time-Triggered Only Time-Triggered + Event-Triggered Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 25
Example: 1,000 Frames (Industrial-Sized) 2 1 Dataflow Links are enumerated on the x-axis 3 4 6 5 1 2 RC/BE frames are also integrated during TT phases. RC TT X RC TT RC TT RC TT Time-Triggered Only Page 26 Time-Triggered + Event-Triggered
Single-Master Clock Synchronization Eth Time Master IN 1 Eth Enabler for Synchronous Communication: Synchronized Global Time 1588 1588 Communication Schedule Eth Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 27
Fault-Tolerant Clock Synchronization Time Master IN 1 IN 1 Eth IN 1 IN 1 Time Master Time Master IN 1 IN 1 Eth 1588 Fault-tolerant synchronization services are needed for establishing a safe global time base Eth 1588 Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 28
Need to cover complex failure modes, e.g. Babbling ECU Faulty ECU starts to send faulty messages. Without traffic policing functions, the switch forwards all faulty frames. At some threshold of faulty frames, the switch starts loosing correct frames from other, independent applications. Copyright TTTech Computertechnik AG. All rights reserved. Page 29
NATIVE ETHERNET BECOMES MORE DETERMINISTIC Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 30
Native Ethernet becomes more deterministic IEEE 802.1 is standardizing general architectures for local area networks (LANs) and metropolitan area architectures (MANs). Together with IEEE 802.3 they are the main working groups working standards for Ethernet switches. Efficient utilization of the communication bandwidth and plug-and-play capabilities are topmost requirements in IEEE 802.1. With AVB, IEEE 802.1 moved into the area of real-time communication. With TSN, IEEE 802.1 moves into the area of dependable communication. Upcoming mainstream IT equipment aims to provide real-time and dependable communication features (to a significant higher degree than today). Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 31
AVB Audio/Video Bridging 802.1AS Timing and Synchronization for Time-Sensitive Applications in Bridged Local Area Networks: a protocol and technique to synchronize local clocks in the network to each other. 802.1Qat Stream Reservation Protocol (SRP): a protocol that allows applications to dynamically reserve bandwidth in the network. 802.1Qav Forwarding and Queuing Enhancements for Time-Sensitive Streams: an enhancement over strict priority based forwarding and queueing mechanisms that establishes fairness properties for lower priority traffic in the network. 802.1BA: definition of profiles for AVB systems. AVB is incorporated in the IEEE 802.1 standards documents since 2011. Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 32
Native Ethernet improves its Reliability IEEE 802.1Q Physical Topology Spanning Tree / Virtual LANs Ring Topology Redundant Networks Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 33
TSN Time-Sensitive Networks 802.1ASbt Timing and Synchronization: Enhancements and Performance Improvements 802.1Qbv Enhancements for Scheduled Traffic: a basic form of timetriggered communication 802.1Qbu Frame Preemption: a mechanism that allows to preempt a frame in transmit to intersperse another frame. 802.1Qca Path Control and Reservation: protocols and mechanisms to set up and manage the redundant communication paths in the network. 802.1CB Frame Replication and Elimination for Reliability: to eliminate redundant copies of frames transmitted over the redundant paths setup in 802.1Qca. 802.1Qcc enhancements and improvements for stream reservation Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 34
Background: Industrial Need for FT Clock-Sync http://www.ieee802.org/1/files/public/docs2013/new-tsn-jochim-goals-of-802-1tsn-0713-v01.pdf 2/17/2014 / Page 35
TSN Time-Sensitive Networks 802.1ASbt Timing and Synchronization: Enhancements and Performance Improvements 802.1Qbv Enhancements for Scheduled Traffic: a basic form of timetriggered communication 802.1Qbu Frame Preemption: a mechanism that allows to preempt a frame in transmit to intersperse another frame. 802.1Qca Path Control and Reservation: protocols and mechanisms to set up and manage the redundant communication paths in the network. 802.1CB Frame Replication and Elimination for Reliability: to eliminate redundant copies of frames transmitted over the redundant paths setup in 802.1Qca. 802.1Qcc enhancements and improvements for stream reservation Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 36
IEEE 802.1AS Clock Synchronization Failure and re-election The clock synchronization protocol is a classical master-slave protocol. The master is called the grandmaster. When the grandmaster fails, then a new grandmaster is elected. Issues with this mechanism have been reported by industry. Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 37
802.1ASbt Clock Synchronization Proposals for Improvements http://www.ieee802.org/1/files/public/docs2013/asbt-spada-kim-fault-tolerant-grand-master-proposal-0513-v1.pdf 2/17/2014 / Page 38
SAE AS6802 Fault-Tolerant Clock Synchronization thernet Executable Formal Specification Using symbolic and bounded model checkers sal-smc and sal-bmc Focus on Interoperation of Synchronization Services (Startup, Restart, Clique Detection, Clique Resolution, abstract Clock Synchronization) Verification of Lower-Level Synchronization Functions Permanence Function (sal-inf-bmc + k-induction) Compression Function (sal-inf-bmc + k-induction) Formal Verification of Clock Synchronization Algorithm First time by means of Model Checking (sal-inf-bmc + k-induction) Re-use of the Formal Models to prove: Layered clock-rate correction algorithm (sal-inf-bmc + k-induction) Layered clock-diagnosis algorithm (sal-inf-bmc + k-induction) Verification and minor corrections of the Sparse Timebase Concept Distributed computations without explicit coordination (PVS) Work has mostly been done in the context of the Marie Curie CoMMiCS project FP7 (FP7/2007-2013) project no. 236701 CoMMiCS Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 39
SAE AS6802 Fault-Tolerant Clock Synchronization thernet Executable Formal Specification Using symbolic and bounded model checkers sal-smc and sal-bmc Focus on Interoperation of Synchronization Services (Startup, Restart, Clique Detection, Clique Resolution, abstract Clock Synchronization) Verification of Lower-Level Synchronization Functions Permanence Function (sal-inf-bmc + k-induction) Compression Function (sal-inf-bmc + k-induction) Formal Verification of Clock Synchronization Algorithm First time by means of Model Checking (sal-inf-bmc + k-induction) Re-use of the Formal Models to prove: Layered clock-rate correction algorithm (sal-inf-bmc + k-induction) Layered clock-diagnosis algorithm (sal-inf-bmc + k-induction) Verification and minor corrections of the Sparse Timebase Concept Distributed computations without explicit coordination (PVS) Work has mostly been done in the context of the Marie Curie CoMMiCS project FP7 (FP7/2007-2013) project no. 236701 CoMMiCS Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 40
Interface Design i Fault-Tolerant Clock Generator IEEE 802.1 SAE AS6802 Fault-Tolerant Clock Consumers Architecture Design is Interface Design [Kopetz] Red Interface specifies the behavior of the FT Clock Generator as observed by the connecting bridges of the IEEE 802.1 network. Internal behavior of the FT Clock Generator may (and most likely will) be much more complex than as observed at the interface. Blue Interface specifies the behavior of the FT Clock Generator as observed by the FT Clock Consumers. 2/17/2014 / Page 41
Interface Design ii Fault-Tolerant Clock Generation IEEE 802.1 SAE AS6802 Fault-Tolerant Clock Consumers The red interface is different from the blue interfaces, because there is additional behavior introduced by the IEEE 802.1 network connecting the FT Clock Generator to the FT Clock Consumers. Both, red and blue, interfaces need to be specified to enable the usage of a faulttolerant timebase. 2/17/2014 / Page 42
SUMMARY AND OUTLOOK Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 43
Summary Our daily life more and more depends on dependable systems. The interconnection and networking of these systems is a main aspect. We see a cross-industry trend towards the use of Ethernet in time-critical, safety-related, and also safety-critical systems, for example in the automotive industry. Plain Ethernet as of today does not provide all functions to allow building distributed dependable systems. Hence, Ethernet variants have been developed, for example thernet (standardized as SAE AS6802) that defines a time-triggered paradigm and mixed time-triggered event-triggered paradigm for Ethernet. Currently, the IEEE is improving native Ethernet with real-time and dependability functions. Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 44
Summary In particular, IEEE 802.3 develops and maintains the Ethernet PHY (e.g., the Reduced Twisted Pair Gigabit Ethernet RTGBE for automotive use) and MAC standards, IEEE 802.1 develops and maintains bridging (aka switching) standards. With AVB, the IEEE has moved Ethernet into the real-time applications domain. With TSN, the IEEE currently moves Ethernet into the hard real-time applications domain and improves Ethernet s robustness. With the growing competences in the IEEE standards, products built on these standards increase their market potential. Well-defined interfaces allow to re-use existing fault-tolerant clocksynchronization protocols. Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 45
Outlook: Networks are everywhere In the European-funded ACROSS project we investigated Network-on-Chip technologies and developed a Time- Triggered Network-on-Chip (TTNoC) http://www.across-project.eu/ Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 46
Outlook: Networks are everywhere In the recently started and European-funded DREAMS project we are taking a holistic view on distributed dependable system as a system-of-systems. Thereby, we research networks on different levels: on-chip, on-board (e.g., on a PCB), within a box, and local area network. In particular we are interested on emerging benefits that come with the realization of the time-triggered paradigm on all these different hierarchical levels. http://www.dreams-project.eu/ Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 47
Recent Book on Time-Triggered Technology Recent Book on Real-Time Systems Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 48
E n s u r i n g R e l i a b l e N e t w o r k s w w w. t t t e c h. c o m Copyright TTTech Computertechnik AG. All rights reserved. 2/17/2014 / Page 49