Designing Predictable Real-Time and Embedded Systems

Designing Predictable Real-Time and Embedded Systems Juniorprofessor Dr. Jian-Jia Chen Karlsruhe Institute of Technology (KIT), Germany 0 KIT Feb. University 27-29, 2012 of at thetu-berlin, State of Baden-Wuerttemberg Berlin, Germany: and by Dr. Jian-Jia Chen National Laboratory of the Helmholtz Association www.kit.edu

About Me Education 1996-2001 B.s. Chemistry: National Taiwan University Taiwan 2001-2006 Ph.D. Computer Science and Information Engineering: National Taiwan University Taiwan Working Experience 2007 : Compulsory Civil Service in Taiwan 2008-2010 April : Postdoc at ETH Zurich, Switzerland 2010 May - Present: Juniorprofessor at Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany Research topics: Low-Power and Energy-Efficient Considerations Timing Predictability of MPSoCs (multiprocessor systems on chip) Energy-Harvesting Systems Reliability System Designs Design Automation Chair of Micro Hardware Technologies for Automation Leader: Jian-Jia Chen 2 Mitarbeiter (one will join in Feb. 2012), 1 Doctoral Stipend 1 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing Task is executed on Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing cache misses Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing Access memory Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing Task is executed on Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing cache misses Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing is blocked Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing Task is executed on Core 3 Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing cache misses Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing Memory access is blocked Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Predictability Due to Resource Sharing Memory access for core 1 finishes Multi Core CPU 1 Multi Core CPU 2 Main Memory 2 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Simple Problems for Shared Resources Input: Structure of tasks on the core Tasks modeled as sequential / time-triggered superblocks Arbitration policy on the shared resource Static arbiter (TDMA) Dynamic arbiter (FCFS, RR) Adaptive arbiter The timing behavior of the architecture is predictable Output: Resource access time is bounded and tight after granted Only timing interference is considered Spatial interference, such as cache replacement policy, etc., is analyzed beforehand What is the worst-case response time? How do we determine the schedulability due to timing interference? 3 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

RTC Event Models Given traces of event arrivals in time domain RTC models event arrivals by arrival curves in interval domain 0 2 4 6 8 10 12 14 time workload maximum events in 3 units possible events in 3 units minimum events in 3 units 0 2 4 6 8 10 12 time Ernesto Wandeler, Lothar Thiele, Marcel Verhoef, Paul Lieverse: System Architecture Evaluation Using Modular Performance Analysis - A Case Study Software Tools for Technology Transfer (STTT), Springer, Vol. 8, No. 6, pages 649-667, October, 2006. 4 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Graphical Interpretation workload Maximum buffer B [β l, β u ] [α l, α u ] GPC [β l, β u ] [α l, α u ] Maximum response time D 0 2 4 6 8 10 12 time D = sup{inf{τ 0 : R(t) R (t + τ)} t 0 = sup{inf{τ 0 : α u ( ) β l ( + τ)} 0 5 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Outline 6 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

State-of-the-Art Multicore Systems Architecture with shared resource shared memory, communication peripherals, I/O peripherals Blocking access to shared resource one request at a time is served stalling due to contention Possible approaches to reduce the contention structure of tasks on the cores arbitration policy on the shared resource (static/dynamic) 7 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Task Model A sequence of subsequent super-blocks Each super-block j of task i is defined by upper e max i,j upper µ max / lower e min i,j / lower accesses µ min execution time i,j i,j to a shared resource static analysis for two phases (1) assuming resource access requires no additional time to get the execution time e i,j (2) only considering the number of shared resource accesses. communication delay c depends on resource Executing periodically Access to shared resources can happen anytime 8 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Worst-Case Execution Time and Resource Accesses Executable Binary Program Control Flow Graph (CFG) Reconstruction Loop Analysis and Unfolding Loop Bounds Static Analysis Value Analyzer Micro Architecture Abstraction Path Analysis ILP Generator /Pipeline Analyzer Timing Information ILP Solver Micro architecture Analysis WCET Visualization and Analysis Evaluation Worst Case Path Analysis 9 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Task Model (cont.) Multiple tasks execute in a time wheel Periodic sequence of statically scheduled tasks gap g between two tasks is variable 10 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Deriving Interfering (Arrival) Curves find all well defined time-windows count number of events that can happen s1,1 s2,1 s3,1 s4,1 s5,1 s6,1 max µ 1,1 exec max 2,1 min µ 1,1 exec min 2,1 S 1 µ max 3,1 µ min 3,1 µ max 4,1 exec max 5,1 µ max 6,1 µ min 4,1 exec min 5,1 µ min 6,1 g s1,1 s2,1 s3,1 s4,1 s5,1 s6,1 max µ 1,1 exec max 2,1 min µ 1,1 exec min 2,1 S 1 µ max 3,1 µ min 3,1 µ max 4,1 µ min 4,1 exec max 5,1 exec min 5,1 µ max 6,1 µ min 6,1 ρ 1,1 ρ 1,1+W1 1 superblock 2 superblocks 4 superblocks t t ˆ =< µ max 1,1,0> ˆ max max max t =< µ 1,1 + µ 1,2, µ 1,1 C> 4 3 ˆ max max min t =< µ k,1, µ k, 1 C+ exec2,1 > k= 1 k= 1 7 superblocks 7 6 ˆ max max min min t =< µ k,1, µ k,1 C+ exec2,1 + exec5,1 + g> k= 1 k= 1 maximize/minimize to compute the gap 2x 5 superblocks 7 6 ˆ max max min t =< µ k,1, µ k,1 C+ exec5,1 + g> k= 3 k= 3 superblocks considerd for gap computation superblocks considerd for time window computation relevant time window 11 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Resulting Interfering Curves Interference of each processing element ᾱ l i ( ) is the interference of PE i onto resource l by assuming PE i executes in isolation α i ( ) { ˆα i ( ) = max α i ( ), α i ( p i ) + (µ max i,j ) j α i ( k p i ) + k j } 0 p p i 2p (µ max i,j ) otherwise 12 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Dynamic Arbitration Interference from other elements as arrival curve Access profile from superblock under analysis 13 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Analysis for Dynamic Arbitration Use Dynamic Programming: 14 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

TDMA on the Shared Resource Independence between tasks Single source of interference 15 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Superblocks with Phases Tasks are structured as sequences of superblocks fixed order of execution upper bound on execution and communication phases may be present (acquisition/execution/replication) 16 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Access Models General Model Dedicated Model Hybrid Model S1 W S 1 1 1 s s 1,1 2,1 s3,1 s1,1 l1,1 l2,1 l3,1 l1,1 A/E/R A/E/R A/E/R A/E/R... ρ +W ρ1,1 1,1 1 S1 W1 S1 W1 s1,1 s2,1 s3,1 s1,1 l1,1 l2,1 l3,1 l1,1 ρ 1,1 ρ 1,1 A E R A E R A E R A E... ρ +W 1,1 1 S1 W1 S1 W1 s1,1 s2,1 s3,1 s1,1 l1,1 l2,1 l3,1 l1,1 A A/E/R R A A/E/R R A A/E/R R A A/E/R... ρ 1,1+W1 W t t t 17 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Adaptive Arbitration superblock s 1,1 A E R max,a 1,1 exec max max,r 1,1 1,1 superblock s 1,2 A/E/R max 1,2 exec max PU1 PU2 PU3 1,2 shared resource superblock s 1,3 A A/E/R R... max,a 1,3 max,e max,r 1,3 1,3 exec max 1,3 t Use dynamic programming The algorithm is very complicated minislot M 3 3 2,2 1 3,3 1 1 4,1 1,1 1 1 2 L 3 4 5 1 1 1 1,1 2,2 3,3... t static dynamic 18 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Our Status 19 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Where Are We? Safe worst-case response time analysis under the following assumptions: sequential superblock executions known resource arbitration known task assignment synchronized release of the first superblocks VERY pessimistic interference by summing the curves of the interfering cores together Tightness This is safe, but it does not consider the interferences precisely The more cores the system has, the more pessimistic the resulting interference curve is. We believe our results are not very close to the actual worst cases However, these attempts are safe Comparisons of TDMA and Round-Robin/FIFO TDMA provides some resource reservation, but the worst-case is to WASTE it; however, it is easier to be analyzed tightly Round-Robin and FIFO are more difficult to be analyzed tightly 20 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

On-going works Dynamic servers may be helpful They provide resource reservations for guaranteeing the service a core may get They reservation is not based on TDMA, but the worst-case behavior may be like TDMA Good candidates: Constant Bandwidth Server (CBS) and Total Bandwidth Server (TBS) Optimizing the configuration of resource arbiters For example, task assignment, slot assignments in TDMA, slot assignments in adaptive arbiters, etc. We are not able to provide any good/reasonable approaches. 21 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Conclusion Resource Sharing in multicore systems is an important issue in terms of Predictability Efficiency Resource Sharing is not An Easy Problem Static arbitration policies Elimination of Interference Dynamic arbitration policies Approximation of Interference 22 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Other On-Going Works Power Management and Analysis Power management for multiple voltage-island multicore architecture Worst-case thermal analysis for multicore systems, 3D architectures, and schedulability analysis. Power management for smart grid and data centers. Multicore Resource Sharing Combining analysis by using the abstract-level modeling and fine-grained model checking Extending to more general event models Adopting research reservation servers in resource arbitration and timing analysis Schedulability for real-time systems and real-time database systems Working on resource augmentation analysis and design (best paper candidate in RTSS 2011) 23 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen

Thank you! 24 Feb. 27-29, 2012 at TU-Berlin, Berlin, Germany: by Dr. Jian-Jia Chen