Precision Timed (PRET) Machines

Similar documents
Computing Needs Time

The Internet of Important Things

Timing Analysis of Embedded Software for Families of Microarchitectures

Synthesis of Distributed Real- Time Embedded Software

PREcision Timed (PRET) Architecture

Temporal Semantics in Concurrent and Distributed Software

Portable Real-Time Code from PTIDES Models

Design and Analysis of Time-Critical Systems Timing Predictability and Analyzability + Case Studies: PTARM and Kalray MPPA-256

Predictable Programming on a Precision Timed Architecture

Reconciling Repeatable Timing with Pipelining and Memory Hierarchy

Embedded Tutorial CPS Foundations

Single-Path Programming on a Chip-Multiprocessor System

C Code Generation from the Giotto Model of Computation to the PRET Architecture

Predictable Processors for Mixed-Criticality Systems and Precision-Timed I/O

PRET Compilation. Expose timing constructs. Hide machine dependent details. 1 Ptolemy II (using directors) and Modelyze (using embedded dominspecific

Precision Timed Machines

PRET-C: A New Language for Programming Precision Timed Architectures (extended abstract)

Precision Timed Infrastructure: Design Challenges

Design and Analysis of Time-Critical Systems Introduction

PTIDES: A Discrete-Event-Based Programming Model for Distributed Embedded Systems

SIC: Provably Timing-Predictable Strictly In-Order Pipelined Processor Core

A Single-Path Chip-Multiprocessor System

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu Yogen Krish Rodolfo Pellizzoni

FlexPRET: A Processor Platform for Mixed-Criticality Systems

The Case for the Precision Timed (PRET) Machine

Fun with a Deadline Instruction

Copyright 2012, Elsevier Inc. All rights reserved.

RTL Coding General Concepts

Introduction to Embedded Systems

Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Timing analysis and timing predictability

Abstract PRET Machines

Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

LECTURE 5: MEMORY HIERARCHY DESIGN

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture!

Introducing Embedded Systems: A Cyber- Physical Systems Approach

Mastering The Behavior of Multi-Core Systems to Match Avionics Requirements

Memory Controllers for Real-Time Embedded Systems. Benny Akesson Czech Technical University in Prague

Written Exam / Tentamen

Embedded Systems. 7. System Components

Predictable Timing of Cyber-Physical Systems Future Research Challenges

Using a Model Checker to Determine Worst-case Execution Time

FlexPRET: A Processor Platform for Mixed-Criticality Systems

Embedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.

Memory Systems and Compiler Support for MPSoC Architectures. Mahmut Kandemir and Nikil Dutt. Cap. 9

Towards Architectural Support for Bandwidth Management in Mixed-Critical Embedded Systems

A Template for Predictability Definitions with Supporting Evidence

Embedded Systems: Hardware Components (part I) Todor Stefanov

Combining Induction, Deduction and Structure for Synthesis

Chapter 1. Computer Abstractions and Technology. Lesson 2: Understanding Performance

Module 18: "TLP on Chip: HT/SMT and CMP" Lecture 39: "Simultaneous Multithreading and Chip-multiprocessing" TLP on Chip: HT/SMT and CMP SMT

Beyond Embedded Systems: Integrating Computation, Networking, and Physical Dynamics

XPU A Programmable FPGA Accelerator for Diverse Workloads

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory

Out-of-Order Parallel Simulation of SystemC Models. G. Liu, T. Schmidt, R. Dömer (CECS) A. Dingankar, D. Kirkpatrick (Intel Corp.)

The Power Wall. Why Aren t Modern CPUs Faster? What Happened in the Late 1990 s?

Towards Lab Based MOOCs: Embedded Systems, Robotics, and Beyond

Energy-Efficient RISC-V Processors in 28nm FDSOI

A Time-Triggered View. Peter PUSCHNER

Computer Architecture

Deterministic Execution of Ptides Programs

BERKELEY PAR LAB. RAMP Gold Wrap. Krste Asanovic. RAMP Wrap Stanford, CA August 25, 2010

Introduction to Embedded Systems

Administrivia. Mini project is graded. 1 st place: Justin (75.45) 2 nd place: Liia (74.67) 3 rd place: Michael (74.49)

Introduction to Embedded Systems

Memory. Lecture 22 CS301

A Process Model suitable for defining and programming MpSoCs

CS 152 Computer Architecture and Engineering

Towards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach

6.9. Communicating to the Outside World: Cluster Networking

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis

Quantitative Verification and Synthesis of Systems

Computer Systems Architecture

Graphical System Design. David Fuller LabVIEW R&D Section Manager

Lecture 1: Introduction

Introduction to Embedded Systems

Computer Architecture

Computer Architecture!

ESE532: System-on-a-Chip Architecture. Today. Message. Real Time. Real-Time Tasks. Real-Time Guarantees. Real Time Demands Challenges

Page 1. Multilevel Memories (Improving performance using a little cash )

SimXMD Co-Debugging Software and Hardware in FPGA Embedded Systems

Copyright 2012, Elsevier Inc. All rights reserved.

Multithreading: Exploiting Thread-Level Parallelism within a Processor

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

A hardware operating system kernel for multi-processor systems

Predictable multithreading of embedded applications using PRET-C

Future Directions. Edward A. Lee. Berkeley, CA May 12, A New Computational Platform: Ubiquitous Networked Embedded Systems. actuate.

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

-- the Timing Problem & Possible Solutions

Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern

CS Computer Architecture

Adapted from David Patterson s slides on graduate computer architecture

Computer Architecture

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

System Design and Methodology/ Embedded Systems Design (Modeling and Design of Embedded Systems)

Transcription:

Precision Timed (PRET) Machines Edward A. Lee Robert S. Pepper Distinguished Professor UC Berkeley BWRC Open House, Berkeley, CA February, 2012 Key Collaborators on work shown here: Steven Edwards Jeff Jensen Sungjun Kim Isaac Liu Slobodan Matic Hiren Patel Jan Reinke Sanjit Seshia Mike Zimmer Jia Zou

A Story The Boeing 777 was Boeing s first fly-by-wire aircraft, controlled by software. It is deployed, appears to be reliable, and is succeeding in the marketplace. Therefore, it must be a success. However Boeing was forced to purchase and store an advance supply of the microprocessors that will run the software, sufficient to last for the estimated 50 year production run of the aircraft and another many years of maintenance. Why? Lee, Berkeley 2

Lesson from this example: Apparently, the software does not specify the behavior that has been validated and certified! Unfortunately, this problem is very common, even with less safety-critical, certification-intensive applications. Validation is done on complete system implementations, not on software. Lee, Berkeley 3

A Key Challenge: Timing is not Part of Software Semantics Correct execution of a program in C, C#, Java, Haskell, OCaml, etc. has nothing to do with how long it takes to do anything. All our computation and networking abstractions are built on this premise. Programmers have to step outside the programming abstractions to specify timing behavior. Lee, Berkeley 4

Execution-time analysis, by itself, does not solve the problem! Analyzing software for timing behavior requires: Our goal is to reduce the problem so that this is the only hard part. Paths through the program (undecidable) Detailed model of microarchitecture Detailed model of the memory system Complete knowledge of execution context Many constraints on preemption/concurrency Lots of time and effort And the result is valid only for that exact hardware and software! Fundamentally, the ISA of the processor has failed to provide an adequate abstraction. Wilhelm, et al. (2008). "The worst-case execution-time problem - overview of methods and survey of tools." ACM TECS 7(3): p1-53. Lee, Berkeley 5

PRET Machines PREcision-Timed processors = PRET Predictable, REpeatable Timing = PRET Performance with REpeatable Timing = PRET // Perform the convolution. for (int i=0; i<10; i++) { x[i] = a[i]*b[j-i]; // Notify listeners. notify(x[i]); } + = PRET Computing With time Lee, Berkeley 6

Dual Approach Rethink the ISA l Timing has to be a correctness property not a performance property. Implementation has to allow for multiple realizations and efficient realizations of the ISA l Repeatable execution times l Repeatable memory access times Lee, Berkeley 7

Example of one sort of mechanism we would like: tryin (500ms) { // Code block } catch { panic(); } If the code block takes longer than 500ms to run, then the panic() procedure will be invoked. But then we would like to verify that panic() is never invoked! jmp_buf buf; if (!setjmp(buf) ){ set_time r1, 500ms exception_on_expire r1, 0 // Code block deactivate_exception 0 } else { panic(); } exception_handler_0 () { } longjmp(buf) Pseudocode showing the mechanism in a mix of C and assembly. Lee, Berkeley 8

Extending an ISA with Timing Semantics [V1] Best effort: set_time r1, 1s // Code block delay_until r1 [V2] Late miss detec5on set_time r1, 1s // Code block branch_expired r1, <target> delay_until r1 [V3] Immediate miss detec5on set_time r1, 1s exception_on_expire r1, 1 // Code block deactivate_exception 1 delay_until r1 [V4] Exact execu5on: set_time r1, 1s // Code block MTFD r1 Lee, Berkeley 9

To provide timing guarantees, we need implementations that deliver repeatable timing Fortunately, electronics technology delivers highly reliable and precise timing but the overlaying software abstractions discard it. Chip architects heavily exploit the lack of temporal semantics. // Perform the convolution. for (int i=0; i<10; i++) { x[i] = a[i]*b[j-i]; // Notify listeners. notify(x[i]); } Lee, Berkeley 10

To deliver repeatable timing, we have to rethink the microarchitecture Challenges: l Pipelining l Memory hierarchy l I/O (DMA, interrupts) l Power management (clock and voltage scaling) l On-chip communication l Resource sharing (e.g. in multicore) Lee, Berkeley 11

Our Current PRET Architecture PTArm, a soft core on a Xilinx Virtex 5 FPGA Note inverted memory compared to multicore! Fast, close memory is shared, slow remote memory is private! Hardware Hardware Hardware thread Hardware thread thread thread registers scratch pad memory memory memory memory I/O devices Interleaved pipeline with one set of registers per thread SRAM scratchpad shared among threads DRAM main memory, separate banks per thread On a Virtex 6, we can fit 55 cores, for a total of 330 concurrent threads with perfectly controllable timing. Lee, Berkeley 12

Multicore PRET In today s multicore architectures, one thread can disrupt the timing of another thread even if they are running on different cores and are not communicating! Our preliminary work shows that control over timing enables conflict-free routing of messages in a network on chip, making it possible to have non-interfering programs on a multicore PRET. Lee, Berkeley 13

Status of the PRET project Results: l PTArm implemented on Xilinx Virtex 5 FPGA. l UNISIM simulator of the PTArm facilitates experimentation. l DRAM controller with repeatable timing and DMA support. l PRET-like utilities implemented on COTS Arm. Much still to be done: l Realize MTFD, interrupt I/O, compiler toolchain, scratchpad management, etc. Lee, Berkeley 14

A Key Next Step: Parametric PRET Architectures set_time r1, 1s // Code block MTFD r1 ISA that admits a variety of implementations: Variable clock rates and energy profiles Variable number of cycles per instruction Latency of memory access varying by address Varying sizes of memory regions A given program may meet deadlines on only some realizations of the same parametric PRET ISA. Lee, Berkeley 15

Realizing the MTFD instruction on a parametric PRET machine set_time r1, 1s // Code block MTFD r1 The goal is to make software that will run correctly on a variety of implementations of the ISA, and that correctness can be checked for each implementation. Lee, Berkeley 16

PRET Publications http://chess.eecs.berkeley.edu/pret/ S. Edwards and E. A. Lee, "The Case for the Precision Timed (PRET) Machine," in the Wild and Crazy Ideas Track of DAC, June 2007. B. Lickly, I. Liu, S. Kim, H. D. Patel, S. A. Edwards and E. A. Lee, Predictable programming on a precision timed architecture, CASES 2008. S. Edwards, S. Kim, E. A. Lee, I. Liu, H. Patel and M. Schoeberl, A Disruptive Computer Design Idea: Architectures with Repeatable Timing, ICCD 2009. D. Bui, H. Patel, and E. Lee, Deploying hard real-time control software on chip-multiprocessors, RTCSA 2010. Bui, E. A. Lee, I. Liu, H. D. Patel and J. Reineke, Temporal Isolation on Multiprocessing Architectures, DAC 2011. J. Reineke, I. Liu, H. D. Patel, S. Kim, E. A. Lee, PRET DRAM Controller: Bank Privatization for Predictability and Temporal Isolation (to appear), CODES +ISSS, Taiwan, October, 2011. S. Bensalem, K. Goossens, C. M. Kirsch, R. Obermaisser, E. A. Lee, J. Sifakis, Time-Predictable and Composable Architectures for Dependable Embedded Systems, Tutorial Abstract (to appear), EMSOFT, Taiwan, October, 2011 Lee, Berkeley 17

Conclusions Overview References: Bui, et al. Temporal Isolation on Multiprocessing Architectures, DAC 2011 Lee. Computing needs time. CACM, 52(5):70 79, 2009 Edwards & Lee, The case for Precision Timed (PRET) Machines, DAC 2007 Today, timing behavior is a property only of realizations of software systems. Tomorrow, timing behavior will be a semantic property of programs and models. Raffaello Sanzio da Urbino The Athens School Lee, Berkeley 18