Embedded Systems. 7. System Components

Similar documents
Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory

Embedded Systems: Hardware Components (part I) Todor Stefanov

Hardware-Software Codesign. 1. Introduction

HW/SW Co-design. Design of Embedded Systems Jaap Hofstede Version 3, September 1999

Embedded System Hardware - Processing -

Embedded Computation

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures

Embedded Systems: Architecture

Hardware-Software Codesign. 1. Introduction

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi. Lecture - 10 System on Chip (SOC)

Embedded Systems: Hardware Components (part II) Todor Stefanov

Information Processing. Peter Marwedel Informatik 12 Univ. Dortmund Germany

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses

Embedded Computing Platform. Architecture and Instruction Set

EE382V: System-on-a-Chip (SoC) Design

REAL TIME DIGITAL SIGNAL PROCESSING

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

ELC4438: Embedded System Design Embedded Processor

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Chapter 5. Introduction ARM Cortex series

04 - DSP Architecture and Microarchitecture

ELCT 912: Advanced Embedded Systems

Introduction to Embedded Systems

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

Advanced Computer Architecture

Hardware Software Codesign of Embedded Systems

Computer Architecture

ECE 471 Embedded Systems Lecture 2

Embedded Systems 1. Introduction

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey

Storage I/O Summary. Lecture 16: Multimedia and DSP Architectures

Multithreading: Exploiting Thread-Level Parallelism within a Processor

4. Hardware Platform: Real-Time Requirements

Advance CPU Design. MMX technology. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. ! Basic concepts

Reconfigurable Computing

The Nios II Family of Configurable Soft-core Processors

VLSI Design Automation. Maurizio Palesi

Embedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.

Lecture 1: Introduction

Processing Unit CS206T

Hardware Software Codesign of Embedded System

Long Term Trends for Embedded System Design

A VARIETY OF ICS ARE POSSIBLE DESIGNING FPGAS & ASICS. APPLICATIONS MAY USE STANDARD ICs or FPGAs/ASICs FAB FOUNDRIES COST BILLIONS

Teaching Computer Architecture with FPGA Soft Processors

Resource Efficiency of Scalable Processor Architectures for SDR-based Applications

EECS Dept., University of California at Berkeley. Berkeley Wireless Research Center Tel: (510)

In this tutorial, we will discuss the architecture, pin diagram and other key concepts of microprocessors.

Department of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7

Embedded Systems. 1 Introduction. Lothar Thiele. Computer Engineering and Networks Laboratory

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

CMPE 310: Systems Design and Programming

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

Computer Architecture s Changing Definition

Overview of SOC Architecture design

VLSI Design Automation

Memory Systems IRAM. Principle of IRAM

Modeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano

Copyright 2016 Xilinx

Evolution of Computers & Microprocessors. Dr. Cahit Karakuş

Computer Architecture. Fall Dongkun Shin, SKKU

NISC Application and Advantages

Hardware Platforms for Embedded Computing

EMBEDDED SYSTEM BASICS AND APPLICATION

Intellectual Property Macrocell for. SpaceWire Interface. Compliant with AMBA-APB Bus

SoC Platforms and CPU Cores

Digital Systems Design. System on a Programmable Chip

FYS Data acquisition & control. Introduction. Spring 2018 Lecture #1. Reading: RWI (Real World Instrumentation) Chapter 1.

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch

Parallel Computing: Parallel Architectures Jin, Hai

General Purpose Signal Processors

ADSP-2100A DSP microprocessor with off-chip Harvard architecture. ADSP-2101 DSP microcomputer with on-chip program and data memory

Multi MicroBlaze System for Parallel Computing

Latches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter

Platform-based Design

System-on Solution from Altera and Xilinx

Parallelism in Hardware

Trends in the Infrastructure of Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Introduction ADSP-2100 FAMILY OF PROCESSORS

The Processor: Instruction-Level Parallelism

1 Microprocessors 2 Microcontrollers 3 Actuation sensing, process control

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

The S6000 Family of Processors

Parallel FIR Filters. Chapter 5

ECE 471 Embedded Systems Lecture 2

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

Microprocessor Soft-Cores: An Evaluation of Design Methods and Concepts on FPGAs

Multimedia in Mobile Phones. Architectures and Trends Lund

Introduction to Microprocessor

Hardware/Software Co-design

Course overview Computer system structure and operation

Field Programmable Gate Array (FPGA)

EECS150 - Digital Design Lecture 13 - Accelerators. Recap and Outline

FPGA. Agenda 11/05/2016. Scheduling tasks on Reconfigurable FPGA architectures. Definition. Overview. Characteristics of the CLB.

Course Overview Revisited

Reader's Guide Outline of the Book A Roadmap For Readers and Instructors Why Study Computer Organization and Architecture Internet and Web Resources

Qsys and IP Core Integration

CS6303-COMPUTER ARCHITECTURE UNIT I OVERVIEW AND INSTRUCTIONS PART A

ECE 111 ECE 111. Advanced Digital Design. Advanced Digital Design Winter, Sujit Dey. Sujit Dey. ECE Department UC San Diego

Transcription:

Embedded Systems 7. System Components Lothar Thiele 7-1

Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic Tasks 5. Resource Sharing 8. Communication 9. Low Power Design 11. Architecture Synthesis 6. Real-Time OS Software and Programming 12. Model Based Design Processing and Communication Hardware 7-2

Embedded System Hardware Embedded system hardware is frequently used in a loop ( hardware in a loop ): this course embedded system actuators 7-3

Topics System Specialization Application Specific Instruction Sets Micro Controller Digital Signal Processors and VLIW Programmable Hardware ASICs System-on-Chip 7-4

Implementation Alternatives General-purpose processors Performance Power Efficiency Application-specific instruction set processors (ASIPs) Microcontroller DSPs (digital signal processors) Flexibility Programmable hardware FPGA (field-programmable gate arrays) Application-specific integrated circuits (ASICs) 7-5

Energy Efficiency Hugo De Man, IMEC, Philips, 2007 7-6

General-purpose Processors High performance Highly optimized circuits and technology Use of parallelism superscalar: dynamic scheduling of instructions super-pipelining: instruction pipelining, branch prediction, speculation complex memory hierarchy Not suited for real-time applications Execution times are highly unpredictable because of intensive resource sharing and dynamic decisions Properties Good average performance for large application mix High power consumption 7-7

General-purpose Processors Multicore Processors Potential of providing higher execution performance by exploiting parallelism Especially useful in high-performance embedded systems, e.g. autonomous driving Disadvantages and problems for embedded systems: Increased interference on shared resources such as buses and shared caches Increased timing uncertainty Often, there is limited parallelism in embedded applications 7-8

Multicore Examples 48 cores 4 cores 7-9

Multicore Examples Intel Xeon Phi (5 Billion transistors, 22nm technology, 350mm 2 area) Oracle Sparc T5 7-10

Embedded Multicore Example Recent development: Specialize multicore processors towards real-time processing and low power consumption Target domains: 7-11

System Specialization The main difference between general purpose highest volume microprocessors and embedded systems is specialization. Specialization should respect flexibility application domain specific systems shall cover a class of applications some flexibility is required to account for late changes, debugging System analysis required identification of application properties which can be used for specialization quantification of individual specialization effects 7-12

Architecture Specialization Techniques system design system component design DSP subsystems processors buses data paths coprocessors interfaces micro controllers conf. HW functions (FPGA) memory blocks logic design logic cells switch elements memory cells A simple system design classification 7-13

Example: Code-size Efficiency RISC (Reduced Instruction Set Computers) machines designed for run-time-, not for code-size-efficiency. Compression techniques: key idea (de)compressor 7-14

Example: Multimedia-Instructions Multimedia instructions exploit that many registers, adders etc are quite wide (32/64 bit), whereas most multimedia data types are narrow (e.g. 8 bit per color, 16 bit per audio sample per channel) 2-8 values can be stored per register and added. + 4 additions per instruction; carry disabled at word boundaries. 7-15

Example: Heterogeneous registers Example (ADSP 210x): D P Addressregisters A0, A1, A2.. AX AY AF MX MY MF Address generation unit (AGU) +,-,.. AR * +,- MR Different functionality of registers AR, AX, AY, AF,MX, MY, MF, MR 7-16

Example: Multiple memory banks or memories D P Addressregisters A0, A1, A2.. AX AY AF MX MY MF Address generation unit (AGU) +,-,.. AR * +,- MR Simplifies parallel fetches 7-17

Example: Address generation units Example (ADSP 210x): Data memory can only be fetched with address contained in register file A, but its update can be done in parallel with operation in main data path (takes effectively 0 time). Register file A contains several precomputed addresses A[i]. There is another register file M that contains modification values M[j]. Possible updates: M[j] := immediate A[i] := A[i] ± M[j] A[i] := A[i] ± 1 A[i] := A[i] ± immediate A[i] := immediate 7-18

Example: Modulo addressing Modulo addressing: Am++ Am:=(Am+1) mod n (implements ring or circular buffer in memory) x sliding window t1 t x[t]: value accessed at time t.. x[t1-1] x[t1] x[t1-n+1] x[t1-n+2].... x[t1-1] x[t1] x[t1+1] x[t1-n+2].. Memory 7-19 Memory

Topics System Specialization Application Specific Instruction Sets Micro Controller Digital Signal Processors and VLIW Programmable Hardware ASICs System-on-Chip 7-20

Control Dominated Systems Reactive systems with event driven behavior Underlying semantics of system description ( input model of computation ) typically (coupled) Finite State Machines or Petri Nets I/O signals output signals output signals 7-21

Microcontroller control-dominant applications supports process scheduling and synchronization preemption (interrupt), context switch short latency times low power consumption peripheral units often integrated suited for real-time applications SIECO51 (Siemens) 8051 core Major System Components 7-22

Microcontroller as a System-on-Chip complete system timers I 2 C-bus and par./ser. interfaces for communication A/D converter watchdog (SW activity timeout): safety on-chip memory (volatile/non-volatile) interrupt controller MSP 430 RISC Processor (Microchip) 7-23

Topics System Specialization Application Specific Instruction Sets Micro Controller Digital Signal Processors and VLIW Programmable Hardware ASICs System-on-Chip 7-24

Data Dominated Systems Streaming oriented systems with mostly periodic behavior Underlying semantics of input description e.g. flow graphs ( input model of computation ) B f 1 B f 2 B f 3 B B f 2 B: buffer Application examples: signal processing, control engineering 7-25

Very Long Instruction Word (VLIW) Key idea: detection of possible parallelism to be done by compiler, not by hardware at run-time (inefficient). VLIW: parallel operations (instructions) encoded in one long word (instruction packet), each instruction controlling one functional unit. E.g.: 7-26

Explicit Parallelism Instruction Computers The TMS320C62xx VLIW Processor as an example of EPIC: 31 0 31 0 31 0 31 0 31 0 31 0 31 0 0 1 1 0 1 1 0 Instr. A Instr. B Instr. C Instr. D Instr. E Instr. F Instr. G Cycle Instruction 1 A 2 B C D 3 E F G 7-27

MAC (multiply & accumulate) sum = 0.0; for (i=0; i<n; i++) sum = sum + a[i]*b[i]; zero-overhead loop (repeat next instruction N times) MAC - Instruktion LDF 0, R0 LDF 0, R1 RPTS N MPYF3 *(AR0)++, *(AR1)++, R0 ADDF3 R0, R1, R1 TMS320C3x Assembler (Texas Instruments) 7-28

Digital Signal Processor optimized for data-flow applications suited for simple control flow parallel hardware units (VLIW) specialized instruction set high data throughput zero-overhead loops specialized memory suited for real-time applications Major System Components 7-29

Example Infineon Processor core for car mirrors Infineon 7-30

Example NXP Trimedia VLIW VLIW MIPS 7-31

Topics System Specialization Application Specific Instruction Sets Micro Controller Digital Signal Processors and VLIW Programmable Hardware ASICs System-on-Chip 7-32

FPGA Basic Strucutre Logic Units I/O Units Connections 7-33

FPGA - Classification Granularity of logic units: Gate, tables, memory, functional blocks (ALU, control, data path, processor) Communication network: Crossbar, hierarchical mesh, tree Reconfiguration: fixed at production time, once at design time, dynamic during run-time 7-34

Floor-plan of VIRTEX II FPGAs 7-35

Virtex Logic Cell [ and source: Xilinx Inc.: Virtex-II Pro Platform FPGAs: Functional Description, Sept. 2002, //www.xilinx.com] 7-36

Example Virtex-6 Combination of flexibility (CLB s), Integration and performance (heterogeneity of hard-ip Blocks) clock distribution logic (CLB) interfaces (PCI, high speed) memory (RAM) DSP slice fast communication 7-37

XILINX Virtex UltraScale Virtex-6 CLB Slice 7-38

Topics System Specialization Application Specific Instruction Sets Micro Controller Digital Signal Processors and VLIW Programmable Hardware ASICs System-on-Chip 7-39

Application Specific Circuits (ASICS) Custom-designed circuits necessary if ultimate speed or energy efficiency is the goal and large numbers can be sold. Approach suffers from long design times, lack of flexibility (changing standards) and high costs (e.g. Mill. $ mask costs). 7-40

Topics System Specialization Application Specific Instruction Sets Micro Controller Digital Signal Processors and VLIW Programmable Hardware ASICs System-on-Chip 7-41

System-on-Chip Samsung Galaxy Note II Eynos 4412 System on a Chip (SoC) ARM Cortex-A9 processing core 32 nanometer: transistor gate width Four processing cores 7-42

Configurable System-On-Chip Example: Altera's SoC FPGA integrates a dual-core ARM Cortex-A9 processor system with a low power FPGA fabrics 7-43

Trend: multiprocessor systems-on-a-chip (MPSoCs) http://www.mpsoc-forum.org/2007/slides/hattori.pdf 7-44

Multiprocessor systems-on-a-chip (MPSoCs) http://www.mpsoc-forum.org/2007/slides/hattori.pdf 7-45

Multiprocessor systems-on-a-chip (MPSoCs) http://www.mpsoc-forum.org/2007/slides/hattori.pdf 7-46

Multiprocessor systems-on-a-chip (MPSoCs) http://www.mpsoc-forum.org/2007/slides/hattori.pdf 7-47