Sri Vidya College of Engineering and Technology. EC6703 Embedded and Real Time Systems Unit IV Page 1.

Similar documents
Hardware Accelerators

Co-synthesis and Accelerator based Embedded System Design

EE382V: System-on-a-Chip (SoC) Design

8. Networks. Why networked embedded systems General network architecture. Networks. Internet-enabled embedded systems Sensor networks

CprE 488 Embedded Systems Design. Lecture 4 Interfacing Technologies

Link Layer and Ethernet

Link Layer and Ethernet

Today. Last Time. Motivation. CAN Bus. More about CAN. What is CAN?

Links Reading: Chapter 2. Goals of Todayʼs Lecture. Message, Segment, Packet, and Frame

EC EMBEDDED AND REAL TIME SYSTEMS

Outline: Connecting Many Computers

Lecture 5 The Data Link Layer. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

High Level View. EE 122: Ethernet and Random Access protocols. Medium Access Protocols

DISTRIBUTED EMBEDDED ARCHITECTURES

EE 122: Ethernet and

Medium Access Protocols

Reminder: Datalink Functions Computer Networking. Datalink Architectures

Lecture 6 The Data Link Layer. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

CS 43: Computer Networks Switches and LANs. Kevin Webb Swarthmore College December 5, 2017

ECE 551 System on Chip Design

Topics. Link Layer Services (more) Link Layer Services LECTURE 5 MULTIPLE ACCESS AND LOCAL AREA NETWORKS. flow control: error detection:

The Controller Area Network (CAN) Interface

Embedded Systems: Hardware Components (part II) Todor Stefanov

Lecture 6. Data Link Layer (cont d) Data Link Layer 1-1

UNIT I [INTRODUCTION TO EMBEDDED COMPUTING AND ARM PROCESSORS] PART A

Lecture 9 The Data Link Layer part II. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

Summary of MAC protocols

Lecture 9: Bridging. CSE 123: Computer Networks Alex C. Snoeren

Media Access Control (MAC) Sub-layer and Ethernet

Controller area network

Networked Control Systems for Manufacturing: Parameterization, Differentiation, Evaluation, and Application. Ling Wang

Links. CS125 - mylinks 1 1/22/14

The Link Layer II: Ethernet

Used to build distributed embedded systems Processing g elements (PEs) are connected by a network The application is distributed over the PEs

Raspberry Pi - I/O Interfaces

Adaptors Communicating. Link Layer: Introduction. Parity Checking. Error Detection. Multiple Access Links and Protocols

«Real Time Embedded systems» Multi Masters Systems

Buses. Maurizio Palesi. Maurizio Palesi 1

Data and Computer Communications. Chapter 11 Local Area Network

Interconnection Structures. Patrick Happ Raul Queiroz Feitosa

ELEC 5260/6260/6266 Embedded Computing Systems

EECS150 - Digital Design Lecture 15 - Project Description, Part 5

SCSI is often the best choice of bus for high-specification systems. It has many advantages over IDE, these include:

Course Introduction. Purpose. Objectives. Content. Learning Time

Redes de Computadores. Medium Access Control

Network Models. Presentation by Dr.S.Radha HOD / ECE SSN College of Engineering

CS 416: Operating Systems Design April 11, 2011

CHAPTER 4 MARIE: An Introduction to a Simple Computer

Embedded Applications. COMP595EA Lecture03 Hardware Architecture

Systems. Roland Kammerer. 10. November Institute of Computer Engineering Vienna University of Technology. Communication Protocols for Embedded

Lecture: Communication in PLC

CMPE 150/L : Introduction to Computer Networks. Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 18

Links. Error Detection. Link Layer. Multiple access protocols. Nodes Links Frame. Shared channel Problem: collisions How nodes share a channel

Applied Networks & Security

ES623 Networked Embedded Systems

COS 140: Foundations of Computer Science

Communication (III) Kai Huang

Getting Connected (Chapter 2 Part 4) Networking CS 3470, Section 1 Sarah Diesburg

Communication in Avionics

Applied Networks & Security

Data Link Layer. Our goals: understand principles behind data link layer services: instantiation and implementation of various link layer technologies

LAN PROTOCOLS. Beulah A AP/CSE

Additional Slides (informative)

Level 1: Physical Level 2: Data link Level 3: Network Level 4: Transport

Chapter 2 Network Models 2.1

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

The Link Layer and LANs: Ethernet and Swiches

SEMICON Solutions. Bus Structure. Created by: Duong Dang Date: 20 th Oct,2010

Chapter Topics Part 1. Network Definitions. Behind the Scenes: Networking and Security

CSE 461: Multiple Access Networks. This Lecture

Frames, Packets and Topologies

COS 140: Foundations of Computer Science

Parallel Programming Multicore systems

Link Layer and LANs 안상현서울시립대학교컴퓨터 통계학과.

CMSC 611: Advanced Computer Architecture

Computer Networks Principles LAN - Ethernet

Silberschatz and Galvin Chapter 15

Mapping applications into MPSoC

Embedded Systems. 7. System Components

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Adaptors Communicating. Link Layer: Introduction. Parity Checking. Error Detection. Multiple Access Links and Protocols

Multiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University

19: Networking. Networking Hardware. Mark Handley

CS 856 Latency in Communication Systems

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures

SoC Platforms and CPU Cores

Computer Networks. Today. Principles of datalink layer services Multiple access links Adresavimas, ARP LANs Wireless LANs VU MIF CS 1/48 2/48

ECE 1160/2160 Embedded Systems Design. Midterm Review. Wei Gao. ECE 1160/2160 Embedded Systems Design

Keystone Architecture Inter-core Data Exchange

Computers as Components Principles of Embedded Computing System Design

Data Link Layer: Overview, operations

Chapter 16 Networking

Module 15: Network Structures

Computer and Network Security

Local Area Networks. Ethernet LAN

LAN Protocols. Required reading: Forouzan 13.1 to 13.5 Garcia 6.7, 6.8. CSE 3213, Fall 2015 Instructor: N. Vlajic

CS 43: Computer Networks Media Access. Kevin Webb Swarthmore College November 30, 2017

FYS Data acquisition & control. Introduction. Spring 2018 Lecture #1. Reading: RWI (Real World Instrumentation) Chapter 1.

ECE4110 Internetwork Programming. Introduction and Overview

The Link Layer and LANs. Chapter 6: Link layer and LANs

Transcription:

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 1

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 2

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 3

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 4

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 5

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 6

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 7

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 8

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 9

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 10

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 11

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 12

Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 13

Why distributed? Higher performance at lower cost. Physically distributed activities---time constants may not allow transmission to central site. Improved debugging---use one CPU in network to debug others. May buy subsystems that have embedded processors. 1

Network abstractions International Standards Organization (ISO) developed the Open Systems Interconnection (OSI) model to describe networks: 7-layer model. Provides a standard way to classify network components and operations. 2

OSI model application presentation session transport network data link physical end-use interface data format application dialog control connections end-to-end service reliable data transport mechanical, electrical 3

OSI layers Physical: connectors, bit formats, etc. Data link: error detection and control across a single link (single hop). Network: end-to-end multi-hop data communication. Transport: provides connections; may optimize network resources. 4

OSI layers, cont d. Session: services for end-user applications: data grouping, checkpointing, etc. Presentation: data formats, transformation services. Application: interface between network and end-user programs. 5

Vehicles as networks 1/3 of cost of car/airplane is electronics/avionics. Dozens of microprocessors are used throughout the vehicle. Network applications: Vehicle control. Instrumentation. Communication. Passenger entertainment systems. 6

CAN bus First used in 1991. Serial bus, 1 Mb/sec up to 40 m. Synchronous bus. Logic 0 dominates logic 1 on bus. Arbitrated with CSMA/AMP: Arbitration on message priority. 7

CAN data frame 11 bit destination address. RTR bit determines read/write from/to destination. Any node can detect bus error, interrupt packet for retransmission. 8

CAN controller Controller implements physical and data link layers. No network layer needed---bus provides end-to-end connections. 9

Other vehicle busses FlexRay is next generation: Time triggered protocol. 10 Mb/s. Local Interconnect Network (LIN) connects devices in a small area (e.g., door). Passenger entertainment networks: Bluetooth. Media Oriented Systems Transport (MOST). 10

Automobile network Engine provides power to drive the wheels via the transmission. Transmission adjusts gearing based on operating characteristics. ABS controls brakes. 11

Freescale MPC5676R Designed for cars and power trains. Two Power cores with vector units. Time processing units can sense and generate waveforms, offloading CPUs. 12

Avionics Aviation electronics must be certified. Early systems used line replaceable units which operated separately. Boeing 777 uses a bus-based system with core processor modules connected by SAFEbus. Federated network has networks for functions. Boeing 787 does not fix the mapping from applications to network units. 13

I 2 C bus Designed for low-cost, medium data rate applications. Characteristics: serial; multiple-master; fixed-priority arbitration. Several microcontrollers come with built-in I 2 C controllers. 14

I 2 C physical layer SDL SCL master 1 master 2 data line clock line slave 1 slave 2 15

I 2 C data format SCL...... SDL... start MSB ack 16

I 2 C electrical interface Open collector interface: + SDL + SCL 17

I 2 C signaling Sender pulls down bus for 0. Sender listens to bus---if it tried to send a 1 and heard a 0, someone else is simultaneously transmitting. Transmissions occur in 8-bit bytes. 18

I 2 C data link layer Every device has an address (7 bits in standard, 10 bits in extension). Bit 8 of address signals read or write. General call address allows broadcast. 19

I 2 C bus arbitration Sender listens while sending address. When sender hears a conflict, if its address is higher, it stops signaling. Low-priority senders relinquish control early enough in clock cycle to allow bit to be transmitted reliably. 20

I 2 C transmissions multi-byte write S adrs 0 data data P read from slave S adrs 1 data P write, then read S adrs 0 data S adrs 1 data P 21

Ethernet Dominant non-telephone LAN. Versions: 10 Mb/s, 100 Mb/s, 1 Gb/s Goal: reliable communication over an unreliable medium. 22

Ethernet topology Bus-based system, several possible physical layers: A B C 23

CSMA/CD Carrier sense multiple access with collision detection: sense collisions; exponentially back off in time; retransmit. 24

Exponential back-off times time 25

Ethernet packet format preamble start frame source adrs dest adrs data length padding CRC payload 26

Ethernet performance Quality-of-service tends to non-linearly decrease at high load levels. Can t guarantee real-time deadlines. However, may provide very good service at proper load levels. 27

Fieldbus Used for industrial control and instrumentation---factories, etc. H1 standard based on 31.25 MB/s twisted pair medium. High Speed Ethernet (HSE) standard based on 100 Mb/s Ethernet. 28

MPSOCS and Shared Memory Multiprocessors

Networks and multiprocessors Single-chip multiprocessors. Accelerators and heterogeneous multiprocessors. 30

Shared memory multiprocessors Multi-core systems are symmetric multiprocessors with identical PEs. Many embedded multiprocessors are heterogeneous: Multiple programmable CPU types. Hardwired accelerators. Heterogeneous multiprocessors use less energy, are less expensive. 31

TI TMS320DM816x DaVinci ARM Cortex A8 plus C674x VLIW DSP. HD video coprocessor accelerates H.264, MPEG-4, etc. HD video processing susystem provides additional video processing. Graphics unit performs 3D graphics. 32

Accelerated systems Use additional computational unit dedicated to some functions? Hardwired logic. Extra CPU. Hardware/software co-design: joint design of hardware and software architectures. 33

Accelerated system architecture CPU request data accelerator result data memory I/O 34

Accelerator vs. co-processor A co-processor executes instructions. Instructions are dispatched by the CPU. An accelerator appears as a device on the bus. The accelerator is controlled by registers. 35

Accelerator implementations Application-specific integrated circuit. Field-programmable gate array (FPGA). Standard component. Example: graphics processor. 36

Xilinx Zynq-7000 Dual-CPU ARM MPCore augmented with an FPGA fabric. AMBA bus connects to CPUs and FPGA fabric. 37

System design tasks Design a heterogeneous multiprocessor architecture. Processing element (PE): CPU, accelerator, etc. Program the system. 38

Accelerated system design First, determine that the system really needs to be accelerated. How much faster is the accelerator on the core function? How much data transfer overhead? Design the accelerator itself. Design CPU interface to accelerator. 39

Accelerated system platforms Several off-the-shelf boards are available for acceleration in PCs: FPGA-based core; PC bus interface. 40

Accelerator/CPU interface Accelerator registers provide control registers for CPU. Data registers can be used for small data objects. Accelerator may include special-purpose read/write logic. Especially valuable for large data transfers. 41

System integration and debugging Try to debug the CPU/accelerator interface separately from the accelerator core. Build scaffolding to test the accelerator. Hardware/software co-simulation can be useful. 42

Caching problems Main memory provides the primary data transfer mechanism to the accelerator. Programs must ensure that caching does not invalidate main memory data. CPU reads location S. Accelerator writes location S. CPU writes location S. 43

Synchronization As with cache, main memory writes to shared memory may cause invalidation: CPU reads S. Accelerator writes S. CPU reads S. 44

Multiprocessor performance analysis Effects of parallelism (and lack of it): Processes. CPU and bus. Multiple processors. 45

Accelerator speedup Critical parameter is speedup: how much faster is the system with the accelerator? Must take into account: Accelerator execution time. Data transfer time. Synchronization with the master CPU. 46

Accelerator execution time Total accelerator execution time: t accel = t in + t x + t out Data input Data output Accelerated computation 47

Accelerator speedup Assume loop is executed n times. Compare accelerated system to nonaccelerated system: S = n(t CPU - t accel ) = n[t CPU - (t in + t x + t out )] Execution time on CPU 48

Single- vs. multi-threaded One critical factor is available parallelism: single-threaded/blocking: CPU waits for accelerator; multithreaded/non-blocking: CPU continues to execute along with accelerator. To multithread, CPU must have useful work to do. But software must also support multithreading. 49

Total execution time Single-threaded: Multi-threaded: P1 P1 P2 A1 P2 A1 P3 P3 P4 P4 50

Execution time analysis Single-threaded: Count execution time of all component processes. Multi-threaded: Find longest path through execution. 51

Sources of parallelism Overlap I/O and accelerator computation. Perform operations in batches, read in second batch of data while computing on first batch. Find other work to do on the CPU. May reschedule operations to move work after accelerator initiation. 52

Data input/output times Bus transactions include: flushing register/cache values to main memory; time required for CPU to set up transaction; overhead of data transfers by bus packets, handshaking, etc. 53

Scheduling and allocation Must: schedule operations in time; allocate computations to processing elements. Scheduling and allocation interact, but separating them helps. Alternatively allocate, then schedule. 54

Example: scheduling and allocation P1 P2 d1 d2 M1 M2 P3 Task graph Hardware platform 55

First design Allocate P1, P2 -> M1; P3 -> M2. M1 P1 P1C P2 P2C M2 P3 time 56

Second design Allocate P1 -> M1; P2, P3 -> M2: M1 P1 P1C M2 P2 P3 time 57

Example: adjusting messages to reduce delay Task graph: Network: 3 execution time 3 allocation P1 d1 d2 P2 M1 M2 M3 4 P3 Transmission time = 4 58

Initial schedule M1 P1 M2 P2 M3 P3 network d1 d2 Time = 15 0 5 10 15 20 time 59

New design Modify P3: reads one packet of d1, one packet of d2 computes partial result continues to next packet 60

New schedule M1 P1 M2 P2 M3 P3 P3 P3 P3 network d1 d2 d1 d2 d1 d2 d1 d2 Time = 12 0 5 10 15 20 time 61

Buffering and performance Buffering may sequentialize operations. Next process must wait for data to enter buffer before it can continue. Buffer policy (queue, RAM) affects available parallelism. 62

Buffers and latency Three processes separated by buffers: B1 A B2 B B3 C 63

Buffers and latency schedules A[0] A[1] B[0] B[1] C[0] C[1] Must wait for all of A before getting any B A[0] B[0] C[0] A[1] B[1] C[1] 64