VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009

Size: px
Start display at page:

Download "VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009"

Transcription

1 Digital Signal Processing 8 December 24, 2009 VIII. DSP Processors 2007 Syllabus: Introduction to programmable DSPs: Multiplier and Multiplier-Accumulator (MAC), Modified bus structures and memory access schemes in DSPs, Multiple access memory, Multiport memory, VLSI architecture, Pipelining, Special addressing modes, On-chip peripherals. Architecture of TMS 320C5X Introduction, Bus structure, Central Arithmetic Logic Unit, Auxiliary register, Index register, Auxiliary register, Compare register, Block move address register, Parallel Logic Unit, Memory mapped registers, Program controller, Some flags in the status registers, On-chip registers, On-chip peripherals. Contents: 8.1 DSP Processors Market 8.2 DSP Processors Features 8.3 Multiply-and-Accumulate 8.4 Interrupts handling incoming signal values 8.5 Fixed- and Floating-point 8.6 Real-time FIR filter example We shall use DSP to mean Digital Signal Processor(s) and sometimes even refer to them as DSP processors, also as programmable digital signal processors (PDSPs). This includes 1. General purpose DSPs (such as the TMS320 s of TI and DSP563 s of Motorola and others) 2. Special purpose DSPs tailored to specific applications like FFT However, the so-called programmability of DSPs mentioned above pales in comparison to that of general purpose CPUs. In what follows, for the most part, we contrast general purpose CPUs (whose strength is in general purpose programmability) with DSPs (whose strength is in high throughput, hardwired, number crunching). For a convergence of both kinds of processor refer to the article on Intel s Larrabee in IEEE Spectrum, January DSP-8 (DSP Processors) 1 of 8 Dr. Ravi Billa

2 8.1 DSP Processors Market Market 1 Small, low-power, relatively weak DSPs Mass-produced consumer products - Toys, Automobiles Inexpensive 2 More capable fixed point processors Cell phones, Digital answering machines, Modems 3 Strongest, often floating point, DSPs Image and video processing, Server applications 8.2 DSP Processors Features Features 1 DSP-specific instructions (e.g., MAC) 2 Special address registers 3 Zero-overhead loops 4 Multiple memory buses and banks 5 Instruction pipelines 6 Fast interrupt servicing (fast context switch) 7 Specialized IO ports 8 Special addressing modes (e.g., bit reversal) 8.3 Multiply-and-Accumulate DSP algorithms are characterized by intensive number crunching that may exceed the capabilities of a general purpose CPU. Due to arithmetic instructions specifically tailored to DSP needs a DSP processor can be much faster for specific tasks. The most common special purpose task in digital signal processing is the multiply-and -accumulate (MAC) operation illustrated by the FIR filter M y(n) = bj x( n j) = b j x j = b 0 x(n) + b 1 x(n 1) + + b M x(n M) j 0 M j 0 Such a repeated MAC operation occurs in other situations as well. Further, the operands b and x need not have the same index j. Letting b and x have independent indices j and k, the following loop accomplishes the MAC operation Loop: update j, update k a a + b x j k DSP-8 (DSP Processors) 2 of 8 Dr. Ravi Billa

3 [Aside The operation a + (b * c), using floating point numbers, may be done with two roundings (once when b and c are multiplied and a second time when the product is added to a), or with just one rounding (where the entire expression a + (b * c) is evaluated in one step). The latter is called a fused multiply-add (FMA) or fused multiply-accumulate (FMAC) included in the IEEE Std. 754.] Loop overhead A general purpose CPU would implement the above sum of products operation in a fixed length loop such as For i = 0 to M {statements} that involves considerable overhead apart from the statements. This overhead consists of: Loop Overhead (General purpose CPU) 1 Provide a CPU register to store the loop index 2 Initialize index register 3 After each pass increment and check loop index for termination 4 (If a CPU register is not available: provide a memory location for indexing, retrieve it, increment it, check it and store it back in memory) 5 Except for the last pass, a jump back to the top of the loop DSP processors provide a zero-overhead hardware mechanism (a REPEAT or DO) that can repeat an instruction or set of instructions a prescribed number of times. Due to hardware support for this repetition structure no clock cycles are wasted on branching or incrementing and checking the loop index. The number of loop iterations is necessarily limited. If loop nesting is allowed not all loops may be zero-overhead. Enhancing the CPU architecture Inside the loop How would a general purpose CPU carry out the computations inside the loop? Assume that {b} and {x} are stored as arrays in memory. Assume that the CPU has pointer registers j and k that can be directly updated and used to retrieve data from memory, two arithmetic registers b and x that can be used as operands of arithmetic operations, a double length register p to receive the product and an accumulator a for summing the products. The instruction sequence for one pass through the loop on a general purpose CPU looks like this: Inside the Loop 1 update j 2 update k 3 b b j 4 x x k 5 fetch (multiply) instruction DSP-8 (DSP Processors) 3 of 8 Dr. Ravi Billa

4 6 decode (multiply) instruction 7 execute (multiply) instruction (p b j x k ) 8 fetch (add) instruction 9 decode (add) instruction 10 execute (add) instruction (a a + p) Assuming that each of the lines above takes one unit of time call it an instruction time or clock cycle (multiplication easily takes several units of time but we assume it is the same as the rest), the sequence takes 10 units of time to complete. We could add a multiply and add (call it MAC) instruction to the instruction set of the CPU (that is, we augment CPU with the appropriate hardware): this would merge the last 6 lines (lines 5 through 10) in the above segment into just 3 lines, and there would then be 7 lines taking 7 units of time as shown below: Inside the Loop 1 update j 2 update k 3 b b j 4 x x k 5 fetch MAC instruction 6 decode MAC instruction 7 execute MAC instruction (a a + b j x k ) A DSP can perform a MAC operation in a single unit of time. Many use this feature as the definition of a DSP. We want to describe below how this is accomplished. Update pointers simultaneously since they are independent. We add two address updating units to the processor hardware. Since these two updates can be done in parallel we show them as one line in the sequence, the sequence now taking 6 units of time: Inside the Loop 1 update j AND update k 2 b b j 3 x x k 4 fetch MAC instruction 5 decode MAC instruction 6 execute MAC (a a + b j x k ) Memory Architecture Load registers b and x simultaneously Since b j and x k are completely independent we can make provision to read them simultaneously from memory into the appropriate registers. In the standard CPU situation there is just one bus connection to the memory; and even connecting two buses to the same (one) memory does not help; and the so-called dual port memories are expensive and slow. In a radical departure from the memory architecture of the standard CPU, the DSP can define multiple memory banks each served by its own bus. Now b j and x k can be DSP-8 (DSP Processors) 4 of 8 Dr. Ravi Billa

5 loaded simultaneously from memory into the registers j and k, shown in the sequence below by listing the two operations on the same line, the sequence now taking 5 units of time: Inside the Loop 1 update j AND update k 2 b b j AND x x k 3 fetch MAC instruction 4 decode MAC instruction 5 execute MAC (a a + b j x k ) We next turn to the last three lines (fetch, decode and execute) in the sequence. Caches Standard CPUs use instruction caches to speed up the execution. Caching implies different amounts of run-time (that is, unpredictability) depending on the state of the caches when operation starts. However, DSPs are designed for real-time use where the prediction of exact timing may be critical. Therefore caches are usually avoided in DSPs because caching complicates the calculation of program execution time. Harvard architecture Now we consider the fetching of one instruction while previous ones are still being decoded or executed. There can now be a clash while fetching an instruction from memory at the same time that data related to a prior instruction is being transferred to or from memory. The solution is to use separate memory banks and separate buses. Previously we used different memory banks for different categories of data but now we are talking about a memory bank for instructions versus a memory bank for data. The memory banks have independent address spaces and are called program memory and data memory resulting in the Harvard architecture. The CPU can fetch the next instruction and simultaneously do a load/store of a memory word. Standard computers use the same memory space for program and data, this being called the von Neumann architecture (Pennsylvania architecture or Princeton architecture?). Most DSPs abide by the Harvard architecture in order to be able to overlap instruction fetches with data transfers. The idea of overlapping brings us to pipelining. [The availability of the modern cache system has substantially alleviated the problem of the von Neumann bottle neck. Most modern computers labeled Harvard architecture allow accessing the contents of the instruction memory as though it were data and are called modified Harvard architecture, used in niche applications like DSP (TI s TMS320, Analog Devices Blackfin) and microcontrollers (Atmel AVR, ZiLOG s Z8Encore!).] To sum up, so far our efforts to enhance the DSP processor s speed have introduced the following concepts: 1. Special instruction (MAC) added to the instruction set CPU augmentation 2. Address registers updated in parallel CPU augmentation 3. Data registers loaded from memory in parallel Memory banks 4. Instruction fetched in parallel with execution of previous instructions Harvard architecture (separate program and data memories) and Pipelining Pipelining allows the parallel execution of any operations that logically can be performed in parallel. These operations need not be the 5 lines listed above, so let us generalize the 5 lines and call them A, B, C, D and E. Each one takes one clock cycle and together they make up one pass through the loop described above. In a given clock cycle the pipeline contains 5 different passes each one being in just one of the 5 states A through E. Thus any one pass would consist of these DSP-8 (DSP Processors) 5 of 8 Dr. Ravi Billa

6 5 operations and take 5 clock cycles to complete. The number of overlapable operations of which one pass is comprised is called the depth of the pipeline. Here we have a depth-5 pipeline. Typical depths are 4 or 5. Some DSP processors have pipeline depths as high as 11. The operation of a depth-5 pipeline is shown below. Time (clock cycles) runs from left to right. The height corresponds to distinct hardware units (stages). There are a total of 6 product terms being added to the accumulator, each term taking 5 clock cycles. In {A1 through E1} the first product term is added to the accumulator and is completed in clock cycle #5. The addition of the second product term, {A2 through E2}, is completed in clock cycle #6, etc. The complete sum is available in 10 cycles. Without the pipeline the summation would take 6 * 5 = 30 cycles. Clock cycles Depth A1 A2 A3 A4 A5 A6 2 B1 B2 B3 B4 B5 B6 3 C1 C2 C3 C4 C5 C6 4 D1 D2 D3 D4 D5 D6 5 E1 E2 E3 E4 E5 E6 There is pipeline overhead: at the left there are 4 clock cycles during which the pipeline is filling while at the right there are a further 4 cycles while the pipeline is emptying. (In this specific example the pipeline is full for 2 cycles). For large enough loops the overhead is negligible; thus the pipeline allows the DSP processor to perform one multiply-and-add per clock cycle on the average. In general, asymptotically, the processor takes only a single clock cycle per instruction. Note that in this treatment we are dividing one pass through the loop into five overlapable parts labeled A through E. In other contexts an instruction cycle is divided into perhaps five overlapable parts. Note also that in the above diagram time runs from left to right, while depth corresponds to distinct hardware units (in this case five). One pass through the loop goes diagonally down from left to right, shown in color. 8.4 Interrupts handling incoming signal values A context switch is when a processor stops what it has been doing and starts doing something else resulting in a need to save/change the contents of registers, pointers, counters, flags etc. A hardware mechanism called an interrupt forces a context switch to a predefined routine called the interrupt handler for the event in question. One major difference between a DSP processor and other types of CPU is the speed of the context switch. A general purpose CPU may have a latency (the time from when an interrupt is requested to the point where the interrupt handler begins to execute) of several tens of clock cycles to perform a context switch, while DSPs have the ability to do a low latency (perhaps even zero-overhead) interrupt. The most important reason for a fast context switch is the need to capture incoming signal values (which are interrupt-based). These signal values are either processed immediately or stored in a buffer for later processing. The DSP fast interrupt is usually accomplished by saving only small portion of the context and having hardware assistance for this procedure. Signal values are input to and output from the DSP through ports. Serial ports are typically used for low rate signals bits are moved in or out one bit per clock cycle through an internal shift register. Parallel ports (typically 8 or 16 bits) are faster but require more pins on the DSP-8 (DSP Processors) 6 of 8 Dr. Ravi Billa

7 DSP chip itself. Further speed up of data transfer is made possible through DMA (Direct Memory Access) channels. 8.5 Fixed- and Floating-point DSP tasks involve intensive number crunching. By integer data we shall mean that the decimal point is at the extreme right end and may therefore be ignored. By real numbers we mean that the decimal point may be somewhere other than at extreme right but is always fixed; this allows the data to contain a fractional part. We shall use fixed-point to cover both integers and real numbers. Such data have a limited range. By floating-point data we shall mean that the position of the decimal point is not fixed (floating) and is adjusted to suit other data it must interact with and operations it goes through and the results. Early DSP processors offered integer-only mode of data and arithmetic. Even today such fixed-point DSPs flourish (or, are forced on DSP developers) due to the realities of cost, speed, size, and power consumption. The DSP community has developed intricate numeric methods to use fixed-point devices. Today there are floating-point DSPs but these still tend to be much more expensive, more power hungry and physically larger than their fixed-point counterparts. Fixed-point Lower cost Smaller size Lower power consumption Embedding DSP into small package, or Where power is limited, or Where price is critical Good match for A/D and D/A (typically unsigned or 2 s-comp. integer devices) Floating-point More expensive Larger size Higher power consumption Fixed-point DSPs What is the consequence of having to prefer fixed-point DSPs over floating point DSPs (in some applications) for reasons mentioned above? The price is extended development time. After the required algorithms have been simulated on computers with floating point capabilities, floating point operations must then be carefully converted to integer ones. This involves 1. Rounding 2. Rescaling (due to limited dynamic range) at various points 3. Underflow and overflow handling 4. Placement of rescalings at optimal points (to ensure maximum signal to quantization ratio) 5. Matching of the precise details of the processor s arithmetic with other systems for interoperability or with a standard for conformity These tasks may require extensive simulation. The most common fixed-point representation is 16-bit two s complement (24- or 32-bit registers also exist). In fixed point DSPs this must accommodate both integers and real numbers. A real number is represented by multiplying it by a large number and rounding; consider the DSP-8 (DSP Processors) 7 of 8 Dr. Ravi Billa

8 coefficient a of the IIR filter y(n) a y(n 1) = x(n), 0 a 1. On an 8-bit fixed-point processor 1 becomes 256, and the range for a is 0 a 256. With 16-bit operands, bit growth (or, increase of precision) means that their sum can require 17 bits and their product 32 bits. Addition and Multiplication Regular CPUs Fixed-point DSPs Floating-point DSPs User must check User must explicitly handle the Hardware (overflow flag is set increase in precision: automatically or exception is (1) Use accumulator longer than discards least triggered), decide the largest product, or significant bits what to do with (2) Scaling as part of the MAC product instruction built into the pipeline, or (3) Saturation arithmetic With a DSP using fixed-point arithmetic, if the product is 32 bits the accumulator could be 40 bits long; this allows eight MACs to be performed without any fear of overflow. At the end of the loop a single check and possible discard can be done. A second possibility is to provide an optional scaling operation as part of the MAC instruction itself (basically a right-shift of the product before the addition, built into the pipeline). Thirdly, saturation arithmetic is the last resort: whenever an overflow (or underflow) occurs the result is replaced by the largest (or smallest) possible of the appropriate sign. The error introduced is smaller than that caused by straight overflow (i. e., roll-over of the register). With fixed-point processors, the filter coefficients should not simply be rounded; rather the best integer coefficients should be determined using an optimization procedure. Floating-point DSPs avoid many of the above problems. There is an IEEE floating point standard. Floating-point DSPs do not usually have instructions for division, powers, square root, trig functions etc. DSP-8 (DSP Processors) 8 of 8 Dr. Ravi Billa

ELC4438: Embedded System Design Embedded Processor

ELC4438: Embedded System Design Embedded Processor ELC4438: Embedded System Design Embedded Processor Liang Dong Electrical and Computer Engineering Baylor University 1. Processor Architecture General PC Von Neumann Architecture a.k.a. Princeton Architecture

More information

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015 Advanced Parallel Architecture Lesson 3 Annalisa Massini - Von Neumann Architecture 2 Two lessons Summary of the traditional computer architecture Von Neumann architecture http://williamstallings.com/coa/coa7e.html

More information

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures An introduction to DSP s Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures DSP example: mobile phone DSP example: mobile phone with video camera DSP: applications Why a DSP?

More information

Latches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter

Latches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter IT 3123 Hardware and Software Concepts Notice: This session is being recorded. CPU and Memory June 11 Copyright 2005 by Bob Brown Latches Can store one bit of data Can be ganged together to store more

More information

Microprocessors and Microcontrollers. Assignment 1:

Microprocessors and Microcontrollers. Assignment 1: Microprocessors and Microcontrollers Assignment 1: 1. List out the mass storage devices and their characteristics. 2. List the current workstations available in the market for graphics and business applications.

More information

REAL TIME DIGITAL SIGNAL PROCESSING

REAL TIME DIGITAL SIGNAL PROCESSING REAL TIME DIGITAL SIGNAL PROCESSING UTN - FRBA 2011 www.electron.frba.utn.edu.ar/dplab Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable.

More information

General Purpose Signal Processors

General Purpose Signal Processors General Purpose Signal Processors First announced in 1978 (AMD) for peripheral computation such as in printers, matured in early 80 s (TMS320 series). General purpose vs. dedicated architectures: Pros:

More information

ARM ARCHITECTURE. Contents at a glance:

ARM ARCHITECTURE. Contents at a glance: UNIT-III ARM ARCHITECTURE Contents at a glance: RISC Design Philosophy ARM Design Philosophy Registers Current Program Status Register(CPSR) Instruction Pipeline Interrupts and Vector Table Architecture

More information

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015 Advanced Parallel Architecture Lesson 3 Annalisa Massini - 2014/2015 Von Neumann Architecture 2 Summary of the traditional computer architecture: Von Neumann architecture http://williamstallings.com/coa/coa7e.html

More information

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram: The CPU and Memory How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram: 1 Registers A register is a permanent storage location within

More information

Implementation of DSP Algorithms

Implementation of DSP Algorithms Implementation of DSP Algorithms Main frame computers Dedicated (application specific) architectures Programmable digital signal processors voice band data modem speech codec 1 PDSP and General-Purpose

More information

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

ASSEMBLY LANGUAGE MACHINE ORGANIZATION ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction

More information

DSP Platforms Lab (AD-SHARC) Session 05

DSP Platforms Lab (AD-SHARC) Session 05 University of Miami - Frost School of Music DSP Platforms Lab (AD-SHARC) Session 05 Description This session will be dedicated to give an introduction to the hardware architecture and assembly programming

More information

TMS320C3X Floating Point DSP

TMS320C3X Floating Point DSP TMS320C3X Floating Point DSP Microcontrollers & Microprocessors Undergraduate Course Isfahan University of Technology Oct 2010 By : Mohammad 1 DSP DSP : Digital Signal Processor Why A DSP? Example Voice

More information

Computer System Overview OPERATING SYSTEM TOP-LEVEL COMPONENTS. Simplified view: Operating Systems. Slide 1. Slide /S2. Slide 2.

Computer System Overview OPERATING SYSTEM TOP-LEVEL COMPONENTS. Simplified view: Operating Systems. Slide 1. Slide /S2. Slide 2. BASIC ELEMENTS Simplified view: Processor Slide 1 Computer System Overview Operating Systems Slide 3 Main Memory referred to as real memory or primary memory volatile modules 2004/S2 secondary memory devices

More information

Computer System Overview

Computer System Overview Computer System Overview Operating Systems 2005/S2 1 What are the objectives of an Operating System? 2 What are the objectives of an Operating System? convenience & abstraction the OS should facilitate

More information

Typical DSP application

Typical DSP application DSP markets DSP markets Typical DSP application TI DSP History: Modem applications 1982 TMS32010, TI introduces its first programmable general-purpose DSP to market Operating at 5 MIPS. It was ideal for

More information

Architecture of 8085 microprocessor

Architecture of 8085 microprocessor Architecture of 8085 microprocessor 8085 consists of various units and each unit performs its own functions. The various units of a microprocessor are listed below Accumulator Arithmetic and logic Unit

More information

SECTION 5 PROGRAM CONTROL UNIT

SECTION 5 PROGRAM CONTROL UNIT SECTION 5 PROGRAM CONTROL UNIT MOTOROLA PROGRAM CONTROL UNIT 5-1 SECTION CONTENTS SECTION 5.1 PROGRAM CONTROL UNIT... 3 SECTION 5.2 OVERVIEW... 3 SECTION 5.3 PROGRAM CONTROL UNIT (PCU) ARCHITECTURE...

More information

Processing Unit CS206T

Processing Unit CS206T Processing Unit CS206T Microprocessors The density of elements on processor chips continued to rise More and more elements were placed on each chip so that fewer and fewer chips were needed to construct

More information

CS 101, Mock Computer Architecture

CS 101, Mock Computer Architecture CS 101, Mock Computer Architecture Computer organization and architecture refers to the actual hardware used to construct the computer, and the way that the hardware operates both physically and logically

More information

DSP Processors Lecture 13

DSP Processors Lecture 13 DSP Processors Lecture 13 Ingrid Verbauwhede Department of Electrical Engineering University of California Los Angeles ingrid@ee.ucla.edu 1 References The origins: E.A. Lee, Programmable DSP Processors,

More information

INTRODUCTION TO DIGITAL SIGNAL PROCESSOR

INTRODUCTION TO DIGITAL SIGNAL PROCESSOR INTRODUCTION TO DIGITAL SIGNAL PROCESSOR By, Snehal Gor snehalg@embed.isquareit.ac.in 1 PURPOSE Purpose is deliberately thought-through goal-directedness. - http://en.wikipedia.org/wiki/purpose This document

More information

Monday, January 27, 2014

Monday, January 27, 2014 Monday, January 27, 2014 Topics for today History of Computing (brief) Encoding data in binary Unsigned integers Signed integers Arithmetic operations and status bits Number conversion: binary to/from

More information

04 - DSP Architecture and Microarchitecture

04 - DSP Architecture and Microarchitecture September 11, 2015 Memory indirect addressing (continued from last lecture) ; Reality check: Data hazards! ; Assembler code v3: repeat 256,endloop load r0,dm1[dm0[ptr0++]] store DM0[ptr1++],r0 endloop:

More information

PROGRAM CONTROL UNIT (PCU)

PROGRAM CONTROL UNIT (PCU) nc. SECTION 5 PROGRAM CONTROL UNIT (PCU) MOTOROLA PROGRAM CONTROL UNIT (PCU) 5-1 nc. SECTION CONTENTS 5.1 INTRODUCTION........................................ 5-3 5.2 PROGRAM COUNTER (PC)...............................

More information

Digital System Design Using Verilog. - Processing Unit Design

Digital System Design Using Verilog. - Processing Unit Design Digital System Design Using Verilog - Processing Unit Design 1.1 CPU BASICS A typical CPU has three major components: (1) Register set, (2) Arithmetic logic unit (ALU), and (3) Control unit (CU) The register

More information

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 23 Hierarchical Memory Organization (Contd.) Hello

More information

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics Computer and Hardware Architecture I Benny Thörnberg Associate Professor in Electronics Hardware architecture Computer architecture The functionality of a modern computer is so complex that no human can

More information

4. Hardware Platform: Real-Time Requirements

4. Hardware Platform: Real-Time Requirements 4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture

More information

Where Does The Cpu Store The Address Of The

Where Does The Cpu Store The Address Of The Where Does The Cpu Store The Address Of The Next Instruction To Be Fetched The three most important buses are the address, the data, and the control buses. The CPU always knows where to find the next instruction

More information

MICROCONTROLLERS 8051

MICROCONTROLLERS 8051 MICROCONTROLLERS 8051 PART A Unit 1: Microprocessor and Microcontroller. Introduction, Microprocessor and Microcontrollers, A Microcontroller survey. RISC & CISC CPU Architectures, Harvard & Von Neumann

More information

CS Computer Architecture

CS Computer Architecture CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 Computer Systems Organization The CPU (Central Processing Unit) is the brain of the computer. Fetches instructions from main memory.

More information

What Are The Main Differences Between Program Counter Pc And Instruction Register Ir

What Are The Main Differences Between Program Counter Pc And Instruction Register Ir What Are The Main Differences Between Program Counter Pc And Instruction Register Ir and register-based instructions - Anatomy on a CPU - Program Counter (PC): holds memory address of next instruction

More information

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1 CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1 Data representation: (CHAPTER-3) 1. Discuss in brief about Data types, (8marks)

More information

Microcomputer Architecture and Programming

Microcomputer Architecture and Programming IUST-EE (Chapter 1) Microcomputer Architecture and Programming 1 Outline Basic Blocks of Microcomputer Typical Microcomputer Architecture The Single-Chip Microprocessor Microprocessor vs. Microcontroller

More information

Computer Organisation CS303

Computer Organisation CS303 Computer Organisation CS303 Module Period Assignments 1 Day 1 to Day 6 1. Write a program to evaluate the arithmetic statement: X=(A-B + C * (D * E-F))/G + H*K a. Using a general register computer with

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction The Motorola DSP56300 family of digital signal processors uses a programmable, 24-bit, fixed-point core. This core is a high-performance, single-clock-cycle-per-instruction engine

More information

Von Neumann architecture. The first computers used a single fixed program (like a numeric calculator).

Von Neumann architecture. The first computers used a single fixed program (like a numeric calculator). Microprocessors Von Neumann architecture The first computers used a single fixed program (like a numeric calculator). To change the program, one has to re-wire, re-structure, or re-design the computer.

More information

UNIT- 5. Chapter 12 Processor Structure and Function

UNIT- 5. Chapter 12 Processor Structure and Function UNIT- 5 Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data CPU With Systems Bus CPU Internal Structure Registers

More information

Optimized Scientific Computing:

Optimized Scientific Computing: Optimized Scientific Computing: Coding Efficiently for Real Computing Architectures Noah Kurinsky SASS Talk, November 11 2015 Introduction Components of a CPU Architecture Design Choices Why Is This Relevant

More information

DSP VLSI Design. Instruction Set. Byungin Moon. Yonsei University

DSP VLSI Design. Instruction Set. Byungin Moon. Yonsei University Byungin Moon Yonsei University Outline Instruction types Arithmetic and multiplication Logic operations Shifting and rotating Comparison Instruction flow control (looping, branch, call, and return) Conditional

More information

MARIE: An Introduction to a Simple Computer

MARIE: An Introduction to a Simple Computer MARIE: An Introduction to a Simple Computer 4.2 CPU Basics The computer s CPU fetches, decodes, and executes program instructions. The two principal parts of the CPU are the datapath and the control unit.

More information

COMPUTER ORGANIZATION AND ARCHITECTURE

COMPUTER ORGANIZATION AND ARCHITECTURE COMPUTER ORGANIZATION AND ARCHITECTURE For COMPUTER SCIENCE COMPUTER ORGANIZATION. SYLLABUS AND ARCHITECTURE Machine instructions and addressing modes, ALU and data-path, CPU control design, Memory interface,

More information

SYLLABUS UNIT - I 8086/8088 ARCHITECTURE AND INSTRUCTION SET

SYLLABUS UNIT - I 8086/8088 ARCHITECTURE AND INSTRUCTION SET 1 SYLLABUS UNIT - I 8086/8088 ARCHITECTURE AND INSTRUCTION SET Intel 8086/8088 Architecture Segmented Memory, Minimum and Maximum Modes of Operation, Timing Diagram, Addressing Modes, Instruction Set,

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function William Stallings Computer Organization and Architecture 8 th Edition Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data

More information

UNIT-II. Part-2: CENTRAL PROCESSING UNIT

UNIT-II. Part-2: CENTRAL PROCESSING UNIT Page1 UNIT-II Part-2: CENTRAL PROCESSING UNIT Stack Organization Instruction Formats Addressing Modes Data Transfer And Manipulation Program Control Reduced Instruction Set Computer (RISC) Introduction:

More information

EE 4980 Modern Electronic Systems. Processor Advanced

EE 4980 Modern Electronic Systems. Processor Advanced EE 4980 Modern Electronic Systems Processor Advanced Architecture General Purpose Processor User Programmable Intended to run end user selected programs Application Independent PowerPoint, Chrome, Twitter,

More information

An introduction to Digital Signal Processors (DSP) Using the C55xx family

An introduction to Digital Signal Processors (DSP) Using the C55xx family An introduction to Digital Signal Processors (DSP) Using the C55xx family Group status (~2 minutes each) 5 groups stand up What processor(s) you are using Wireless? If so, what technologies/chips are you

More information

REAL TIME DIGITAL SIGNAL PROCESSING

REAL TIME DIGITAL SIGNAL PROCESSING REAL TIME DIGITAL SIGNAL PROCESSING UTN-FRBA 2010 Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable. Reproducibility. Don t depend on components

More information

Computer Organization Question Bank

Computer Organization Question Bank Id 1 Question Mass produced first working machine (50 copies) was invented by A C D Answer Wilhelm Schickhard laise Pascal Gottfried Liebniz Charles abbage Id 2 Question Father of Modern Computer A Wilhelm

More information

Lecture1: introduction. Outline: History overview Central processing unite Register set Special purpose address registers Datapath Control unit

Lecture1: introduction. Outline: History overview Central processing unite Register set Special purpose address registers Datapath Control unit Lecture1: introduction Outline: History overview Central processing unite Register set Special purpose address registers Datapath Control unit 1 1. History overview Computer systems have conventionally

More information

MICROPROCESSOR BASED SYSTEM DESIGN

MICROPROCESSOR BASED SYSTEM DESIGN MICROPROCESSOR BASED SYSTEM DESIGN Lecture 5 Xmega 128 B1: Architecture MUHAMMAD AMIR YOUSAF VON NEUMAN ARCHITECTURE CPU Memory Execution unit ALU Registers Both data and instructions at the same system

More information

G GLOSSARY. Terms. Figure G-0. Table G-0. Listing G-0.

G GLOSSARY. Terms. Figure G-0. Table G-0. Listing G-0. G GLOSSARY Figure G-0. Table G-0. Listing G-0. Terms Autobuffering Unit (ABU). (See I/O processor and DMA) Arithmetic Logic Unit (ALU). This part of a processing element performs arithmetic and logic operations

More information

Register Are Two Names For The Same Place

Register Are Two Names For The Same Place The Program Counter And The Instruction Register Are Two Names For The Same Place Hi I am wondering where the program counter goes when the program The interrupt will take place after the current iteration

More information

Question Bank Microprocessor and Microcontroller

Question Bank Microprocessor and Microcontroller QUESTION BANK - 2 PART A 1. What is cycle stealing? (K1-CO3) During any given bus cycle, one of the system components connected to the system bus is given control of the bus. This component is said to

More information

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured System Performance Analysis Introduction Performance Means many things to many people Important in any design Critical in real time systems 1 ns can mean the difference between system Doing job expected

More information

General Purpose Processors

General Purpose Processors Calcolatori Elettronici e Sistemi Operativi Specifications Device that executes a program General Purpose Processors Program list of instructions Instructions are stored in an external memory Stored program

More information

Module 1. Introduction. Version 2 EE IIT, Kharagpur 1

Module 1. Introduction. Version 2 EE IIT, Kharagpur 1 Module 1 Introduction Version 2 EE IIT, Kharagpur 1 Lesson 4 Embedded Systems Components Part II Version 2 EE IIT, Kharagpur 2 Overview on Components Instructional Objectives After going through this lesson

More information

Intel s MMX. Why MMX?

Intel s MMX. Why MMX? Intel s MMX Dr. Richard Enbody CSE 820 Why MMX? Make the Common Case Fast Multimedia and Communication consume significant computing resources. Providing specific hardware support makes sense. 1 Goals

More information

Representation of Numbers and Arithmetic in Signal Processors

Representation of Numbers and Arithmetic in Signal Processors Representation of Numbers and Arithmetic in Signal Processors 1. General facts Without having any information regarding the used consensus for representing binary numbers in a computer, no exact value

More information

COSC 122 Computer Fluency. Computer Organization. Dr. Ramon Lawrence University of British Columbia Okanagan

COSC 122 Computer Fluency. Computer Organization. Dr. Ramon Lawrence University of British Columbia Okanagan COSC 122 Computer Fluency Computer Organization Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Key Points 1) The standard computer (von Neumann) architecture consists

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Arithmetic Unit 10032011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Recap Chapter 3 Number Systems Fixed Point

More information

Introduction to Microcontrollers

Introduction to Microcontrollers Introduction to Microcontrollers Embedded Controller Simply an embedded controller is a controller that is embedded in a greater system. One can define an embedded controller as a controller (or computer)

More information

Wednesday, January 28, 2018

Wednesday, January 28, 2018 Wednesday, January 28, 2018 Topics for today History of Computing (brief) Encoding data in binary Unsigned integers Signed integers Arithmetic operations and status bits Number conversion: binary to/from

More information

UNIT II SYSTEM BUS STRUCTURE 1. Differentiate between minimum and maximum mode 2. Give any four pin definitions for the minimum mode. 3. What are the pins that are used to indicate the type of transfer

More information

Summary of Computer Architecture

Summary of Computer Architecture Summary of Computer Architecture Summary CHAP 1: INTRODUCTION Structure Top Level Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output

More information

COMP2121: Microprocessors and Interfacing. Instruction Set Architecture (ISA)

COMP2121: Microprocessors and Interfacing. Instruction Set Architecture (ISA) COMP2121: Microprocessors and Interfacing Instruction Set Architecture (ISA) http://www.cse.unsw.edu.au/~cs2121 Lecturer: Hui Wu Session 2, 2017 1 Contents Memory models Registers Data types Instructions

More information

icroprocessor istory of Microprocessor ntel 8086:

icroprocessor istory of Microprocessor ntel 8086: Microprocessor A microprocessor is an electronic device which computes on the given input similar to CPU of a computer. It is made by fabricating millions (or billions) of transistors on a single chip.

More information

Chapter 4. MARIE: An Introduction to a Simple Computer

Chapter 4. MARIE: An Introduction to a Simple Computer Chapter 4 MARIE: An Introduction to a Simple Computer Chapter 4 Objectives Learn the components common to every modern computer system. Be able to explain how each component contributes to program execution.

More information

By, Ajinkya Karande Adarsh Yoga

By, Ajinkya Karande Adarsh Yoga By, Ajinkya Karande Adarsh Yoga Introduction Early computer designers believed saving computer time and memory were more important than programmer time. Bug in the divide algorithm used in Intel chips.

More information

DSP VLSI Design. Pipelining. Byungin Moon. Yonsei University

DSP VLSI Design. Pipelining. Byungin Moon. Yonsei University Byungin Moon Yonsei University Outline What is pipelining? Performance advantage of pipelining Pipeline depth Interlocking Due to resource contention Due to data dependency Branching Effects Interrupt

More information

Lecture 5: Instruction Pipelining. Pipeline hazards. Sequential execution of an N-stage task: N Task 2

Lecture 5: Instruction Pipelining. Pipeline hazards. Sequential execution of an N-stage task: N Task 2 Lecture 5: Instruction Pipelining Basic concepts Pipeline hazards Branch handling and prediction Zebo Peng, IDA, LiTH Sequential execution of an N-stage task: 3 N Task 3 N Task Production time: N time

More information

Advance CPU Design. MMX technology. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. ! Basic concepts

Advance CPU Design. MMX technology. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. ! Basic concepts Computer Architectures Advance CPU Design Tien-Fu Chen National Chung Cheng Univ. Adv CPU-0 MMX technology! Basic concepts " small native data types " compute-intensive operations " a lot of inherent parallelism

More information

Chapter 1 Computer System Overview

Chapter 1 Computer System Overview Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Seventh Edition By William Stallings Objectives of Chapter To provide a grand tour of the major computer system components:

More information

Problem Set 1 Solutions

Problem Set 1 Solutions CSE 260 Digital Computers: Organization and Logical Design Jon Turner Problem Set 1 Solutions 1. Give a brief definition of each of the following parts of a computer system: CPU, main memory, floating

More information

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan Processors Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan chanhl@maili.cgu.edu.twcgu General-purpose p processor Control unit Controllerr Control/ status Datapath ALU

More information

The von Neumann Architecture. IT 3123 Hardware and Software Concepts. The Instruction Cycle. Registers. LMC Executes a Store.

The von Neumann Architecture. IT 3123 Hardware and Software Concepts. The Instruction Cycle. Registers. LMC Executes a Store. IT 3123 Hardware and Software Concepts February 11 and Memory II Copyright 2005 by Bob Brown The von Neumann Architecture 00 01 02 03 PC IR Control Unit Command Memory ALU 96 97 98 99 Notice: This session

More information

Xilinx DSP. High Performance Signal Processing. January 1998

Xilinx DSP. High Performance Signal Processing. January 1998 DSP High Performance Signal Processing January 1998 New High Performance DSP Alternative New advantages in FPGA technology and tools: DSP offers a new alternative to ASICs, fixed function DSP devices,

More information

REAL TIME DIGITAL SIGNAL PROCESSING

REAL TIME DIGITAL SIGNAL PROCESSING REAL TIME DIGITAL SIGNAL PROCESSING SASE 2010 Universidad Tecnológica Nacional - FRBA Introduction Why Digital? A brief comparison with analog. Advantages Flexibility. Easily modifiable and upgradeable.

More information

DC57 COMPUTER ORGANIZATION JUNE 2013

DC57 COMPUTER ORGANIZATION JUNE 2013 Q2 (a) How do various factors like Hardware design, Instruction set, Compiler related to the performance of a computer? The most important measure of a computer is how quickly it can execute programs.

More information

3.1 Description of Microprocessor. 3.2 History of Microprocessor

3.1 Description of Microprocessor. 3.2 History of Microprocessor 3.0 MAIN CONTENT 3.1 Description of Microprocessor The brain or engine of the PC is the processor (sometimes called microprocessor), or central processing unit (CPU). The CPU performs the system s calculating

More information

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMMUNICATION ENGINEERING REG 2008 TWO MARKS QUESTION AND ANSWERS

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMMUNICATION ENGINEERING REG 2008 TWO MARKS QUESTION AND ANSWERS CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY B.E.,/B.TECH., ELECTRONICS EC6504 MICROPROCESSORS & MICRO CONTROLLERS COMMUNICATION ENGINEERING REG 2008 TWO MARKS QUESTION AND ANSWERS UNIT 1 AND 2 CS SUBJECT

More information

Basics of Microprocessor

Basics of Microprocessor Unit 1 Basics of Microprocessor 1. Microprocessor Microprocessor is a multipurpose programmable integrated device that has computing and decision making capability. This semiconductor IC is manufactured

More information

CS 3510 Comp&Net Arch

CS 3510 Comp&Net Arch CS 3510 Comp&Net Arch Pipeline Dr. Ken Hoganson 2010 Enhancing Performance We observed that we can obtain better performance in executing instructions, if a single cycle accomplishes multiple operations:

More information

EE 354 Fall 2015 Lecture 1 Architecture and Introduction

EE 354 Fall 2015 Lecture 1 Architecture and Introduction EE 354 Fall 2015 Lecture 1 Architecture and Introduction Note: Much of these notes are taken from the book: The definitive Guide to ARM Cortex M3 and Cortex M4 Processors by Joseph Yiu, third edition,

More information

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK SUBJECT : CS6303 / COMPUTER ARCHITECTURE SEM / YEAR : VI / III year B.E. Unit I OVERVIEW AND INSTRUCTIONS Part A Q.No Questions BT Level

More information

AVR Microcontrollers Architecture

AVR Microcontrollers Architecture ก ก There are two fundamental architectures to access memory 1. Von Neumann Architecture 2. Harvard Architecture 2 1 Harvard Architecture The term originated from the Harvard Mark 1 relay-based computer,

More information

PART A (22 Marks) 2. a) Briefly write about r's complement and (r-1)'s complement. [8] b) Explain any two ways of adding decimal numbers.

PART A (22 Marks) 2. a) Briefly write about r's complement and (r-1)'s complement. [8] b) Explain any two ways of adding decimal numbers. Set No. 1 IV B.Tech I Semester Supplementary Examinations, March - 2017 COMPUTER ARCHITECTURE & ORGANIZATION (Common to Electronics & Communication Engineering and Electronics & Time: 3 hours Max. Marks:

More information

Class Notes. Dr.C.N.Zhang. Department of Computer Science. University of Regina. Regina, SK, Canada, S4S 0A2

Class Notes. Dr.C.N.Zhang. Department of Computer Science. University of Regina. Regina, SK, Canada, S4S 0A2 Class Notes CS400 Part VI Dr.C.N.Zhang Department of Computer Science University of Regina Regina, SK, Canada, S4S 0A2 C. N. Zhang, CS400 83 VI. CENTRAL PROCESSING UNIT 1 Set 1.1 Addressing Modes and Formats

More information

William Stallings Computer Organization and Architecture

William Stallings Computer Organization and Architecture William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function Rev. 3.2.1 (2005-06) by Enrico Nardelli 11-1 CPU Functions CPU must: Fetch instructions Decode instructions

More information

Fig 1. Block diagram of a microcomputer

Fig 1. Block diagram of a microcomputer Computer: A computer is a multipurpose programmable machine that reads binary instructions from its memory, accepts binary data as input,processes the data according to those instructions and provides

More information

Chapter 3 - Top Level View of Computer Function

Chapter 3 - Top Level View of Computer Function Chapter 3 - Top Level View of Computer Function Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 3 - Top Level View 1 / 127 Table of Contents I 1 Introduction 2 Computer Components

More information

Chapter 12. CPU Structure and Function. Yonsei University

Chapter 12. CPU Structure and Function. Yonsei University Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining The Pentium processor The PowerPC processor 12-2 CPU Structures Processor

More information

Chapter 2 Lecture 1 Computer Systems Organization

Chapter 2 Lecture 1 Computer Systems Organization Chapter 2 Lecture 1 Computer Systems Organization This chapter provides an introduction to the components Processors: Primary Memory: Secondary Memory: Input/Output: Busses The Central Processing Unit

More information

Computer-System Organization (cont.)

Computer-System Organization (cont.) Computer-System Organization (cont.) Interrupt time line for a single process doing output. Interrupts are an important part of a computer architecture. Each computer design has its own interrupt mechanism,

More information

Module 4c: Pipelining

Module 4c: Pipelining Module 4c: Pipelining R E F E R E N C E S : S T A L L I N G S, C O M P U T E R O R G A N I Z A T I O N A N D A R C H I T E C T U R E M O R R I S M A N O, C O M P U T E R O R G A N I Z A T I O N A N D A

More information

Lode DSP Core. Features. Overview

Lode DSP Core. Features. Overview Features Two multiplier accumulator units Single cycle 16 x 16-bit signed and unsigned multiply - accumulate 40-bit arithmetic logical unit (ALU) Four 40-bit accumulators (32-bit + 8 guard bits) Pre-shifter,

More information

A Review on Cache Memory with Multiprocessor System

A Review on Cache Memory with Multiprocessor System A Review on Cache Memory with Multiprocessor System Chirag R. Patel 1, Rajesh H. Davda 2 1,2 Computer Engineering Department, C. U. Shah College of Engineering & Technology, Wadhwan (Gujarat) Abstract

More information

Architecture & Instruction set of 8085 Microprocessor and 8051 Micro Controller

Architecture & Instruction set of 8085 Microprocessor and 8051 Micro Controller of 8085 microprocessor 8085 is pronounced as "eighty-eighty-five" microprocessor. It is an 8-bit microprocessor designed by Intel in 1977 using NMOS technology. It has the following configuration 8-bit

More information