CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Similar documents
Computer Architecture and Organization. Instruction Sets: Addressing Modes and Formats

Addressing Modes. Immediate Direct Indirect Register Register Indirect Displacement (Indexed) Stack

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 11. Instruction Sets: Addressing Modes and Formats. Yonsei University

Chapter 5. A Closer Look at Instruction Set Architectures

Understand the factors involved in instruction set

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 11 Instruction Sets: Addressing Modes and Formats

Chapter 2 Instruction Set Architecture

Computer Organization CS 206 T Lec# 2: Instruction Sets

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Instruction Sets: Characteristics and Functions Addressing Modes

CHAPTER 5 A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Chapter 5. A Closer Look at Instruction Set Architectures

CHAPTER 5 A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer?

William Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function

UNIT- 5. Chapter 12 Processor Structure and Function

We briefly explain an instruction cycle now, before proceeding with the details of addressing modes.

CPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition

CPU Structure and Function

COMPUTER ORGANIZATION & ARCHITECTURE

Instruc=on Set Architecture

Architectures. CHAPTER 5 Instruction Set 5.1 INTRODUCTION 5.2 INSTRUCTION FORMATS

Processing Unit CS206T

William Stallings Computer Organization and Architecture

Class Notes. Dr.C.N.Zhang. Department of Computer Science. University of Regina. Regina, SK, Canada, S4S 0A2

UNIT V: CENTRAL PROCESSING UNIT

The Instruction Set. Chapter 5

CHAPTER 5 A Closer Look at Instruction Set Architectures

CPU Structure and Function

Module 3 Instruction Set Architecture (ISA)

CHAPTER 8: Central Processing Unit (CPU)

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Instruction Set II. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 7. Instruction Set. Quiz. What is an Instruction Set?

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function

CS430 Computer Architecture

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

INSTRUCTION SET ARCHITECTURE AND DESIGN. Danang Wahyu Utomo

Parallelism. Execution Cycle. Dual Bus Simple CPU. Pipelining COMP375 1

Chapter 4. The Processor

Computer Architecture

Chapter 3 : Control Unit

add R1,x add R1,500 add R1,[x] The answer is: all of these instructions implement adding operation on R1 and all of them have two addresses.

William Stallings Computer Organization and Architecture

Instruc=on Set Architecture

Chapter 2: Instructions How we talk to the computer

Chapter 12. CPU Structure and Function. Yonsei University

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

DC57 COMPUTER ORGANIZATION JUNE 2013

Computer Organization Question Bank

Fig: Computer memory with Program, data, and Stack. Blog - NEC (Autonomous) 1

UNIT-II. Part-2: CENTRAL PROCESSING UNIT

2. ADDRESSING METHODS

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

CSCE 5610: Computer Architecture

Chapter 13 Reduced Instruction Set Computers

Computer Architecture 1 ح 303

Where Does The Cpu Store The Address Of The

Instruction Sets: Characteristics and Functions

PESIT Bangalore South Campus

Chapter 4. The Processor

Intel 8086 MICROPROCESSOR. By Y V S Murthy

Lecture 4: Instruction Set Design/Pipelining

Universität Dortmund. ARM Architecture

MICROPROCESSOR PROGRAMMING AND SYSTEM DESIGN

Micro-Operations. execution of a sequence of steps, i.e., cycles

Chapter 4. MARIE: An Introduction to a Simple Computer. Chapter 4 Objectives. 4.1 Introduction. 4.2 CPU Basics

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

Blog -

Chapter 9. Pipelining Design Techniques

Computer Organization & Assembly Language Programming

Instr. execution impl. view

Chapter 4. The Processor Designing the datapath

Computer Architecture Programming the Basic Computer

icroprocessor istory of Microprocessor ntel 8086:

Instruction Set Design

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Department of Computer Science and Engineering

Computer Organisation CS303

MARIE: An Introduction to a Simple Computer

Module 5 - CPU Design

Pipelining, Branch Prediction, Trends

Instruction Set Architecture. "Speaking with the computer"

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Latches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter

The Microarchitecture Level

instruction set computer or RISC.

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Instruction Set Principles and Examples. Appendix B

Lecture 4: RISC Computers

What is Pipelining? RISC remainder (our assumptions)

General purpose registers These are memory units within the CPU designed to hold temporary data.

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

Final Lecture. A few minutes to wrap up and add some perspective

Preventing Stalls: 1

Blog -

Transcription:

CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng.

Part 5: Processors Our goal: understand basics of processors and CPU understand the architecture of MARIE, a model computer a close look at the instruction set architecture know how to do assembly programming with MARIE architecture

Part 5: Processors Overview: Introduction to processors and CPU Introduction to the architecture of MARIE A close look at the instruction set architecture Assembly language and programming paradigm

A close look at the instruction set Instruction addressing mode Instruction-level pipeline

Instruction addressing mode

Instruction addressing mode Addressing modes specify where an operand is located. Purpose of different addressing modes: to be able to reference as many locations of memory as possible Immediate Direct Indirect Register Register Indirect Indexed Stack

Instruction addressing mode All computer architectures provide more than one addressing mode The CPU determines which addressing mode used in a particular instruction by: (1) different opcodes use different addressing modes (2) one or more bits in the instruction format can be used as a mode field Effective Address (EA) in Addressing mode is usually a main memory address or register address

Operand is part of instruction Operand = address field e.g. ADD 5 Add 5 to contents of accumulator 5 is operand No memory reference to fetch data Fast Instruction addressing mode: immediate addressing

Instruction addressing mode: direct addressing Address field contains address of operand Effective address (EA) = address field (A) e.g. ADD A Add contents of memory cell A to accumulator Look in memory at address A for operand Single memory reference to access data No additional calculations to work out effective address Limited address space

Instruction addressing mode: direct addressing diagram Instruction Opcode Address A Memory Operand

Memory cell pointed to by address field A contains the address of (pointer to) the operand EA = (A) Look in A, find address (A) and look there for operand e.g. ADD (A) Instruction addressing mode: indirect addressing Add contents of cell pointed to by contents of A to accumulator

Large address space May be nested, multilevel, cascaded e.g. EA = (((A))) Multiple memory accesses to find operand Hence slower Instruction addressing mode: indirect addressing

Instruction addressing mode: indirect addressing diagram Instruction Opcode Address A Memory Pointer to operand Operand

Operand is held in register named in address filed EA = R Limited number of registers Very small address field needed Shorter instructions Faster instruction fetch Instruction addressing mode: register addressing

No memory access Very fast execution Instruction addressing mode: register addressing Very limited address space Multiple registers helps performance Requires good assembly programming or compiler writing Similar with Direct addressing

Instruction addressing mode: register addressing diagram Instruction Opcode Register Address R Registers Operand

Similar with indirect addressing EA = (R) Instruction addressing mode: register indirect addressing Operand is in memory cell pointed to by contents of register R One fewer memory access than indirect addressing

Instruction addressing mode: register indirect addressing diagram Instruction Opcode Register Address R Memory Registers Pointer to Operand Operand

Instruction addressing mode: indexed addressing Indexed addressing uses a register (implicitly or explicitly) as an offset, which is added to the address in the operand to determine the effective address of the data. EA = A + (R) Address field hold two values A = base value R = register that holds offset

Instruction addressing mode: indexed addressing diagram Instruction Opcode Register R Address A Memory Registers Pointer to Operand + Operand

Operand is (implicitly) on top of stack e.g. Instruction addressing mode: stack addressing ADD Pop top two items from stack and add

Comparison of different addressing mode Mode Immediate Direct Indirect Register Register indirect Indexed Stack Algorithm Operand = A EA = A EA = (A) EA = R EA = (R) EA = A + (R) EA = top of stack Principal Advantage No memory reference Simple Large address space No memory reference Large address space Flexibility No memory reference Principal Disadvantage Limited operand magnitude Limited address space Multiple memory reference Limited address space Extra memory reference Complexity Limited applicability A: content of an address field. R: Content of an address field refers to register EA: Actual address

Addressing mode: an example Given the following memory values: Word 20 contains 40 Word 30 contains 50 Word 40 contains 60 Word 50 contains 70 What values do the following instructions load into the accumulator? Load immediate 20 Load direct 20 Load indirect 20 Load immediate 30 Load direct 30 Load indirect 30

Addressing mode: an example For the instruction shown, what value is loaded into the accumulator for each addressing mode?

Instruction-level pipeline

Instruction-level pipeline Some CPUs divide the fetch-decode-execute cycle into smaller steps. These smaller steps can often be executed in parallel to increase performance. Such parallel execution is called instruction-level pipelining. This term is sometimes abbreviated ILP in the literature.

Instruction-level pipeline Suppose a fetch-decode-execute cycle were broken into the following smaller steps: 1. Fetch instruction. 4. Fetch operands. 2. Decode opcode. 5. Execute instruction. 3. Calculate effective 6. Store result. address of operands. Suppose we have a six-stage pipeline. S1 fetches the instruction, S2 decodes it, S3 determines the address of the operands, S4 fetches them, S5 executes the instruction, and S6 stores the result.

Instruction-level pipeline For every clock cycle, one small step is carried out, and the stages are overlapped. S1. Fetch instruction. S2. Decode opcode. S3. Calculate effective address of operands. S4. Fetch operands. S5. Execute. S6. Store result. Effect: : N-stage N pipeline can operate on N instructions simultaneously Each stage takes one clock cycle. Each instruction takes one clock cycle once the pipeline is full.

Instruction-level pipeline The theoretical speedup offered by a pipeline can be determined as follows: Let t p be the time per stage. Each instruction represents a task, T, in the pipeline. The first task (instruction) requires k t p time to complete in a k-stage pipeline. The remaining (n - 1) tasks emerge from the pipeline one per cycle. So the total time to complete the remaining tasks is (n - 1)t p. Thus, to complete n tasks using a k-stage pipeline requires: (k t p ) + (n - 1)t p = (k + n - 1)t p.

Instruction-level pipeline If we take the time required to complete n tasks without a pipeline and divide it by the time it takes to complete n tasks using a pipeline, we find: If we take the limit as n approaches infinity, (k + n - 1) approaches n, which results in a theoretical speedup of:

Why understanding pipeline Pipeline is transparent to assembly language programmer Disadvantage: programmer who does not understand pipeline can produce inefficient code WHY? Reason: Hardware automatically stalls pipeline if items are not available, if the next instruction depends on the result of the previous instruction.

Example of instruction stalls Assume Need to perform addition and subtraction operations Operands and results in registers A through E Code is: Second instruction stalls to wait for operand C, The instruction K+1 needs the result of instruction K before it can continue. This causes instruction K+1 to wait until instruction k completes.

How to achieve maximum speed Program must be written to accommodate instruction pipeline To minimize stalls Avoid introducing unnecessary branches and subroutine call Avoid to invoke a co-processor, e.g. call an instruction that takes a long time like floating point arithmetic Avoid external storage

Example of avoiding stalls Stalls eliminated by rearranging (a) to (b)

Thank you for your attendance Any questions?