CSCE 5610: Computer Architecture

Similar documents
Chapter 2A Instructions: Language of the Computer

Lecture 4: Instruction Set Architecture

Instruction Set Architecture. "Speaking with the computer"

CSIS1120A. 10. Instruction Set & Addressing Mode. CSIS1120A 10. Instruction Set & Addressing Mode 1

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

Computer Architecture. MIPS Instruction Set Architecture

Instruction Set Architecture

Instruction Set Principles and Examples. Appendix B

CS4617 Computer Architecture

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

CSE Lecture In Class Example Handout

Computer Architecture. The Language of the Machine

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

ISA and RISCV. CASS 2018 Lavanya Ramapantulu

55:132/22C:160, HPCA Spring 2011

CS/COE1541: Introduction to Computer Architecture

EE 361 University of Hawaii Fall

COSC 6385 Computer Architecture. Instruction Set Architectures

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

Course Administration

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

Announcements HW1 is due on this Friday (Sept 12th) Appendix A is very helpful to HW1. Check out system calls

Computer Architecture

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

Math 230 Assembly Programming (AKA Computer Organization) Spring 2008

Chapter 1. Computer Abstractions and Technology. Lesson 3: Understanding Performance

Computer Organization MIPS ISA

ELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2)

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

ECE 486/586. Computer Architecture. Lecture # 7

CS3350B Computer Architecture MIPS Instruction Representation

PHBREAK: RISC ISP architecture the MIPS ISP. you read: text Chapter 3 CS 350

Lectures 3-4: MIPS instructions

Instruction Set Design

ECE 486/586. Computer Architecture. Lecture # 6

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

CPU Architecture and Instruction Sets Chapter 1

CENG3420 Lecture 03 Review

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

Instruction Set Architectures

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

ECE 486/586. Computer Architecture. Lecture # 8

Lecture 3 Machine Language. Instructions: Instruction Execution cycle. Speaking computer before voice recognition interfaces

COMPUTER ORGANIZATION AND DESIGN

ECE 30 Introduction to Computer Engineering

Instruction Set Architecture (ISA)

Lecture Topics. Branch Condition Options. Branch Conditions ECE 486/586. Computer Architecture. Lecture # 8. Instruction Set Principles.

Subroutines. int main() { int i, j; i = 5; j = celtokel(i); i = j; return 0;}

Instructions: Language of the Computer

Procedure Calling. Procedure Calling. Register Usage. 25 September CSE2021 Computer Organization

Chapter 2. Instruction Set Principles and Examples. In-Cheol Park Dept. of EE, KAIST

Math 230 Assembly Programming (AKA Computer Organization) Spring MIPS Intro

CHAPTER 5 A Closer Look at Instruction Set Architectures

Instruction Sets Ch 9-10

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

Instruction Sets Ch 9-10

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Reminder: tutorials start next week!

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions

EEC 581 Computer Architecture Lecture 1 Review MIPS

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

Today s topics. MIPS operations and operands. MIPS arithmetic. CS/COE1541: Introduction to Computer Architecture. A Review of MIPS ISA.

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

ECE468 Computer Organization & Architecture. MIPS Instruction Set Architecture

CS3350B Computer Architecture

CSE 141 Computer Architecture Spring Lecture 3 Instruction Set Architecute. Course Schedule. Announcements

Reduced Instruction Set Computer (RISC)

Instruction Set. Instruction Sets Ch Instruction Representation. Machine Instruction. Instruction Set Design (5) Operation types

Assembly labs start this week. Don t forget to submit your code at the end of your lab section. Download MARS4_5.jar to your lab PC or laptop.

MIPS%Assembly% E155%

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

EN164: Design of Computing Systems Topic 03: Instruction Set Architecture Design

CHAPTER 2: INSTRUCTION SET PRINCIPLES. Prepared by Mdm Rohaya binti Abu Hassan

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

CS/ECE/752 Chapter 2 Instruction Sets Instructor: Prof. Wood

CS3350B Computer Architecture MIPS Introduction

Computer Architecture

Computer Organization

Computer Architecture. Chapter 2-2. Instructions: Language of the Computer

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

Instruction Set Architectures Part I: From C to MIPS. Readings:

Forecast. Instructions (354 Review) Basics. Basics. Instruction set architecture (ISA) is its vocabulary. Instructions are the words of a computer

EC-801 Advanced Computer Architecture

CSE Lecture In Class Example Handout

Chapter 2. Instructions: Language of the Computer. HW#1: 1.3 all, 1.4 all, 1.6.1, , , , , and Due date: one week.

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

The Single Cycle Processor

All instructions have 3 operands Operand order is fixed (destination first)

Review of instruction set architectures

The MIPS Instruction Set Architecture

2.7 Supporting Procedures in hardware. Why procedures or functions? Procedure calls

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from:

Reduced Instruction Set Computer (RISC)

CHAPTER 5 A Closer Look at Instruction Set Architectures

Computer Organization MIPS Architecture. Department of Computer Science Missouri University of Science & Technology

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

MIPS Instruction Set

Transcription:

HW #1 1.3, 1.5, 1.9, 1.12 Due: Sept 12, 2018 Review: Execution time of a program Arithmetic Average, Weighted Arithmetic Average Geometric Mean Benchmarks, kernels and synthetic benchmarks Computing CPI for branch instructions CPI with cache misses Amdahl s law MTTF and MTTR CSCE 5610 Sept. 06, 2018 1 Instruction Set Architecture mostly from Appendix A How should we design an instruction set? Difference between Assembly language and Machine language (Instruction set) Instruction Format How many bits per instruction How many bits for opcode What and how many operands to include Book classifies the ISA into 4 types (see page A-4) Stack Accumulator Register-Memory Load/store (register-register) CSCE 5610 Sept. 06, 2018 2 1

Let us see how these ISA s differ with a simple example Convert C = A+B into these instruction sets Stack Accumulator Register-Memory Load/Store (Reg-Reg) PUSH A LOAD A LOAD R1, A LOAD R1, A PUSH B ADD B ADD R3, R1, B LOAD R2, B ADD STORE C STORE C, R3 ADD R3, R1, R2 POP C STORE C, R3 Which is better? We can try to compare code sizes Let us look at problem A.9 from Text on page A-49 CSCE 5610 Sept. 06, 2018 3 Stack Accumulator Register-Memory Load/Store (Reg-Reg) PUSH A (8+64) LOAD A (8+64) LOAD R1, A (8+6+64) LOAD R1, A (8+6+64) PUSH B (8+64) ADD B (8+64) ADD R3, R1, B (8+6+6+64) LOAD R2, B (8+6+64) ADD (8) STORE C (8+64) STORE C, R3 (8+64+6) ADD R3, R1, R2 (8+6+6+6) POP C (8+64) STORE C, R3 (8+64+6) 224 bits 216 bits 240 bits 260 bits But implementing Stack instructions is very difficult using pipelines Accumulator may require multiple loads Consider C = A+B+A*D Load A Add B Store Temp Load A Mult D Add Temp Store C Load R1, A Load R2, B Add R3, R1, R2 Load R4, D Mult R5, R1, R4 Add R6, R3, R5 Store C, R6 CSCE 5610 Sept. 06, 2018 4 2

See page A6 for a discussion of Pros and cons on these different ISAs CSCE 5610 Sept. 06, 2018 5 Note in most systems, we specify an address for each byte. So a long integer (8 bytes) takes up 8 addresses, integer (32 bytes) uses 4 addresses Another issue to consider is the difference between Big-endian and Little-endian 61 bits * * * Byte 0 ---------- Byte 7 Byte 7 ---------- Byte 0 Address modes How do we locate data in memory? So far we assumed that we needed 64 bits to specify a memory address. Can we specify all the 64 bits in the instruction? We can reduce the number of bits needed in an instruction with different address modes But before talking about address modes Let us first talk about how data is mapped to memory Consider the following structure CSCE 5610 Sept. 06, 2018 6 3

struct foo { char a; int b; bool c; double d; short e; float f; double g; char * cptr; float *fptr; int x; }; How is this represented in memory if we are using 32 bit words? Note we need to align certain items on appropriate byte boundaries 0 1 2 3 word 0 a * * * word 1 b b b b word 2 c * * * word 3 d d d d word 4 d d d d word 5 e e * * word 6 f f f f word 7 g g g g word 8 g g g g word 9 cptr cpt cptr cptr word 10 fptr fptr fptr fptr word 11 x x x x How does it look like if we are using 64 bit words? We can rearrange the elements of the struct to save some bytes CSCE 5610 Sept. 06, 2018 7 Address modes (page A-9) how do we specify memory address in a instruction Register, immediate -- not really address, data is in a register ADD R i, R j, R k Register indirect address is in a register Load R i, (R j) Immediate data is in the instruction itself ADDI R i, Rj, #4 Displacement, Indexed address is the sum of the displacement and the value in a registers: LOAD R i, 100(R j) Absolute (or memory direct) address specified in the instruction LOAD R i, 10000 Memory Indirect address in the instruction points to memory location that actually contains the address: LOAD R i, (10000) Auto increment and decrement How to specify which mode is being used? CSCE 5610 Sept. 06, 2018 8 4

CSCE 5610 Sept. 06, 2018 9 Separate field for mode OPCODE Mode Operand-1 Mode Operand-2 Not good to have too many modes compiler analysis becomes more complex need more bits some modes make no sense with some instructions OR the address mode can be implied by the operation ADD-Immediate R i, R j, #value Load-Auto-inc R i, disp (R j) We allow addresses with load/store (and branch instructions) And we limit the address mode to Register + displacement only Load R i, disp (R j ) Store R i, disp (R j) CSCE 5610 Sept. 06, 2018 10 5

9/6/18 This shows how frequently different address modes have been used for different programs compiled. Figure A.7 Summary of use of memory addressing modes (including immediates). These major addressing modes account for all but a few percent (0% 3%) of the memory accesses. Register modes, which are not counted, account for one-half of the operand references, while memory addressing modes (including immediate) account for the other half. Of Note that the last 3 modes are more commonly used course, the compiler affects what addressing modes are used; see Section A.8. The memory indirect mode on the VAX can use displacement, autoincrement, or autodecrement to form the initial memory address; in these programs, almost all the memory indirect references use displacement mode as the base. Displacement mode includes all displacement lengths (8, 16, and 32 bits). The PC-relative addressing modes, used almost exclusively for branches, are not included. Only the addressing modes with an average frequency of over 1% are CSCE 5610 Sept. 06,shown. 2018 11 8 2019 Elsevier Inc. All rights reserved. -- note we can make displacement = 0 (will be register indirect mode) or use R0 so the register value is always zero (will be absolute address) Load Ri, 0 (Rj) Store Ri, disp (R0) How large should the displacement field be? Need displacement for immediate values, address displacement and branches See the figure on page A-11 It shows that you rarely need more than 16 bits of displacement for addresses CSCE 5610 Sept. 06, 2018 12 6

9/6/18 For immediate values ADD R1, R2, #4 Other types of operations that need addresses Branch instructions PC relative mode Compare and Branch BEQ Ri, Rj, Target-Address Jump J Target-Address JAL function-address : Similar to Load Ri, disp (Rj) : New format no registers : J disp or JR (Ri) How many bits for displacement in Jump? CSCE 5610 Sept. 06, 2018 13 See the figure on page A-18 This shows how far do you normally jump (relative to current position) In most cases we try to fit an instruction in 32 or 64 bits CSCE 5610 Sept. 06, 2018 14 7

Other decisions to make How many registers? 0, 1 or infinite 32 or 64 if we use 64, we need 6 bits to specify registers as operands Separate Integer and Floating point registers 64 of each kind? implied by operations (Integer-Add means use integer registers) Special purpose registers (PC, Status register, Stack-pointer (?)) What operations should we support? Arithmetic integer and floating point Add, Subtract, Multiply and Divide Logic (bit, byte or word) AND, OR, NOT, Compare: EQ, NE, GE, LT,. Conditional and unconditional branches Delayed branch? CSCE 5610 Sept. 06, 2018 15 Delayed Branches Even if the condition is satisfied, execute the next instruction before branching. BEQ R1, R2, addr1 Add R3, R4, R5.. addr1: Mult R3, R4, R5 Consider what happens if R1 equals R2 Normally, if R1=R2, the Add is not executed. But, in delayed branches, add will be executed whether the branch is taken or not. What is the advantage? The main advantage is in pipelines -- by the time branch is decoded, the next instruction is already fetched. In delayed branches, we will not discard the next instruction even if branch is taken CSCE 5610 Sept. 06, 2018 16 8

Switch (i) { case 0 :. case 1 :.. } 6-bit opcode Source Index Reg 16-bit address Switch R s R I Displacement What about cases like Switch (i) { case 0 :. case 5 :. case 13. } Rs contains i; (R I +displacement) contains the starting address of a table of addresses representing the branch addresses for each case clause. CSCE 5610 Sept. 06, 2018 17 Function calls/ Subroutine calls. what are the necessary actions that must take place? At call At return Save PC (return address) Pass parameters, Save registers By caller or by callee? Use the saved return address to return Restore registers -- again by caller or callee? How about an instruction like store multiple registers and load multiple to help with save/restore registers? What are the choices in implementation of function call? CSCE 5610 Sept. 06, 2018 18 9

MIPS style: JAL jump and link PC is saved in R31 ($ra or return address register) By convention you can pass some parameters through registers R4-R7 ($a0-$a3 or arguments registers) Results can also be returned through registers R2-R3 ($v0-$v1) Callee Saves registers R16-R23 ($s or save registers) Caller Saves registers R8-R15 ($t or temporary registers) Using Stack (PDP 11) Store PC on stack Store all arguments on stack Store the number of arguments on stack Possibly save registers on stack Is the stack in memory or a dedicated hardware? CSCE 5610 Sept. 06, 2018 19 WP 0 9 10 Global Registers 0 15 16 16 WP+16 25 26 10 31 15 16 WP+21 25 WP+32 Register Windows -- Sun SPARC 26 31 1026 15 16 25 2 26 31 CSCE 5610 Sept. 06, 2018 20 10