October 24. Five Execution Steps

Similar documents
Design of the MIPS Processor

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

Processor: Multi- Cycle Datapath & Control

Control & Execution. Finite State Machines for Control. MIPS Execution. Comp 411. L14 Control & Execution 1

Design of the MIPS Processor (contd)

The Processor: Datapath & Control

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

Multicycle Approach. Designing MIPS Processor

ECE369. Chapter 5 ECE369

Chapter 5: The Processor: Datapath and Control

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

Chapter 4 The Processor (Part 2)

CPE 335. Basic MIPS Architecture Part II

LECTURE 6. Multi-Cycle Datapath and Control

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

Topic #6. Processor Design

EECE 417 Computer Systems Architecture

Chapter 5 Solutions: For More Practice

Processor (multi-cycle)

Points available Your marks Total 100

Unsigned Binary Integers

Unsigned Binary Integers

2) Using the same instruction set for the TinyProc2, convert the following hex values to assembly language: x0f

Computer Science 141 Computing Hardware

CSE 2021 COMPUTER ORGANIZATION

CSE 2021 COMPUTER ORGANIZATION

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Chapter 1. Computer Abstractions and Technology. Lesson 3: Understanding Performance

Alternative to single cycle. Drawbacks of single cycle implementation. Multiple cycle implementation. Instruction fetch

RISC Processor Design

Format. 10 multiple choice 8 points each. 1 short answer 20 points. Same basic principals as the midterm

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.

Topic Notes: MIPS Instruction Set Architecture

CS/COE 0447 Example Problems for Exam 2 Fall 2010

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

Systems Architecture I

Lets Build a Processor

Chapter 2A Instructions: Language of the Computer

ECE232: Hardware Organization and Design

Homework 4 - Solutions (Floating point representation, Performance, Recursion and Stacks) Maximum points: 80 points

2B 52 AB CA 3E A1 +29 A B C. CS120 Fall 2018 Final Prep and super secret quiz 9

Multiple Cycle Data Path

CENG3420 Lecture 03 Review

MIPS Assembly Programming

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

Computer Organisation CS303

ECE 30 Introduction to Computer Engineering

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

Single vs. Multi-cycle Implementation

CSE 141 Computer Architecture Summer Session Lecture 3 ALU Part 2 Single Cycle CPU Part 1. Pramod V. Argade

ENE 334 Microprocessors

RISC Architecture: Multi-Cycle Implementation

1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer

ECE331: Hardware Organization and Design

Computer Architecture

EN164: Design of Computing Systems Topic 03: Instruction Set Architecture Design

MIPS PROJECT INSTRUCTION SET and FORMAT

ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010

CC 311- Computer Architecture. The Processor - Control

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

Review: Abstract Implementation View

ECE 331 Hardware Organization and Design. Professor Jay Taneja UMass ECE - Discussion 3 2/8/2018

Midterm I October 6, 1999 CS152 Computer Architecture and Engineering

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

RISC Architecture: Multi-Cycle Implementation

ECE 331 Hardware Organization and Design. Professor Jay Taneja UMass ECE - Discussion 5 2/22/2018

Note- E~ S. \3 \S U\e. ~ ~s ~. 4. \\ o~ (fw' \i<.t. (~e., 3\0)

Review: MIPS Organization

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

MIPS ISA and MIPS Assembly. CS301 Prof. Szajda

Lecture 5: The Processor

Lecture 5: Procedure Calls

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

Cache Organizations for Multi-cores

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Mips Code Examples Peter Rounce

Processor design - MIPS

Computer Science and Engineering 331. Midterm Examination #1. Fall Name: Solutions S.S.#:

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Math 230 Assembly Programming (AKA Computer Organization) Spring 2008

RISC Design: Multi-Cycle Implementation

Thomas Polzer Institut für Technische Informatik

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

University of California at Santa Barbara. ECE 154A Introduction to Computer Architecture. Quiz #1. October 30 th, Name (Last, First)

CSSE232 Computer Architecture I. Mul5cycle Datapath

CSCI 402: Computer Architectures. Arithmetic for Computers (4) Fengguang Song Department of Computer & Information Science IUPUI.

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

CSE Lecture In Class Example Handout

The MIPS Instruction Set Architecture

Grading: 3 pts each part. If answer is correct but uses more instructions, 1 pt off. Wrong answer 3pts off.

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

CSCE 5610: Computer Architecture

Transcription:

October 24 Programming problems? Read Section 6.1 for November 5 How instructions execute Test Preview Ask Questions! 10/24/2001 Comp 120 Fall 2001 1 Five Execution Steps Instruction Fetch Instruction Decode and Register Fetch Execution, Memory Address Computation, or Branch Completion Memory Access or R-type instruction completion Memory Read Completion INSTRUCTIONS TAKE FROM 3-5 CYCLES! A FSM looks at the op-code to determine how many... 10/24/2001 Comp 120 Fall 2001 2 1

Step 1: Instruction Fetch Use PC to get instruction and put it in the Instruction Register. Increment the PC by 4 and put the result back in the PC. Can be described succinctly using RTL "Register-Transfer Language" IR = Memory[PC]; IR is Instruction Register PC = PC + 4; What is the advantage of updating the PC now? 10/24/2001 Comp 120 Fall 2001 3 Step 2: Instruction Decode and Register Fetch Read registers rs and rt in case we need them Compute the branch address in case the instruction is a branch RTL: A = Reg[IR[25-21]]; B = Reg[IR[20-16]]; ALUOut = PC + (sign-extend(ir[15-0]) << 2); We aren't setting any control lines based on the instruction type (we are busy "decoding" it in our control logic) 10/24/2001 Comp 120 Fall 2001 4 2

Step 3 (instruction dependent) ALU is performing one of three functions, based on instruction type Memory Reference: ALUOut = A + sign-extend(ir[15-0]); R-type: ALUOut = A op B; Branch: if (A==B) PC = ALUOut; 10/24/2001 Comp 120 Fall 2001 5 Step 4 (R-type or memory-access) Loads and stores access memory MDR = Memory[ALUOut]; MDR is Memory Data Register or Memory[ALUOut] = B; R-type instructions finish Reg[IR[15-11]] = ALUOut; 10/24/2001 Comp 120 Fall 2001 6 3

Step 5 Memory Read Completion Reg[IR[20-16]]= MDR; 10/24/2001 Comp 120 Fall 2001 7 Summary: Step name Instruction fetch Instruction decode/register fetch Action for R-type instructions Action for memory-reference Action for instructions branches IR = Memory[PC] PC = PC + 4 A = Reg [IR[25-21]] B = Reg [IR[20-16]] ALUOut = PC + (sign-extend (IR[15-0]) << 2) Action for jumps Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] II computation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2) jump completion Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut] completion ALUOut or Store: Memory [ALUOut] = B Memory read completion Load: Reg[IR[20-16]] = MDR 10/24/2001 Comp 120 Fall 2001 8 4

Test Preview You should be able to answer all of these 10/24/2001 Comp 120 Fall 2001 9 1a-1b 1. Consider the hex number 0x0000A120 a.[4] What bit pattern (base 2 number) does it represent? 0000 0000 0000 0000 1010 0001 0010 0000 b. [4] If you complement it (that is one s complement), what hex number do you get? 0xFF FF 5E DF 10/24/2001 Comp 120 Fall 2001 10 5

1c-1d Consider the hex number 0x0000A120 c. [4] If you negate it, what hex number do you get? 0xFF FF 5E E0 d. [4] What decimal value does it represent when interpreted as a signed integer? 10x16 3 + 1x16 2 +2*16 = 40960 + 256 + 32 = 41248 10/24/2001 Comp 120 Fall 2001 11 2a-2b 2. Consider the 32 bit binary number 1100 0000 0000 0000 0000 0000 0000 0000 a. [4] What is its hex representation? 0xC0000000 b. [4] Interpret it as an unsigned integer; what is its valueindecimal? 2^31 + 2^30 = 3G = 3221225472 10/24/2001 Comp 120 Fall 2001 12 6

2c-2d Consider the 32 bit binary number 1100 0000 0000 0000 0000 0000 0000 0000 c. [4] Interpret it as a signed integer; what is its value in decimal? 2 31 +2 30 = -1073741824 d. [4] Interpret it as an IEEE floating point single precision number (sign, 8-bit exponent with bias=127, 23-bit mantissa); what is its value in decimal? (-1) sign *1.0*2 (128-127) = 2.0 2 7= 128 1100 0000 0000 0000 0000 0000 0000 0000 10/24/2001 Comp 120 Fall 2001 13 3. Using 8-bit twos-complement binary numbers (that is, numbers only 8 bits long to save you work) a. [8] Show how to add 23 and 17. 23=00010111, 17=00010001, ~17=11101110, -17 = 11101111, 00010111 + 11101111 = 00000110 3 b. [8] Show how to subtract 15 from 20 by negating and adding. 20=00010100, 15=00001111, ~15=11110000, -15=11110001, 00010100+11110001 = 00000101 10/24/2001 Comp 120 Fall 2001 14 7

4 4. The branch instructions bne (branch-not-equal) and beq (branch-equal) use PC-relative addressing. a. [8] Assuming the condition is true, show how to determine the byte address of the next instruction to be executed, given the PC value and 16 bits of the immediate address field of the instruction. M = immediate field 16 bits interpreted as a signed integer. PC = PC+4+4*M b. [4] What characteristics of the branches in real programs justify the choice of PC-relative addressing? branches tend to be to nearby addresses, this is called locality 10/24/2001 Comp 120 Fall 2001 15 5 5. [8] Suppose the target address L1 for a beq instruction is 500k bytes away. This is too far away to represent in the 16 bit immediate address of the beq instruction. What instruction sequence should we use to accomplish the goal of branching to L1 on equal compare without using any registers? (I m interested in the idea here, not the exact instructions). 500K is slightly smaller than 2^19 (=512K). change the beq to a bne that skips the next instruction on true. Make the next instruction a jump which has a 25 bit address field. e.g., beq $t0, $t1, L1 becomes bne $t0, $t1, Next j L1 Next:... 10/24/2001 Comp 120 Fall 2001 16 8

6a 6. When we access an array of integers, we have to multiply the index by 4 to get the byte offset. A certain program has an average CPI of 2.5 with 10% of the cycles devoted to converting array indices to addresses with an instruction sequence like: mul $t0, $t1, 4 % convert index to a byte offset add $t0, $t0, $a0 % add offset to the base address % of the array Assume the above sequence takes 10 cycles. a. [4] Show how to implement this address calculation by replacing the multiply with adds. add $t0, $t1, $t1 add $t0, $t0, $t0 add $t0, $t0, $a0 10/24/2001 Comp 120 Fall 2001 17 6b...average CPI of 2.5 with 10% of the cycles devoted to converting... b. [4] Assuming the add sequence takes 3 cycles, what is the speedup for new program over using the multiply sequence? 10 Cycles takes 10%, so: 10/0.1 = 100 cycles total. (10 in conversion, 90 in others) New C=93. (3 in conversion, 90 in others) Speedup = 100/93 = 1.075 Or using Amdahl s Law: 1 ------------------- 0.9 + 0.1*3/10 10/24/2001 Comp 120 Fall 2001 18 9

6c-6d...average CPI of 2.5 with 10% of the cycles devoted to converting... c. [4] Show how to implement this address calculation by replacing the multiply with a shift-left-logical. sll $t0, $t1, 2 add $t0, $t0, $a0 $t1 = b 31 b 30 b 29 b 28 b 1 b 0 --> $t0 = b 29 b 28 b 1 b 0 00 $t0=$t1*4ifb 31 b 30 =00 d. [4] Assuming the shift/add sequence takes 2 cycles, what is the average CPI for the new program? 10/0.1 = 100 cycles total. CPI=2.5 so I = 40. (CPI=C/I) New I=40, (Still 2 in conversion, 38 in others) New C=92. (2 in conversion, 90 in others) New CPI=92/40 = 2.3. 10/24/2001 Comp 120 Fall 2001 19 7 7. [8] Suppose we want to design a 1-bit subtracter like our 1-bit adder. The inputs are A, B, and Borrow-In. The outputs are A-B and Borrow-out. Make a truth table for outputs A-B and Borrow-out. A B BI A-B BO 0 0 0 0 0 0 0 1 1 1 0 1 0 1 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 0 0 1 1 1 1 1 e.g....0 1 A 0 1... -...0 0 B 1 0... 10/24/2001 Comp 120 Fall 2001 20 10

8 8. [8] Consider 3 functions, F1, F2, and F3. F1 calls F2 and F2 calls F3, which is a leaf function. Function F1 uses and overwrites registers t0-7 and s0- s3. Function F2 uses and overwrites registers t0-3 and s0-1. Function F3 uses and overwrites registers t0-t4. At the beginning of each function we save some registers on the stack and at the end of each function we restore these registers. For each function, specify the minimal set of registers that must be saved and restored using the MIPS register saving conventions. MIPS convention: no need to save t registers. F1 must save s0, s1, s2, s3, RA. F2 must save s0, s1, RA. F3 doesn t have to save anything. Only 1 if you forgot RA. Why RA? F1 uses jr $ra at the end. But it changes $ra when it calls F2 by jal F2. 10/24/2001 Comp 120 Fall 2001 21 Assignment Email to me (gb@cs.unc.edu) by Friday 5PM. at least 2 questions on material we ve covered that you would like to review OR a statement that you don t have any questions 10/24/2001 Comp 120 Fall 2001 22 11