RISC Processor Design

Similar documents
RISC Design: Multi-Cycle Implementation

RISC Architecture: Multi-Cycle Implementation

RISC Architecture: Multi-Cycle Implementation

CPE 335. Basic MIPS Architecture Part II

Systems Architecture I

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

Processor: Multi- Cycle Datapath & Control

LECTURE 6. Multi-Cycle Datapath and Control

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

CSE 2021 COMPUTER ORGANIZATION

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

ENE 334 Microprocessors

ALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR

CSE 2021 COMPUTER ORGANIZATION

Chapter 5: The Processor: Datapath and Control

Topic #6. Processor Design

CC 311- Computer Architecture. The Processor - Control

ECE369. Chapter 5 ECE369

Points available Your marks Total 100

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Chapter 5 Solutions: For More Practice

Multiple Cycle Data Path

Processor (I) - datapath & control. Hwansoo Han

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

The Processor: Datapath & Control

Lets Build a Processor

Major CPU Design Steps

Systems Architecture

Pipelined Processor Design

Multi-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

Microprogramming. Microprogramming

Processor (multi-cycle)

Initial Representation Finite State Diagram. Logic Representation Logic Equations

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

EECE 417 Computer Systems Architecture

Note- E~ S. \3 \S U\e. ~ ~s ~. 4. \\ o~ (fw' \i<.t. (~e., 3\0)

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

LECTURE 5. Single-Cycle Datapath and Control

Design of the MIPS Processor

Control Unit for Multiple Cycle Implementation

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

ECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems

Multicycle Approach. Designing MIPS Processor

Single vs. Multi-cycle Implementation

Chapter 4. The Processor. Computer Architecture and IC Design Lab

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

Mapping Control to Hardware

Computer Science 141 Computing Hardware

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Alternative to single cycle. Drawbacks of single cycle implementation. Multiple cycle implementation. Instruction fetch

Implementing the Control. Simple Questions

Review: Abstract Implementation View

CENG 3420 Lecture 06: Datapath

Design of the MIPS Processor (contd)

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

CSE 2021: Computer Organization Fall 2010 Solution to Assignment # 3: Multicycle Implementation

Pipelined Processor Design

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

Chapter 4 The Processor (Part 2)

CPE 335 Computer Organization. Basic MIPS Architecture Part I

Using a Hardware Description Language to Design and Simulate a Processor 5.8

Initial Representation Finite State Diagram Microprogram. Sequencing Control Explicit Next State Microprogram counter

Chapter 4. The Processor

Chapter 4. The Processor

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007

EE457. Note: Parts of the solutions are extracted from the solutions manual accompanying the text book.

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

The MIPS Processor Datapath

Processor Implementation in VHDL. University of Ulster at Jordanstown University of Applied Sciences, Augsburg

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS232 Final Exam May 5, 2001

CSEN 601: Computer System Architecture Summer 2014

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Lecture 5: The Processor

ECE232: Hardware Organization and Design

Beyond Pipelining. CP-226: Computer Architecture. Lecture 23 (19 April 2013) CADSL

Single Cycle CPU Design. Mehran Rezaei

ECE 313 Computer Organization FINAL EXAM December 11, Multicycle Processor Design 30 Points

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

The Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath

Chapter 4. The Processor Designing the datapath

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

Microprogrammed Control Approach

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

ECE Exam I - Solutions February 19 th, :00 pm 4:25pm

Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)

ECE260: Fundamentals of Computer Engineering

Final Project: MIPS-like Microprocessor

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

The overall datapath for RT, lw,sw beq instrucution

Transcription:

RISC Processor Design Single Cycle Implementation - MIPS Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 13 SE-273: Processor Design Feb 07, 2011 SE-273@SERC 1

Courtesy: Prof. Vishwani Agrawal 2

MIPS Instructions 3

MIPS Registers and Memory 4

Example Addressing Modes addi 1. Immediate addressing op rs rt Immediate add 2. Register addressing op rs rt rd... funct Registers Register 3. Base addressing lw, sw op rs rt Address Memory Register + Byte Halfword Word beq, bne 4. PC-relative addressing op rs rt Address Memory PC + Word j 5. Pseudodirect addressing op Address PC Memory Word 2004 Morgan Kaufman Publishers 5

MIPS Instruction Set MIPS Instruction Subset Arithmetic and Logical Instructions add, sub, or, and, slt Memory reference Instructions lw, sw Branch beq, j 6

Overview of MIPS simple instructions, all 32 bits wide very structured, no unnecessary baggage only three instruction formats R op rs rt rd shmt I op rs rd 16 bit address J op 26 bit address funct rely on compiler to achieve performance 2004 Morgan Kaufman Publishers 7

Where Does It All Begin? In a register called program counter (PC). PC contains the memory address of the next instruction to be executed. In the beginning, PC contains the address of the memory location where the program begins. 8

Where is the Program? Processor Memory Program counter (register) Start address Machine code of program 9

How Does It Run? Start PC has memory address where program begins Fetch instruction word from memory address in PC and increment PC PC + 4 for next instruction Decode and execute instruction No Program complete? Yes STOP 10

Datapath and Control Datapath: Memory, registers, adders, ALU, and communication buses. Each step (fetch, decode, execute) requires communication (data transfer) paths between memory, registers and ALU. Control: Datapath for each step is set up by control signals that set up dataflow directions on communication buses and select ALU and memory functions. Control signals are generated by a control unit consisting of one or more finite-state machines. 11

Datapath for Instruction Fetch 4 Add PC Address Instruction Memory Instruction word to control unit and registers 12

Register File: A Datapath Component Read registers Write register Write data 5 5 5 32 reg 1 reg 2 32 Registers (reg. file) 32 32 reg 1 data reg 2 data RegWrite from control 13

Multi-Operation ALU Operation select ALU function 000 AND 001 OR 010 Add 110 Subtract 111 Set on less than Operation select from control ALU 3 zero overflow result zero = 1, when all bits of result are 0 14

R-Type Instructions Also known as arithmetic-logical instructions add, sub, slt Example: add $t0, $s1, $s2 Machine instruction word 000000 10001 10010 01000 00000 100000 opcode $s1 $s2 $t0 function Read two registers Write one register Opcode and function code go to control unit that generates RegWrite and ALU operation code. 15

Datapath for R-Type Instruction 000000 10001 10010 01000 00000 100000 opcode $s1 $s2 $t0 function (add) Read register numbers Write reg. number 10001 10010 01000 5 5 5 $s1 $s2 $t0 32 Registers (reg. file) Operation select from control (add) 32 32 ALU 3 zero overflow result Write data 32 RegWrite from control activated 16

Load and Store Instructions I-type instructions lw $t0, 1200 ($t1) # incr. in bytes 100011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 sw $t0, 1200 ($t1) # incr. in bytes 101011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 17

Read register numbers Datapath for lw Instruction Write reg. number Write data 0000 0100 1011 0000 01001 01000 100011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 5 5 5 32 $t1 $t0 16 32 Registers (reg. file) RegWrite from control activated Sign extend 32 ALU Operation select from control (add) 18 32 3 zero overflow result mem. data to $t0 Addr. Write data MemWrite Read data Data memory MemRead activated

Datapath for sw Instruction Read register numbers Write reg. number Write data 0000 0100 1011 0000 101011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 01001 01000 5 5 5 32 $t1 $t0 16 32 Registers (reg. file) RegWrite from control Sign extend 32 32 ALU Operation select from control (add) 19 3 zero overflow 32 result $t0 data to mem. MemWrite activated Addr. Write data Read data Data memory MemRead

Branch Instruction (I-Type) beq $s1, $s2, 25 # if $s1 = $s2, advance PC through 25 instructions 16-bits 000100 10001 10010 0000 0000 0001 1001 opcode $s1 $s2 25 Note: Can branch within ± 2 15 words from the current instruction address in PC. 20

Datapath for beq Instruction Read register numbers Write reg. number Write data 0000 0000 0001 1001 16 10001 10010 32 5 5 5 $s1 $s2 Sign extend 32 Registers (reg. file) RegWrite from control 32 Shift left 2 000100 10001 10010 0000 0000 0001 1001 opcode $s1 $s2 25 32 32 Operation select from control (subtract) 3 ALU zero overflow PC+4 From instruction fetch datapath 32 result 32 To branch control logic Add Branch target 32 21

J-Type Instruction j 2500 # jump to instruction 2,500 26-bits 000010 0000 0000 0000 0010 0111 0001 00 opcode 2,500 0000 0000 0000 0000 0010 0111 0001 0000 bits 28-31 from PC+4 32-bit jump address 22

Datapath for Jump Instruction Branch 4 Add Branch addr. 32 PC+4 1 mux 0 Jump 32 32 0 mux 1 4 PC 32 32 Address Instruction Memory 32 26 Shift left 2 opcode (bits 26-31) to control 6 28 Instruction word to control and registers 23

0-25 Jump Shift 4 Add RegDst opcode 26-31 left 2 CONTROL Branch ALU 1 mux 0 0 mux 1 MemtoReg PC Instr. mem. Combined Datapaths 0-15 21-25 16-20 11-15 Sign ext. 1 mux 0 Reg. File Shift left 2 1 mux 0 ALU ALU Cont. zero MemWrite MemRead Data mem. 0 mux 1 0-5 24

Control Instruction bits 26-31 opcode Instruction bits 0-5 funct. Control Logic ALUOp RegDst Jump Branch MemRead MemtoReg MemWrite ALUSrc RegWrite ALU Control 25 2 to ALU

Control Logic: Truth Table Instr type Inputs: instr. opcode bits 31 30 29 28 27 26 RegD Jump ALU st Src Outputs: control signals MemtReg Mem Mem Bran oreg Write Read Write ch ALO Op1 ALU Op2 R 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 lw 1 0 0 0 1 1 0 0 1 1 1 1 0 0 0 0 sw 1 0 1 0 1 1 X 0 1 X 0 0 1 0 0 0 beq 0 0 0 1 0 0 X 0 0 X 0 0 0 1 0 1 j 0 0 0 0 1 0 X 1 X X X X X X X X 26

How Long Does It Take? Assume control logic is fast and does not affect the critical timing. Major time delay components are ALU, memory read/write, and register read/write. Arithmetic-type (R-type) Fetch (memory read) Register read ALU operation Register write Total 2ns 1ns 2ns 1ns 6ns 27

Time for lw and sw (I-Types) ALU (R-type) Load word (I-type) Fetch (memory read) Register read ALU operation Get data (mem. Read) Register write Total 2ns 1ns 2ns 2ns 1ns Store word (no register write)7ns 6ns 8ns 28

Time for beq (I-Type) ALU (R-type) Load word (I-type) Store word (I-type) Branch on equal (I-type) Fetch (memory read) Register read ALU operation Total 6ns 8ns 7ns 2ns 1ns 2ns 5ns 29

Time for Jump (J-Type) ALU (R-type) Load word (I-type) Store word (I-type) Branch on equal (I-type) Jump (J-type) Fetch (memory read) Total 6ns 8ns 7ns 2ns 2ns 5ns 30

How Fast Can the Clock Be? If every instruction is executed in one clock cycle, then: Clock period must be at least 8ns to perform the longest instruction, i.e., lw. This is a single cycle machine. It is slower because many instructions take less than 8ns but are still allowed that much time. Method of speeding up: Use multicycle datapath. 31

A Single Cycle Example a31... a2 a1 a0 b31... b2 b1 b0 0 Delay of 1-bit full adder = 1ns Clock period 32ns 1-b full adder 1-b full adder 1-b full adder 1-b full adder Time of adding words ~ 32ns Time of adding bytes ~ 32ns c32 s31... s2 s1 s0 32

A Multicycle Implementation Shift Shift a31... a2 a1 a0 b31... b2 b1 b0 Delay of 1-bit full adder = 1ns Clock period 1ns Time of adding words ~ 32ns Time of adding bytes ~ 8ns c32 1-b full adder s31... s2 s1 s0 33 FF Initialize to 0 Shift

Multicycle Datapath PC Addr. Data Memory Instr. reg. (IR) Mem. Data (MDR) Register file A Reg. B Reg. 4 ALU ALUOut Reg. One-cycle data transfer paths (need registers to hold data) 34

Multicycle Datapath Requirements Only one ALU, since it can be reused. Single memory for instructions and data. Five registers added: Instruction register (IR) Memory data register (MDR) Three ALU registers, A and B for inputs and ALUOut for output 35

Multicycle Datapath PCSource IorD out MUX PCWrite etc. PC in1 in2 control Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 11-15 21-25 Register file RegDst RegWrite A Reg. B Reg. ALUSrcA ALUSrcB Shift left 2 28-31 4 ALU ALUOut Reg. MemRead MemWrite MemtoReg 0-5 0-15 Sign extend Shift left 2 ALU control ALUOp 36

Step Instruction fetch Instr. decode/ Reg. fetch Execution, addr. Comp., branch & jump completion 3 to 5 Cycles for an Instruction R-type (4 cycles) ALUOut A op B Mem. Ref. (4 or 5 cycles) Branch type (3 cycles) IR Memory[PC]; PC PC+4 A Reg(IR[21-25]); B Reg(IR[16-20]) ALUOut PC + (sign extend IR[0-15]) << 2 ALUOut A+sign extend (IR[0-15]) If (A= =B) then PC ALUOut J-type (3 cycles) PC PC[28-31] (IR[0-25]<<2) Mem. Access or R-type completion Memory read completion Reg(IR[11-15]) ALUOut MDR M[ALUout] or M[ALUOut] B Reg(IR[16-20]) MDR 37

Cycle 1 of 5: Instruction Fetch (IF) Read instruction into IR, M[PC] IR Control signals used: IorD = 0 select PC MemRead = 1 read memory IRWrite = 1 write IR Increment PC, PC + 4 PC Control signals used: ALUSrcA = 0 select PC into ALU ALUSrcB = 01 select constant 4 ALUOp = 00 ALU adds PCSource = 00 select ALU output PCWrite = 1 write PC 38

Cycle 2 of 5: Instruction Decode (ID) R I 31-26 25-21 20-16 15-11 10-6 5-0 opcode reg 1 reg 2 reg 3 shamt fncode opcode reg 1 reg 2 word address increment J opcode word address jump Control unit decodes instruction Datapath prepares for execution R and I types, reg 1 A reg, reg 2 B reg No control signals needed Branch type, compute branch address in ALUOut ALUSrcA = 0 select PC into ALU ALUSrcB = 11 Instr. Bits 0-15 shift 2 into ALU ALUOp = 00 ALU adds 39

Cycle 3 of 5: Execute (EX) R type: execute function on reg A and reg B, result in ALUOut Control signals used: ALUSrcA = 1 A reg into ALU ALUsrcB = 00 B reg into ALU ALUOp = 10 instr. Bits 0-5 control ALU I type, lw or sw: compute memory address in ALUOut A reg + sign extend IR[0-15] Control signals used: ALUSrcA = 1 A reg into ALU ALUSrcB = 10 Instr. Bits 0-15 into ALU ALUOp = 00 ALU adds 40

Cycle 3 of 5: Execute (EX) I type, beq: subtract reg A and reg B, write ALUOut to PC Control signals used: ALUSrcA = 1 A reg into ALU ALUsrcB = 00 B reg into ALU ALUOp = 01 ALU subtracts If zero = 1, PCSource = 01 If zero = 1, PCwriteCond =1 ALUOut to PC write PC Instruction complete, go to IF J type: write jump address to PC IR[0-25] shift 2 and four leading bits of PC Control signals used: PCSource = 10 PCWrite = 1 write PC Instruction complete, go to IF 41

Cycle 4 of 5: Reg Write/Memory R type, write destination register from ALUOut Control signals used: RegDst = 1 Instr. Bits 11-15 specify reg. MemtoReg = 0 ALUOut into reg. RegWrite = 1 write register Instruction complete, go to IF I type, lw: read M[ALUOut] into MDR Control signals used: IorD = 1 select ALUOut into mem adr. MemRead = 1 read memory to MDR I type, sw: write M[ALUOut] from B reg Control signals used: IorD = 1 select ALUOut into mem adr. MemWrite = 1 write memory 42 Instruction complete, go to IF

Cycle 5 of 5: Reg Write I type, lw: write MDR to reg[ir(16-20)] Control signals used: RegDst = 0 instr. Bits 16-20 are write reg MemtoReg = 1 MDR to reg file write input RegWrite = 1 read memory to MDR Instruction complete, go to IF For an alternative method of designing datapath, see N. Tredennick, Microprocessor Logic Design, the Flowchart Method, Digital Press, 1987. 43

1-bit Control Signals Signal name Value = 0 Value =1 RegDst Write reg. # = bit 16-20 Write reg. # = bit 11-15 RegWrite No action Write reg. Write data ALUSrcA First ALU Operand PC First ALU Operand Reg. A MemRead No action Mem.Data Output M[Addr.] MemWrite No action M[Addr.] Mem. Data Input MemtoReg Reg.File Write In ALUOut Reg.File Write In MDR IorD Mem. Addr. PC Mem. Addr. ALUOut IRWrite No action IR Mem.Data Output PCWrite No action PC is written PCWriteCond No action PC is written if zero(alu)=1 zero(alu) PCWriteCond PCWrite PCWrite etc. 44

2-bit Control Signals Signal name Value Action ALUOp ALUSrcB PCSource 00 ALU performs add 01 ALU performs subtract 10 Funct. field (0-5 bits of IR ) determines ALU operation 00 Second input of ALU B reg. 01 Second input of ALU 4 (constant) 10 Second input of ALU 0-15 bits of IR sign ext. to 32b 11 Second input of ALU 0-15 bits of IR sign ext. and left shift 2 bits 00 ALU output (PC +4) sent to PC 01 ALUOut (branch target addr.) sent to PC 10 Jump address IR[0-25] shifted left 2 bits, concatenated with PC+4[28-31], sent to PC 45

Control: Finite State Machine Start Clock cycle 1 Clock cycle 2 State 1 State 0 Instruction fetch Instruction decode and register fetch Clock cycles 3-5 FSM-M Memory access instr. FSM-R FSM-B FSM-J R-type instr. Branch instr. Jump instr. 46

State 0: Instruction Fetch (CC1) PCSource=00 PC IorD=0 out MUX PCWrite etc.=1 in1 in2 control Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite =1 0-25 16-20 Mem. Data (MDR) 21-25 Register file RegDst RegWrite A Reg. B Reg. ALUSrcA=0 Shift left 2 28-31 ALUSrcB=01 4 ALU Add ALUOut Reg. MemRead = 1 MemWrite MemtoReg 0-5 Sign extend Shift left 2 ALU control 47 0-15 ALUOp =00

State 0 Control FSM Outputs State0 Instruction fetch Start MemRead =1 ALUSrcA = 0 IorD = 0 IRWrite = 1 ALUSrcB = 01 ALUOp = 00 PCWrite = 1 PCSource = 00 State 1 Instruction decode/ Register fetch/ Branch addr. Outputs? 48

State 1: Instr. Decode/Reg. Fetch/ Branch Address (CC2) PCSource IorD out MUX PCWrite etc. PC in1 in2 control Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 21-25 Register file RegDst RegWrite A Reg. B Reg. ALUSrcA=0 Shift left 2 28-31 ALUSrcB=11 4 ALU Add ALUOut Reg. MemRead MemWrite MemtoReg 0-5 Sign extend Shift left 2 ALU control 49 0-15 ALUOp = 00

State 1 Control FSM Outputs State0 Instruction fetch (IF) Start MemRead =1 ALUSrcA = 0 IorD = 0 IRWrite = 1 ALUSrcB = 01 ALUOp = 00 PCWrite = 1 PCSource = 00 State 1 Instruction decode (ID) / Register fetch / Branch addr. Opcode = lw, sw Opcode = R-type ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 Opcode = BEQ FSM-M FSM-R FSM-B FSM-J Opcode = J-type 50

State 1 (Opcode = lw) FSM-M (CC3-5) CC4 PCSource PC IorD=1 out MUX PCWrite etc. in1 in2 control Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 21-25 RegWrite=1 Register file CC5 RegDst=0 A Reg. B Reg. ALUSrcA=1 Shift left 2 28-31 ALUSrcB=10 4 CC3 ALU Add ALUOut Reg. MemRead=1 MemWrite MemtoReg=1 0-5 Sign extend Shift left 2 ALU control 51 0-15 ALUOp = 00

State 1 (Opcode= sw) FSM-M (CC3-4) CC4 PCSource PC IorD=1 out MUX PCWrite etc. in1 in2 control Addr. Data Memory 26-31 to Control FSM CC4 Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 21-25 RegWrite Register file RegDst=0 A Reg. B Reg. ALUSrcA=1 Shift left 2 28-31 ALUSrcB=10 4 CC3 ALU Add ALUOut Reg. MemRead MemWrite=1 MemtoReg 0-5 Sign extend Shift left 2 ALU control 52 0-15 ALUOp = 00

FSM-M (Memory Access) From state 1 Opcode = lw or sw Read Memory data MemRead = 1 IorD = 1 Opcode = lw Compute mem addrress ALUSrcA =1 ALUSrcB = 10 ALUOp = 00 Write register RegWrite = 1 MemtoReg = 1 RegDst = 0 Opcode = sw Write memory MemWrite = 1 IorD = 1 To state 0 (Instr. Fetch) 53

State 1(Opcode=R-type) FSM-R (CC3-4) PCSource IorD out MUX PCWrite etc. PC in1 in2 control Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 21-25 16-20 Mem. Data (MDR) RegWrite Register file RegDst=0 A Reg. B Reg. 11-15 CC3 CC4 ALUSrcA=1 Shift left 2 28-31 ALUSrcB=00 4 ALU ALUOut Reg. funct. code MemRead MemWrite MemtoReg=0 0-5 Sign extend Shift left 2 ALU control 54 0-15 ALUOp = 10

FSM-R (R-type Instruction) From state 1 Opcode = R-type ALU operation ALUSrcA =1 ALUSrcB = 00 ALUOp = 10 Write register RegWrite = 1 MemtoReg = 0 RegDst = 1 To state 0 (Instr. Fetch) 55

IorD out MUX PCWrite etc.=1 PC in1 in2 control State 1 (Opcode = beq ) FSM-B (CC3) Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 16-20 Mem. Data (MDR) 21-25 11-15 0-25 Register file RegDst RegWrite A Reg. B Reg. CC3 If(zero) ALUSrcA=1 Shift left 2 28-31 ALUSrcB=00 4 ALU zero ALUOut Reg. subtract PCSource 01 MemRead MemWrite MemtoReg 0-5 Sign extend Shift left 2 ALU control 56 0-15 ALUOp = 01

Write PC on zero zero=1 PCWriteCond=1 PCWrite PCWrite etc.=1 57

FSM-B (Branch) From state 1 Opcode = beq Write PC on branch condition ALUSrcA =1 ALUSrcB = 00 ALUOp = 01 PCWriteCond=1 PCSource=01 Branch condition: If A B=0 zero = 1 To state 0 (Instr. Fetch) 58

State 1 (Opcode = j) FSM-J (CC3) IorD out MUX in1 PCWrite etc. PC in2 Dat a control Addr. Memory MemRead MemWrite 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) MemtoReg 11-15 0-5 21-25 0-15 Register file RegDst CC3 RegWrite Sign extend A Reg. B Reg. ALUSrcA ALUSrcB Shift left 2 Shift left 2 28-31 4 ALU ALU control zero PCSource 10 ALUOut Reg. ALUOp 59

Write PC zero PCWriteCond PCWrite=1 PCWrite etc.=1 60

FSM-J (Jump) From state 1 Opcode = jump Write jump addr. In PC PCWrite=1 PCSource=10 To state 0 (Instr. Fetch) 61

Control FSM 3 Read memory data Write register State 0 lw 2 4 5 Instr. fetch/ adv. PC Start Compute memory addr. sw Write memory data lw or sw 6 7 1 ALU operation Write register Instr. decode/reg. fetch/branch R addr. B J Write PC on branch condition 8 9 Write jump addr. to PC 62

Control FSM (Controller) 6 inputs (opcode) Combinational logic 16 control outputs Present state Next state Reset Clock FF FF FF FF 63

Designing the Control FSM Encode states; need 4 bits for 10 states, e.g., State 0 is 0000, state 1 is 0001, and so on. Write a truth table for combinational logic: Opcode Present state Control signals Next state 000000 0000 0001000110000100 0001................ Synthesize a logic circuit from the truth table. Connect four flip-flops between the next state outputs and present state inputs. 64

MemWrite Block Diagram of a Processor MemRead Controller (Control FSM) ALUOp 2-bits ALU control ALUOp 3-bits funct. [0,5] PCWriteCond PCWrite IRWrite IorD MemtoReg RegWrite RegDst PCSource 2-bits ALUSrcB 2-bits ALUSrcA Overflow zero Opcode 6-bits Reset Clock Mem. Addr. Datapath (PC, register file, registers, ALU) Mem. write data Mem. data out 65

Exceptions or Interrupts Conditions under which the processor may produce incorrect result or may hang. Illegal or undefined opcode. Arithmetic overflow, divide by zero, etc. Out of bounds memory address. EPC: 32-bit register holds the affected instruction address. Cause: 32-bit register holds an encoded exception type. For example, 0 for undefined instruction 1 for arithmetic overflow 66

PCWrite etc.=1 PC MUX out in1 Implementing Exceptions in2 control 26-31 to Control FSM 0 1 Instr. reg. (IR) CauseWrite=1 Cause 32-bit register 8000 0180(hex) 67 ALUSrcA=0 ALUSrcB=01 4 ALU ALU control PCSource 11 EPCWrite=1 EPC Overflow to Control FSM Subtract ALUOp =01

How Long Does It Take? Again Assume control logic is fast and does not affect the critical timing. Major time components are ALU, memory read/write, and register read/write. Time for hardware operations, suppose Memory read or write Register read ALU operation Register write 2ns 1ns 2ns 1ns 68

Single-Cycle Datapath R-type Load word (I-type) Store word (I-type) 7ns Branch on equal (I-type) Jump (J-type) 6ns 8ns 5ns 2ns Clock cycle time = 8ns Each instruction takes one cycle 69

Multicycle Datapath Clock cycle time is determined by the longest operation, ALU or memory: Clock cycle time = 2ns Cycles per instruction (CPI): lw 5 (10ns) sw 4 (8ns) R-type 4 (8ns) beq 3 (6ns) j 3 (6ns) 70

CPI of a Computer k (Instructions of type k) CPI k CPI = k (instructions of type k) where CPI k = Cycles for instruction of type k Note: CPI is dependent on the instruction mix of the program being run. Standard benchmark programs are used for specifying the performance of CPUs. 71

Example Consider a program containing: loads 25% stores 10% branches 11% jumps 2% Arithmetic 52% CPI = 0.25 5 + 0.10 4 + 0.11 3 + 0.02 3 + 0.52 4 = 4.12 for multicycle datapath CPI = 1.00 for single-cycle datapath 72

Multicycle vs. Single-Cycle Performance ratio = Single cycle time / Multicycle time (CPI cycle time) for single-cycle = (CPI cycle time) for multicycle 1.00 8ns = = 0.97 4.12 2ns Single cycle is faster in this case, but remember, performance ratio depends on the instruction mix. 73

Thank You 74