RISC Design: Multi-Cycle Implementation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in CP-226: Computer Architecture Lecture 8 (11 Feb 2013)
PC 4 Add Instr. mem. Combined Datapaths RegDst 0-15 0-5 0-25 Shift Jump left 2 opcode 26-31 21-25 16-20 11-15 Sign ext. CONTROL 1 mux 0 Reg. File Shift left 2 Branch 1 mux 0 Cont. zero 11 Feb 2013 Computer Architecture@MNIT 2 1 mux 0 MemWrite MemRead Data mem. 0 mux 1 MemtoReg 0 mux 1
Control Logic Instruction bits 26-31 opcode Instruction bits 0-5 funct. Control Logic Op RegDst Jump Branch MemRead MemtoReg MemWrite Src RegWrite Control 11 Feb 2013 Computer Architecture@MNIT 3 2 to
Time for Jump (J-Type) (R-type) 6ns Load word (I-type) 8ns Store word (I-type) 7ns Branch on equal (I-type) 5ns Jump (J-type) Fetch (memory read) 2ns Total 2ns 11 Feb 2013 Computer Architecture@MNIT 4
How Fast Can the Clock Be? If every instruction is executed in one clock cycle, then: Clock period must be at least 8ns to perform the longest instruction, i.e., lw. This is a single cycle machine. It is slower because many instructions take less than 8ns but are still allowed that much time. Method of speeding up: Use multicycle datapath. 11 Feb 2013 Computer Architecture@MNIT 5
A Single Cycle Example a31... a2 a1 a0 b31... b2 b1 b0 0 Delay of 1-bit full adder = 1ns Clock period 32ns 1-b full adder 1-b full adder 1-b full adder 1-b full adder Time of adding words ~ 32ns Time of adding bytes ~ 32ns c32 s31... s2 s1 s0 11 Feb 2013 Computer Architecture@MNIT 6
A Multicycle Implementation Shift Shift a31... a2 a1 a0 b31... b2 b1 b0 Delay of 1-bit full adder = 1ns Clock period 1ns Time of adding words ~ 32ns Time of adding bytes ~ 8ns c32 1-b full adder FF Initialize to 0 s31... s2 s1 s0 Shift 11 Feb 2013 Computer Architecture@MNIT 7
Multicycle Datapath PC Addr. Data Memory Instr. reg. (IR) Mem. Data (MDR) Register file A Reg. B Reg. 4 Out Reg. One-cycle data transfer paths (need registers to hold data) 11 Feb 2013 Computer Architecture@MNIT 8
Multicycle Datapath Requirements Only one, since it can be reused. Single memory for instructions and data. Five registers added: Instruction register (IR) Memory data register (MDR) Three registers, A and B for inputs and Out for output 11 Feb 2013 Computer Architecture@MNIT 9
Multicycle Datapath PCSource IorD out MUX in1 PCWrite etc. PC in2 control Data Addr. Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 11-15 21-25 Register file RegDst RegWrite B Reg. A Reg. SrcA SrcB Shift left 2 28-31 4 Out Reg. MemRead MemWrite MemtoReg 0-5 0-15 Sign extend Shift left 2 control Op 11 Feb 2013 Computer Architecture@MNIT 10
3 to 5 Cycles for an Instruction Step R-type Mem. Ref. Branch type J-type (4 cycles) (4 or 5 cycles) (3 cycles) (3 cycles) Instruction fetch IR Memory[PC]; PC PC+4 Instr. decode/ Reg. fetch Execution, addr. Comp., branch & jump completion Mem. Access or R-type completion Memory read completion Out A op B Reg(IR[11-15]) Out A Reg(IR[21-25]); B Reg(IR[16-20]) Out PC + (sign extend IR[0-15]) << 2 Out A+sign extend (IR[0-15]) MDR M[out] or M[Out] B Reg(IR[16-20]) MDR If (A= =B) then PC Out PC PC[28-31] (IR[0-25]<<2) 11 Feb 2013 Computer Architecture@MNIT 11
Cycle 1 of 5: Instruction Fetch (IF) Read instruction into IR, M[PC] IR Control signals used:» IorD = 0 select PC» MemRead = 1 read memory» IRWrite = 1 write IR Increment PC, PC + 4 PC Control signals used:» SrcA = 0 select PC into» SrcB = 01 select constant 4» Op = 00 adds» PCSource = 00 select output» PCWrite = 1 write PC 11 Feb 2013 Computer Architecture@MNIT 12
Cycle 2 of 5: Instruction Decode (ID) R I 31-26 25-21 20-16 15-11 10-6 5-0 opcode reg 1 reg 2 reg 3 shamt fncode opcode reg 1 reg 2 word address increment J opcode word address jump Control unit decodes instruction Datapath prepares for execution R and I types, reg 1 A reg, reg 2 B reg» No control signals needed Branch type, compute branch address in Out» SrcA = 0 select PC into» SrcB = 11 Instr. Bits 0-15 shift 2 into» Op = 00 adds 11 Feb 2013 Computer Architecture@MNIT 13
Cycle 3 of 5: Execute (EX) R type: execute function on reg A and reg B, result in Out Control signals used:» SrcA = 1 A reg into» srcb = 00 B reg into» Op = 10 instr. Bits 0-5 control I type, lw or sw: compute memory address in Out A reg + sign extend IR[0-15] Control signals used:» SrcA = 1 A reg into» SrcB = 10 Instr. Bits 0-15 into» Op = 00 adds 11 Feb 2013 Computer Architecture@MNIT 14
Cycle 3 of 5: Execute (EX) I type, beq: subtract reg A and reg B, write Out to PC Control signals used:» SrcA = 1 A reg into» srcb = 00 B reg into» Op = 01 subtracts» If zero = 1, PCSource = 01 Out to PC» If zero = 1, PCwriteCond = 1 write PC» Instruction complete, go to IF J type: write jump address to PC IR[0-25] shift 2 and four leading bits of PC Control signals used:» PCSource = 10» PCWrite = 1 write PC» Instruction complete, go to IF 11 Feb 2013 Computer Architecture@MNIT 15
Cycle 4 of 5: Reg Write/Memory R type, write destination register from Out Control signals used:» RegDst = 1 Instr. Bits 11-15 specify reg.» MemtoReg = 0 Out into reg.» RegWrite = 1 write register» Instruction complete, go to IF I type, lw: read M[Out] into MDR Control signals used:» IorD = 1 select Out into mem adr.» MemRead = 1 read memory to MDR I type, sw: write M[Out] from B reg Control signals used:» IorD = 1 select Out into mem adr.» MemWrite = 1 write memory» Instruction complete, go to IF 11 Feb 2013 Computer Architecture@MNIT 16
Cycle 5 of 5: Reg Write I type, lw: write MDR to reg[ir(16-20)] Control signals used:» RegDst = 0 instr. Bits 16-20 are write reg» MemtoReg = 1 MDR to reg file write input» RegWrite = 1 read memory to MDR» Instruction complete, go to IF For an alternative method of designing datapath, see N. Tredennick, Microprocessor Logic Design, the Flowchart Method, Digital Press, 1987. 11 Feb 2013 Computer Architecture@MNIT 17
1-bit Control Signals Signal name Value = 0 Value =1 RegDst Write reg. # = bit 16-20 Write reg. # = bit 11-15 RegWrite No action Write reg. Write data SrcA First Operand PC First Operand Reg. A MemRead No action Mem.Data Output M[Addr.] MemWrite No action M[Addr.] Mem. Data Input MemtoReg Reg.File Write In Out Reg.File Write In MDR IorD Mem. Addr. PC Mem. Addr. Out IRWrite No action IR Mem.Data Output PCWrite No action PC is written PCWriteCond No action PC is written if zero()=1 zero() PCWriteCond PCWrite PCWrite etc. 11 Feb 2013 Computer Architecture@MNIT 18
2-bit Control Signals Signal name Value Action Op SrcB PCSource 00 performs add 01 performs subtract 10 Funct. field (0-5 bits of IR ) determines operation 00 Second input of B reg. 01 Second input of 4 (constant) 10 Second input of 0-15 bits of IR sign ext. to 32b 11 Second input of 0-15 bits of IR sign ext. and left shift 2 bits 00 output (PC +4) sent to PC 01 Out (branch target addr.) sent to PC 10 Jump address IR[0-25] shifted left 2 bits, concatenated with PC+4[28-31], sent to PC 11 Feb 2013 Computer Architecture@MNIT 19
Control: Finite State Machine Start Clock cycle 1 Clock cycle 2 State 1 State 0 Instruction fetch Instruction decode and register fetch Clock cycles 3-5 FSM-M Memory access instr. FSM-R FSM-B FSM-J R-type instr. Branch instr. Jump instr. 11 Feb 2013 Computer Architecture@MNIT 20
State 0: Instruction Fetch (CC1) PCSource=00 PC IorD=0 out MUX in1 PCWrite etc.=1 in2 control Data Addr. Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite =1 0-25 16-20 Mem. Data (MDR) 21-25 Register file RegDst RegWrite B Reg. A Reg. SrcA=0 Shift left 2 28-31 SrcB=01 4 Add Out Reg. MemRead = 1 MemWrite MemtoReg Sign extend Shift left 2 11 Feb 2013 Computer Architecture@MNIT 21 0-5 0-15 control Op =00
State 0 Control FSM Outputs State0 Instruction fetch Start MemRead =1 SrcA = 0 IorD = 0 IRWrite = 1 SrcB = 01 Op = 00 PCWrite = 1 PCSource = 00 State 1 Instruction decode/ Register fetch/ Branch addr. Outputs? 11 Feb 2013 Computer Architecture@MNIT 22
State 1: Instr. Decode/Reg. Fetch/ Branch Address (CC2) PCSource IorD out MUX in1 PCWrite etc. PC in2 control Data Addr. Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 21-25 Register file RegDst RegWrite B Reg. A Reg. SrcA=0 Shift left 2 28-31 SrcB=11 4 Add Out Reg. MemRead MemWrite MemtoReg 0-5 0-15 Sign extend Shift left 2 control Op = 00 11 Feb 2013 Computer Architecture@MNIT 23
State 1 Control FSM Outputs State0 Instruction fetch (IF) Start MemRead =1 SrcA = 0 IorD = 0 IRWrite = 1 SrcB = 01 Op = 00 PCWrite = 1 PCSource = 00 State 1 Instruction decode (ID) / Register fetch / Branch addr. SrcA = 0 SrcB = 11 Op = 00 Opcode = BEQ Opcode = J-type FSM-M FSM-R FSM-B FSM-J 11 Feb 2013 Computer Architecture@MNIT 24
State 1 (Opcode = lw) FSM-M (CC3-5) CC4 PCSource MUX in1 PCWrite etc. PC IorD=1 out in2 control Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 21-25 Register file CC5 RegWrite=1 RegDst=0 B Reg. A Reg. SrcA=1 Shift left 2 28-31 SrcB=10 4 CC3 Add Out Reg. MemRead=1 MemWrite MemtoReg=1 0-5 0-15 Sign extend Shift left 2 control Op = 00 11 Feb 2013 Computer Architecture@MNIT 25
State 1 (Opcode= sw) FSM-M (CC3-4) CC4 PCSource PC IorD=1 out MUX in1 PCWrite etc. in2 control Addr. Data Memory 26-31 to Control FSM CC4 Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 21-25 Register file RegWrite RegDst=0 B Reg. A Reg. SrcA=1 Shift left 2 28-31 SrcB=10 4 CC3 Add Out Reg. MemRead MemWrite=1 MemtoReg 0-5 0-15 Sign extend Shift left 2 control Op = 00 11 Feb 2013 Computer Architecture@MNIT 26
FSM-M (Memory Access) From state 1 Opcode = lw or sw Read Memory data MemRead = 1 IorD = 1 Opcode = lw Compute mem addrress SrcA =1 SrcB = 10 Op = 00 Write register RegWrite = 1 MemtoReg = 1 RegDst = 0 Opcode = sw Write memory MemWrite = 1 IorD = 1 To state 0 (Instr. Fetch) 11 Feb 2013 Computer Architecture@MNIT 27
State 1(Opcode=R-type) FSM-R (CC3-4) PCSource IorD out MUX in1 PCWrite etc. PC in2 control Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 21-25 Register file RegWrite A Reg. 11-15 CC3 RegDst=0 CC4 B Reg. SrcA=1 Shift left 2 28-31 SrcB=00 4 Out Reg. funct. code MemRead MemWrite MemtoReg=0 Sign extend Shift left 2 11 Feb 2013 Computer Architecture@MNIT 28 0-5 0-15 control Op = 10
FSM-R (R-type Instruction) From state 1 Opcode = R-type operation SrcA =1 SrcB = 00 Op = 10 Write register RegWrite = 1 MemtoReg = 0 RegDst = 1 To state 0 (Instr. Fetch) 11 Feb 2013 Computer Architecture@MNIT 29
State 1 (Opcode = beq ) FSM-B (CC3) IorD out MUX in1 PCWrite etc.=1 PC in2 control Addr. Data Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 21-25 16-20 Mem. Data (MDR) 11-15 0-25 Register file RegDst RegWrite A Reg. CC3 B Reg. If(zero) SrcA=1 Shift left 2 28-31 SrcB=00 4 zero Out Reg. subtract PCSource 01 MemRead MemWrite MemtoReg Sign extend Shift left 2 11 Feb 2013 Computer Architecture@MNIT 30 0-5 0-15 control Op = 01
Write PC on zero zero=1 PCWriteCond=1 PCWrite PCWrite etc.=1 11 Feb 2013 Computer Architecture@MNIT 31
FSM-B (Branch) From state 1 Opcode = beq Write PC on branch condition SrcA =1 SrcB = 00 Op = 01 PCWriteCond=1 PCSource=01 Branch condition: If A B=0 zero = 1 To state 0 (Instr. Fetch) 11 Feb 2013 Computer Architecture@MNIT 32
State 1 (Opcode = j) FSM-J (CC3) IorD out MUX in1 PCWrite etc. PC in2 Dat a control Addr. MemRead MemWrite 26-31 to Control MemoryFSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) MemtoReg 11-15 0-5 21-25 0-15 Register file RegDst CC3 RegWrite Sign extend A Reg. B Reg. SrcA SrcB Shift left 2 Shift left 2 28-31 4 zero control PCSource 10 Out Reg. Op 11 Feb 2013 Computer Architecture@MNIT 33
Write PC zero PCWriteCond PCWrite=1 PCWrite etc.=1 11 Feb 2013 Computer Architecture@MNIT 34
FSM-J (Jump) From state 1 Opcode = jump Write jump addr. In PC PCWrite=1 PCSource=10 To state 0 (Instr. Fetch) 11 Feb 2013 Computer Architecture@MNIT 35
Control FSM 3 State 0 2 Instr. fetch/ adv. PC Start 1 Instr. decode/reg. fetch/branch addr. R B J Read memory data Write register lw 4 5 Compute memory addr. sw Write memory data 6 7 operation Write register Write PC on branch condition 8 9 Write jump addr. to PC 11 Feb 2013 Computer Architecture@MNIT 36
Control FSM (Controller) 6 inputs (opcode) Combinational logic 16 control outputs Present state Next state Reset Clock FF FF FF FF 11 Feb 2013 Computer Architecture@MNIT 37
Designing the Control FSM Encode states; need 4 bits for 10 states, e.g., State 0 is 0000, state 1 is 0001, and so on. Write a truth table for combinational logic: Opcode Present state Control signals Next state 000000 0000 0001000110000100 0001................ Synthesize a logic circuit from the truth table. Connect four flip-flops between the next state outputs and present state inputs. 11 Feb 2013 Computer Architecture@MNIT 38
Block Diagram of a Processor MemWrite MemRead Controller (Control FSM) Op 2-bits control Opcode 6-bits zero Overflow SrcA SrcB 2-bits PCSource 2-bits RegDst RegWrite MemtoReg IorD IRWrite PCWrite PCWriteCond funct. [0,5] Op 3-bits Reset Clock Mem. Addr. Datapath (PC, register file, registers, ) Mem. write data Mem. data out 11 Feb 2013 Computer Architecture@MNIT 39
Exceptions or Interrupts Conditions under which the processor may produce incorrect result or may hang. Illegal or undefined opcode. Arithmetic overflow, divide by zero, etc. Out of bounds memory address. EPC: 32-bit register holds the affected instruction address. Cause: 32-bit register holds an encoded exception type. For example, 0 for undefined instruction 1 for arithmetic overflow 11 Feb 2013 Computer Architecture@MNIT 40
Implementing Exceptions PCWrite etc.=1 PC MUX out in1 in2 control 26-31 to Control FSM Instr. reg. (IR) 0 1 CauseWrite=1 Cause 32-bit register 8000 0180(hex) SrcA=0 SrcB=01 4 control Subtract PCSource 11 EPCWrite=1 EPC Overflow to Control FSM Op =01 11 Feb 2013 Computer Architecture@MNIT 41
How Long Does It Take? Again Assume control logic is fast and does not affect the critical timing. Major time components are, memory read/write, and register read/write. Time for hardware operations, suppose Memory read or write Register read operation Register write 2ns 1ns 2ns 1ns 11 Feb 2013 Computer Architecture@MNIT 42
Single-Cycle Datapath R-type 6ns Load word (I-type) 8ns Store word (I-type) 7ns Branch on equal (I-type) 5ns Jump (J-type) 2ns Clock cycle time = 8ns Each instruction takes one cycle 11 Feb 2013 Computer Architecture@MNIT 43
Multicycle Datapath Clock cycle time is determined by the longest operation, or memory: Clock cycle time = 2ns Cycles per instruction (CPI): lw 5 (10ns) sw 4 (8ns) R-type 4 (8ns) beq 3 (6ns) j 3 (6ns) 11 Feb 2013 Computer Architecture@MNIT 44
CPI of a Computer k (Instructions of type k) CPI k CPI = k (instructions of type k) where CPI k = Cycles for instruction of type k Note: CPI is dependent on the instruction mix of the program being run. Standard benchmark programs are used for specifying the performance of CPUs. 11 Feb 2013 Computer Architecture@MNIT 45
Example Consider a program containing: loads 25% stores 10% branches 11% jumps 2% Arithmetic 52% CPI= 0.25 5 + 0.10 4 + 0.11 3 + 0.02 3 + 0.52 4 = 4.12 for multicycle datapath CPI= 1.00 for single-cycle datapath 11 Feb 2013 Computer Architecture@MNIT 46
Multicycle vs. Single-Cycle Performance ratio = Single cycle time / Multicycle time (CPI cycle time) for single-cycle = (CPI cycle time) for multicycle 1.00 8ns = = 0.97 4.12 2ns Single cycle is faster in this case, but remember, performance ratio depends on the instruction mix. 11 Feb 2013 Computer Architecture@MNIT 47
Thank You 11 Feb 2013 Computer Architecture@MNIT 48