CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27
The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference instructions: lw, sw arithmetic-logical instructions: add, addu, sub, subu, and, or, xor, nor, slt, sltu arithmetic-logical immediate instructions: addi, addiu, andi, ori, xori, slti, sltiu control flow instructions: beq, j q Generic implementation: use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC) decode the instruction (and read registers) execute the instruction Exec Fetch PC = PC+4 Decode CENG342 L6.2 Spring 27
Abstract Implementation View q Two types of functional units: elements that operate on data values (combinational) elements that contain state (sequential) PC Register Reg r File Reg r Reg r Data Data Data Data q Single cycle operation q Split memory (Harvard) model - one memory for instructions and one for data CENG342 L6.3 Spring 27
Fetching s q Fetching instructions involves reading the instruction from the updating the PC value to be the address of the next (sequential) instruction clock Exec Fetch PC = PC+4 Decode PC 4 PC is updated every clock cycle, so it does not need an explicit write control signal is read every clock cycle, so it doesn t need an explicit read control signal CENG342 L6.4 Spring 27
Decoding s q Decoding instructions involves sending the fetched instruction s opcode and function field bits to the control unit Fetch PC = PC+4 Control Unit Exec Decode r Register r 2 Data File Write r Data 2 reading two values from the Register File - Register File addresses are contained in the instruction CENG342 L6.5 Spring 27
ing Registers Just in Case q Note that both RegFile read ports are active for all instructions during the Decode cycle using the rs and rt instruction field addresses Since haven t decoded the instruction yet, don t know what the instruction is! Just in case the instruction uses values from the RegFile do work ahead by reading the two source operands Which instructions do make use of the RegFile values? CENG342 L6.6 Spring 27
EX: q All instructions (except j) use the after reading the registers. Please analyze memoryreference, arithmetic, and control flow instructions. CENG342 L6.7 Spring 27
Executing R Format Operations q R format operations (add, sub, slt, and, or) R-type: 3 25 2 5 5 op rs rt rd shamt funct perform operation (op and funct) on values in rs and rt store the result back into the Register File (into location rd) RegWrite control Exec Fetch PC = PC+4 Decode r Register r 2 Data File Write r Data 2 overflow zero Note that Register File is not written every cycle (e.g. sw), so we need an explicit write control signal for the Register File CENG342 L6.8 Spring 27
Consider the slt q Remember the R format instruction slt slt $t, $s, $s # if $s < $s # then $t = # else $t = Where does the (or ) come from to store into $t in the Register File at the end of the execute cycle? RegWrite control r Register r 2 Data File Write r Data 2 overflow zero CENG342 L6.9 Spring 27
Executing Load and Store Operations q Load and store operations have to 3 25 2 5 I-Type: op rs rt address offset compute a memory address by adding the base register (in rs) to the 6-bit signed offset field in the instruction - base register was read from the Register File during decode - offset value in the low order 6 bits of the instruction must be sign extended to create a 32-bit signed value store value, read from the Register File during decode, must be written to the Data load value, read from the Data, must be stored in the Register File CENG342 L6. Spring 27
Executing Load and Store Operations, con t RegWrite control MemWrite r Register r 2 Data File Write r Data 2 overflow zero Data Data Sign 6 Extend 32 Mem CENG342 L6. Spring 27
Executing Branch Operations q Branch operations have to 3 25 2 5 I-Type: op rs rt address offset compare the operands read from the Register File during decode (rs and rt values) for equality (zero output) compute the branch target address by adding the updated PC to the sign extended6-bit signed offset field in the instruction - base register is the updated PC - offset value in the low order 6 bits of the instruction must be sign extended to create a 32-bit signed value and then shifted left 2 bits to turn it into a word address CENG342 L6.2 Spring 27
Executing Branch Operations, con t 4 Shift left 2 Branch target address control PC r Register r 2 Data File Write r Data 2 zero (to branch control logic) Sign 6 Extend 32 CENG342 L6.3 Spring 27
Executing Jump Operations q Jump operations have to 3 25 J-Type: op jump target address replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits 4 4 PC 26 Shift left 2 28 Jump address CENG342 L6.4 Spring 27
Creating a Single Datapath from the Parts q Assemble the datapath elements, add control lines as needed, and design the control path q Fetch, decode and execute each instruction in one clock cycle single cycle design no datapath resource can be used more than once per instruction, so some must be duplicated (e.g., why we have a separate and Data ) to share datapath elements between two different instruction classes will need multiplexors at the input of the shared elements with control lines to do the selection q Cycle time is determined by length of the longest path CENG342 L6.5 Spring 27
Fetch, R, and Access Portions PC 4 RegWrite r Register r 2 Data File Write r Data 2 control ovf zero MemWrite Data Data Sign 6 Extend 32 Mem CENG342 L6.6 Spring 27
Multiplexor Insertion PC 4 RegWrite r Register r 2 Data File Write r Data 2 Src control ovf zero MemWrite Data Data MemtoReg Sign 6 Extend 32 Mem CENG342 L6.7 Spring 27
Clock Distribution System Clock clock cycle RegWrite MemWrite PC 4 r Register r 2 Data File Write r Data 2 Src control ovf zero Data Data MemtoReg Sign 6 Extend 32 Mem CENG342 L6.8 Spring 27
ing the Branch Portion PC 4 Shift left 2 RegWrite r Register r 2 Data File Write r Data 2 Src control ovf zero PCSrc MemWrite Data Data MemtoReg Sign 6 Extend 32 Mem CENG342 L6.9 Spring 27
Our Simple Control Structure q We wait for everything to settle down might not produce right answer right away and RegFile reads are combinational (as are, adders, muxes, shifter, signextender) Use write signals along with the clock edge to determine when to write to the sequential elements (to the PC, to the Register File and to the Data ) q The clock cycle time is determined by the logic delay through the longest path We are ignoring some details like register setup and hold times CENG342 L6.2 Spring 27
Summary: ing the Control q Selecting the operations to perform (, Register File and read/write) q Controlling the flow of data (multiplexor inputs) q Information comes from the 32 bits of the instruction q Observations op field always in bits 3-26 addr of two registers to be R-type: 3 25 2 5 5 op rs rt rd shamt funct 3 25 2 5 I-Type: op rs rt address offset read are always specified by the rs and rt fields (bits 25-2 and 2-6) base register for lw and sw always in rs (bits 25-2) addr. of register to be written is in one of two places in rt (bits 2-6) for lw; in rd (bits 5-) for R-type instructions offset for beq, lw, and sw always in bits 5- CENG342 L6.2 Spring 27
(Almost) Complete Single Cycle Datapath 4 Shift left 2 PCSrc RegDst RegWrite Src MemWrite MemtoReg PC Instr[3-] Instr[25-2] Instr[2-6] Instr[5 -] r Register r 2 Data File Write r Data 2 ovf zero Data Data Instr[5-] Sign 6 Extend 32 control Mem Instr[5-] Op CENG342 L6.22 Spring 27
Control q 's operation based on instruction type and function code control input Function and or xor nor add subtract set on less than q Notice that we are using different encodings than in the book CENG342 L6.23 Spring 27
EX: Control, Con t q Controlling the uses of multiple decoding levels main control unit generates the Op bits control unit generates control bits Instr op funct Op action control lw xxxxxx add sw xxxxxx add beq xxxxxx subtract add add subt subtract and and or or xor xor nor nor slt slt CENG342 L6.24 Spring 27
Control Truth Table Our m control input F5 F4 F3 F2 F F Op Op control 3 control 2 control control X X X X X X X X X X X X X X X X X X X X X X X X X X /subt Mux control q Four, 6-input truth tables CENG342 L6.25 Spring 27
Control Logic q From the truth table can design the Control logic Instr[3] Instr[2] Instr[] Instr[] Op Op control 3 control 2 control control CENG342 L6.26 Spring 27
(Almost) Complete Datapath with Control Unit 4 Op Instr[3-26] Control Unit Branch Src Shift left 2 PCSrc Mem MemtoReg MemWrite PC Instr[3-] RegDst Instr[25-2] Instr[2-6] Instr[5 -] RegWrite r Register r 2 Data File Write r Data 2 ovf zero Data Data Instr[5-] Sign 6 Extend 32 control Instr[5-] CENG342 L6.27 Spring 27
R-type Data/Control Flow 4 Op Instr[3-26] Control Unit Branch Src Shift left 2 PCSrc Mem MemtoReg MemWrite PC Instr[3-] RegDst Instr[25-2] Instr[2-6] Instr[5 -] Instr[5-] RegWrite r Register r 2 Data File Write r Data 2 Sign 6 Extend 32 ovf zero control Data Data Instr[5-] CENG342 L6.28 Spring 27
Store Word Data/Control Flow 4 Op Instr[3-26] Control Unit Branch Src Shift left 2 PCSrc Mem MemtoReg MemWrite PC Instr[3-] RegDst Instr[25-2] Instr[2-6] Instr[5 -] RegWrite ovf r Register r 2 Data File Write r Data 2 zero Data Data Instr[5-] Sign 6 Extend 32 control Instr[5-] CENG342 L6.29 Spring 27
Load Word Data/Control Flow 4 Op Instr[3-26] Control Unit Branch Src Shift left 2 PCSrc Mem MemtoReg MemWrite PC Instr[3-] RegDst Instr[25-2] Instr[2-6] Instr[5 -] RegWrite r Register r 2 Data File Write r Data 2 ovf zero Data Data Instr[5-] Sign 6 Extend 32 control Instr[5-] CENG342 L6.3 Spring 27
Branch Data/Control Flow 4 Op Instr[3-26] Control Unit Branch Src Shift left 2 PCSrc Mem MemtoReg MemWrite PC Instr[3-] RegDst Instr[25-2] Instr[2-6] Instr[5 -] Instr[5-] RegWrite r Register r 2 Data File Write r Data 2 Sign 6 Extend 32 ovf zero control Data Data Instr[5-] CENG342 L6.3 Spring 27
Main Control Unit Instr RegDst Src MemReg RegWr MemRd MemWr Branch Op R-type lw sw beq X X X X CENG342 L6.32 Spring 27
Control Unit Logic q From the truth table can design the Main Control logic Instr[3] Instr[3] Instr[29] Instr[28] Instr[27] Instr[26] R-type lw sw beq RegDst Src MemtoReg RegWrite Mem MemWrite Branch Op Op CENG342 L6.33 Spring 27
Review: Handling Jump Operations q Jump operation have to replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits 3 J-Type: op jump target address 4 4 PC 26 Shift left 2 28 Jump address CENG342 L6.34 Spring 27
ing the Jump Operation 4 Instr[25-] 26 Op Instr[3-26] Shift left 2 Control Unit 28 Branch 32 PC+4[3-28] Jump Src Shift left 2 PCSrc Mem MemtoReg MemWrite PC Instr[3-] RegDst Instr[25-2] Instr[2-6] Instr[5 -] RegWrite ovf r Register r 2 Data File Write r Data 2 zero Data Data Instr[5-] Sign 6 Extend 32 control Instr[5-] CENG342 L6.35 Spring 27
EX: Main Control Unit of j Instr RegDst Src MemReg RegWr MemRd MemWr Branch Op Jump R-type lw sw beq j X X X X X X X X XX CENG342 L6.36 Spring 27
Single Cycle Implementation Cycle Time q Unfortunately, though simple, the single cycle approach is not used because it is very slow q Clock cycle must have the same length for every instruction q What is the longest path (slowest instruction)? CENG342 L6.37 Spring 27
EX: Critical Paths q Calculate cycle time assuming negligible delays (for muxes, control unit, sign extend, PC access, shift left 2, wires) except: and Data (4 ns) and adders (2 ns) Register File access (reads or writes) ( ns) Instr. I Mem Reg Rd Op D Mem Reg Wr Total R- type load store beq jump 4 2 8 4 2 4 2 4 2 4 4 2 7 4 4 CENG342 L6.38 Spring 27
Single Cycle Disadvantages & Advantages q Uses the clock cycle inefficiently the clock cycle must be timed to accommodate the slowest instr especially problematic for more complex instructions like floating point multiply Clk Cycle Cycle 2 lw sw Waste q May be wasteful of area since some functional units (e.g., adders) must be duplicated since they can not be shared during a clock cycle but q It is simple and easy to understand CENG342 L6.39 Spring 27