Chapter 5: The Processor: Datapath and Control

Similar documents
Systems Architecture I

Topic #6. Processor Design

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

ECE369. Chapter 5 ECE369

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

CPE 335. Basic MIPS Architecture Part II

Processor: Multi- Cycle Datapath & Control

The Processor: Datapath & Control

CSE 2021 COMPUTER ORGANIZATION

LECTURE 6. Multi-Cycle Datapath and Control

RISC Processor Design

CC 311- Computer Architecture. The Processor - Control

CSE 2021 COMPUTER ORGANIZATION

Lets Build a Processor

Multiple Cycle Data Path

Systems Architecture

Processor (I) - datapath & control. Hwansoo Han

Major CPU Design Steps

ENE 334 Microprocessors

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Chapter 5 Solutions: For More Practice

Lecture 5: The Processor

RISC Architecture: Multi-Cycle Implementation

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

Design of the MIPS Processor

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

EECE 417 Computer Systems Architecture

RISC Design: Multi-Cycle Implementation

RISC Architecture: Multi-Cycle Implementation

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Design of the MIPS Processor (contd)

Note- E~ S. \3 \S U\e. ~ ~s ~. 4. \\ o~ (fw' \i<.t. (~e., 3\0)

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Alternative to single cycle. Drawbacks of single cycle implementation. Multiple cycle implementation. Instruction fetch

Multicycle Approach. Designing MIPS Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Points available Your marks Total 100

Chapter 4 The Processor (Part 2)

Chapter 4. The Processor

Chapter 4. The Processor

ALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR

The Processor: Datapath & Control

LECTURE 5. Single-Cycle Datapath and Control

CPE 335 Computer Organization. Basic MIPS Architecture Part I

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

The MIPS Processor Datapath

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Mapping Control to Hardware

CPU Organization (Design)

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Inf2C - Computer Systems Lecture Processor Design Single Cycle

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

ECE260: Fundamentals of Computer Engineering

Processor (multi-cycle)

CSE 2021: Computer Organization Fall 2010 Solution to Assignment # 3: Multicycle Implementation

Computer Science 141 Computing Hardware

Single Cycle Datapath

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

ECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems

CENG 3420 Lecture 06: Datapath

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

ECE232: Hardware Organization and Design

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

Using a Hardware Description Language to Design and Simulate a Processor 5.8

Single vs. Multi-cycle Implementation

Review: Abstract Implementation View

Single Cycle Data Path

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

Single Cycle Datapath

Processor Implementation in VHDL. University of Ulster at Jordanstown University of Applied Sciences, Augsburg

Chapter 4. The Processor Designing the datapath

CS Computer Architecture Spring Week 10: Chapter

Ch 5: Designing a Single Cycle Datapath

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

Multi-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Initial Representation Finite State Diagram. Logic Representation Logic Equations

Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007

CSEN 601: Computer System Architecture Summer 2014

Chapter 4. The Processor

CSE Computer Architecture I Fall 2009 Lecture 13 In Class Notes and Problems October 6, 2009

Chapter 4 The Processor 1. Chapter 4A. The Processor

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W10-M

Control Unit for Multiple Cycle Implementation

COMPUTER ORGANIZATION AND DESIGN

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

CPU Design Steps. EECC550 - Shaaban

ECE 30, Lab #8 Spring 2014

Multicycle conclusion

Transcription:

Chapter 5: The Processor: Datapath and Control

Overview Logic Design Conventions Building a Datapath and Control Unit Different Implementations of MIPS instruction set A simple implementation of a processor Multicycle Implementation Exceptions Computer Architecture CS 35-2 2

Review Logic Design Conventions Almost ready to move into chapter 5 and start building a processor First, let s review Boolean Logic and build the ALU To support -add for MIPS instruction set and use 32 of them operation a b 32 32 ALU 32 result Computer Architecture CS 35-2 3

Review: Boolean Logic (summary) Boolean operations a AND b True only when a is true and b is true a OR b True when either a is true or b is true, or both are true NOT a True when a is false, and vice versa Computer Architecture CS 35-2 4

Review: Boolean Logic (continued) Boolean expressions Constructed by combining together Boolean operations Example: (a AND b) OR ((NOT b) AND (NOT a)) Truth tables capture the output/value of a Boolean expression A column for each input plus the output A row for each combination of input values Computer Architecture CS 35-2 5

Boolean Logic (continued) Example: (a AND b) OR ((NOT b) and (NOT a)) INPUT OUTPUT a b a b a.b b.a (a.b) + (b.a) value Computer Architecture CS 35-2 6

Gates Gates Hardware devices built from transistors to mimic Boolean logic AND gate Two input lines, one output line a b AND gate c Outputs a when both inputs are Computer Architecture CS 35-2 7

Gates (continued) OR gate Two input lines, one output line a b OR gate c Outputs a when either input is NOT gate One input line, one output line a NOT gate Outputs a when input is and vice versa Computer Architecture CS 35-2 8

Gates (continued) Abstraction in hardware design Map hardware devices to Boolean logic Design more complex devices in terms of logic, not electronics Computer Architecture CS 35-2 9

Building Computer Circuits A circuit is a collection of logic gates: Transforms a set of binary inputs into a set of binary outputs Values of the outputs depend only on the current values of the inputs Combinational circuits have no cycles in them (no outputs feed back into their own inputs) Computer Architecture CS 35-2

Circuit Diagram (Summary) a a + b c Input b b Output (a + b).b NOT((a + b).b) d Every output in a circuit diagram can be represented as a Boolean Expression Computer Architecture CS 35-2

A Circuit Construction Algorithm ALU Sum-of-products algorithm is one way to design circuits: Truth table Boolean expression gate layout Computer Architecture CS 35-2 2

A Circuit Construction Algorithm (continued) Sum-of-products algorithm Truth table captures every input/output possible for circuit Repeat process for each output line Build a Boolean expression using AND and NOT for each of the output line Combine together all the expressions with ORs Build circuit from whole Boolean expression Computer Architecture CS 35-2 3

Construction of Addition Circuit -ADD Algorithm 3 + 4 Carry Bit c i 3 a i + 4 b i 27 s i Computer Architecture CS 35-2 4

An Addition Circuit add $s, $s2, $s3 Addition circuit Adds two unsigned binary integers, setting output bits and an overflow Built from -bit adders (-ADD) Starting with rightmost bits, each pair produces A value for that order A carry bit for next place to the left Computer Architecture CS 35-2 5

An Addition Circuit (continued) -ADD truth table Input One bit from each input integer: a i, b i One carry bit (always zero for rightmost bit): c i Output One bit for output place value: s i One carry bit: c i+ Computer Architecture CS 35-2 6

-ADD Circuit Sub-expression construction (AND & NOT Gates) Inputs outputs a i b i c i s i c i+ a. b. c a. b. c a. b. c a. b. c Computer Architecture CS 35-2 7

-ADD Circuit Sub-expression construction (AND & NOT Gates) OUTPUT: s i ( a. b. c) + ( a. b. c ) + ( a. b. c ) + ( a. b. c ) Construct the circuit diagram for the sub-expression Computer Architecture CS 35-2 8

-ADD Circuit Sub-expression construction (AND & NOT Gates) Inputs outputs a i b i c i s i c i+ a. b. c a. b. c a. b. c a. b. c Computer Architecture CS 35-2 9

-ADD Circuit Sub-expression construction (AND & NOT Gates) OUTPUT: c i+ ( a. b. c) + ( a. b. c ) + ( a. b. c ) + ( a. b. c ) Construct the circuit diagram for the sub-expression Computer Architecture CS 35-2 2

An Addition Circuit (continued) Building the full adder Put rightmost bits into -ADD, with zero for the input carry Send -ADD s output value to output, and put its carry value as input to -ADD for next bits to left Repeat process for all bits Computer Architecture CS 35-2 2

Control Circuits Do not perform computations Choose order of operations or select among data values Major types of controls circuits Multiplexors Select one of inputs to send to output Decoders Sends a on one output line, based on what input line indicates Computer Architecture CS 35-2 22

Control Circuits (continued) Multiplexor form 2 N regular input lines N selector input lines output line Multiplexor purpose Given a code number for some input, selects that input to pass along to its output Used to choose the right input value to send to a circuit (ALU, Registers..) Computer Architecture CS 35-2 23

Practice Problem: Consider a logic function with three inputs: A, B, and C. Output D is true if at least one input is true Output E is true if exactly two inputs are true Output F is true only if all three inputs are true Show the truth table for these three functions. Show the Boolean equations for these three functions. Show an implementation consisting of AND, OR and NOT gates. Computer Architecture CS 35-2 24

The Processor: Datapath and Control Intro Recall CPU Performance equation: CPU time in program = Instruction Count x CPI x Clock cycle time Best Performance Optimal (minimum) CPU time The ISA and Compiler determine the Instruction Count The Processor determines: Clock cycles Per Instruction (CPI) Seconds per clock cycle (Clock cycle time) Build a processor that exploits the MIPS ISA Computer Architecture CS 35-2 25

The Processor: Datapath and Control Basic MIPS Implementation Let s look at a subset of core MIPS ISA: The memory-reference instructions: load word (lw), store word (sw) The arithmetic-logical instructions: add, sub, and, or, slt The control flow instructions: branch equal (beq)and jump (j) Create Datapath and design the Control for the three instruction classes Computer Architecture CS 35-2 26

The Processor: Datapath and Control Basic MIPS Implementation flow Let s examine the steps involved to implement the instruction classes. Generic Steps [assume the memory unit stores instructions and supply the instruction given an address] use the program counter (PC) to supply instruction address get the instruction from memory read registers use the instruction to decide exactly what to do Computer Architecture CS 35-2 27

The Processor: Datapath and Control Basic MIPS Implementation flow 2. Use ALU after reading the register(s) The memory-reference instructions: address calculation The arithmetic-logical instructions: execute operation Branch instruction: Comparison 3. Now Complete Implementation based on instruction class The memory-reference instructions: access memory? The arithmetic-logical instructions: access register? Branch: access PC? Computer Architecture CS 35-2 28

Abstract High-level View Basic (our subset) MIPS Implementation flow 4 Add Add Data PC Address Instruction Instruction memory Register # Registers ALU Address Register # Register # Data memory Data What is wrong with this flow? Computer Architecture CS 35-2 29

Logic Design Conventions MIPS Implementation Two types of logic elements: elements that operate on data values (Combinational) No memory output state depends only on current input state ALU: and-gate, or-gate and not-gate are combinational elements that contain state (State Elements) Store the state of the bit Output depends on the stored bit State-elements (flip-flops, latches) Instruction, Data Memories, Registers Computer Architecture CS 35-2 3

State Element INPUT OUTPUT Data value (to update SE) Clock State Element (SE) Previous Data Value data value in earlier clock cycle When to write data value (required input) A State Element has a minimum of two inputs and one output Computer Architecture CS 35-2 3

State Elements Clocking Methodology Defines when signals can be read and when they can be written If a signal is written at the same time it is read Value of read: Old Value New Value or Mix of Old & New Values Clocking methodology avoids the scenario Computer Architecture CS 35-2 32

Clocks Synchronous Logic When should an element that contains state be updated? Falling edge Clock period Rising edge Computer Architecture CS 35-2 33

State Elements Synchronous Digital Systems Clocks Free running signal with a fixed cycle time Decide when an element that contains state should be updated Falling edge Clock period Rising edge Clock Period: High and Low Computer Architecture CS 35-2 34

Sequential Logic Design Logic Circuit representation: n Block of State Elements Combinational Logic State Element Signals written into state elements must be valid (Stable) State element Combinational logic State element 2 Clock cycle After Clock edge: SE- output changes Input to Combinational Logic (Stable signal) -- SE-2 Require Long Clock Period Restriction to ensure output of combinational logic is stable (lower bound) Computer Architecture CS 35-2 35

Clocking Methodology Edge-triggered clocking Edge-triggered Clocking Methodology Either Rising edge or Falling Edge of clock is active All State Changes occur on the active clock edge Sampling of signals is almost instantaneous Choice of an active clock edge depends on implementation State element Combinational logic State element 2 Clock cycle Rising edge Falling edge All signals from SE - to SE -2 propagate in one clock cycle Computer Architecture CS 35-2 36

Sequential Logic How does a Sequential Logic circuit operate?. When the clock issues a pulse Each SE examines outputs of combinational logic, changes affected / state to opposite state (takes a few nanoseconds) and transforms to a stable state 2. The change in state of some of the stored bits triggers changes in some combinational logic and subsequently changes in other SE (unstable state ). Takes few nanoseconds for every output of the combinational logic block to reach a stable state 3. When all outputs of combinational logic blocks are stable, the clock issues another pulse Repeat steps and 2. Require Long Clock Period Restriction to ensure output of combinational logic is stable Computer Architecture CS 35-2 37

Building a Datapath Store and Access Instructions Instruction address Instruction memory Instruction PC Add Sum a. Instruction memory b. Program counter c. Adder Datapath Elements to store and Access Instructions Instruction memory is SE (but no write function) Combinational logic PC SE Computer Architecture CS 35-2 38

Building a Datapath R-Format ALU Operations Register numbers Data 5 5 5 Read register Read register 2 Write register Write Data Registers Read data Read data 2 Data 4 ALU operation ALU Zero ALU result RegWrite a. Registers b. ALU Units needed to implement R-format ALU operations Register File contains 2 5 registers (read/write) SE Computer Architecture CS 35-2 39

Building a datapath lw and sw operations M e m W r ite A d d re s s W r ite d a ta R e a d d a ta D a ta m e m o r y 6 S ig n 3 2 e x te n d M e m R e a d a. D a ta m e m o r y u n it b. S ig n - e x te n s io n u n it Additional Units needed to implement lw and sw Memory Unit is a SE: Inputs: Address & Write Data Output: Read Result Sign Extension Unit: Converts a 6-bit input to 32-bit output Computer Architecture CS 35-2 4

Building Datapath Use Multiplexors to combine: the components P C S rc A d d M ux 4 A d d A L U re s u lt S h ift le ft 2 P C R e a d a d d re s s In s tru c tio n In s tru c tio n m e m o ry R e a d re g is te r R e a d re g is te r 2 R e g is te rs W rite re g is te r W rite d a ta R e g W rite R e a d d a ta R e a d d a ta 2 A L U S rc M ux 4 A L U A L U o p e ra tio n Z e ro A L U re s u lt A d d re s s W rite d a ta M e m W rite R e a d d a ta D a ta m e m o ry M e m to R e g M ux 6 S ig n 3 2 ex te n d M e m R e a d Computer Architecture CS 35-2 4

A Simple Implementation Scheme ALU Control Let s look at a subset of core MIPS ISA: The memory-reference instructions: load word (lw), store word (sw) The arithmetic-logical instructions: add, sub, and, or, set on less than (slt) The control flow instructions: branch equal (beq)and jump (j) Computer Architecture CS 35-2 42

A Simple Implementation Scheme ALU Control ALU has 4 control Inputs (2 4 combinations) Examine Subset (6 combinations) ALU Control Input Function AND OR add subtract set on less than NOR Computer Architecture CS 35-2 43

A Simple Implementation Scheme ALU Control Selecting the operations to perform (ALU, read/write, etc.) Controlling the flow of data (multiplexor inputs) Information comes from the 32 bits of the instruction What should the ALU do with this instruction? add $8, $7, $8 Instruction Format: op rs rt rd shamt funct Compute AND, OR, subtract, add or slt depending on 6-bit funct Computer Architecture CS 35-2 44

ALU Control What should the ALU do with this instruction? lw $, ($2) 35 2 op rs rt 6 bit offset Compute the memory address.. But how? Computer Architecture CS 35-2 45

ALU Control Need a mapping for hardware to compute the 4-bit ALU control input signals Instruction Type: opcode lw, sw beq arithmetic ALUop (is set to) Function Code for Arithmetic.. Computer Architecture CS 35-2 46

ALU Control Bit Settings Instruction opcode ALUop Instruction Operation Funct Field Desired ALU action ALU control input lw load word add sw store word add Branch equal branch equal subtract R-type add add R-type subtract subtract R-type AND and R-type OR or R-type set on less than set on less than ~ don t care bit Computer Architecture CS 35-2 47

ALU Control Bit Settings Control Bit Settings depends on: ALUop bits Function field codes for R-type instruction Instruction opcode ALUop Funct Field ALU control input lw sw Branch equal R-type R-type R-type R-type R-type Computer Architecture CS 35-2 48

ALU Control Logic Truth Table (the 3-ALU control bits) ALUop Funct Field Operation ALUop ALUop F5 F4 F3 F2 F F (Output) ALU Control function has 3-distinct outputs: operation2 operation operation Computer Architecture CS 35-2 49

ALU Control Logic Mapping of ALU Control Function to Gates ALUOp ALUOp ALUOp ALU control block F3 Operation2 F (5 ) F2 F F Operation Operation Operation What is the mapping for the ALU control functions? Computer Architecture CS 35-2 5

Mapping of ALU Control Functions ALUOp ALUOp ALUOp ALU control block F3 Operation2 F (5 ) F2 F F Operation Operation Operation ALUop ALUop ALUop F5 Funct Field F4 F3 F2 F F Operation (Output) ALU Control Input AND Function OR add subtract set on less than Computer Architecture CS 35-2 5

Designing the Main Control Unit Datapath Let s examine MIPS Instruction Fields & Control Lines: R-Type Instruction Field op rs rt rd shamt funct Bit Position 3:26 25:2 2:6 5: :6 5: # of bits 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Observations: The op field in bits 3:26 always set to Source Registers: Identified in bits 25:2 & bits 2:6 resp. Destination Register: Identified in bits 5: The shamt field is never used (ignore) The funct field is coded as per ALU design Computer Architecture CS 35-2 52

Designing the Main Control Unit Datapath Load word, store word and branch-on-equal: I-Type Instruction Field op rs rt address Bit Position 3:26 25:2 2:6 5: # of bits 6 bits 5 bits 5 bits 6 bits Observations: The op field - same position as R-type format The base register (rs) and rt fields - same positions as R-type format The address field in bits 5 - Computer Architecture CS 35-2 53

Designing the Main Control Unit Observations The Op field is always in bits 3 26 Labeled as op5, Op4, Op3, Op2, Op, Op Every instruction reads register specified by rs field Every instruction, except load word, reads the register specified by rt field The load word writes to the rt field The base register for load and store word is always specified by rs field The destination register is one of two places: For load word, it is rt field (bits 2-6) For R-Type instruction it is in rd field (bits 5 -) Thus a multiplexor needs to be added to the datapath to select the correct field for the write register input Computer Architecture CS 35-2 54

Designing the Main Control Unit Effects of control signals Fig 5.7 shows the datapath with 7 control lines Signal Name RegDst RegWrite ALUSrc PCSrc MemRead MemWrite MemtoReg None Send PC+4 to PC None None Effect if Destination number of write register is in rt field Send rt register to ALU Send ALU result to register file Destination write register is in rd field Effect if Store Write data input into Destination register Send sign-extended lower 6 bits of instruction to ALU Send branch target to PC Read contents of addressed memory word Store Write data into addressed memory word Send word to register file Computer Architecture CS 35-2 55

Designing the Main Control Unit Control unit computes settings of control lines Add M ux. Instruction Memory 2. Register Read 3. ALU Operation 4. Data Memory 5. Register Write PC 4 R ead address Instruction [3 26] Instruction [25 2] R ead register Read Instruction [2 6] data R ead register 2 Zero Instruction [3 ] ALU M Read ALU W rite data 2 result Instruction ux Instruction [5 ] register M m em ory ux W rite data Registers Instruction [5 ] C ontrol 5 RegD st Branch M em R ead M em toreg ALUO p MemW rite ALUSrc RegW rite 6 Sign 32 extend Shift left 2 ALU control Add ALU result Address W rite data PCSrc 2 3 4 R ead data D ata m em ory M ux Instruction [5 ] Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp ALUp R-format lw sw beq Computer Architecture CS 35-2 56

Designing the Main Control Unit Effects of control signals Fig 5.7 shows two additional control line control lines ALUOp and ALUOp (ALUOp) The states of all control lines (except PCSrc) are determined by Op code field The state of PCSrc is determined by AND gate with inputs: Zero output of ALU and Branch control line The Branch control line is asserted if Op field equals (beq) Computer Architecture CS 35-2 57

Designing the Main Control Unit Control unit computes settings of control lines Instruction RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp ALUOp R-format lw sw beq R-format Instructions: add, sub, slt Source Register fields ~ rs and rt Destination register field ~ rd add $t, $t2, $s Computer Architecture CS 35-2 58

The datapath in operation Signal Name Effect if Effect if ALUop ALUo ALUop F 5 Funct Field F 4 F 3 F 2 F F Operation (Output) RegDst RegWrite ALUSrc PCSrc MemRead Destination number of write register is in rt field None Send rt register to ALU Send PC+4 to PC None Destination write register is in rd field Store Write data input into Destination register Send sign-extended lower 6 bits of instruction to ALU Send branch target to PC Read contents of addressed memory word MemWrite None Store Write data into addressed memory word MemtoReg Send ALU result to register file Send word to register file Instruction RegDst ALUSrc Memto- Reg RegW rite MemR ead Mem Write Branch ALUOp ALUOp R-format lw sw beq Computer Architecture CS 35-2 59

Computer Architecture CS 35-2 6 The datapath in operation Truth Table Op ALUOp AlUOp Branch MemWrite MemRead RegWrite MemtoReg ALUSrc RegDst Outputs Op Op2 Op3 Op4 Op5 Inputs beq sw lw R-format Signal Name Input/ Output

The datapath in operation Input/ Output Inputs Outpu ts Signal Name Op5 Op4 Op3 Op2 Op Op RegDst ALUSrc Memto Reg RegWri te MemR ead MemW rite Branch AlUOp ALUOp R- format lw sw beq Signal Name RegDst RegWrite ALUSrc PCSrc MemRead MemWrite MemtoReg Instruc tion R- format lw sw Destination number of write register is in rt field None Send rt register to ALU Send PC+4 to PC None None Send ALU result to register file Reg Dst Effect if AL USr c Me mt o- Re g Re g W rit e Destination write register is in rd field Store Write data input into Destination register Send sign-extended lower 6 bits of instruction to ALU Send branch target to PC Read contents of addressed memory word Store Write data into addressed memory word Send word to register file M e m Re ad Me m Wr ite Effect if Br an ch AL UO p AL UO p beq Computer Architecture CS 35-2 6

Single-cycle implementation Let s summarize Instruction Class R-type Load word Store word Branch Major Functional Units used by Instruction class/single Clock cycle Instruction Memory Fetch Instruction Fetch Instruction Fetch Instruction Fetch Instruction Register Read Access Register Access Register Access Register Access Register ALU Operation ALU ALU ALU ALU Data Memory Memory access Memory access Register Write Register access Register access Other Units: PC, Control, Multiplexors Computer Architecture CS 35-2 62

Single-cycle implementation Let s summarize Assume Critical Machine Operation Times: Memory Unit ~ 2 ps, ALU and adders ~ ps, Register file (read/write) ~ 5 ps. Estimate clock cycle for the machine Instruction Class Major Functional Units used by Instruction Operation Times Period Instruction Memory Register Read ALU Operation Data Memory Register Write R-type 2 5 5 4 ps Load word 2 5 2 5 6 ps Store word 2 5 2 55 ps Branch 2 5 35 ps Single Clock cycle determined by longest instruction period = 6 ps Computer Architecture CS 35-2 63

A Multicycle Implementation Datapath Basic Idea ( Divide and Conquer ) Break MIPS Instructions into independent and manageable steps Balance workload across steps Each step takes one clock cycle Re-use functional units in different steps Pack as much work into each step At most one ALU operation, one register file access or one memory access per step Computer Architecture CS 35-2 64

A Multicycle Datapath Implementation Approach A single memory replaces the instruction memory and data memory associated with single-cycle approach A single ALU performs the addition operations (thereby eliminating the two adders for single-cycle operation) A 32-bit Instruction Register (IR) is added to store instruction fetched from memory A 32-bit Memory Data Register (MDR) is added to hold data fetched from memory Two 32-bit registers, A and B are added to store values of source registers rs and rt respectively A 32-bit register, ALUOut, is added to temporarily hold the results of the ALU Use FSM to determine control signals (instead of instructions) Computer Architecture CS 35-2 65

A Multicycle Datapath Implementation Actions of the bit Control Signals Signal Name RegDst RegWrite ALUSrcA MemRead MemWrite MemtoReg IorD IRWrite PCWrite PCWriteCond Destination number of write register is in rt field None First ALU Operand = PC None None Send ALUOut result to register file Memory address is PC None None None Effect if Effect if Destination write register is in rd field Store Write data input into Destination register First ALU Operand = A register Read contents of addressed memory word Store Write data into addressed memory word Send MDR to register file Memory address is ALUOut Store memory word into the IR Store a new value into the PC Store a new value into PC if ALU result is Computer Architecture CS 35-2 66

A Multicycle Datapath Implementation Actions of the 2 bit Control Signals Signal Name ALUOp ALUSrcB PCSource Value Effect if ALU performs an ADD operation ALU performs a subtract operation ALU performs operation specified by funct field of the IR Second ALU Operand is the B register Second ALU Operand is the constant 4 Second ALU Operand is sign-extended lower 6 bits of the IR Second ALU Operand is sign-extended lower 6 bits of the IR shifted left 2 places New PC value comes from ALU (PC+4) New PC value comes from branch target address in ALUOut New PC value is jump target address Computer Architecture CS 35-2 67

A Multicycle Datapath Implementation Approach. Instruction Memory 2. Register Read 3. ALU Operation 4. Data Memory 5. Register Write PC Instruction Read Address [25 2] register M ux Memory MemData Write data &4 Instruction [2 6] Instruction [5 ] Instruction register Instruction [5 ] Memory data register Instruction [5 ] M ux M ux Read register 2 Registers Write register Write data 6 Sign 32 extend Read data Read data 2 2 & 5 Shift left 2 A M ux B 4 M ux 2 3 Zero ALU ALU result 3 ALUOut Computer Architecture CS 35-2 68

Breaking the Instructions Example: add $t, $t, $t2 The add instruction changes value of a register (Reg). Register specified by bits 5: of Instruction. Instruction specified by the PC. New value is the sum ( op ) of two registers. Registers specified by bits 25:2 and 2:6 of the instruction Reg[Memory[PC][5:]] <= Reg[Memory[PC][25:2]] op Reg[Memory[PC][2:6]] Computer Architecture CS 35-2 69

Breaking the Instructions Example: add $t, $t, $t2 Reg[Memory[PC][5:]] <= Reg[Memory[PC][25:2]] op Reg[Memory[PC][2:6]] Could break down to: IR <= Memory[PC] A <= Reg[IR[25:2]] B <= Reg[IR[2:6]] ALUOut <= A op B Reg[IR[5:]] <= ALUOut Computer Architecture CS 35-2 7

A Multicycle Datapath Implementation Five Execution steps. Instruction Fetch 2. Instruction Decode and Register Fetch 3. Execution, Memory Address Computation, or Branch Completion 4. Memory Access or R-type instruction completion 5. Memory Read Completion Computer Architecture CS 35-2 7

. Instruction Fetch Use PC to get instruction and put it in the Instruction Register. Increment the PC by 4 and put the result back in the PC. IR <= Memory[PC]; PC <= PC + 4; Why update the PC now? Signal Name Set to Set to Signal Settings: ALUSrcA MemRead First ALU Operand = PC Read contents of addressed memory word IorD Memory address is PC IRWrite Store memory word into the IR PCWrite Store a new value into the PC ALUSrcB Second ALU Operand is the constant 4 ALUOp New PC value comes from ALU (PC+4) Computer Architecture CS 35-2 72

2. Instruction Decode and Register Fetch Read registers rs and rt in case we need them Compute the branch address in case the instruction is a branch A <= Reg[IR[25:2]]; B <= Reg[IR[2:6]]; ALUOut <= PC + (sign-extend(ir[5:]) << 2); What is the signal setting for step 2? Computer Architecture CS 35-2 73

3. Execution, Memory Address Computation, or Branch Completion ALU performs one of three functions, based on instruction type Memory Reference: ALUOut <= A + sign-extend(ir[5:]); R-type: ALUOut <= A op B; Branch: if (A==B) PC <= ALUOut; Determine the signal settings for each instruction type Computer Architecture CS 35-2 74

4. Memory Access or R-type instruction completion Loads and stores access memory MDR <= Memory[ALUOut]; or Memory[ALUOut] <= B; R-type instructions finish Reg[IR[5:]] <= ALUOut; Computer Architecture CS 35-2 75

Memory Read Completion Reg[IR[2:6]] <= MDR; Only the load word instruction executes this step Computer Architecture CS 35-2 76

A Multicycle Datapath Implementation Let s summarize Step Name Action for R- type Instructions Action for memoryreference instruction Action for branches Instruction fetch IR <= Memory[PC] PC <= PC + 4 Instruction decode/register fetch A <= Reg [IR[25:2]] B <= Reg [IR[2:6]] ALUOut <= PC + (sign-extend (IR[5:]) << 2) Execution, address computation, branch/jump completion ALUOut <= A op B ALUOut <= A + signextend (IR[5:]) If (A == B) PC <= ALUOut Memory access or R-type completion Reg [IR[5:]] <= ALUOut Load: MDR <= Memory[ALUOut] or Store: Memory [ALUOut] <= B Memory read completion Load: Reg[IR[2:6]] <= MDR Computer Architecture CS 35-2 77

Multicycle Datapath Example How many cycles will it take to execute the following MIPS code lw $s2, ($s3) lw $s3, 4($s3) beq $s2, $s3, Label add $s5, $s2, $s3 sw $s5, 8($s3) #assume not Computer Architecture CS 35-2 78

Finite State Machine (FSM) Overview Subsystems characterized by: Set of well defined states (States:, 2, 3,4, 5) State function determined typically by Current state Input values Output function determined typically by Current state Input values Computer Architecture CS 35-2 79