Introduction. ENG3380 Computer Organization and Architecture MIPS: Data Path Design Part 3. Topics. References. School of Engineering 1

Similar documents
Computer Science 141 Computing Hardware

CENG 3420 Lecture 06: Datapath

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Review: Abstract Implementation View

Systems Architecture I

CPE 335 Computer Organization. Basic MIPS Architecture Part I

The Processor: Datapath & Control

CC 311- Computer Architecture. The Processor - Control

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

CPE 335. Basic MIPS Architecture Part II

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Implementing the Control. Simple Questions

ENE 334 Microprocessors

Multicycle Approach. Designing MIPS Processor

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

Systems Architecture

5.7. Microprogramming: Simplifying Control Design 5.7

Lets Build a Processor

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Microprogrammed Control Approach

CS Computer Architecture Spring Week 10: Chapter

ECE232: Hardware Organization and Design

Chapter 4 The Processor (Part 2)

The overall datapath for RT, lw,sw beq instrucution

LECTURE 5. Single-Cycle Datapath and Control

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

Learning Outcomes. Spiral 3-3. Sorting: Software Implementation REVIEW

Chapter 4 The Processor 1. Chapter 4A. The Processor

Multicycle conclusion

Processor (I) - datapath & control. Hwansoo Han

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Mapping Control to Hardware

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

Outline of today s lecture. EEL-4713 Computer Architecture Designing a Multiple-Cycle Processor. What s wrong with our CPI=1 processor?

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

Topic #6. Processor Design

Designing a Multicycle Processor

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

The Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

Processor: Multi- Cycle Datapath & Control

EECE 417 Computer Systems Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Chapter 4. The Processor

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

Initial Representation Finite State Diagram. Logic Representation Logic Equations

Lecture 5: The Processor

ECE468 Computer Organization and Architecture. Designing a Multiple Cycle Controller

COMP303 Computer Architecture Lecture 9. Single Cycle Control

Initial Representation Finite State Diagram Microprogram. Sequencing Control Explicit Next State Microprogram counter

ECE170 Computer Architecture. Single Cycle Control. Review: 3b: Add & Subtract. Review: 3e: Store Operations. Review: 3d: Load Operations

RISC Processor Design

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

Chapter 4. The Processor

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath

CS61C : Machine Structures

Outline. EEL-4713 Computer Architecture Designing a Single Cycle Datapath

Major CPU Design Steps

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

CSE140: Components and Design Techniques for Digital Systems

ECE369. Chapter 5 ECE369

CPU Organization (Design)

ECE 313 Computer Organization EXAM 2 November 9, 2001

ECE 361 Computer Architecture Lecture 11: Designing a Multiple Cycle Controller. Review of a Multiple Cycle Implementation

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

RISC Design: Multi-Cycle Implementation

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Inf2C - Computer Systems Lecture Processor Design Single Cycle

ECE260: Fundamentals of Computer Engineering

CS3350B Computer Architecture Winter 2015

Processor (multi-cycle)

EECE 417 Computer Systems Architecture

Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control

Ch 5: Designing a Single Cycle Datapath

Outline. Combinational Element. State (Sequential) Element. Clocking Methodology. Input/Output of Elements

Computer Architecture. Lecture 6.1: Fundamentals of

ENGN1640: Design of Computing Systems Topic 04: Single-Cycle Processor Design

are Softw Instruction Set Architecture Microarchitecture are rdw

The Processor: Datapath & Control

Fundamentals of Computer Systems

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

ECE468 Computer Organization and Architecture. Designing a Single Cycle Datapath

Chapter 4. The Processor

Recap: A Single Cycle Datapath. CS 152 Computer Architecture and Engineering Lecture 8. Single-Cycle (Con t) Designing a Multicycle Processor

CS359: Computer Architecture. The Processor (A) Yanyan Shen Department of Computer Science and Engineering

Data paths for MIPS instructions

Computer Architecture

Chapter 4. The Processor Designing the datapath

Review. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU

The MIPS Processor Datapath

Lecture 10: Simple Data Path

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

Single Cycle Datapath

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

Transcription:

ENG8 Computer Organization and rchitecture MIPS: Data Path Design Part Winter 7 S. reibi School of Engineering University of Guelph Introduction Topics uilding a Complete Data Path for MIPS Multi Cycle Computer Datapath Design Design of the Unit dvantages & Disadvantages Summary Single Cycle Implementation Cycle Time Unfortunately, though simple, the single cycle approach is not used because it is very slow Clock cycle must have the same length for every instruction What is the longest (slowest) path (slowest instruction)? With thanks to W. Stallings, Hamacher, J. Hennessy, M. J. Irwin for lecture slide contents Many slides adapted from the PPT slides accompanying the textbook and CSE Course References I. Computer Organization and rchitecture: Designing for Performance, th edition, by William Stalling, Pearson. II. Computer Organization and Design: The Hardware/Software Interface, th editino, by D. Patterson and J. Hennessy, Morgan Kaufmann III. Computer Organization and rchitecture: Themes and Variations,, by lan Clements, CENGGE Learning Review: Single Cycle Data and Path Instr[5-] 8 6 left +[-8] Jump Op ranch dd Instr[-6] Instr[5-] Instruction Instr[-6] Instr[-] Instr[5 -] Src ddr RegisterData ddr Write ddr Data dd left ovf Src Mem Data Data Instr[5-] 6 Instr[5-] School of Engineering

Single Cycle Disadvantages & dvantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate the slowest instr especially problematic for more complex instructions like floating point multiply Clk Cycle Cycle lw sw Waste May be wasteful of area since some functional units (e.g., adders) must be duplicated since they can not be shared during a clock cycle but It is simple and easy to understand Our Multicycle pproach reak up the instructions into steps where each step takes a clock cycle while trying to balance the amount of work to be done in each step use only one major functional unit per clock cycle t the end of a clock cycle Store values needed in a later clock cycle by the current instruction in a state element (internal register not visible to the programmer) Instruction Register Data Register and Register read data registers out output register - ll (except ) hold data only between a pair of adjacent clock cycles (so they don t need a write signal) Data used by subsequent instructions are stored in programmer visible state elements (i.e., Register,, or ) The Multicycle Datapath High Level View Registers have to be added after every major functional unit to hold the output value until it is used in a subsequent clock cycle MIPS Data Path Multi Cycle Data ddr RegisterData ddr Write ddr Data out 8 Multicycle Implementation Overview Clocking the Multicycle Datapath Each instruction step takes clock cycle Therefore, an instruction takes more than clock cycle to complete (fetch, decode, execute, ) Not every instruction takes the same number of clock cycles to complete Multicycle implementations allow faster clock rates different instructions to take a different number of clock cycles functional units to be used more than once per instruction as long as they are used on different clock cycles, as a result - only need one memory - only need one /adder System Clock clock cycle ddr RegisterData Data ddr Write ddr Data out School of Engineering

The Multicycle Datapath High Level View Multiplexors have to be added since we are using a single memory and a single The Complete Multicycle Data with Data ddr RegisterData ddr Write ddr Data out Data Instr[-6] Instr[5-] Instr[5-] Instr[5-] ddr RegisterData ddr Write ddr Data left [-8] 8 left out The Multicycle Datapath with more Support lthough this datapath supports normal incrementing of the, a few more connections and a multiplexor will be needed for branches and jumps The additions versus the single-clock datapath include Data Instr[5-] Instr[5-] ddr RegisterData ddr Write ddr Data left out The Complete Multicycle Data with Data The is written both unconditionally and conditionally. Cond Instr[-6] Instr[5-] Instr[5-] ddr RegisterData ddr Write ddr During a normal increment and for jumps, the is written unconditionally. If the instruction is a conditional branch, the increment is replaced with the value in Out only if registers are equal. [-8] Instr[5-] Data left left 8 out The MC Datapath with support to ranch/jump The Complete Multicycle Data with. The output of the, which is the value + during instruction fetch. This value should be stored directly into the.. The register Out, which is where we will store the address of the branch target after it is computed.. The lower 6 bits of the instruction register () shifted left by two and concatenated with the upper bits of the incremented, which is the source when the instruction is a jump. [-8] Data ddr RegisterData ddr Write ddr Instr[5-] Instr[5-] Instr[5-] Data left left 8 out Data Cond Mem Write Source Op Src Src Instr[-6] Instr[5-] Instr[5-] ddr RegisterData ddr Write ddr Instr[5-] Data left [-8] left 8 out School of Engineering

The Instruction & Instruction Register Five Instruction Steps R-type: 5 5 5 I-Type: J-Type: op rs rt rd shamt funct 5 5 op rs rt address offset op jump target address Data Instr[5-] Write ddr Instr[5-] to branch Instr[-6] to unit for opcode Instr[5-] used for jump Instr[5-] ddr Instr[-6] ddr ddr RegisterData ddr Write ddr Data left. Instruction. Instruction and Register. R-type Instruction Execution, /Write Computation, ranch Completion, or Jump Completion. ccess, Write Completion or R-type Instruction Completion 5. Completion (Write ack) INSTRUCTIONS TKE FROM - 5 CYCLES! Instr[5-] to Review: Our ling the uses of multiple decoding levels main unit generates the Op bits unit generates bits Instr op funct Op action lw xxxxxx add sw xxxxxx add beq xxxxxx subtract add add subt subtract and and or or xor xor nor nor slt slt Step : Instruction Use to get instruction from the memory and put it in the Instruction Register Increment the by and put the result back in the Can be described succinctly using the RTL "Register- Transfer Language = []; = + ; Can we figure out the values of the signals? What is the advantage of updating the now? Our Multicycle pproach, con t Datapath ctivity During Instruction ing from or writing to any of the internal registers, Register, or the occurs (quickly) at the beginning (for read) or the end of a clock cycle (for write) Cond Mem Write Source Op Src Src ing from the Register takes ~5% of a clock cycle since it has additional and access overhead (but reading can be done in parallel with decode) Had to add multiplexors in front of several of the functional unit input ports (e.g.,, ) because they are now shared by different clock cycles and/or do multiple jobs ll operations occurring in one clock cycle occur in parallel This limits us to one operation, one access, and one Register access per clock cycle Data Instr[-6] ddr RegisterData ddr Write ddr Instr[5-] Instr[5-] Instr[5-] Data left [-8] left 8 out School of Engineering

als Settings,Write,,= = Instr Mem;Write Src= src= Source,Op= als Settings,Write,,= = Mem;Write Src= src= Source,Op= Instr Src= Src= Op= Cond= Step : Instruction and Register Don t know what the instruction is yet, so can only registers rs and rt in case we need them Compute the branch address in case the instruction is a branch The RTL: R-type: 5 5 5 op rs rt rd shamt funct = Reg[[5-]]; I-Type: = Reg[[-6]]; Out = +(sign-extend([5-])<< ); 5 5 op rs rt address offset Note we aren't setting any lines based on the instruction (since we don t know what it is (the logic is busy "decoding" the op code bits)) Step (instruction dependent) is performing one of four functions, based on instruction type. reference (lw and sw): op rs rt address offset Out = + sign-extend([5-]);. R-type: Out = op ; I-Type: 5 5 R-type: 5 5 5 op rs rt rd shamt funct. ranch: I-Type: op rs rt address offset if (==) = Out;. Jump: = [-8] ([5-] << ); J-Type: 5 5 op jump target address Datapath ctivity During Instruction Cond Mem Write Source Op Src Src Datapath ctivity During () lw & sw Cond Mem Write Source Op Src Src Out = + sign-extend([5-]); Data Instr[-6] Instr[5-] ddr RegisterData ddr Write ddr Data Instr[5-] left Instr[5-] [-8] 8 left out Data Instr[-6] Instr[5-] ddr RegisterData ddr Write ddr Data Instr[5-] left Instr[5-] [-8] 8 left out School of Engineering 5

Datapath ctivity During () R-type R-type: 5 5 5 Cond Mem Write op rs rt rd shamt funct Source Op Src Src Out = op ; als Settings,Write,,= = Mem;Write Src= src= Source,Op= Instr Src= Src= Op= Cond= Data Instr[-6] Instr[5-] ddr RegisterData ddr Write ddr Data Instr[5-] left Instr[5-] [-8] 8 left out Src= Src= Op= Cond= Src= Src= Op= Cond= Src= Src= Op= Source= Cond Source= fter state the signals asserted depend on the class of instruction. Thus, the finite state machine has four arcs exiting state, corresponding to the four instruction classes: reference (lw, sw) R-type ranch on equal Jump This process of branching to different states depending on the instruction is called decoding. Datapath ctivity During () beq Data I-Type: 5 5 Cond Mem Write op rs rt address offset ddr RegisterData ddr Write ddr Instr[5-] Data left Source Op Src Src Instr[-6] Instr[5-] Instr[5-] if (==) = Out; [-8] left 8 out Step or Write (also instruction dependent) reference: = [Out]; or [Out] = ; -- lw -- sw R-type instruction completion (write to Reg) Reg[[5-]] = Out; R-type: 5 5 5 op rs rt rd shamt funct Remember, the register write actually takes place at the end of the cycle on the clock edge Datapath ctivity During () j J-Type: op jump target address Cond Mem Write = [-8] ([5-] << ); Source Op Src Src Datapath ctivity During lw ccess Cond Mem Write Source Op Src Src Data Instr[-6] Instr[5-] ddr RegisterData ddr Write ddr Data Instr[5-] left Instr[5-] [-8] 8 left out Data Instr[-6] Instr[5-] ddr RegisterData ddr Write ddr Data Instr[5-] left Instr[5-] [-8] 8 left out School of Engineering 6

Datapath ctivity During sw ccess Step 5: Completion (Write ack) Data Cond Mem Write Source Op Src Src Instr[-6] Instr[5-] Instr[5-] ddr RegisterData ddr Write ddr Instr[5-] Data left [-8] left 8 out ll we have left is the write back into the register file the data just read from memory for the load lw instruction Reg[[-6]]= ; I-Type: 5 5 op rs rt address offset Write the load data, which was stored into in the previous cycle into the register file. What about all the other instructions? Datapath ctivity During R-type Completion Datapath ctivity During lw Write ack Cond Mem Write Source Op Src Src Cond Mem Write Source Op Src Src Data Instr[-6] Instr[5-] ddr RegisterData ddr Write ddr Data Instr[5-] left Instr[5-] [-8] 8 left out Data Instr[-6] Instr[5-] ddr RegisterData ddr Write ddr Data Instr[5-] left Instr[5-] [-8] 8 left out ccess als Settings,Write,,= = Mem;Write Src= src= Source,Op= Instr Src= Src= Op= Cond= Write ack als Settings,Write,,= = Mem;Write Src= src= Source,Op= Instr Src= Src= Op= Cond= Src= Src= Op= Cond= Src= Src= Op= Cond= Src= Src= Op= Source= Cond Source= Src= Src= Op= Cond= Src= Src= Op= Cond= Src= Src= Op= Source= Cond Source= Mem = Cond= ccess = Cond= = = Cond= Mem = Cond= ccess = Cond= = = Cond= = = Cond= Write ack School of Engineering 7

RTL Summary (from 5 cycles) Instruction & will be common to all Instructions Step R-type Mem Ref ranch Jump Instr fetch = []; = + ; = Reg[[5-]]; = Reg[[-6]]; Out = +(sign-extend([5-])<< ); Operations for Each Cycle RTL Summary (from 5 cycles) Step R-type Mem Ref ranch Jump nswering Simple Questions How many cycles will it take to execute this code? Out = op ; Out = + sign-extend ([5-]); if (==) = Out; = [-8] ([5- ] << ); lw $t, ($t) lw $t, ($t) beq $t, $t, Label #assume not add $t5, $t, $t sw $t5, 8($t) Label:... address for second lw being calculated What is going on during the 8 th cycle of execution? In what cycle does the actual addition of $t and $t takes place? 6 th cycle th cycle In what cycle is the branch target address calculated? 5 5 = cycles RTL Summary (from 5 cycles) Cycles /5 Cycles Cycles Cycles Step R-type Mem Ref ranch Jump Instr fetch Out = op ; access = []; = + ; = Reg[[5-]]; = Reg[[-6]]; Out = +(sign-extend([5-])<< ); Reg[ [5-] ] = Out; Out = + sign-extend ([5-]); = [Out]; or [Out] = ; if (==) = Out; X = [-8] ([5- ] << ); X Multi Cycle Writeback X Reg[[-6]] = ; X X School of Engineering 8

Unit Design Recall (in Single-Cycle datapath) we used a set of truth tables (hardwired) that specified the setting of the signals based on the instruction class. For the Multi-Cycle datapath, the is more complex!! Why? ecause the instruction is executed in a series of steps. The for the Multi-Cycle datapath must specify the signals to be set in any step and the next step in the sequence. Possible Implementations: Finite Sate Machine (FSM) Microprogrammed Finite State Machine Implementation Use D-FF or JK FF to realize the unit From State Diagram obtain the State Table. Determine the number of FFs required. Use excitation tables to design input logic for FF. Op5 Op Op Op Combinational logic Op Inst[-6] Inputs Op System Clock Outputs State Reg Cond Mem Write Source Op Source Source Next State Multicycle Multicycle datapath signals are not determined solely by the bits in the instruction e.g., op code bits tell what operation the should be doing, but not what instruction cycle is to be done next We can use a finite state machine for a set of states (current state stored in State Register) next state function (determined by current state and the input) output function (determined by current state) (Type of FSM?) Combinational logic Inst Opcode State Reg Datapath points Next State So we are using a Moore machine (datapath signals based only on current state)......... lgorithmic State Machine (SM) Implementation We can also use lgorithmic State Machines (SM) to implement the Unit. Translate the FSM to an SM and then: Use one flipflop per state Use Sequence Register and r. The SM implementation based on VHDL can be realized using: Schematic Capture Structural VHDL ehavioral VHDL Multicycle Datapath Finite State Machine,Write,,= Src= Src= Op= Cond= Mem = Cond= ccess = = Cond= Write ack = Instr Mem;Write Src= Src= Src= src= Op= Source,Op= Cond= 5 = Cond= 6 Src= Src= Op= Cond= 7 = = Cond= 8 9 Src= Src= Source= Op= Source= Cond State ssignment Total of States The Complete Multicycle Data with Data Cond Mem Write Source Op Src Src Instr[-6] Instr[5-] Instr[5-] ddr RegisterData ddr Write ddr Instr[5-] Data left [-8] left 8 out School of Engineering 9

The Effect of -bit als Datapath Outputs Truth Table Outputs Cond X Mem Write X Source Op Src Src X Input Values (Current State[-]) The Effect of -bit als Datapath Outputs Truth Table Outputs Input Values (Current State[-]) Cond X X X X X X X X X Mem Write X X X X X X X X Source XX XX XX XX XX XX XX Op XX XX XX XX XX Src XX XX XX XX XX Src X X X X X X X X X X X X X Datapath ctivity During Instruction Data Cond Mem Write Source Op Src Src Instr[-6] ddr RegisterData ddr Write ddr Instr[5-] Instr[5-] Instr[5-] Data left [-8] left 8 out Multicycle Datapath FSM,Write,,= Src= Src= Op= Cond= Mem = Cond= ccess = = Cond= Write ack = Instr Mem;Write Src= Src= Src= src= Op= Source,Op= Cond= 5 = Cond= 6 Src= Src= Op= Cond= 7 = = Cond= 8 9 Src= Src= Source= Op= Source= Cond Total of States School of Engineering

Next State Truth Table ( ) Current State [-] (Rtype) (jmp) Inst[-6] (beq) (Op[5-]) (lw) (sw) ny other 6 Next State Truth Table (ll return to state ) Recall.. Simple Sequencer.. Current State [-] (Rtype) (jmp) Inst[-6] (beq) (Op[5-]) (lw) (sw) ny other illegal XXXX XXXX XXXX illegal XXXX XXXX XXXX XXXX illegal XXXX XXXX XXXX XXXX illegal XXXX XXXX XXXX XXXX illegal XXXX XXXX XXXX XXXX illegal XXXX XXXX XXXX XXXX illegal XXXX XXXX XXXX XXXX illegal XXXX XXXX XXXX XXXX illegal Instruction Register En Mapping Logic MUX Register Incrementer + als (micro-operations) Simplifying the Unit Design For an implementation of the full MIPS IS instr s can take from clock cycles to + clock cycles resulting in finite state machines with hundreds to thousands of states with even more arcs (state sequences) - Such state machine representations become impossibly complex Instead, can represent the set of signals that are asserted during a state as a low-level instruction to be executed by the datapath microinstructions Executing the microinstruction is equivalent to asserting the signals specified by the microinstruction Microprogramming Units microinstruction has to specify what signals should be asserted what microinstruction should be executed next Each microinstruction corresponds to one state in the FSM and is assigned a state number (or address ). Sequential behavior increment the state (address) of the current microinstruction to get to the state (address) of the next. Jump to the microinstruction that begins execution of the next MIPS instruction (state ). ranch to a microinstruction based on unit input using dispatch tables (Will Discuss this later!!) - need one for microinstructions following state - need another for microinstructions following state The set of microinstructions that define a MIPS assembly language instruction (macroinstruction) is its microroutine School of Engineering

Microcode Implementation Our Microinstruction Format PL Outputs Cond Mem Write Source Op Source Source Field Value al setting Comments dd Op = Cause to add Subt Op = Cause to subtract (compare op for beq) Func code Op = Use function code to determine SRC Src = Use as top input Src = Use reg as top input ddrctl dder Microprogram Counter System clock To Datapath ddr select logic Op5 Op Op Op Op Op sequencing Inst[-6] (Opcode) Defining a Microinstruction Format Format the fields of the microinstruction and the signals that are affected by each field signals specified by a field usually have functions that are related format is chosen to simplify the representation and to make it difficult to write inconsistent microinstructions - i.e., that allow a given signal be set to two different values Make each field of the microinstruction responsible for specifying a nonoverlapping set of signals signals that are never asserted simultaneously may share the same field seven fields for our simple machine - ; SRC; SRC; Register ; ; ; Sequencing Our Microinstruction Format Field Value al setting Comments dd Op = Cause to add Subt Op = Cause to subtract (compare op for beq) Func code Op = Use function code to determine SRC Src = Use as top input Src = Use reg as top input SRC Src = Use reg as bottom input Src = Use as bottom input Src = Use sign ext output as bottom input Extshft Src = Use shift-by-two output as bottom input Our Microinstruction Format Field Value al setting Comments dd Op = Cause to add Subt Op = Cause to subtract (compare op for beq) Func code Op = Use function code to determine Our Microinstruction Format Field Value al setting Comments dd Op = Cause to add Subt Op = Cause to subtract (compare op for beq) Func code Op = Use function code to determine SRC Src = Use as top input Src = Use reg as top input SRC Src = Use reg as bottom input Register Src = Use as bottom input Src = Use sign ext output as bottom input Extshft Src = Use shift-by-two output as bottom input Write Write, =, =, =, = Reg using rs and rt fields of as read addr s; put data into and Write Reg using rd field of as write addr and Out as write data Write Reg using rt field of as write addr and as write data School of Engineering

Our Microinstruction Format, con t Field Value al setting Comments Mem, =,Write Write Mem, lord =, = memory using as addr; write result into (and ) memory using Out as addr; write results into Write memory using Out as addr and as write data Creating the Microprogram microinstruction Label (ddr) () SRC SRC Reg dd Seq Seq ing compute + fetch instr write into output into go to µinstr Label field represents the state (address) of the microinstruction microinstruction assigned state (address) Our Microinstruction Format, con t Field Value al setting Comments Mem, =,Write write Write Mem, lord =, = Source = Jump address Source =, Cond Source =, memory using as addr; write result into (and ) memory using Out as addr; write results into Write memory using Out as addr and as write data Write with output of If Zero output of is true, write with the contents of Out Write with jump address after shift-by-two Multicycle Datapath FSM,Write,,= Src= Src= Op= Cond= Mem = Cond= ccess 5 = Cond= = Instr Mem;Write Src= Src= Src= src= Op= Source,Op= Cond= 6 Src= Src= Op= Cond= 7 = = Cond= 8 9 Src= Src= Source= Op= Source= Cond = = Cond= Write ack Our Microinstruction Format, con t Field Value al setting Comments Mem, =,Write write Outcond Sequencing Write Mem, lord =, = Source = Outcond Jump address Source =, Cond Source =, memory using as addr; write result into (and ) memory using Out as addr; write results into Write memory using Out as addr and as write data Write with output of If Zero output of is true, write with the contents of Out Write with jump address after shift-by-two Seq ddrctl = Choose next microinstruction sequentially ddrctl = Jump to the first microinstruction (i.e., ) to begin a new instruction Dispatch ddrctl = ranch using PL_ Dispatch ddrctl = ranch using PL_ The Entire Microprogram ddr SRC SRC Reg Seq ing dd Seq dd Ext shft Disp dd Disp Seq Write 5 Write 6 Func code Seq 7 Write 8 Subt Outcond 9 Jump address School of Engineering

Multicycle Datapath FSM,Write,,= Src= Src= Op= Cond= Mem = Cond= ccess = Instr Mem;Write Src= Src= Src= src= Op= Source,Op= Cond= 5 = Cond= 6 Src= Src= Op= Cond= 7 = = Cond= 8 9 Src= Src= Source= Op= Source= Cond = = Cond= Write ack Implementing Dispatches 8 8 8 8 8 School of Engineering

Multicycle dvantages & Disadvantages Uses the clock cycle efficiently the clock cycle is timed to accommodate the slowest instruction step balance the amount of work to be done in each step restrict each step to use only one major functional unit Multicycle implementations allow faster clock rates different instructions to take a different number of clock cycles functional units to be used more than once per instruction as long as they are used on different clock cycles but Requires additional internal state registers, muxes, and more complicated (FSM) 85 Path Design lternatives Single Cycle vs. Multiple Cycle Timing Initial representation Sequencing Logic representation Implementation technique Finite state diagram Microprogram Explicit next Microprogram counter state function + dispatch PLs Logic equations Programmable Logic rray (PL) Microcode ROM/RM Single Cycle Implementation: Clk Clk Cycle Cycle Cycle lw sw Waste multicycle clock slower than /5 th of single cycle clock Multiple Cycle Implementation: due to state register overhead Cycle Cycle Cycle Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle Microprogram representation advantages Easier to design, write, and debug lw I Dec Exec Mem W sw I Dec Exec Mem R-type I Summary 87 School of Engineering 5