CPE 335. Basic MIPS Architecture Part II

Similar documents
CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

Processor: Multi- Cycle Datapath & Control

CSE 2021 COMPUTER ORGANIZATION

LECTURE 6. Multi-Cycle Datapath and Control

CSE 2021 COMPUTER ORGANIZATION

CC 311- Computer Architecture. The Processor - Control

Systems Architecture I

ECE369. Chapter 5 ECE369

Topic #6. Processor Design

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

CPE 335 Computer Organization. Basic MIPS Architecture Part I

The Processor: Datapath & Control

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

RISC Processor Design

Multiple Cycle Data Path

Chapter 5: The Processor: Datapath and Control

Computer Science 141 Computing Hardware

ENE 334 Microprocessors

ALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR

EECE 417 Computer Systems Architecture

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

RISC Architecture: Multi-Cycle Implementation

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

RISC Design: Multi-Cycle Implementation

Chapter 4 The Processor (Part 2)

Lets Build a Processor

Alternative to single cycle. Drawbacks of single cycle implementation. Multiple cycle implementation. Instruction fetch

Multicycle Approach. Designing MIPS Processor

Major CPU Design Steps

Design of the MIPS Processor

Note- E~ S. \3 \S U\e. ~ ~s ~. 4. \\ o~ (fw' \i<.t. (~e., 3\0)

Points available Your marks Total 100

Processor (multi-cycle)

Chapter 5 Solutions: For More Practice

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Design of the MIPS Processor (contd)

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

RISC Architecture: Multi-Cycle Implementation

Initial Representation Finite State Diagram. Logic Representation Logic Equations

Multi-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

Lecture 5: The Processor

Implementing the Control. Simple Questions

Single vs. Multi-cycle Implementation

Mapping Control to Hardware

CSE 2021: Computer Organization Fall 2010 Solution to Assignment # 3: Multicycle Implementation

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

Control Unit for Multiple Cycle Implementation

CPE 335 Computer Organization. Basic MIPS Pipelining Part I

Multicycle conclusion

Processor (I) - datapath & control. Hwansoo Han

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Processor Implementation in VHDL. University of Ulster at Jordanstown University of Applied Sciences, Augsburg

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

Using a Hardware Description Language to Design and Simulate a Processor 5.8

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

LECTURE 5. Single-Cycle Datapath and Control

Initial Representation Finite State Diagram Microprogram. Sequencing Control Explicit Next State Microprogram counter

Systems Architecture

CENG 3420 Lecture 06: Datapath

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions

Review: Abstract Implementation View

ECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems

EE457. Note: Parts of the solutions are extracted from the solutions manual accompanying the text book.

Microprogramming. Microprogramming

Chapter 4. The Processor. Computer Architecture and IC Design Lab

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W10-M

Introduction to Pipelined Datapath

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Microprogrammed Control Approach

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Lecture 9: Microcontrolled Multi-Cycle Implementations. Who Am I?

ECS 154B Computer Architecture II Spring 2009

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

Introduction. ENG3380 Computer Organization and Architecture MIPS: Data Path Design Part 3. Topics. References. School of Engineering 1

CSEN 601: Computer System Architecture Summer 2014

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

ECE468 Computer Organization and Architecture. Designing a Multiple Cycle Controller

ECE 30, Lab #8 Spring 2014

Chapter 4. The Processor

The overall datapath for RT, lw,sw beq instrucution

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

EE457 Lab 4 Part 4 Seven Questions From Previous Midterm Exams and Final Exams ee457_lab4_part4.fm 10/6/04

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

Designing a Multicycle Processor

CS152 Computer Architecture and Engineering. Lecture 8 Multicycle Design and Microcode John Lazzaro (

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Chapter 4. The Processor

CS152 Computer Architecture and Engineering Lecture 13: Microprogramming and Exceptions. Review of a Multiple Cycle Implementation

COMP2611: Computer Organization. The Pipelined Processor

MIPS-Lite Single-Cycle Control

CS232 Final Exam May 5, 2001

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Design of Digital Circuits Lecture 13: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring April 2017

Transcription:

CPE 335 Computer Organization Basic MIPS Architecture Part II Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Architecture 1

Multicycle Datapath Approach Let an instruction take more than 1 clock cycle to complete Break up instructions into steps where - each step takes a cycle while trying to balance the amount of work to be done in each step - restrict each cycle to use only one major functional unit; unless used in parallel Not every instruction takes the same number of clock cycles In addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used on different clock cycles, as a result Need one memory only but only one memory access per cycle Need one ALU/adder only but only one ALU operation per cycle CPE232 Basic MIPS Architecture 2

Multicycle Datapath Approach, con t At the end of a cycle Store values needed in a later cycle by the current instruction in internal registers (A,B, IR, and MDR). These registers are invisible to the programmer. All of these registers, except IR, hold data only between a pair of adjacent clock cycles thus they don t need write control signal. PC Memory Address Read Data (Instr. or Data) Write Data IR MDR Read Addr 1 Register Read Read Addr 2 Data 1 File Write Addr Read Write Data Data 2 A B ALU ALUo out IR Instruction Register MDR Memory Data Register A, B regfile read data registers ALUout ALU output register Data used by subsequent instructions are stored in programmer visible registers (i.e., register file, PC, or memory) CPE232 Basic MIPS Architecture 3

Multicycle Datapath Approach, con t Similar to single cycle, shared functional units should have multiplexers at their inputs. There is only one adder that will be used to update PC, perform ALU operations, comparison for beq, memory address computation, and branch address computation. CPE232 Basic MIPS Architecture 4

Multicycle Datapath Approach- Control Signals CPE232 Basic MIPS Architecture 5

The Multicycle Datapath with Control Signals PCWriteCond PCWrite IorD MemRead MemWrite MemtoReg IRWrite PCSource ALUOp Control ALUSrcB ALUSrcA RegWrite RegDst PC 0 1 Address Write Data Memory Read Data (Instr. or Data) IR MDR 0 1 1 0 Instr[31-26] Instr[15-0] Instr[5-0] Read Addr 1 Register Read Addr 2 File Write Addr Write Data Sign Extend 32 Instr[25-0] Read Data 1 Read Data 2 Shift left 2 A B 4 0 1 0 1 2 3 PC[31-28] Shift 28 left 2 zero ALU ALU control ALUout 2 0 1 CPE232 Basic MIPS Architecture 6

Multicycle Machine: 1-bit Control Signals Signal Effect when deasserted Effect when asserted RegDst RegWrite The destination register number comes from the rt field None The destination register number comes from the rd field Write is enabled to selected destination register ALUSrcA The first ALU operand is the PC The first ALU operand is register A MemRead MemWrtite MemtoReg IorD None None The value fed to register file is from ALUOut PC is used as an address to memory unit Content of memory address is placed on Memory data out Memory location specified by the address is replaced by the value on Write data input The value fed to register file is from memory ALUOut is used to supply the address to the memory unit IRWrite None The output of memory is written into IR PCWrite None PCWriteCond None PC is written; the source is controlled by PCSource PC is written if Zero output from ALU is also active CPE232 Basic MIPS Architecture 7

Multicycle Machine: 2-bit Control Signals Signal Value Effect 00 ALU performs add operation ALUOp 01 ALU performs subtract operation 10 The funct field of the instruction determines the ALU operation 00 The second input to the ALU comes from register B ALUSrcB 01 The second input to the ALU is 4 (to increment PC) 10 11 The second input to the ALU is the sign extended offset, lower 16 bits of IR. The second input to the ALU is the sign extended, lower 16 bits of the IR shifted left by two bits 00 Output of ALU (PC +4) is sent to the PC for writing PCSource 01 The content of ALUOut are sent to the PC for writing (Branch address) 10 The jump address is sent to the PC for writing CPE232 Basic MIPS Architecture 8

Breaking Instruction Execution into Clock Cycles Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 IFetch Dec Exec Mem WB 1. IFetch: Instruction Fetch and Update PC (Same for all instructions) Operations 11I 1.1 Instruction ti Fetch: IR <= Memory[PC] 1.2 Update PC : PC <= PC + 4 Control signals values - IorD = 0, MemRead = 1, IRWrite = 1 - ALUSrcA = 0, ALUSrcB = 01, ALUOp = 00, PCWrite = 1 - PCSrc = 00 CPE232 Basic MIPS Architecture 9

Breaking Instruction Execution into Clock Cycles 2. Decode - Instruction decode and register fetch (same for all instructions) We don t know the instruction yet, do non harmful operations Operations 2.1 read the two source registers rs and rt and place them in registers A and B, respectively. A <= Reg[IR[25:21]] B <= Reg[IR[20:16]] 2.2 Compute the branch address ALUOut <= PC + (sign-extend(ir[15:0]) <<2) Control signals values - ALUSrcA = 0, ALUSrcB = 11, ALUOp = 00 CPE232 Basic MIPS Architecture 10

Breaking Instruction Execution into Clock Cycles 3. Execution, Memory address computation, or branch completion Operation in this cycle depends on instruction type Operations * if memory reference, compute address ALUOut <= A + sign-extend(ir[15:0]) ALUSrcA A = 1, ALUSrcB = 10, ALUOp = 00 * if arithmetic-logic instruction, perform operation ALUOut <= A op B ALUSrcA = 1, ALUSrcB = 00, ALUOp = 10 CPE232 Basic MIPS Architecture 11

Breaking Instruction Execution into Clock Cycles 3. Execution, Memory address computation, or branch completion (continued) operation depends on instruction type Operations * if branch instruction if (A == B) PC<= ALUOut ALUSrcA A = 1, ALUSrcB = 00, ALUOp = 01, PCWriteCond = 1, PCSrc = 01 * if jump instruction PC <= {PC[31:28], (IR[25:0],2 b00)} PCSource = 10, PCWrite = 1 CPE232 Basic MIPS Architecture 12

Breaking Instruction Execution into Clock Cycles 4. Memory access or R-type completion operation in this cycle depends on instruction type Operations * if load instruction : read value from memory into MDR MDR <= Memory[ALUOut] MemRead = 1, IorD = 1 * if store instruction: store rt into memory Memory[ALUOut] <= B MemWrite = 1, IorD = 1 * if arithmetic-logical instruction: write ALU result into rd Reg[IR[15:11]] <= ALUOut MemtoReg = 0, RegDst = 1, RegWrite = 1 CPE232 Basic MIPS Architecture 13

Breaking Instruction Execution into Clock Cycles 5. Memory read completion Needed for the load instruction only Operations 5.1 store the loaded value in MDR into rt Reg[IR[20:16]] <= MDR RegWrite = 1, MemtoReg = 1, RegDst = 0 CPE232 Basic MIPS Architecture 14

Breaking Instruction Execution into Clock Cycles In this implementation, not all instructions take 5 cycles Instruction Class Clock Cycles Required Load 5 Store 4 Branch 3 Arithmetic-logical 4 Jump 3 CPE232 Basic MIPS Architecture 15

Multicycle Performance Compute the average CPI for multicycle implementation for SPECINT2000 program which has the following instruction mix: 25% loads, 10% stores, 11% branches, 2% jumps, 52% ALU. Assume the CPI for each instruction class as given in the previous table CPI = Σ CPIi x ICi /IC = 0.25 x 5 + 0.1 x 4 + 0.11 x 3 + 0.02 x 3 + 0.52 x 4 =412 4.12 Compare to CPI = 1 for single cycle?!! Assume CC M = 1/5 CC S Then Performance M / Performance S = (IC x 1 x CC S ) / (IC x 4.12 x (1/5) CC S ) = 121 1.21 Multicycle is also cost-effective in terms of hardware. CPE232 Basic MIPS Architecture 16

Multicycle Control Unit Multicycle l datapath th control signals are not determined d solely l by the bits in the instruction e.g., g, op code bits tell what operation the ALU should be doing, but not what instruction cycle is to be done next Since the instruction is broken into multiple cycles, we need to know what we did in the previous cycle(s) in order to determine the current action Must use a finite state machine (FSM) for control a set of states (current state stored in State Register) next state function (determined by current state and the input) output function (determined by current state and the input) Combinational control logic... Inst Opcode... State Reg... Datapath control points Next State CPE232 Basic MIPS Architecture 17

The States of the Control Unit 10 states are required in the FSM control The sequence of states is determined by five steps of execution and the instruction CPE232 Basic MIPS Architecture 18

The Control Unit 1. Logic gates inputs : present state + opcode #bits = 10 outputs: control + next state #bits = 20 truth table size = 2 10 rows x 20 columns 2. ROM Canbeusedtoimplement the truth table above (2 10 x 20 bit = 20 Kbit) Each location stores the control signals values and the next state Each location is addressable by the opcode and next state value CPE232 Basic MIPS Architecture 19

Micro-programmed Control Unit ROM implementation is vulnerable to bugs and expensive especially for complex CPU. Size increase as the number and complexity of instructions (states) increases. Use Microprogramming The next state value may not be sequential Generate the next state outside the storage element Each state is a microinstruction and the signals are specified symbolically Use labels for sequencing CPE232 Basic MIPS Architecture 20

Sequencer CPE232 Basic MIPS Architecture 21

Microprogram The microassembler converts the microcode into actual signal values The sequencing field is used along with the opcode to determine the next state CPE232 Basic MIPS Architecture 22

Multicycle Advantages & Disadvantages Uses the clock cycle efficiently the clock cycle is timed to accommodate the slowest instruction step Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Clk lw sw R-type IFetch Dec Exec Mem WB IFetch Dec Exec Mem IFetch Multicycle implementations allow functional units to be used more than once per instruction as long as they are used on different clock cycles but Requires additional internal state registers, more muxes, and more complicated (FSM) control CPE232 Basic MIPS Architecture 23

Single Cycle vs. Multiple Cycle Timing Single Cycle Implementation: Clk Cycle 1 Cycle 2 Multiple Cycle Implementation: lw sw Waste multicycle clock slower than 1/5 th of single cycle clock due to state register overhead Clk Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 lw IFetch Dec Exec Mem WB IFetch Dec Exec Mem sw R-type IFetch CPE232 Basic MIPS Architecture 24