Design of the MIPS Processor

Similar documents
Design of the MIPS Processor (contd)

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

Processor: Multi- Cycle Datapath & Control

CPE 335. Basic MIPS Architecture Part II

LECTURE 6. Multi-Cycle Datapath and Control

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

Chapter 5 Solutions: For More Practice

CSE 2021 COMPUTER ORGANIZATION

ECE369. Chapter 5 ECE369

Topic #6. Processor Design

CSE 2021 COMPUTER ORGANIZATION

The Processor: Datapath & Control

Processor (I) - datapath & control. Hwansoo Han

Chapter 5: The Processor: Datapath and Control

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

Systems Architecture I

Processor (multi-cycle)

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Multiple Cycle Data Path

CC 311- Computer Architecture. The Processor - Control

Systems Architecture

RISC Processor Design

Alternative to single cycle. Drawbacks of single cycle implementation. Multiple cycle implementation. Instruction fetch

LECTURE 5. Single-Cycle Datapath and Control

ENE 334 Microprocessors

Computer Science 141 Computing Hardware

Chapter 4 The Processor (Part 2)

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

Control & Execution. Finite State Machines for Control. MIPS Execution. Comp 411. L14 Control & Execution 1

EECE 417 Computer Systems Architecture

Chapter 4. The Processor. Computer Architecture and IC Design Lab

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

CSEN 601: Computer System Architecture Summer 2014

Points available Your marks Total 100

Lets Build a Processor

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

ECE232: Hardware Organization and Design

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

ALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR

Multicycle Approach. Designing MIPS Processor

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

RISC Architecture: Multi-Cycle Implementation

RISC Architecture: Multi-Cycle Implementation

October 24. Five Execution Steps

Note- E~ S. \3 \S U\e. ~ ~s ~. 4. \\ o~ (fw' \i<.t. (~e., 3\0)

Laboratory 5 Processor Datapath

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

COMP2611: Computer Organization. The Pipelined Processor

Single vs. Multi-cycle Implementation

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Chapter 4. The Processor

Single Cycle Data Path

The MIPS Processor Datapath

Processor Design Pipelined Processor (II) Hung-Wei Tseng

RISC Design: Multi-Cycle Implementation

Multicycle conclusion

Implementing the Control. Simple Questions

ECE260: Fundamentals of Computer Engineering

CPE 335 Computer Organization. Basic MIPS Architecture Part I

Inf2C - Computer Systems Lecture Processor Design Single Cycle

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CSE Computer Architecture I Fall 2009 Lecture 13 In Class Notes and Problems October 6, 2009

CSE Lecture In Class Example Handout

Lecture 5: The Processor

Using a Hardware Description Language to Design and Simulate a Processor 5.8

Chapter 4. The Processor

ECS 154B Computer Architecture II Spring 2009

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

ECE260: Fundamentals of Computer Engineering

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Single Cycle CPU Design. Mehran Rezaei

The Processor: Datapath & Control

Major CPU Design Steps

Project Description EEC 483 Computer Organization, Spring 2019

Adding Support for jal to Single Cycle Datapath (For More Practice Exercise 5.20)

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

CPE 335 Computer Organization. Basic MIPS Pipelining Part I

MIPS-Lite Single-Cycle Control

CSSE232 Computer Architecture I. Mul5cycle Datapath

Processor Implementation in VHDL. University of Ulster at Jordanstown University of Applied Sciences, Augsburg

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

CS3350B Computer Architecture Quiz 3 March 15, 2018

ECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems

COMP303 Computer Architecture Lecture 9. Single Cycle Control

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

EE 457 Unit 6a. Basic Pipelining Techniques

Multi-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

Transcription:

Design of the MIPS Processor We will study the design of a simple version of MIPS that can support the following instructions: I-type instructions LW, SW R-type instructions, like ADD, SUB Conditional branch instruction BEQ J-type branch instruction J The instruction formats 6-bit 5-bit 5-bit 5-bit 5-bit 5-bit LW op rs rt SW op rs rt immediate immediate ADD op rs rt rd 0 func SUB op rs rt rd 0 func BEQ op rs rt immediate J op address

ALU control ALU control (3-bit) 32 ALU result 32 ALU control input ALU function 000 AND 001 OR 010 add 110 sub 111 Set less than How to generate the ALU control input? The control unit first generates this from the opcode of the instruction.

A single-cycle MIPS We consider a simple version of MIPS that uses Harvard architecture. Harvard architecture uses separate memory for instruction and data. Instruction memory is read-only a programmer cannot write into the instruction memory. To read from the data memory, set Memory read =1 To write into the data memory, set Memory write =1

Instruction fetching Clock Each clock cycle fetches the instruction from the address specified by the PC, and increments PC by 4 at the same time.

Executing R-type instructions This is the instruction format for the R-type instructions.

Here are the steps in the execution of an R-type instruction: Read instruction Read source registers rs and rt ALU performs the desired operation Store result in the destination register rd. Q. Why should all these be completed in a single cycle?

Executing lw, sw instructions These are I-type instructions. op rs rt address Try to recognize the steps in the execution of lw and sw.

beq $at, $zero, L Design of the MIPS Processor (contd) First, revisit the datapath for add, sub, lw, sw. We will augment it to accommodate the beq and j instructions. Execution of branch instructions add $v1, $v0, $zero add $v1, $v1, $v1 j somewhere L: add $v1, $v0, $v0 Offset= 3x4=12 The offset must be added to the next PC to generate the target address for branch.

The modified version of MIPS The final datapath for single cycle MIPS. Find out which paths the signal follow for lw, sw, add and beq instructions

Executing R-type instructions The ALUop will be determined by the value of the opcode field and the function field of the instruction word

Executing LW instruction

Executing beq instruction The branch may

Control signal table This table summarizes what control signals are needed to execute an instruction. The set of control signals vary from one instruction to another. How to implement the control unit? Recall how to convert a truth table into a logical circuit! The control unit implements the above truth table.

Instruction Memory The Control Unit Control ALUsrc MemRead I [31-26, 15-0] MemWrite ALUop RegDst Regwrite All control signals are not shown here

1-cycle implementation is not used Why? Because the length of the clock cycle will always be determined by the slowest operation (lw, sw) even if the data memory is not used. Practical implementations use multiple cycles per instruction, which fixes some shortcomings of the 1-cycle implementation. Faster instructions (R-type) are not held back by the slower instructions (lw, sw) The clock cycle time can be decreased, i.e. faster clock can be used Eventually simplifies the implementation of pipelining, the universal speed-up technique. This requires some changes in the datapath

Multi-cycle implementation of MIPS First, revisit the 1- cycle version

The multi-cycle version Note that we have eliminated two adders, and used only one memory unit (so it is Princeton architecture) that contains both instructions and data. It is not essential to have a single memory unit, but it shows an alternative design of the datapath.

Intermediate registers are necessary In each cycle, a fraction of the instruction is executed Five stages of instruction execution Cycle 1. Instruction fetch and PC increment Cycle 2. Reading sources from the register file Cycle 3 Performing an ALU computation Cycle 4 Reading or writing (data) memory Cycle 5 Storing data back to the register file

Why intermediate registers? Sometimes we need the output of a functional unit in a later clock cycle during the execution of an instruction. (Example: The instruction word fetched in stage 1 determines the destination of the register write in stage 5. The ALU result for an address computation in stage 3 is needed as the memory address for lw or sw in stage 4.) These outputs must be stored in intermediate registers for future use. Otherwise they will be lost after the next clock cycle. (Instruction read in stage 1 is saved in Instruction register. Register file outputs from stage 2 are saved in registers A and B. The ALU output will be stored in a register ALUout. Any data fetched from memory in stage 4 is kept in the Memory data register MDR.)

The Five Cycles of MIPS (Instruction Fetch) IR:= Memory[PC] PC:= PC+4 (Instruction decode and Register fetch) A:= Reg[IR[25:21]], B:=Reg[IR[20:16]] ALUout := PC + sign-extend(ir[15:0]] (Execute Memory address Branch completion) Memory reference: ALUout:= A+ IR[15:0] R-type (ALU): ALUout:= A op B Branch: if A=B then PC := ALUout (Memory access R-type completion) LW: MDR:= Memory[ALUout] SW: Memory[ALUout]:= B R-type: Reg[IR[15:11]]:= ALUout (Writeback) LW: Reg[[20:16]]:= MDR

We will now study the implementation of a pipelined version of MIPS. We utilize the five stages of implementation for this purpose.

The PC is not shown here, but can easily be added. Also, the buffer between the stages is not shown The implementation of pipelining becomes simpler when you use separate instruction memory and data memory (We will explain it later). So we go back to our original Harvard architecture.

Pipelined MIPS Why pipelining? While a typical instruction takes 3-4 cycles (i.e. 3-4 CPI), a pipelined processor targets 1 CPI (and gets close to it). Pipelining in a laundromat -- Washer takes 30 minutes --Dryer takes 40 minutes -- Folding takes 20 minutes. How does the laundromat example help with speeding up MIPS?