RISC Architecture: Multi-Cycle Implementation

Similar documents
RISC Architecture: Multi-Cycle Implementation

RISC Design: Multi-Cycle Implementation

RISC Processor Design

ALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR

CPE 335. Basic MIPS Architecture Part II

Processor: Multi- Cycle Datapath & Control

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

CSE 2021 COMPUTER ORGANIZATION

LECTURE 6. Multi-Cycle Datapath and Control

Systems Architecture I

Points available Your marks Total 100

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

ENE 334 Microprocessors

CSE 2021 COMPUTER ORGANIZATION

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

ECE369. Chapter 5 ECE369

CC 311- Computer Architecture. The Processor - Control

Processor (multi-cycle)

Chapter 5: The Processor: Datapath and Control

Topic #6. Processor Design

Beyond Pipelining. CP-226: Computer Architecture. Lecture 23 (19 April 2013) CADSL

Multi-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction

Microprogramming. Microprogramming

Chapter 5 Solutions: For More Practice

The Processor: Datapath & Control

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

Multiple Cycle Data Path

Control Unit for Multiple Cycle Implementation

Lets Build a Processor

Single vs. Multi-cycle Implementation

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

Major CPU Design Steps

Alternative to single cycle. Drawbacks of single cycle implementation. Multiple cycle implementation. Instruction fetch

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions

Initial Representation Finite State Diagram. Logic Representation Logic Equations

Processor (I) - datapath & control. Hwansoo Han

Computer Science 141 Computing Hardware

EECE 417 Computer Systems Architecture

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Design of the MIPS Processor (contd)

CSE 2021: Computer Organization Fall 2010 Solution to Assignment # 3: Multicycle Implementation

Systems Architecture

Design of the MIPS Processor

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

Using a Hardware Description Language to Design and Simulate a Processor 5.8

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

ECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems

Mapping Control to Hardware

Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007

Chapter 4 The Processor (Part 2)

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Note- E~ S. \3 \S U\e. ~ ~s ~. 4. \\ o~ (fw' \i<.t. (~e., 3\0)

Pipelined Processor Design

Multicycle Approach. Designing MIPS Processor

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Implementing the Control. Simple Questions

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W10-M

EE457. Note: Parts of the solutions are extracted from the solutions manual accompanying the text book.

Initial Representation Finite State Diagram Microprogram. Sequencing Control Explicit Next State Microprogram counter

Pipelined Processor Design

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

ECE 313 Computer Organization FINAL EXAM December 11, Multicycle Processor Design 30 Points

CS232 Final Exam May 5, 2001

Design of Digital Circuits Lecture 13: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring April 2017

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Review: Abstract Implementation View

LECTURE 5. Single-Cycle Datapath and Control

CENG 3420 Lecture 06: Datapath

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Chapter 4. The Processor. Computer Architecture and IC Design Lab

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

Processor Implementation in VHDL. University of Ulster at Jordanstown University of Applied Sciences, Augsburg

ECE232: Hardware Organization and Design

Final Project: MIPS-like Microprocessor

Chapter 4. The Processor

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

CPE 335 Computer Organization. Basic MIPS Architecture Part I

ECE Exam I - Solutions February 19 th, :00 pm 4:25pm

Lecture 2: MIPS Processor Example

Introduction to Pipelined Datapath

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

Announcements. This week: no lab, no quiz, just midterm

ECE 30, Lab #8 Spring 2014

Very short answer questions. You must use 10 or fewer words. "True" and "False" are considered very short answers.

Chapter 4. The Processor

The overall datapath for RT, lw,sw beq instrucution

Microprogrammed Control Approach

Lecture 10: Simple Data Path

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

EE457 Lab 4 Part 4 Seven Questions From Previous Midterm Exams and Final Exams ee457_lab4_part4.fm 10/6/04

Lecture 5: The Processor

ECS 154B Computer Architecture II Spring 2009

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Single Cycle CPU Design. Mehran Rezaei

CSSE232 Computer Architecture I. Datapath

Transcription:

RISC Architecture: Multi-Cycle Implementation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in Computer Organization & Architecture Lecture 14 (19 April 2013)

Example Processor MIPS subset MIPS Instruc,on Subset v Arithme,c and Logical Instruc,ons Ø add, sub, or, and, slt v Memory reference Instruc,ons Ø lw, sw v Branch Ø beq, j 16 Apr 2013 Computer Architecture@IIT Mandi 2

Multicycle Datapath PCSource IorD out MUX in1 PCWrite etc. PC in2 control Data Addr. Memory 26-31 to Control FSM Instr. reg. (IR) IRWrite 0-25 16-20 Mem. Data (MDR) 11-15 21-25 Register file RegDst RegWrite B Reg. A Reg. ALUSrcA ALUSrcB Shift left 2 28-31 4 ALU ALUOut Reg. MemRead MemWrite MemtoReg 0-5 0-15 Sign extend Shift left 2 ALU control ALUOp 16 Apr 2013 Computer Architecture@IIT Mandi 3

3 to 5 Cycles for an Instruction Step R-type Mem. Ref. Branch type J-type (4 cycles) (4 or 5 cycles) (3 cycles) (3 cycles) Instruc(on fetch IR Memory[PC]; PC PC+4 Instr. decode/ Reg. fetch Execu(on, addr. Comp., branch & jump comple(on ALUOut A op B A Reg(IR[21-25]); B Reg(IR[16-20]) ALUOut PC + (sign extend IR[0-15]) << 2 ALUOut A+sign extend (IR[0-15]) If (A= =B) then PC ALUOut PC PC[28-3 1] (IR[0-25]<<2) Mem. Access or R- type comple(on Memory read comple(on Reg(IR[11-1 5]) ALUOut MDR M[ALUout] or M[ALUOut] B Reg(IR[16-20]) MDR 16 Apr 2013 Computer Architecture@IIT Mandi 4

Cycle 1 of 5: Instruction Fetch (IF) Read instruc,on into IR, M[PC] IR Control signals used:» IorD = 0 select PC» MemRead = 1 read memory» IRWrite = 1 write IR Increment PC, PC + 4 PC Control signals used:» ALUSrcA = 0 select PC into ALU» ALUSrcB = 01 select constant 4» ALUOp = 00 ALU adds» PCSource = 00 select ALU output» PCWrite = 1 write PC 16 Apr 2013 Computer Architecture@IIT Mandi 5

Cycle 2 of 5: Instruction Decode (ID) R I J 31-26 25-21 20-16 15-11 10-6 5-0 opcode reg 1 reg 2 reg 3 shamt fncode opcode reg 1 reg 2 word address increment opcode word address jump Control unit decodes instruc,on Datapath prepares for execu,on R and I types, reg 1 A reg, reg 2 B reg» No control signals needed Branch type, compute branch address in ALUOut» ALUSrcA = 0 select PC into ALU» ALUSrcB = 11 Instr. Bits 0-15 shi` 2 into ALU» ALUOp = 00 ALU adds 16 Apr 2013 Computer Architecture@IIT Mandi 6

Cycle 3 of 5: Execute (EX) R type: execute func,on on reg A and reg B, result in ALUOut Control signals used:» ALUSrcA = 1 A reg into ALU» ALUsrcB = 00 B reg into ALU» ALUOp = 10 instr. Bits 0-5 control ALU I type, lw or sw: compute memory address in ALUOut A reg + sign extend IR[0-15] Control signals used:» ALUSrcA = 1 A reg into ALU» ALUSrcB = 10 Instr. Bits 0-15 into ALU» ALUOp = 00 ALU adds 16 Apr 2013 Computer Architecture@IIT Mandi 7

Cycle 3 of 5: Execute (EX) I type, beq: subtract reg A and reg B, write ALUOut to PC Control signals used:» ALUSrcA = 1 A reg into ALU» ALUsrcB = 00 B reg into ALU» ALUOp = 01 ALU subtracts» If zero = 1, PCSource = 01 ALUOut to PC» If zero = 1, PCwriteCond = 1 write PC» Instruc(on complete, go to IF J type: write jump address to PC IR[0-25] shi` 2 and four leading bits of PC Control signals used:» PCSource = 10» PCWrite = 1 write PC» Instruc(on complete, go to IF 16 Apr 2013 Computer Architecture@IIT Mandi 8

Cycle 4 of 5: Reg Write/Memory R type, write des,na,on register from ALUOut Control signals used:» RegDst = 1 Instr. Bits 11-15 specify reg.» MemtoReg = 0 ALUOut into reg.» RegWrite = 1 write register» Instruc(on complete, go to IF I type, lw: read M[ALUOut] into MDR Control signals used:» IorD = 1 select ALUOut into mem adr.» MemRead = 1 read memory to MDR I type, sw: write M[ALUOut] from B reg Control signals used:» IorD = 1 select ALUOut into mem adr.» MemWrite = 1 write memory» Instruc(on complete, go to IF 16 Apr 2013 Computer Architecture@IIT Mandi 9

Cycle 5 of 5: Reg Write I type, lw: write MDR to reg[ir(16-20)] Control signals used:» RegDst = 0 instr. Bits 16-20 are write reg» MemtoReg = 1 MDR to reg file write input» RegWrite = 1 read memory to MDR» Instruc(on complete, go to IF For an alternative method of designing datapath, see N. Tredennick, Microprocessor Logic Design, the Flowchart Method, Digital Press, 1987. 16 Apr 2013 Computer Architecture@IIT Mandi 10

Control FSM 3 State 0 2 Instr. fetch/ adv. PC Start 1 Instr. decode/reg. fetch/branch addr. R B J Read memory data Write register lw 4 5 Compute memory addr. sw Write memory data 6 7 ALU operation Write register Write PC on branch condition 8 9 Write jump addr. to PC 16 Apr 2013 Computer Architecture@IIT Mandi 11

Control FSM (Controller) 6 inputs (opcode) Combinational logic 16 control outputs Present state Next state Reset Clock FF FF FF FF 16 Apr 2013 Computer Architecture@IIT Mandi 12

Designing the Control FSM Encode states; need 4 bits for 10 states, e.g., State 0 is 0000, state 1 is 0001, and so on. Write a truth table for combina,onal logic: Opcode Present state Control signals Next state 000000 0000 0001000110000100 0001................ Synthesize a logic circuit from the truth table. Connect four flip- flops between the next state outputs and present state inputs. 16 Apr 2013 Computer Architecture@IIT Mandi 13

Block Diagram of a Processor MemWrite MemRead Controller (Control FSM) ALUOp 2-bits ALU control Opcode 6-bits zero Overflow ALUSrcA ALUSrcB 2-bits PCSource 2-bits RegDst RegWrite MemtoReg IorD IRWrite PCWrite PCWriteCond funct. [0,5] ALUOp 3-bits Reset Clock Mem. Addr. Datapath (PC, register file, registers, ALU) Mem. write data Mem. data out 16 Apr 2013 Computer Architecture@IIT Mandi 14

Exceptions or Interrupts Condi,ons under which the processor may produce incorrect result or may hang. Illegal or undefined opcode. Arithme,c overflow, divide by zero, etc. Out of bounds memory address. EPC: 32- bit register holds the affected instruc,on address. Cause: 32- bit register holds an encoded excep,on type. For example, 0 for undefined instruc,on 1 for arithme,c overflow 16 Apr 2013 Computer Architecture@IIT Mandi 15

Implementing Exceptions PCWrite etc.=1 PC MUX out in1 in2 control 26-31 to Control FSM Instr. reg. (IR) 0 1 CauseWrite=1 Cause 32-bit register 8000 0180(hex) ALUSrcA=0 ALUSrcB=01 4 ALU ALU control Subtract PCSource 11 EPCWrite=1 EPC Overflow to Control FSM ALUOp =01 16 Apr 2013 Computer Architecture@IIT Mandi 16

How Long Does It Take? Again Assume control logic is fast and does not affect the cri,cal,ming. Major,me components are ALU, memory read/write, and register read/write. Time for hardware opera,ons, suppose Memory read or write 2ns Register read 1ns ALU opera,on 2ns Register write 1ns 16 Apr 2013 Computer Architecture@IIT Mandi 17

Single-Cycle Datapath R- type 6ns Load word (I- type) 8ns Store word (I- type) 7ns Branch on equal (I- type) 5ns Jump (J- type) 2ns Clock cycle,me = 8ns Each instruc,on takes one cycle 16 Apr 2013 Computer Architecture@IIT Mandi 18

Multicycle Datapath Clock cycle,me is determined by the longest opera,on, ALU or memory: Clock cycle,me = 2ns Cycles per instruc,on (CPI): lw 5 (10ns) sw 4 (8ns) R- type 4 (8ns) beq 3 (6ns) j 3 (6ns) 16 Apr 2013 Computer Architecture@IIT Mandi 19

CPI of a Computer k (Instructions of type k) CPI k CPI = k (instructions of type k) where CPI k = Cycles for instruction of type k Note: CPI is dependent on the instruction mix of the program being run. Standard benchmark programs are used for specifying the performance of CPUs. 16 Apr 2013 Computer Architecture@IIT Mandi 20

Example Consider a program containing: loads 25% stores 10% branches 11% jumps 2% Arithme,c 52% CPI = 0.25 5 + 0.10 4 + 0.11 3 + 0.02 3 + 0.52 4 = 4.12 for mul,cycle datapath CPI = 1.00 for single- cycle datapath 16 Apr 2013 Computer Architecture@IIT Mandi 21

Multicycle vs. Single-Cycle Performance ratio = Single cycle time / Multicycle time (CPI cycle time) for single-cycle = (CPI cycle time) for multicycle 1.00 8ns = = 0.97 4.12 2ns Single cycle is faster in this case, but remember, performance ratio depends on the instruction mix. 16 Apr 2013 Computer Architecture@IIT Mandi 22

Traffic Flow 16 Apr 2013 Computer Architecture@IIT Mandi 23

ILP: Instruction Level Parallelism Single- cycle and mul,- cycle datapaths execute one instruc,on at a,me. How can we get beser performance? Answer: Execute mul,ple instruc,on at a,me: Pipelining Enhance a mul,- cycle datapath to fetch one instruc,on every cycle. Parallelism Fetch mul,ple instruc,ons every cycle. 16 Apr 2013 Computer Architecture@IIT Mandi 24

Thank You 16 Apr 2013 Computer Architecture@IIT Mandi 25