ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Similar documents
ECE232: Hardware Organization and Design

CENG 3420 Lecture 06: Datapath

Systems Architecture

Processor (I) - datapath & control. Hwansoo Han

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Chapter 4. The Processor. Computer Architecture and IC Design Lab

CPE 335 Computer Organization. Basic MIPS Architecture Part I

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

LECTURE 5. Single-Cycle Datapath and Control

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Review: Abstract Implementation View

Systems Architecture I

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

CPE 335. Basic MIPS Architecture Part II

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

RISC Processor Design

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

Major CPU Design Steps

Chapter 4. The Processor

Computer Science 141 Computing Hardware

Topic #6. Processor Design

CC 311- Computer Architecture. The Processor - Control

Chapter 4. The Processor

ENE 334 Microprocessors

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

CPU Organization (Design)

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

Chapter 5: The Processor: Datapath and Control

The Processor: Datapath & Control

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

ECE369. Chapter 5 ECE369

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

CSE 2021 COMPUTER ORGANIZATION

Chapter 4 The Processor 1. Chapter 4A. The Processor

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Processor: Multi- Cycle Datapath & Control

The MIPS Processor Datapath

ECE260: Fundamentals of Computer Engineering

Lecture 10: Simple Data Path

CSEN 601: Computer System Architecture Summer 2014

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

The Processor: Datapath & Control

Design of the MIPS Processor

LECTURE 6. Multi-Cycle Datapath and Control

CS3350B Computer Architecture Quiz 3 March 15, 2018

Inf2C - Computer Systems Lecture Processor Design Single Cycle

Chapter 4. The Processor Designing the datapath

Single Cycle Datapath

CPU Design Steps. EECC550 - Shaaban

Adding Support for jal to Single Cycle Datapath (For More Practice Exercise 5.20)

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

Single Cycle Datapath

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

CSE 2021 COMPUTER ORGANIZATION

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

CS Computer Architecture Spring Week 10: Chapter

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

Design of the MIPS Processor (contd)

Lecture 5: The Processor

CSE Computer Architecture I Fall 2009 Lecture 13 In Class Notes and Problems October 6, 2009

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath

Chapter 5 Solutions: For More Practice

Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control

CS 61C: Great Ideas in Computer Architecture. MIPS CPU Datapath, Control Introduction

Single-Cycle Examples, Multi-Cycle Introduction

Chapter 4. The Processor

COMP303 Computer Architecture Lecture 9. Single Cycle Control

RISC Design: Multi-Cycle Implementation

ECS 154B Computer Architecture II Spring 2009

MIPS-Lite Single-Cycle Control

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

Single Cycle CPU Design. Mehran Rezaei

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

Lets Build a Processor

Introduction. ENG3380 Computer Organization and Architecture MIPS: Data Path Design Part 3. Topics. References. School of Engineering 1

The overall datapath for RT, lw,sw beq instrucution

Multiple Cycle Data Path

The Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath

RISC Architecture: Multi-Cycle Implementation

CS3350B Computer Architecture Winter Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2)

Laboratory 5 Processor Datapath

Chapter 4. The Processor

ENGN1640: Design of Computing Systems Topic 04: Single-Cycle Processor Design

Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)

Data paths for MIPS instructions

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

Lecture 7 Pipelining. Peng Liu.

361 control.1. EECS 361 Computer Architecture Lecture 9: Designing Single Cycle Control

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

Transcription:

ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 http://www.auburn.edu/~uzg5/ Adapted from Dr. Chen-Huan Chiang (Intel) and Prof. Vishwani D. Agrawal (Auburn University) [Adapted from Computer Organization and Design, Patterson & Hennessy, 214] 2/6/217 ELEC 52-1/62-1 Lecture 4 1

Von Neumann Kitchen Start ALU Control Registers PC My choice Processor Program Data Input Memory Output 2/6/217 ELEC 52-1/62-1 Lecture 4 2

Where Does It All Begin? In a register called program counter (PC). PC contains the memory address of the next instruction to be executed. In the beginning, PC contains the address of the memory location where the program begins. 2/6/217 ELEC 52-1/62-1 Lecture 4 3

Where is the Program? Processor Memory Program counter (register) Start address Machine code of program 2/6/217 ELEC 52-1/62-1 Lecture 4 4

How Does It Run? Start PC has memory address where program begins Fetch instruction word from memory address in PC and increment PC PC + 4 to point to next instruction Decode instruction Execute instruction Save result in register or memory No Program complete? Yes STOP 2/6/217 ELEC 52-1/62-1 Lecture 4 5

Datapath and Control Datapath Memory, registers, adders, ALU, and communication buses. Each step (fetch, decode, execute, save result) requires communication (data transfer) paths between memory, registers and ALU. Control Datapath for each step is set up by control signals that set up dataflow directions on communication buses and select ALU and memory functions. Control signals are generated by a control unit consisting of one or more finite-state machines. 2/6/217 ELEC 52-1/62-1 Lecture 4 6

Single-Cycle Processor Simplified MIPS - Datapath

Registers ALU Add Abstract View of MIPS 4 Data PC Address Instruction Instruction memory Register # Register # Register # Address Data Memory Data 2/6/217 ELEC 52-1/62-1 Lecture 4 8

Add Instruction Fetch instructions from Instruction Memory Update PC for next instruction 4 Instruction Memory PC Address Instruction 2/6/217 ELEC 52-1/62-1 Lecture 4 9

Register File: A Datapath Component registers 5 5 Reg 1 Reg 2 32 Reg 1 Data Write register Write Data 5 32 Register File 32 Reg 2 Data RegWrite 2/6/217 ELEC 52-1/62-1 Lecture 4 1

Instruction Decode R-Type 6-bit Opcode and 6-bit funct to Control Unit two registers (rs and rt) Control Unit Instruction I-Type 6-bit Opcode to Control Unit one register (rs) J-Type? Reg 1 Reg 2 Register File Write Reg Write Data Data 1 Data 2 2/6/217 ELEC 52-1/62-1 Lecture 4 11

Execute: R-Type 31-26 25-21 2-16 15-11 1 6 5 opcode rs rt rd shamt funct RegWrite ALU Operation Instruction Reg 1 Reg 2 Data 1 Register File Write Reg Data 2 Write Data ALU zero Why RegWrite? 2/6/217 ELEC 52-1/62-1 Lecture 4 12

Execute: Load/Store 31-26 25-21 2-16 15 - opcode rs rt 16-bit address RegWrite ALU operation MemWrite Instruction Reg 1 Reg 2 Write Reg Write Data Data 1 Data 2 ALU zero Address Data Memory Write Data Data Signextend 16 32 Mem lw $rt, offset($rs) sw $rt, offset($rs) 2/6/217 ELEC 52-1/62-1 Lecture 4 13

ALU Add Add Execute: Branch bne $t, $t1, Label beq $t, $t1, Label 4 Shift left 2 Branch target address ALU operation PC Instruction Reg 1 Reg 2 Register File Write Reg Write Data Data 1 Data 2 zero (To branch control logic) 16 Sign Extend 32 2/6/217 ELEC 52-1/62-1 Lecture 4 14

Add Execute: Jump Jump operation involves Update lower 28 bits of the PC Lower 26 bits of the fetched instruction shifted left by 2 bits (converting to byte address) op 26-bit address 4 4 MSBs of PC+4 PC Address Instruction Memory Instruction 26 Shift left 2 28 Jump address 2/6/217 ELEC 52-1/62-1 Lecture 4 15

Assembling Datapath Assemble the datapath segments Add control lines and multiplexors as needed Single cycle design fetch, decode and execute each instructions all in one clock cycle No datapath resource can be used more than once per instruction Must be duplicated if needed (e.g., separate Instruction Memory and Data Memory, several adders) Multiplexors needed at the input of shared elements with control lines to do the selection Write signals to control writing to the Register File and Data Memory Cycle time is determined by length of the longest path 2/6/217 ELEC 52-1/62-1 Lecture 4 16

Add Add Datapath (Except Jump) 4 ALUOp Shift left 2 1 Instr[31-26] Control Unit RegWrite ALUSrc PCSrc MemWrite MemtoReg RegDst zero PC Address Instruction Memory Instr[31-] Instr[25-21] Instr[2-16] Instr 1 [15-11] Reg 1 Reg 2 Register File Write Addr Write Data Data 1 Data 2 1 ALU Address Data Memory Write Data Data 1 Instr[15-] Sign 16 Extend 32 ALU control Mem Instr[5-]

Add Add Datapath and Control (Except Jump) 4 ALUOp Branch Shift left 2 1 PCSrc Instr[31-26] RegDst Control Unit RegWrite ALUSrc MemWrite MemtoReg PC Address Instruction Memory Instr[31-] Instr[25-21] Instr[2-16] Instr 1 [15-11] Reg 1 Reg 2 Register File Write Addr Write Data Data 1 Data 2 1 ALU Address Data Memory Write Data Data 1 Instr[15-] Sign 16 Extend 32 ALU control Mem Instr[5-]

Arithmetic Logic Unit (ALU) Operation select ALU function AND 1 OR 1 Add 11 Subtract 111 Set on less than 11 NOR Operation select from control ALU 4 zero overflow result zero = 1, when all bits of result are 2/6/217 ELEC 52-1/62-1 Lecture 4 19

Building a 32 bit ALU 2/6/217 ELEC 52-1/62-1 Lecture 4 2

1-Bit ALU: AND, OR, ADD, SUB, NOR 2/6/217 ELEC 52-1/62-1 Lecture 4 21

slt produces a 1 if rs < rt and otherwise Use subtraction: (a-b) < implies a < b ALU: slt 2/6/217 ELEC 52-1/62-1 Lecture 4 22

ALU: Branch 2/6/217 ELEC 52-1/62-1 Lecture 4 23

ALU Control ALU Control Lines Function AND 1 OR 1 add 11 subtract 111 set on less than 11 NOR 2/6/217 ELEC 52-1/62-1 Lecture 4 24

Single-Cycle Processor Simplified MIPS - Control

Datapath and Control (Except Jump) Instruction RegDst ALUSrc Memto-Reg Reg Write Mem Mem Write Branch ALUOp1 ALUp R-format 1 1 1 lw 1 1 1 1 sw X 1 X 1 beq X X 1 1

ALU Control Load and store word instructions, ALU computes the target memory address by addition Base address + displacement Base register + sign_ext(imm16) R-type instructions ALU performs one of the following 5 actions depending on the value of the 6-bit funct field AND, OR, subtract, add, set on less than Branch ALU performs a subtraction Check the output ZERO We can use 2 bits of opcode (Instr[31:26]) as ALUop to distinguish the above 3 types of instructions lw/sw (), beq (1), R-type (1) Note that the binary encoding (11) is not used 2/6/217 ELEC 52-1/62-1 Lecture 4 27

Recall: ALU Control Inputs 4 bits required for ALU control inputs, ALUctr Remember this in ALU design? = and 1 = or 1 = add 11 = subtract 111 = slt 11 = NOR opcode funct funct op Main 6? 6 Control ALUop 2 ALUctr 4 To ALU 2/6/217 ELEC 52-1/62-1 Lecture 4 28

What s in the box? op 6 Main Control funct 6 ALUop 2 ALU Control ALUctr 4 To ALU Opcode ALUOp Operation Function Code Desired ALU action ALU control input LW Load word xxxxxx add 1 SW Store word xxxxxx add 1 Branch equal 1 Branch equal xxxxxx subtract 11 R-type 1 Addition 1 add 1 R-type 1 Subtraction 11 subtract 11 R-type 1 AND 11 and R-type 1 OR 111 or 1 R-type 1 Set-on-less-than 111 set-on-less-than 111 ALUOp Function code ALU control ALUOp1 ALUOp F5 F4 F3 F2 F1 F input X X X X X X 1 X 1 X X X X X X 11 1 X X X 1 1 X X X 1 11 1 X X X 1 1 X X X 1 1 1 1 X X X 1 1 111

Add Add Control Unit 4 ALUOp Branch Shift left 2 1 PCSrc Instr[31-26] RegDst Control Unit RegWrite ALUSrc MemWrite MemtoReg PC Address Instruction Memory Instr[31-] Instr[25-21] Instr[2-16] Instr 1 [15-11] Reg 1 Reg 2 Register File Write Addr Write Data Data 1 Data 2 1 ALU Address Data Memory Write Data Data 1 Instr[15-] Sign 16 Extend 32 ALU control Mem Instr[5-]

R-Type Instructions add $x, $y, $z 31 25 2 15 1 5 R-type: op rs rt rd shamt funct Instruction Fetch (IF): An instruction is fetched from the instruction memory and the PC is incremented. Instruction Decode (ID): Two registers, $y and $z, are read from the register file. Execution (EX): The ALU operates on the data read from the register file, using the function code (bits 5- of the instruction) to generate the ALU function. Write Back (WB): The result from the ALU is written into the register file using bits 15-11 of the instruction to select the destination register ($x). 2/6/217 ELEC 52-1/62-1 Lecture 4 31

Add Add R-Type Instructions add $x, $y, $z 4 ALUOp Branch Shift left 2 1 PCSrc Instr[31-26] RegDst Control Unit RegWrite ALUSrc MemWrite MemtoReg PC Address Instruction Memory Instr[31-] Instr[25-21] Instr[2-16] Instr 1 [15-11] Reg 1 Reg 2 Register File Write Addr Write Data Data 1 Data 2 1 ALU Address Data Memory Write Data Data 1 Instr[15-] Sign 16 Extend 32 ALU control Mem Instr[5-] 1

I-Type: Load lw $x, offset ($y) 31 25 2 15 I-type: op rs rt offset Instruction Fetch (IF): An instruction is fetched from the instruction memory and the PC is incremented. Instruction Decode (ID): A register ($y) value is read from the register file. Address Calculation (EX): The ALU computes the sum of the value read from the register file and the sign-extended lower 16 bits of the instruction (offset). Memory Operation (MEM): The sum from the ALU is used as the address for the data memory. Write Back (WB): The data from the memory unit is written into the register file; the register destination is given by bits 2-16 of the instruction ($x). 2/6/217 ELEC 52-1/62-1 Lecture 4 33

Add Add I-Type: Load lw $x, offset ($y) 4 ALUOp Branch Shift left 2 1 PCSrc Instr[31-26] RegDst Control Unit RegWrite ALUSrc MemWrite MemtoReg PC Address Instruction Memory Instr[31-] Instr[25-21] Instr[2-16] Instr 1 [15-11] Reg 1 Reg 2 Register File Write Addr Write Data Data 1 Data 2 1 ALU Address Data Memory Write Data Data 1 Instr[15-] Sign 16 Extend 32 ALU control Mem Instr[5-]

I-Type : Branch beq $x, $y, offset 31 25 2 15 I-type: op rs rt offset Instruction Fetch (IF): An instruction is fetched from the instruction memory and the PC is incremented. Instruction Decode (ID): Two registers, $x and $y, are read from the register file. Branch Address calculation (EX): The ALU performs a subtract on the data values read from the register file. The value of PC + 4 is added to the sign-extended lower 16 bits of the instruction (offset); the result is the branch target address. Branch Decision: The Zero result from the ALU is used to decide which adder result to store into the PC. 2/6/217 ELEC 52-1/62-1 Lecture 4 35

Add Add I-Type: beq beq $x, $y, offset 4 ALUOp Branch Shift left 2 1 PCSrc Instr[31-26] RegDst Control Unit RegWrite ALUSrc MemWrite MemtoReg PC Address Instruction Memory Instr[31-] Instr[25-21] Instr[2-16] Instr 1 [15-11] Reg 1 Reg 2 Register File Write Addr Write Data Data 1 Data 2 1 ALU Address Data Memory Write Data Data 1 Instr[15-] Sign 16 Extend 32 ALU control Mem Instr[5-] 1

Control Signals Instruction RegDst ALUSrc Memto-Reg Reg Write Mem Mem Write Branch ALUOp1 ALUp R-format 1 1 1 lw 1 1 1 1 sw X 1 X 1 beq X X 1 1 op[] op[5] Control Unit RegDst ALUSrc ALUp1 ALUp 2/6/217 ELEC 52-1/62-1 Lecture 4 37

Adding jump hardware op 26-bit address Note: the 26-bit address is a word address Must be multiplied by 4 to obtain the byte address, i.e. shift-left-by 2 Low order 26 bits of the jump instruction 26 PC[31:28] or PC+4[31:28]? 4 PC 32 32 2/6/217 ELEC 52-1/62-1 Lecture 4 38

Add Add 4 Instr[25-] Shift left 2 26 4 28 ALUOp PC[31-28] jump Branch RegWrite Jump 32 Shift left 2 1 PCSrc 1 Instr[31-26] RegDst Control Unit ALUSrc MemWrite MemtoReg PC Address Instruction Memory Instr[31-] Instr[25-21] Instr[2-16] Instr 1 [15-11] Reg 1 Reg 2 Register File Write Addr Write Data Data 1 Data 2 1 ALU Address Data Memory Write Data Data 1 Instr[15-] Sign 16 Extend 32 ALU control Mem Instr[5-]

Limitations Inefficient clocking Clock cycle must be timed to accommodate the slowest instruction Problematic for more complex instructions like floating point multiply Clk Cycle 1 Cycle 2 lw sw Waste May be wasteful of area since some functional units (e.g., adders) must be duplicated since they can not be shared during a clock cycle BUT it is simple and easy to understand Especially the design of the main control unit Combinational logic 2/6/217 ELEC 52-1/62-1 Lecture 4 4

lk Clk Clk Registers ALU Add Critical Path (Load) Critical Path = PC s Clk-to-Q + Instruction Memory s Access Time + Register File s Access Time + ALU to Perform a 32-bit Add + Data Memory Access Time + Setup Time for Register File Write + Clock Skew 4 Register file and ideal memory: The CLK input is a factor ONLY during write operation During read operation, behave as combinational logic: Address valid => Output valid after access time. (i.e. delay) Data Register # PC Address Instruction Register # Address Instruction memory Register # Data Memory Data

Arithmetic & Logical Cycle Time IF ID EXE WB Load IF ID EXE MEM WB Critical Path Store IF ID EXE MEM Branch IF ID EXE 2/6/217 ELEC 52-1/62-1 Lecture 4 42

Multicycle Datapath Approach Let an instruction take more than 1 clock cycle to complete Break up instructions into steps where each step takes a cycle while trying to Balance the amount of work to be done in each step Restrict each cycle to use only one major functional unit Not every instruction takes the same number of clock cycles In addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used in different clock cycles, hence One memory but only one memory access per cycle Recall instruction and data memory in single-cycle processor One ALU/adder but only one ALU operation per cycle Recall one adder for PC+4 and one ALU/adder for others in single-cycle processor 2/6/217 ELEC 52-1/62-1 Lecture 4 43

Reducing Cycle Time Cut combinational dependency graph and insert register / latch Do the same work in two fast cycles, rather than one slow one storage element storage element Acyclic Combinational Logic Acyclic Combinational Logic (A) => storage element storage element Acyclic Combinational Logic (B) 2/6/217 ELEC 52-1/62-1 Lecture 4 storage element 44

MDR B ALUout PC A IR Multicycle Datapath Abstract View End of a cycle All data needed in subsequent clock cycles must be stored in an internal register (not visible to the programmers). All (except IR) hold data only between a pair of adjacent clock cycles (no write control signal for the internal register is needed) Address Memory Data (Instr. or Data) Write Data Reg 1 Reg 2Data 1 Register File Write Addr Write Data Data 2 ALU Single Memory Unit, Single ALU, Temporary registers after major functional unit IR Instruction Register MDR Memory Data Register A, B regfile read data registers ALUout ALU output register 2/6/217 ELEC 52-1/62-1 Lecture 4 45

Next: Pipelining https://www.youtube.com/watch?v=ijarlbd9r3 https://www.youtube.com/watch?v=anxgje6i3g8 https://www.youtube.com/watch?v=5lp4ebfpati