CPU Design Steps. EECC550 - Shaaban

Similar documents
Major CPU Design Steps

Designing a Multicycle Processor

CS 152 Computer Architecture and Engineering. Lecture 10: Designing a Multicycle Processor

CPU Organization (Design)

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath

CS152 Computer Architecture and Engineering. Lecture 8 Multicycle Design and Microcode John Lazzaro (

COMP303 Computer Architecture Lecture 9. Single Cycle Control

CS3350B Computer Architecture Winter Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2)

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

Recap: A Single Cycle Datapath. CS 152 Computer Architecture and Engineering Lecture 8. Single-Cycle (Con t) Designing a Multicycle Processor

The Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath

Lecture #17: CPU Design II Control

Initial Representation Finite State Diagram Microprogram. Sequencing Control Explicit Next State Microprogram counter

CS61C : Machine Structures

MIPS-Lite Single-Cycle Control

CpE 442. Designing a Multiple Cycle Controller

ECE468 Computer Organization and Architecture. Designing a Single Cycle Datapath

361 multipath..1. EECS 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor

361 control.1. EECS 361 Computer Architecture Lecture 9: Designing Single Cycle Control

CS 110 Computer Architecture Single-Cycle CPU Datapath & Control

CS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic

The Processor: Datapath & Control

CS 61C: Great Ideas in Computer Architecture. MIPS CPU Datapath, Control Introduction

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions

Outline. EEL-4713 Computer Architecture Designing a Single Cycle Datapath

Midterm I October 6, 1999 CS152 Computer Architecture and Engineering

CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

Working on the Pipeline

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

CPU Organization Datapath Design:

CS 61C: Great Ideas in Computer Architecture Control and Pipelining

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single- Cycle CPU Datapath & Control Part 2

UC Berkeley CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 28: Single- Cycle CPU Datapath Control Part 1

CS61C : Machine Structures

ECE 361 Computer Architecture Lecture 11: Designing a Multiple Cycle Controller. Review of a Multiple Cycle Implementation

EEM 486: Computer Architecture. Lecture 3. Designing Single Cycle Control

Adding Support for jal to Single Cycle Datapath (For More Practice Exercise 5.20)

What do we have so far? Multi-Cycle Datapath (Textbook Version)

ECE468 Computer Organization and Architecture. Designing a Multiple Cycle Controller

Ch 5: Designing a Single Cycle Datapath

University of California College of Engineering Computer Science Division -EECS. CS 152 Midterm I

Single Cycle CPU Design. Mehran Rezaei

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

UC Berkeley CS61C : Machine Structures

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

CPU Organization Datapath Design:

ECE170 Computer Architecture. Single Cycle Control. Review: 3b: Add & Subtract. Review: 3e: Store Operations. Review: 3d: Load Operations

Midterm I March 1, 2001 CS152 Computer Architecture and Engineering

CSE 141 Computer Architecture Summer Session Lecture 3 ALU Part 2 Single Cycle CPU Part 1. Pramod V. Argade

Lecture 6 Datapath and Controller

Chapter 4. The Processor. Computer Architecture and IC Design Lab

CS359: Computer Architecture. The Processor (A) Yanyan Shen Department of Computer Science and Engineering

Review. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU

Midterm I March 3, 1999 CS152 Computer Architecture and Engineering

ECE 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor

CPU Organization Datapath Design:

ECE4680 Computer Organization and Architecture. Designing a Pipeline Processor

Outline of today s lecture. EEL-4713 Computer Architecture Designing a Multiple-Cycle Processor. What s wrong with our CPI=1 processor?

CS61C : Machine Structures

Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 18 CPU Design: The Single-Cycle I ! Nasty new windows vulnerability!

CS 61C: Great Ideas in Computer Architecture Lecture 12: Single- Cycle CPU, Datapath & Control Part 2

Midterm I SOLUTIONS October 6, 1999 CS152 Computer Architecture and Engineering

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single- Cycle CPU Datapath Control Part 1

Initial Representation Finite State Diagram. Logic Representation Logic Equations

Processor (I) - datapath & control. Hwansoo Han

Systems Architecture

CS152 Computer Architecture and Engineering Lecture 13: Microprogramming and Exceptions. Review of a Multiple Cycle Implementation

CENG 3420 Lecture 06: Datapath

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Instructor: Randy H. Katz hcp://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #18. Warehouse Scale Computer

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 19 CPU Design: The Single-Cycle II & Control !

Lecture 12: Single-Cycle Control Unit. Spring 2018 Jason Tang

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

Review: Abstract Implementation View

Systems Architecture I

CS3350B Computer Architecture Quiz 3 March 15, 2018

CPSC614: Computer Architecture

ECE4680. Computer Organization and Architecture. Designing a Multiple Cycle Processor

Midterm Questions Overview

How to design a controller to produce signals to control the datapath

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

Topic #6. Processor Design

Chapter 4. The Processor

ECE468. Computer Organization and Architecture. Designing a Multiple Cycle Processor

Laboratory 5 Processor Datapath

CS 152 Computer Architecture and Engineering. Lecture 12: Multicycle Controller Design

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

Chapter 5: The Processor: Datapath and Control

CPE 335. Basic MIPS Architecture Part II

ECE 30, Lab #8 Spring 2014

Transcription:

CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements. 2. Select set of datapath components & establish clock methodology. 3. Assemble datapath meeting the requirements. 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic. #1 Lec # 5 Winter 2000 12-20-2000

CPU Design & Implantation Process Bottom-up Design: Assemble components in target technology to establish critical timing. Top-down Design: Specify component behavior from high-level requirements. Iterative refinement: Establish a partial solution, expand and improve. Instruction Set Architecture => processor datapath control Reg. File Mux ALU Reg Mem Decoder Sequencer Cells Gates #2 Lec # 5 Winter 2000 12-20-2000

Single Cycle MIPS Datapath: CPI = 1, Long Clock Cycle Inst Memory Adr <0:15> <11:15> <16:20> <21:25> Rs Rt Rd Imm16 Instruction<31:0> imm16 4 PC Ext Adder Adder npc_sel PC Mux Clk 00 RegDst RegWr busw Clk 1 imm16 Rd 0 Rt 5 5 Rs 5 Rw Ra Rb -bit Registers 16 Rt busa busb Extender Equal 0 Mux 1 ALUctr = ALU MemWr WrEn Adr Data In Data Memory Clk MemtoReg 0 Mux 1 ExtOp ALUSrc #3 Lec # 5 Winter 2000 12-20-2000

Drawback of Single Cycle Processor Long cycle time. All instructions must take as much time as the slowest: Cycle time for load is longer than needed for all other instructions. Real memory is not as well-behaved as idealized memory Cannot always complete data access in one (short) cycle. #4 Lec # 5 Winter 2000 12-20-2000

Abstract View of Single Cycle CPU op Main Control fun ALU control Next PC PC npc_sel Equal ExtOp ALUSrc ALUctr MemRd MemWr RegDst RegWr Instruction Fetch MemWr Register Fetch Ext ALU Mem Access Reg. Wrt Data Mem Result Store #5 Lec # 5 Winter 2000 12-20-2000

Single Cycle Instruction Timing Arithmetic & Logical PC Inst Memory Reg File mux ALU mux setup Load PC Inst Memory Reg File mux ALU Data Mem mux Critical Path Store PC Inst Memory Reg File mux ALU Data Mem Branch PC Inst Memory Reg File cmp mux setup #6 Lec # 5 Winter 2000 12-20-2000

Reducing Cycle Time: Multi-Cycle Design Cut combinational dependency graph by inserting registers / latches. The same work is done in two or more fast cycles, rather than one slow cycle. storage element storage element Acyclic Combinational Logic => Acyclic Combinational Logic (A) storage element storage element Acyclic Combinational Logic (B) storage element #7 Lec # 5 Winter 2000 12-20-2000

Clock Cycle Time & Critical Path Clk............ Critical path: the slowest path between any two storage devices Cycle time is a function of the critical path must be greater than: Clock-to-Q + Longest Path through the Combination Logic + Setup #8 Lec # 5 Winter 2000 12-20-2000

Instruction Processing Cycles Instruction Fetch Next Instruction Instruction Decode Obtain instruction from program storage Update program counter to address of next instruction Determine instruction type Obtain operands from registers }Common steps for all instructions Execute Compute result value or status Result Store Store result in register/memory if needed (usually called Write Back). #9 Lec # 5 Winter 2000 12-20-2000

Partitioning The Single Cycle Datapath Add registers between smallest steps Next PC PC Instruction Fetch npc_sel Operand Fetch ExtOp ALUSrc ALUctr MemRd MemWr RegDst RegWr MemWr Exec Mem Access Reg. File Data Mem Result Store #10 Lec # 5 Winter 2000 12-20-2000

npc_sel Next PC PC Example Multi-cycle Datapath Instruction Fetch IR Reg File Operand Fetch A B ExtOp ALUSrc ALUctr Ext ALU R MemRd MemWr Mem Access M MemToReg Data Mem RegDst RegWr Reg. File Registers added: IR: Instruction register A, B: Two registers to hold operands read from register file. R: or ALUOut, holds the output of the ALU M: or Memory data register (MDR) to hold data read from data memory Result Store Equal #11 Lec # 5 Winter 2000 12-20-2000

Operations In Each Cycle R-Type Logic Immediate Load Store Branch Instruction Fetch IR Mem[PC] IR Mem[PC] IR Mem[PC] IR Mem[PC] IR Mem[PC] Instruction Decode A R[rs] B R[rt] A R[rs] A R[rs] A R[rs] B R[rt] A R[rs] B R[rt] If Equal = 1 Execution R A + B R A OR ZeroExt[imm16] R A + SignEx(Im16) R A + SignEx(Im16) + (SignExt(imm16) x4) else Memory M Mem[R] Mem[R] B Write Back R[rd] R R[rt] R R[rd] M #12 Lec # 5 Winter 2000 12-20-2000

Finite State Machine (FSM) Control Model State specifies control points for Register Transfer. Transfer occurs upon exiting state (same falling edge). inputs (conditions) Next State Logic Control State State X Register Transfer Control Points Depends on Input Output Logic outputs (control points) #13 Lec # 5 Winter 2000 12-20-2000

Control Specification For Multi-cycle CPU Finite State Machine (FSM) IR MEM[PC] instruction fetch A R[rs] B R[rt] decode / operand fetch Execute R-type R A fun B ORi R A or ZX LW R A + SX SW R A + SX BEQ & ~Equal BEQ & Equal PC PC + SX 00 Memory M MEM[R] MEM[R] B To instruction fetch R[rd] R R[rt] R R[rt] M Write-back To instruction fetch To instruction fetch #14 Lec # 5 Winter 2000 12-20-2000

Traditional FSM Controller state op cond next state control points Truth or Transition Table Equal 6 11 next State control points datapath State op 4 State To datapath #15 Lec # 5 Winter 2000 12-20-2000

Traditional FSM Controller datapath + state diagram => control Translate RTN statements into control points. Assign states. Implement the controller. #16 Lec # 5 Winter 2000 12-20-2000

Mapping RTNs To Control Points Examples & State Assignments IR MEM[PC] instruction fetch 0000 imem_rd, IRen Execute ALUfun, Sen R-type R A fun B 0100 Aen, Ben ORi R A or ZX 0110 A R[rs] B R[rt] 0001 LW R A + SX 1000 SW decode / operand fetch R A + SX 1011 BEQ & ~Equal 0011 BEQ & Equal PC PC + SX 00 0010 Memory RegDst, RegWr, PCen M MEM[S] 1001 MEM[S] B 1100 To instruction fetch state 0000 R[rd] R 0101 R[rt] R 0111 R[rt] M 1010 Write-back To instruction fetch state 0000 To instruction fetch state 0000 #17 Lec # 5 Winter 2000 12-20-2000

BEQ R ORI LW SW Detailed Control Specification State Op field Eq Next IR PC Ops Exec Mem Write-Back en sel A B Ex Sr ALU S R W M M-R Wr Dst 0000??????? 0001 1 0001 BEQ 0 0011 1 1 0001 BEQ 1 0010 1 1 0001 R-type x 0100 1 1 0001 ori x 0110 1 1 0001 LW x 1000 1 1 0001 SW x 1011 1 1 0010 xxxxxx x 0000 1 1 0011 xxxxxx x 0000 1 0 0100 xxxxxx x 0101 0 1 fun 1 0101 xxxxxx x 0000 1 0 0 1 1 0110 xxxxxx x 0111 0 0 or 1 0111 xxxxxx x 0000 1 0 0 1 0 1000 xxxxxx x 1001 1 0 add 1 1001 xxxxxx x 1010 1 0 0 1010 xxxxxx x 0000 1 0 1 1 0 1011 xxxxxx x 1100 1 0 add 1 1100 xxxxxx x 0000 1 0 0 1 #18 Lec # 5 Winter 2000 12-20-2000

Alternative Multiple Cycle Datapath (In Textbook) Miminizes Hardware: 1 memory, 1 adder PCWr PC IorD 0 Mux 1 PCWrCond PCSrc BrWr Zero MemWr RAdr Ideal Memory WrAdr Din Dout IRWr Instruction Reg RegDst Rs Rt Rt 0 Mux 5 5 Rd 1 1 Mux 0 RegWr Ra Rb busa Reg File Rw buswbusb << 2 ALUSelA 4 0 Mux 1 0 1 2 3 1 Mux 0 ALU ALU Control Zero Target ALU Out Imm 16 ExtOp Extend MemtoReg ALUSelB ALUOp #19 Lec # 5 Winter 2000 12-20-2000

Alternative Multiple Cycle Datapath (In Textbook) Shared instruction/data memory unit A single ALU shared among instructions Shared units require additional or widened multiplexors Temporary registers to hold data between clock cycles of the instruction: Additional registers: Instruction Register (IR), Memory Data Register (MDR), A, B, ALUOut #20 Lec # 5 Winter 2000 12-20-2000

Operations In Each Cycle R-Type Logic Immediate Load Store Branch Instruction Fetch IR Mem[PC] IR Mem[PC] IR Mem[PC] IR Mem[PC] IR Mem[PC] Instruction Decode A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) A B R[rs] R[rt] ALUout PC + (SignExt(imm16) x4) Execution ALUout A + B ALUout A OR ZeroExt[imm16] ALUout A + SignEx(Im16) ALUout A + SignEx(Im16) If Equal = 1 PC ALUout Memory M Mem[ALUout] Mem[ALUout] B Write Back R[rd] ALUout R[rt] ALUout R[rd] Mem #21 Lec # 5 Winter 2000 12-20-2000

High-Level View of Finite State Machine Control First steps are independent of the instruction class Then a series of sequences that depend on the instruction opcode Then the control returns to fetch a new instruction. Each box above represents one or several state. #22 Lec # 5 Winter 2000 12-20-2000

Instruction Fetch and Decode FSM States #23 Lec # 5 Winter 2000 12-20-2000

Load/Store Instructions FSM States #24 Lec # 5 Winter 2000 12-20-2000

R-Type Instructions FSM States #25 Lec # 5 Winter 2000 12-20-2000

Branch Instruction Single State Jump Instruction Single State #26 Lec # 5 Winter 2000 12-20-2000

#27 Lec # 5 Winter 2000 12-20-2000

Finite State Machine (FSM) Specification IR MEM[PC] PC PC + 4 0000 instruction fetch A R[rs] B R[rt] ALUout PC +SX 0001 decode Execute R-type ALUout A fun B 0100 ORi ALUout A op ZX 0110 LW ALUout A + SX 1000 SW ALUout A + SX 1011 BEQ If A = B then PC ALUout 0010 Memory R[rd] ALUout 0101 To instruction fetch R[rt] ALUout 0111 M MEM[ALUout] 1001 R[rt] M 1010 To instruction fetch MEM[ALUout] B 1100 To instruction fetch #28 Lec # 5 Winter 2000 12-20-2000 Write-back

MIPS Multi-cycle Datapath Performance Evaluation What is the average CPI? State diagram gives CPI for each instruction type Workload below gives frequency of each type Type CPI i for type Frequency CPI i x freqi i Arith/Logic 4 40% 1.6 Load 5 30% 1.5 Store 4 10% 0.4 branch 3 20% 0.6 Average CPI: 4.1 Better than CPI = 5 if all instructions took the same number of clock cycles (5). #29 Lec # 5 Winter 2000 12-20-2000