Instructor: Randy H. Katz hcp://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #18. Warehouse Scale Computer

Similar documents
The Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath

CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture. MIPS CPU Datapath, Control Introduction

CS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath

CS3350B Computer Architecture Winter Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2)

CS 110 Computer Architecture Single-Cycle CPU Datapath & Control

Review. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU

ECE468 Computer Organization and Architecture. Designing a Single Cycle Datapath

Outline. EEL-4713 Computer Architecture Designing a Single Cycle Datapath

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 28: Single- Cycle CPU Datapath Control Part 1

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

CS61C : Machine Structures

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 19 CPU Design: The Single-Cycle II & Control !

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single- Cycle CPU Datapath & Control Part 2

CS61C : Machine Structures

COMP303 Computer Architecture Lecture 9. Single Cycle Control

CS 61C: Great Ideas in Computer Architecture Lecture 12: Single- Cycle CPU, Datapath & Control Part 2

UC Berkeley CS61C : Machine Structures

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 18 CPU Design: The Single-Cycle I ! Nasty new windows vulnerability!

Ch 5: Designing a Single Cycle Datapath

Lecture #17: CPU Design II Control

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single Cycle MIPS CPU

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single- Cycle CPU Datapath Control Part 1

EEM 486: Computer Architecture. Lecture 3. Designing Single Cycle Control

CS 61C: Great Ideas in Computer Architecture Control and Pipelining

CPU Organization (Design)

361 control.1. EECS 361 Computer Architecture Lecture 9: Designing Single Cycle Control

ECE170 Computer Architecture. Single Cycle Control. Review: 3b: Add & Subtract. Review: 3e: Store Operations. Review: 3d: Load Operations

CS61C : Machine Structures

CS359: Computer Architecture. The Processor (A) Yanyan Shen Department of Computer Science and Engineering

Working on the Pipeline

MIPS-Lite Single-Cycle Control

If you didn t do as well as you d hoped

How to design a controller to produce signals to control the datapath

CPU Design Steps. EECC550 - Shaaban

Lecture 6 Datapath and Controller

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

University of California College of Engineering Computer Science Division -EECS. CS 152 Midterm I

Major CPU Design Steps

Midterm I March 3, 1999 CS152 Computer Architecture and Engineering

CSE 141 Computer Architecture Summer Session Lecture 3 ALU Part 2 Single Cycle CPU Part 1. Pramod V. Argade

CS152 Computer Architecture and Engineering Lecture 10: Designing a Single Cycle Control. Recap: The MIPS Instruction Formats

Recap: The MIPS Subset ADD and subtract EEL Computer Architecture shamt funct add rd, rs, rt Single-Cycle Control Logic sub rd, rs, rt

Designing a Multicycle Processor

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

The Processor: Datapath & Control

Computer Science 61C Spring Friedland and Weaver. The MIPS Datapath

add rd, rs, rt Review: A Single Cycle Datapath We have everything Lecture Recap: Meaning of the Control Signals

361 multipath..1. EECS 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor

Midterm I October 6, 1999 CS152 Computer Architecture and Engineering

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single- Cycle CPU Datapath & Control Part 2. Clk

ECE 361 Computer Architecture Lecture 11: Designing a Multiple Cycle Controller. Review of a Multiple Cycle Implementation

CS 110 Computer Architecture Review Midterm II

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

Single Cycle CPU Design. Mehran Rezaei

CpE 442. Designing a Multiple Cycle Controller

Recap: A Single Cycle Datapath. CS 152 Computer Architecture and Engineering Lecture 8. Single-Cycle (Con t) Designing a Multicycle Processor

Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

ECE468 Computer Organization and Architecture. Designing a Multiple Cycle Controller

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

ECE 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor

CS 152 Computer Architecture and Engineering. Lecture 10: Designing a Multicycle Processor

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

CS3350B Computer Architecture Quiz 3 March 15, 2018

ECE4680. Computer Organization and Architecture. Designing a Multiple Cycle Processor

ECE468. Computer Organization and Architecture. Designing a Multiple Cycle Processor

CENG 3420 Lecture 06: Datapath

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

ECE4680. Computer Organization and Architecture. Designing a Multiple Cycle Processor

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

Midterm I March 1, 2001 CS152 Computer Architecture and Engineering

Adding Support for jal to Single Cycle Datapath (For More Practice Exercise 5.20)

Lecture 12: Single-Cycle Control Unit. Spring 2018 Jason Tang

The MIPS Processor Datapath

Lecture 7 Pipelining. Peng Liu.

Processor (I) - datapath & control. Hwansoo Han

Guerrilla Session 3: MIPS CPU

Systems Architecture

ECE4680 Computer Organization and Architecture. Designing a Pipeline Processor

Chapter 4. The Processor

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Final Exam Spring 2017

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

1048: Computer Organization

CS152 Computer Architecture and Engineering. Lecture 8 Multicycle Design and Microcode John Lazzaro (

Fundamentals of Computer Systems

Review: Abstract Implementation View

CPSC614: Computer Architecture

Outline of today s lecture. EEL-4713 Computer Architecture Designing a Multiple-Cycle Processor. What s wrong with our CPI=1 processor?

CPE 335 Computer Organization. Basic MIPS Architecture Part I

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Transcription:

/29/3 CS 6C: Great Ideas in Computer Architecture Building Blocks for Datapaths Instructor: Randy H. Katz hcp://inst.eecs.berkeley.edu/~cs6c/fa3 /27/3 Fall 23 - - Lecture #8 So5ware Parallel Requests Assigned to computer e.g., Search Katz Parallel Threads Assigned to core e.g., Lookup, Ads Parallel InstrucVons > instrucvon @ one Vme e.g., 5 pipelined instrucvons Parallel Data > data item @ one Vme e.g., Add of 4 pairs of words Hardware descripvons All gates @ one Vme Programming Languages You are Here! Harness Parallelism & Achieve High Performance Hardware Warehouse Scale Computer Core Memory Input/Output InstrucVon Unit(s) Cache Memory Core /27/3 Fall 23 - - Lecture #8 2 Computer (Cache) Core FuncVonal Unit(s) A +B A +B A 2 +B 2 A 3 +B 3 Smart Phone Today Logic Gates

/29/3 Machine Interpreta4on Levels of RepresentaVon/ InterpretaVon High Level Language Program (e.g., C) Compiler Assembly Language Program (e.g., MIPS) Assembler Machine Language Program (MIPS) Hardware Architecture DescripCon (e.g., block diagrams) Architecture Implementa4on Logic Circuit DescripCon (Circuit SchemaCc Diagrams) temp = v[k]; v[k] = v[k+]; v[k+] = temp; lw $t, ($2) lw $t, 4($2) sw $t, ($2) sw $t, 4($2) Anything can be represented as a number, i.e., data or instrucvons! /27/3 Fall 23 - - Lecture #8 3 Agenda Timing and State Machines Datapath Elements: Mux + MIPS- lite Datapath CPU Timing MIPS- lite Control And, in Conclusion, /29/3 Fall 23 - - Lecture #8 4 2

/29/3 Agenda Timing and State Machines Datapath Elements: Mux + MIPS- lite Datapath CPU Timing MIPS- lite Control And, in Conclusion, /29/3 Fall 23 - - Lecture #8 5 Last Time: SummaVon Circuit Register is used to hold up the transfer of data to adder Rough Vming High () Low () High () Low () High () Low () Time Square wave clock sets when things change Rounded Rectangle per clock means could be or Xi must be ready before clock edge due to adder delay /29/3 Fall 23 - - Lecture #8 6 3

/29/3 Register Internals n instances of a Flip- Flop Flip- flop name because the output flips and flops between and D is data input, Q is data output Also called D- type Flip- Flop /29/3 Fall 23 - - Lecture #8 7 Camera Analogy Timing Terms Want to take a portrait Vming right before and aper taking picture Set up?me don t move since about to take picture (open camera shucer) Hold?me need to hold svll aper shucer opens unvl camera shucer closes Time click to data Vme from open shucer unvl can see image on output (viewfinder) /29/3 Fall 23 - - Lecture #8 8 4

/29/3 Hardware Timing Terms Setup Time: when the input must be stable before the edge of the CLK Hold Time: when the input must be stable a5er the edge of the CLK CLK- to- Q Delay: how long it takes the output to change, measured from the edge of the CLK /29/3 Fall 23 - - Lecture #8 9 FSM Maximum Clock Frequency What is the maximum frequency of this circuit? Hint: Frequency = /Period Max Delay = Setup Time + CLK- to- Q Delay + CL Delay /29/3 Fall 23 - - Lecture #8 5

/29/3 Another Great (Theory) Idea: Finite State Machines (FSM) You may have seen FSMs in other classes (e.g., CS7) Same basic idea FuncVon can be represented with a state transivon diagram With combinavonal logic and registers, any FSM can be implemented in hardware /29/3 Fall 23 - - Lecture #8 Example: 3 Ones FSM FSM to detect the occurrence of 3 consecuvve s in the Input Draw the FSM Assume state transivons are controlled by the clock: On each clock cycle the machine checks the inputs and moves to a new state and produces a new output /29/3 Fall 23 - - Lecture #8 2 6

/29/3 Hardware ImplementaVon of FSM Register needed to hold a representavon of the machine s state. Unique bit pacern for each state. CombinaVonal logic circuit is used to implement a funcvon maps from present state (PS) and input to next state (NS) and output. The register is used to break the feedback path between Next State (NS) and Prior State (PS), controlled by the clock +! =! /29/3 Fall 23 - - Lecture #8 3 Hardware for FSM: CombinaVonal Logic Can look at its funcvonal specificavon, truth table form Truth table PS" Input" NS" Output" " " " " " " " " " " " " " " " " " " " " " " " " /29/3 Fall 23 - - Lecture #8 4 7

/29/3 Hardware for FSM: CombinaVonal Logic Truth table PS" Input" NS" Output" " " " " " " " " " " " " " " " " " " " " " " " " /29/3 Fall 23 - - Lecture #8 5 Hardware for FSM: CombinaVonal Logic AlternaVve Truth Table format: list only cases where value is a.then restate as logic equavons using PS, PS, Input Truth table PS" Input" NS" Output" " " " " " " " " " " " " " " " " " " " " " " " " /29/3 Fall 23 - - Lecture #8 6 8

/29/3 Truth table PS" " " " " " " Input" " " " " " " NS" " " " " " " Hardware for FSM: CombinaVonal Logic AlternaVve Truth Table format: list only cases where value is a.then restate as logic equavons using PS, PS, Input Output" " " " " " " NS bit is PS" Input" " " NS bit is PS" Input" " " Output is PS" Input" " " /29/3 Fall 23 - - Lecture #8 7 Truth table PS" " " " " " " Input" " " " " " " NS" " " " " " " Hardware for FSM: CombinaVonal Logic AlternaVve Truth Table format: list only cases where value is a.then restate as logic equavons using PS, PS, Input Output" " " " " " " NS = PS PS Input NS = ~PS ~PS Input NS = PS PS Input NS = ~PS PS Input Output= PS PS Input Output= PS ~PS Input NS bit is PS" Input" " " NS bit is PS" Input" " " Output is PS" Input" " " /29/3 Fall 23 - - Lecture #8 8 9

/29/3 Administrivia /29/3 Fall 23 - - Lecture #8 9 Agenda Timing and State Machines Datapath Elements: Mux + MIPS- lite Datapath CPU Timing MIPS- lite Control And, in Conclusion, /29/3 Fall 23 - - Lecture #8 2

/29/3 Design Hierarchy system datapath control code registers multiplexer comparator state registers combinational logic register logic switching networks /27/3 Fall 23 - - Lecture #8 2 Conceptual MIPS Datapath /27/3 Fall 23 - - Lecture #8 22

/29/3 Data MulVplexer (e.g., 2- to- x n- bit- wide) mux /27/3 Fall 23 - - Lecture #8 23 N Instances of - bit- Wide Mux /27/3 Fall 23 - - Lecture #8 24 2

/29/3 How Do We Build a - bit- Wide Mux (in Logisim)? s /27/3 Fall 23 - - Lecture #8 25 4- to- MulVplexer How many rows in TT? /27/3 Fall 23 - - Lecture #8 3

/29/3 AlternaVve Hierarchical Approach (in Logisim) /27/3 Fall 23 - - Lecture #8 27 Logisim /27/3 Fall 23 - - Lecture #8 28 4

/29/3 Subcircuits Subcircuit: Logisim equivalent of procedure or method Every project is a hierarchy of subcircuits /27/3 Fall 23 - - Lecture #8 29 N- bit- wide Data MulVplexer (in Logisim + tunnel) mux /27/3 Fall 23 - - Lecture #8 3 5

/29/3 ArithmeVc and Logic Unit Most processors contain a special logic block called ArithmeVc and Logic Unit () We ll show you an easy one that does ADD, SUB, bitwise AND, bitwise OR /27/3 Fall 23 - - Lecture #8 3 Simple /27/3 Fall 23 - - Lecture #8 6

/29/3 Adder/Subtractor: One- bit adder Least Significant Bit /27/3 Fall 23 - - Lecture #8 33 Adder/Subtractor: One- bit adder (/2) /27/3 Fall 23 - - Lecture #8 34 7

/29/3 Adder/Subtractor: One- bit Adder (2/2) /27/3 Fall 23 - - Lecture #8 35 N x - bit Adders N- bit Adder Connect Carry Out i- to Carry in i: b + + + /27/3 Fall 23 - - Lecture #8 36 8

/29/3 Twos Complement Adder/Subtractor /27/3 Fall 23 - - Lecture #8 37 CriVcal Path When sewng clock period in synchronous systems, must allow for worst case Path through combinavonal logic that is worst case called crivcal path Can be esvmated by number of gate delays : Number of gates must go through in worst case Idea: Doesn t macer if speedup other paths if don t improve the crivcal path What might crivcal path of? /27/3 Fall 23 - - Lecture #8 38 9

/29/3 MulVplexer Design MIPS- lite Datapath CPU Timing MIPS- lite Control And, in Conclusion, Agenda /27/3 Fall 23 - - Lecture #8 39 Agenda Timing and State Machines Datapath Elements: Mux + MIPS- lite Datapath CPU Timing MIPS- lite Control And, in Conclusion, /29/3 Fall 23 - - Lecture #8 4 2

/29/3 Processor Design Process Five steps to design a processor:. Analyze instrucvon set à Processor datapath requirements Control 2. Select set of datapath Memory components & establish Datapath clock methodology 3. Assemble datapath meevng the requirements 4. Analyze implementavon of each instrucvon to determine sewng of control points that effects the register transfer. 5. Assemble the control logic Formulate Logic EquaVons Design Circuits /27/3 Fall 23 - - Lecture #8 4 Input Output ADDU and SUBU addu rd,rs,rt subu rd,rs,rt OR Immediate: ori rt,rs,imm6 LOAD and STORE Word lw rt,rs,imm6 sw rt,rs,imm6 3 BRANCH: beq rs,rt,imm6 The MIPS- lite Subset 3 3 3 2 6 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits 2 op rs rt immediate 6 bits 5 bits 5 bits 6 bits 2 op rs rt immediate 6 bits 5 bits 5 bits 6 bits 2 op rs rt immediate 6 bits 5 bits 5 bits 6 bits /27/3 Fall 23 - - Lecture #8 42 6 6 6 6 2

/29/3 Register Transfer Language (RTL) RTL gives the meaning of the instrucvons {op, rs, rt, rd, shamt, funct} MEM[ PC ]! {op, rs, rt, Imm6} MEM[ PC ]! All start by fetching the instrucvon Inst Register Transfers! ADDU R[rd] R[rs] + R[rt]; PC PC + 4! SUBU R[rd] R[rs] R[rt]; PC PC + 4! ORI R[rt] R[rs] zero_ext(imm6); PC PC + 4! LOAD R[rt] MEM[ R[rs] + sign_ext(imm6)]; PC PC + 4! STORE MEM[ R[rs] + sign_ext(imm6) ] R[rt]; PC PC + 4! BEQ if ( R[rs] == R[rt] ) then PC PC + 4 + (sign_ext(imm6) ) else PC PC + 4! /27/3 Fall 23 - - Lecture #8 43 Step : Requirements of the InstrucVon Set Memory (MEM) InstrucVons & data (will use one for each: really caches) Registers (R: x ) Read rs Read rt Write rt or rd PC Extender (sign/zero extend) Add/Sub/OR unit for operavon on register(s) or extended immediate Add 4 (+ maybe extended immediate) to PC Compare if registers equal? /27/3 Fall 23 - - Lecture #8 44 22

/29/3 Generic Steps of Datapath PC instrucvon memory rd rs rt registers Data memory mux +4 imm. InstrucVon Fetch 2. Decode/ Register Read 5. Register 3. Execute 4. Memory Write /27/3 Fall 23 - - Lecture #8 45 Step 2: Components of the Datapath CombinaVonal Elements State Elements + Clocking Methodology A B Building Blocks Adder Adder CarryIn Sum CarryOut A B Select MUX MulVplexer Y A B OP Result /27/3 Fall 23 - - Lecture #8 46 23

/29/3 Needs for MIPS- lite + Rest of MIPS AddiVon, subtracvon, logical OR, ==: ADDU R[rd] = R[rs] + R[rt];... SUBU R[rd] = R[rs] R[rt];... ORI R[rt] = R[rs] zero_ext(imm6)... BEQ if ( R[rs] == R[rt] )... Test to see if output == for any operavon gives == test. How? P&H also adds AND, Set Less Than ( if A < B, otherwise) from Appendix C, secvon C.5 /27/3 Fall 23 - - Lecture #8 47 Storage Element: Idealized Memory Write Enable Address Memory (idealized) Data In One input bus: Data In One output bus: Data Out Clk Memory word is found by: Address selects the word to put on Data Out Write Enable = : address selects the memory word to be wricen via the Data In bus Clock input (CLK) CLK input is a factor ONLY during write operavon During read operavon, behaves as a combinavonal logic block: Address valid Data Out valid aper access Vme DataOut /27/3 Fall 23 - - Lecture #8 48 24

/29/3 Storage Element: Register (Building Block) Similar to D Flip Flop except N- bit input and output Write Enable input Write Enable: Data In Write Enable Negated (or deasserted) (): Data Out will not change Asserted (): Data Out will become Data In on rising edge of clock N Data Out N /27/3 Fall 23 - - Lecture #8 49 Storage Element: Register File Register File consists of registers: Two - bit output busses: busa and busb One - bit input bus: busw Register is selected by: RW RA RB Write Enable 5 5 5 busw RA (number) selects the register to put on busa (data) RB (number) selects the register to put on busb (data) RW (number) selects the register to be wricen via busw (data) when Write Enable is Clock input () Clk input is a factor ONLY during write operavon During read operavon, behaves as a combinavonal logic block: RA or RB valid busa or busb valid aper access Vme. Clk x - bit Registers busa busb /27/3 Fall 23 - - Lecture #8 5 25

/29/3 Step 3: Assemble DataPath MeeVng Requirements Register Transfer Requirements Datapath Assembly InstrucVon Fetch Read Operands and Execute OperaVon Common RTL operavons Fetch the InstrucVon: mem[pc] Update the program counter: SequenVal Code: PC PC + 4 Branch and Jump: PC something else Address InstrucVon Memory InstrucVon Word /27/3 Fall 23 - - Lecture #8 5 PC Next Address Logic Step 3: Add & Subtract R[rd] = R[rs] op R[rt] (addu rd,rs,rt) Ra, Rb, and Rw come from instrucvon s Rs, Rt, and Rd fields 3 2 6 6 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits ctr and RegWr: control logic aper decoding the instrucvon rd rs rt RegWr ctr 5 5 5 busa Rw Ra Rb busw Result x - bit Registers busb Already defined the register file & /27/3 Fall 23 - - Lecture #8 52

/29/3 Agenda Timing and State Machines Datapath Elements: Mux + MIPS- lite Datapath CPU Timing MIPS- lite Control And, in Conclusion, /29/3 Fall 23 - - Lecture #8 53 Clk Clocking Methodology............ Storage elements clocked by same edge CriVcal path (longest path through logic) determines length of clock period Have to allow for Clock- to- Q and Setup Times too This lecture (and P&H secvons) 4.3-4.4 do whole instrucvon in clock cycle for pedagogic reasons Project 4 will do it in 2 clock cycles via simple pipelining Soon explain pipelining and use 5 clock cycles per instrucvon /27/3 Fall 23 - - Lecture #8 54 27

/29/3 Clk PC Old Value Rs, Rt, Rd, Op, Func ctr Register- Register Timing: One Complete Cycle Clk- to- Q New Value InstrucVon Memory Access Time Old Value New Value Delay through Control Logic Old Value New Value RegWr Old Value New Value Register File Access Time busa, B Old Value New Value Delay busw Old Value New Value RegWr Rd Rs Rt 5 5 5 ctr Setup Time Register Write Rw Ra Rb busa busw Occurs Here RegFile busb /27/3 Fall 23 - - Lecture #8 55 Agenda Timing and State Machines Datapath Elements: Mux + MIPS- lite Datapath CPU Timing MIPS- lite Control And, in Conclusion, /29/3 Fall 23 - - Lecture #8 56 28

/29/3 Clk PC Old Value Rs, Rt, Rd, Op, Func ctr Register- Register Timing: One Complete Cycle Clk- to- Q New Value InstrucVon Memory Access Time Old Value New Value Delay through Control Logic Old Value New Value RegWr Old Value New Value Register File Access Time busa, B Old Value New Value Delay busw Old Value New Value RegWr Rd Rs Rt 5 5 5 ctr Setup Time Register Write Rw Ra Rb busa busw Occurs Here RegFile busb /27/3 Fall 23 - - Lecture #8 57 Logical OperaVons with Immediate R[rt] = R[rs] op ZeroExt[imm6] 3 2 6 5 op rs rt immediate 3 6 bits 5 bits 5 bits 6 5 6 bits immediate 6 bits 6 bits But we re wri4ng to Rt register?? And immediate input?? RegWr Rd Rs Rt 5 5 5 ctr busw Rw Ra Rb busa RegFile busb /27/3 Fall 23 - - Lecture #8 58 29

/29/3 Logical OperaVons with Immediate R[rt] = R[rs] op ZeroExt[imm6] RegDst RegWr rd rs 5 5 Rw imm6 rt Ra Rb RegFile 6 rt 5 3 op rs rt immediate /27/3 Fall 23 - - Lecture #8 59 2 3 6 bits 5 bits 5 bits 6 5 6 bits ZeroExt 6 bits 2: mul?plexor ctr busa busb Src 6 immediate 6 bits Already defined - bit MUX; Zero Ext? Load OperaVons R[rt] = Mem[R[rs] + SignExt[imm6]] Example: lw rt,rs,imm6 3 What sign extending?? And where is Mem?? 2 6 op rs rt immediate 6 bits 5 bits 5 bits 6 bits RegDst rd rt RegWr rs 5 5 Rw imm6 Ra Rb RegFile 6 rt 5 ZeroExt busa busb ctr /27/3 Fall 23 - - Lecture #8 6 Src 3

/29/3 Load OperaVons R[rt] = Mem[R[rs] + SignExt[imm6]] Example: lw rt,rs,imm6 3 2 6 op rs rt immediate 6 bits 5 bits 5 bits 6 bits RegDst rd rt ctr MemtoReg MemWr RegWr rs rt 5 5 5 Rw Ra Rb busa busw RegFile busb? WrEn Adr imm6 Data In Data 6 Memory Src /27/3 ExtOp Fall 23 - - Lecture #8 6 Extender 3 RTL: The Add InstrucVon 2 op rs rt rd shamt funct add rd, rs, rt MEM[PC] Fetch the instrucvon from memory R[rd] = R[rs] + R[rt] The actual operavon 6 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits PC = PC + 4 Calculate the next instrucvon s address 6 /27/3 Fall 23 - - Lecture #8 62 3

/29/3 InstrucVon Fetch Unit at Beginning of Add Fetch the instrucvon from InstrucVon memory: InstrucVon = MEM[PC] Inst same for all instrucvons Memory InstrucVon<3:> 4 PC Ext Adder Adder npc_sel PC Mux Inst Address imm6 /27/3 Fall 23 - - Lecture #8 63 Single Cycle Datapath during Add 3 R[rd] = R[rs] + R[rt] 2 6 op rs rt rd shamt funct RegDst= rd RegWr= rs 5 5 busw npc_sel=+4 Rw imm6 rt Ra Rb RegFile 6 rt 5 ExtOp=x Extender busa busb instr fetch unit Src= 6 Rs Rt Rd Imm6 zero ctr=add MemtoReg= = Data In InstrucVon<3:> <2:25> <6:2> <:5> MemWr= WrEn Adr <:5> Data Memory /27/3 Fall 23 - - Lecture #8 64

/29/3 InstrucVon Fetch Unit at End of Add PC = PC + 4 Same for all instrucvons except: Branch and Jump Inst Memory 4 PC Ext npc_sel=+4 Adder Adder PC Mux Inst Address imm6 /27/3 Fall 23 - - Lecture #8 65 Single Cycle Datapath during OR Immediate 3 2 op rs rt immediate R[rt] = R[rs] OR ZeroExt[Imm6] RegDst= Rd RegWr= busw npc_sel= Rs 5 5 Rw imm6 Rt Ra Rb RegFile 6 Rt 5 ExtOp= Extender busa busb 6 instr fetch unit Rs Rt Rd Imm6 zero ctr= MemtoReg= Src= InstrucVon<3:> <2:25> = Data In <6:2> MemWr= <:5> WrEn Adr <:5> Data Memory /27/3 Fall 23 - - Lecture #8 66 33

/29/3 Single Cycle Datapath during OR Immediate 3 op rs rt immediate R[rt] = R[rs] OR ZeroExt[Imm6] RegDst= Rd RegWr= Rs 5 5 busw npc_sel=+4 Rw imm6 Rt Ra Rb RegFile 6 Rt 5 ExtOp=zero 2 Extender busa busb 6 instr fetch unit Rs Rt Rd Imm6 zero ctr=or MemtoReg= /27/3 Fall 23 - - Lecture #8 67 Src= InstrucVon<3:> <2:25> = Data In <6:2> MemWr= <:5> WrEn Adr Data Memory <:5> Single Cycle Datapath during Load 3 op rs rt immediate R[rt] = Data Memory {R[rs] + SignExt[imm6]} RegDst= Rd RegWr= busw npc_sel= Rs 5 5 Rw imm6 Rt Ra Rb RegFile 6 Rt 5 ExtOp= 2 Extender busa busb 6 instr fetch unit Rs Rt Rd Imm6 zero ctr= MemtoReg= Src= InstrucVon<3:> <2:25> = Data In <6:2> MemWr= <:5> WrEn Adr Data Memory /27/3 Fall 23 - - Lecture #8 68 <:5> Student RouleCe 34

/29/3 Single Cycle Datapath during Load 3 op rs rt immediate R[rt] = Data Memory {R[rs] + SignExt[imm6]} RegDst= Rd RegWr= Rs 5 5 busw npc_sel=+4 Rw imm6 Rt Ra Rb RegFile 6 Rt 5 ExtOp=sign 2 Extender busa busb 6 instr fetch unit Rs Rt Rd Imm6 zero ctr=add MemtoReg= Src= Instruction<3:> <2:25> = Data In <6:2> MemWr= <:5> WrEn Adr <:5> Data Memory /27/3 Fall 23 - - Lecture #8 69 Single Cycle Datapath during Store 3 2 op rs rt immediate Data Memory {R[rs] + SignExt[imm6]} = R[rt] RegDst= Rd RegWr= busw npc_sel= Rs 5 5 Rw imm6 Rt Ra Rb RegFile 6 Rt 5 ExtOp= Extender busa busb 6 instr fetch unit Rs Rt Rd Imm6 zero ctr= MemtoReg= Src= InstrucVon<3:> <2:25> = Data In <6:2> MemWr= <:5> WrEn Adr <:5> Data Memory /27/3 Fall 23 - - Lecture #8 7 35

/29/3 Single Cycle Datapath during Store 3 op rs rt immediate Data Memory {R[rs] + SignExt[imm6]} = R[rt] RegDst=x Rd RegWr= Rs 5 5 busw npc_sel=+4 Rw imm6 Rt Ra Rb RegFile 6 Rt 5 ExtOp=sign 2 Extender busa busb 6 instr fetch unit Rs Rt Rd Imm6 zero ctr=add MemtoReg=x Src= InstrucVon<3:> <2:25> = Data In <6:2> MemWr= <:5> WrEn Adr <:5> Data Memory /27/3 Fall 23 - - Lecture #8 7 Single Cycle Datapath during Branch 3 op rs rt immediate if (R[rs] - R[rt] == ) then Zero = ; else Zero = RegDst= Rd RegWr= busw npc_sel= Rs 5 5 Rw imm6 Rt Ra Rb RegFile 6 Rt 5 ExtOp= 2 Extender busa busb 6 instr fetch unit Rs Rt Rd Imm6 zero ctr= MemtoReg= Src= InstrucVon<3:> <2:25> = Data In <6:2> MemWr= <:5> WrEn Adr <:5> Data Memory /27/3 Fall 23 - - Lecture #8 72 36

/29/3 Single Cycle Datapath during Branch 3 op rs rt immediate if (R[rs] - R[rt] == ) then Zero = ; else Zero = RegDst=x Rd RegWr= Rs 5 5 busw npc_sel=br Rw imm6 Rt Ra Rb RegFile 6 Rt 5 ExtOp=x 2 Extender busa busb 6 instr fetch unit Rs Rt Rd Imm6 zero ctr=sub MemtoReg=x Src= InstrucVon<3:> <2:25> = Data In <6:2> MemWr= <:5> WrEn Adr <:5> Data Memory /27/3 Fall 23 - - Lecture #8 73 InstrucVon Fetch Unit at the End of Branch 3 2 op rs rt immediate if (Zero == ) then PC = PC + 4 + SignExt[imm6]*4 ; else PC = PC + 4 npc_sel Zero imm6 4 PC Ext Adder Adder MUX npc_sel ctrl Mux Inst Memory Adr PC 6 InstrucVon<3:> What is encoding of npc_sel? Direct MUX select? Branch inst. / not branch Let s pick 2nd opvon npc_sel zero? MUX x Fall 23 - - Lecture #8 Q: What logic gate? /27/3 74 37

/29/3 4 Summary: Datapath s Control Signals ExtOp: zero, sign src: regb; immed ctr: ADD, SUB, OR PC Ext Adder Adder Inst Address npc_sel Mux PC RegDst RegWr busw Rd Rs 5 5 Rw imm6 Rt Ra Rb RegFile 6 ExtOp Rt 5 MemWr: write memory MemtoReg: ; Mem RegDst: rt ; rd RegWr: write register busa busb ctr MemtoReg MemWr /27/3 Fall 23 - - Lecture #8 75 imm6 Extender Src Data In WrEn Adr Data Memory Agenda Timing and State Machines Datapath Elements: Mux + MIPS- lite Datapath CPU Timing MIPS- lite Control And, in Conclusion, /29/3 Fall 23 - - Lecture #8 76 38

/29/3 And. in Conclusion, Single- Cycle Processor Use muxes to select among input S input bits selects 2 S inputs Each input can be n- bits wide, independent of S Can implement muxes hierarchically ArithmeVc circuits are a kind of combinavonal logic Processor Control Datapath Memory Input Output Five steps to processor design:. Analyze instrucvon set à datapath requirements 2. Select set of datapath components & establish clock methodology 3. Assemble datapath meevng the requirements 4. Analyze implementavon of each instrucvon to determine sewng of control points that effects the register transfer. 5. Assemble the control logic Formulate Logic EquaVons Design Circuits /27/3 Fall 23 - - Lecture #8 77 39