Lecture 3: Single Cycle Microarchitecture. James C. Hoe Department of ECE Carnegie Mellon University

Similar documents
Lecture 2: RISC V Instruction Set Architecture. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 2: RISC V Instruction Set Architecture. Housekeeping

Computer Architecture

Computer Architecture

CMSC Computer Architecture Lecture 4: Single-Cycle uarch and Pipelining. Prof. Yanjing Li University of Chicago

Programmable Machines

Programmable Machines

Lecture 9: Microcontrolled Multi-Cycle Implementations. Who Am I?

RISC-V Assembly and Binary Notation

Processor. Han Wang CS3410, Spring 2012 Computer Science Cornell University. See P&H Chapter , 4.1 4

ISA and RISCV. CASS 2018 Lavanya Ramapantulu

ece4750-tinyrv-isa.txt

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University

The MIPS Instruction Set Architecture

CS3350B Computer Architecture Quiz 3 March 15, 2018


MIPS Instruction Set

Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , , Appendix B

Computer Architecture

CPU Organization (Design)

Laboratory Exercise 6 Pipelined Processors 0.0

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

CENG 3420 Lecture 06: Datapath

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath

CS 351 Exam 2 Mon. 11/2/2015

Computer Architecture. The Language of the Machine

Review: ISA. Review: Compiling Applications. Computer Architecture ELEC3441. Instruction Set Architecture (1) Computer Architecture: HW/SW Interface

Midterm. Sticker winners: if you got >= 50 / 67

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

A Processor. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter , 4.1-3

Lecture 8: Data Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University

CS 4200/5200 Computer Architecture I

CS Computer Architecture Spring Week 10: Chapter

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

CS3350B Computer Architecture MIPS Instruction Representation

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Design for a simplified DLX (SDLX) processor Rajat Moona

Lecture 10: Simple Data Path

ICS 233 Computer Architecture & Assembly Language. ICS 233 Computer Architecture & Assembly Language

Implementing RISC-V Interpreter in Hardware

Reminder: tutorials start next week!

EC 513 Computer Architecture

CSEN 601: Computer System Architecture Summer 2014

M2 Instruction Set Architecture

CS61C : Machine Structures

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

EE108B Lecture 3. MIPS Assembly Language II

Reduced Instruction Set Computer (RISC)

Course Administration

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

Ch 5: Designing a Single Cycle Datapath

A Processor! Hakim Weatherspoon CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter , 4.1-3

CS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

Character Is a byte quantity (00~FF or 0~255) ASCII (American Standard Code for Information Interchange) Page 91, Fig. 2.21

TSK3000A - Generic Instructions

Reduced Instruction Set Computer (RISC)

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

Major CPU Design Steps

Inf2C - Computer Systems Lecture Processor Design Single Cycle

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

ECE 4750 Computer Architecture, Fall 2014 T01 Single-Cycle Processors

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

EEM 486: Computer Architecture. Lecture 2. MIPS Instruction Set Architecture

are Softw Instruction Set Architecture Microarchitecture are rdw

CS 61C: Great Ideas in Computer Architecture. MIPS CPU Datapath, Control Introduction

CS3350B Computer Architecture Winter Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2)

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

Lecture 7 Pipelining. Peng Liu.

Chapter 4. The Processor. Computer Architecture and IC Design Lab

The Processor: Datapath & Control

CS31001 COMPUTER ORGANIZATION AND ARCHITECTURE. Debdeep Mukhopadhyay, CSE, IIT Kharagpur. Instructions and Addressing

Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)

MIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

Review: Abstract Implementation View

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

Review. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU

Instruction Set Architecture (ISA)

UC Berkeley CS61C : Machine Structures

Midterm Questions Overview

A Model RISC Processor. DLX Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Adding Support for jal to Single Cycle Datapath (For More Practice Exercise 5.20)

Lecture 5: Instruction Set Architecture : Road Map

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

CS61C : Machine Structures

CS 110 Computer Architecture Single-Cycle CPU Datapath & Control

Mark Redekopp, All rights reserved. EE 352 Unit 3 MIPS ISA

Outline. EEL-4713 Computer Architecture Designing a Single Cycle Datapath

Processor (I) - datapath & control. Hwansoo Han

CS 61C: Great Ideas in Computer Architecture RISC-V Instruction Formats

9/14/17. Levels of Representation/Interpretation. Big Idea: Stored-Program Computer

CPE 335 Computer Organization. Basic MIPS Architecture Part I

Transcription:

8 447 Lecture 3: Single Cycle Microarchitecture James C. Hoe Department of ECE Carnegie Mellon University 8 447 S8 L03 S, James C. Hoe, CMU/ECE/CALCM, 208

Your goal today Housekeeping first try at implementing the RV32I ISA Notices Handout #4: HW, due 2/7 Student survey on Canvas, past due Lab, Part A, due week of /29 Lab, Part B, due week of 2/5 ings P&H Ch 4.~4.4 finish P&H Ch 2 for next time 8 447 S8 L03 S2, James C. Hoe, CMU/ECE/CALCM, 208

Processing FSM I S Next S O An ISA describes an abstract FSM state = program visible state next state logic = instruction execution Nice ISAs have atomic instruction semantics one state transition per instruction in abstract FSM The implementation FSM can look wildly different 8 447 S8 L03 S3, James C. Hoe, CMU/ECE/CALCM, 208

Program Visible State (aka Architectural State) PC 5 5 5 register register 2 Registers register 2 Reg Mem address Address Data Mem 8 447 S8 L03 S4, James C. Hoe, CMU/ECE/CALCM, 208 **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]

Magic Memory and Register File Combinational output of the read port is a combinational function of the register file contents and the corresponding read select port Synchronous write the selected register (or location) is updated on the posedge clock transition when write enable is asserted Cannot affect read output in between clock edges 8 447 S8 L03 S5, James C. Hoe, CMU/ECE/CALCM, 208

Simplifying Characteristics of RISC Simple operations 2 input, output arithmetic and logical operations few alternatives for accomplishing the same thing Simple movements ALU ops are register to register, never load store architecture, addressing mode Simple branches limited varieties of branch conditions and targets PC offset Simple instruction encoding all instructions encoded in the same number of bits simple, fixed formats 8 447 S8 L03 S6, James C. Hoe, CMU/ECE/CALCM, 208 (RISC=Reduced Set Computer)

RISC Processing 5 generic steps instruction fetch instruction decode and operand fetch ALU/execute access (not required by non mem instructions) IF write back Data Register # PC Address Registers ALU Register # ID Register # EX WB Address Data Data MEM 8 447 S8 L03 S7, James C. Hoe, CMU/ECE/CALCM, 208 **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]

Single Cycle Datapath for RV32I ALU s 8 447 S8 L03 S8, James C. Hoe, CMU/ECE/CALCM, 208

Register Register ALU s Assembly (e.g., register register addition) ADD rd, rs, rs2 Machine encoding 0000000 7 bit Semantics GPR[rd] GPR[rs] + GPR[rs2] PC PC + 4 Exceptions: none (ignore carry and overflow) Variations Arithmetic: {ADD, SUB} Compare: {signed, unsigned} x {Set if Less Than} Logical: {AND, OR, XOR} Shift: {Left, Right Logical, Right Arithmetic} 8 447 S8 L03 S9, James C. Hoe, CMU/ECE/CALCM, 208 rs2 5 bit rs 5 bit 000 3 bit rd 5 bit 000 7 bit

ADD rd rs rs2 PC address 5 5 5 register register 2 Registers register 2 Reg if MEM[PC] == ADD rd rs rs2 GPR[rd] GPR[rs] + GPR[rs2] PC PC + 4 8 447 S8 L03 S0, James C. Hoe, CMU/ECE/CALCM, 208 IF ID EX MEM WB Combinational state update logic

R Type ALU Datapath 4 Add func3, func7 PC address [9:5] [24:20] [:7] register register 2 Registers register 2 3 ALU operation Zero ALU ALU result Reg **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.] 8 447 S8 L03 S, James C. Hoe, CMU/ECE/CALCM, 208

Reg Immediate ALU s Assembly (e.g., reg immediate additions) ADDI rd, rs, imm 2 Machine encoding imm[:0] 2 bit Semantics GPR[rd] GPR[rs] + sign extend (imm) PC PC + 4 Exceptions: none (ignore carry and overflow) Variations Arithmetic: {ADDI, SUBI} Compare: {signed, unsigned} x {Set if Less Than Imm} Logical: {ANDI, ORI, XORI} **Shifts by unsigned imm[4:0]: {SLLI, SRLI, SRAI} 8 447 S8 L03 S2, James C. Hoe, CMU/ECE/CALCM, 208 rs 5 bit 000 3 bit rd 5 bit 0000 7 bit

ADDI rd rs immediate 2 Add 4 PC address 5 5 5 register register 2 Registers register 2 Add Sum Reg if MEM[PC] == ADDI rd rs immediate GPR[rd] GPR[rs] + sign extend (immediate) PC PC + 4 8 447 S8 L03 S3, James C. Hoe, CMU/ECE/CALCM, 208 IF ID EX MEM WB Combinational state update logic

Datapath for R and I type ALU Inst s Add PC address 4 [9:5] [24:20] [:7] register register 2 Registers register 2 opcode, func3, func7, 3 ALU operation Zero ALU ALU result Reg [3:20] 6 Sign 32 2 extend ALUSrc isitype **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.] 8 447 S8 L03 S4, James C. Hoe, CMU/ECE/CALCM, 208

Single Cycle Datapath for Data Movement s 8 447 S8 L03 S5, James C. Hoe, CMU/ECE/CALCM, 208

Load s Assembly (e.g., load 4 byte word) LW rd, offset 2 (base) Machine encoding offset[:0] 2 bit Semantics byte_address 32 = sign extend(offset 2 ) + GPR[base] GPR[rd] MEM 32 [byte_address] PC PC + 4 Exceptions: none for now Variations: LW, LH, LHU, LB, LBU e.g., LB :: GPR[rd] sign extend(mem 8 [byte_address]) LBU :: GPR[rd] zero extend(mem 8 [byte_address]) Note: RV32I is byte addressable, little endian 8 447 S8 L03 S6, James C. Hoe, CMU/ECE/CALCM, 208 base 5 bit 00 3 bit rd 5 bit 00000 7 bit rs

LW Datapath Add PC address 4 register register 2 Registers register 2 add 3 ALU operatio Zero ALU ALU result Address Mem Data Reg 6 Sign 32 2 extend ALUSrc isitype Mem if MEM[PC]==LW rd offset 2 (base) EA = sign extend(offset) + GPR[base] GPR[rd] MEM[ EA ] PC PC + 4 8 447 S8 L03 S7, James C. Hoe, CMU/ECE/CALCM, 208 IF ID EX MEM WB Combinational state update logic

8 447 S8 L03 S8, James C. Hoe, CMU/ECE/CALCM, 208 Store s Assembly (e.g., store 4 byte word) SW rs2, offset 2 (base) Machine encoding offset[:5] 7 bit rs2 5 bit base 5 bit 00 3 bit ofst[4:0] 5 bit 0000 7 bit Semantics byte_address 32 = sign extend(offset 2 )+ GPR[base] MEM 32 [byte_address] GPR[rs2] PC PC + 4 Exceptions: none for now Variations: SW, SH, SB e.g., SB:: MEM 8 [byte_address] (GPR[rs2])[7:0]

SW Datapath Add PC address 4 register register 2 Registers register 2 add 3 ALU operatio Zero ALU ALU result Address Mem Data Reg 0 6 Sign 32 2 extend ALUSrc isstype Mem if MEM[PC]==SW rs2 offset 2 (base) EA = sign extend(offset) + GPR[base] MEM[ EA ] GPR[rs2] PC PC + 4 8 447 S8 L03 S9, James C. Hoe, CMU/ECE/CALCM, 208 IF ID EX MEM WB Combinational state update logic

Load Store Datapath Add PC address 4 register register 2 Registers register Reg!isStore 2 6 Sign 32 32 extend add 3 ALU operation Zero ALU ALU result ALUSrc isitype isstype Address isstore Mem Data isload Mem extend ImmExtend {Itype, ItypeU, Stype } LoadExtend {W, H, HU, B, BU} 8 447 S8 L03 S20, James C. Hoe, CMU/ECE/CALCM, 208 **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]

Datapath for Non Control Flow Inst s Add PC address 4 register register 2 Registers register Reg!isStore 2 6 Sign 32 32 extend opcode, func3, func7, 3 ALU operation Zero ALU ALU result ALUSrc isitype isstype Address isstore Mem Data isload Mem extend 8 447 S8 L03 S2, James C. Hoe, CMU/ECE/CALCM, 208 ImmExtend MemtoReg isload LoadExtend **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]

Single Cycle Datapath for Control Flow s 8 447 S8 L03 S22, James C. Hoe, CMU/ECE/CALCM, 208

Jump Assembly JAL rd imm 2 Machine encoding Note: implicit imm[0]=0 imm[20 0: 9:2] 20 bit rd 5 bit 0 7 bit Semantics target =PC + sign extend(imm 2 ) GPR[rd] PC + 4 PC target How far can you jump? Exceptions: misaligned target (4 byte) UJ type *Note*: use JAL x0 label instead of BEQ x0 x0 label 8 447 S8 L03 S23, James C. Hoe, CMU/ECE/CALCM, 208

Unconditional Jump Datapath isj PCSrc ADD PC address 4 UJ immediate 32 **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.] Add PC+4 PCtoReg isj register register 2 Registers register Reg 2 6 Sign 32 32 extend 3 X Zero ALU ALU result ALUSrc ALU operation Address 0 Mem Data 0 Mem extend if MEM[PC]==JAL rd, immediate 20 GPR[rd] = PC +4 PC =PC + sign extend(imm 2 ) 8 447 S8 L03 S24, James C. Hoe, CMU/ECE/CALCM, 208 **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.] ImmExtend MemtoReg LoadExtend X X X What about JALR?

(Conditional) Branch s Assembly (e.g., branch if equal) BEQ rs, rs2, imm 3 Note: implicit imm[0]=0 Machine encoding imm[2 0:5] 7 bit Semantics target = PC + sign extend(imm 3 ) if GPR[rs]==GPR[rs2] then PC target else PC PC + 4 How far can you jump? Exceptions: misaligned target (4 byte) if taken Variations BEQ, BNE, BLT, BGE, BLTU, BGEU 8 447 S8 L03 S25, James C. Hoe, CMU/ECE/CALCM, 208 rs2 5 bit rs 5 bit 000 3 bit imm[4: ] 5 bit 000 7 bit

JAL and taken Branch PC+4 Conditional Branch Datapath JALR PCSrc PC address 8 447 S8 L03 S26, James C. Hoe, CMU/ECE/CALCM, 208 4 Add **Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.] PC + 4 from instruction path PC register register 2 Registers register Reg 0 2 6 Sign 32 32 extend Shift left 2 Add 3 Sum ALU operation bcond ALU ALU Zero Result ALUSrc isitype isstype isjalr Branch target sub (when Bxx) bcond? To branch control logic ImmExtend ={Itype, ItypeU, Stype, SBtype, Utype, UJtype}

Adding Control to Datapath 8 447 S8 L03 S27, James C. Hoe, CMU/ECE/CALCM, 208 [Figure 4.7 from book, Copyright 208 Elsevier Inc. All rights reserved.]

Datapath Control Generation MEM[PC] Decode Logic ALUSrc Reg MemtoReg PCtoReg Mem Mem ALU Op ImmExtend LoadExtend PCSrc 8 447 S8 L03 S28, James C. Hoe, CMU/ECE/CALCM, 208

Single Bit Control Signals When De asserted When asserted Equation ALUSrc 2 nd ALU input from 2 nd GPR read port 2 nd ALU input from immediate (opcode!=isrtype) && (opcode!=isbtype) Reg GPR write disabled GPR write enabled MemtoReg Steer ALU result to GPR write port steer load to GPR write port (opcode!=sw) && (opcode!=bxx) opcode==lw/h/b PCtoReg Steer above result to GPR write port Steer PC+4 to GPR write port (opcode==jal) II (opcode==jalr) Mem Memory read disabled Memory read port return load value opcode==lw/h/b Mem Memory write disabled Memory write enabled opcode==sw/h/b 8 447 S8 L03 S29, James C. Hoe, CMU/ECE/CALCM, 208

Multi Bit Control Signals Options Equation ALU Op ADD, SUB, AND, OR, XOR, NOR, LT, and Shift bcond: EQ, NE, GE, LT ImmExtend PCSrc Itype, ItypeU, Stype, SBtype, Utype, UJtype PC+4, PCadder, ALU LoadExtend W,H,HU,B,BU case func3.... 8 447 S8 L03 S30, James C. Hoe, CMU/ECE/CALCM, 208 case opcode RTypeALU: according to funct3, funct7[5] ITypeALU : according to funct3 only (except shift) LW/SW/JALR : ADD Bxx : SUB and select bcond function : pass through 2 nd select based on instruction format type (may want to have separate extension units for primary ALU and PC offset adder) case opcode JAL : PC + immediate JALR : GPR + immediate Bxx : taken?(pc + immediate):(pc + 4) : PC+4

architecture Architecture Architecture vs Microarchitecture Architectural Level a clock has a hour hand and a minute hand,... a computer does.????. You can read a clock without knowing how it works Microarchitecture Level a particular clockwork has a certain set of gears arranged in a certain configuration a particular computer design has a certain path and a certain control logic Realization Level machined alloy gears vs stamped sheet metal CMOS vs ECL vs vacuum tubes 8 447 S8 L03 S3, James C. Hoe, CMU/ECE/CALCM, 208 conceptual physical [Computer Architecture, Blaauw and Brooks, 997]