Implementing RISC-V Interpreter in Hardware

Similar documents
Control Unit. Main Memory. control. status. address instructions. address data. Internal storage Datapath

6.004 Recitation Problems L13 RISC-V Interpreter

6.004 Recitation Problems L13 RISC-V Interpreter

6.004 Recitation Problems L11 RISC-V Interpreter

6.004 Recitation Problems L11 RISC-V Interpreter

Non-pipelined Multicycle processors

Multicycle processors. Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology

Non-Pipelined Processors

Multicycle processors. Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology

1 /15 2 /20 3 /20 4 /25 5 /20

1 /20 2 /18 3 /20 4 /18 5 /24

1 /18 2 /16 3 /18 4 /26 5 /22

1 /20 2 /18 3 /20 4 /18 5 /24

Non-Pipelined Processors

1 /18 2 /16 3 /18 4 /26 5 /22

Programmable Machines

Programmable Machines


RISC-V Assembly and Binary Notation

ece4750-tinyrv-isa.txt

Non-Pipelined Processors - 2

Virtual Memory and Interrupts

Interrupts/Exceptions/Faults

Computer Architecture

Lecture 3: Single Cycle Microarchitecture. James C. Hoe Department of ECE Carnegie Mellon University

Data Hazards in Pipelined Processors

Modeling Processors. Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

Modeling Processors. Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

Contributors to the course material

Lecture 2: RISC V Instruction Set Architecture. Housekeeping

ISA and RISCV. CASS 2018 Lavanya Ramapantulu

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

A General-Purpose Computer The von Neumann Model. Concocting an Instruction Set. Meaning of an Instruction. Anatomy of an Instruction

Concocting an Instruction Set

RTL Model of a Two-Stage MIPS Processor

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology

Constructive Computer Architecture. Combinational ALU. Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology

Concocting an Instruction Set

MIPS%Assembly% E155%

Lecture 2: RISC V Instruction Set Architecture. James C. Hoe Department of ECE Carnegie Mellon University

CS3350B Computer Architecture MIPS Instruction Representation

Concocting an Instruction Set

EE108B Lecture 3. MIPS Assembly Language II

Concocting an Instruction Set

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

R-type Instructions. Experiment Introduction. 4.2 Instruction Set Architecture Types of Instructions

MIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

Computer Architecture Experiment

Flow of Control -- Conditional branch instructions

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set

CS3350B Computer Architecture Quiz 3 March 15, 2018

TSK3000A - Generic Instructions

Computer Architecture. The Language of the Machine

Examples of branch instructions

Bypassing and EHRs. Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. L23-1

Reduced Instruction Set Computer (RISC)

CENG3420 Lecture 03 Review

MIPS PROJECT INSTRUCTION SET and FORMAT

Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , , Appendix B

Processor. Han Wang CS3410, Spring 2012 Computer Science Cornell University. See P&H Chapter , 4.1 4

Reduced Instruction Set Computer (RISC)

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

The MIPS Instruction Set Architecture

Review: ISA. Review: Compiling Applications. Computer Architecture ELEC3441. Instruction Set Architecture (1) Computer Architecture: HW/SW Interface

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

Instructions: Language of the Computer

Design for a simplified DLX (SDLX) processor Rajat Moona

Review: MIPS Organization

Lecture 08: RISC-V Single-Cycle Implementa9on CSE 564 Computer Architecture Summer 2017

Computer Organization and Components

CS222: MIPS Instruction Set

Chapter 2A Instructions: Language of the Computer

Mark Redekopp, All rights reserved. EE 357 Unit 11 MIPS ISA

ICS 233 COMPUTER ARCHITECTURE. MIPS Processor Design Multicycle Implementation

Modeling Processors. Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

Laboratory Exercise 6 Pipelined Processors 0.0

Course Administration

6.175: Constructive Computer Architecture. Tutorial 4 Epochs and Scoreboards

Chapter 2. Instructions: Language of the Computer. HW#1: 1.3 all, 1.4 all, 1.6.1, , , , , and Due date: one week.

CENG 3420 Lecture 06: Datapath

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

ELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2)

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Mips Code Examples Peter Rounce

Unsigned Binary Integers

Unsigned Binary Integers

Topic Notes: MIPS Instruction Set Architecture

Computer Organization MIPS ISA

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

EEM 486: Computer Architecture. Lecture 2. MIPS Instruction Set Architecture

MIPS Memory Access Instructions

MIPS Instruction Reference

CS 4200/5200 Computer Architecture I

CS 351 Exam 2 Mon. 11/2/2015

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts

Transcription:

Implementing RISC-V Interpreter in Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October 16, 2018 MIT 6.004 Fall 2018 L11-1

Instruction interpreter Fetch the instruction at pc Decode the instruction Execute the instruction (compute the next state values for the register file and pc) Access the memory if the instruction is a Ld or St Update the register file and pc First, we will examine the major components: register file, ALU, memory, decoder, execution unit; Then we will put it all together to build single-cycle and multicycle nonpipelined implementations of RISV-V L11-2

Arithmetic-Logic Unit (ALU) ALU performs all the arithmetic and logical functions a b func - Add, Sub, And, Or,... ALU result function Word alu(word a, Word b, AluFunc func); Type of a, e.g., Bit#(32) typedef enum {Add, Sub, And, Or, Xor, Slt, Sltu, Sll, Sra, Srl} AluFunc deriving(bits, Eq); This what you implemented in Lab2 L11-3

ALU for comparison operators Like ALU but returns a Bool a b func - GT, LT, EQ, Zero,... ALU Br c function Bool alubr(word a, Word b, BrFunc func); typedef enum {Eq, Neq, Lt, Ltu, Ge, Geu} BrFunc deriving(bits, Eq); L11-4

ALU for Comparison operators function Bool alubr(word a, Word b, BrFunc brfunc); Bool brtaken = case(brfunc) Eq: (a == b); Neq: (a!= b); Lt: signedlt(a, b); Ltu: (a < b); Ge: signedge(a, b); Geu: (a >= b); AT: True; NT: False; endcase; return res; endfunction L11-5

Register File 2 Read ports + 1 Write port index data index data rd1 rd2 rf wr en data index Registers can be read or written any time, so the guards are always true (not shown) typedef Bit#(32) Word; typedef Bit#(5) RIndx; interface RFile2R1W; method Word rd1(rindx index) ; method Word rd2(rindx index) ; method Action wr (RIndx index, Word data); endinterface L11-6

Register File implementation module mkrfile2r1w(rfile2r1w); Vector#(32,Reg#(Word)) rfile <- replicatem(mkreg(0)); method Word rd1(rindx rindx) = rfile[rindx]; method Word rd2(rindx rindx) = rfile[rindx]; method Action wr(rindx rindx, Word data); if(rindx!=0) begin rfile[rindx] <= data; end endmethod Register 0 is hardwired to endmodule zero and cannot be written All three methods of the register file can be called simultaneously, and in that case the read methods read the value already in the register file {rd1, rd2} < wr L11-7

Magic Memory Model Clock WriteEnable Address WriteData MAGIC RAM ReadData Reads and writes behave as in a register file (not true for real SRAM or DRAM) Reads are combinational Write, if enabled, is performed at the rising clock edge However, unlike a register file, there is only one port, which is used either for reading or for wring In Lecture 12, we will consider more realistic memory systems L11-8

Magic Memory Interface op address store data load data en memreq magic memory Magic memory can be read or written any time, so the guards are always true (not shown) interface MagicMemory; method ActionValue#(Word) req(memreq r); endinterface typedef struct {MemOp op; Word addr; Word data;} MemReq deriving(bits, Eq); typedef enum {Ld, St} MemOp deriving(bits, Eq); let data <- m.req(memreq{op:ld, addr:a, data:?}); let dummy <- m.req(memreq{op:st, addr:a, data:v}); L11-9

Instruction Decoding An instruction can be executed only after each of its fields has been extracted Fields are needed to access the register file, compute address to access memory, supply the proper opcode to ALU, set the pc,... Some 32 bit values may not represent an instruction or may represent an instruction not supported by our implementation Many instructions differ only slightly from each other in both decoding and execution Unlike RISC-V, some instruction sets are extremely complicated to decode, e.g. Intel X86 L11-10

Decoding instructions An Example What RISC-V instruction is represented by these 32 bits? Reference manual specifies the fields as follows: opcode = 0110011 funct3 = 000 rd = 00011 rs1 = 00010 rs2 = 00001 rs2 rs1 => opcode Op, R-type encoding => ADD => x3 => x2 => x1 opcode 00000000000100010000000110110011 funct7 funct3 What is the meaning of executing this instruction? rf.wr(3, alu(rf.rd1(2), rf.rd2(1), Add)); pc<=pc+4; L11-11 rd

ALU Instructions Differ only in the ALU op to be performed Instruction Description Execution ADD rd, rs1, rs2 Add reg[rd] <= reg[rs1] + reg[rs2] SUB rd, rs1, rs2 Sub reg[rd] <= reg[rs1] reg[rs2] SLL rd, rs1, rs2 Shift Left Logical reg[rd] <= reg[rs1] << reg[rs2] SLT rd, rs1, rs2 Set if < (Signed) reg[rd] <= (reg[rs1] < s reg[rs2])? 1 : 0 SLTU rd, rs1, rs2 Sef it < (Unsigned) reg[rd] <= (reg[rs1] < u reg[rs2])? 1 : 0 XOR rd, rs1, rs2 Xor reg[rd] <= reg[rs1] ^ reg[rs2] SRL rd, rs1, rs2 Shift Right Logical reg[rd] <= reg[rs1] >> u reg[rs2] SRA rd, rs1, rs2 Shift Right Arithmetic reg[rd] <= reg[rs1] >> s reg[rs2] OR rd, rs1, rs2 Or reg[rd] <= reg[rs1] reg[rs2] AND rd, rs1, rs2 And reg[rd] <= reg[rs1] & reg[rs2] October 16, 2018 These instructions are grouped in a category called OP with fields (func, rd, rs1, rs2) where func is the function for the ALU MIT 6.004 Fall 2018 L11-12

ALU Instructions with one Immediate operand Instruction Description Execution ADDI rd, rs1, immi Add Immediate reg[rd] <= reg[rs1] + immi SLTI rd, rs1, immi SLTIU rd, rs1, immi Set if < Immediate (Signed) Sef it < Immediate (Unsigned) reg[rd] <= (reg[rs1] < s immi)? 1 : 0 reg[rd] <= (reg[rs1] < u immi)? 1 : 0 XORI rd, rs1, immi Xor Immediate reg[rd] <= reg[rs1] ^ immi ORI rd, rs1, immi Or Immediate reg[rd] <= reg[rs1] immi ANDI rd, rs1, immi And Immediate reg[rd] <= reg[rs1] & immi SLLI rd, rs1, immi SRLI rd, rs1, immi SRAI rd, rs1, immi Shift Left Logical Immediate Shift Right Logical Immediate Shift Right Arithmetic Immediate reg[rd] <= reg[rs1] << immi reg[rd] <= reg[rs1] >> u immi reg[rd] <= reg[rs1] >> s immi These instructions are grouped in a category called OPIMM with fields (func, rd, rs1, immi) where func is the function for the alu L11-13

Decoding can be complicated! imm[12] rs2 rs1 imm[4:1] opcode 11111110000000001001111011100011 imm[10:5] funct3 imm[11] What is this instruction? Reference manual specifies the fields as follows: opcode = 1100011 funct3 = 001 rs1 = 00001 rs2 = 00000 => opcode branch, B-type encoding => BNEQ => x1 => x0 imm[12:1] = {1, 1, 111111, 1110}; imm[0] = 0; imm[31:13] are sign extended => imm[31:0] = -4 bralu(rf.rd1(1), rf.rd2(0), Neq)? pc <= pc + (-4) : pc <= pc + 4 L11-14

Branch Instructions differ only in the alubr operation they perform Instruction Description Execution BEQ rs1, rs2, immb Branch = pc <= (reg[rs1] == reg[rs2])? pc + immb : pc + 4 BNE rs1, rs2, immb Branch!= pc <= (reg[rs1]!= reg[rs2])? pc + immb : pc + 4 BLT rs1, rs2, immb Branch < (Signed) BGE rs1, rs2, immb Branch (Signed) BLTU rs1, rs2, immb Branch < (Unsigned) BGEU rs1, rs2, immb Branch (Unsigned) pc <= (reg[rs1] < s reg[rs2])? pc + immb : pc + 4 pc <= (reg[rs1] s reg[rs2])? pc + immb : pc + 4 pc <= (reg[rs1] < u reg[rs2])? pc + immb : pc + 4 pc <= (reg[rs1] u reg[rs2])? pc + immb : pc + 4 These instructions are grouped in a category called BRANCH with fields (brfunc, rs1, rs2, immb) where brfunc is the function for alubr L11-15

Remaining Instructions Instruction Description Execution JAL rd, immj JALR rd, rs1, immi LUI rd, immu Jump and Link Jump and Link Register Load Upper Immediate reg[rd] <= pc + 4 pc <= pc + immj reg[rd] <= pc + 4 pc <= {(reg[rs1] + immu)[31:1], 1 b0} reg[rd] <= immu LW rd, rs1, immi Load Word reg[rd] <= mem[reg[rs1] + immi] SW rs1, rs2, imms Store Word mem[reg[rs1] + imms] <= reg[rs2] Each of these instructions is in a category by itself and needs to extract different fields from the instruction LW and SW need to access memory for execution and thus, are required to compute an effective memory address L11-16

Instruction decoder We need a function to extract the category and the various fields for each category from a 32-bit instruction Fields we have identified so far are: Instruction category: OP, OPIMM, BRANCH, JAL, JALR, LUI, LOAD, STORE, Unsupported Function for alu: alufunc Function for bralu: brfunc Register fields: rd, rs1, rs2 Immediate constants: immi(12), immb(12), immj(15), immu(20), imms(12) but each is used as a 32-bit value with proper sign extension Notice that no instruction has all the fields L11-17

Decoded Instruction Type typedef struct { IType AluFunc BrFunc Maybe#(RIndx) RIndx RIndx Word itype; alufunc; brfunc; dst; src1; src2; imm; Field names Type of the Field names Destination register 0 behaves like an Invalid destination If dst is invalid, register file update is not performed } DecodedInst deriving(bits, Eq); typedef enum {OP, OPIMM, BRANCH, LUI, JAL, JALR, LOAD, STORE, Unsupported} IType deriving(bits, Eq); typedef enum {Add, Sub, And, Or, Xor, Slt, Sltu, Sll, Sra, Srl} AluFunc deriving(bits, Eq); typedef enum {Eq, Neq, Lt, Ltu, Ge, Geu} BrFunc deriving(bits, Eq); function DecodedInst decode(bit#(32) inst); Lab 5 L11-18

Instruction interpreter Putting it all together module mkinterpreter(empty); Reg#(Word) pc <- mkreg(0); RFile2R1W rf <- mkrfile2r1w; MagicMemory imem <- mkmagicmemory; MagicMemory dmem <- mkmagicmemory; rule dointerpreter; let inst <- imem.req(memreq{op:ld, addr:pc, data:?}); let dinst = decode(inst); // dinst fields: itype, alufunc, brfunc, dst, src1, src2, imm let rval1 = rf.rd1(dinst.src1); let rval2 = rf.rd2(dinst.src2); let einst = execute(dinst, rval1, rval2, pc); // einst fields: itype, dst, data, addr, nextpc updatestate(einst, pc, rf, dmem); // returns an action to be performed endrule endmodule L11-19

Function execute function ExecInst execute( DecodedInst dinst, Word rval1, Word rval2, Word pc ); // extract from dinst: itype, alufunc, brfunc, imm // initialize einst and its fields: data, nextpc, addr to? case (itype) matches OP: begin data = alu(rval1, rval2, alufunc); nextpc = pc+4; end OPIMM: begin data = alu(rval1, imm, alufunc); nextpc = pc+4; end BRANCH: begin nextpc = alubr(rval1, rval2, brfunc)? pc+imm : pc+4; end LUI: begin data = imm; nextpc = pc+4; end JAL: begin data = pc+4; nextpc = pc+imm; end JALR: begin data = pc+4; nextpc = (rval1+imm) & ~1; end LOAD: begin addr = rval1+imm; nextpc = pc+4; end STORE: begin data = rval2; addr = rval1+imm; endcase endfunction nextpc = pc+4; end // assign to einst; To zero out the LSB L11-20

Function update function Action updatestate( ExecInst einst, Reg#(Word) pc, RFile2R1W rf, MagicMemory dmem); return (action //Extract fileds of einst: data, addr, dst; let data = einst.data; // memory access if (einst.itype == LOAD) begin This means we are passing the register, not its value, as a parameter data <- dmem.req(memreq{op:ld, addr:addr, data:?}); end else if (einst.itype == STORE) begin let dummy <- dmem.req(memreq{op:st, addr:addr, data:data}); end // register file write if (isvalid(dst)) rf.wr(frommaybe(?,dst), data); // pc update pc <= einst.nextpc; endaction); endfunction L11-21

Our interpreter is a Single-Cycle RISC-V Processor Register File 2 read & 1 write ports PC Decode Execute Inst Memory separate Instruction & Data memories Data Memory Datapath (arrows in this diagram) are derived automatically from the high-level rule-based Bluespec description L11-22

Take Home problem Decode the following instruction and describe its execution: 11111100000111111111000011101111 Refer to 6.004 ISA Reference Tables Available at [Course Webpage > Resources] Also, L9 Recitation Worksheet can be helpful October 16, 2018 MIT 6.004 Fall 2018 L11-23