A Model RISC Processor. DLX Architecture

Similar documents
DLX: A Simplified RISC Model

DLX: A Simplified RISC Model

Instruction Set Architecture (ISA)

CHAPTER 2: INSTRUCTION SET PRINCIPLES. Prepared by Mdm Rohaya binti Abu Hassan

Presentation 2 DLX: A Simplified RISC Model

Reminder: tutorials start next week!

Instruction Set Architecture of. MIPS Processor. MIPS Processor. MIPS Registers (continued) MIPS Registers

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

ISA: The Hardware Software Interface

R-type Instructions. Experiment Introduction. 4.2 Instruction Set Architecture Types of Instructions

Computer Architecture

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts

Design for a simplified DLX (SDLX) processor Rajat Moona

Speeding Up DLX Computer Architecture Hadassah College Spring 2018 Speeding Up DLX Dr. Martin Land

ISA and RISCV. CASS 2018 Lavanya Ramapantulu


Computer Architecture. The Language of the Machine

ECE260: Fundamentals of Computer Engineering

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

M2 Instruction Set Architecture

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

MIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support

CS3350B Computer Architecture MIPS Instruction Representation

ECE 486/586. Computer Architecture. Lecture # 7

ECE260: Fundamentals of Computer Engineering

Computer Organization MIPS Architecture. Department of Computer Science Missouri University of Science & Technology

Programmable Machines

Reduced Instruction Set Computer (RISC)

The MIPS Instruction Set Architecture

TSK3000A - Generic Instructions

Programmable Machines

Computer Architecture (TT 2011)

MIPS Instruction Format

Reduced Instruction Set Computer (RISC)

CS 4200/5200 Computer Architecture I

CS/COE1541: Introduction to Computer Architecture

Lecture 4: Instruction Set Architecture

Course Administration

INSTRUCTION SET COMPARISONS

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

Computer Architecture. MIPS Instruction Set Architecture

Concocting an Instruction Set

The Evolution of Microprocessors. Per Stenström

DLX computer. Electronic Computers M

Mark Redekopp, All rights reserved. EE 357 Unit 11 MIPS ISA

Instruction Set Principles. (Appendix B)

Chapter 4. The Processor

Character Is a byte quantity (00~FF or 0~255) ASCII (American Standard Code for Information Interchange) Page 91, Fig. 2.21

CPU Architecture and Instruction Sets Chapter 1

Processor. Han Wang CS3410, Spring 2012 Computer Science Cornell University. See P&H Chapter , 4.1 4

Midterm. Sticker winners: if you got >= 50 / 67

MIPS Instruction Set

A General-Purpose Computer The von Neumann Model. Concocting an Instruction Set. Meaning of an Instruction. Anatomy of an Instruction

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

Floating Point/Multicycle Pipelining in DLX

ICS 233 Computer Architecture & Assembly Language. ICS 233 Computer Architecture & Assembly Language

Chapter 4. The Processor

Computer Science and Engineering 331. Midterm Examination #1. Fall Name: Solutions S.S.#:

Computer Organization MIPS ISA

CS3350B Computer Architecture

EE108B Lecture 3. MIPS Assembly Language II

Processor (I) - datapath & control. Hwansoo Han

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

CS31001 COMPUTER ORGANIZATION AND ARCHITECTURE. Debdeep Mukhopadhyay, CSE, IIT Kharagpur. Instructions and Addressing

A Processor. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter , 4.1-3

Chapter 2A Instructions: Language of the Computer

Concocting an Instruction Set

MIPS Reference Guide

Lecture 4: MIPS Instruction Set

The MIPS Processor Datapath

Topic Notes: MIPS Instruction Set Architecture

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.

Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , , Appendix B

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

DLXsim A Simulator for DLX

Chapter 2. Instructions: Language of the Computer. HW#1: 1.3 all, 1.4 all, 1.6.1, , , , , and Due date: one week.

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

Load1 no Load2 no Add1 Y Sub Reg[F2] Reg[F6] Add2 Y Add Reg[F2] Add1 Add3 no Mult1 Y Mul Reg[F2] Reg[F4] Mult2 Y Div Reg[F6] Mult1

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Review of instruction set architectures

ICS 233 COMPUTER ARCHITECTURE. MIPS Processor Design Multicycle Implementation

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Unsigned Binary Integers

Unsigned Binary Integers

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Introduction to MIPS Processor

CENG3420 Lecture 03 Review

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

From CISC to RISC. CISC Creates the Anti CISC Revolution. RISC "Philosophy" CISC Limitations

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Today s topics. MIPS operations and operands. MIPS arithmetic. CS/COE1541: Introduction to Computer Architecture. A Review of MIPS ISA.

SpartanMC. SpartanMC. Instruction Set Architecture

Part II Instruction-Set Architecture. Jan Computer Architecture, Instruction-Set Architecture Slide 1

Instruction Set Architecture. "Speaking with the computer"

Concocting an Instruction Set

Concocting an Instruction Set

Computer Architecture. Chapter 3: Arithmetic for Computers

Thomas Polzer Institut für Technische Informatik

Transcription:

DLX Architecture A Model RISC Processor 1

General Features Flat memory model with 32-bit address Data types Integers (32-bit) Floating Point Single precision (32-bit) Double precision (64 bits) Register-register operation model 32 integer registers (32 bits wide) R0 R1 R2... R31 F0 F1 F2... F31 FPU ALU data cache instruction cache Named R0, R1,..., R31 Addressed as 00000 to 11111 in register address space Reg[R0] = 0 (constant) Other registers identical (no special purpose registers) 32 FP registers (32 bits wide) F0, F1,..., F31 Satisfy IEEE 754 standard FP format Store double precision FP is register pair (even, odd) 2

Addressing Modes Register ADD R3, R4, R5 Reg[R3] Reg[R4] + Reg[R5] Immediate ADD R3, R4, #3 Reg[R3] Reg[R4] + 3 Displacement LW R3, 100(R1) Reg[R3] Mem[100+Reg[R1]] Register Deferred LW R3, 0(R1) Reg[R3] Mem[Reg[R1]] Absolute LW R3, 100(R0) Reg[R3] Mem[100] Three memory addressing modes implemented using Displacement 100(R1) Reg[R3] Mem[100+Reg[R1]] Register Deferred 0(R1) Reg[R3] Mem[0+Reg[R1]] Absolute 100(R0) Reg[R3] Mem[100+Reg[R0]] 3

Data Transfer Instructions LW R1, 30(R2) SW 30(R2), R1 LB R1, 30(R2) SB 30(R2), R1 LBU R1, 30(R2) LH R1, 30(R2) LF F1, 30(R2) SF 30(R2), F1 MOVF F3, F1 MOVD F2, F0 MOVFP2I R2, F2 MOVI2FP F2, R2 Load Word Store Word Load Byte Store Byte Load Byte unsigned Load Half Word Load Float Store Float Move Float Move Double FP to INT INT to FP Reg[R1] 32 Mem[30 + Reg[R2]] Mem[30 + Reg[R2]] 32 Reg[R1] Reg[R1] 32 (Mem[30 + Reg[R2]] 0 ) 24 ## Mem[30 + Reg[R2]] Mem[30 + Reg[R2]] 8 Reg[R1] 24..31 Reg[R1] 32 0 24 ## Mem[30 + Reg[R2]] Reg[R1] 32 (Mem[30 + Reg[R2] ] 0 ) 16 ## Mem[30 + Reg[R2]] Reg[F1] 32 Mem[30 + Reg[R2]] Mem[30 + Reg[R2]] 32 Reg[F1] Reg[F3] 32 Reg[F1] Reg[F2],Reg[F3] 64 Reg[F0],Reg[F1] Reg[R2] 32 Reg[F2] Reg[F2] 32 Reg[R2] 4

Arithmetic/Logic Instructions ADD R1, R2, R3 Add Reg[R1] Reg[R2] + Reg[R3] ADDI R1, R2, #3 Add Immediate Reg[R1] Reg[R2] + 3 SUB R1, R2, R3 Sub Reg[R1] Reg[R2] - Reg[R3] SUBI R1, R2, #3 Sub Immediate Reg[R1] Reg[R2] - 3 MULT R1, R2, R3 Multiply Reg[R1] Reg[R2] * Reg[R3] DIV R1, R2, R3 Divide Reg[R1] Reg[R2] Reg[R3] AND R1, R2, R3 And Reg[R1] Reg[R2] AND Reg[R3] ANDI R1, R2, #3 And Immediate Reg[R1] Reg[R2] AND 3 OR R1, R2, R3 Or Reg[R1] Reg[R2] OR Reg[R3] ORI R1, R2, #3 Or Immediate Reg[R1] Reg[R2] OR 3 XOR R1, R2, R3 Exclusive Or Reg[R1] Reg[R2] XOR Reg[R3] XORI R1, R2, #3 Exclusive Or Immediate Reg[R1] Reg[R2] XOR 3 LHI R1, #42 Load High Reg[R1] 42 ## 0 16 SLT R1, R2, R3 Set Less Than SGT R1, R2, R3 SLE R1, R2, R3 SGE R1, R2, R3 SEQ R1, R2, R3 Set Greater Than Set Less Than or Equal Set Greater Than or Equal Set Equal SNE R1, R2, R3 Set Not Equal if Reg[R2] < Reg[R3] then Reg[R1] 1 else Reg[R1] 0 if Reg[R2] > Reg[R3] then Reg[R1] 1 else Reg[R1] 0 if Reg[R2] Reg[R3] then Reg[R1] 1 else Reg[R1] 0 if Reg[R2] Reg[R3] then Reg[R1] 1 else Reg[R1] 0 if Reg[R2] = Reg[R3] then Reg[R1] 1 else Reg[R1] 0 if Reg[R2] Reg[R3] then Reg[R1] 1 else Reg[R1] 0 5

Floating Point Instructions ADDF F1, F2, F3 Add Float Reg[F1] Reg[F2] + Reg[F3] ADDD F0, F2, F4 Add Double Reg[F0] Reg[F2] Reg[F4] + 64 Reg[F1] Reg[F3] Reg[F5] SUBF F1, F2, F3 Sub Float NOTE: Floating point numbers are SUBD F0, F2, F4 Sub Double represented as single or double MULTF F1, F2, F3 Multiply precision numbers according to IEEE Float 754. MULTD F0, F2, F4 Multiply Double The ALU functions for FP are not DIV F1, F2, F3 Divide Float simple binary operations on the bits DIVD F0, F2, F4 Divide Double in the register. LTF F2, F3 Set Less Than if Reg[F2] < Reg[F3] then StatFP 1 1 else StatFP 1 0 GTF F2, F3 Set Greater if Reg[F2] > Reg[F3] then StatFP 1 1 Than else StatFP 1 0 LEF F2, F3 Set Less Than if Reg[F2] Reg[F3] then StatFP 1 1 or Equal else StatFP 1 0 GEF F2, F3 Set Greater if Reg[F2] Reg[F3] then StatFP 1 1 Than or Equal else StatFP 1 0 EQF F2, F3 Set Equal if Reg[F2] = Reg[F3] then StatFP 1 1 else StatFP 1 0 NEF F2, F3 Set Not Equal if Reg[F2] Reg[F3] then StatFP 1 1 else StatFP 1 0 LTD, GTD, LED, GED, EQD, NED Double precision comparisons 6

Control Instructions J offset JAL offset JR R3 JALR R2, offset BEQZ R4, offset BNEZ R4, offset TRAP N Jump Jump and Link Jump Register Jump and Link Register Branch equal zero Branch not equal zero Software interrupt PC PC + offset (-2 25 offset 2 25-1) Reg[R31] PC PC PC + offset (-2 25 offset 2 25-1) PC Reg[R3] Reg[R2] PC PC PC + offset (-2 15 offset 2 15-1) if Reg[R4] == 0 then PC PC + offset (-2 15 offset 2 15-1) if Reg[R4]!= 0 then PC PC + offset (-2 15 offset 2 15-1) Details not specified in Hennessy and Patterson Note: Register is updated ( PC + 4) when branch instruction is loaded Register PC is updated (PC or PC + offset) at end of instruction execution 7

Programming in DLX Assembly Language for ( i = 0 ; i < 256 ; i++) a[i] = a[i] + b[i] c[i] + d[i] } a[] = 000 3FF b[] = 400 7FF c[] = 800 BFF d[] = C00 FFF ADDI R1, R0, #0x400 ; 256 integers = 1024 bytes = 400h bytes LW R2, -4(R1) LW R3, 3FC(R1) ADD R4, R2, R3 LW R2, 7FC(R1) SUB R4, R4, R2 LW R2, BFC(R1) ADD R4, R4, R2 SW -4(R1), R4 SUBI R1, R1, #4 BNEZ R1, -0x28 ; load word from a[] (400 4 = 3FC) ; load word from b[] (400 + 3FC = 7FC) ; add ; load word from c[] (400 + 7FC = BFC) ; sub ; load word from d[] (400 + BFC = FFC) ; add ; store sum in a[] ; i-- ; if R1 <> 0 jump 10 back instructions 8

Implementation General approach No central system bus Base hardware organization on assembly line with uniform operations Separate memory for instructions and data High level design Instructions move through 5 stages (left to right) First two stages identical for all instructions FETCH and DECODE Last three stages operate according to instruction EXECUTE (ALU instructions and address calculations) MEMORY ACCESS (Load/Store instructions) WRITE BACK (register update for Load and ALU instructions) Instruction Fetch Instruction Decode Execute Data Access Write Back Address Instruction Address Data Instruction Memory Data Memory 9

RISC Performance Compare VAX with MIPS 2000 (RISC CPU) on SPEC 89 results Same clock rate IC IC MIPS VAX 2 CPI CPI MIPS VAX 1 6 S VAX VAX CPI IC τ 1 = 6 = 3 MIPS MIPS CPI IC τ 2 Ref: Hennessy-Patterson Figure 2-30 10

Instruction Formats 32-bit instructions (0 to 31) Three instruction formats J-type R-type I-type Jump (unconditional branch) instructions Specifies branch offset Register-register ALU instructions Specifies destination register (rd), and two source registers (rs1, rs2) All other instructions Specifies destination register (rd), immediate, and source register (rs) Type 0-5 6-10 11-15 16-31 6 5 5 5 11 R opcode rs1 rs2 rd function I opcode rs rd immediate J opcode offset 11

J Type Instruction Format 6 26 Opcode Offset added to PC Encodes: Jump PC PC + offset Jump and link r31 PC PC offset Trap and return from exception Implementation unspecified in Hennessy and Patterson Two possible implementations for Offset field 1. Lower 26 bits of physical address of Interrupt Service Routine 2. Trap number = index to Interrupt Vector Table 12

R Type Instruction 6 5 5 5 11 Opcode rs1 rs2 rd function Encodes: Register-register ALU operations rd rs1 function rs2 Function encodes the ALU operation: Add, Sub,... 13

I Type Instruction 6 5 5 16 Opcode rs rd Immediate Encodes: Loads rd imm(rs) Stores imm(rs) rd ALU operations with immediate operand rd rs op immediate Conditional branch instructions if rs eq/ne 0 then PC PC + imm (rd unused) Jump register PC rs Jump and link register rd PC PC PC + immediate 14

Implementation Details 15

Execution Stages by Instruction Type ALU Store Load Branch Fetch instruction from memory Fetch instruction from memory Fetch instruction from memory Fetch instruction from memory Decode operation and operands Decode operation and operands Decode operation and operands Decode operation and operands Calculate ALU operation Calculate memory address Calculate memory address Calculate branch condition Calculate branch address Store data to memory Update PC Load data from memory Update PC Write result to register Write loaded data to register Update PC Update PC 16

Temporary Registers for Implementation IR Instruction Register Holds fetched instruction during execution PC Program Counter Memory address of next instruction Next Program Counter Temporary update of PC (points to fall-through instruction) A, B, I Operand buffers Values read from data registers according to instruction ALU out ALU output Result of ALU operation LMD Load Memory Data Data loaded from memory Cond Condition flag Result of test for conditional branch 17

Example Type I ALU Instruction Instruction addi R1, R2, #5 Operation Reg[R1] Reg[R2] + 5 0-5 6-10 11-15 16-31 Encoding Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 IR Mem[PC] PC + 4 addi 00010 00001 0000 0000 0000 0101 op rs rd immediate A Reg[IR 6-10 ] /* A Reg[R2] */ B Reg[IR 11-15 ] /* B Reg[R1] */ I (IR 16 ) 16 ## IR 16-31 ALU out A + I Reg[IR 11-15 ] ALU out /* Reg[R1] A + I */ PC 18

Example Type R ALU Instruction Instruction Operation add R1, R2, R3 Reg[R1] Reg[R2] + Reg[R3] 0-5 6-10 11-15 16-20 21-31 Encoding Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 IR Mem[PC] PC + 4 R-R 00010 00011 00001 add op rs1 rs2 rd funct A Reg[IR 6-10 ] /* A Reg[R2] */ B Reg[IR 11-15 ] /* B Reg[R3] */ I (IR 16 ) 16 ## IR 16-31 ALU out A + B Reg[IR 16-20 ] ALU out /* Reg[R1] A + B */ PC 19

Example Type I Store Instruction Instruction Operation SW 32(R1), R2 Mem[32+Reg[R1]] Reg[R2] 0-5 6-10 11-15 16-31 Encoding Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 IR Mem[PC] PC + 4 SW 00001 00010 0000 0000 0010 0000 op rs rd immediate A Reg[IR 6-10 ] /* A Reg[R1] */ B Reg[IR 11-15 ] /* B Reg[R2] */ I (IR 16 ) 16 ## IR 16-31 ALU out A + I Mem[ALU out ] B /* Mem[A+I] Reg[R2] */ PC 20

Example Type I Load Instruction Instruction Operation LW R2, 32(R1) Reg[R2] Mem[32+Reg[R1]] 0-5 6-10 11-15 16-31 Encoding Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 IR Mem[PC] PC + 4 LW 00001 00010 0000 0000 0010 0000 op rs rd immediate A Reg[IR 6-10 ] /* A Reg[R1] */ B Reg[IR 11-15 ] /* B Reg[R2] */ I (IR 16 ) 16 ## IR 16-31 ALU out A + I LMD Mem[ALU out ] /* LMD Mem[A+I] */ Reg[IR 11-15 ] LMD /* Reg[R2] Mem[A+I] */ PC 21

Example Type I Conditional Branch Instruction Instruction beqz R1, 1024 Operation Encoding if (Reg[R1] == 0) PC + 1024 else PC 0-5 6-10 11-15 16-31 beqz 00001 00000 0000 0100 0000 0000 op rs rd immediate Stage 1 Stage 2 Stage 3 Stage 4 IR Mem[PC] PC + 4 A Reg[IR 6-10 ] /* A Reg[R1] */ B Reg[IR 11-15 ] /* B Reg[R0] */ I (IR 16 ) 16 ## IR 16-31 ALU out + I if (A == 0) cond = 1 else cond = 0 if (cond == 1) PC ALU out else PC Stage 5 22

DLX Drawing Version 1 mux (multiplexer) chooses 1 output from N inputs 23

Type I ALU Instruction 1 PC + 4 PC mem[pc] addi r1, r2, #5 regs[r1] regs[r2] + 5 24

Type I ALU Instruction 2 PC + 4 PC mem[pc] Reg[IR 6-10 ] Reg[IR 11-15 ] Reg[IR 16-31 ] addi r1, r2, #5 regs[r1] regs[r2] + 5 25

Type I ALU Instruction 3 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+I Reg[IR 11-15 ] I Reg[IR 16-31 ] addi r1, r2, #5 regs[r1] regs[r2] + 5 26

Type I ALU Instruction 4 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+I Reg[IR 11-15 ] Reg[IR 11-15 ] A+I I Reg[IR 16-31 ] A+I A+I addi r1, r2, #5 regs[r1] regs[r2] + 5 27

Type R ALU Instruction 1 PC + 4 PC mem[pc] add r1, r2, r3 regs[r1] regs[r2] + regs[r3] 28

Type R ALU Instruction 2 PC + 4 PC mem[pc] Reg[IR 6-10 ] Reg[IR 11-15 ] Reg[IR 16-31 ] add r1, r2, r3 regs[r1] regs[r2] + regs[r3] 29

Type R ALU Instruction 3 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+B Reg[IR 11-15 ] B Reg[IR 16-31 ] add r1, r2, r3 regs[r1] regs[r2] + regs[r3] 30

Type R ALU Instruction 4 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+B Reg[IR 11-15 ] B Reg[IR 16-20 ] A+B Reg[IR 16-31 ] A+B A+B add r1, r2, r3 regs[r1] regs[r2] + regs[r3] 31

Type I Store Instruction 1 PC + 4 PC mem[pc] sw 32(r1), r2 mem[32+ regs[r1]] regs[r2] 32

Type I Store Instruction 2 PC + 4 PC mem[pc] Reg[IR 6-10 ] Reg[IR 11-15 ] Reg[IR 16-31 ] sw 32(r1), r2 mem[32+ regs[r1]] regs[r2] 33

Type I Store Instruction 3 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+I Reg[IR 11-15 ] I B Reg[IR 16-31 ] sw 32(r1), r2 mem[32+ regs[r1]] regs[r2] 34

Type I Store Instruction 4 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+I A+I Reg[IR 11-15 ] I B B Reg[IR 16-31 ] sw 32(r1), r2 mem[32+ regs[r1]] regs[r2] 35

Type I Load Instruction 1 PC + 4 PC mem[pc] lw r2, 32(r1) regs[r2] mem[32+ regs[r1]] 36

Type I Load Instruction 2 PC + 4 PC mem[pc] Reg[IR 6-10 ] Reg[IR 11-15 ] Reg[IR 16-31 ] lw r2, 32(r1) regs[r2] mem[32+ regs[r1]] 37

Type I Load Instruction 3 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+I Reg[IR 11-15 ] I Reg[IR 16-31 ] lw r2, 32(r1) regs[r2] mem[32+ regs[r1]] 38

Type I Load Instruction 4 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+I A+I Reg[IR 11-15 ] mem[a+i] I Reg[IR 16-31 ] lw r2, 32(r1) regs[r2] mem[32+ regs[r1]] 39

Type I Load Instruction 5 PC + 4 cond A Reg[IR 6-10 ] PC mem[pc] A A+I A+I Reg[IR 11-15 ] mem[a+i] Reg[IR 11-15 ] mem[a+i] I Reg[IR 16-31 ] mem[a+i] lw r2, 32(r1) regs[r2] mem[32+ regs[r1]] 40

Type I Branch Instruction 1 PC + 4 PC mem[pc] beqz r1, 1024 if (regs[r1] == 0) PC + I else PC 41

Type I Branch Instruction 2 PC + 4 PC mem[pc] Reg[IR 6-10 ] Reg[IR 11-15 ] Reg[IR 16-31 ] beqz r1, 1024 if (regs[r1] == 0) PC + I else PC 42

Type I Branch Instruction 3 PC + 4 cond PC mem[pc] Reg[IR 6-10 ] A +I Reg[IR 11-15 ] I Reg[IR 16-31 ] beqz r1, 1024 if (regs[r1] == 0) PC + I else PC 43

Type I Branch Instruction 4 / +I / +I PC + 4 cond PC mem[pc] Reg[IR 6-10 ] A +I +I Reg[IR 11-15 ] I Reg[IR 16-31 ] beqz r1, 1024 if (regs[r1] == 0) PC + I else PC 44

Performance Instruction distribution for version 1 based on compilation of SPEC 92 Type i ALU Load Store Branch IC i / IC 40% 25% 15% 20% CPI i 4 5 4 4 CPI = CPI i i ICi IC = 4 0.40 + 5 0.25 + 4 0.15 + 4 0.25 = 4.25 45