Announcements. EE108B Lecture MIPS Assembly Language III. MIPS Machine Instruction Review: Instruction Format Summary

Similar documents
EE108B Lecture 3. MIPS Assembly Language II

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

MIPS%Assembly% E155%

CPU Performance. Christos Kozyrakis. Stanford University C. Kozyrakis EE108b Lecture 5

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Computer Architecture. The Language of the Machine

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions

Computer Architecture Instruction Set Architecture part 2. Mehran Rezaei

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

MIPS Instruction Set

CS 61c: Great Ideas in Computer Architecture

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

Chapter 2A Instructions: Language of the Computer

Reduced Instruction Set Computer (RISC)

Reduced Instruction Set Computer (RISC)

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

ECE 331 Hardware Organization and Design. Professor Jay Taneja UMass ECE - Discussion 3 2/8/2018

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

ECE 2035 Programming HW/SW Systems Fall problems, 7 pages Exam Two 23 October 2013

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set

Computer Architecture. MIPS Instruction Set Architecture

Mips Code Examples Peter Rounce

CENG3420 Lecture 03 Review

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

SPIM Instruction Set

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

MIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support

Lecture 5: Procedure Calls

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

Mark Redekopp, All rights reserved. EE 357 Unit 11 MIPS ISA

MIPS Reference Guide

EEM 486: Computer Architecture. Lecture 2. MIPS Instruction Set Architecture

ECE 15B Computer Organization Spring 2010

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

ECE232: Hardware Organization and Design

Computer Architecture

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

Thomas Polzer Institut für Technische Informatik

The MIPS Instruction Set Architecture

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions

MIPS Instruction Set Architecture (2)

ECE 2035 Programming HW/SW Systems Spring problems, 6 pages Exam Two 11 March Your Name (please print) total

EE108B Lecture 2 MIPS Assembly Language I. Christos Kozyrakis Stanford University

CS3350B Computer Architecture MIPS Instruction Representation

MIPS Assembly Language. Today s Lecture

Today s Lecture. MIPS Assembly Language. Review: What Must be Specified? Review: A Program. Review: MIPS Instruction Formats

Course Administration

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Instruction Set Architecture

MIPS Functions and the Runtime Stack

Instructions: Assembly Language

Architecture II. Computer Systems Laboratory Sungkyunkwan University

EEC 581 Computer Architecture Lecture 1 Review MIPS

Overview. Introduction to the MIPS ISA. MIPS ISA Overview. Overview (2)

Assembly Language Programming. CPSC 252 Computer Organization Ellen Walker, Hiram College

COMPUTER ORGANIZATION AND DESIGN

ENCM 369 Winter 2013: Reference Material for Midterm #2 page 1 of 5

MIPS Functions and Instruction Formats

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam Two 23 October Your Name (please print clearly) Signed.

Computer Architecture

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

EE 361 University of Hawaii Fall

Today s topics. MIPS operations and operands. MIPS arithmetic. CS/COE1541: Introduction to Computer Architecture. A Review of MIPS ISA.

Examples of branch instructions

Computer Organization MIPS ISA

We will study the MIPS assembly language as an exemplar of the concept.

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

CS61C Machine Structures. Lecture 12 - MIPS Procedures II & Logical Ops. 2/13/2006 John Wawrzynek. www-inst.eecs.berkeley.

Chapter 2. Instruction Set Architecture (ISA)

Programming the processor

Flow of Control -- Conditional branch instructions

Course Administration

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam Two 21 October 2016

Concocting an Instruction Set

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

MIPS Datapath. MIPS Registers (and the conventions associated with them) MIPS Instruction Types

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Chapter 2. Instructions:

CS61C : Machine Structures

ECE 2035 A Programming Hw/Sw Systems Fall problems, 8 pages Final Exam 8 December 2014

ECE468 Computer Organization & Architecture. MIPS Instruction Set Architecture

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam One 19 September 2012

ELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2)

Lec 10: Assembler. Announcements

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from:

Computer Organization MIPS Architecture. Department of Computer Science Missouri University of Science & Technology

Procedure Calling. Procedure Calling. Register Usage. 25 September CSE2021 Computer Organization

Chapter 3 MIPS Assembly Language. Ó1998 Morgan Kaufmann Publishers 1

ECE260: Fundamentals of Computer Engineering

Lecture 4: MIPS Instruction Set

ECE 2035 Programming Hw/Sw Systems Fall problems, 10 pages Final Exam 9 December 2013

ECE 2035 A Programming Hw/Sw Systems Spring problems, 8 pages Final Exam 29 April 2015

Q1: /30 Q2: /25 Q3: /45. Total: /100

Transcription:

Announcements EE108B Lecture MIPS Assembly Language III Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b PA1 available, due on Thursday 2/8 Work on you own (no groups) Homework #1 due on Tue 1/23, 5pm outside Gates 310 Lab #1 is available Due on Tuesday 1/30, midnight EE108B Review Session Friday: 2:15pm-3:05pm in Thornton 102 1 2 Review of MIPS Assembly Language II MIPS Machine Instruction Review: Instruction Format Summary Memory data transfer Data alignment Control transfer instructions Branches and jumps Machine language Encoding assembly instructions MIPS instruction formats R-format 3 4

Today s Menu I-Format Reading 2.5, 2.7, 2.9 I & J format Introduce a few more instructions Logical operations jal, jr For loops Switch statements Procedure calls 5 The immediate instruction format Uses different opcodes for each instruction Immediate field is signed (positive/negative constants) Used for loads and stores as well as other instructions with immediates (addi, lui, etc.) Also used for branches Bits 6 5 5 16 OP rs rt imm First Source Register Second Source Register or dest Immediate 6 I-Format Example Consider the addi instruction addi $t0, $t1, 1 # $t0 = $t1 + 1 Fill in each of the fields Bits 6 5 5 16 8 9 8 1 First Second Source Source Register Register Immediate Another I-Format Example Consider the while loop Loop: addu $t0, $s0, $s0 # $t0 = 2 * i addu $t0, $t0, $t0 # $t0 = 4 * i add $t1, $t0, $s3 # $t1 = &(A[i]) lw $t2, 0($t1) # $t2 = A[i] bne $t2, $s2, Exit # goto Exit if!= addu $s0, $s0, $s1 # i = i + j j Loop # goto Loop Exit: Pretend the first instruction is located at address 80000 001000 01001 01000 0000000000000001 7 8

I-Format Example (Incorrect) PC Relative Addressing Consider the bne instruction bne $t2, $s2, Exit # goto Exit if $t2!= $s2 Fill in each of the fields Bits 6 5 5 16 5 10 18 8 How can we improve our use of immediate addresses when branching? Since instructions are always 32 bits long and word addressing requires alignment, every address must be a multiple of 4 bytes Therefore, we actually branch to the address that is PC + 4 + 4 immediate First Second Source Source Register Register Immediate 000101 01010 10010 0000000000001000 This is not the optimum encoding 9 10 I-Format Example Branching Far Away Re-consider the bne instruction bne $t2, $s2, Exit # goto Exit if $t2!= $s2 Use PC-Relative addressing for the immediate Bits 6 5 5 16 5 10 18 2 First Second Source Source Register Register Immediate 000101 01010 10010 0000000000000010 If the target is greater than -2 15 to 2 15-1 words away, then the compiler inverts the condition and inserts an unconditional jump Consider the example where L1 is far away beq $s0, $s1, L1 # goto L1 if S$0=$s1 Can be rewritten as bne $s0, $s1, L2 j L1 L2: # Inverted # Unconditional jump Compiler must be careful not to cross 256 MB boundaries with jump instructions 11 12

I-Format Example: Load/Store J-Format Consider the lw instruction lw $t2, 0($t1) Fill in each of the fields Bits 6 5 5 16 35 9 10 0 First Second Source Source Register Register # $t2 = Mem[$t1] Immediate 001000 01001 01010 0000000000000000 The jump instruction format Different opcodes for each instruction Examples include j and jal instructions Absolute addressing since long jumps are common Based on word addressing (target 4) Pseudodirect addressing where 28 bits from target, and remaining 4 bits come from upper bits of PC Bits 6 26 OP target Jump Target Address Jump PC = PC 31..28 target 00 13 14 sum_pow2 Revised Machine and DisAssembly sum_pow2: # $a0 = b, $a1 = c addu $a0,$a0,$a1 # a = b + c, $a0 = a bltz $a0, Exceed # goto Exceed if $a0 < 0 slti $v0,$a0,8 # $v0 = a < 8 beq $v0,$zero, Exceed # goto Exceed if $v0 == 0 addiu $v1,$sp,8 # $v1 = pow2 address sll $v0,$a0,2 # $v0 = a*4 addu $v0,$v0,$v1 # $v0 = pow2 + a*4 lw $v0,0($v0) # $v0 = pow2[a] b Return # goto Return Exceed: addu $v0,zero,zero # $v0 = 0 Return: jr ra # return sum_pow2 sum_pow2: 0x400a98: 00 85 20 21 addu a0,a0,a1 0x400a9c: 04 80 00 08 bltz a0,0x400abc 0x400aa0: 28 82 00 06 slti v0,a0,6 0x400aa4: 10 40 00 06 beq v0,zero,0x400abc 0x400aa8: 27 a3 00 08 addiu v1,sp,8 0x400aac: 00 04 10 80 sll v0,a0,2 0x400ab0: 00 43 10 21 addu v0,v0,v1 0x400ab4: 8c 42 00 00 lw v0,0(v0) 0x400ab8: 10 00 00 01 j 0x400ac0 0x400abc: 00 00 10 21 addu v0,zero,zero 0x400ac0: 03 e0 00 08 jr ra 15 16

Addressing Modes Summary Addressing Modes Summary 1. Immediate addressing op rs rt Immediate Register addressing Operand is a register (e.g. ALU) Base/displacement addressing (ex. load/store) Operand is at the memory location that is the sum of a base register + a constant Immediate addressing (e.g. constants) Operand is a constant within the instruction itself PC-relative addressing (e.g. branch) Address is the sum of PC and constant in instruction (e.g. branch) Pseudo-direct addressing (e.g. jump) Target address is concatenation of field in instruction and the PC 2. Register addressing op rs rt rd... funct 3. Base addressing op rs rt Address Register 4. PC-relative addressing op rs rt Address PC 5. Pseudodirect addressing op Address + + Registers Register Byte Memory Halfword Word Memory Word Memory PC Word 17 18 Logical Operators Bitwise operators often useful for bit manipulation 31 22 6 0 9 bits 17 bits 6 bits 31 16 0 0 0 sll $t0, $t3, 9 # shift $t3 left by 9, store in $t0 srl $t0, $t0, 15 # shift $t0 right by 15 Always operate unsigned except for arithmetic shifts MIPS Logical Instructions Instruction Example Meaning Comments and and $1, $2, $3 $1 = $2 & $3 Logical AND or or $1, $2, $3 $1 = $2 $3 Logical OR xor xor $1, $2, $3 $1 = $2 $3 Logical XOR nor nor $1, $2, $3 $1 = ~($2 $3) Logical NOR and immediate andi $1, $2, 10 $1 = $2 & 10 Logical AND w. constant or immediate ori $1, $2, 10 $1 = $2 10 Logical OR w. constant xor immediate xori $1, $2, 10 $1 = ~$2 & ~10 Logical XOR w. constant shift left log sll $1, $2, 10 $1 = $2 << 10 Shift left by constant shift right log srl $1, $2, 10 $1 = $2 >> 10 Shift right by constant shift rt. Arith sra $1, $2, 10 $1 = $2 >> 10 Shift rt. (sign extend) shift left var sllv $1, $2, $3 $1 = $2 << $3 Shift left by variable shift right var srlv $1, $2, $3 $1 = $2 >> $3 Shift right by variable shift rt. arith srav $1, $2, $3 $1 = $2 >> $3 Shift right arith. var load upper imm lui $1, 40 $1 = 40 << 16 Places imm in upper 16b 19 20

Loading a 32 bit Constant For Loop Example MIPS only has 16 bits of immediate value Could load from memory but still have to generate memory address Use lui and ori to load 0xdeadbeef into $a0 lui $a0, 0xdead # $a0 = dead0000 ori $a0, $a0, 0xbeef # $a0 = deadbeef /* Compute x raised to nonnegative power p */ int ipowr_for(int x, unsigned p) int result; for (result = 1; p!= 0; p = p>>1) if (p & 0x1) result *= x; x = x*x; return result; Algorithm Exploit property that p = p 0 + 2p 1 + 4p 2 + 2 n 1 p n 1 Gives: x p = z 0 z 2 1 (z 2 2 ) 2 ( ((z n 12 ) 2 ) ) 2 z i = 1 when p i = 0 z i = x when p i = 1 times Complexity O(log p) Example 3 10 = 3 2 * 3 8 = 3 2 * ((3 2 ) 2 ) 2 21 22 While Loop Transformation For Loop Transformation While loop review while () Similar to while loop for (; ; ) goto test; loop: test: if () goto loop; ; goto test; loop: ; test: if () goto loop; 23 24

For Example Switch Statements int result; for (result = 1; p!= 0; p = p>>1) if (p & 0x1) result *= x; x = x*x; result = 1 p!= 0 if (p & 0x1) result *= x; x = x*x; p = p >> 1 result = 1; goto test; loop: if (p & 0x1) result *= x; x = x*x; p = p >> 1; test: if (p!= 0) goto loop; MIPS assembly is left as an exercise multiply covered in Chap 3 typedef enum ADD, MULT, MINUS, DIV, MOD, BAD op_type; char unparse_symbol(op_type op) switch (op) case ADD : return '+'; case MULT: return '*'; case MINUS: return '-'; case DIV: return '/'; case MOD: return '%'; case BAD: return '?'; Implementation Options: 1. Series of conditional branches Good if few cases Slow if many 2. Jump Table Lookup branch target Use the MIPS jr instruction to unconditionally jump to address stored in a register jr dest # Jump to $dest Avoids conditional branches Possible when cases are small integer constants C compiler e.g. gcc Picks one based on case structure 25 26 Jump Table Structure Switch Statement Example Switch Form switch(op) case val_0: 0 case val_1: 1 case val_n-1: 1 Approx. Translation jtab: target = JTab[op]; goto *target; Jump Table Targ0 Targ1 Targ2 Targn-1 Targ0: Targ1: Targn-1: Jump Targets Code Block 0 Code Block 1 Code Block 1 typedef enum ADD, MULT, MINUS, DIV, MOD, BAD op_type; char unparse_symbol(op_type op) switch (op) Setup: Enumerated Values ADD 0 MULT 1 MINUS 2 DIV 3 MOD 4 BAD 5 bltz $a0, Exit # if op < 0 goto Exit slti $t0, $a0, 6 # $t0 = 1 if op < 6 beq $t0, $zero, Exit # if op >= 6 goto Exit slli $t1, $a0, 2 # $t1 = 4 * op add $t2, $t1, $t4 # $t2 =&(JumpTable[op]) # assumes $t4=jumptable lw $t3, 0($t2) # $t3 = JumpTable[op] jr $t3 # jump to JumpTable[op] 27 28

Jump Table Targets Procedure Call and Return JumpTable:.word L0 #Op = 0.word L1 #Op = 1.word L2 #Op = 2.word L3 #Op = 3.word L4 #Op = 4.word L5 #Op = 5 Advantage of Jump Table Can do k-way branch L0: L1: L2: ori $v0, $zero, 43 # + j Exit ori $v0, $zero, 42 # * j Exit ori $v0, $zero, 45 # - j Exit L3: ori $v0, $zero, 47 # / j Exit L4: ori $v0, $zero, 37 # % j Exit L5: ori $v0, $zero, 63 #? Exit: # end of switch Procedures are required for structured programming Aka: functions, methods, subroutines, Implementing procedures in assembly requires several things to be done Memory space must be set aside for local variables Arguments must be passed in and return values passed out Execution must continue after the call Procedure Steps 1. Place parameters in a place where the procedure can access them 2. Transfer control to the procedure 3. Acquire the storage resources needed for the procedure 4. Perform the desired task 5. Place the result value in a place where the calling program can access it 6. Return control to the point of origin 29 30 Call and Return To jump to a procedure, use the jal or jalr instructions jal target # Jump and link to label jalr $dest # Jump and link to $dest Jump and link The program counter (PC) stores the address of the currently executing instruction The jump and link instructions stores the next instruction address in $ra before transferring control to the target/destination Therefore, the address stored in $ra is PC + 4 To return, use the jr instruction jr $ra 31 Stack-Based Languages Languages that Support Recursion e.g., C, Java, Pascal, Code must be Reentrant Multiple simultaneous instantiations of single procedure Need some place to store state of each instantiation Arguments, local variables, return pointer Stack Discipline State for given procedure needed for limited time From when called to when return Callee returns before caller does LIFO Stack Allocated in Frames State for single procedure instantiation 32

Nested Stacks The stack grows downward and shrinks upward A: CALL B B: CALL C C: RET RET A A B C A A B A B Stacks Data is pushed onto the stack to store it and popped from the stack when not longer needed MIPS does not support in hardware (use loads/stores) Procedure calling convention requires one Calling convention Common rules across procedures required Recent machines are set by software convention and earlier machines by hardware instructions Using Stacks Stacks can grow up or down Stack grows down in MIPS Entire stack frame is pushed and popped, rather than single elements 33 34 MIPS Storage Layout Procedure Activation Record (Frame) textbook Each procedure creates an activation record on the stack $sp = 7fffffff 16 $gp = 10008000 16 10000000 16 400000 16 Stack Dynamic data Static data Text/Code Reserved Stack and dynamic area grow towards one another to maximize storage use before collision First word of frame $fp Last word of frame $sp Saved arguments Saved return address Saved registers (if any) Local Arrays and Structures (if any) Higher addresses Stack grows downwards Lower addresses 35 36

Procedure Activation Record (Frame) SGI/GCC Compiler Each procedure creates an activation record on the stack At least 32 bytes by convention to allow for $a0-$a3, $ra, $fp and be double double word aligned (16 byte multiple) First word of frame $fp Arguments (>= 16 B) Local Arrays and Structures (if any) Saved return address Saved registers Higher addresses Stack grows downwards Register Assignments Use the following calling conventions Name Register number Usage $zero 0 the constant value 0 $v0-$v1 2-3 values for results and expression evaluation $a0-$a3 4-7 arguments $t0-$t7 8-15 temporaries $s0-$s7 16-23 saved $t8-$t9 24-25 more temporaries $gp 28 global pointer $sp 29 stack pointer $fp 30 frame pointer $ra 31 return address Last word of frame $sp Space for nested call args (>= 16 B) Lower addresses 37 38 Register Assignments (cont) Caller vs. Callee Saved Registers 0 $zero Zero constant (0) 1 $at Reserved for assembler 2 $v0 Expression results 3 $v1 4 $a0 Arguments 5 $a1 6 $a2 7 $a3 8 $t0 Caller saved... 15 $t7 16 $s0 Callee saved... 23 $s7 24 $t8 Temporary (cont d) 25 $t9 26 $k0 Reserved for OS kernel 27 $k1 28 $gp Pointer to global area 29 $sp Stack Pointer 30 $fp Frame Pointer 31 $ra Return address Preserved Not Preserved Saved registers ($s0-$s7) Temporary registers ($t0-$t9) Stack/frame pointer ($sp, $fp, $gp)argument registers ($a0-$a3) Return address ($ra) Return values ($v0-$v1) Preserved registers (Callee Save) Save register values on stack prior to use Restore registers before return Not preserved registers (Caller Save) Do what you please and expect callees to do likewise Should be saved by the caller if needed after procedure call 39 40

Call and Return Calling Convention Steps Caller Save caller-saved registers $a0-$a3, $t0-$t9 Load arguments in $a0-$a3, rest passed on stack Execute jal instruction Callee Setup 1. Allocate memory for new frame ($sp = $sp - frame) 2. Save callee-saved registers $s0-$s7, $fp, $ra 3. Set frame pointer ($fp = $sp + frame size - 4) Callee Return Place return value in $v0 and $v1 Restore any callee-saved registers Pop stack ($sp = $sp + frame size) Return by jr $ra Before call: Callee setup; step 1 Callee setup; step 2 Callee setup; step 3 FP SP FP SP FP SP FP SP ra old fp $s0-$s7 ra old fp $s0-$s7 First four arguments passed in registers Adjust SP Save registers as needed Adjust FP 41 42 Simple Example Factorial Example int foo(int num) return(bar(num + 1)); int bar(int num) return(num + 1); foo: bar: addiu $sp, $sp, -32 # push frame sw $ra, 28($sp) # Store $ra sw $fp, 24($sp) # Store $fp addiu $fp, $sp, 28 # Set new $fp addiu $a0, $a0, 1 # num + 1 jal bar # call bar lw $fp, 24($sp) # Load $fp lw $ra, 28($sp) # Load $ra addiu $sp, $sp, 32 # pop frame jr $ra # return addiu $v0, $a0, 1 # leaf procedure jr $ra # with no frame int fact(int n) if (n <= 1) return(1); else return(n*fact(n-1)); fact: slti $t0, $a0, 2 # a0 < 2 beq $t0, $zero, skip # goto skip ori $v0, $zero, 1 # Return 1 jr $ra # Return skip: addiu $sp, $sp, -32 # $sp down 32 sw $ra, 28($sp) # Save $ra sw $fp, 24($sp) # Save $fp addiu $fp, $sp, 28 # Set up $fp sw $a0, 32($sp) # Save n addui $a0, $a0, -1 # n 1 jal fact # Call recursive link: lw $a0, 32($sp) # Restore n mul $v0, $v0, $a0 # n * fact(n-1) lw $ra, 28($sp) # Load $ra lw $fp, 24($sp) # Load $fp addiu $sp, $sp, 32 # Pop stack jr $ra # Return 43 44

Running Factorial Example Factorial Again Tail Recursion main() fact(3); sp(main) fp(fact(3)) 64 60 56 sp(fact(3)) 32 fp(fact(2)) 28 24 sp(fact(2)) 0 3 main 92 2 link 60 1 main() frame fact(3) frame fact(2) frame Stack just before call of fact(1) int fact_t (int n, int val) if (n <= 1) return val; return fact_t(n-1, val*n); int fact(int n) return fact_t(n, 1); fact_t(n, val) return fact_t(, ) Form Directly return value returned by recursive call Consequence Can convert into loop 45 46 fact_t(n, val) start: val = ; n = ; goto start; Removing Tail Recursion Effect of Optimization Turn recursive chain into single procedure (leaf) Eliminate procedure call overhead No stack frame needed Constant space requirement Vs. linear for recursive version int fact_t (int n, int val) start: if (n <= 1) return val; val = val*n; n = n-1; goto start; Assembly Code for fact_t int fact_t (int n, int val) start: if (n <= 1) return val; val = val*n; n = n-1; goto start; fact_t: addui $v0, $a1, $zero # $v0 = $a1 start: slti $t0, $a0, 2 # n < 2 beq $t0, $zero, skip # if n >= 2 goto skip jr $ra # return skip: mult $v0, $v0, $a0 # val = val * n addiu $a0, $a0, -1 # n = n - 1 j start # goto start 47 48

Pseudoinstructions Assembler expands pseudoinstructions move $t0, $t1 # Copy $t1 to $t0 addu $t0, $zero, $t1 # Actual instruction Some pseudoinstructions need a temporary register Cannot use $t, $s, etc. since they may be in use The $at register is reserved for the assembler blt $t0, $t1, L1 # Goto L1 if $t0 < $t1 slt $at, $t0, $t1 # Set $at = 1 if $t0 < $t1 bne $at, $zero, L1 # Goto L1 if $at!= 0 49