CS153: Compilers Lecture 8: Compiling Calls

Similar documents
Memory Usage 0x7fffffff. stack. dynamic data. static data 0x Code Reserved 0x x A software convention

MIPS Procedure Calls. Lecture 6 CS301

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

See P&H 2.8 and 2.12, and A.5-6. Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

Instruction Set Architectures (4)

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

Calling Conventions. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 2.8 and 2.12

Lecture 5. Announcements: Today: Finish up functions in MIPS

Implementing Procedure Calls

Anne Bracy CS 3410 Computer Science Cornell University

CS 316: Procedure Calls/Pipelining

Mark Redekopp, All rights reserved. EE 352 Unit 6. Stack Frames Recursive Routines

Functions in MIPS. Functions in MIPS 1

Subroutines. int main() { int i, j; i = 5; j = celtokel(i); i = j; return 0;}

Anne Bracy CS 3410 Computer Science Cornell University

CSE Lecture In Class Example Handout

Anne Bracy CS 3410 Computer Science Cornell University

We can emit stack-machine-style code for expressions via recursion

Arguments and Return Values. EE 109 Unit 16 Stack Frames. Assembly & HLL s. Arguments and Return Values

ECE260: Fundamentals of Computer Engineering

Hakim Weatherspoon CS 3410 Computer Science Cornell University

MIPS Functions and the Runtime Stack

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

Today. Putting it all together

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number

CS 61C: Great Ideas in Computer Architecture More MIPS, MIPS Functions

Prof. Kavita Bala and Prof. Hakim Weatherspoon CS 3410, Spring 2014 Computer Science Cornell University. See P&H 2.8 and 2.12, and A.

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number

Functions and Procedures

CS 61c: Great Ideas in Computer Architecture

Procedure Call and Return Procedure call

EN164: Design of Computing Systems Lecture 11: Processor / ISA 4

Function Calling Conventions 1 CS 64: Computer Organization and Design Logic Lecture #9

Course Administration

Lecture 9: Procedures & Functions. CS 540 George Mason University

CS 61C: Great Ideas in Computer Architecture Strings and Func.ons. Anything can be represented as a number, i.e., data or instruc\ons

Review of Activation Frames. FP of caller Y X Return value A B C

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

CS153: Compilers Lecture 7: Simple Code Generation

Computer Architecture Instruction Set Architecture part 2. Mehran Rezaei

EE 109 Unit 15 Subroutines and Stacks

CSE Lecture In Class Example Handout

CS64 Week 5 Lecture 1. Kyle Dewey

Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #7. Warehouse Scale Computer

MIPS Functions and Instruction Formats

Lec 10: Assembler. Announcements

Lecture 7: Procedures and Program Execution Preview

Calling Conventions. See P&H 2.8 and Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

The Activation Record (AR)

CS 61C: Great Ideas in Computer Architecture (Machine Structures) More MIPS Machine Language

Compilers and computer architecture: A realistic compiler to MIPS

CS61C : Machine Structures

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

Stacks and Procedures

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

SPIM Procedure Calls

Code Generation. Lecture 19

Lecture 7: Procedures

UCB CS61C : Machine Structures

CS 110 Computer Architecture Lecture 6: More MIPS, MIPS Functions

Code Generation. Lecture 12

Lecture Outline. Topic 1: Basic Code Generation. Code Generation. Lecture 12. Topic 2: Code Generation for Objects. Simulating a Stack Machine

CA Compiler Construction

Compiling Code, Procedures and Stacks

Announcements. EE108B Lecture MIPS Assembly Language III. MIPS Machine Instruction Review: Instruction Format Summary

MIPS Assembly Language Programming

LECTURE 19. Subroutines and Parameter Passing

CS4410 : Spring 2013

MIPS Assembly Language Programming

MIPS Programming. A basic rule is: try to be mechanical (that is, don't be "tricky") when you translate high-level code into assembler code.

More C functions and Big Picture [MIPSc Notes]

MIPS Procedure Calls - Review

CS 61C: Great Ideas in Computer Architecture Func%ons and Numbers

ECE 250 / CS 250 Computer Architecture. Procedures

Procedure Calls Main Procedure. MIPS Calling Convention. MIPS-specific info. Procedure Calls. MIPS-specific info who cares? Chapter 2.7 Appendix A.

Topic 7: Activation Records

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions

Q1: /20 Q2: /30 Q3: /24 Q4: /26. Total: /100

Procedures and Stacks

Stacks and Procedures

Machine Programming 3: Procedures

CS61C : Machine Structures

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

Storage in Programs. largest. address. address

2/16/2018. Procedures, the basic idea. MIPS Procedure convention. Example: compute multiplication. Re-write it as a MIPS procedure

Run-time Environment

CS61C Machine Structures. Lecture 12 - MIPS Procedures II & Logical Ops. 2/13/2006 John Wawrzynek. www-inst.eecs.berkeley.

Control Instructions

CS61C : Machine Structures

Chapter 2A Instructions: Language of the Computer

CSc 520 Principles of Programming Languages. Questions. rocedures as Control Abstractions... 30: Procedures Introduction

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

Inequalities in MIPS (2/4) Inequalities in MIPS (1/4) Inequalities in MIPS (4/4) Inequalities in MIPS (3/4) UCB CS61C : Machine Structures

CS356: Discussion #6 Assembly Procedures and Arrays. Marco Paolieri

Stacks and Procedures

MIPS Datapath. MIPS Registers (and the conventions associated with them) MIPS Instruction Types

Lecture 5: Procedure Calls

4.2: MIPS function calls

CS 110 Computer Architecture MIPS Instruction Formats

CSC258: Computer Organization. Functions and the Compiler Tool Chain

Transcription:

CS153: Compilers Lecture 8: Compiling Calls Stephen Chong https://www.seas.harvard.edu/courses/cs153

Announcements Project 2 out Due Thu Oct 4 (7 days) Project 3 out Due Tuesday Oct 9 (12 days) Reminder: CS Nights Mondays 8pm-10pm, in MD119. Pizza provided! 2

Today Function calls Calling convention How to implement functions 3

Extending Fish Let s extend Fish with functions and local variables type exp =... Call of var * (exp list) type stmt =... Let of var*exp*stmt type func = { name : var, args : var list, body : stmt } type prog = func list One distinguished function (main) will be the entry point for the program 4

Call and Return Each procedure is just a Fish program beginning with a label (the function name) MIPS calling convention: To compile a call f(a,b,c,d) Move results of expressions a,b,c,d into registers $4 $7 jal f: moves return address into $31 The return address is address to continue execution after f has finished executing i.e., instruction immediately after the jal: $pc + 4 To return(e) Move result of e to $2 jr $31 (i.e., jump to the return address) 5

What Could Go Wrong? What if foo calls bar and bar calls baz? bar needs to save its return address $31 is a caller-save register Where do we save it? One option: each procedure has a (global) variable to hold the return address E.g., foo_return, bar_return, baz_return But what about recursive calls? E.g., foo calls bar, and bar calls foo, and foo calls bar,... Each invocation of a function needs its own return address! Each invocation of function also needs its own local variables, arguments, etc. 6

Stacks Key idea: associate a frame with each invocation of a procedure In the frame, store data belonging to the invocation Return address Arguments to invocation Local variables... Memory Frame for 1st invocation of foo Frame for 1st invocation of bar Higher addresses Lower addresses 7

Frame Allocation Memory Frames are allocated last-in-first-out i.e., as a stack For historic reasons, the stack of frames grows downwards Why? We use $29 (aka $sp) as the stack pointer Points to the top of the stack Use register $30 (aka $fp) as the frame pointer $fp $sp Frame for 1st invocation of foo Frame for 1st invocation of bar Frame for 2nd invocation of foo Frame for 2nd invocation of bar Higher addresses Lower addresses Points to start of current frame (the first word in current frame) 8

Frame Allocation Memory To allocate a frame with n bytes: Subtract n from $sp Set $fp to $sp + n - 4 i.e., $fp points to first word in this frame To deallocate a frame $fp $sp Frame for 1st invocation of foo Frame for 1st invocation of bar Frame for 2nd invocation of foo Higher addresses Restore $fp to previous value Frame for 2nd invocation of bar Lower addresses Add n to $sp 9

Calling Convention in More Detail To call f with arguments a1,..., an: 1. Save caller-save registers These are registers that the callee f is free to clobber, so if the caller wants to preserve their values, caller must save them Registers $8 $15, $24, $25 (aka $t0 $t9) are the general-purpose callersave registers 2. Move arguments Push extra arguments onto stack in reverse order Place 1st four arguments in $4 $7 (aka $a0 $a3) Set aside space on stack for 1st 4 arguments 3. Execute jal f: return address is placed in $31 (aka $ra) [code for function f executes, and returns to return address] 4. Pop arguments off stack; restore caller-save registers 10

What does the callee f do? Function prologue At beginning of called function During execution Function epilogue At end of called function 11

Function Prologue 1. Allocate frame: subtract frame size n from $sp n big enough for local vars, callee-save registers, etc. 2. Save any callee-save registers Registers the caller expects to be preserved Includes $fp, $ra, and $s0 $s7 ($16 $23). Don't need to save a register you don't clobber E.g., only need to save $ra if function makes a call 3. Set $fp to $sp + n - 4 12

During Execution Access variables relative to stack pointer (or frame pointer) must keep track of each var's offset Temporary values can be pushed on stack and then popped off. Push(r): subu $sp,$sp,4; sw r, 0($sp) Pop(r): lw r,0($sp); addu $sp,$sp,4 e.g., when compiling e1+e2, we can evaluate e1, push result on stack, evaluate e2, pop e1's value and then add the results. 13

Function Epilogue 1. Place result in $v0 ($2). 2. Restore callee-saved registers Includes caller's frame pointer and the return address 3. Deallocate frame: add frame size n to $sp 4. Return to caller jr $ra 14

Example (from SPIM docs) int fact(int n) { if (n < 1) return 1; else return n * fact(n-1); } int main() { } return fact(10)+42; 15

Function prologue Main main: subu $sp,$sp,32 # allocate frame sw $ra,20($sp) # save caller return address sw $fp,16($sp) # save caller frame pointer addiu $fp,$sp,28 # set up new frame pointer li $a0,10 # set up argument (10) jal fact # call fact addi $v0,v0,42 # add 42 to result lw $ra,20($sp) # restore return address lw $fp,16($sp) # restore frame pointer addiu $sp,$sp,32 # pop frame jr $ra # return to caller Function epilogue 16

Function prologue Main main: subu $sp,$sp,32 # allocate frame sw $ra,20($sp) # save caller return address sw $fp,16($sp) # save caller frame pointer addiu $fp,$sp,28 # set up new frame pointer li $a0,10 # set up Notes: argument (10) jal fact $sp # call is kept fact double-word aligned addi $v0,v0,42 MIPS # add calling 42 convention to resultis lw $ra,20($sp) minimum # restore frame return size is 24 address bytes lw $fp,16($sp) ($a0-$a3, # restore $ra, frame padded pointer to doubleword addiu $sp,$sp,32 # pop boundary) frame jr $ra main # return also needs to to caller store $fp, padded to double-word boundary So frame size for main is 32 bytes Function epilogue 17

Fact fact: subu $sp,$sp,32 # allocate frame sw $ra,20($sp) # save caller return address sw $fp,16($sp) # save caller frame pointer addiu $fp,$sp,28 # set up new frame pointer bgtz $a0,l2 # if n > 0 goto L2 li $v0,1 # set return value to 1 j L1 # goto epilogue L2: sw $a0,0($fp) # save n addi $a0,$a0,-1 # subtract 1 from n jal fact # call fact(n-1) lw $v1,0($fp) # load n mul $v0,$v0,$v1 # calculate n*fact(n-1) L1: lw $ra,20($sp) # restore ra lw $fp,16($sp) # restore frame pointer addiu $sp,$sp,32 # pop frame from stack jr $ra # return Function epilogue Function prologue 18

$fp $sp 0x100 0x0FC 0x0F8 0x0F4 0x0F0 0x0EC 0x0E8 0x0E4 0x0E0 0x0DC 0x0D8 0x0D4 0x0D0 0x0CC 0x0C8 0x0C4 0x0C0 0x0BC 0x0B8 0x0B4 saved argument 10 (filler to align to multiple of 8) main s return address main s frame pointer (space for $a3) (space for $a2) (space for $a1) (space for $a0) Frame for fact(10) 19

$fp $sp 0x100 0x0FC 0x0F8 0x0F4 0x0F0 0x0EC 0x0E8 0x0E4 0x0E0 0x0DC 0x0D8 0x0D4 0x0D0 0x0CC 0x0C8 0x0C4 0x0C0 0x0BC 0x0B8 0x0B4 saved argument 10 (filler to align to multiple of 8) main s return address main s frame pointer (space for $a3) (space for $a2) (space for $a1) (space for $a0) saved argument 9 (filler to align to multiple of 8) fact(10) s return address fact(10) s frame pointer (space for $a3) (space for $a2) (space for $a1) (space for $a0) Frame for fact(10) Frame for fact(9) 20

$fp $sp 0x100 0x0FC 0x0F8 0x0F4 0x0F0 0x0EC 0x0E8 0x0E4 0x0E0 0x0DC 0x0D8 0x0D4 0x0D0 0x0CC 0x0C8 0x0C4 0x0C0 0x0BC 0x0B8 0x0B4 saved argument 10 (filler to align to multiple of 8) main s return address main s frame pointer (space for $a3) (space for $a2) (space for $a1) (space for $a0) saved argument 9 (filler to align to multiple of 8) fact(10) s return address fact(10) s frame pointer (space for $a3) (space for $a2) (space for $a1) (space for $a0) Frame for fact(10) Frame for fact(9) 21

WTFrame Pointer? Frame pointers aren't necessary! Can calculate variable offsets relative to $sp Works unless values of unknown size are allocated on the stack (e.g., via alloca) Simplifies variadic functions i.e., variable number of arguments, such as printf Debuggers like having saved frame pointers around (can crawl up the stack). There are 2 conventions for the MIPS: GCC: uses frame pointer SGI: doesn't use frame pointer No frame pointer means fewer instructions for calls, but complicates code generation (due to pushes, variadic functions, etc.) 22

Compiling Functions Now we understand calling convention, how do we generator code for functions? Must generate prologue & epilogue Need to know how much space frame occupies Roughly c + 4*v where c is the constant overhead to save callee-save registers and v is number of local variables (including parameters) When translating the body, must know offset of each variable During compilation keep an environment that maps variables to offsets. Access variables relative to the frame pointer. When we encounter a return, move the result to $v0 and jump to epilogue Also keep epilogue's label in environment 23

Environments type varmap val empty_varmap : unit -> varmap val insert_var : varmap -> var -> int -> varmap val lookup_var : varmap -> var -> int type env = {epilogue : label, varmap : varmap} 24

What about temps? Three different options For Project 4, implement one of these options Option 1 When evaluating a compound expression e1 + e2 generate code to evaluate e1 and place it in $v0, then push $v0 on the stack generate code to evaluate e2 and place it in $v0 pop e1's value into a temporary register (e.g., $t0) add $t0 and $v0 and put the result in $v0 Bad news: lots of pushes and pops, so lots of overhead Good news: very simple! Don t need to figure out how many temps you need 25

Option 1 Example: 20 instructions (11 memory) a := (x + y) + (z + w) lw $v0, <xoff>($fp) # evaluate x push $v0 # push x's value lw $v0, <yoff>($fp) # evaluate y pop $v1 # pop x's value add $v0,$v1,$v0 # add x and y's values push $v0 # push value of x+y lw $v0, <zoff>($fp) # evaluate z push $v0 # push z's value lw $v0, <woff>($fp) # evaluate w pop $v1 # pop z's value add $v0,$v1,$v0 # add z and w's values pop $v1 # pop x+y add $v0,$v1,$v0 # add (x+y) and (z+w)'s values sw $v0,<aoff>($fp) # store result in a 26

Option 2 Eliminate nested expressions! Avoids the need to push every time we have a nested expression. Introduce new variables to hold intermediate results E.g., a := (x + y) + (z + w) might be translated to: t0 := x + y; t1 := z + w; a := t0 + t1; Treat temps the same as local variables i.e., allocate space for temps once in the prologue and deallocate the space once in the epilogue 27

Option 2 example: 20 instructions (9 memory) a := (x + y) + (z + w) t0 := x + y; t1 := z + w; lw $v0, <xoff>($fp) lw $v1, <yoff>($fp) add $v0, $v0, $v1 sw $v0, <t0off>($fp) lw $v0, <zoff>($fp) lw $v1, <woff>($fp) add $v0, $v0, $v1 sw $v0, <t1off>($fp) a := t0 + t1; lw $v0, <t0off>($fp) lw $v1, <t1off>($fp) add $v0, $v0, $v1 sw $v0, <aoff>($fp) 28

Option 2.5 Still doing lots of loads and stores for temps (and also for variables!) So another idea: use registers to hold intermediate values instead of variables For now: Assume an infinite number of registers Keep a distinction between temps and variables: variables require loading/storing, but temps do not 29

Option 2.5 Example a := (x + y) + (z + w) t0 := x; t1 := y; t2 := t0 + t1; t3 := z; t4 := w; t5 := t3 + t4; t6 := t2 + t5; a := t6; # load variable # load variable # add # load variable # load variable # add # add # store result 30

Option 2.5 Example: 8 instructions (5 memory) a := (x + y) + (z + w) t0 := x; t1 := y; t2 := t0 + t1; t3 := z; t4 := w; t5 := t3 + t4; t6 := t2 + t5; a := t6; lw $t0,<xoff>($fp) lw $t1,<yoff>($fp) add $t2,$t0,$t1 lw $t3,<zoff>($fp) lw $t4,<woff>($fp) add $t5,$t3,$t4 add $t6,$t2,$t5 sw $t6,<aoff>($fp) Note that each little statement translates directly to a single MIPS instruction 31

Using Temps a := (x + y) + (z + w) t0 := x; t1 := y; t2 := t0 + t1; t3 := z; t4 := w; t5 := t3 + t4; t6 := t2 + t5; a := t6; # t0 taken # t0, t1 taken # t2 taken # t2, t3 taken # t2, t3, t4 taken # t2, t5 taken # t6 taken # <none taken> We can reuse temps They form a stack discipline! Idea: Use a compile-time stack of registers instead of a run-time stack! 32

Using Temps a := (x + y) + (z + w) t0 := x; t1 := y; t0 := t0 + t1; t1 := z; t2 := w; t1 := t1 + t2; t0 := t0 + t1; a := t0; # t0 taken # t0, t1 taken # t0 taken # t0, t1 taken # t0, t1, t2 taken # t0, t1 taken # t0 taken # <empty> 33

Option 3 When the compile-time stack overflows (i.e., need more temps than we have registers): Generate code to spill (push) all of the temps. Reset the compile-time stack to <empty> When the compile-time stack underflows (i.e., we need temps that we spilled earlier): Generate code to pop all of the temps. Reset the compile-time stack to full. What's really happening is that we're caching the hot end of the run-time stack in registers Some architectures (e.g., SPARC, Itanium) can do the spilling/ restoring with 1 instruction. 34

Option 3 Compared to previous options Don t push and pop on stack when expressions are small Eliminates lots of memory access (and amortizes the cost of stack adjustment) But still far from optimal... Consider a+(b+(c+(d+ +(y+z) ))) versus ( ((((a+b)+c)+d)+ +y)+z If order of evaluation doesn't matter, then want to pick one that minimizes depth of stack (i.e., less likely to overflow.) 35