The remote testing experiment. It works! Code Generation. Lecture 12. Remote testing. From the cs164 newsgroup

Similar documents
Lecture Outline. Topic 1: Basic Code Generation. Code Generation. Lecture 12. Topic 2: Code Generation for Objects. Simulating a Stack Machine

Code Generation. Lecture 19

Code Generation. Lecture 12

Code Generation. Lecture 30

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

We can emit stack-machine-style code for expressions via recursion

Code Generation. Lecture 31 (courtesy R. Bodik) CS164 Lecture14 Fall2004 1

Code Generation II. Code generation for OO languages. Object layout Dynamic dispatch. Parameter-passing mechanisms Allocating temporaries in the AR

Code Generation Super Lectures

Code Generation Super Lectures

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Code Generation & Parameter Passing

COL728 Minor2 Exam Compiler Design Sem II, Answer all 5 questions Max. Marks: 20

The Activation Record (AR)

Compilers and computer architecture: A realistic compiler to MIPS

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Calling Conventions. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 2.8 and 2.12

Lecture #31: Code Generation

The code generator must statically assign a location in the AR for each temporary add $a0 $t1 $a0 ; $a0 = e 1 + e 2 addiu $sp $sp 4 ; adjust $sp (!

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

Chapter 2A Instructions: Language of the Computer

CS 316: Procedure Calls/Pipelining

See P&H 2.8 and 2.12, and A.5-6. Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

Prof. Kavita Bala and Prof. Hakim Weatherspoon CS 3410, Spring 2014 Computer Science Cornell University. See P&H 2.8 and 2.12, and A.

Lecture #19: Code Generation

Intermediate Languages and Machine Languages. Lecture #31: Code Generation. Stack Machine with Accumulator. Stack Machines as Virtual Machines

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

CSE Lecture In Class Example Handout

Course Administration

Implementing Procedure Calls

Written Assignment 5 Due??

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

CS143 - Written Assignment 4 Reference Solutions

Lecture 5: Procedure Calls

CS 61c: Great Ideas in Computer Architecture

CSE Lecture In Class Example Handout

Assignment 11: functions, calling conventions, and the stack

Lec 10: Assembler. Announcements

Systems Architecture I

MIPS Programming. A basic rule is: try to be mechanical (that is, don't be "tricky") when you translate high-level code into assembler code.

Anne Bracy CS 3410 Computer Science Cornell University

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions

SPIM Procedure Calls

Function Calling Conventions 1 CS 64: Computer Organization and Design Logic Lecture #9

Run-time Environments

Run-time Environments

CS 164, Midterm #2, Spring O, M, C - e1 : Bool O, M, C - e2 : t2 O, M, C - e2 : T3 O, M, C - if e1 then e2 else e3 fi : T2 _ T3

Review of Activation Frames. FP of caller Y X Return value A B C

x86 assembly CS449 Fall 2017

CS64 Week 5 Lecture 1. Kyle Dewey

Subroutines. int main() { int i, j; i = 5; j = celtokel(i); i = j; return 0;}

Compiling Code, Procedures and Stacks

Computer Architecture

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions

Memory Usage 0x7fffffff. stack. dynamic data. static data 0x Code Reserved 0x x A software convention

CENG3420 Lecture 03 Review

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

CSC 8400: Computer Systems. Using the Stack for Function Calls

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

Computer Architecture. Chapter 2-2. Instructions: Language of the Computer

CA Compiler Construction

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Anne Bracy CS 3410 Computer Science Cornell University

Functions in MIPS. Functions in MIPS 1

CS222: Dr. A. Sahu. Indian Institute of Technology Guwahati

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

MIPS Functions and Instruction Formats

Lecture 5. Announcements: Today: Finish up functions in MIPS

ECE232: Hardware Organization and Design

MIPS Datapath. MIPS Registers (and the conventions associated with them) MIPS Instruction Types

MIPS Assembly Programming

Function Calling Conventions 2 CS 64: Computer Organization and Design Logic Lecture #10

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Communicating with People (2.8)

MIPS Functions and the Runtime Stack

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation

MODULE 4 INSTRUCTIONS: LANGUAGE OF THE MACHINE

Computer Science 2500 Computer Organization Rensselaer Polytechnic Institute Spring Topic Notes: MIPS Programming

Functions and the MIPS Calling Convention 2 CS 64: Computer Organization and Design Logic Lecture #11

Compiler Construction D7011E

CS61 Section Solutions 3

Thomas Polzer Institut für Technische Informatik

MIPS Procedure Calls. Lecture 6 CS301

CSC 2400: Computing Systems. X86 Assembly: Function Calls

Procedures and Stacks

Computer Architecture. The Language of the Machine

CSC 2400: Computer Systems. Using the Stack for Function Calls

CS61C Machine Structures. Lecture 12 - MIPS Procedures II & Logical Ops. 2/13/2006 John Wawrzynek. www-inst.eecs.berkeley.

Chapter 2. lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1

Assembly Language: Function Calls

Procedure Calls. Young W. Lim Mon. Young W. Lim Procedure Calls Mon 1 / 29

CS153: Compilers Lecture 8: Compiling Calls

Transcription:

The remote testing experiment. It works! Coverage - Score plot 0.9 Code Generation 0.7 0.6 Lecture 12 Coverage 0.5 0.4 0.3 Series1 Poly. (Series1) 0.2 0.1 0 0 10 20 30 40 50 Score CS 164 Lecture 14 Fall 2004 1 CS 164 Lecture 14 Fall 2004 2 Remote testing From the cs164 newsgroup In PA1, if you achieved best coverage, you also got best score! 44 44 44 44 43 43 43 43 44 41 38 44 14286 14286 14286 14286 > I need a small website made. I'm willing to pay for the work too. > So... if anyone is interested e-mail me at [deleted]@berkeley.edu. CS164's guide to create a website: 1) Write a lexer 2) Write a lexer spec for some unknown, obscure language 3) Write a parser 4) Write the grammar for that obscure language 5) Write a code generator that generates HTML 6)... 7) Profit! Now only you can maintain that website! CS 164 Lecture 14 Fall 2004 3 CS 164 Lecture 14 Fall 2004 4 The moral Essentially, the recommended strategy is to goal: no one can maintain your programs means: develop an obscure language for your programs But if this is your goal, why a new language? tons of unmaintainable Java programs written some even submitted as cs164 projects I am sure you can succeed with just Java, too. A better road to profit develop a language: can be obscure, even horrible, but make sure it s horibly useful, too (ex.: perl, C++, Visual Basic, latex) then publish books on this language Lecture Outline Stack machines The MIPS assembly language The x86 assembly language A simple source language Stack-machine implementation of the simple language CS 164 Lecture 14 Fall 2004 5 CS 164 Lecture 14 Fall 2004 6

Stack Machines A simple evaluation model No variables or registers A stack of values for intermediate results Example of a Stack Machine Program Consider two instructions push i -place the integer i on top of the stack add - pop two elements, add them and put the result back on the stack A program to compute 7 + 5: push 7 push 5 add CS 164 Lecture 14 Fall 2004 7 CS 164 Lecture 14 Fall 2004 8 Stack Machine. Example stack push 7 7 push 5 Each instruction: Takes its operands from the top of the stack Removes those operands from the stack Computes the required operation on them Pushes the result on the stack 5 7 add 12 CS 164 Lecture 14 Fall 2004 9 Why Use a Stack Machine? Each operation takes operands from the same place and puts results in the same place This means a uniform compilation scheme And therefore a simpler compiler CS 164 Lecture 14 Fall 2004 10 Why Use a Stack Machine? Location of the operands is implicit Always on the top of the stack No need to specify operands explicitly No need to specify the location of the result Instruction add as opposed to add r 1, r 2 Smaller encoding of instructions More compact programs This is one reason why Java Bytecodes use a stack evaluation model Optimizing the Stack Machine The add instruction does 3 memory operations Two reads and one write to the stack The top of the stack is frequently accessed Idea: keep the top of the stack in a register (called accumulator) Register accesses are faster The add instruction is now acc acc + top_of_stack Only one memory operation! CS 164 Lecture 14 Fall 2004 11 CS 164 Lecture 14 Fall 2004 12

Stack Machine with Accumulator Invariants The result of computing an expression is always in the accumulator For an operation op(e 1,,e n ) push the accumulator on the stack after computing each of e 1,,e n-1 The result of e n is in the accumulator before op After the operation pop n-1 values After computing an expression the stack is as before CS 164 Lecture 14 Fall 2004 13 Stack Machine with Accumulator. Example Compute 7 + 5 using an accumulator acc stack acc 7 push acc 7 7 acc 5 5 7 12 CS 164 Lecture 14 Fall 2004 14 acc acc + top_of_stack pop A Bigger Example: 3 + (7 + 5) Code Acc Stack acc 3 3 <init> push acc 3 3, <init> acc 7 7 3, <init> push acc 7 7, 3, <init> acc 5 5 7, 3, <init> acc acc + top_of_stack 12 7, 3, <init> pop 12 3, <init> acc acc + top_of_stack 15 3, <init> pop 15 <init> CS 164 Lecture 14 Fall 2004 15 Notes It is very important that the stack is preserved across the evaluation of a subexpression Stack before the evaluation of 7 + 5 is 3, <init> Stack after the evaluation of 7 + 5 is 3, <init> The first operand is on top of the stack CS 164 Lecture 14 Fall 2004 16 From Stack Machines to MIPS The compiler generates code for a stack machine with accumulator We want to run the resulting code on an x86 or MIPS processor (or simulator) We implement stack machine instructions using MIPS instructions and registers MIPS assembly vs. x86 assembly In PA4 and PA5, you will generate x86 code because we have no MIPS machines around and using a MIPS simulator is less exciting In this lecture, we will use MIPS assembly it s somewhat more readable than x86 assembly e.g. in x86, both store and load are called movl translation from MIPS to x86 trivial see the translation table in a few slides CS 164 Lecture 14 Fall 2004 17 CS 164 Lecture 14 Fall 2004 18

Simulating a Stack Machine The accumulator is kept in MIPS register $a0 in x86, it s in %eax The stack is kept in memory The stack grows towards lower addresses standard convention on both MIPS and x86 The address of the next location on the stack is kept in MIPS register $sp The top of the stack is at address $sp + 4 in x86, its %esp MIPS Assembly MIPS architecture Prototypical Reduced Instruction Set Computer (RISC) architecture Arithmetic operations use registers for operands and results Must use load and store instructions to use operands and results in memory 32 general purpose registers (32 bits each) We will use $sp, $a0 and $t1 (a temporary register) CS 164 Lecture 14 Fall 2004 19 CS 164 Lecture 14 Fall 2004 20 A Sample of MIPS Instructions lw reg 1 offset(reg 2 ) Load 32-bit word from address reg 2 + offset into reg 1 add reg 1 reg 2 reg 3 reg 1 reg 2 + reg 3 sw reg 1 offset(reg 2 ) Store 32-bit word in reg 1 at address reg 2 + offset addiu reg 1 reg 2 imm reg 1 reg 2 + imm u means overflow is not checked li reg imm reg imm CS 164 Lecture 14 Fall 2004 21 x86 Assembly x86 architecture Complex Instruction Set Computer (CISC) architecture Arithmetic operations can use both registers and memory for operands and results So, you don t have to use separate load and store instructions to operate on values in memory CISC gives us more freedom in selecting instructions (hence, more powerful optimizations) but we ll use a simple RISC subset of x86 so translation from MIPS to x86 will be easy CS 164 Lecture 14 Fall 2004 22 x86 assembly x86 has two-operand instructions: ex.: ADD dest, src dest := dest + src in MIPS: dest := src1 + src2 An annoying fact to remember different x86 assembly versions exists one important difference: order of operands the manuals assume ADD dest, src the gcc assembler we ll use uses opposite order ADD src, dest CS 164 Lecture 14 Fall 2004 23 Sample x86 instructions (gcc order of operands) movl offset(reg 2 ), reg 1 Load 32-bit word from address reg 2 + offset into reg 1 add reg 2, reg 1 reg 1 reg 1 + reg 2 movl reg 1 offset(reg 2 ) Store 32-bit word in reg 1 at address reg 2 + offset add imm, reg 1 reg 1 reg 1 + imm use this for MIPS addiu movl imm, reg reg imm CS 164 Lecture 14 Fall 2004 24

MIPS to x86 translation x86 vs. MIPS registers MIPS x86 MIPS x86 lw reg 1 offset(reg 2 ) movl offset(reg 2 ), reg 1 $a0 %eax add reg 1 reg 1 reg 2 add reg 2, reg 1 $sp %esp sw reg 1 offset(reg 2 ) movl reg 1 offset(reg 2 ) $fp %ebp addiu reg 1 reg 1 imm add imm, reg 1 $t %ebx li reg imm movl imm, reg CS 164 Lecture 14 Fall 2004 25 CS 164 Lecture 14 Fall 2004 26 MIPS Assembly. Example. The stack-machine code for 7 + 5 in MIPS: acc 7 push acc acc 5 acc acc + top_of_stack pop li $a0 7 sw $a0 0($sp) addiu $sp $sp -4 li $a0 5 lw $t1 4($sp) add $a0 $a0 $t1 addiu $sp $sp 4 Some Useful Macros We define the following abbreviation push $t sw $a0 0($sp) addiu $sp $sp -4 pop addiu $sp $sp 4 $t top lw $t 4($sp) We now generalize this to a simple language CS 164 Lecture 14 Fall 2004 27 CS 164 Lecture 14 Fall 2004 28 A Small Language A language with integers and integer operations P D; P D D def id(args) = E; ARGS id, ARGS id E int id if E 1 = E 2 then E 3 else E 4 E 1 + E 2 E 1 E 2 id(e 1,,E n ) A Small Language (Cont.) The first function definition f is the main routine Running the program on input i means computing f(i) Program for computing the Fibonacci numbers: def fib(x) = if x = 1 then 0 else if x = 2 then 1 else fib(x - 1) + fib(x 2) CS 164 Lecture 14 Fall 2004 29 CS 164 Lecture 14 Fall 2004 30

Code Generation Strategy For each expression e we generate MIPS code that: Computes the value of e in $a0 Preserves $sp and the contents of the stack We define a code generation function cgen(e) whose result is the code generated for e Code Generation for Constants The code to evaluate a constant simply copies it into the accumulator: cgen(i) = li $a0 i Note that this also preserves the stack, as required CS 164 Lecture 14 Fall 2004 31 CS 164 Lecture 14 Fall 2004 32 Code Generation for Add cgen(e 1 + e 2 ) = cgen(e 1 ) push $a0 cgen(e 2 ) $t1 top add $a0 $t1 $a0 pop Possible optimization: Put the result of e 1 directly in register $t1? Code Generation for Add. Wrong! Optimization: Put the result of e 1 directly in $t1? cgen(e 1 + e 2 ) = cgen(e 1 ) move $t1 $a0 cgen(e 2 ) add $a0 $t1 $a0 Try to generate code for : 3 + (7 + 5) CS 164 Lecture 14 Fall 2004 33 CS 164 Lecture 14 Fall 2004 34 Code Generation Notes The code for + is a template with holes for code for evaluating e 1 and e 2 Stack-machine code generation is recursive Code for e 1 + e 2 consists of code for e 1 and e 2 glued together Code generation can be written as a recursivedescent of the AST At least for expressions CS 164 Lecture 14 Fall 2004 35 Code Generation for Sub and Constants New instruction: sub reg 1 reg 2 reg 3 Implements reg 1 reg 2 -reg 3 cgen(e 1 -e 2 ) = cgen(e 1 ) push $a0 cgen(e 2 ) $t1 top sub $a0 $t1 $a0 pop CS 164 Lecture 14 Fall 2004 36

Code Generation for Conditional Code Generation for If (Cont.) We need flow control instructions New instruction: beq reg 1 reg 2 label Branch to label if reg 1 = reg 2 x86: cmpl reg 1 reg 2 je label New instruction: b label Unconditional jump to label x86: jmp label cgen(if e 1 = e 2 then e 3 else e 4 ) = cgen(e 1 ) push $a0 cgen(e 2 ) $t1 top pop beq $a0 $t1 true_branch false_branch: cgen(e 4 ) b end_if true_branch: cgen(e 3 ) end_if: CS 164 Lecture 14 Fall 2004 37 CS 164 Lecture 14 Fall 2004 38 The Activation Record Code for function calls and function definitions depends on the layout of the activation record A very simple AR suffices for this language: The result is always in the accumulator No need to store the result in the AR The activation record holds actual parameters For f(x 1,,x n ) push x n,,x 1 on the stack These are the only variables in this language The Activation Record (Cont.) The stack discipline guarantees that on function exit $sp is the same as it was on function entry No need to save $sp We need the return address It s handy to have a pointer to start of the current activation This pointer lives in register $fp (frame pointer) Reason for frame pointer will be clear shortly CS 164 Lecture 14 Fall 2004 39 CS 164 Lecture 14 Fall 2004 40 The Activation Record Summary: For this language, an AR with the caller s frame pointer, the actual parameters, and the return address suffices Consider a call to f(x,y), The AR will be: SP, FP x AR of f y old FP stack growth Code Generation for Function Call The calling sequence is the instructions (of both caller and callee) to set up a function invocation New instruction: jal label Jump to label, save address of next instruction in $ra x86: the return address is stored on the stack by the call label instruction CS 164 Lecture 14 Fall 2004 41 CS 164 Lecture 14 Fall 2004 42

Code Generation for Function Call (Cont.) Code Generation for Function Definition cgen(f(e 1,,e n )) = push $fp cgen(e n ) push $a0 cgen(e 1 ) push $a0 jal f_entry The caller saves its value of the frame pointer Then it saves the actual parameters in reverse order The caller saves the return address in register $ra The AR so far is 4*n+4 bytes long New instruction: jr reg Jump to address in register reg cgen(def f(x 1,,x n ) = e) = move $fp $sp push $ra cgen(e) $ra top addiu $sp $sp z lw $fp 0($sp) jr $ra Note: The frame pointer points to the top, not bottom of the frame The callee pops the return address, the actual arguments and the saved value of the frame pointer z = 4*n + 8 CS 164 Lecture 14 Fall 2004 43 CS 164 Lecture 14 Fall 2004 44 Calling Sequence. Example for f(x,y). Code Generation for Variables Before call On entry In body After call SP SP FP RA x y x y SP old FP old FP SP FP FP FP CS 164 Lecture 14 Fall 2004 45 Variable references are the last construct The variables of a function are just its parameters They are all in the AR Pushed by the caller Problem: Because the stack grows when intermediate results are saved, the variables are not at a fixed offset from $sp CS 164 Lecture 14 Fall 2004 46 Code Generation for Variables (Cont.) Solution: use a frame pointer Always points to the return address on the stack Since it does not move it can be used to find the variables Let x i be the i th (i = 1,,n) formal parameter of the function for which code is being generated Code Generation for Variables (Cont.) Example: For a function def f(x 1,x 2 ) = e the activation and frame pointer are set up as follows: SP x 1 is at fp + 4 FP RA x 2 is at fp + 8 x 1 Thus: x 2 old FP cgen(x i ) = lw $a0 z($fp) ( z = 4*i ) CS 164 Lecture 14 Fall 2004 47 CS 164 Lecture 14 Fall 2004 48

Summary The activation record must be designed together with the code generator Summary See the PA4 starter kit for a large code generation example Code generation can be done by recursive traversal of the AST We recommend you use a stack machine for your Decaf compiler (it s simple) Production compilers do different things Emphasis is on keeping values (esp. current stack frame) in registers Intermediate results are laid out in the AR, not pushed and popped from the stack CS 164 Lecture 14 Fall 2004 49 CS 164 Lecture 14 Fall 2004 50 Review Allocating Temporaries in the AR The stack machine has activation records and intermediate results interleaved on the stack AR Intermediates AR Intermediates CS 164 Lecture 14 Fall 2004 51 CS 164 Lecture 14 Fall 2004 52 Review (Cont.) Advantage: Very simple code generation A Better Way Idea: Keep temporaries in the AR Disadvantage: Very slow code Storing/loading temporaries requires a store/load and $sp adjustment The code generator must assign a location in the AR for each temporary CS 164 Lecture 14 Fall 2004 53 CS 164 Lecture 14 Fall 2004 54

Example def fib(x) = if x = 1 then 0 else if x = 2 then 1 else fib(x - 1) + fib(x 2) What intermediate values are placed on the stack? How many slots are needed in the AR to hold these values? CS 164 Lecture 14 Fall 2004 55 How Many Temporaries? Let NT(e) = # of temps needed to evaluate e NT(e 1 + e 2 ) Needs at least as many temporaries as NT(e 1 ) Needs at least as many temporaries as NT(e 2 ) + 1 Space used for temporaries in e 1 can be reused for temporaries in e 2 CS 164 Lecture 14 Fall 2004 56 The Equations The Revised AR NT(e 1 + e 2 ) = max(nt(e 1 ), 1 + NT(e 2 )) NT(e 1 -e 2 ) = max(nt(e 1 ), 1 + NT(e 2 )) NT(if e 1 = e 2 then e 3 else e 4 ) = max(nt(e 1 ),1 + NT(e 2 ), NT(e 3 ), NT(e 4 )) NT(id(e 1,,e n ) = max(nt(e 1 ),,NT(e n )) NT(int) = 0 NT(id) = 0 Is this bottom-up or top-down? What is NT(code for fib)? For a function definition f(x 1,,x n ) = e the AR has 2 + n + NT(e) elements Return address Frame pointer n arguments NT(e) locations for intermediate results CS 164 Lecture 14 Fall 2004 57 CS 164 Lecture 14 Fall 2004 58 Picture Revised Code Generation SP FP Temp NT(e)... Temp 1 RA x 1... x n Old FP Code generation must know how many temporaries are in use at each point Add a new argument to code generation: the position of the next available temporary CS 164 Lecture 14 Fall 2004 59 CS 164 Lecture 14 Fall 2004 60

Code Generation for + (original) cgen(e 1 + e 2 ) = cgen(e 1 ) sw $a0 0($sp) addiu $sp $sp -4 cgen(e 2 ) lw $t1 4($sp) add $a0 $t1 $a0 addiu $sp $sp 4 Code Generation for + (revised) cgen(e 1 + e 2, nt) = cgen(e 1, nt) sw $a0 -nt($fp) cgen(e 2, nt + 4) lw $t1 -nt($fp) add $a0 $t1 $a0 CS 164 Lecture 14 Fall 2004 61 CS 164 Lecture 14 Fall 2004 62 Notes The temporary area is used like a small, fixedsize stack Exercise: Write out cgen for other constructs Code Generation for Object-Oriented Languages CS 164 Lecture 14 Fall 2004 63 CS 164 Lecture 14 Fall 2004 64 Object Layout OO implementation = Stuff from last lecture + More stuff OO Slogan: If B is a subclass of A, then an object of class B can be used wherever an object of class A is expected Two Issues How are objects represented in memory? How is dynamic dispatch implemented? This means that code in class A works unmodified for an object of class B CS 164 Lecture 14 Fall 2004 65 CS 164 Lecture 14 Fall 2004 66

Object Layout Example Object Layout (Cont.) class A { int a = 0; int d = 1; int f() { return a = a + d } } class B extends A { int b = 2; int f() { return a } //override int g() { return a = a - b } } class C extends A { int c = 3; int h() { return a = a * c } } CS 164 Lecture 14 Fall 2004 67 Attributes a and d are inherited by classes B and C All methods in all classes refer to a For A methods to work correctly in A, B, and C objects, attribute a must be in the same place in each object CS 164 Lecture 14 Fall 2004 68 Object Layout (Cont.) An object is like a struct in C. The reference foo.field is an index into a foo struct at an offset corresponding to field Objects are laid out in contiguous memory Each attribute stored at a fixed offset in object When a method is invoked, the object is this and the fields are the object s attributes Object Layout The first 3 words of an object contain header information: Class Tag Object Size Dispatch Ptr Attribute 1 Attribute 2... Offset 0 4 8 12 16 CS 164 Lecture 14 Fall 2004 69 CS 164 Lecture 14 Fall 2004 70 Object Layout (Cont.) Class tag is an integer Identifies class of the object Object size is an integer Size of the object in words Dispatch ptr is a pointer to a table of methods More later Attributes in subsequent slots Subclasses Observation: Given a layout for class A, a layout for subclass B can be defined by extending the layout of A with additional slots for the additional attributes of B Leaves the layout of A unchanged (B is an extension) Lay out in contiguous memory CS 164 Lecture 14 Fall 2004 71 CS 164 Lecture 14 Fall 2004 72

Layout Picture Subclasses (Cont.) Offset Class A 0 Atag 4 5 8 * 12 a 16 d 20 The offset for an attribute is the same in a class and all of its subclasses Any method for an A 1 can be used on a subclass A 2 Consider layout for A n A 3 A 2 A 1 B C Btag Ctag 6 6 * * a a d d b c Header A 1 attrs. A 2 attrs A 3 attrs A 1 object A 2 object A 3 object What about multiple inheritance?... CS 164 Lecture 14 Fall 2004 73 CS 164 Lecture 14 Fall 2004 74 Dynamic Dispatch Dynamic Dispatch Example Consider again our example class A { int a = 0; int d = 1; int f() { return a = a + d } } class B extends A { int b = 2; int f() { return a } int g() { return a = a - b } } class C extends A { int c = 3; int h() { return a = a * c } } CS 164 Lecture 14 Fall 2004 75 e.g() g refers to method in B if e is a B e.f() f refers to method in A if f is an A or C (inherited in the case of C) f refers to method in B for a B object The implementation of methods and dynamic dispatch strongly resembles the implementation of attributes CS 164 Lecture 14 Fall 2004 76 Dispatch Tables Dispatch Table Example Every class has a fixed set of methods (including inherited methods) A dispatch table indexes these methods An array of method entry points A method f lives at a fixed offset in the dispatch table for a class and all of its subclasses Offset Class A B C 0 fa fb fa 4 g h The dispatch table for class A has only 1 method The tables for B and C extend the table for A to the right Because methods can be overridden, the method for f is not the same in every class, but is always at the same offset CS 164 Lecture 14 Fall 2004 77 CS 164 Lecture 14 Fall 2004 78

Using Dispatch Tables The dispatch pointer in an object of class X points to the dispatch table for class X Every method f of class X is assigned an offset O f in the dispatch table at compile time Using Dispatch Tables (Cont.) Every method must know what object is this this is passed as the first argument to all methods To implement a dynamic dispatch e.f() we Evaluate e, obtaining an object x Find D by reading the dispatch-table field of x Call D[O f ](x) D is the dispatch table for x In the call, this is bound to x CS 164 Lecture 14 Fall 2004 79 CS 164 Lecture 14 Fall 2004 80