Lecture #31: Code Generation

Similar documents
Intermediate Languages and Machine Languages. Lecture #31: Code Generation. Stack Machine with Accumulator. Stack Machines as Virtual Machines

Lecture #19: Code Generation

Code Generation. Lecture 30

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines

Code Generation. Lecture 31 (courtesy R. Bodik) CS164 Lecture14 Fall2004 1

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

We can emit stack-machine-style code for expressions via recursion

Code Generation. Lecture 19

Lecture Outline. Topic 1: Basic Code Generation. Code Generation. Lecture 12. Topic 2: Code Generation for Objects. Simulating a Stack Machine

The remote testing experiment. It works! Code Generation. Lecture 12. Remote testing. From the cs164 newsgroup

Code Generation. Lecture 12

The Activation Record (AR)

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Assignment 11: functions, calling conventions, and the stack

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

x86 assembly CS449 Fall 2017

Compiler Construction D7011E

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Code Generation Super Lectures

CS 31: Intro to Systems Functions and the Stack. Martin Gagne Swarthmore College February 23, 2016

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only

Assembly Language: Function Calls

CS61 Section Solutions 3

Intel assembly language using gcc

CS4215 Programming Language Implementation

Assembly Language: Function Calls" Goals of this Lecture"

Integers and variables

COMP 181. Prelude. Intermediate representations. Today. High-level IR. Types of IRs. Intermediate representations and code generation

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

Assembly Language: Function Calls" Goals of this Lecture"

CS356: Discussion #6 Assembly Procedures and Arrays. Marco Paolieri

Compilers and computer architecture: A realistic compiler to MIPS

Parsing Scheme (+ (* 2 3) 1) * 1

CSC 2400: Computing Systems. X86 Assembly: Function Calls

W4118: PC Hardware and x86. Junfeng Yang

AS08-C++ and Assembly Calling and Returning. CS220 Logic Design AS08-C++ and Assembly. AS08-C++ and Assembly Calling Conventions

Machine Programming 3: Procedures

Project Compiler. CS031 TA Help Session November 28, 2011

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

Integers and variables

Subprograms: Local Variables

Intermediate Code Generation

Introduction to Code Optimization. Lecture 36: Local Optimization. Basic Blocks. Basic-Block Example

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

CS 2210 Programming Project (Part IV)

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

CIT Week13 Lecture

CS 164, Midterm #2, Spring O, M, C - e1 : Bool O, M, C - e2 : t2 O, M, C - e2 : T3 O, M, C - if e1 then e2 else e3 fi : T2 _ T3

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Other Forms of Intermediate Code. Local Optimizations. Lecture 34

Administrative. Other Forms of Intermediate Code. Local Optimizations. Lecture 34. Code Generation Summary. Why Intermediate Languages?

Procedure Calls. Young W. Lim Mon. Young W. Lim Procedure Calls Mon 1 / 29

CS241 Computer Organization Spring 2015 IA

CIS 341 Midterm February 28, Name (printed): Pennkey (login id): SOLUTIONS

CS415 Compilers. Intermediate Represeation & Code Generation

Systems I. Machine-Level Programming V: Procedures

CS 33: Week 3 Discussion. x86 Assembly (v1.0) Section 1G

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

You may work with a partner on this quiz; both of you must submit your answers.

Review Questions. 1 The DRAM problem [5 points] Suggest a solution. 2 Big versus Little Endian Addressing [5 points]

Introduction to Compilers Professor Jeremy Siek

CS213. Machine-Level Programming III: Procedures

Virtual Machine. Part I: Stack Arithmetic. Building a Modern Computer From First Principles.

Semantic actions for expressions

THEORY OF COMPILATION

G Programming Languages - Fall 2012

x86 assembly CS449 Spring 2016

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

6.001 Recitation 23: Register Machines and Stack Frames

Assembly III: Procedures. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

ECS 142 Project: Code generation hints

Compilation /15a Lecture 7. Activation Records Noam Rinetzky

Syntax-Directed Translation. Lecture 14

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 6

Sungkyunkwan University

IA32 Stack. Lecture 5 Machine-Level Programming III: Procedures. IA32 Stack Popping. IA32 Stack Pushing. Topics. Pushing. Popping

ASSEMBLY III: PROCEDURES. Jo, Heeseung

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Instruction Set Architecture

Assembly III: Procedures. Jo, Heeseung

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

CS429: Computer Organization and Architecture

Process Layout and Function Calls

IR Lowering. Notation. Lowering Methodology. Nested Expressions. Nested Statements CS412/CS413. Introduction to Compilers Tim Teitelbaum

Chapter 4 Processor Architecture: Y86 (Sections 4.1 & 4.3) with material from Dr. Bin Ren, College of William & Mary

ECE232: Hardware Organization and Design

Giving credit where credit is due

Assembly Programming (III)

Digital Forensics Lecture 3 - Reverse Engineering

Credits and Disclaimers

We ve written these as a grammar, but the grammar also stands for an abstract syntax tree representation of the IR.

CS61C Machine Structures. Lecture 12 - MIPS Procedures II & Logical Ops. 2/13/2006 John Wawrzynek. www-inst.eecs.berkeley.

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 19: Efficient IL Lowering 5 March 08

Winter Compiler Construction T10 IR part 3 + Activation records. Today. LIR language

Transcription:

Lecture #31: Code Generation [This lecture adopted in part from notes by R. Bodik] Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 1

Intermediate Languages and Machine Languages From trees such as output from project #2, could produce machine language directly. However, itis oftenconvenientto first generatesome kindof intermediate language (IL): a high-level machine language for a virtual machine. Advantages: Separates problem of extracting the operational meaning (the dynamic semantics) of a program from the problem of producing good machine code from it, because it Gives a clean target for code generation from the AST. By choosing IL judiciously, we can make the conversion of IL machine language easier than the direct conversion of AST machine language. Helpful when we want to target several different architectures (e.g., gcc). Likewise, if we can use the same IL for multiple languages, we can re-use the IL machine language implementation (e.g., gcc, CIL from Microsoft s Common Language Infrastructure). Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 2

Stack Machines as Virtual Machines A simple evaluation model: instead of registers, a stack of values for intermediate results. Examples: The Java Virtual Machine, the Postscript interpreter. Each operation (1) pops its operands from the top of the stack, (2) computes the required operation on them, and (3) pushes the result on the stack. A program to compute 7 + 5: push 7 push 5 add Advantages # Push constant 7 on stack # Pop two 5 and 7 from stack, add, and push result. Uniform compilation scheme: Each operation takes operands from the same place and puts results in the same place. Fewer explict operands in instructions means smaller encoding of instructions and more compact programs. Meshes nicely with subroutine calling conventions that push arguments on stack. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 3

Stack Machine with Accumulator The add instruction does 3 memory operations: Two reads and one write of the stack. The top of the stack is frequently accessed Idea: keep most recently computed value in a register (called the accumulator) since register accesses are faster. For an operation op(e 1,,e n ): compute each of e 1,,e n 1 into acc and then push on the stack; compute e n into the accumulator; perform op computation, with result in acc. pop e 1,,e n 1 off stack. The add instruction is now acc := acc + top_of_stack pop one item off the stack and uses just one memory operation (popping just means adding constant to stack-pointer register). After computing an expression the stack is as it was before computing the operands. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 4

Example: Full computation of 7+5 acc := 7 push acc acc := 5 acc := acc + top_of_stack pop stack Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 5

A Point of Order Often more convenient to push operands in reverse order, so rightmost operand pushed first. This is a common convention for pushing function arguments, and is especially natural when stack grows toward lower addresses. Also nice for non-commutative operations on architectures such as the ia32. Example: compute x - y. We show assembly code on the right acc := y push acc acc := x acc := acc - top_of_stack pop stack movl y, %eax pushl %eax movl x, %eax subl (%esp), %eax addl $4, %esp Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 6

Translating from AST to Stack Machine (I) First, it mightbe usefulto haveabstractionsfor ourvirtual machine and its operations: /** A virtual machine. */ class VM { public: /** Add INST to our instruction sequence. */ void emitinst (Instruction inst); /** Represents machine instructions in a VM. */ class Instruction { Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 7

Translating from AST to Stack Machine (II) A simple recursive pattern usually serves for expressions. At the top level, our trees might have an expression-code method: class AST { public: /** Generate code for me, leaving my value on the stack. */ virtual void cgen (VM* machine); /** An appropriate VM instruction to use when my operands are on * the stack. */ virtual Instruction getinst (); Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 8

Translating from AST to Stack Machine (III) Implementations of cgen then obey this general comment, and each assumes that its children will as well. E.g., class Binop_AST : public AST { void cgen (VM* machine) { child (1)->cgen (machine); child (0)->cgen (machine); machine->emitinst (getinst ()); It is up to the implementation of VM to decide how the stack is represented: with all results in memory, or with the most recent in an accumulator. Code for cgen need not change (example of separation of concerns, btw). Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 9

Virtual Register Machines and Three-Address Code Another common kind of virtual machine has an infinite supply of registers, each capable of holding a scalar value or address, in addition to ordinary memory. A common IL in this case is some form of three-address code, so called because the typical working instruction has the form target := operand 1 operand 2 where there are two source addresses, one destination address and an operation ( ). Often, we require that the operands in the full three-address form denote (virtual) registers or immediate (literal) values, similar to the usual RISC architecture. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 10

Three-Address Code, continued A few other forms deal with memory and other kinds of operation: memory_operand := register_or_immediate_operand register_operand := register_or_immediate_operand register_operand := memory_operand goto label if operand1 operand2 then goto label param operand ; Push parameter for call. call operand, # of parameters ; Call, put return in ; specific dedicated register Here, stands for some kind of comparison. Memory operands might be labels of static locations, or indexed operands such as (in C-like notation) *(r1+4) or *(r1+r2). Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 11

Translating from AST into Three-Address Code This time, we ll have the cgen routine return where it has put its result: class AST { /** Generate code to compute my value, returning the location * of the result. */ virtual Operand* cgen (VM* machine); Where an Operand denotes some abstract place holding a value. Once again, we rely on our children to obey this general comment: class Binop_AST : public Callable { Operand* cgen (VM* machine) { Operand* left = child (0)->cgen (machine); Operand* right = child (1)->cgen (machine); Operand* result = machine->allocateregister (); machine->emitinst (result, getinst (), left, right); return result; emitinst now produces three-address instructions. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 12

A Larger Example Consider a small language with integers and integer operations: P: D ";" P D D: "def" id(args) "=" E; ARGS: id "," ARGS id E: int id "if" E1 "=" E2 "then" E3 "else" E4 "fi" E1 "+" E2 E1 "-" E2 id "(" E1,,En ")" The first function definition f is the main routine Running the program on input i means computing f(i) Assume a project-2-like AST. Let s continue implementing cgen ( + and - already done). Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 13

Conversion of D ";" P: Simple Cases: Literals and Sequences class StmtList_AST : public AST { Operand* cgen (VM* machine) { for (int i = 0; i < arity (); i += 1) child (i)->cgen (machine); return Operand::NoneOperand; class Int_Literal_AST : public Typed_Leaf { Operand* cgen (VM* machine) { return machine->immediateoperand (inttokenvalue ()); NoneOperand is an Operand that contains None. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 14

Identifiers class Id_AST : public AST { Operand* cgen (VM* machine) { Operand result = machine->allocateregister (); machine->emitinst (MOVE, result, getdecl()->getmylocation (machine)); return result; That is, we assume that the declaration object that holds information about this occurrence of the identifier contains its location. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 15

Calls class Call_AST : public AST { Operand* cgen (VM* machine) { AST* args = getarglist (); for (int i = args->arity ()-1; i > 0; i -= 1) machine->emitinst (PARAM, args.child (i)->cgen (machine)); Operand* callable = child (0)->cgen (machine); machine->emitinst (CALL, callable, args->arity ()); return Operand::ReturnOperand; ReturnOperand is abstract location where functions return their value. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 16

Control Expressions: if (Strategy) Control expressions generally involve jump and conditional jump instructions. To translate if E1 = E2 then E3 else E4 fi we might aim to produce something that realizes the following pseudocode: code to compute E1 into r1 code to compute E2 into r2 if r1!= r2 goto L1 code to compute E3 into r3 goto L2 L1: code to compute E4 into r3 L2: where the ri denote virtual-machine registers. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 17

Control Expressions: if (Code Generation) class IfExpr_AST : public AST { Operand* cgen (VM* machine) { Operand* left = child (0)->cgen (machine); Operand* right = child (1)->cgen (machine); Label* elselabel = machine->newlabel (); Label* donelabel = machine->newlabel (); machine->emitinst (IFNE, left, right, elselabel); Operand* result = machine->allocateregister (); machine->emitinst (MOVE, result, child (2)->cgen (machine)); machine->emitinst (GOTO, donelabel); machine->placelabel (elselabel); machine->emitinst (MOVE, result, child (3)->cgen (machine)); machine->placelabel (donelabel); return result; newlabel creates a new, undefined instruction label. placelabel inserts a definition of the label in the code. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 18

Code generation for def class DefNode : public AST { Operand* cgen (VM* machine) { machine->placelabel (getname ()); machine->emitfunctionprologue (); Operand* result = getbody ()->cgen (machine); machine->emitinst (MOVE, Operand::ReturnOperand, result); machine->emitfunctionepilogue (); return Operand::NoneOperand; Where function prologues and epilogues are standard code sequences for entering and leaving functions, setting frame pointers, etc. Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 19

A Sample Translation Program for computing the Fibonacci numbers: def fib(x) = if x = 1 then 0 else if x = 2 then 1 else fib(x - 1) + fib(x - 2) Possible code generated: f: function prologue r1 := x L3: r5 := x if r1!= 1 then goto L1 r6 := r5-1 r2 := 0 param r6 goto L2 call fib, 1 L1: r3 := x r7 := rret if r3!= 2 then goto L3 r8 := x r4 := 1 r9 := r8-2 goto L4 param r9 call fib, 1 r10 := r7 + rret r4 := r10 L4: r2 := r4 L2: rret := r2 function epilogue Last modified: Tue Apr 10 18:47:46 2018 CS164: Lecture #31 20