Intermediate Code Generation (ICG)

Similar documents
The structure of a compiler

ssumptions trategy = Later passes have less semantic information = E.g., array indexing, boolean expressions, case statements

Compiler Design. Code Shaping. Hwansoo Han

Code Shape II Expressions & Assignment

Implementing Control Flow Constructs Comp 412

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

CS415 Compilers. Intermediate Represeation & Code Generation

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

CS5363 Final Review. cs5363 1

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

(just another operator) Ambiguous scalars or aggregates go into memory

Goals of Program Optimization (1 of 2)

CMPT 379 Compilers. Anoop Sarkar. 11/13/07 1. TAC: Intermediate Representation. Language + Machine Independent TAC

Optimization Prof. James L. Frankel Harvard University

CS 432 Fall Mike Lam, Professor. Code Generation

Intermediate Representations

CS 314 Principles of Programming Languages. Lecture 13

CS 406/534 Compiler Construction Instruction Scheduling beyond Basic Block and Code Generation

G Programming Languages - Fall 2012

The Software Stack: From Assembly Language to Machine Code

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Intermediate Code Generation

Intermediate Code Generation

Run-time Environments

Run-time Environments

Intermediate Code Generation

Lexical Considerations

Lexical Considerations

CSc 453 Intermediate Code Generation

CSE 504: Compiler Design. Runtime Environments

Instruction Selection, II Tree-pattern matching

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

Homework I - Solution

Short Notes of CS201

CS 314 Principles of Programming Languages

CSE 401/M501 Compilers

1 Lexical Considerations

CS201 - Introduction to Programming Glossary By

SSA Based Mobile Code: Construction and Empirical Evaluation

Languages and Compiler Design II IR Code Optimization

Intermediate Representations

Intermediate Representations & Symbol Tables

Translation. From ASTs to IR trees

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Handling Assignment Comp 412

CS153: Compilers Lecture 15: Local Optimization

Code Generation. Lecture 12

Instruction Selection: Peephole Matching. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

The Procedure Abstraction

Intermediate Representation (IR)

Languages and Compiler Design II IR Code Generation I

Final CSE 131B Spring 2004

Code Generation & Parameter Passing

LECTURE 17. Expressions and Assignment

Type checking. Jianguo Lu. November 27, slides adapted from Sean Treichler and Alex Aiken s. Jianguo Lu November 27, / 39

Semantic Analysis and Type Checking

Agenda. CSE P 501 Compilers. Big Picture. Compiler Organization. Intermediate Representations. IR for Code Generation. CSE P 501 Au05 N-1

Concepts Introduced in Chapter 6

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

Code Generation. Dragon: Ch (Just part of it) Holub: Ch 6.

Compilers. Lecture 2 Overview. (original slides by Sam

Lecture 4: Instruction Set Design/Pipelining

CSC 467 Lecture 13-14: Semantic Analysis

Dynamic Dispatch and Duck Typing. L25: Modern Compiler Design

CS 406/534 Compiler Construction Putting It All Together

CSE 401 Final Exam. March 14, 2017 Happy π Day! (3/14) This exam is closed book, closed notes, closed electronics, closed neighbors, open mind,...

Code Generation. Lecture 19

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Field Analysis. Last time Exploit encapsulation to improve memory system performance

Types II. Hwansoo Han

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

UNIT-V. Symbol Table & Run-Time Environments Symbol Table

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

CIS 341 Midterm March 2, 2017 SOLUTIONS

Procedure and Object- Oriented Abstraction

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Intermediate Representations

CS 426 Fall Machine Problem 4. Machine Problem 4. CS 426 Compiler Construction Fall Semester 2017

Lecture Outline. Topic 1: Basic Code Generation. Code Generation. Lecture 12. Topic 2: Code Generation for Objects. Simulating a Stack Machine

Semantic actions for declarations and expressions

Compilers and computer architecture: A realistic compiler to MIPS

Lecture 12: Compiler Backend

CSE 504: Compiler Design. Code Generation

CS 360 Programming Languages Interpreters

Introduction to Compiler

CODE GENERATION Monday, May 31, 2010

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Instruction Selection: Preliminaries. Comp 412

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Topic 6 Basic Back-End Optimization

CS2210: Compiler Construction. Code Generation

Concepts Introduced in Chapter 6

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Module 27 Switch-case statements and Run-time storage management

Transcription:

Intermediate Code Generation (ICG) Transform AST to lower-level intermediate representation Basic Goals: Separation of Concerns Generate efficient code sequences for individual operations Keep it fast and simple: leave most optimizations to later phases Provide clean, easy-to-optimize code IR forms the basis for code optimization and target code generation Mid-level vs. Low-level Model of Compilation Both models can use a machine-independent IR: AST machine-specific code sequences target code AST machine-independent code sequences machine-level IR target code Key difference: where does target instruction selection happen Topic 8: Intermediate Code Generation p.1/32

Intermediate code generation Overview Assumptions Strategy Goal: Translate AST to low-level machine-independent 3-address IR Intermediate language: RISC-like 3-address code Intermediate Code Generation (ICG) is independent of target ISA Storage layout has been pre-determined Infinite number of registers + Frame Pointer (FP) Q. What values can live in registers? 1. Simple bottom-up tree-walk on AST ILOC: Cooper and Torczon, Appendix A. 2. Translation uses only local info: current AST node + children 3. Good (local) code is important! = Later passes have less semantic information = E.g., array indexing, boolean expressions, case statements 4. We will discuss important special cases Topic 8: Intermediate Code Generation p.2/32

Code generation for expression trees Illustrates the tree-walk scheme assign a virtual register to each operator emit code in postorder traversal of expression tree Support routines Notes assume tree reflects precedence, associativity assume all operands are integers base() and offset() may emit code base() handles lexical scoping base( str ) looks up str in the symbol table and returns a virtual register that contains the base address for str offset( str ) looks up str in the symbol table and returns a virtual register that contains the offset of str from its base register new name() returns a new virtual register name Topic 8: Intermediate Code Generation p.3/32

Simple treewalk for expressions expr( node ) int result, t1, t2, t3; switch( type of node ) { case TIMES: t1 = expr( left child of node ); t2 = expr( right child of node ); result = new_name(); emit( mult, t1, t2, =>, result ); break; case PLUS: t1 = expr( left child of node ); t2 = expr( right child of node ); result = new_name(); emit( add, t1, t2, =>, result ); break; } case ID: t1 = base( node.val ); t2 = offset( node.val ); result = new_name(); emit( loadao, t1, t2, =>, res break; case NUM: result = new_name(); emit( loadi, node.val, =>, re break; return result; Minus & divide follow the same pattern Topic 8: Intermediate Code Generation p.4/32

Code generation Assume base for x and y is fp loadi offset of x => r1 loadao fp, r1 => r2 ; r2 x loadi 4 => r3 ; constant loadi offset of y => r4 loadao fp, r4 => r5 ; r5 y mult r3, r5 => r6 add r2, r6 => r7 + id * x num 4 id y Topic 8: Intermediate Code Generation p.5/32

Mixed type expressions Mixed type expressions E.g., x + 4 * 2.3e0 expression must have a clearly defined meaning typically convert to more general type complicated, machine dependent code Sample Conversion Table + int real double complex Typical Language Rule E.g., x + 4, where (Tx T 4 ): 1. T result f(+, Tx, T 4 ) 2. convert x to T result 3. convert 4 to T result 4. add converted values (yields T result ) int int real double complex real real real double complex double double double double complex complex complex complex complex complex Topic 8: Intermediate Code Generation p.6/32

Array references Example: A[i,j] Basic Strategy 1. Translate i (may be an expr) 2. Translate j (may be an expr) 3. Translate &A + [i, j] 4. Emit load Index Calculation assuming row-major order ) Let n i = high i low i + 1 Simple address expression (in two dimensions): base + ((i 1 low 1 ) n 2 + i 2 low 2 ) w Reordered address expression (in k dimensions): ((...(i 1 n 2 + i 2 )n 3 + i 3 )...)n k + i k ) w + base w ((...((low 1 n 2 ) + low 2 )n 3 + low 3 )...)n k + low k ) Topic 8: Intermediate Code Generation p.7/32

Optimizing the address calculation ((...(i 1 n 2 + i 2 )n 3 + i 3 )...)n k + i k ) w + base w ((...((low 1 n 2 ) + low 2 )n 3 + low 3 )...)n k + low k ) Constants Usually, all low i are constants Sometimes, all n i except n 1 (high-order dimension) are constant = final term is compile-time evaluable Expose common subexpressions refactor first term to create terms for each i r : i r n r+1 n r+2... n k w LICM: update r th term only when i r changes can remove much of the overhead Topic 8: Intermediate Code Generation p.8/32

Whole arrays as procedure parameters Three main challenges 1. Finding extents of all dimensions (including highest if checking bounds) 2. Passing non-contiguous section of larger array, e.g., Fortran 90: Formal Param: F(:, :) Actual Param (whole array): A(1:100, 1:100), Actual Param (array section): A(10:50:2,20:100:4) 3. Passing an array by value Language design choices C, C++, Java, Fortran 77: problem (1) is trivial and (2,3) don t exist Fortran 90, 95,... : problems (1) and (2) are non-trivial Passing whole arrays by value making a copy is extremely expensive = pass by reference, copy-on-write if value modified most languages (including call-by-value ones) pass arrays by reference Topic 8: Intermediate Code Generation p.9/32

Whole arrays by reference Finding extents pass a pointer to a dope vector as parameter: [l 1, u 1, s 1, l 2, u 2, s 1,...l k, u k, s k ] stuff in all the values in the calling sequence generate address polynomial in callee interprocedural optimizations can eliminate this: inlining procedure specialization (aka cloning) single caller Passing non-contiguous section of larger array Fortran 90 requires that section must have regular stride = dope vector with strides is sufficient Topic 8: Intermediate Code Generation p.10/32

Function calls in Expressions Key issue: Side Effects Evaluation order is important Example: func1(a) * globalx * func2(b) Register save/restore will preserve intermediate values Use standard calling sequence for each call set up the arguments generate the call and return sequence get the return value into a register Topic 8: Intermediate Code Generation p.11/32

Boolean & relational expressions Boolean expressions boolean not or-term or-term or-term or-term or and-term and-term Relational expressions rel-term rel-term rel-op expr expr rel-op < = = > expr... (rest of expr grammar) and-term and-term and value value value true false rel-term Topic 8: Intermediate Code Generation p.12/32

Short-circuiting Boolean Expressions What is short circuiting? Example Terms of a boolean expression can be evaluated until its value is established. Then, any remaining terms must not be evaluated. if (a && foo(b))... call to foo() should not be made if a is false. Basic Rules once value established, stop evaluating true or expr is true false and expr is false order of evaluation must be observed Note: If order of evaluation is unspecified, short-circuiting can be used as an optimization: reorder by cost and short-circuit Topic 8: Intermediate Code Generation p.13/32

Relations and Booleans using numerical values Numerical encoding assign a value to true, such as 1 (0x00000001) or -1 (0xFFFFFFFF) assign a value to false, such as 0 use hardware instructions and, or, not, xor Select values that work with the hardware (not 1 & 3) Example: b or c and not d 1 t1 not d 2 t2 c and t1 3 t3 b or t2 Example: if (a < b) 1 if (a < b) br l1 2 t1 false 3 br l2 4 l1: t1 true 5 l2: nop ; now use result Can represent relational as boolean! Integrates well into larger boolean expressions Topic 8: Intermediate Code Generation p.14/32

Relationals using control flow Encode using the program counter encode answer as a position in the code use conditional branches and hardware comparator along one path, relation holds; on other path, it does not Example: if (a < b) stmt 1 else stmt 2 Naïve code: After branch folding: 1 if (a < b) br lthen 2 br lelse 3 lthen: code for stmt1 4 br lafter 5 lelse: code for stmt2 6 br lafter 7 lafter: nop 1 if (a < b) br lthen 2 lelse: code for stmt2 3 br lafter 4 lthen: code for stmt1 5 lafter: nop Path lengths are balanced. Topic 8: Intermediate Code Generation p.15/32

Booleans using control flow Example: if (a<b or c<d and e<f) then stmt1 else stmt2 Naïve code: After branch folding: 1 if (a < b) br lthen 2 br l1 3 l1: if (c < d) br l2 4 br lelse 5 l2: if (e < f) br lthen 6 br lelse 7 lthen: stmt1 8 br lafter 9 lelse: stmt2 10 br lafter 11 lafter: nop 1 if (a < b) br lthen 2 if (c >= d) br lelse 3 if (e < f) br lthen 4 lelse: stmt2 5 br lafter 6 lthen: stmt1 7 lafter: nop It cleans up pretty well. Topic 8: Intermediate Code Generation p.16/32

Control Flow vs. Numerical Representations: Tradeoffs Hardware Issues Condition code (CC) registers: encode comparisons Conditional moves: use CC regs as boolean values Predicated instructions: us boolean values for condition execution (instead of control flow Tradeoffs Control flow works well when: Result is only used for branching Conditional moves and predicated execution are not available or code in branches is not appropriate for them Numerical representation works well when: Result must be materialized in a variable Result is used for branching but conditional moves or predicated execution are available and appropriate to use for code in branches Topic 8: Intermediate Code Generation p.17/32

Control-flow constructs Examples Loops Branches are common and expensive. Efficient inner loops are critical. if-then-else: see Boolean/relational expressions do, while or for loops switch statement Convert to a common representation do: evaluate iteration count first while and for: test and backward branch at bottom of loop = simple loop becomes a single basic block = backward branch: easy to predict Topic 8: Intermediate Code Generation p.18/32

Case statements Basic rules 1. evaluate the controlling expression 2. branch to the selected case 3. execute its code 4. branch to the following statement Main challenge: finding the right case Method When Cost linear search few cases O( cases ) binary search sparse O(log 2 ( cases )) jump table dense O(1), but with table lookup Topic 8: Intermediate Code Generation p.19/32

Options for Implementing Assignment of Primitive Objects 1 let x: Int <- foo() in 2 self.n <-... Assignment Option 1: Copy pointers 1. t0 = alloca pointer to IntObject; 2. t2 = Call foo() ;; t1 points to result object 3. *t0 = t2 ;; t0, t2 now point to same obj 4.... 16.... <self.n = t10> ;; both point to same obj Option 2: Copy values 1. t0 = alloca pointer to IntObject ; on stack 2. t1 = malloc IntObject ; on heap 3. *t0 = t1; 4. t2 = Call foo() ; foo() returns int 5. t4 = *t0 ;; t4: IntObject* 6. t4->val = t2 ;; **t0 now contains value of foo() 7.... 18.... <self.n->val = t12> Topic 8: Intermediate Code Generation p.20/32

Options for Implementing Assignment of Primitive Objects Assignment 1 let x: Int <- foo() + self.n in 2 let o: Object <- x in 3 o.type_name()... Option 1: Copy pointers 1. tx = alloca IntObject* ;; tx: IntObject** 2. to = alloca Object* ;; to: Object** 3. t1 = Call foo() ;; t1: IntObject* 4. t2 = selfptr->n->val ;; n: IntObject on heap?? t5 =... <eval t1->val + t2?? *tx = t5 ;; store result pointer to x?? *to = *tx ;; *to, *tx, t5 point to same obj?? t6 = *to ;; get Int object pointer??... <dispatch t6->type_name()> ;; uses Int as object Option 3: Box / unbox only where needed 1. tx = alloca int ;; tx: int* 2. to = alloca Object* ;; to: Object** 3. t2 = Call foo() ;; t2: int 4. t3 = self->n ;; n: int; fields are unboxed 5. t4 = t2 + t3 ;; t2: int 6. *tx = t2 ;; store int t4 to x 7. t5 = *tx ;; t5: int; value of x 8. t6 = malloc IntObject ;; t6: IntObject* 9. t6->val = t4 ;; int is now boxed 10.... <dispatch t6->type_name()> ;; uses boxed int as object Topic 8: Intermediate Code Generation p.21/32

Options for Assignment: How well can they be optimized? Option 1: Copy pointers Uniform code generator: no special cases for primitives Local objects on heap : many promoted to int registers; rest in heap Fields: likely to remain objects in heap Option 2: Copy values Special cases for primitives: assignment, copies Let objects on heap: all promoted to int registers Temp objects on heap: all promoted to int registers Fields: likely to remain objects in heap Option 3: Box/unbox only where needed Special cases for primitives: assignment, copies, method dispatch Let vars, temps: held in int registers (except when boxed) Fields: simple int fields in parent object Overhead of boxing/unboxing only at object operations This only applies to primitive objects Topic 8: Intermediate Code Generation p.22/32

Structures Structure Accesses p->x becomes loadai r p, offset(x) r 1 Structure Layout: Key Goals All structure fields have constant offsets, fixed at compile-time May need padding between fields = Why? May need padding at end of struct = Why? Structure Layout Example struct Small { char* p; int n; }; struct Big { char c; struct Small m; int i }; Assume SparcV9: Byte alignments = pointer: 8, int: 4, short: 2 Offsets? p :? n :? c :? m :? i :? sizeof(struct Small) =? sizeof(struct Big) =? Topic 8: Intermediate Code Generation p.23/32

Class with single-inheritance Key Operations and Terminology p.x p.m(a 1, a 2,...,a n ) C q q = (C q ) p Field access Method dispatch Downcast Terminology and Type-Checking Assumptions C p, C q : the static types of references p, q O p, O q : the dynamic types of the objects p, q refer to O p C p (in COOL notation), i.e., O p is lower in inheritance tree. x, M are valid members of C p = valid for O p For downcast, O p C q When is this checked? Topic 8: Intermediate Code Generation p.24/32

Class with single-inheritance: Code Generation Goals Functional Goals 1. Class layouts, run-time descriptors constructed at compile-time Note: Class-loading-time in a JVM is compile-time 2. Same code sequences must work for any O p C p 3. Separate compilation of classes = we know superclasses but not subclasses = code sequences, layouts, descriptors must be consistent for all classes Efficiency Goals 1. Small constant-time code sequences 2. Avoid hashing [class-name,method-name] func ptr 3. Minimize #indirection steps 4. Minimize #object allocations for temporaries 5. Two important optimizations: inlining, dynamic static dispatch Topic 8: Intermediate Code Generation p.25/32

Single-inheritance: Example Runtime Objects Class record: Method table: One per class One per class Object record: One per instance COOL Classes 1 class C1 (* inherits Object *) { x1: Int, y1: Int; M1(): Int } 2 class C2 inherits C1 { x2: Int; M2(): Int }; 3 class C3 inherits C2 { x3: Int; M1(): Int; M3(): Int }; Class Records for Example Class Records in C 1 struct ClassC1 { struct ClassObject* p; VTableC1 { $...$ }; int 1; } 2 struct ClassC2 { struct ClassC1* p; VTableC2 { $...$ }; int 2; } 3 struct ClassC3 { struct ClassC2* p; VTableC3 { $...$ }; int 3; } Topic 8: Intermediate Code Generation p.26/32

Single-inheritance: Example (continued) Method Tables for Example Method Tables in C 1 struct VTableC1 { <Object methods>; int()* M1_C1; }; 2 struct VTableC2 { <Object methods>; int()* M1_C1; int()* M2_C2; }; 3 struct VTableC3 { <Object methods>; int()* M1_C3; int()* M2_C2; int()* M3_C3; }; Object Records for Example Object Records in C 1 struct ObjObject { void* classptr; /* no fields */ }; 2 struct ObjC1 { struct ObjObject p1; int x1; int y1; }; 3 struct ObjC2 { struct ObjC1 p2; int x2; }; 4 struct ObjC3 { struct ObjC2 p3; int x3; }; Compare layouts of these object records: ObjObject: { classptr } ObjC1: { classptr; x1; y1 } ObjC2: { classptr; x1; y1; x2 } ObjC3: { classptr; x1; y1; x2; x3 } Topic 8: Intermediate Code Generation p.27/32

Single-inheritance: Example (continued) Code Sequence for Field Access (* r2: C2 <- new C3; r3: C3 <- new C3*) x: Int <- r2.x1 + r3.x1; Code Sequence for Method Dispatch (* r3: C1 <- new C3 *) x: Int <- r3.m1() Topic 8: Intermediate Code Generation p.28/32

Runtime Safety Checks Fundamental cost for safe languages: Java, ML, Modula, Ada,... Loads and Stores Initialize all pointer variables (including fields) to NULL Check (p!= 0) before every load/store using p optimize for locals! Downcasts Record class identifier in class object Before downcast C q q = (C q ) p: Check O p C q Array References Empirical evidence: These are by far the most expensive run-time checks Record size information just before array in memory Before array reference A[expr 0,..., expr n 1 ]: Check (lb i expr i ), (expr i ub i ), 0 i n 1 optimize! Topic 8: Intermediate Code Generation p.29/32

Separation of Concerns: Principles Fundamental Principles Read: The PL.8 Compiler, Auslander and Hopkins, CC82. Each compiler pass should address one goal and leave other concerns to other passes. Optimization passes should use a common, standardized IR. All code (user or compiler-generated) optimized uniformly Key Assumptions register allocator does a great job optimization phase does a great job simplifies optimizations simplifies translation little or no special case analysis global data-flow analysis is worthwhile Topic 8: Intermediate Code Generation p.30/32

Separation of Concerns: Optimizations and Examples Optimization Passes in PL.8 Compiler Dead Code Elimination (DCE) Constant Propagation (CONST) Strength reduction Reassociation Common Subexpression Elimination (CSE) Global Value Numbering (GVN) Loop Invariant Code Motion (LICM) Dead Store Elimination (DSE) Control flow simplification (Straightening) Trap Elimination Peephole optimizations Separation of Concerns: Examples ICG ignores common opts: DCE, CSE, LICM, straightening, peephole CSE and LICM ignore register allocation Instruction scheduling ignores register allocation Topic 8: Intermediate Code Generation p.31/32

Separation of Concerns: Tradeoffs Advantages Simple ICG: bottom-up, context-independent Opts. can ignore register constraints Each pass can be simpler = more reliable, perhaps faster Each optimization pass can be run multiple times. Sequences of passes can be run in different orders. Each pass gets used nearly every time = more reliable User-written and compiler-generated code optimized uniformly Disadvantages Requires robust optimization algorithms Requires strong register allocation Compilation time? Topic 8: Intermediate Code Generation p.32/32