Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Similar documents
Handling Assignment Comp 412

Arrays and Functions

Code Shape II Expressions & Assignment

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

Compiler Design. Code Shaping. Hwansoo Han

(just another operator) Ambiguous scalars or aggregates go into memory

The Processor Memory Hierarchy

CS415 Compilers. Intermediate Represeation & Code Generation

Parsing II Top-down parsing. Comp 412

Instruction Selection: Preliminaries. Comp 412

Naming in OOLs and Storage Layout Comp 412

Intermediate Representations

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

CS 406/534 Compiler Construction Instruction Scheduling beyond Basic Block and Code Generation

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

The Software Stack: From Assembly Language to Machine Code

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Implementing Control Flow Constructs Comp 412

Intermediate Representations

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Intermediate Representations

Intermediate Code Generation

Computing Inside The Parser Syntax-Directed Translation. Comp 412

Runtime Support for OOLs Object Records, Code Vectors, Inheritance Comp 412

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

Syntax Analysis, III Comp 412

Intermediate Code Generation

Runtime Support for Algol-Like Languages Comp 412

Syntax Analysis, III Comp 412

Instruction Selection, II Tree-pattern matching

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Intermediate Representations & Symbol Tables

Building a Parser Part III

Intermediate Representations Part II

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Lecture 12: Compiler Backend

Syntax Analysis, VII One more LR(1) example, plus some more stuff. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. target code.

Parsing II Top-down parsing. Comp 412

DEMO A Language for Practice Implementation Comp 506, Spring 2018

1 Lexical Considerations

Intermediate Code Generation (ICG)

Introduction to Parsing. Comp 412

Intermediate Representations

Local Register Allocation (critical content for Lab 2) Comp 412

Optimizer. Defining and preserving the meaning of the program

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CSE 504: Compiler Design. Runtime Environments

Syntax Analysis, VI Examples from LR Parsing. Comp 412

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

by Pearson Education, Inc. All Rights Reserved.

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Compiler Optimisation

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

The structure of a compiler

EECS 6083 Intro to Parsing Context Free Grammars

Program construction in C++ for Scientific Computing

CS5363 Final Review. cs5363 1

Lexical Considerations

The ILOC Simulator User Documentation

Lexical Considerations

CS 406/534 Compiler Construction Putting It All Together

CS /534 Compiler Construction University of Massachusetts Lowell. NOTHING: A Language for Practice Implementation

CPSC 3740 Programming Languages University of Lethbridge. Data Types

Example. program sort; var a : array[0..10] of integer; procedure readarray; : function partition (y, z :integer) :integer; var i, j,x, v :integer; :

3.5 Practical Issues PRACTICAL ISSUES Error Recovery

Chapter 3: Operators, Expressions and Type Conversion

UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Computational Science & Engineering

CS 406/534 Compiler Construction Parsing Part I

Language Reference Manual simplicity

The ILOC Simulator User Documentation

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

The ILOC Simulator User Documentation

CSE 401/M501 Compilers

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Run-time Environments

Run-time Environments

CS415 Compilers. Lexical Analysis

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

CS 2210 Programming Project (Part IV)

Review of the C Programming Language

Introduction to Optimization Local Value Numbering

Intermediate Code Generation

Alternatives for semantic processing

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Lab 3, Tutorial 1 Comp 412

Le L c e t c ur u e e 2 To T p o i p c i s c t o o b e b e co c v o e v r e ed e Variables Operators

Pointers and Arrays CS 201. This slide set covers pointers and arrays in C++. You should read Chapter 8 from your Deitel & Deitel book.

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Semantic actions for declarations and expressions

MIDTERM EXAMINATION - CS130 - Spring 2003

C-LANGUAGE CURRICULAM

Chapter 8 :: Composite Types

Transcription:

COMP 412 FALL 2017 Generating Code for Assignment Statements back to work Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Chapters 4, 6 & 7 in EaC2e

Review ( from Lec. 18) Example Building an Abstract Syntax Tree 1 Goal Expr $$ = $1; 2 Expr Expr + Term $$ = MakeAddNode($1,$3); 3 Expr - Term $$ = MakeSubNode($1,$3); 4 Term $$ = $1; 5 Term Term * Factor $$ = MakeMulNode($1,$3); 6 Term / Factor $$ = MakeDivNode($1,$3); 7 Factor $$ = $1; 8 Factor ( Expr ) $$ = $2; 9 number $$ = MakeNumNode(token); 10 ident $$ = MakeIdNode(token); Assumptions: constructors for each node stack holds pointers to nodes COMP We call 412, $$ Fall = 2017 $1; a copy rule. 1

Review ( from Lec. 18) Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id shift $ Expr num * id shift $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id shift $ Expr Term * id shift $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept AHSDT Works! Built the AST Some reduce actions just copied values; others called constructors. Same tree as earlier slide - id num x 2 * id y COMP 412, Fall 2017 2

Review ( from Lec. 18) Example Emitting ILOC 1 Goal Expr 2 Expr Expr +Term $$ = NextRegister(); Emit(add, $1, $3, $$); 3 Expr - Term $$ = NextRegister(); Emit(sub, $1, $3, $$); 4 Term $$ = $1; 5 Term Term * Factor $$ = NextRegister(); Emit(mult, $1, $3, $$) 6 Term / Factor $$ = NextRegister(); Emit(div, $1, $3, $$); 7 Factor $$ = $1; 8 Factor ( Expr ) $$ = $2; 9 number $$ = NextRegister(); Emit(loadI,Value(lexeme),$$); 10 ident $$ = NextRegister(); EmitLoad(ident,$$); Assumptions NextRegister() returns a virtual register name Emit() can format assembly code EmitLoad() handles addressability & gets a value into a register COMP Copy 412, rules Fall on 2017 the same productions as in the last example 3

Review ( from Lec. 18) Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id shift $ Expr num * id shift $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id shift $ Expr Term * id shift $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept Emitting ILOC Simple, but clean, code EmitLoad() hides all of the messy details of naming, addressability, and address modes loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2017 4

What s Left in Expressions and Assignments? Assume that we all understand symbol tables & storage mapping We hid a lot of detail in EmitLoad() How does all of that work? What about type information? Every variable & value has a type Operators have a range of types over which they are defined What about more operators? Assignment? Arrays, strings, structures If the language does not have a declarations before executables rule, then the compiler must either: (1) derive its knowledge of variables, types, and locations incrementally; or (2) make two passes over the code (e.g., build an AST and re-traverse it). To accomplish (1), the compiler use a higher level of abstraction in the initial IR to avoid the need for specific addresses (e.g., represent variables with names and expand address computations COMP 412, Fall at 2017 a later stage in compilation.) 5

Handling Identifiers Identifiers have some subtlety, which we hid in EmitLoad() The compiler needs to decide which values can legally live in registers A value is ambiguous if the compiler can access it via more than one name Some pointer based values Some call-by-reference parameters Most compilers treat array elements (e.g., A[i,j]) as ambiguous Ambiguous values must live in memory Any unambiguous value is a candidate for a register See pages 341-342 in EaC2e for discussion of ambiguity (load or store at each access) The compiler needs to assign offsets to each variable kept in RAM Order them by alignment constraints and assign most constrained objects first (e.g., double-word, then word, then half-word, then byte, ) Reduce padding needed to maintain alignment See pages 339-340 in EaC2e for discussion COMP 412, Fall 2017 of alignment within a data area 6

Handling Identifiers Identifiers have some subtlety, which we hid in EmitLoad() For a scalar identifier x at some point p: The compiler needs to know Which declaration of x pertains at p? And what properties does it declare? The various symbol tables resolve this issue After storage assignment x has either A virtual register number, or A base + offset pair to compute an address static int x; int foo( int y ) { int x; x = y * 9; if(x > 1017) x = foo(x) + 17; return x; } x = foo(12) Emitload() can either return the VR or emit code to compute the address into a register, perform the load, and return the resulting VR For aggregates, the address computation is more complex, as we shall see COMP 412, Fall 2017 7

Extending the Grammar Adding other operators +, -,, and are not enough Need to add operators to grammar and generate code for them Defining the syntax Additional productions Additional levels of precedence See, for example, Fig. 7.7 on p. 351 of EaC2e. Code generation follows the scheme for existing operators Evaluate the operands, then perform the operation Complex operations (sin, log 10, 2 x ) might turn into library calls Handle assignment as an operator COMP 412, Fall 2017 8

From Figure 7.7 on p 351 in EaC2e Boolean & Relational Expressions This grammar allows w < x < y < z with left associativity For example, booleans, relationals, and unary operators Boolean Boolean AndTerm AndTerm AndTerm AndTerm RelExpr RelExpr RelExpr RelExpr < Expr RelExpr Expr RelExpr = Expr RelExpr Expr RelExpr Expr RelExpr > Expr Expr Expr Expr + Term Expr - Term Term Term Term Value Term Value Value Value! Factor Factor Factor ( Expr ) number Reference where Reference derives a name, a subscripted name, a structure COMP reference, 412, Fall 2017 a string reference, a function call, 9

Mixed-Type Expressions So far, expressions have had one consistent type What if the operands to an operation have different types? Key observation: the language must define the behavior Might simply be an error, detected during semantic elaboration Require the programmer to fix it (insert a cast?) Might require the compiler to insert an implicit conversion Implicit conversions Compiler uses a conversion table to determine type of result Convert arguments to result type & perform operation Most languages have symmetric & rational conversion tables Typical Conversion Table for Addition (note the symmetry) + Integer Real Double Complex Integer Integer Real Double Complex Real Real Real Double Complex Double Double Double Double Complex Complex Complex Complex Complex Complex COMP 412, Fall 2017 10

Assignment as an Operator lhs rhs Strategy Evaluate rhs to a value Evaluate lhs to a location lvalue is a register Þ move rhs into the register lvalue is an address Þ store rhs into the memory location If rvalue & lvalue have different types Evaluate rvalue to its natural type Convert that value to the type of *lvalue (an rvalue) (an lvalue) Unambiguous scalars go into registers Ambiguous scalars or aggregates go into memory Keeping ambiguous values in memory COMP 412, Fall 2017 lets the hardware sort out the addresses. 11

Handling Assignment What if the compiler cannot determine the type of the rhs? Issue is a property of the language & the specific program For type-safety, compiler must insert a run-time check Some languages & implementations ignore safety (a truly bad idea) Add a tag field to the data items to hold type information Explicitly check tags at runtime Code for assignment becomes more complex evaluate rhs if type(lhs) ¹ rhs.tag then convert rhs to type(lhs) or throw an exception lhs rhs Choice between conversion & a runtime exception depends on details of language & type system Much more complex than static checking, plus costs occur at runtime rather than compile time COMP 412, Fall 2017 12

Handling Assignment Compile-time type-checking Goal is to eliminate the need for both tags & runtime checks Determine, at compile time, the type of each subexpression Use runtime check only if compiler cannot determine types Optimization strategy If compiler knows the type, move the check to compile-time Unless tags are needed for garbage collection, eliminate them If check is needed, try to overlap it with other computation Can design the language so all checks are static COMP 412, Fall 2017 13

Extending the Schemes More complex cases for IDENTIFIER What about references to aggregate values? (array or structure elements) Need a layout for aggregate Need a formula to find the specified element Will (often) generate runtime arithmetic to compute element address What about function calls in expressions? (future lecture) Generate the calling sequence & load the return value Severely limits compiler s ability to reorder operations What about a reference to a value passed in as a parameter? Many linkages pass the first several values in registers Call-by-value Þ just a local variable with a funny offset Call-by-reference Þ funny offset, extra indirection COMP 412, Fall 2017 15 Parameters are typically stored at a negative offset from the local data area. Code will have a pointer to that data area.

Expanding the Expression Grammar The Classic, Left-Recursive, Expression Grammar 1 Goal Expr 2 Expr Expr +Term 3 Expr - Term 4 Term 5 Term Term * Factor 6 Term / Factor 7 Factor 8 Factor ( Expr ) 9 number 10 ident The grammar is great for x - 2 * y Not all programs operate over the domain of ident and number More complex programs require more complex expressions No mention of array references, structure references, or function calls COMP 412, Fall 2017 16

Expanding the Expression Grammar The Classic, Left-Recursive, Expression Grammar 1 Goal Expr 2 Expr Expr +Term 3 Expr - Term 4 Term 5 Term Term * Factor 6 Term / Factor 7 Factor 8 Factor ( Expr ) 9 number 10 ident 11 ident [ Exprs ] 12 ident ( Exprs ) References to aggregates & functions First, the compiler must recognize expressions that refer to arrays, structures, and function calls Second, the compiler must have schemes to generate code to implement those abstractions Array reference Function invocation COMP 412, Fall 2017 17

Scheme to Generate Code For A[i,j]? Compiler must generate the runtime address of the element A[i,j] The compiler needs to know where A begins Tie between compile-time knowledge and runtime behavior We will defer this discussion until we discuss procedure calls assume that @A is the address of A s first element The compiler must have a plan for the internal layout of A Programming language usually dictates array element layout Three common choices Row-major order Column-major order Indirection vectors And a formula for calculating the address of an array element General scheme: compute address, then issue load (rvalue) or store (lvalue) A The Concept 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 COMP 412, Fall 2017 18

Array Layout Row-Major Order Lay out as a sequence of consecutive rows Rightmost subscript varies fastest A[1,1], A[1,2], A[1,3], A[2,1], A[2,2], A[2,3] Storage Layout Stride one access: successive references in the loop are to successive locations in the level one (or L1) data cache. Stride one access maximizes spatial reuse & the effectiveness of hardware prefetch units. A 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 Stride One Access for ( i = 0; i < n; i++) for ( j = 0; j < n; j++) A[i][j] = 0; Declared arrays in C (and most languages) A The Concept 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 COMP 412, Fall 2017 19

Array Layout Column-Major Order Lay out as a sequence of columns Leftmost subscript varies fastest A[1,1], A[2,1], A[1,2], A[2,2], A[1,3], A[2,3] Storage Layout A 1,1 2,1 1,2 2,2 1,3 2,3 1,4 2,4 Stride One Access for ( j = 0; j < n; j++) for ( i =0; i < n; i++) A[i][j] = 0; All arrays in FORTRAN A The Concept 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 COMP 412, Fall 2017 20

Array Layout Indirection Vectors Vector of pointers to pointers to to values Takes much more space, trades indirection for arithmetic Not amenable to analysis Storage Layout A 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 Stride One Access No reference pattern guarantees stride one access in cache, unless rows are contiguous Arrays in Java int **array; in C 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 COMP 412, Fall 2017 21 A The Concept

Return to high-school algebra Computing an Address for an Array Element The General Scheme for an Array Reference Compute an address Issue a load Color Code: Invariant Varying A[ i ] @A + ( i low ) x sizeof(a[1]) In general: base(a) + ( i low ) x sizeof(a[1]) @A is the base address of A. Depending on how A is declared, @A may be an offset from the pointer to the procedure s data area (its ARP), an offset from some global label, or an arbitrary address. The first two are compile time constants. low Lower bound on index COMP ARP 412, Activation Fall 2017 Record Pointer (see next lecture) 22

Computing an Address for an Array Element A[ i ] @A + ( i low ) x sizeof(a[1]) In general: base(a) + ( i low ) x sizeof(a[1]) int A[1:10] Þ low is 1 Make low be 0 for faster access (saves a ) Almost always a power of 2, known at compile-time Þ use a shift for speed Note, for the record, that a naïve vector reference turns into a subtract, a shift, and an addressoffset load (e.g., loadao) three operations before optimization. This sequence is amazingly cheap. low Lower bound on index COMP 412, Fall 2017 23

Assume sizeof(a[1]) is 1, for simplicity. Computing an Address for an Array Element A[ i ] @A + ( i low ) x sizeof(elt) In general: base(a) + ( i low ) x sizeof(a[1]) What about A[i 1,i 2 ]? low 1 Lower bound on row index 1 high 1 Upper bound on row index 5 low 2 Lower bound on column index 1 high 2 Upper bound on column index 4 Row-major order, two dimensions @A + (( i 1 low 1 ) x (high 2 low 2 + 1) + i 2 low 2 ) x sizeof(elt) A[4,3] 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 3,1 3,2 3,3 3,4 4,1 4,2 4,3 4,4 5,1 5,2 5,3 5,4 (i-low 1 ) is 3 (high 2 low 2 + 1) is 4 3 x 4 is 12 i 2 low 2 is 2 Combined, the 2 terms take us to the start of A[4,3] Address computation took 3 adds, 3 subtracts, 1 multiply and 1 shift COMP 412, Fall 2017 24 Cheap in comparison to the register-save costs of a function call.

Array Layout Row-Major Order Lay out as a sequence of consecutive rows Rightmost subscript varies fastest A[1,1], A[1,2], A[1,3], A[2,1], A[2,2], A[2,3] Storage Layout A 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 Address Polynomial for A[i,j] @A + (( i 1 low 1 ) x (high 2 low 2 + 1) + i 2 low 2 ) x sizeof(elt) Why does C start arrays indices at zero? To avoid those subtractions COMP This polynomial 412, Fall 2017 follows Horner s rule for evaluation. 25

Array Layout Column-Major Order Lay out as a sequence of columns Leftmost subscript varies fastest A[1,1], A[2,1], A[1,2], A[2,2], A[1,3], A[2,3] Storage Layout A 1,1 2,1 1,2 2,2 1,3 2,3 1,4 2,4 Address Polynomial for A[i,j] @A + (( i 2 low 2 ) x (high 1 low 1 + 1) + i 1 low 1 ) x sizeof(elt) Change in layout order swaps the subscripts COMP This polynomial 412, Fall 2017 follows Horner s rule for evaluation. 26

Array Layout Indirection Vectors Vector of pointers to pointers to to values Takes much more space, trades indirection for arithmetic Not amenable to analysis Storage Layout 1,1 1,2 1,3 1,4 A Address Polynomial for A[i,j] 2,1 2,2 2,3 2,4 *(A[i 1 ])[i 2 ] where each of the [i] s is, itself, a 1-d array reference Back when systems supported 1 or 2 cycle indirect load operations, this scheme was efficient. It replaces a multiply & an add with an indirect load. A[i 1 ] *[i 2 ] sub i 1, low => r 1 add @A, r 1 => r 2 load r 2 => r 3 sub i 2, low => r 4 add r 3, r 4 => r 5 load r 5 => r 6 Sketch of the code COMP 412, Fall 2017 27

Array Address Calculations In scientific codes, array address calculations are a major cost Each additional dimension adds more arithmetic Efficiency in address calculation is a critical issue for performance A[i+1,j], A[i,j], A[i,j+1] should all have some common terms Horner s rule evaluation hides the common terms Improving these calculations is a matter of algebra (including distributivity!) Improving the efficiency of array address calculations has been a major focus of code optimization (& hardware design) for the last 40 years Generate better code Transform it to fit the context Design memory systems that support common array access patterns Optimize hardware for stride one access Include sophisticated prefetch engines that recognize access patterns COMP 412, Fall 2017 28