(just another operator) Ambiguous scalars or aggregates go into memory

Similar documents
Code Shape II Expressions & Assignment

Compiler Design. Code Shaping. Hwansoo Han

CS 406/534 Compiler Construction Instruction Scheduling beyond Basic Block and Code Generation

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Intermediate Code Generation

Implementing Control Flow Constructs Comp 412

Handling Assignment Comp 412

Arrays and Functions

Intermediate Code Generation

Intermediate Code Generation (ICG)

CS 432 Fall Mike Lam, Professor. Code Generation

Compiling Techniques

Intermediate Representations Part II

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

CS415 Compilers. Intermediate Represeation & Code Generation

Intermediate Representations

CODE GENERATION Monday, May 31, 2010

The structure of a compiler

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

Intermediate Representations & Symbol Tables

Intermediate Representations

Intermediate Representations

Intermediate Representation (IR)

Principles of Compiler Design

CSE 504: Compiler Design. Runtime Environments

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

CS Computer Architecture

Instruction Sets: Characteristics and Functions Addressing Modes

CPSC 3740 Programming Languages University of Lethbridge. Data Types

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Intermediate Code Generation

Instruction Sets Ch 9-10

Instruction Sets Ch 9-10

Alternatives for semantic processing

Functional programming languages

Programming Languages

Chap. 8 :: Subroutines and Control Abstraction

Code generation and optimization

Intermediate Code Generation

Module 27 Switch-case statements and Run-time storage management

CSE 401/M501 Compilers

The ILOC Simulator User Documentation

Control Abstraction. Hwansoo Han

COMPUTER ORGANIZATION & ARCHITECTURE

What Do Compilers Do? How Can the Compiler Improve Performance? What Do We Mean By Optimization?

The ILOC Simulator User Documentation

Chapter 8 :: Subroutines and Control Abstraction. Final Test. Final Test Review Tomorrow

Chapter 9 :: Subroutines and Control Abstraction

Data Types. Outline. In Text: Chapter 6. What is a type? Primitives Strings Ordinals Arrays Records Sets Pointers 5-1. Chapter 6: Data Types 2

General issues. Section 9.1. Compiler Construction: Code Generation p. 1/18

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Compilers. Lecture 2 Overview. (original slides by Sam

ECE468 Computer Organization & Architecture. MIPS Instruction Set Architecture

NOTE: Answer ANY FOUR of the following 6 sections:

Chapter 3:: Names, Scopes, and Bindings

IA-64 Compiler Technology

Computer Architecture /

opt. front end produce an intermediate representation optimizer transforms the code in IR form into an back end transforms the code in IR form into

CS 314 Principles of Programming Languages. Lecture 13

Data Types In Text: Ch C apter 6 1

The ILOC Simulator User Documentation

CSc 453. Code Generation I Christian Collberg. Compiler Phases. Code Generation Issues. Slide Compilers and Systems Software.

1 Lexical Considerations

Introduction. CSc 453. Compilers and Systems Software. 19 : Code Generation I. Department of Computer Science University of Arizona.

Organization of Programming Languages CS320/520N. Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Computer Architecture and Organization. Instruction Sets: Addressing Modes and Formats

Programming Language Dilemma Fall 2002 Lecture 1 Introduction to Compilation. Compilation As Translation. Starting Point

MIPS Instruction Set Architecture (2)

Stack Tracing In A Statically Typed Language

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

Compiler Architecture

Lecture 4: Instruction Set Design/Pipelining

RTN (Register Transfer Notation)

Architectures. Code Generation Issues III. Code Generation Issues II. Machine Architectures I

Example. program sort; var a : array[0..10] of integer; procedure readarray; : function partition (y, z :integer) :integer; var i, j,x, v :integer; :

CMSC 611: Advanced Computer Architecture

Instruction Set Architectures Part I: From C to MIPS. Readings:

Lecture 12: Compiler Backend

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

Object Code (Machine Code) Dr. D. M. Akbar Hussain Department of Software Engineering & Media Technology. Three Address Code

CSE 401/M501 Compilers

Review of the C Programming Language

Administration CS 412/413. Advanced Language Support. First-class vs. Second-class. First-class functions. Function Types

Instruction Set. Instruction Sets Ch Instruction Representation. Machine Instruction. Instruction Set Design (5) Operation types

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Instruction Selection and Scheduling

Computer Architecture. The Language of the Machine

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

CMSC 430 Introduction to Compilers. Spring Code Generation

We can emit stack-machine-style code for expressions via recursion

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

Reduced Instruction Set Computer (RISC)

MIDTERM EXAMINATION - CS130 - Spring 2003

16.10 Exercises. 372 Chapter 16 Code Improvement. be translated as

Lexical Considerations

Chapter 13 Reduced Instruction Set Computers

Computer Organization CS 206 T Lec# 2: Instruction Sets

Transcription:

Handling Assignment (just another operator) lhs rhs Strategy Evaluate rhs to a value (an rvalue) Evaluate lhs to a location (an lvalue) > lvalue is a register move rhs > lvalue is an address store rhs If rvalue & lvalue have different types > Evaluate rvalue to its natural type > Convert that value to the type of *lvalue Unambiguous scalars go into registers Let hardware sort out the addresses! Ambiguous scalars or aggregates go into memory CMSC430 Spring 2007 1 Handling Assignment What if the compiler cannot determine the type of the rhs? This is a property of the language & the specific program If type-safety is desired, compiler must insert run-time checks Add a tag field to the data items to hold type information Code for assignment becomes more complex evaluate rhs if type(lhs) rhs.tag then convert rhs to type(lhs) or signal a run-time error lhs rhs CMSC430 Spring 2007 2 1

Handling Assignment Compile-time type-checking Goal is to eliminate both the check & the tag Determine, at compile time, the type of each subexpression Use compile-time types to determine whether check is needed Optimization strategy If compiler knows the type, move the check to compile-time If tags are not needed for garbage collection, eliminate the tags If check is unavoidable, try to overlap it with other computation Can design the language so all checks are static CMSC430 Spring 2007 3 Handling Assignment (with reference counting) The problem with reference counting Must adjust the count on each pointer assignment Overhead is significant, relative to assignment Code for assignment becomes evaluate rhs lhs count lhs count - 1 lhs addr(rhs) rhs count rhs count + 1 Plus a check for zero at the end This adds 1 +, 1 -, 2 loads, & 2 stores With extra functional units & large caches, this may become free CMSC430 Spring 2007 4 2

How does the compiler handle A[ i ][ j ]? First, must agree on a storage scheme Row-major order Lay out as a sequence of consecutive rows Righmost subscript varies fastest A[1,1], A[1,2], A[1,3], A[2,1], A[2,2], A[2,3] Column-major order Lay out as a sequence of columns Leftmost subscript varies fastest A[1,1], A[2,1], A[1,2], A[2,2], A[1,3], A[2,3] (most languages) (Fortran) Indirection vectors (Java) Vector of pointers to pointers to to values Takes much more space, trades indirection for arithmetic Not amenable to analysis CMSC430 Spring 2007 5 Laying Out Arrays The Concept A 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 Row-major order A 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 Column-major order A 1,1 2,1 1,2 2,2 1,3 2,3 1,4 2,4 Indirection vectors A 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 These have distinct & different cache behavior CMSC430 Spring 2007 6 3

Computing an Array Address A[ i ] @A + ( i low ) x sizeof(a[1]) In general: base(a) + ( i low ) x sizeof(a[1]) int A[1:10] low is 1 Make low 0 for faster access (save a ) What about A[i 1 ][i 2 ]? This stuff looks expensive! Lots of implicit +, -, x ops Row-major order, two dimensions @A + (( i 1 low 1 ) x (high 2 low 2 + 1) + i 2 low 2 ) x sizeof(a[1]) Column-major order, two dimensions @A + (( i 2 low 2 ) x (high 1 low 1 + 1) + i 1 low 1 ) x sizeof(a[1]) Indirection vectors, two dimensions *(A[i 1 ])[i 2 ] where A[i 1 ] is, itself, a 1-d array reference Almost always a power of 2, known at compile-time use a shift for speed CMSC430 Spring 2007 7 Optimizing Address Calculation for A[ i ][ j ] In row-major order @A + (i low 1 )(high 2 low 2 +1) x w + (j low 2 ) x sizeof(a[1][1]) Which can be factored into @A + i x (high 2 low 2 +1) x sizeof(a[1][1]) + j x sizeof(a[1][1]) (low 1 x (high 2 low 2 +1) x sizeof(a[1][1] + low 2 x sizeof(a[1][1]) If low i, high i, and w are known, the last term is a constant Define @A 0 as @A (low 1 x (high 2 low 2 +1) x sizeof(a[1][1]) + low 2 x sizeof(a[1][1]) And len 2 as (high 2 -low 2 +1) Then, the address expression becomes @A 0 + (i x len 2 + j ) x sizeof(a[1][1]) Compile-time constants CMSC430 Spring 2007 8 4

Array References What about arrays as actual parameters? Whole arrays, as call-by-reference parameters Need dimension information build a dope vector Store the values in the calling sequence Pass the address of the dope vector in the parameter slot Generate complete address polynomial at each reference @A low 1 high 1 low 2 high 2 Some improvement is possible Save len i and low i rather than low i and high i Pre-compute the fixed terms in prolog sequence (a win if used) What about call-by-value? Most c-b-v languages pass arrays by reference This is a language design issue CMSC430 Spring 2007 9 Array References What about A[12] as an actual parameter? If corresponding parameter is a scalar, it s easy Pass the address or value, as needed Must know about both formal & actual parameter Language definition must force this interpretation What is corresponding parameter is an array? Must know about both formal & actual parameter Meaning must be well-defined and understood Cross-procedural checking of conformability Again, we re treading on language design issues CMSC430 Spring 2007 10 5

Array References What about variable-sized arrays? Local arrays dimensioned by actual parameters Same set of problems as parameter arrays Requires dope vectors (or equivalent) > dope vector at fixed offset in activation record Different access costs for textually similar references This presents a lot of opportunity for a good optimizer Common subexpressions in the address polynomial Contents of dope vector are fixed during each activation Should be able to recover much of the lost ground Handle them like parameter arrays CMSC430 Spring 2007 11 Example: Array Address Calculations in a Loop DO J = 1, N A[I,J] = A[I,J] + B[I,J] END DO Naïve: Perform the address calculation twice DO J = 1, N R1 = @A 0 + (J x len 1 + I ) x floatsize R2 = @B 0 + (J x len 1 + I ) x floatsize MEM(R1) = MEM(R1) + MEM(R2) END DO CMSC430 Spring 2007 12 6

Example: Array Address Calculations in a Loop DO J = 1, N A[I,J] = A[I,J] + B[I,J] END DO Sophisticated: Move comon calculations out of loop R1 = I x floatsize c = len 1 x floatsize! Compile-time constant R2 = @A 0 + R1 R3 = @B 0 + R1 DO J = 1, N a = J x c R4 = R2 + a R5 = R3 + a MEM(R4) = MEM(R4) + MEM(R5) END DO CMSC430 Spring 2007 13 Example: Array Address Calculations in a Loop DO J = 1, N A[I,J] = A[I,J] + B[I,J] END DO Very sophisticated: Convert multiply to add (Strength Reduction) R1 = I x floatsize c = len 1 x floatsize! Compile-time constant R2 = @A 0 + R1 ; R3 = @B 0 + R1 DO J = 1, N R2 = R2 + c R3 = R3 + c MEM(R2) = MEM(R2) + MEM(R3) END DO CMSC430 Spring 2007 14 7

Boolean & Relational Values How should the compiler represent them? Answer depends on the target machine Two classic approaches Numerical representation Positional (implicit) representation Correct choice depends on both context and ISA CMSC430 Spring 2007 15 Boolean & Relational Values Numerical representation Assign values to TRUE and FALSE Use hardware AND, OR, and NOT operations Use comparison to get a boolean from a relational expression Examples x < y becomes cmp_lt r x,r y r 1 if (x < y) cmp_lt r then stmt 1 becomes x,r y r 1 else stmt cbr r1 _stmt 1,_stmt 2 2 CMSC430 Spring 2007 16 8

Boolean & Relational Values What if the ISA uses a condition code? Must use a conditional branch to interpret result of compare Necessitates branches in the evaluation Example: comp r x, r y cc 1 cbr_lt cc 1 L T,L F x < y becomes L T : loadi 1 r 2 br L E L F : loadi 0 r 2 L E : other stmts This positional representation is much more complex CMSC430 Spring 2007 17 Boolean & Relational Values What if the ISA uses a condition code? (KDC: a seductive, evil idea) Must use a conditional branch to interpret result of compare Necessitates branches in the evaluation Example: cmp r x, r y cc 1 cbr_lt cc 1 L T,L F x < y becomes L T : loadi 1 r 2 br L E L F : loadi 0 r 2 L E : other stmts Condition codes are an architect s hack allow ISA to avoid some comparisons complicates code for simple cases This positional representation is much more complex CMSC430 Spring 2007 18 9

Boolean & Relational Values The last example actually encodes result in the PC If result is used to control an operation, this may be enough Example if (x < y) then a c + d else a e + f VARIATIONS ON THE ILOC BRANCH STRUCTURE Straight Condition Codes Boolean Compares comp r x,r y cc 1 cmp_lt r x,r y r 1 cbr_lt cc 1 L 1,L 2 cbr r 1 L 1,L 2 L 1 : add r c,r d r a L 1 : add r c,r d r a br L OUT br L OUT L 2 : add r e,r f r a L 2 : add r e,r f r a br L OUT br L OUT L OUT : nop L OUT : nop Condition code version does not directly produce (x < y) Boolean version does Still, there is no significant difference in the code produced CMSC430 Spring 2007 19 Boolean & Relational Values Conditional move & predication both simplify this code Example if (x < y) then a c + d else a e + f OTHER ARCHITECTURAL VARIATIONS Conditional Move Predicated Execution comp r x,r y cc 1 cmp_lt r x,r y r 1 add r c,r d r 1 (r 1 )? add r c,r d r a add r e,r f r 2 ( r 1 )? add r e,r f r a i2i_< cc 1,r 1,r 2 r a Both versions avoid the branches Both are shorter than CCs or Boolean-valued compare Are they better? CMSC430 Spring 2007 20 10

Boolean & Relational Values Consider the assignment x a < b c < d VARIATIONS ON THE ILOC BRANCH STRUCTURE Straight Condition Codes Boolean Compare comp r a,r b cc 1 cmp_lt r a,r b r 1 cbr_lt cc 1 L 1,L 2 cmp_lt r c,r d r 2 L 1 : comp r c,r d cc 2 and r 1,r 2 r x cbr_lt cc 2 L 3,L 2 L 2 : loadi 0 r x br L OUT L 3 : loadi 1 r x br L OUT L OUT : nop Here, the boolean compare produces much better code CMSC430 Spring 2007 21 Boolean & Relational Values Conditional move & predication help here, too OTHER ARCHITECTURAL VARIATIONS Conditional Move Predicated Execution comp r a,r b cc 1 cmp_lt r a,r b r 1 i2i_< cc 1,r T,r F r 1 cmp_lt r c,r d r 2 comp r c,r d cc 2 and r 1,r 2 r x i2i_< cc 2,r T,r F r 2 and r 1,r 2 r x Conditional move is worse than Boolean compares Predication is identical to Boolean compares Context & hardware determine the appropriate choice CMSC430 Spring 2007 22 11

Control Flow If-then-else Follow model for evaluating relationals & booleans with branches Loops Evaluate condition before loop (if needed) Evaluate condition after loop Branch back to the top (if needed) Merges test with last block of loop body Pre-test Loop body Post-test While, for, do, & until all fit this basic model Next block CMSC430 Spring 2007 23 Loop Implementation Code for (i = 1; i< 100; i++) { body } loadi 1 r 1 loadi 1 r 2 loadi 100 r 3 cmp_gt r 1, r 3 r 4 cbr r 4 L 2, L 1 L 1 : body add r 1, r 2 r 1 cmp_le r 1, r 3 r 5 cbr r 5 L 1, L 2 L 2 : next statement CMSC430 Spring 2007 24 12

Control Flow Case Statements 1 Evaluate the controlling expression 2 Branch to the selected case 3 Execute the code for that case 4 Branch to the statement after the case Parts 1, 3, & 4 are well understood, part 2 is the key Surprisingly many compilers do this for all cases! Strategies Linear search (nested if-then-else constructs) Build a table of case expressions & binary search it Directly compute an address (requires dense case set) CMSC430 Spring 2007 25 Activation record structure R9 Link register R8 Activation Record ptr R7 Temporary Act Rec ptr R6 End of stack R2-R5 Four arithmetic Regs R0-R1 Mult-Div registers Registers 2-5 used for + and Multiply code pattern: L R0, arg1 M R0, arg2 LI Ri, (R1)0 where Ri = allocated reg. Assume a procedure at level 3 calls a function with 2 arguments at static level 3: X(A,B) R8 Dynamic Link Return Address End of Stack Display[1] Display[2] Variable 1 Variable 2 Variables R6 Argument 1 Argument 2 CMSC430 Spring 2007 26 13

Activation record structure Function call code: X(A,B) Code: LI R0, A Get first arg address ST R0, (R6)0 Save address in stack LI R0, B Get second arg address ST R0, (R6)1 Save address in stack JSRI R9,X Goto Function R9=retn addr ST R0,X Save function value in Stack R8 R6 Dynamic Link Return Address End of Stack Display[1] Display[2] Variable 1 Variable 2 Variables A B CMSC430 Spring 2007 27 Activation record structure Function X(integer: I, J) Prolog Code: ST R8, (R6)2 New Dyn Link ST R9, (R6)3 New Retn Addr L R0, (R6)0 Get arg 1 ST R0, (R6)I+2 Store in I in new AR L R0, (R6)1 Get arg 2 ST R0, (R6)J+2 Store in J in new AR AI R6, 2 New AR start L R0, (R8)3 Calling Disp[1] ST R0, (R6)3 New Disp[1] L R0, (R8)4 Calling Disp[2] ST R0, (R6)4 New Disp[2] ST R6, (R6)5 New Disp[3] LI R8, (R6)0 Set new Act Rec Ptr AI R6,sizeof(X) Set new end stack ST R6, (R8)2 Save end stack R8 Dynamic Link Return Address End of Stack Display[1] Display[2] Variable 1 Variable 2 Variables R6 A B R8 X Dynamic Link R6 X Return Address X End of Stack X Display[1] X Display[2] X Display[3] CMSC430 Spring 2007 28 14

Activation record structure Function X(A,B) Epilog Code: L R0, fcn-value Set value of X L R9, (R8)1Return Address L R8, (R8)0Load Dyn Link L R6, (R8)2Reset end of stack JI 0, (R9)0 Return R8 Dynamic Link Return Address End of Stack Display[1] Display[2] Variable 1 Variable 2 Variables R6 A B R8 X Dynamic Link X Return Address X End of Stack X Display[1] X Display[2] X Display[3] CMSC430 Spring 2007 29 15