Compiler Theory. (Intermediate Code Generation Abstract S yntax + 3 Address Code)

Similar documents
G Compiler Construction Lecture 12: Code Generation I. Mohamed Zahran (aka Z)

Intermediate Code Generation

COMPILER CONSTRUCTION (Intermediate Code: three address, control flow)

Chapter 6 Intermediate Code Generation

CSE 504: Compiler Design. Code Generation

CODE GENERATION Monday, May 31, 2010

Code Generation (#')! *+%,-,.)" !"#$%! &$' (#')! 20"3)% +"#3"0- /)$)"0%#"

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

CMPT 379 Compilers. Anoop Sarkar. 11/13/07 1. TAC: Intermediate Representation. Language + Machine Independent TAC

Concepts Introduced in Chapter 6

UNIT IV INTERMEDIATE CODE GENERATION

Principle of Compilers Lecture VIII: Intermediate Code Generation. Alessandro Artale

Intermediate Representa.on

Concepts Introduced in Chapter 6

Compiler Optimization Techniques

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Intermediate Representations

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Intermediate Code Generation

Intermediate Code Generation

TACi: Three-Address Code Interpreter (version 1.0)

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona

Principles of Compiler Design

Principles of Compiler Design

UNIT-3. (if we were doing an infix to postfix translator) Figure: conceptual view of syntax directed translation.

General issues. Section 9.1. Compiler Construction: Code Generation p. 1/18

Intermediate Representations

Intermediate representation

Formal Languages and Compilers Lecture X Intermediate Code Generation

intermediate-code Generation

UNIT-V. Symbol Table & Run-Time Environments Symbol Table

NARESHKUMAR.R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY Page 1

Static Checking and Intermediate Code Generation Pat Morin COMP 3002

Dixita Kagathara Page 1

Page No 1 (Please look at the next page )

Figure 28.1 Position of the Code generator

Alternatives for semantic processing

tokens parser 1. postfix notation, 2. graphical representation (such as syntax tree or dag), 3. three address code

CSc 453 Intermediate Code Generation

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Three-address code (TAC) TDT4205 Lecture 16

Lecture 3 Local Optimizations, Intro to SSA

Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known.

opt. front end produce an intermediate representation optimizer transforms the code in IR form into an back end transforms the code in IR form into

CSC D70: Compiler Optimization

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

Code Generation. M.B.Chandak Lecture notes on Language Processing

& Simulator. Three-Address Instructions. Three-Address Instructions. Three-Address Instructions. Structure of an Assembly Program.

6. Intermediate Representation

Intermediate Code & Local Optimizations

Object Code (Machine Code) Dr. D. M. Akbar Hussain Department of Software Engineering & Media Technology. Three Address Code

CS2210: Compiler Construction. Code Generation

Code optimization. Have we achieved optimal code? Impossible to answer! We make improvements to the code. Aim: faster code and/or less space

Group B Assignment 8. Title of Assignment: Problem Definition: Code optimization using DAG Perquisite: Lex, Yacc, Compiler Construction

Compiler Optimization Intermediate Representation

6. Intermediate Representation!

Intermediate Code Generation

Tour of common optimizations

Lecture 7. Instruction Scheduling. I. Basic Block Scheduling II. Global Scheduling (for Non-Numeric Code)

Question Bank. 10CS63:Compiler Design

Intermediate Representations. Reading & Topics. Intermediate Representations CS2210

More Code Generation and Optimization. Pat Morin COMP 3002

Lexical Analysis-: The role of lexical analysis buffing, specification of tokens. Recognitions of tokens the lexical analyzer generator lexical

CSE 401/M501 Compilers

Plan for Today. Concepts. Next Time. Some slides are from Calvin Lin s grad compiler slides. CS553 Lecture 2 Optimizations and LLVM 1

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

PRINCIPLES OF COMPILER DESIGN

Intermediate-Code Generat ion

Intermediate Code Generation

CS 403 Compiler Construction Lecture 10 Code Optimization [Based on Chapter 8.5, 9.1 of Aho2]

LECTURE 17. Expressions and Assignment

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

COMPILER DESIGN - CODE OPTIMIZATION

Functions: flow of control (foc)

Optimization. ASU Textbook Chapter 9. Tsan-sheng Hsu.

CS415 Compilers. Intermediate Represeation & Code Generation

8 Optimisation. 8.1 Introduction to Optimisation

COMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013

Compiler Construction I

Instructions: Language of the Computer

CS606- compiler instruction Solved MCQS From Midterm Papers

Compiler Optimization

THEORY OF COMPILATION

Other Forms of Intermediate Code. Local Optimizations. Lecture 34

Administrative. Other Forms of Intermediate Code. Local Optimizations. Lecture 34. Code Generation Summary. Why Intermediate Languages?

CS 406/534 Compiler Construction Putting It All Together

Control-Flow Graph and. Local Optimizations

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

CSIS1120A. 10. Instruction Set & Addressing Mode. CSIS1120A 10. Instruction Set & Addressing Mode 1

Machine-Independent Optimizations

VIVA QUESTIONS WITH ANSWERS

Intermediate Code Generation

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

What Do Compilers Do? How Can the Compiler Improve Performance? What Do We Mean By Optimization?

Programming Language Processor Theory

CMPSC 160 Translation of Programming Languages. Three-Address Code

Transcription:

Compiler Theory (Intermediate Code Generation Abstract S yntax + 3 Address Code) 006

Why intermediate code? Details of the source language are confined to the frontend (analysis phase) of a compiler, while details of the target machine are confined to the back-end (synthesis) part. This saves a considerable amount of effort since with m front-ends and n back-ends we have m*n compilers.

Intermediate representations Syntax Trees Code is represented in the form of a tree where nodes represent constructs in the source program; the children of a node represent the meaningful components of a construct. Three-Address Code Made up of instructions of the general form x=y op z X, y and z are the three addresses. C Often used as an intermediate representation. e.g. LOTOS language

Directed Acyclic Graphs (i) Nodes in a syntax tree represent constructs in the source program A DAG is used to identify common subexpressions. e.g. a+a*(b-c)+(b-c)*d By doing so it gives the compiler important hints on how to generate efficient code to evaluate the expressions.

DAG for a+a*(b-c)+(b-c)*d

Three Address Code (TAC) An alternative form of intermediate (lower level) representation. x+y*x becomes t 1 = y * z t 2 = x + t 1 For expressions TAC is very similar to syntax trees. For statements it would produce labels and jumps in a similar fashion to machine code.

TAC for a+a*(b-c)+(b-c)*d

Addresses and Instructions An address can be A name : source program names In an implementation we would have these as pointers pointing towards the symbol table. A constant A compiler-generated temporary To store temporary results

Addresses and Instructions (i) Assignment instructions x = y op z where op is binary x = op y where op is unary (-,!, type casting) Copy Instructions x = y Jumps Unconditional : goto L Conditional : if x goto L iffalse x goto L if x relop y goto L

Addresses and Instructions (ii) Procedure Calls and returns Parameters : param x Procedure with n params: call p,n Function with n params : y = call f,n Indexed Array[] Access x = y[i] x[i] = y Address and Pointer Assignments (no need to be covered) x = &y x = *y *x = y

do i = i+1; while (a[i] < v);

Quadruples and Triples Data structures to hold three address code instructions. A Quadruple has four fields ( x = y+z) Op (+) Arg1 (y) Arg2 (z) Result (x)

An example... Note that in an actual implementation a,b and c should be pointers to the symbol table.

Triples Omit result field. Instead of a result field we can use pointers to the triple structure itself. This makes DAG and triple representation practically identical, since we are pointing to a node. In next example (n) indicates position n in the triple structure

An example... Note similarity...

Code Generation This is the final phase of a compiler Takes an intermediate representation and generates the equivalent target program Code optimisation (if any) occurs between the intermediate and target code generation We require that the code is correct and effectively uses the resources on the target machine Is itself efficient in generating code

Code Generation Whoever is designing the code generator must have a very good knowledge of the architecture of the target hardware and operating system. Should keep in mind Memory management Instruction selection Register allocation Evaluation order We shall look at generic issues

Input to the Code Generator Consists of Intermediate Code produced by front end Symbol Table to determine the runtime addresses of the data objects denoted by the names in the intermediate representation The underlying machine memory is byte-addressable and would have a number (say n) of general-purpose registers.

Assembly Language Instructions Most instructions consist of an operator, followed by a target, followed by a list of source operands. A label may precede an instruction We the next few slides we shall look at a number of different instruction classes.

Load Operations LD dst, addr Loads the value in location addr into location dst. dst = addr LD r, x loads the value at addr x into r LD r1, r2 loads the contents of register r2 into register r1

Store operations ST x, r Stored the value in register r into the location x. This instruction denotes the assignment x = r. Note difference from LD

Computation Operators OP dst, src1, src2 OP would be ADD, SUB etc dst, src1, src2 are locations Applies operation OP to the values in locations src1 and src2, and place the result of this operation in location dst ADD r1, r2, r3 computes r1=r2+r3 Unary OP do not have src2

Unconditional Jumps BR L Causes control to branch to the machine instruction with label L BR stands for branch

Conditional Jumps Bcond r, L Where r is a register and L is a label Cond stands for any of the of the common tests on values eg. LT, GT, etc BLTZ r, L causes a jump to label L if the value in register r is less than zero, and allows control to pass to the next machine instruction if not.

3AC to machine code (x=y-z) LD R1, y LD R2, z SUB R1, R1, R2 ST x, R1 // R1 = y // R2 = z // R1 = R1 - R2 // x = R1

3AC to machine code (if x<y goto L) LD R1, x // R1 = x LD R2, y // R2 = y SUB R1, R1, R2 // R1 = R1 - R2 BLTZ R1, M // if R1<0 jump M

Instruction Selection Instruction selection effects Execution speed and Size A rich instruction set may provide several ways to perform any given operation. Typical e.g.... INC x instruction replaces ADD x + 1;

Register Allocation Instructions involving registry operands are usually shorter and much faster than those involving memory operands. For this reason utilization of registers is important in generating fast code. The use of registers is often subdivided into two problems During register allocation, we select the set of variables that will reside in registers at a point in the program During subsequent register assignments, we pick the specific register that the variable will be stored in Optimal assignment of registers is NP-complete... hence some heuristics have to be used.

Standard code optimizations (i) Common Sub Expression An expression E is called a common sub-expression if E was previously computed and the values of the variables in E have not changed. In such cases we can use the previously computed value of E. Copy Propagation Reorganises assignment statements so that: x = y z = x Becomes x = y z = y

Standard code optimizations (ii) Dead Code Elimination A variable is 'live' at a point in a program if its value can be used subsequently, otherwise it is 'dead'. Statements may compute values that are never used in a program. e.g. Computing expressions values which are never assigned e.g. If (debug) then print... and someone (using data flow analysis) the compiler can deduce that debug is always false. Check and print code can be removed.

Standard code optimizations (iii) copy propagations + dead code elimination x = t3 a[t2] = t5 a[t4] = x goto b2 Elimination of copy propagation x= t3 a[t2] = t5 a[t4] = t3 goto b5 Elimination of dead code a[t2] = t5 a[t4] = t3 goto b5

Standard code optimizations (iii) Loop optimisation Loops are an important place where optimisations may occur. Clearly we try to reduce the number of instructions inside the loop! One option is to use code motion ( and maintain semantics ) This transformation takes out of the loop any expressions that have the same evaluation independent of the number of times the loop executes (loop-invariant computation) and places it before the loop. e.g. while (i <= (limit-1))... limit 1 can be computed before the loop!! i <= t where t = limit-1