Intermediate Representations

Similar documents
Intermediate Code Generation

CMPSC 160 Translation of Programming Languages. Three-Address Code

Formal Languages and Compilers Lecture X Intermediate Code Generation

CSc 453 Intermediate Code Generation

Principle of Compilers Lecture VIII: Intermediate Code Generation. Alessandro Artale

COMPILER CONSTRUCTION (Intermediate Code: three address, control flow)

Compilers. Compiler Construction Tutorial The Front-end

Intermediate Code Generation

Projects for Compilers

COP5621 Exam 3 - Spring 2005

Intermediate Code Generation

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Concepts Introduced in Chapter 6

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

SOURCE LANGUAGE DESCRIPTION

Syntax-Directed Translation

Compiler Theory. (Intermediate Code Generation Abstract S yntax + 3 Address Code)

Intermediate Code Generation

THEORY OF COMPILATION

CS2210: Compiler Construction. Code Generation

UNIT-3. (if we were doing an infix to postfix translator) Figure: conceptual view of syntax directed translation.

intermediate-code Generation

CSE 340 Fall 2014 Project 4

Concepts Introduced in Chapter 6

Talen en Compilers. Johan Jeuring , period 2. December 15, Department of Information and Computing Sciences Utrecht University

NARESHKUMAR.R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY Page 1

Using an LALR(1) Parser Generator

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Intermediate Code Generation

S3.0 : A multicore 32-bit Processor

Quadsim Version 2.1 Student Manual

Dixita Kagathara Page 1

A simple syntax-directed

Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known.

Module 27 Switch-case statements and Run-time storage management

CPSC 411, 2015W Term 2 Midterm Exam Date: February 25, 2016; Instructor: Ron Garcia

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

Virtual Machine Tutorial

Time : 1 Hour Max Marks : 30

CMPT 379 Compilers. Anoop Sarkar. 11/13/07 1. TAC: Intermediate Representation. Language + Machine Independent TAC

Where we are. What makes a good IR? Intermediate Code. CS 4120 Introduction to Compilers

UNIT IV INTERMEDIATE CODE GENERATION

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Code Generation. Dragon: Ch (Just part of it) Holub: Ch 6.

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Intermediate Representations

Quick Reference Guide

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

A Simple Syntax-Directed Translator

TACi: Three-Address Code Interpreter (version 1.0)

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

Name :. Roll No. :... Invigilator s Signature : INTRODUCTION TO PROGRAMMING. Time Allotted : 3 Hours Full Marks : 70

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSE 401/M501 Compilers

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Type Checking. Chapter 6, Section 6.3, 6.5

Syntax Intro and Overview. Syntax

Chapter 6 Intermediate Code Generation

Intermediate Representa.on

CSCI Compiler Design

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Semantic actions for declarations and expressions

Object Code (Machine Code) Dr. D. M. Akbar Hussain Department of Software Engineering & Media Technology. Three Address Code

COMPILERS BASIC COMPILER FUNCTIONS

Compiler Principle and Technology. Prof. Dongming LU April 15th, 2019

Semantic actions for declarations and expressions. Monday, September 28, 15

7 Translation to Intermediate Code

Introduction to Programming Using Java (98-388)

Intermediate Code Generation

Principles of Compiler Design

CPS 506 Comparative Programming Languages. Syntax Specification

KU Compilerbau - Programming Assignment

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Writing Evaluators MIF08. Laure Gonnord

Control Structures. Boolean Expressions. CSc 453. Compilers and Systems Software. 16 : Intermediate Code IV

Compiler Code Generation COMP360

CS 432 Fall Mike Lam, Professor. Code Generation

Question Bank. 10CS63:Compiler Design

Lexical Considerations

Intermediate Representations

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:

Intermediate Code Generation

Intermediate Code Generation

ECE232: Hardware Organization and Design

CS 403: Scanning and Parsing

Semantic actions for declarations and expressions

Compiler Construction I

Summary: Direct Code Generation

CS251 Programming Languages Handout # 29 Prof. Lyn Turbak March 7, 2007 Wellesley College

Code generation scheme for RCMA

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

The SPL Programming Language Reference Manual

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Page No 1 (Please look at the next page )

Module 25 Control Flow statements and Boolean Expressions

CMa simple C Abstract Machine

Transcription:

Intermediate Representations A variety of intermediate representations are used in compilers Most common intermediate representations are: Abstract Syntax Tree Directed Acyclic Graph (DAG) Three-Address Code Code for a simplified stack-based virtual machine (Example: P-code) Abstract Syntax is adequate representation of the source code However, it is not linear and does not resemble target code High-level constructs, such as if and while, should be translated into jumps Directed acyclic graph is an optimization of a syntax tree Intermediate Code Generation 1

Three-Address Code Generalized assembly code for a virtual 3-address machine 3-address code represents a linearization of the syntax tree 3-address code can be: High level: representing all operations as abstractly as a syntax tree Low level: closely resembling target code Basic 3-address instruction consists of an operator and 3 addresses Two addresses for the two operands and one address for the result General form x := y op z op is an operator code x, y, and z are typically implemented as pointers to symbols x is either an identifier or a temporary symbol y and z can be an identifier, a literal, or a temporary symbol Intermediate Code Generation 2

Types of 3-Address Instructions Arithmetic, logical, and shift instructions are of the form: x := y op z Binary Operator op can be any of the following operators: ADD, SUB, MUL, DIV, MOD, AND, OR, XOR, SHL, SHR, SHRA, EQ, NE, LT, LE, GT, GE The above operators are generic Type information can be also added to each operator Binary Arithmetic Logical and shift To distinguish between integer and floating-point operations Unary op instructions are of the form: x := op y op can be PLUS, MINUS, or NOT operator Relational operators Unary Operator op can also be conversion operators to convert between integer and FP Intermediate Code Generation 3

Example on Translating an Expression Consider the translation of ( 2 + a * ( b c / d ) ) / e t1 := c / d t2 := b t1 t3 := a * t2 t4 := 2 + t3 t5 := t4 / e Compiler generates temporaries when translating into 3AC t1, t2, t3, t4, and t5 are generated temporaries Temporaries are identified by number Temporaries are stored in symbols, like identifiers Type information is also added to temporary symbols Intermediate Code Generation 4

3-Address Instructions for Copy and Jumps Move or copy instruction is of the form: x := y Unconditional jump and label instructions are of the form: GOTO Ln Ln is a label identified by a number n LABEL Ln A label is not an instruction; It is the address of the following instruction Conditional branch instructions are of the form: IF x goto Ln Branch if x is true IFNOT x goto Ln Branch if x is false x is a Boolean variable that evaluate to true or false The general conditional branch instruction is of the form: if x relop y goto Ln Conditional branches are used to implement conditional and loop statements Intermediate Code Generation 5

Example on Translating a While Loop Consider the following while loop: sum:=0; i:=1; while (i<n) { sum := sum + i; i := i+1; } A translation of the above loop into 3AC is shown below: sum := 0 i := 1 goto L2 label L1 sum := sum + i i := i + 1 label L2 if i < n goto L1 Intermediate Code Generation 6

Implementation of 3-Address Code A 3-address instruction is implemented as a quadruple: An operator code Two pointers to operand symbols A pointer to result symbol, goto target, or label number A code sequence can be implemented as an array or a linked list A linked list is preferable because it Facilitates reordering and concatenation of instructions Grows dynamically as required A 3-address instruction has a link to the next instruction opcode result target label first second link Intermediate Code Generation 7

Three-Address Instruction Structure The structure of a 3-address instruction is given below: struct Inst { Inst(OpType op, Symbol* r=0, Symbol* f=0, Symbol* s=0); Inst(OpType op, Inst* t, Symbol* f=0, Symbol* s=0); Inst(OpType op, unsigned l); OpType opcode; // Operation Code union { // Anonymous union Symbol* result; // Either a symbol pointer, or Inst* target; // Target label used with GOTO unsigned label; // Label number used with LABEL }; Symbol* first; // First operand symbol Symbol* second; // Second operand symbol Inst* link; // Link to next instruction }; Intermediate Code Generation 8

Generating Temporaries and Labels newtemp() allocates and returns a new temporary symbol The static variable num ensures a unique number for every call Symbol* newtemp() { static int num = 1; Symbol* temp = new Symbol(TEMP,num); num++; return temp; } newlabel() allocates and returns a new label instruction Inst* newlabel() { static int num = 1; Inst* label = new Inst(LABEL,num); num++; return label; } Intermediate Code Generation 9

Concatenating Instructions and Code A code sequence is a linear linked list of instructions We identify the first and last instructions in a code sequence struct Code { }; Inst* first; Inst* last; // Code sequence structure // First instruction in code sequence // Last instruction in code sequence The + operator is overloaded to mean concatenation Four + operators will concatenate code sequences and instructions The result of concatenation is a code sequence Code operator+(code a, Code b); Code operator+(code c, Inst* i); Code operator+(inst* i, Code c); Code operator+(inst* i, Inst* j); Intermediate Code Generation 10

Concatenating Two Code Sequences The + operator links the pointers of two code sequences: Code operator+(code a, Code b) { Code c; if (a.first == 0) { // Code a is NULL c = b; } else if (b.first == 0) { // Code b is NULL c = a; } else { // General Case c.first = a.first; c.last = b.last; a.last->link = b.first; } return c; } Intermediate Code Generation 11

Translating Expressions into 3-Address Code Synthesized Attributes E.code: Code sequence evaluating E E.sym: Symbol representing value of E addop.op: addition operator: ADD or SUB mulop.op: multiplication operator: MUL, DIV, or MOD Grammar Rules E E 1 addop E 2 E E 1 mulop E 2 E UnaryOp E 1 Semantic Rules E.sym := newtemp(); AddInst := new Inst(addop.op, E.sym, E 1.sym, E 2.sym); E.code := E 1.code + E 2.code + AddInst; E.sym := newtemp(); MulInst := new Inst(mulop.op, E.sym, E 1.sym, E 2.sym); E.code := E 1.code + E 2.code + MulInst; E.sym := newtemp(); UnaryInst := new Inst(UnaryOp.op, E.sym, E 1.sym); E.code := E 1.code + UnaryInst; Intermediate Code Generation 12

Translating Expressions cont'd Synthesized Attributes UnaryOp.op: id.name: num.sym: Grammar Rules E ( E 1 ) E id E num UnaryOp addop UnaryOp not Unary operator: PLUS, MINUS, NOT Identifier name Literal symbol holding number value Semantic Rules E.sym := E 1.sym; E.code := E 1.code; E.sym := idtable.lookup(id.name); E.code.first := E.code.last = 0; E.sym := num.sym; E.code.first := E.code.last = 0; if addop.op = ADD then UnaryOp.op := PLUS else UnaryOp.op := MINUS UnaryOp.op := NOT Intermediate Code Generation 13

Translating If-Statement into 3-Address Code S.code E.code S if E then Slist end ; S.code E.code IFNOT E.sym goto Next SList.code LABEL Next S if E then Slist 1 else Slist 2 end ; IFNOT E.sym goto Else SList 1.code GOTO Next LABEL Else SList 2.code LABEL Next Intermediate Code Generation 14

Translating If-Statement cont'd Synthesized Attributes E.code: E.sym: S.code: Slist.code: Grammar Rules S if E then Slist end ; S if E then Slist 1 else Slist 2 end ; Code sequence for E Symbol representing E Code sequence of a statement Code sequence of a statement list Semantic Rules Next := newlabel(); IfNotNext := new Inst(IFNOT, Next, E.sym); S.code := E.code + IfNotNext + Slist.code + Next; Else := newlabel(); Next := newlabel(); IfNotElse := new Inst(IFNOT, Else, E.sym); GotoNext := new Inst(GOTO, Next); S.code := E.code + IfNotElse + Slist 1.code + GotoNext + Else + Slist 2.code + Next; Intermediate Code Generation 15

Translating While Statement Possible translation S.code LABEL Expr S while E do Slist end ; E.code IFNOT E.sym goto Next SList.code GOTO Expr LABEL Next Intermediate Code Generation 16

Translating While and Statement Lists Synthesized Attributes Grammar Rules E.code: Code sequence evaluating E S.code: Code sequence of statement S Slist.code: list Slist Code sequence of statement E.sym: Symbol representing value of E Semantic Rules S while E do Slist end ; next := newlabel(); Expr := newlabel(); GotoExpr := new Inst(GOTO, Expr); IfNotNext := new Inst(IFNot, next, E.sym); S.code := Expr + E.code + IfNotNext+ Slist.code+ GotoExpr + next; Slist Slist 1 S Slist.code := Slist 1.code + S.code; Slist ε Slist.code.first := 0; Slist.code.last := 0; Intermediate Code Generation 17

Implementation Example of Interpretation function expr( ) : Pointer-> ST // The BNF of expr is converted to EBNF to avoid the left recursion // The code in black is common for parsing and to create an entry in the symbol table for temp var. begin left := term( ); //term() will return a pointer to the term record in the ST and add its value // The code highlighted in red is Related to type checking type:= get_type(left); if type(left)!= integer or float then semantic-error();// Related to type checking entry = new_temp(); //The code highlighted in green is added as part of the interpreter put_value (entry, get_value(left); while token = ADDOP do op := ADDOP.op ; match(addop); right := term( ); if get_type(right)!= type then semantic-error(); put_value(entry, (get_value(entry)+get_value(right))); end while; return entry; end expr; Intermediate Code Generation 18