Theory of Compilation Erez Petrank. Lecture 7: Intermediate Representation Part 2: Backpatching

Similar documents
THEORY OF COMPILATION

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 7, vrijdag 2 november 2018

Module 26 Backpatching and Procedures

CMPSC 160 Translation of Programming Languages. Three-Address Code

UNIT-3. (if we were doing an infix to postfix translator) Figure: conceptual view of syntax directed translation.

Formal Languages and Compilers Lecture X Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation

Intermediate Representa.on

Principle of Compilers Lecture VIII: Intermediate Code Generation. Alessandro Artale

Intermediate Code Generation

NARESHKUMAR.R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY Page 1

Intermediate Code Generation

Intermediate Code Generation

Compilerconstructie. najaar Rudy van Vliet kamer 124 Snellius, tel rvvliet(at)liacs(dot)nl

Syntax Directed Translation

Intermediate Code Generation Part II

Intermediate Representations

CSc 453 Intermediate Code Generation

Compilers. Compiler Construction Tutorial The Front-end

intermediate-code Generation

Syntax-Directed Translation

Intermediate Code Generation

UNIT IV INTERMEDIATE CODE GENERATION

Module 25 Control Flow statements and Boolean Expressions

Intermediate Code Generation

CS2210: Compiler Construction. Code Generation

Intermediate Code Generation

CS322 Languages and Compiler Design II. Spring 2012 Lecture 7

Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Syntax-Directed Translation Part II

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

CMPT 379 Compilers. Anoop Sarkar. 11/13/07 1. TAC: Intermediate Representation. Language + Machine Independent TAC

A Simple Syntax-Directed Translator

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Principles of Programming Languages

Fall Compiler Principles Lecture 6: Intermediate Representation. Roman Manevich Ben-Gurion University of the Negev

Chapter 6 Intermediate Code Generation

CS 406: Syntax Directed Translation

Control Structures. Boolean Expressions. CSc 453. Compilers and Systems Software. 16 : Intermediate Code IV

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

LECTURE 3. Compiler Phases

COMP455: COMPILER AND LANGUAGE DESIGN. Dr. Alaa Aljanaby University of Nizwa Spring 2013

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Time : 1 Hour Max Marks : 30

CSc 453. Compilers and Systems Software. 16 : Intermediate Code IV. Department of Computer Science University of Arizona

Fall Compiler Principles Lecture 5: Intermediate Representation. Roman Manevich Ben-Gurion University of the Negev

CSCI565 Compiler Design

Concepts Introduced in Chapter 6

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 19: Efficient IL Lowering 5 March 08

COP5621 Exam 3 - Spring 2005

Languages and Compiler Design II IR Code Generation I

Bottom-Up Parsing. Lecture 11-12

Principles of Programming Languages

CS 4201 Compilers 2014/2015 Handout: Lab 1

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

PRINCIPLES OF COMPILER DESIGN

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Compilers. 5. Attributed Grammars. Laszlo Böszörmenyi Compilers Attributed Grammars - 1

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Dixita Kagathara Page 1

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Writing Evaluators MIF08. Laure Gonnord

Principles of Compiler Design

Exercises II. Exercise: Lexical Analysis

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

A programming language requires two major definitions A simple one pass compiler

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

CSCI565 Compiler Design

Related Course Objec6ves

Lecture 14 Sections Mon, Mar 2, 2009

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

5. Syntax-Directed Definitions & Type Analysis

IR Lowering. Notation. Lowering Methodology. Nested Expressions. Nested Statements CS412/CS413. Introduction to Compilers Tim Teitelbaum

LR Parsing LALR Parser Generators

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Compiler Theory. (Intermediate Code Generation Abstract S yntax + 3 Address Code)

Concepts Introduced in Chapter 6

Code Genera*on for Control Flow Constructs

LR Parsing LALR Parser Generators

2.2 Syntax Definition

LECTURE NOTES ON COMPILER DESIGN P a g e 2

CMPT 379 Compilers. Parse trees

Bottom-Up Parsing. Lecture 11-12

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

CS606- compiler instruction Solved MCQS From Midterm Papers

More On Syntax Directed Translation

Lecture Overview Code generation in milestone 2 o Code generation for array indexing o Some rational implementation Over Express Over o Creating

Question Bank. 10CS63:Compiler Design

Front End. Hwansoo Han

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Transcription:

Theory of Compilation 236360 rez Petrank Lecture 7: Intermediate Representation Part 2: ackpatching 1

You are here Compiler Source text txt Lexical Analysis Syntax Analysis Semantic Analysis Inter. Rep. (IR) Gen. Code xecutable code exe characters tokens (Abstract Syntax Tree) AST Annotated AST 2

Intermediate Representation neutral representation between the front-end and the back-end Abstracts away details of the source language Abstract away details of the target language A compiler may have multiple intermediate representations and move between them Java C Ada Objec2ve C GNRIC arm x86 ia64 3

Our Representation: Three Address Code (3AC) very instruction operates on three addresses result = operand1 operator operand2 Close to low-level operations in the machine language Operator is a basic operation Statements in the source language may be mapped to multiple instructions in three address code 4

Three address code: example assign a + * b uminus b * uminus t 1 t 2 t 3 t 4 t 5 a := := := := := := c b * t 1 c b * t 3 t 2 + t 4 t 5 c c 5

Three address code: example instructions instruc(on meaning x := y op z assignment with binary operator x := op y assignment unary operator x:= y assignment x := &y assign address of y x:=*y assignment from deref y *x := y assignment to deref x instruc(on goto L if x relop y goto L meaning uncondi2onal jump condi2onal jump 6

Creating 3AC Assume bottom up parser Covers a wider range of grammars LALR sufficient to cover most programming languages Creating 3AC via syntax directed translation Attributes examples: code code generated for a nonterminal var name of variable that stores result of nonterminal freshvar() helper function that returns the name of a fresh variable 7

Creating 3AC: expressions produc(on S id := 1 + 2 1 * 2-1 (1) seman(c rule S.code :=. code gen(id.var :=.var).var := freshvar();.code = 1.code 2.code gen(.var := 1.var + 2.var).var := freshvar();.code = 1.code 2.code gen(.var := 1.var * 2.var).var := freshvar();.code = 1.code gen(.var := uminu 1.var).var := 1.var.code = ( 1.code ) id.var := id.var;.code = (we use to denote concatena2on of intermediate code fragments) 8

example a = b * - c + b* - c a assign +.var = t5.code = t1 = - c t2 = b*t1 t3 = - c t4 = b*t3 t5 = t2+t4 *.var = t2.code = t1 = - c t2 = b*t1.var = t4.code = t3 = - c t4 = b*t3 *.var = t3.code = t3 = - c b.var = b.code = uminus c.var = t1.code = t1 = - c b.var = b.code = uminus c.var = c.code =.var = c.code = 9

Creating 3AC: control statements S t while do S1 S.begin: S.after:.code if.var = 0 goto S.after S 1.code goto S.begin produc(on S t while do S1 seman(c rule S.begin := freshlabel(); S.a_er = freshlabel(); S.code := gen(s.begin : ).code gen( if.var = 0 goto S.a_er) S1.code gen( goto S.begin) gen(s.a_er : ) 10

Generating IR code Option 1 accumulate code in AST attributes Option 2 emit IR code to a file during compilation If for every production the code of the left-hand-side is constructed from a concatenation of the code of the RHS in some fixed order 11

xpressions and assignments produc(on S id := seman(c ac(on { p:= lookup(id.name); if p null then emit(p :=.var) else error } 1 op 2 {.var := freshvar(); emit(.var := 1.var op 2.var) } - 1 {.var := freshvar(); emit(.var := uminus 1.var) } ( 1) {.var := 1.var } id { p:= lookup(id.name); if p null then.var :=p else error } 12

oolean xpressions produc(on seman(c ac(on 1 op 2 {.var := freshvar(); emit(.var := 1.var op 2.var) } not 1 {.var := freshvar(); emit(.var := not 1.var) } ( 1) {.var := 1.var } true {.var := freshvar(); emit(.var := 1 ) } false {.var := freshvar(); emit(.var := 0 ) } Represent true as 1, false as 0 Wasteful representa2on, crea2ng variables for true/false 13

oolean expressions via jumps produc(on seman(c ac(on id1 op id2 {.var := freshvar(); emit( if id1.var relop id2.var goto nextstmt+2); emit(.var := 0 ); emit( goto nextstmt + 1); emit(.var := 1 ) } This is useful for short circuit evaluation. Let us start with an example. 14

oolean xpressions: an xample or a < b and c < d e < f 15

oolean xpressions: an xample or a < b and 100: 101: 102: 103: if a < b goto 103 T 1 := 0 goto 104 T 1 := 1 c < d e < f 16

oolean xpressions: an xample or a < b and 100: 101: 102: 103: if a < b goto 103 T 1 := 0 goto 104 T 1 := 1 c < d 104: if c < d goto 107 105: T 2 := 0 106: goto 108 107: T 2 := 1 e < f 17

oolean xpressions: an xample or a < b and 100: 101: 102: 103: if a < b goto 103 T 1 := 0 goto 104 T 1 := 1 c < d 104: if c < d goto 107 105: T 2 := 0 106: goto 108 107: T 2 := 1 e < f 108: 109: 110: 111: 112: if e < f goto 111 T 3 := 0 goto 112 T 3 := 1 18

oolean xpressions: an xample or a < b and 100: 101: 102: 103: if a < b goto 103 T 1 := 0 goto 104 T 1 := 1 c < d 104: if c < d goto 107 105: T 2 := 0 106: goto 108 107: T 2 := 1 e < f 108: 109: 110: 111: 112: 113: if e < f goto 111 T 3 := 0 goto 112 T 3 := 1 T 4 := T 2 and T 3 19

oolean xpressions: an xample or a < b and 100: 101: 102: 103: if a < b goto 103 T 1 := 0 goto 104 T 1 := 1 c < d 104: if c < d goto 107 105: T 2 := 0 106: goto 108 107: T 2 := 1 e < f 108: 109: 110: 111: 112: 113: if e < f goto 111 T 3 := 0 goto 112 T 3 := 1 T 4 := T 2 and T 3 T 5 := T 1 or T 4 20

Short circuit evaluation Second argument of a boolean operator is only evaluated if the first argument does not already determine the outcome (x and y) is equivalent to: if x then y else false ; (x or y) is equivalent to: if x then true else y ; Is short circuit evaluation equivalent to standard evaluation? We use it if the language definition dictates its use. 21

Our previous example a < b or (c<d and e<f) 100: if a < b goto 103 101: T 1 := 0 102: goto 104 103: T 1 := 1 104: if c < d goto 107 105: T 2 := 0 106: goto 108 107: T 2 := 1 108: if e < f goto 111 109: T 3 := 0 110: goto 112 111: T 3 := 1 112: T 4 := T 2 and T 3 113: T 5 := T 1 and T 4 100: if a < b goto 105 101: if!(c < d) goto 103 102: if e < f goto 105 103: T := 0 104: goto 106 105: T := 1 106: naive Short circuit evalua2on 22

Control Structure: if, else, while. Consider the conditional jumps: S if then S 1 if then S 1 else S 2 while do S 1 One option is to create code for, create code for S, and then create a jump to the beginning of S or the end of S according to s value. ut it would be more efficient to create code that while executing will jump to the correct location immediately when the value of is discovered.

Control Structure: if, else, while. The problem is that we do not know where to jump to While parsing s tree for if then S, we don t know S s code and where it starts or ends. Yet, we must generate jumps to these locations. Solution: for each expression keep labels:.truelabel and.falselabel. We jump to these labels when s value is true or false correspondingly. Also, for each statement S, we have a label S.nextLabel which keeps the address of the code that follows the statement S. The semantic equation will be.falselabel = S.next. For.true, we create a new label between s code and S s code. S if then S

התכונה next While parsing statement S, generate the code with its next label. produc(on P t S S t S1 S2 seman(c ac(on S.next = freshlabel(); P.code = S.code label(s.next) S1.next = freshlabel(); S2.next = S.next; S.code = S1.code label(s1.next) S2.code The attribute S.next is inherited: the child gets it. The attribute code is generated: the parent gets it. The label S.next is symbolic. The address will only be known after we parse S.

Control Structures: conditional produc(on seman(c ac(on S t if then S1.trueLabel = freshlabel();.falselabel = S.next; S1.next = S.next; S.code =.code gen (.truelabel : ) S1.code Are S1.next,.falseLabel inherited or synthesized? Is S.code inherited or synthesized? 26

Control Structures: conditional produc(on S t if then S1 else S2 seman(c ac(on.truelabel = freshlabel();.falselabel = freshlabel(); S1.next = S.next; S2.next = S.next; S.code =.code gen(.truelabel : ) S1.code gen( goto S.next) gen(.falselabel : ) S2.code.trueLabel and.falselabel considered inherited 27

S t if then S1 else S2.trueLabel:.falseLabel: S.next:.code S1.code goto S.next S2.code.trueLabel = freshlabel();.falselabel = freshlabel(); S1.next = S.next; S2.next = S.next; S.code =.code gen(.truelabel : ) S1.code gen( goto S.next) gen(.falselabel : ) S2.code 28

oolean expressions Generate code that jumps to.truelabel if s value is true and to.falselabel if s value is false. produc(on t 1 or 2 t 1 and 2 t not 1 t (1) seman(c ac(on 1.trueLabel =.truelabel; 1.falseLabel = freshlabel(); 2.trueLabel =.truelabel; 2.falseLabel =.falselabel;.code = 1.code gen (1.falseLabel : ) 2.code 1.trueLabel = freshlabel(); 1.falseLabel =.falselabel; 2.trueLabel =.truelabel; 2.falseLabel =.falselabel;.code = 1.code gen (1.trueLabel : ) 2.code 1.trueLabel =.falselabel; 1.falseLabel =.truelabel;.code = 1.code; 1.trueLabel =.truelabel; 1.falseLabel =.falselabel;.code = 1.code; t id1 relop id2.code=gen ( if id1.var relop id2.var goto.truelabel) gen( goto.falselabel); t true t false.code = gen( goto.truelabel).code = gen( goto.falselabel); 29

oolean expressions produc(on t 1 or 2 seman(c ac(on 1.trueLabel =.truelabel; 1.falseLabel = freshlabel(); 2.trueLabel =.truelabel;.falselabel =.falselabel;.code = 1.code gen (1.falseLabel : ) 2.code How can we determine the address of 1.falseLabel? Only possible after we know the code of 1 and all the code preceding 1 30

xample S if then S1 1 true and 2 false 1.trueLabel = freshlabel(); 1.falseLabel =.falselabel; 2.trueLabel =.truelabel; 2.falseLabel =.falselabel;.code = 1.code gen (1.trueLabel : ) 2.code.code = gen( goto 2.falseLabel).code = gen( goto 1.trueLabel) 31

xample.truelabel = freshlabel();.falselabel = S.next; S S1.next = S.next; S.code =.code gen (.truelabel : ) S1.code if then S1 1 true and 2 false 1.trueLabel = freshlabel(); 1.falseLabel =.falselabel; 2.trueLabel =.truelabel; 2.falseLabel =.falselabel;.code = 1.code gen (1.trueLabel : ) 2.code.code = gen( goto 2.falseLabel).code = gen( goto 1.trueLabel) 32

Computing the labels We can build the code while parsing the tree bottomup, leaving the jump targets vacant. We can compute the values for the labels in a second traversal of the AST Can we do it in a single pass? 33

So far Three address code. Intermediate code generation is executed with parsing (via semantic actions). Creating code for oolean expressions and for control statements is more involved. We typically use short circuit evaluation, value of expression is implicit in control location. We need to compute the branching addresses. Option 1: compute them in a second AST pass. 34

(תיקון לאחור) ackpatching Goal: generate code in a single pass Generate code as we did before, but manage labels differently Keep targets of jumps empty until values are known, and then back-patch them New synthesized attributes for.truelist list of jump instructions that eventually get the label where goes when is true..falselist list of jump instructions that eventually get the label where goes when is false. 35

ackpatching For every label, maintain a list of instructions that jump to this label When the address of the label is known, go over the list and update the address of the label Previous solutions do not guarantee a single pass The attribute grammar we had before is not S-attributed (e.g., next is inherited), and is not L-attributed (because the inherited attributes are not necessarily from left siblings). 36

S Why is it difficult? if then else S 1 S 2 Think of an if-then-else statement. To compute S1.next, we need to know the code of all blocks S1,, S2, in order to compute the address that follows them. On the other hand, to compute the code of S1, we need to know S1.next, or S.next, which is not known before S1 is computed. So we cannot compute S1 s code with all jump targets, but we can compute it up to leaving space for future insertion of S1.next. Using backpatching we maintain a list of all empty spaces that need to jump to S1.next. When we know S1.next, we go over this list and update the jump targets with S1.next s value.

Functions for list handling makelist(addr) create a list of instructions containing addr merge(p1,p2) concatenate the lists pointed to by p1 and p2, returns a pointer to the new list backpatch(p,addr) inserts i as the target label for each of the instructions in the list pointed to by p 38

Accumulating the code Assume a bottom-up parsing so that the code is created in the correct order from left to right, bottom-up. The semantic analysis is actually executed during parsing and the code is accumulated. Recall that we associated two labels.truelabel and.falselabel for each expression, which are the labels to which we should jump if turns out true or false. We will now replace these with two lists.truelist and.falselist, which record all instructions that should be updated with the address of.truelabel or.falselabel when we discover their values. In addition, we also replace S.next with S.nextlist, maintaining all instructions that should jump to S.next, when we know what it is.

Accumulating Code (cont d).truelist and.falselist are generated attributes: the descendants tell the parent where there is code to fix. When ascending to the parent of, we know what are the relevant addresses and we can backpatch them into all the instruction locations that are in the list. Similarly for S.nextlist. We will need a function nextinstr() that retruns the address of the next instruction.

Computing oolean expressions Recall the code jumps to.true ot.false according to s value. Previously: id 1 relop id 2 true false {.code := gen ( ' if ' id 1.var relop.op id 2.var ' goto '.true ) gen ( ' goto '.false ) } {.code := gen ( ' goto '.true ) } {.code := gen ( ' goto '.false ) } Now with backpatching: id 1 relop id 2 true false {.truelist := makelist ( nextinstr ) ;.falselist := makelist ( nextinstr + 1 ) ; emit ( ' if ' id 1.var relop.op id 2.var ' goto_ ' ) emit ( ' goto_ ' ) } {.truelist := makelist ( nextinstr ) ; emit ( ' goto_ ' ) } {.falselist := makelist ( nextinstr ) ; emit ( ' goto_ ' ) }

Computing oolean expressions Recall the code jumps to.true ot.false according to s value. Previously: not 1 { 1.true :=.false ; 1.false :=.true ;.code := 1.code } ( 1 ) { 1.true :=.true ; 1.false :=.false ;.code := 1.code } Now with backpatching: not 1 {.truelist := 1.falselist ;.falselist := 1.truelist } ( 1 ) {.truelist := 1.truelist ;.falselist := 1.falselist }

Computing oolean expressions Recall the code jumps to.true ot.false according to s value. Previously: 1 or 2 1 and 2 { 1.true :=.true ; 1.false := newlable() ; 2.true :=.true ; 2.false :=.false ;.code := 1.code gen ( 1.false ' : ') 2.code } { 1.true := newlabel (); 1.false :=.false ; 2.true :=.true ; 2.false :=.false ;.code := 1.code gen ( 1.true ' : (' 2.code } Now with backpatching: 1 or M 2 1 and M 2 M { backpatch ( 1.falselist, M.instr ) ;.truelist := merge ( 1.truelist, 2.truelist ) ;.falselist := 2.falselist } { backpatch ( 1.truelist, M.instr ) ;.truelist := 2.truelist ;.falselist := merge ( 1.falselist, 2.falselist ) } { M.instr := nextinstr }

ackpatching oolean expressions produc(on t 1 or M 2 seman(c ac(on backpatch(1.falselist,m.instr);.truelist = merge(1.truelist,2.truelist);.falselist = 2.falseList; t 1 and M 2 backpatch(1.truelist,m.instr);.truelist = 2.trueList;.falseList = merge(1.falselist,2.falselist); t not 1 t (1).trueList = 1.falseList;.falseList = 1.trueList;.trueList = 1.trueList;.falseList = 1.falseList; t id1 relop id2.truelist = makelist(nextinstr);.falselist = makelist(nextinstr+1); emit ( if id1.var relop id2.var goto _ ) emit( goto _ ); t true t false M t ε.truelist = makelist(nextinstr); emit ( goto _ );.falselist = makelist(nextinstr); emit ( goto _ ); M.instr = nex2nstr; 44

Using a marker xample: 1 or M 2 { M.instr = nextinstr;} Use M to obtain the address just before 2 code starts being generated 45

xample X < 100 or x > 200 and x!= y.t = {100}.f = {101} or M x < 100 100: if x< 100 goto _ 101: goto _ ε and M ε x > 200 x!= y t id1 relop id2.truelist = makelist(nextinstr);.falselist = makelist(nextinstr+1); emit ( if id1.var relop id2.var goto _ ) emit( goto _ ); 46

xample X < 100 or x > 200 and x!= y.t = {100}.f = {101} M.i = 102 or M x < 100 100: if x< 100 goto _ 101: goto _ ε and M ε x > 200 x!= y M t ε M.instr = nex(nstr; 47

xample X < 100 or x > 200 and x!= y.t = {100}.f = {101} M.i = 102 or M x < 100 100: if x< 100 goto _ 101: goto _ ε.t = {102}.f = {103} and M ε 102: if x> 200 goto _ 103: goto _ x > 200 x!= y t id1 relop id2.truelist = makelist(nextinstr);.falselist = makelist(nextinstr+1); emit ( if id1.var relop id2.var goto _ ) emit( goto _ ); 48

xample X < 100 or x > 200 and x!= y.t = {100}.f = {101} M.i = 102 or M x < 100 100: if x< 100 goto _ 101: goto _ ε.t = {102}.f = {103} and M ε M.i = 104 102: if x> 200 goto _ 103: goto _ x > 200 x!= y M t ε M.instr = nex(nstr; 49

xample X < 100 or x > 200 and x!= y.t = {100}.f = {101} M.i = 102 or M x < 100 100: if x< 100 goto _ 101: goto _ 102: if x> 200 goto _ 103: goto _ ε.t = {102}.f = {103} x > 200 and M ε 104: if x!=y goto _ 105: goto _ M.i = 104 x!= y.t = {104}.f = {105} t id1 relop id2.truelist = makelist(nextinstr);.falselist = makelist(nextinstr+1); emit ( if id1.var relop id2.var goto _ ) emit( goto _ ); 50

xample X < 100 or x > 200 and x!= y.t = {100}.f = {101} or M M.i = 102.t = {104}.f = {103,105} x < 100 100: if x< 100 goto _ 101: goto _ 102: if x> 200 goto 104 103: goto _ ε.t = {102}.f = {103} x > 200 and M ε 104: if x!=y goto _ 105: goto _ M.i = 104 x!= y.t = {104}.f = {105} t 1 and M 2 backpatch(1.truelist,m.instr);.truelist = 2.trueList;.falseList = merge(1.falselist,2.falselist); 51

xample X < 100 or x > 200 and x!= y.t = {100}.f = {101} or M.t = {100,104}.f = {103,105} M.i = 102.t = {104}.f = {103,105} x < 100 100: if x< 100 goto _ 101: goto 102 102: if x> 200 goto 104 103: goto _ ε.t = {102}.f = {103} x > 200 and M ε 104: if x!=y goto _ 105: goto _ M.i = 104 x!= y.t = {104}.f = {105} t 1 or M 2 backpatch(1.falselist,m.instr);.truelist = merge(1.truelist,2.truelist);.falselist = 2.falseList; 52

xample 100: if x<100 goto _ 101: goto _ 102: if x>200 goto _ 103: goto _ 104: if x!=y goto _ 105: goto _ 100: if x<100 goto _ 101: goto _ 102: if x>200 goto 104 103: goto _ 104: if x!=y goto _ 105: goto _ 100: if x<100 goto _ 101: goto 102 102: if x>200 goto 104 103: goto _ 104: if x!=y goto _ 105: goto _ efore backpatching A_er backpatching by the produc2on t 1 and M 2 A_er backpatching by the produc2on t 1 or M 2 53

ackpatching for statements produc(on S t if () M S1 S t if () M1 S1 N else M2 S2 S t while M1 () M2 S1 S t { L } S t A M t ε N t ε L t L1 M S L t S seman(c ac(on backpatch(.truelist,m.instr); S.nextList = merge(.falselist,s1.nextlist); backpatch(.truelist,m1.instr); backpatch(.falselist,m2.instr); temp = merge(s1.nextlist,n.nextlist); S.nextList = merge(temp,s2.nextlist); backpatch(s1.nextlist,m1.instr); backpatch(.truelist,m2.instr); S.nextList =.falselist; emit( goto M1.instr); S.nextList = L.nextList; S.nextList = null; M.instr = nex2nstr; N.nextList = makelist(nextinstr); emit( goto _ ); backpatch(l1.nextlist,m.instr); L.nextList = S.nextList; L.nextList = S.nextList 54

Generate code for procedures n = f(a[i]); t1 = i * 4 t2 = a[t1] // could have expanded this as well param t2 t3 = call f, 1 n = t3 we will see handling of procedure calls in much more detail later 55

xtend grammar for procedures D t define T id (F) { S } F t ε T id, F S t return ; t id (A) A t ε, A statements expressions type checking function type: return type, type of formal parameters within an expression function treated like any other operator symbol table parameter names 56

Summary Three address code is our IR. Intermediate code generation is executed while parsing (via semantic actions). Creating code for oolean expressions and for control statements is more involved. We typically use short circuit evaluation, value of an expression is implicit in control location. We need to compute the branching addresses. Option 1: compute them in a second AST pass. Option 2: backpatching (a single pass): maintain lists of incomplete jumps, where all jumps in a list have the same target. When the target becomes known, all instructions on its list are filled in. 57

Recap Lexical analysis regular expressions identify tokens ( words ) Syntax analysis context-free grammars identify the structure of the program ( sentences ) Contextual (semantic) analysis type checking defined via typing judgements can be encoded via attribute grammars Syntax directed translation attribute grammars Intermediate representation many possible IRs generation of intermediate representation 3AC 58

Journey inside a compiler float posi2on; float ini2al; float rate; posi2on = ini2al + rate * 60 Token Stream <float> <ID,posi2on> <;> <float> <ID,ini2al> <;> <float> <ID,rate> <;> <ID,1> <=> <ID,2> <+> <ID,3> <*> <60> Lexical Analysis Syntax Analysis Sem. Analysis Inter. Rep. Code Gen. 59

Journey inside a compiler <ID,1> <=> <ID,2> <+> <ID,3> <*> <60> symbol table id symbol type data = AST 1 posi2on float 2 ini2al float <id,1> + 3 rate float <id,2> * S t ID = t ID + * NUM <id,3> 60 Lexical Analysis Syntax Analysis Sem. Analysis Inter. Rep. Code Gen. 60

Journey inside a compiler = AST = AST symbol table id symbol type <id,1> + <id,1> + 1 posi2on float 2 ini2al float <id,2> * <id,2> * 3 rate float invofloat <id,3> 60 <id,3> 60 coercion: automa2c conversion from int to float inserted by the compiler Lexical Analysis Syntax Analysis Sem. Analysis Inter. Rep. Code Gen. 61

Journey inside a compiler id1 = t3 <id,1> = <id,2> + * t3 = id2 * t2 3AC t1 = inttofloat(60) t2 = id3 * t1 t3 = id2 + t2 id1 = t3 t2 = id3 * t1 <id,3> invofloat 60 t1 = invofloat(60) produc(on S t id = t 1 op 2 t invofloat(num) seman(c rule S.code :=. code gen(id.var :=.var).var := freshvar();.code = 1.code 2.code gen(.var := 1.var op 2.var).var := freshvar();.code = gen(.var := invofloat(num)) t id.var := id.var;.code = Lexical Analysis Syntax Analysis Sem. Analysis Inter. Rep. Code Gen. (for brevity, bubbles show only code generated by the node and not all accumulated code avribute) note the structure: translate 1 translate 2 handle operator 62

Journey inside a compiler 3AC Op2mized t1 = inttofloat(60) t2 = id3 * t1 t3 = id2 + t2 id1 = t3 t1 = id3 * 60.0 id1 = id2 + t1 value known at compile 2me can generate code with converted value eliminated temporary t3 Lexical Analysis Syntax Analysis Sem. Analysis Inter. Rep. Code Gen. 63

Journey inside a compiler Op2mized Code Gen t1 = id3 * 60.0 id1 = id2 + t1 LDF R2, id3 MULF R2, R2, #60.0 LDF R1, id2 ADDF R1,R1,R2 STF id1,r1 Lexical Analysis Syntax Analysis Sem. Analysis Inter. Rep. Code Gen. 64

Problem 3.8 from [Appel] A simple left-recursive grammar: + id id A simple right-recursive grammar accepting the same language: id + id Which has better behavior for shift-reduce parsing? 65

Answer Input id+id+id+id+id id (reduce) + + id (reduce) + + id (reduce) + + id (reduce) + + id (reduce) + id id le_ recursive stack The stack never has more than three items on it. In general, with LR-parsing of left-recursive grammars, an input string of length O(n) requires only O(1) space on the stack. 66

Answer Input id+id+id+id+id id + id right recursive id id + id + id id + id + id + id + id id + id + id id + id + id + id id + id + id + id + id + id + id + id + id (reduce) id + id + id + id + (reduce) id + id + id + (reduce) id + id + (reduce) id + (reduce) stack The stack grows as large as the input string. In general, with LR-parsing of right-recursive grammars, an input string of length O(n) requires O(n) space on the stack. 67

Next Runtime nvironments 68