Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

Similar documents
Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Computing Inside The Parser Syntax-Directed Translation. Comp 412

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Handling Assignment Comp 412

Syntax Analysis, VI Examples from LR Parsing. Comp 412

Context-sensitive Analysis Part IV Ad-hoc syntax-directed translation, Symbol Tables, andtypes

Building a Parser Part III

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

Intermediate Representations

Instruction Selection: Preliminaries. Comp 412

Syntax Analysis, III Comp 412

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing II Top-down parsing. Comp 412

Implementing Control Flow Constructs Comp 412

Introduction to Parsing. Comp 412

Runtime Support for Algol-Like Languages Comp 412

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

Syntax Analysis, III Comp 412

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

CS415 Compilers. Intermediate Represeation & Code Generation

Intermediate Representations

Alternatives for semantic processing

The Software Stack: From Assembly Language to Machine Code

CS 406/534 Compiler Construction Putting It All Together

Code Shape II Expressions & Assignment

CSCI312 Principles of Programming Languages

Local Register Allocation (critical content for Lab 2) Comp 412

Instruction Selection, II Tree-pattern matching

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Introduction to Compiler

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

Class Information ANNOUCEMENTS

Instruction Selection: Peephole Matching. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Syntax Analysis, VII One more LR(1) example, plus some more stuff. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. target code.

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Bottom-up Parser. Jungsik Choi

Lexical Analysis - An Introduction. Lecture 4 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Compiler Optimisation

Parsing III. (Top-down parsing: recursive descent & LL(1) )

CS415 Compilers. Lexical Analysis

Runtime Support for OOLs Object Records, Code Vectors, Inheritance Comp 412

Naming in OOLs and Storage Layout Comp 412

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Lab 3, Tutorial 1 Comp 412

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Arrays and Functions

Intermediate Representations

Syntactic Analysis. Top-Down Parsing

Parsing II Top-down parsing. Comp 412

Combining Optimizations: Sparse Conditional Constant Propagation

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Compiler Design. Code Shaping. Hwansoo Han

Lexical Analysis. Introduction

Parsing Part II (Top-down parsing, left-recursion removal)

Lecture 12: Compiler Backend

DEMO A Language for Practice Implementation Comp 506, Spring 2018

Semantic actions for declarations and expressions

Syntactic Directed Translation

Semantic actions for declarations and expressions. Monday, September 28, 15

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

Syntax-directed translation. Context-sensitive analysis. What context-sensitive questions might the compiler ask?

Front End. Hwansoo Han

Agenda. CSE P 501 Compilers. Big Picture. Compiler Organization. Intermediate Representations. IR for Code Generation. CSE P 501 Au05 N-1

The View from 35,000 Feet

CS5363 Final Review. cs5363 1

Optimizer. Defining and preserving the meaning of the program

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

CS 406/534 Compiler Construction Parsing Part I

Topic 6 Basic Back-End Optimization

CS /534 Compiler Construction University of Massachusetts Lowell

Syntax-Directed Translation. Lecture 14

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

CSCI Compiler Design

Intermediate Code Generation

Question Points Score

3.5 Practical Issues PRACTICAL ISSUES Error Recovery

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Context-sensitive analysis. Semantic Processing. Alternatives for semantic processing. Context-sensitive analysis

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS 11 Ocaml track: lecture 6

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

opt. front end produce an intermediate representation optimizer transforms the code in IR form into an back end transforms the code in IR form into

Using an LALR(1) Parser Generator

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

CS 406/534 Compiler Construction Parsing Part II LL(1) and LR(1) Parsing

CS 406/534 Compiler Construction Instruction Selection and Global Register Allocation

Introduction to Syntax Directed Translation and Top-Down Parsers

Chapter 4 :: Semantic Analysis

CSE 504: Compiler Design. Instruction Selection

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

A parser is some system capable of constructing the derivation of any sentence in some language L(G) based on a grammar G.

The Processor Memory Hierarchy

Chapter 3. Parsing #1

Transcription:

COMP 412 FALL 2018 Computing Inside The Parser Syntax-Directed Translation, II Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Chapter 4 in EaC2e

Midterm Exam Thursday October 18, @ 7PM Herzstein Amphitheater The Midterm will cover Class content through last class Chapters 1, 2, 3, and 5 in the book Local register allocation material from the Chapter 13 excerpt The format will be Closed book, closed notes, closed devices Designed as a two-hour exam Review Session Tuesday, October 16, 7PM Location tba Bring your questions COMP 412, Fall 2018 1

Review Example Computing the value of an unsigned integer Consider the simple grammar 1 Number digit DigitList 2 DigitList digit DigitList 3 epsilon One obvious use of the grammar is to convert an ASCII string of the number to its integer value Build computation into parser An easy intro to syntax-directed translation pair (boolean, int) Number( value ) { if (word = digit) then { value = ValueOf( digit ); word = NextWord( ); return DigitList( value ); } else return (false, invalid value); } pair (boolean, int) DigitList( value ) { if (word = digit) then { value = value * 10 + ValueOf( digit ); word = NextWord( ); return DigitList( value ); } else return (true, value); } SDT in a top-down, recursive-descent parser COMP 412, Fall 2018 2

Review SDT in a Bottom-Up Parser (e.g., LR(1)) We specify SDT actions relative to the syntax 1 Number DigitList { value(lhs) = value(dl) ; } 2 DigitList DigitList digit { value(lhs) = value(dl) * 10 + value(digit) } 3 digit { value(lhs) = value(digit); } The compiler writer can attach actions to productions and have those actions execute each time the parser reduces by that production Simple mechanism that ties computation to syntax Needs storage associated with each symbol in each production Needs a scheme to name that storage COMP 412, Fall 2018 3

Review SDT in Bison or Yacc In Bison or Yacc, we specify SDT actions using a simple notation 1 Number DigitList { $$ = $1; } 2 DigitList DigitList digit { $$ = $1 * 10 + $2; } 3 digit { $$ = $1; } Scanner provides the value of digit. The compiler writer provides production-specific code snippets that execute when the parser reduces by that production Positional notation for the value associated with a symbol $$ is the LHS; $1 is the first symbol in the RHS; $2 the second, Compiler writer can put arbitrary code in these snippets Solve a travelling salesman problem, compute PI to 100 digits, More importantly, they can compute on the lexemes of grammar symbols and on information derived and stored earlier in translation How does this fit into the LR(1) skeleton parser? COMP 412, Fall 2018 4

Fitting AHSDT into the LR(1) Skeleton Parser stack.push(invalid); stack.push(s 0 ); // initial state word = scanner.next_word(); loop forever { s = stack.top(); if ( ACTION[s,word] == reduce A b ) then { stack.popnum(2* b ); // pop 2* b symbols s = stack.top(); stack.push(a); // push LHS, A stack.push(goto[s,a]); // push next state } else if ( ACTION[s,word] == shift s i ) then { stack.push(word); stack.push(s i ); word scanner.next_word(); } else if ( ACTION[s,word] == accept & word == EOF ) then break; else throw a syntax error; } report success; Actions are taken on reductions Insert a call to Work() before the call to stack.popnum() Work() contains a case statement that switches on the production number Code in Work() can read items from the stack That is why it calls Work() before stack.popnum() COMP 412, Fall 2018 5

Fitting AHSDT into the LR(1) Skeleton Parser stack.push(invalid); stack.push(s 0 ); // initial state word = scanner.next_word(); loop forever { s = stack.top(); if ( ACTION[s,word] == reduce A b ) then { stack.popnum(2* b ); // pop 2* b symbols s = stack.top(); stack.push(a); // push LHS, A stack.push(goto[s,a]); // push next state } else if ( ACTION[s,word] == shift s i ) then { stack.push(word); stack.push(s i ); word scanner.next_word(); } else if ( ACTION[s,word] == accept & word == EOF ) then break; else throw a syntax error; } report success; Passing values between actions Tie values to instances of grammar symbols Equivalent to parse tree nodes We can pass values on the stack Push / pop 3 rather than 2 Work() takes the stack as input (conceptually) and returns the value for the reduction it processes Shift creates initial values COMP 412, Fall 2018 6

stack.push(invalid); stack.push(s0); // initial state word = scanner.next_word(); loop forever { s = stack.top(); if ( ACTION[s,word] == reduce A b ) then { r = Work(stack, A b ) stack.popnum(3* b ); // pop 3* b symbols s = stack.top(); // save exposed state stack.push(a); // push A stack.push (r); // push result of WORK() stack.push(goto[s,a]); // push next state } else if ( ACTION[s,word] == shift si ) then { stack.push(word); stack.push(initial Value); stack.push(si); word scanner.next_word(); } else if ( ACTION[s,word] == accept & word == EOF ) then break; else throw a syntax error; } report success; COMP 412, Fall 2018 Fitting AHSDT into the LR(1) Skeleton Parser Modifications are minor - Insert call to Work() - Change the push() & pop() behavior Same asymptotic behavior as the original algorithm. - 50% more stack space Last obstacle is making it easy to write the code for Work() Note that, in C, the stack has some odd union type. 7

Translating Code Snippets Into Work() For each production, the compiler writer can provide a code snippet { value = value * 10 + digit; } We need a scheme to name stack locations. Yacc introduced a simple one that has been widely adopted. $$ refers to the result, which will be pushed on the stack $1 is the first item on the productions right hand side $2 is the second item $3 is the third item, and so on The digits example above becomes { $$ = $1 * 10 + $2; } COMP 412, Fall 2018 8

Translating Code Snippets Into Work() How do we implement Work()? Work() takes 2 arguments: the stack and a production number Work() contains a case statement that switches on production number Each case contains the code snippet for a reduction by that production The $1, $2, $3 macros translate into references into the stack The $$ macro translates into the return value if ( ACTION[s,word] == reduce A b ) then { r = Work(stack, A b ) stack.popnum(3* b ); // pop 3* b symbols s = stack.top(); // save exposed state stack.push(a); // push A stack.push (r); // push result of WORK() stack.push(goto[s,a]); // push next state } $$ translates to r $i translates to the stack location 3 *( b - i + 1) units down from stacktop Note that b, i, 3, and 1 are all constants so $i can be evaluated to a compiletime constant COMP 412, Fall 2018 9

SDT is a Mechanism AHSDT allows the compiler writer to specify syntax-driven computations Compiler-writer ties computation to syntax Parser determines when to use each computation Build an IR representation Critical for later phases of the compiler Might execute it directly to create an interpreter Calculators, accounting systems, command-line shells, Might analyze it to derive knowledge Optimizing compilers, theorem provers, Might transform it and re-generate source code Automatic parallelization systems, code refactoring tools, Measure and report on program properties Key point: the mechanism does not determine how you use it Corollary: the syntax does not dictate the form or content of the IR COMP 412, Fall 2018 10

Example Building an Abstract Syntax Tree 1 Goal Expr $$ = $1; 2 Expr Expr + Term $$ = MakeAddNode($1,$3); 3 Expr - Term $$ = MakeSubNode($1,$3); 4 Term $$ = $1; 5 Term Term * Factor $$ = MakeMulNode($1,$3); 6 Term / Factor $$ = MakeDivNode($1,$3); 7 Factor $$ = $1; 8 Factor ( Expr ) $$ = $2; 9 number $$ = MakeNumNode(token); 10 ident $$ = MakeIdNode(token); Assumptions: constructors for each node stack holds pointers to nodes COMP We call 412, $$ Fall = 2018 $1; a copy rule. 11

Example Building an Abstract Syntax Tree 1 Goal Expr $$ = $1; 2 Expr Expr + Term $$ = MakeAddNode($1,$3); 3 Expr - Term $$ = MakeSubNode($1,$3); 4 Term $$ = $1; 5 Term Term * Factor $$ = MakeMulNode($1,$3); 6 Term / Factor $$ = MakeDivNode($1,$3); 7 Factor $$ = $1; 8 Factor ( Expr ) $$ = $2; 9 number $$ = MakeNumNode(token); 10 ident $$ = MakeIdNode(token); Create interior nodes Create interior nodes Create leaf nodes Assumptions: constructors for each node stack holds pointers to nodes Rules 1, 4, 7, and 8 are just copy rules to pass values upward in the derivation. Rule 8 only exists to enforce precedence (evaluation order). COMP We call 412, $$ Fall = 2018 $1; a copy rule. 12

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept Consider x 2 * y Trace of the LR(1) parse Detail of states abstracted Expression Grammar 1 Goal Expr 2 Expr Expr + Term 3 Expr - Term 4 Term 5 Term Term * Factor 6 I Term / Factor 7 I Factor 8 Factor ( Expr ) 9 num COMP 412, Fall 2018 10 id 13

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept id x COMP 412, Fall 2018 14

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept Action is a copy rule id x COMP 412, Fall 2018 15

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept id num x 2 COMP 412, Fall 2018 16

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept Action is a copy rule id num x 2 COMP 412, Fall 2018 17

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept id num x 2 id y COMP 412, Fall 2018 18

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept id num x 2 * id y COMP 412, Fall 2018 19

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept - id num x 2 * id y COMP 412, Fall 2018 20

Example Building an Abstract Syntax Tree Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept AHSDT Works! Built the AST Some reduce actions just copied values; others called constructors. Same tree as earlier slide - id num x 2 * id y COMP 412, Fall 2018 21

Example Emitting ILOC 1 Goal Expr 2 Expr Expr +Term $$ = NextRegister(); Emit(add, $1, $3, $$); 3 Expr - Term $$ = NextRegister(); Emit(sub, $1, $3, $$); 4 Term $$ = $1; 5 Term Term * Factor $$ = NextRegister(); Emit(mult, $1, $3, $$) 6 Term / Factor $$ = NextRegister(); Emit(div, $1, $3, $$); 7 Factor $$ = $1; 8 Factor ( Expr ) $$ = $2; 9 number $$ = NextRegister(); Emit(loadI,Value(lexeme),$$); 10 ident $$ = NextRegister(); EmitLoad(ident,$$); Assumptions NextRegister() returns a virtual register name Emit() can format assembly code EmitLoad() handles addressability & gets a value into a register COMP Copy 412, rules Fall on 2018 the same productions as in the last example 22

Same parse, different rule set Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept Emitting ILOC The actions that generated AST leaves and the actions that generated AST interior nodes emit code. pr0 is a base address NextRegister() returned the next virtual register loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 23

Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept Action is a copy rule loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 24

Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 25

Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept Action is a copy rule loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 26

Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 27

Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 28

Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 29

Example Emitting ILOC Stack Input Action $ id num * id shift $ id num * id reduce 10 $ Factor num * id reduce 7 $ Term num * id reduce 4 $ Expr num * id reduce 9 $ Expr Factor * id reduce 7 $ Expr Term * id reduce 10 $ Expr Term * Factor reduce 5 $ Expr Term reduce 3 $ Expr accept Emitting ILOC Simple, but clean, code EmitLoad() hides all of the messy details of naming, addressability, and address modes loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 30

Handling multi with AHSDT Expr Term Fact. <id,x> Goal Expr Term Fact. <num,2> Term * Fact. <id,y> Rule 5 must cope with multiple scenarios If the Factor is an id, then the rules need to pass a register upward If the Factor is a num, then the rules need to return either a register or an immediate value, depending on the magnitude of the num (will it fit?) Rule 5 must recognize the choices, decide, and generate the better code This kind of computation makes the AHSDT rules complex and context dependent. Syntax tree We cannot refactor the grammar to handle this situation; at least, I have not seen how to do it in a reasonable manner. loadai pr0, @x vr1 loadi 2 vr2 loadai pr0, @y vr3 mult vr2, vr3 vr4 sub vr1,vr4 vr5 COMP 412, Fall 2018 31

Handling multi outside of AHSDT Two simple strategies to generate a multi operation outside of AHSDT Generate the simpler code and improve it in the optimizer This kind of rewrite is easily handled during optimization We will see this issue when we discuss value numbering (Chapter 8) Keep a short buffer of emitted operations (3 or 4) and fix them If the buffer is short, the cost is small and constant We will see this issue when we discuss peephole optimization (Chapter 11) Each of these strategies achieves the same result, without creating complicated code in the middle of the AHSDT actions. Each of these strategies achieves more general benefits. COMP 412, Fall 2018 32

What About EmitLoad() EmitLoad() hides lots of details Needs to map an ident to a location Register, symbolic address, or formula to compute a virtual address Implies that the ident has allocated storage Someone or something needs to lay out the contents of data memory Needs to generate code to compute the address Register becomes direct reference Symbolic address becomes a loadi/load sequence (as in lab 1) Formula becomes a more complex expression Formula may be an offset from some known point, a chain of one or more pointers, an indirect reference through a class definition, We need to deal with the issue of storage layout, but not today COMP 412, Fall 2018 33

Code Generation In a modern, multi-ir compiler, code generation happens several times SDT Lowering Passes source code Front End IR 1 IR 2 IR 3 target ST Optimizer ST Optimizer ST Back End code Generate an IR straight out of the parser Might be an AST, might be some high-level (abstract) linear form Almost always accompanied by a symbol table of some form Code passes through one or more lowering pass Takes the code and decreases ( lowers ) the level of abstraction Expand complex operations (e.g., call or mvcl), make control-flow explicit The problems are, essentially, the same The mechanisms are likely quite different COMP 412, Fall 2018 34

Mechanisms for Code Generation In different contexts, the compiler might use different techniques In an LR(1) parser, SDT might be the right tool Generate a tree, a graph, or linear code Actions driven by the syntax of the input code In a lowering pass, traversing the IR might be necessary For a graphical IR: A treewalk, or graph-walk, that produces the new IR For a linear IR: Walk through the IR in one direction or the other In instruction selection, pattern matching is the tool of choice The set of choices, particularly address modes, make this problem complex Code is not subject to subsequent optimization The context makes these tasks subtly different Early in compilation, generate code that will optimize well Late in compilation, generate code that will run well STOP? COMP 412, Fall 2018 35

Handling Identifiers For each identifier and each literal constant, the compiler must know: Its type, size, & lifetime (tied to block, procedure, file, or entire program) Its dimension (scalar, structure, array, array of structure, ) Its location (register, address in RAM, offset from some knowable address) Most of that information can be derived from declarations Programmer specifies her intent for each named identifier in the code Literal constants can be inferred from syntax and context Decl Type SpecList Type INT CHAR SpecList SpecList, Spec Spec Spec NAME Datatype is defined in the production for Type Datatype for a name is set in the production(s) for Spec AHSDT action must carry value from Type to Spec Fragment of typical grammar COMP 412, Fall 2018 36

Processing Declarations Consider the following declaration int x, y, z; Decl Type SpecList INT Spec SpecList The compiler needs to get INT from the Type node to the NAME nodes Propagation across long distances Propagation both up and down in the syntax tree AHSDT rules don t model this kind of value propagation well. <NAME,x> Spec <NAME,y> Corresponding Parse Tree SpecList Spec <NAME,z> COMP 412, Fall 2018 37

Processing Declarations Consider the following declaration int x, y, z; Decl Type SpecList We can add action rules: Decl Type SpecList INT Spec SpecList Type INT { $$ INT; } CHAR { $$ CHAR; } <NAME,x> Spec SpecList SpecList SpecList, Spec Spec Spec NAME <NAME,y> Spec But we cannot get the value back down the SpecList structure to NAMEs Corresponding Parse Tree <NAME,z> COMP 412, Fall 2018 38

Processing Declarations Consider the following declaration int x, y, z; Decl We need to reach outside the AHSDT framework. AHSDT only provides local names ($$, $1, $2, ) We need to pass values from one rule to another Use a global variable Framework passes values up the tree. Reach outside to pass down or across the tree. Type INT Spec <NAME,x> SpecList Spec <NAME,y> Corresponding Parse Tree SpecList SpecList Spec <NAME,z> COMP 412, Fall 2018 39

Processing Declarations This technique is ugly, but effective Consider the following declaration int x, y, z; Decl Type SpecList Using an outside variable, CUR Decl Type SpecList Type INT { CUR INT; } CHAR { CUR CHAR; } INT SpecList SpecList, Spec Spec Spec NAME {set type of NAME to CUR; 1 } Spec <NAME,x> Spec <NAME,y> Corresponding Parse Tree SpecList SpecList Spec <NAME,z> 1 In COMP the compiler s 412, Fall 2018 symbol table 40 CUR

Processing Declarations Another way to handle this kind of problem is to refactor the grammar We tend to think of the grammar from the language viewpoint What syntax does it generate? We can think of grammar as a computational vehicle What does it allow us to compute? Decl INT IntList CHAR CharList IntList IntList, Spec Spec CharList CharList, Spec Spec Spec NAME COMP 412, Fall 2018 41

Processing Declarations Another way to handle this kind of problem is to refactor the grammar We tend to think of the grammar from the language viewpoint What syntax does it generate? We can think of grammar as a computational vehicle What does it allow us to compute? Decl INT IntList CHAR CharList IntList IntList, Spec Spec CharList CharList, Spec Spec Spec NAME INT Decl x AST for example IntList y IntList z IntList The type is implicit in the rule Actions can use type as a constant Avoids opaque use of a global Must still record the types in the symbol table The grammar is shaped for the computation, not the definition Leads to cleaner, more obvious, more maintainable code COMP 412, Fall 2018 42