CS 406: Syntax Directed Translation

Similar documents
Syntax Directed Translation

Syntax-Directed Translation. Lecture 14

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

A Simple Syntax-Directed Translator

Syntax-Directed Translation

Syntax-Directed Translation

[Syntax Directed Translation] Bikash Balami

A programming language requires two major definitions A simple one pass compiler

Syntax Directed Translation

Syntax-Directed Translation. Introduction

Syntactic Directed Translation

LR Parsing LALR Parser Generators

Recursive Descent Parsers

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

Syntax-Directed Translation

CS 403: Scanning and Parsing

Abstract Syntax Trees & Top-Down Parsing

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

LR Parsing LALR Parser Generators

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Compilers. 5. Attributed Grammars. Laszlo Böszörmenyi Compilers Attributed Grammars - 1

Yacc: A Syntactic Analysers Generator

Syntax-directed translation. Context-sensitive analysis. What context-sensitive questions might the compiler ask?

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

COP4020 Spring 2011 Midterm Exam

Object-oriented Compiler Construction

LECTURE 3. Compiler Phases

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Syntax-Directed Translation Part II

Summary: Semantic Analysis

Lecture 8: Deterministic Bottom-Up Parsing

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 97 INSTRUCTIONS

Syntax-Directed Translation. CS Compiler Design. SDD and SDT scheme. Example: SDD vs SDT scheme infix to postfix trans

CJT^jL rafting Cm ompiler

Lecture 7: Deterministic Bottom-Up Parsing

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Syntax Analysis Part I

LL parsing Nullable, FIRST, and FOLLOW

Introduction to Parsing. Lecture 8

COP4020 Programming Languages. Semantics Prof. Robert van Engelen

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

More On Syntax Directed Translation

Compiler Principle and Technology. Prof. Dongming LU April 15th, 2019

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

CSCI312 Principles of Programming Languages!

Building Compilers with Phoenix

Lecture 14 Sections Mon, Mar 2, 2009

Top down vs. bottom up parsing

CS606- compiler instruction Solved MCQS From Midterm Papers

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

CSE P 501 Exam 11/17/05 Sample Solution

Syntax-Directed Translation. Concepts Introduced in Chapter 5. Syntax-Directed Definitions

Compiler Design Overview. Compiler Design 1

COMPILER DESIGN - QUICK GUIDE COMPILER DESIGN - OVERVIEW

Unit 13. Compiler Design

Abstract Syntax Trees Synthetic and Inherited Attributes

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

CSE 12 Abstract Syntax Trees

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Compiler Construction

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Syntax-Directed Translation Part I

CSE 582 Autumn 2002 Exam 11/26/02

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Parsing CSCI-400. Principles of Programming Languages.

CS 4120 Introduction to Compilers

Project Compiler. CS031 TA Help Session November 28, 2011

Compiler Theory. (Semantic Analysis and Run-Time Environments)

Using an LALR(1) Parser Generator

Question Points Score

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

We now allow any grammar symbol X to have attributes. The attribute a of symbol X is denoted X.a

Parsing Techniques. AST Review. AST Data Structures. LL AST Construction. AST Construction CS412/CS413. Introduction to Compilers Tim Teitelbaum

Downloaded from Page 1. LR Parsing

Compiler Construction

More Assigned Reading and Exercises on Syntax (for Exam 2)

Context-sensitive analysis. Semantic Processing. Alternatives for semantic processing. Context-sensitive analysis

COP4020 Programming Languages. Semantics Robert van Engelen & Chris Lacher

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

CS 536 Midterm Exam Spring 2013

Time : 1 Hour Max Marks : 30

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

QUESTIONS RELATED TO UNIT I, II And III

2.2 Syntax Definition

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

COP5621 Exam 3 - Spring 2005

Chapter 4 :: Semantic Analysis

Transcription:

CS 406: Syntax Directed Translation Stefan D. Bruda Winter 2015

SYNTAX DIRECTED TRANSLATION Syntax-directed translation the source language translation is completely driven by the parser The parsing process and parse trees/ast used to direct semantic analysis and the translation of the source program Separate phase of a compiler or grammar augmented with information to control the semantic analysis and translation (attribute grammars) Attribute grammars associate attributes with each grammar symbol An attribute has a name and an associated value: string, number, type, memory location, register whatever information we need. Examples Attributes for a variable include type (as declared, useful later in type-checking) An integer constant will have an attribute value (used later to generate code) With each grammar rule we also give semantic rules or actions, describing how to compute the attribute values associated with each grammar symbol in the rule An attribute value for a parse node may depend on information from its children nodes, its siblings, and its parent CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 1 / 14

ATTRIBUTE GRAMMARS AND ACTIONS Grammar Action(s) int ::= digit { int 0.value = digit.value; } int digit { int 0.value = int 1.value 10 + digit.value; } digit ::= 0 { digit.value = 0; } 1 { digit.value = 1; } 2 { digit.value = 2; }... 9 { digit.value = 9; } Attributes are computed during the construction of the parse tree and are typically included in the node objects of that tree Two general classes of attributes: Synthesized: passed up in the parse tree Inherited: passed down the parse tree CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 2 / 14

ATTRIBUTES Synthesized attributes: the left hand-side attribute is computed from the right hand-side attributes X ::= Y 1 Y 2... Y n X.a = f (Y 1.a, Y 2.a,..., Y n.a) The lexical analyzer supplies the attributes of terminals The attributes for nonterminals are built up for the nonterminals and passed up the tree int e = 4 10+ 2 = 42 int e = 4 digit e = 4 4 digit e = 2 Inherited attributes: the right hand-side attributes are derived from the left hand-side attributes or other right hand-side attributes X ::= Y 1 Y 2... Y n Y k.a = f (X.a, Y 1.a, Y 2.a,..., Y k 1.a, Y k+1.a,..., Y n.a) Used for passing information about the context to nodes further down the tree CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 3 / 14 2

ATTRIBUTES (CONT D) P ::= D S { S.dl = D.dl; } D ::= var V ; D { D 0.dl = addlist( V.name, D 1.dl); } ε { D 0.dl = NULL; } S ::= V := E ; S {check( V.name, S 0.dl); S 1.dl = S 0.dl; } ε {} V ::= x { V.name = x ; } y { V.name = y ; } z { V.name = z ; } CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 4 / 14

ATTRIBUTES (CONT D) P ::= D S { S.dl = D.dl; } D ::= var V ; D { D 0.dl = addlist( V.name, D 1.dl); } ε { D 0.dl = NULL; } S ::= V := E ; S {check( V.name, S 0.dl); S 1.dl = S 0.dl; } ε {} V ::= x { V.name = x ; } y { V.name = y ; } z { V.name = z ; } Two attributes: name for the name of the variable and dl for the list of declarations CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 4 / 14

ATTRIBUTES (CONT D) P ::= D S { S.dl = D.dl; } D ::= var V ; D { D 0.dl = addlist( V.name, D 1.dl); } ε { D 0.dl = NULL; } S ::= V := E ; S {check( V.name, S 0.dl); S 1.dl = S 0.dl; } ε {} V ::= x { V.name = x ; } y { V.name = y ; } z { V.name = z ; } Two attributes: name for the name of the variable and dl for the list of declarations Each time a new variable is declared a synthesized attribute for its name is attached to it That name is added to a list of variables declared so far in the synthesized attribute dl created from the declaration block CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 4 / 14

ATTRIBUTES (CONT D) P ::= D S { S.dl = D.dl; } D ::= var V ; D { D 0.dl = addlist( V.name, D 1.dl); } ε { D 0.dl = NULL; } S ::= V := E ; S {check( V.name, S 0.dl); S 1.dl = S 0.dl; } ε {} V ::= x { V.name = x ; } y { V.name = y ; } z { V.name = z ; } Two attributes: name for the name of the variable and dl for the list of declarations Each time a new variable is declared a synthesized attribute for its name is attached to it That name is added to a list of variables declared so far in the synthesized attribute dl created from the declaration block The list of variables is then passed as an inherited attribute to the statements following the declarations so that it can be checked that variables are declared before use CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 4 / 14

ATTRIBUTES (CONT D) Most programming languages require both synthesized and inherited attributes A given style of parsing favors attribute flow in one direction Top-down parsing deals trivially with inherited attributes Bottom-up parsing deals trivially with synthesized attributes The other direction is handled using other techniques For example, a symbol table is often used to pass attributed back and forth irrespective of the direction favored by any particular parsing method CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 5 / 14

ATTRIBUTE IMPLEMENTATION Typically handling of attributes: associate with each symbol either member variables in the AST node structure or some sort of structure (e.g., list) with all the necessary attributes If we have a list then we store it as a member variable in each node structure Associate code to the processing of each nonterminal to carry on the attribute computations Also need some convention for referring to individual symbols in a rule while defining the associated action Typical convention in compiler generators: $$ to refer to the left hand side and $i to refer to the i-th component of the right hand side: P -> DS { $$.list = $1.list; } D -> var V; D { $$.list = add_to_list($2.name, $4.list); } { $$.list = NULL; } S -> V := E; S { check($1.name, $$.list); $5.list = $$.list; } V -> x { $$.name = "x"; } y { $$.name = "y"; } z { $$.name = "z"; } CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 6 / 14

BOTTOM-UP SYNTAX DIRECTED TRANSLATION Consider a LR parser ready to reduce using A ::= X 1... X n The synbols X i are on the stack before the reduction Previous reductions have associated semantic values (attributes) to these symbols They are then popped and A is pushed in their place While we do this, we execute some code that compute the attribute valued for A In effect we have a syntactic stack (for the actual parsing) and a semantic stack (for the semantic values) CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 7 / 14

ISSUES IN BOTTOM-UP SYNTAX DIRECTED TRANSLATION digit ::= 0 1... 9 int ::= digit int digit num ::= o int int We require that the o-prefixed numbers be evaluated in octal CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 8 / 14

ISSUES IN BOTTOM-UP SYNTAX DIRECTED TRANSLATION digit ::= 0 1... 9 int ::= digit int digit num ::= o int int We require that the o-prefixed numbers be evaluated in octal Drawback: no restriction to octal digits for octal numbers CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 8 / 14

ISSUES IN BOTTOM-UP SYNTAX DIRECTED TRANSLATION digit ::= 0 1... 9 int ::= digit int digit num ::= o int int We require that the o-prefixed numbers be evaluated in octal Drawback: no restriction to octal digits for octal numbers Major drawback: not enough information from below for the differentiation between decimal and octal numbers Semantic rules for computing these are different, yet they should all get attached to the rules for int The decision on whether to process a decimal or octal number happens when o is shifted on the stack At that time however an int has already been reduced and so its semantic actions have already been applied In addition, semantic rules can only be applied to reductions, not shifts CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 8 / 14

FIRST SOLUTION: RULE CLONING Since our problem is caused by using the same rules for two different things, we can clone those rules so that we have separate copies for separate purposes When to use one set of rules and when to use the other is given based on the context of the nonterminal (i.e., where is the nonterminal used) digit ::= 0 1... 9 int ::= digit int digit intoct ::= digit intoct digit num ::= o intoct int Drawback: Grammar inflation The added rules are not meaningful syntactically Extreme care should be taken when modifying a grammar to make sure that the new version still generates the same language The problem of context-free grammar equivalence is undecidable CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 9 / 14

SECOND SOLUTION: FORCING SEMANTIC ACTIONS Suppose we need a semantic action when shifting some token x We can insert a new rule A ::= x, and attach the action to this rule All the occurrences of x in the original grammar will be replaced by A Suppose we need a semantic action between two symbols x and y CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 10 / 14

SECOND SOLUTION: FORCING SEMANTIC ACTIONS Suppose we need a semantic action when shifting some token x We can insert a new rule A ::= x, and attach the action to this rule All the occurrences of x in the original grammar will be replaced by A Suppose we need a semantic action between two symbols x and y We then insert a new rule A ::= ε and attach the action to it All the occurrences of x y in the original grammar will be replaced by x A y num ::= oct int {ans = int.value; } dec int {ans = int.value; } oct ::= o {base = 8; } dec ::= ε {base = 10; } int ::= digit { int 0.value = digit.value; } int digit { int 0.value = int 1.value base + digit.value; } digit ::= 0 { digit.value = 0; }... 9 { digit.value = 9; } Note the use of the global variable base (common occurrence) The same caveats about modifying the grammar (semantic-only rules, equivalence) apply CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 10 / 14

THIRD SOLUTION: GRAMMAR RESTRUCTURING Global variables are undesirable because rules may be recursive and this may have unexpected consequences on these variables Global variables can also make the semantic actions difficult to write and maintain since there is a lack of separation between actions Proper initialization and resetting may be problematic A more robust solution is to restructure the parse tree as to eliminate the need for global variables: 1 Sketch a parse tree that allows bottom-up synthesis without global variables 2 Revise the grammar to achieve that parse tree 3 Verify that the grammar is still suitable for parsing (LALR(1), etc.) 4 Verify that the grammar still generate the same language CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 11 / 14

THIRD SOLUTION: GRAMMAR RESTRUCTURING Global variables are undesirable because rules may be recursive and this may have unexpected consequences on these variables Global variables can also make the semantic actions difficult to write and maintain since there is a lack of separation between actions Proper initialization and resetting may be problematic A more robust solution is to restructure the parse tree as to eliminate the need for global variables: 1 Sketch a parse tree that allows bottom-up synthesis without global variables 2 Revise the grammar to achieve that parse tree 3 Verify that the grammar is still suitable for parsing (LALR(1), etc.) 4 Verify that the grammar still generate the same language int ::= int digit { int 0.value = int 1.value int 1.base + digit.value; int 0.base = int 1.base; } base { int 0.base = base.base; int 0.value = 0; } base ::= ε { base.base = 10; } o { base.base = 8; } digit ::= 0 { digit.value = 0; }... 9 { digit.value = 9; } CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 11 / 14

TOP-DOWN SYNTAX DIRECTED TRANSLATION Top-down parsers are usually recursive descent parsers The computation of attributes is naturally inserted in the code, just like the code for constructing the AST Same ideas as above may be required to modify the grammar so that all the attributes can be computed class Node {...}; Node* Sequence() { Node* current = new Node(SEQ,...); if (t == CLS BRACE) /* <empty> */ ; else { /* <statement> <sequence> */ current.addchild(statement()); current.addchild(sequence()); } return current; } Also see the example in the textbook CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 12 / 14

ABSTRACT SYNTAX TREES The most common semantic actions are the ones that construct the abstract syntax tree for the input program AST is a simplified and more compact representation of the parse tree Just like in a parse tree, an AST node can have an arbitrary number of children Links to the parent often needed (depending on the algorithms used in the semantic analysis) The data structure for an AST node can be approached in two ways 1 Have individual types for individual nodes (assignment, conditional, loop, etc.) see assignments Handy for languages that provide type definitions with inheritance, case in which this is the preferred method Awkward in languages that do not offer inheritance constructs CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 13 / 14

ABSTRACT SYNTAX TREES The most common semantic actions are the ones that construct the abstract syntax tree for the input program AST is a simplified and more compact representation of the parse tree Just like in a parse tree, an AST node can have an arbitrary number of children Links to the parent often needed (depending on the algorithms used in the semantic analysis) The data structure for an AST node can be approached in two ways 1 Have individual types for individual nodes (assignment, conditional, loop, etc.) see assignments Handy for languages that provide type definitions with inheritance, case in which this is the preferred method Awkward in languages that do not offer inheritance constructs 2 Have the same data structure for all nodes General, language-independent solution Needs efficient representation for nodes with arbitrary number of children Typical implementation: left-child-right-sibling Each node is a node in a binary tree The left child of a node points to the first child of that node The right child of a node points to the next (right) sibling of that node CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 13 / 14

AST DESIGN PRINCIPLES AST design is crucial for the next phases of the compilation process It should be possible to reconstitute ( unparse ) the program from an AST An AST node must hold enough information to recall the program fragment that generated it Subsequent phases of the compilation process must access the AST through suitable interfaces Different phases have different requirements (and so will use different interfaces) Several phases will modify AST nodes It is crucial to provide proper encapsulation to ensure that the AST information is not altered inadvertently Subsequent compilation phases will traverse the AST (possibly repeatedly) The easiest way to accomplish this is through polymorphic and recursive functions defined within the class hierarchy of AST node The functions must be virtual to ensure the proper application for each node type Most useful pattern for such functions: visitors traverse the whole tree recursively CS 406: Syntax Directed Translation (S. D. Bruda) Winter 2015 14 / 14