Parsing Techniques. AST Review. AST Data Structures. LL AST Construction. AST Construction CS412/CS413. Introduction to Compilers Tim Teitelbaum

Similar documents
Parsing Techniques. AST Review. AST Data Structures. Implicit AST Construction. AST Construction CS412/CS413. Introduction to Compilers Tim Teitelbaum

Summary: Semantic Analysis

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

Recursive Descent Parsers

Plan for Today. Regular Expressions: repetition and choice. Syntax and Semantics. Context Free Grammars

A programming language requires two major definitions A simple one pass compiler

Syntax-Directed Translation. Concepts Introduced in Chapter 5. Syntax-Directed Definitions

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

A Simple Syntax-Directed Translator

Syntax-Directed Translation. Lecture 14

Lecture 14 Sections Mon, Mar 2, 2009

Syntax-Directed Translation. Introduction

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

CS 406: Syntax Directed Translation

Semantic Analysis. How to Ensure Type-Safety. What Are Types? Static vs. Dynamic Typing. Type Checking. Last time: CS412/CS413

CS 4120 Introduction to Compilers

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

LR Parsing LALR Parser Generators

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COP5621 Exam 3 - Spring 2005

Syntax Directed Translation

Parsing II Top-down parsing. Comp 412

Project Compiler. CS031 TA Help Session November 28, 2011

Syntax-Directed Translation

Bottom-Up Parsing. Lecture 11-12

Context-free grammars

Syntax-Directed Translation. CS Compiler Design. SDD and SDT scheme. Example: SDD vs SDT scheme infix to postfix trans

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Compiler Construction: Parsing

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

Syntax-Directed Translation

MIT Top-Down Parsing. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 17: Types and Type-Checking 25 Feb 08

Compiler Principle and Technology. Prof. Dongming LU April 15th, 2019

Introduction to Parsing. Lecture 8

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

3. Parsing. Oscar Nierstrasz

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Abstract Syntax Trees Synthetic and Inherited Attributes

Introduction to Compiler Design

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CPS 506 Comparative Programming Languages. Syntax Specification

Object-oriented Compiler Construction

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

LR Parsing LALR Parser Generators

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Bottom-Up Parsing. Lecture 11-12

Syntax-Directed Translation Part I

CSE302: Compiler Design

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Compilers. Compiler Construction Tutorial The Front-end

Chapter 4. Lexical and Syntax Analysis

Semantic Analysis Attribute Grammars

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Programs as data Parsing cont d; first-order functional language, type checking

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

Programming Languages

Building Compilers with Phoenix

Lexical and Syntax Analysis. Top-Down Parsing

Syntactic Directed Translation

Syntax Analysis Part I

CS 314 Principles of Programming Languages

CSE 431S Type Checking. Washington University Spring 2013

Compiler Construction

Compilers. Bottom-up Parsing. (original slides by Sam

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Semantic actions for declarations and expressions

Compiler Design Concepts. Syntax Analysis

Tree visitors. Context classes. CS664 Compiler Theory and Design LIU 1 of 11. Christopher League* 24 February 2016

Compiler Construction

Types of parsing. CMSC 430 Lecture 4, Page 1

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

CSCI312 Principles of Programming Languages!

In One Slide. Outline. LR Parsing. Table Construction

COP4020 Spring 2011 Midterm Exam

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

4. Lexical and Syntax Analysis

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

More On Syntax Directed Translation

How do LL(1) Parsers Build Syntax Trees?

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Lexical and Syntax Analysis. Bottom-Up Parsing

Introduction to Parsing. Lecture 5

Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Using an LALR(1) Parser Generator

4. Lexical and Syntax Analysis

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

More Assigned Reading and Exercises on Syntax (for Exam 2)

POLITECNICO DI TORINO. (01JEUHT) Formal Languages and Compilers. Laboratory N 3. Lab 3. Cup Advanced Use

Transcription:

Parsing Techniques C41/C413 Introduction to Compilers Tim Teitelbaum Lecture 11: yntax-directed Definitions February 14, 005 LL parsing Computes a Leftmost derivation Determines the derivation top-down LL parsing table indicates which production to use for expanding the leftmost non-terminal LR parsing Computes a Rightmost derivation Determines the derivation bottom-up Uses a set of LR states and a stack of symbols LR parsing table indicates, for each state, what action to perform (shift/reduce) and what state to go to next Use these techniques to construct an AT C 41/413 pring 007 Introduction to Compilers 1 C 41/413 pring 007 Introduction to Compilers AT Review Derivation = sequence of applied productions E + 1 + 1 + E 1 + Parse tree = graph representation of a derivation Doesn t capture the order of applying the productions Abstract yntax Tree (AT) discards unnecessary information from the parse tree Parse Tree E + 1 E + E 3 AT 1 + + 3 AT Data tructures abstract class Expr { class extends Expr { Expr left, right; (Expr L, Expr R) { left = L; right = R; class Num extends Expr { int value; Num (int v) { value = v; Num 1 Num Num 3 C 41/413 pring 007 Introduction to Compilers 3 C 41/413 pring 007 Introduction to Compilers 4 AT Construction LL/LR parsing techniques implicitly walk the parse tree during parsing LL parsing: Parse tree is implicitly represented by the sequence of applied derivation steps (preorder) LR parsing: Parse tree is implicitly represented by the sequence of applied reductions (endorder) The AT is implicitly defined by the parse tree We want to explicitly construct the AT during parsing: add code in the parser to explicitly build the AT LL AT Construction LL parsing: extend procedures for nonterminals Example: E' ' ε + E num ( ) vo parse_() { switch (token) { case num: case ( : parse_e(); parse_ (); return; default: throw new ParseError(); Expr parse_() { switch (token) { case num: case ( : Expr left = parse_e(); Expr right = parse_ (); if (right == null) return left; else return new (left, right); default: throw new ParseError(); C 41/413 pring 007 Introduction to Compilers 5 C 41/413 pring 007 Introduction to Compilers 6 1

LR AT Construction LR parsing We also need to add code for explicit AT construction AT construction mechanism for LR Parsing tore parts of the tree on the stack For each symbol X on stack, also store the AT subtree rooted at X on stack Whenever the parser performs a reduce operation for a production A β, create an AT node for A from AT fragments on stack for constituents of β Example stack LR AT Construction, ctd. + E Before reduction E+ Num() Num(3) Num(1) E+ E num ( ) Num(1) After reduction E+ Num() Num(3) C 41/413 pring 007 Introduction to Compilers 7 C 41/413 pring 007 Introduction to Compilers 8 Issues Unstructured code: mixed parsing code with AT construction code Automatic parser generators The generated parser needs to contain AT construction code How to construct a customized AT data structure using an automatic parser generator? May want to perform other actions concurrently with the parsing phase E.g., semantic checks This can reduce the number of compiler passes yntax-directed Definition olution: syntax-directed definition Extends each grammar production with an associated semantic action (code): E+ { action The parser generator adds these actions into the generated parser Each action is executed when the corresponding production is reduced C 41/413 pring 007 Introduction to Compilers 9 C 41/413 pring 007 Introduction to Compilers 10 emantic Actions Actions = code in a programming language ame language as the automatically generated parser Examples: Yacc = actions written in C CUP = actions written in Java The actions can access the parser stack! Parser generators extend the stack of states (corresponding to RH symbols) symbols with entries for user-defined structures (e.g., parse trees) The action code should be able to refer to the states (corresponding to the RH grammar symbols in the production Need a naming scheme C 41/413 pring 007 Introduction to Compilers 11 Naming cheme Need names for grammar symbols to use in the semantic action code Need to refer to multiple occurrences of the same nonterminal symbol E E 1 + E Distinguish the nonterminal on the LH E 0 E + E C 41/413 pring 007 Introduction to Compilers 1

Naming cheme: CUP CUP: Name RH nonterminal occurrences using distinct, user-defined labels: expr ::= expr:e1 PLU expr:e Use keyword REULT for LH nonterminal CUP Example: expr ::= expr:e1 PLU expr:e {: REULT = e1 + e; : Naming cheme: yacc Yacc: Uses keywords: $1 refers to the first RH symbol, $ refers to the second RH symbol, etc. Keyword $$ refers to the LH nonterminal YaccExample: expr ::= expr PLU expr { $$ = $1 + $3; C 41/413 pring 007 Introduction to Compilers 13 C 41/413 pring 007 Introduction to Compilers 14 Building the AT Use semantic actions to build the AT AT is built bottom-up during parsing non terminal Expr expr; User-defined type for semantic objects on the stack Nonterminal name expr ::= NUM:i {: REULT = new Num(i.val); : expr ::= expr:e1 PLU expr:e {: REULT = new (e1,e); : expr ::= expr:e1 MULT expr:e {: REULT = new Mul(e1,e); : expr ::= LPAR expr:e RPAR {: REULT = e; : Example E num (E) E+E E*E Parser stack stores value of each symbol (1+)*3 (1 +)*3 (E Num(1) +)*3 REULT=new Num(1) (E+ )*3 (E+E Num() )*3 REULT=new Num() (E (, ) )*3 REULT=new (e1,e) (E) *3 E *3 REULT=e C 41/413 pring 007 Introduction to Compilers 15 C 41/413 pring 007 Introduction to Compilers 16 AT Design Keep the AT abstract Do not introduce a tree node for every node in parse tree (not very abstract) E + ( ) E E + 3 1 E? 1 3 um Expr um LPar um RPar Expr Expr um 3 1 Expr C 41/413 pring 007 Introduction to Compilers 17 AT Design Do not use one single class AT_node E.g., need information for if, while, +, *, ID, NUM class AT_node { int node_type; AT_node[ ] children; tring name; int value; etc Problem: must have fields for every different kind of node with attributes Not extensible, Java type checking no help C 41/413 pring 007 Introduction to Compilers 18 3

Use Class Hierarchy Can use subclassing to solve problem Use an abstract class for each interesting set of non-terminals in grammar (e.g. expressions) E E+E E*E -E (E) abstract class Expr { class extends Expr { Expr left, right; class Mult extends Expr { Expr left, right; // or: class BinExpr extends Expr { Oper o; Expr l, r; class Minus extends Expr { Expr e; Another Example E ::= num (E) E+E ::= E ; if (E) if (E) else = E ; ; abstract class Expr { class Num extends Expr { Num(int value) class extends Expr { (Expr e1, Expr e) class Id extends Expr { Id(tring name) abstract class tmt { class If extends tmt { If(Expr c, tmt s1, tmt s) class Empty extends tmt { Empty() class Assign extends tmt { Assign(tring, Expr e) C 41/413 pring 007 Introduction to Compilers 19 C 41/413 pring 007 Introduction to Compilers 0 Other yntax-directed Definitions Can use syntax-directed definitions to perform semantic checks during parsing E.g., type-checking Benefit = efficiency One single compiler pass for multiple tasks Disadvantage = unstructured code Mixes parsing and semantic checking phases Perform checks while AT is changing Limited to one pass in bottom-up order C 41/413 pring 007 Introduction to Compilers 1 Type Declaration Example D T D D 1, { Type(, T.type); D.type = T.type; { Type(, D 1.type); D.type = D 1.type; T int { T.type = inttype; T float { T.type = floattype; C 41/413 pring 007 Introduction to Compilers Propagation of Values Propagate type attributes while building the AT int a, b T.type D.type D T inttype int D.type D, Type(,D.type) Type(,T.type) D T L Another Example { D.type = T.type; L.type = T.type; T int { T.type = inttype; T float { T.type = floattype; L { Type(,???); L L 1, { Type(, L 1.type);??? C 41/413 pring 007 Introduction to Compilers 3 C 41/413 pring 007 Introduction to Compilers 4 4

Propagation of Values Propagate values both bottom-up and top-down int a, b T.type T inttype int LR parsing: AT is built bottom-up! D L.type L.type L L Type(,L.Type), Type(,L.type) tructured Approach eparate AT construction from semantic checking phase Traverse the AT and perform semantic checks (or other actions) only after the tree has been built and its structure is stable This approach is more flexible and less error-prone It is better when efficiency is not a critical issue C 41/413 pring 007 Introduction to Compilers 5 C 41/413 pring 007 Introduction to Compilers 6 Where We Are Visitor Methodology for AT Traversal ource code (character stream) Token stream Abstract syntax tree (AT) if (b == 0) a = b; if ( b == 0 ) a = b ; if == b 0 = a b Lexical Analysis yntax Analysis (Parsing) emantic Analysis Visitor pattern: useful OO programming pattern that separates data structure definition (e.g., the AT) from code that traverses the structure (e.g., the name resolution code and the type checking code). Define a Visitor interface for all traversals of the AT Extend each AT class with a method that accepts any Visitor Code each traversal as a separate class that implements the Visitor interface C 41/413 pring 007 Introduction to Compilers 7 C 41/413 pring 007 Introduction to Compilers 8 AT Data tructure abstract class Expr Expr { class extends Expr Expr {...... Expr Expre1, e e class Num Numextends Expr Expr {...... int intvalue class Id Idextends Expr Expr {...... tring name interface Visitor { vo visit( e); e); vo visit(num e); e); vo visit(id e); e); Visitor Interface C 41/413 pring 007 Introduction to Compilers 9 C 41/413 pring 007 Introduction to Compilers 30 5

Accept methods Visitor Methods abstract class classexpr {{ abstract public publicvo voaccept(visitor v); v); class class extends Expr Expr {{ public publicvo voaccept(visitor v) v) {{ v.visit(this); class classnum Numextends Expr Expr {{ public publicvo voaccept(visitor v) v) {{ v.visit(this); class classid Idextends Expr Expr {{ public publicvo voaccept(visitor v) v) { v.visit(this); The declared type of this is the subclass it which it occurs. Overload resolution of v.visit(this); invokes appropriate visit function in the Visitor. For each kind of traversal, implement the Visitor interface, e.g., class PostfixOutputVisitor implements Visitor { vo visit( e) { e.e1.accept(this); e.e.accept(this); ystem.out.print( + ); vo visit(num e) { ystem.out.print(value); vo visit(id e) { ystem.out.print(); Dispatch to e.accept in the visit methods eliminates case analysis on AT subclasses To traverse expression e: PostfixOutputVisitor v = new PostfixOutputVisitor(); e.accept(v); C 41/413 pring 007 Introduction to Compilers 31 C 41/413 pring 007 Introduction to Compilers 3 Inherited and ynthesized Information o far, OK for traversal and action w/o communication of values But we need a way to pass information Down the AT (inherited) Up the AT (synthesized) To pass information down the AT add parameter to visit functions To pass information up the AT add return value to visit functions Visitor Interface () interface Visitor { Object visit( e, e, Object inh); Object visit(num e, e, Object inh); inh); Object visit(id e, e, Object inh); inh); C 41/413 pring 007 Introduction to Compilers 33 C 41/413 pring 007 Introduction to Compilers 34 Accept methods () abstract class classexpr {{ abstract public publicobject Objectaccept(Visitor v, v, Object Object inh); inh); class class extends Expr Expr {{ public publicobject Objectaccept(Visitor v, v, Object Object inh) inh) {{ return returnv.visit(this, inh); inh); class classnum Numextends Expr Expr {{ public publicobject Objectaccept(Visitor v, v, Object Object inh) inh) {{ return returnv.visit(this, inh); inh); class classid Idextends Expr Expr {{ public publicobject Objectaccept(Visitor v, v, Object Object inh) inh) {{ return returnv.visit(this, inh); inh); C 41/413 pring 007 Introduction to Compilers 35 Visitor Methods () For kind of traversal, implement the Visitor interface, e.g., class EvaluationVisitor implements Visitor { Object visit( e, Object inh) { int left = (int) e.e1.accept(this, inh); int right = (int) e.e.accept(this, inh); return left+right; Object visit(num e, Object inh) { return value; Object visit(id e, Object inh) { return Lookup(, (ymboltable)inh); To traverse expression e: EvaluationVisitor v = new EvaluationVisitor (); e.accept(v, EmptyTable()); C 41/413 pring 007 Introduction to Compilers 36 6

ummary yntax-directed definitions attach semantic actions to grammar productions Easy to construct the AT using syntax-directed definitions Can use syntax-directed definitions to perform semantic checks eparate AT construction from semantic checks or other actions that traverse the AT C 41/413 pring 007 Introduction to Compilers 37 7