Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Similar documents
Concepts Introduced in Chapter 4

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Chapter 4: Syntax Analyzer

Syntax Analysis Check syntax and construct abstract syntax tree

3. Context-free grammars & parsing

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

A programming language requires two major definitions A simple one pass compiler

Context-free grammars

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Compiler Construction: Parsing

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

CS 314 Principles of Programming Languages

Introduction to parsers

3. Parsing. Oscar Nierstrasz

CS 406/534 Compiler Construction Parsing Part I

CA Compiler Construction

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Revisit the example. Transformed DFA 10/1/16 A B C D E. Start

Compiler Design Concepts. Syntax Analysis

Parsing III. (Top-down parsing: recursive descent & LL(1) )

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Principles of Programming Languages COMP251: Syntax and Grammars

Types of parsing. CMSC 430 Lecture 4, Page 1

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Top down vs. bottom up parsing

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof.

Context-Free Grammars

Lexical and Syntax Analysis. Top-Down Parsing

MIT Top-Down Parsing. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Lexical and Syntax Analysis (2)

Chapter 4. Lexical and Syntax Analysis

Compilers Course Lecture 4: Context Free Grammars

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Lexical and Syntax Analysis

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

4. Lexical and Syntax Analysis

Syntax Analysis Part I

Topdown parsing with backtracking

Building Compilers with Phoenix

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

LANGUAGE PROCESSORS. Introduction to Language processor:

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Table-Driven Top-Down Parsers

Compiler Construction 2016/2017 Syntax Analysis

4. Lexical and Syntax Analysis

CSE 3302 Programming Languages Lecture 2: Syntax

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Parsing Part II (Top-down parsing, left-recursion removal)

Abstract Syntax Trees & Top-Down Parsing

Parsing II Top-down parsing. Comp 412

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

A Simple Syntax-Directed Translator

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Introduction to Parsing. Comp 412

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Chapter 4: LR Parsing

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Wednesday, August 31, Parsers

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

How do LL(1) Parsers Build Syntax Trees?

Syntactic Analysis. Top-Down Parsing

Monday, September 13, Parsers

Grammars and ambiguity. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 8 1


Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars.

Programming Language

EECS 6083 Intro to Parsing Context Free Grammars

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

CS 403: Scanning and Parsing

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Introduction to Parsing

Computer Science 160 Translation of Programming Languages

CS 230 Programming Languages

Introduction to Syntax Analysis. The Second Phase of Front-End

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Solving systems of regular expression equations

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

컴파일러입문 제 6 장 구문분석

Transcription:

Eliminating Ambiguity Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity. Example: consider the following grammar stat if expr then stat if expr then stat else stat other One can easily see that this grammar is ambiguous. The sentence if E1 then if E2 then S1 else S2 has the following two different parse trees: stat / \ if expr then stat / / / \ \ \ E1 if expr then stat else stat E2 S1 S2 stat / / / \ \ \ if expr then stat else stat / \ \ / \ \ E1 if expr then stat E2 S1 1

In all programming languages with conditional statements of this form the first parse tree is preferred. The general rule is to match each else with the closest previous unmatched then. This disambiguating rule can be incorporated directly into the grammar. Thus the previous grammar can be rewritten to the following unambiguous grammar: stat matched stat unmatched stat matched stat if expr then matched stat else matched stat other unmatched stat if expr then stat if expr then matched stat else unmatched stat Left recursive grammars A grammar is left recursive if it has a nonterminal A such that there is derivation A + Aα. Example the following left recursive grammar for arithmetic expressions Expr Expr + Term Term Term Term * Fact Fact Fact (Expr) id can be transformed into the following equivalent grammar without left recursion Expr Term Expr 1 Expr 1 + Term Expr 1 λ Term Fact Term 1 Term 1 * Fact Term 1 λ Fact (Expr) id The new grammar can be handled by top-down parsing which can not handle left-recursive grammar. 2

Left Factoring Left factoring is a grammar transformation that is useful for producing a grammar suitable for top-down (predictive) parsing. The basic idea is, in general, as follows: 1. let A αβ 1 αβ 2 be two production rules for the nonterminal symbol A 2. if the input begins with a nonempty string derived from α 3. and we do not know whether to expand A to αβ 1 or αβ 2 4. then we may defer the decision by expanding A to αa 5. after seeing the input derived from α, we expand A to β 1 or to β 2 6. this means, left-factored, the original productions become A αa A β 1 β 2 Example: the following grammar stmt if expr then stmt else stmt if expr then stmt can be left-factored to the following grammar stmt if expr then stmt A A else stmt λ 3

Top Down Parsing Top-down parsing can be viewed as an attempt to find a leftmost derivation for an input string or it can be viewed as an attempt to construct a parse tree for the input string starting from the root and creating the nodes of the parse tree in preorder. We will study a general form of top-down parsing called recursive descent that may involve backtracking, i.e. making repeated scans of the input. We also will study a special case of the recursive descent parsing called predictive parsing. The following example shows how to use backtracking in forming a parse tree for a given input. Example: consider the following grammar S cad A ab a and the input string w=cad. The following figure shows how backtrack is used to construct the parse tree of w S s s / \ / \ / \ c A d c A d c A d / \ a b a (step 1) (step 2) (step 3) A left-recursive grammar can cause a recursive-descent parser, even one with backtracking, to go into infinite loop. 4

Top-down parsing construction The top-down construction of a parse tree is done by starting with the root, labeled with the starting nonterminal symbol, and repeatedly performing the following two steps: 1. at node n, labeled with a nonterminal symbol A, select one of the production rules for A and construct a children at n for the symbols on the right side of the rule 2. find the next node at which a subtree is to be constructed Example: consider the following grammar that defines simple types type simple id array [ simple ] of type simple integer char num..num The parse tree for array[num.. num] of integer can be constructed by top-down parsing as follows: type type type (1) / / / \ \ \ / / / \ \ \ / / / \ \ \ / / / \ \ \ array [ simple ] of type array [ simple ] of type / \ (2) num.. num (3) type type / / / \ \ \ / / / \ \ \ / / / \ \ \ / / / \ \ \ array [ simple ] of type array [ simple ] of type / \ / \ num.. num simple num.. num simple (4) (5) integer 5

Recursive descent parsing: a recursive-descent parsing is a top-down method of syntax analysis in which we execute a set of recursive procedures to process the input. A procedure is associated with each nonterminal symbol of a grammar. Predictive parsing: predictive parsing is a special form of recursive-descent parsing that needs no backtracking. Example: consider the following grammar for simple types type simple id array [ simple ] of type simple integer char num..num The nonterminal symbols of this grammar are type and simple, so we can have the procedures (it written in pseudo-code): procedure type; begin if lookahead is in integer, char, num then simple else if lookahead = ˆ then begin match( ˆ ); match(id) end else if lookahead = array then begin match(array); match( [ ); simple; match( ] ); match(of); type end else error end; procedure simple; begin if lookahead = integer then match(integer) else if lookahead = char the match(char) else if lookahead = num then begin match(num); match(.. ); match(num) end else error end; 6

Here the auxiliary procedure match() is used to simplify the code of type and simple. It has the form: procedure match(t : token); begin if lookahead = t then lookahead := nexttoken else error end; it changes the variable lookahead which is the currently s- canned input token. The predictive parsing process as follows: 1. parsing begins with a call of the procedure for the starting nonterminal symbol (in the above example type) 2. the variable lookahead is initialized with the FIRST token (in the above example array). Below we show how the FIRST token can be determined 3. then the corresponding code is executed. For example in our example above the procedure type executes the corresponding code match(array); match( [ ); simple; match( ] ); match(of); type corresponding to the right side of the production rule type array [ simple ] of type We define FIRST(α) to be the set of tokens that appear as the first symbols of one or more strings generated from α For example: FIRST(simple)={integer, char, num} FIRST( id)={ } FIRST(array[simple]of type) = {array} 7

Transition diagrams for predictive parsers To construct the transition diagram of a predictive parser from a grammar we do the following: 1. eliminate left recursion from the grammar 2. left factor the grammar 3. for each nonterminal A, create an initial and final (return) state 4. for each production A X 1 X 2 X n create a path from the initial to the final state, with edges labeled X 1, X 2,, X n Example: consider the following grammar Expr Expr + Term Term Term Term * Fact Fact Fact (Expr) id To construct the transition diagram for this grammar we follow the above 4 steps: Step 1: first we eliminate left recursion getting the following equivalent grammar rule1: Expr Term Expr 1 rule2: Expr 1 + Term Expr 1 λ rule3: Term Fact Term 1 rule4: Term 1 * Fact Term 1 λ rule5: Fact (Expr) id Step 2: this grammar is already left-factored Step 3: we have 5 nonterminal symbols Expr, Expr 1, Term, Term 1,and Fact, so we construct an initial and final state for each one. 8

step 4: finally for each production rule we construct a transition diagram as follows: for rule1: Expr Term Expr 1 for rule2: Expr 1 + Term Expr 1 λ for rule3: Term Fact Term 1 for rule4: Term 1 * Fact Term 1 λ for rule5: Fact ( Expr ) id 9