1 Eliminating Ambiguity Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity. Example: consider the following grammar stat if expr then stat if expr then stat else stat other One can easily see that this grammar is ambiguous. The sentence if E1 then if E2 then S1 else S2 has the following two different parse trees: stat / \ if expr then stat / / / \ \ \ E1 if expr then stat else stat E2 S1 S2 stat / / / \ \ \ if expr then stat else stat / \ \ / \ \ E1 if expr then stat E2 S1 1
2 In all programming languages with conditional statements of this form the first parse tree is preferred. The general rule is to match each else with the closest previous unmatched then. This disambiguating rule can be incorporated directly into the grammar. Thus the previous grammar can be rewritten to the following unambiguous grammar: stat matched stat unmatched stat matched stat if expr then matched stat else matched stat other unmatched stat if expr then stat if expr then matched stat else unmatched stat Left recursive grammars A grammar is left recursive if it has a nonterminal A such that there is derivation A + Aα. Example the following left recursive grammar for arithmetic expressions Expr Expr + Term Term Term Term * Fact Fact Fact (Expr) id can be transformed into the following equivalent grammar without left recursion Expr Term Expr 1 Expr 1 + Term Expr 1 λ Term Fact Term 1 Term 1 * Fact Term 1 λ Fact (Expr) id The new grammar can be handled by topdown parsing which can not handle leftrecursive grammar. 2
3 Left Factoring Left factoring is a grammar transformation that is useful for producing a grammar suitable for topdown (predictive) parsing. The basic idea is, in general, as follows: 1. let A αβ 1 αβ 2 be two production rules for the nonterminal symbol A 2. if the input begins with a nonempty string derived from α 3. and we do not know whether to expand A to αβ 1 or αβ 2 4. then we may defer the decision by expanding A to αa 5. after seeing the input derived from α, we expand A to β 1 or to β 2 6. this means, leftfactored, the original productions become A αa A β 1 β 2 Example: the following grammar stmt if expr then stmt else stmt if expr then stmt can be leftfactored to the following grammar stmt if expr then stmt A A else stmt λ 3
4 Top Down Parsing Topdown parsing can be viewed as an attempt to find a leftmost derivation for an input string or it can be viewed as an attempt to construct a parse tree for the input string starting from the root and creating the nodes of the parse tree in preorder. We will study a general form of topdown parsing called recursive descent that may involve backtracking, i.e. making repeated scans of the input. We also will study a special case of the recursive descent parsing called predictive parsing. The following example shows how to use backtracking in forming a parse tree for a given input. Example: consider the following grammar S cad A ab a and the input string w=cad. The following figure shows how backtrack is used to construct the parse tree of w S s s / \ / \ / \ c A d c A d c A d / \ a b a (step 1) (step 2) (step 3) A leftrecursive grammar can cause a recursivedescent parser, even one with backtracking, to go into infinite loop. 4
5 Topdown parsing construction The topdown construction of a parse tree is done by starting with the root, labeled with the starting nonterminal symbol, and repeatedly performing the following two steps: 1. at node n, labeled with a nonterminal symbol A, select one of the production rules for A and construct a children at n for the symbols on the right side of the rule 2. find the next node at which a subtree is to be constructed Example: consider the following grammar that defines simple types type simple id array [ simple ] of type simple integer char num..num The parse tree for array[num.. num] of integer can be constructed by topdown parsing as follows: type type type (1) / / / \ \ \ / / / \ \ \ / / / \ \ \ / / / \ \ \ array [ simple ] of type array [ simple ] of type / \ (2) num.. num (3) type type / / / \ \ \ / / / \ \ \ / / / \ \ \ / / / \ \ \ array [ simple ] of type array [ simple ] of type / \ / \ num.. num simple num.. num simple (4) (5) integer 5
6 Recursive descent parsing: a recursivedescent parsing is a topdown method of syntax analysis in which we execute a set of recursive procedures to process the input. A procedure is associated with each nonterminal symbol of a grammar. Predictive parsing: predictive parsing is a special form of recursivedescent parsing that needs no backtracking. Example: consider the following grammar for simple types type simple id array [ simple ] of type simple integer char num..num The nonterminal symbols of this grammar are type and simple, so we can have the procedures (it written in pseudocode): procedure type; begin if lookahead is in integer, char, num then simple else if lookahead = ˆ then begin match( ˆ ); match(id) end else if lookahead = array then begin match(array); match( [ ); simple; match( ] ); match(of); type end else error end; procedure simple; begin if lookahead = integer then match(integer) else if lookahead = char the match(char) else if lookahead = num then begin match(num); match(.. ); match(num) end else error end; 6
7 Here the auxiliary procedure match() is used to simplify the code of type and simple. It has the form: procedure match(t : token); begin if lookahead = t then lookahead := nexttoken else error end; it changes the variable lookahead which is the currently s canned input token. The predictive parsing process as follows: 1. parsing begins with a call of the procedure for the starting nonterminal symbol (in the above example type) 2. the variable lookahead is initialized with the FIRST token (in the above example array). Below we show how the FIRST token can be determined 3. then the corresponding code is executed. For example in our example above the procedure type executes the corresponding code match(array); match( [ ); simple; match( ] ); match(of); type corresponding to the right side of the production rule type array [ simple ] of type We define FIRST(α) to be the set of tokens that appear as the first symbols of one or more strings generated from α For example: FIRST(simple)={integer, char, num} FIRST( id)={ } FIRST(array[simple]of type) = {array} 7
8 Transition diagrams for predictive parsers To construct the transition diagram of a predictive parser from a grammar we do the following: 1. eliminate left recursion from the grammar 2. left factor the grammar 3. for each nonterminal A, create an initial and final (return) state 4. for each production A X 1 X 2 X n create a path from the initial to the final state, with edges labeled X 1, X 2,, X n Example: consider the following grammar Expr Expr + Term Term Term Term * Fact Fact Fact (Expr) id To construct the transition diagram for this grammar we follow the above 4 steps: Step 1: first we eliminate left recursion getting the following equivalent grammar rule1: Expr Term Expr 1 rule2: Expr 1 + Term Expr 1 λ rule3: Term Fact Term 1 rule4: Term 1 * Fact Term 1 λ rule5: Fact (Expr) id Step 2: this grammar is already leftfactored Step 3: we have 5 nonterminal symbols Expr, Expr 1, Term, Term 1,and Fact, so we construct an initial and final state for each one. 8
9 step 4: finally for each production rule we construct a transition diagram as follows: for rule1: Expr Term Expr 1 for rule2: Expr 1 + Term Expr 1 λ for rule3: Term Fact Term 1 for rule4: Term 1 * Fact Term 1 λ for rule5: Fact ( Expr ) id 9
