Tabledriven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and nonterminals.


1 Bottomup Parsing: Tabledriven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and nonterminals. Basic operation is to shift terminals from the input to the stack until the righthand side of an appropriate grammar rule is seen, and then to reduce the stuff on the stack that matches the righthand side to the single nonterminal of the rule. Hence, bottomup parsers are often called shiftreduce parsers. 1 Example Grammar: E E + n n Input: 2 + 3, or n + n Parse: ($ is EOF input, also bottom of stack) Parsing stack Input Action 1 $ n + n $ shift 2 $ n + n $ reduce E n 3 $ E + n $ shift 4 $ E + n $ shift 5 $ E + n $ reduce E E + n 6 $ E $ accept 2 Compiler Construction F6S 1
2 Notes: Left recursion is not a problem in bottomup parsing. Indeed, as we shall see, lookahead is not as serious an issue. Keeping track of what is on the stack, however, is an issue (note the difference in the grammar rule reductions at lines 2 and 5 of the previous example). See later discussion on stack state. Right recursion is actually a bit of a problem, because it makes the stack grow large (see next example). 3 Example Grammar: E n + E n Input: 2 + 3, or n + n Parse: Parsing stack Input Action 1 $ n + n $ shift 2 $ n + n $ shift 3 $ n + n $ shift 4 $ n + n $ reduce E n 5 $ n + E $ reduce E n + E 6 $ E $ accept 4 Compiler Construction F6S 2
3 Decision Problems in Bottomup Parsing (parsing conflicts): Shiftreduce conflicts: Almost always come from ambiguities, and almost always the right disambiguating rule is to shift (danglingelse). Reducereduce conflicts: More difficult, bottomup parsers try to resolve them using Follow contexts. There are no shiftshift conflicts. 5 Danglingelse Example: Grammar: S I o I i S i S e S Input: i i o e o Parse: Parsing stack Input Action 1 $ i i o e o $ shift 2 $ i i o e o $ shift 3 $ i i o e o $ shift 4 $ i i o e o $ reduce S o 5 $ i i S e o $ shift/reduce (shift) 6 $ i i S e o $ shift 7 $ i i S e o $ reduce S o 8 $ i i S e S $ reduce I i S e S 9 $ i I $ reduce S I 10 $ i S $ reduce I i S 11 $ I $ reduce S I 12 $ S $ accept 6 Compiler Construction F6S 3
4 Reducereduce Example Grammar: S AB A x B x Input: x x Parse: (Follow(A) = {x}, Follow(B) = {$}) Parsing stack Input Action 1 $ x x $ shift 2 $ x x $ reduce A x (reduce B x) 3 $ A x $ shift 4 $ A x $ reduce B x 5 $ A B $ reduce S A B 6 $ S $ accept 7 Shiftreduce parsers differ in their use of Follow information: LR(0) parsers never consult the lookahead at all. SLR(1) parsers use the Follow sets as previously constructed. LR(1) parsers use context to split the Follow sets into subsets for different parsing paths (huge, inefficient parsers). LALR(1) parsers: like LR(1) but coarser subsets are used (achieves most of the benefit, but much smaller and faster). 8 Compiler Construction F6S 4
5 Technical Addendum Shiftreduce parsers have trouble figuring out when to accept, so acceptance is turned into a reduction by a new rule S S with a new start symbol S. Adding this rule is called augmenting the grammar: Parsing stack Input Action <previous example> $ A B $ reduce S A B 6 $ S $ reduce S' S 7 $ S' $ accept 9 Format of a Yacc/Bison definition file {definitions} %% {rules} %% {auxiliary C functions} Typical file extensions:.y.yacc.bison 10 Compiler Construction F6S 5
6 Stack States: the DFA of sets of LR(0) items An item is a grammar rule option with a distinguished position (indicated by a period or other symbol): A α. β The position in an item indicates that a parse has reached that position in recognizing that rule. α is then on the stack, and a β may be coming in the input (the α is called a viable prefix). A stack state consists of the set of items which have compatible viable prefixes. 11 Computing the DFA of sets of LR(0) items Add the augmentation item S. S to the start state of the DFA. Then add all the initial items for S to the state: S. α. Continue by adding all the initial items for those nonterminals which appear right after the dot in any previous item (this is called the closure of the set of items). Every symbol that comes immediately after the dot gives rise to a transition to a state generated by adding closure items to the item with the dot moved past that symbol. 12 Compiler Construction F6S 6
7 Example: S A B, A x, B x The DFA of sets of items: S' S A x. S. A B. x 0 S A S' S. 5 A x. 1 B S A. B S A B. B. x 4 2 x B x Parsing Table from the DFA (1) S AB, (2) A x, (3) B x State Input Goto x other $ S A B 0 s r2 r2 r2 2 s3 4 3 r3 r3 r3 4 r1 r1 r1 5 accept Note that this table is LR(0), since each state is either a shift state or a reduce state by a single rule. Thus, no lookahead needs to be consulted. 14 Compiler Construction F6S 7
8 Example of a parse by the table Parsing stack Input Action 1 $0 x x $ s1 2 $ 0 x 1 x $ r2 (A x) 3 $ 0 A 2 x $ s3 4 $ 0 A 2 x 3 $ r3 (B x) 5 $ 0 A 2 B 4 $ r1 (S A B) 6 $ 0 S 5 $ accept Note that, with the state numbers on the stack, the symbols themselves need not be put on the stack, and we will delete them occasionally from future examples. 15 Bison table is essentially the same: state 0 'x' shift, and go to state 1 S go to state 5 A go to state 2 state 1 A > 'x'.(rule 2) $default reduce using rule 2 (A) state 2 S > A. B (rule 1) 'x' shift, and go to state 3 B go to state 4 state 3 B > 'x'. (rule 3) $default reduce using rule 3 (B) state 4 S > A B. (rule 1) $default reduce using rule 1 (S) state 5 $ go to state 6 state 6 $ go to state 7 state 7 $default accept 16 Compiler Construction F6S 8
9 Example: E E + n n The DFA of sets of items: E'. E E. E + n E. n 0 n E E' E. E E. + n 1 + E n. E E +. n n E E + n Parsing Table (not( LR(0)!): ): State Input Goto n + $ E 0 s2 1 1 s3 accept 2 r (E n) r (E n) 3 s4 4 r (E E + n) r (E E + n) This table uses the SLR(1) rule: consult the Follow sets to decide between a shift and a reduce, or between two reduces: Follow(E) = { + $ }, Follow(E ) = {$} (Note the clear need for augmentation in this case) 18 Compiler Construction F6S 9
10 Sample Parse: Parsing stack Input Action 1 $ 0 n + n + n $ shift 2 2 $ 0 n 2 + n + n $ reduce E n 3 $ 0 E 1 + n + n $ shift 3 4 $ 0 E n + n $ shift 4 5 $ 0 E n 4 + n $ reduce E E + n 6 $ 0 E 1 + n $ shift 3 7 $ 0 E n $ shift 4 8 $ 0 E n 4 $ reduce E E + n 9 $ 0 E 1 $ accept 19 Example with ε (Bison table): S A B, A x, A ε, B x state 0 'x' shift, and go to state 1 'x' [reduce using rule 3 (A)] $default reduce using rule 3 (A) S go to state 5 A go to state 2 state 1 A > 'x'. (rule 2) $default reduce using rule 2 (A) state 2 etc Compiler Construction F6S 10
11 Remarks on Example with ε This grammar is unambiguous, but cannot be parsed with one symbol of lookahead The default disambiguation rule in Bison means it will never accept the string x, only the string x x Thus, even bottomup parsers have their power limited by lookahead 21 A Contrasting Example with ε: S A B, A x, B x, B ε state 0 'x' shift, and go to state 1 S go to state 5 A go to state 2 state 1 A > 'x'. (rule 2) $default reduce using rule 2 (A) state 2 S > A. B (rule 1) 'x' shift, and go to state 3 $default reduce using rule 4 (B) B go to state 4 22 Compiler Construction F6S 11
12 Contrasting Example with ε (continued) : state 3 B > 'x'. (rule 3) $default reduce using rule 3 (B) state 4 S > A B. (rule 1) $default reduce using rule 1 (S) state 5 $ go to state 6 state 6 $ go to state 7 state 7 $default accept 23 Example: E n + E n state 0 'n' shift, and go to state 1 E go to state 4 state 1 E > 'n'. '+' E (rule 1) E > 'n'. (rule 2) '+' shift, and go to state 2 $default reduce using rule 2 (E) state 2 E > 'n' '+'. E (rule 1) 'n' shift, and go to state 1 E go to state 3 state 3 E > 'n' '+' E. (rule 1) $default reduce using rule 1 (E) state 4 $... accept 24 Compiler Construction F6S 12
13 Danglingelse: Grammar: (1) S I (2) S other (3) I if S (4) I if S else S DFA of LR(0) items on next slide 25 S'. S S. I S S' S. 1 S. other I. if S I S I. I. if S else S 2 0 other if I I S other. 3 other if I if. S I if. S else S S. I S. other I. if S I. if S else S 4 if I if S else. S S. I S. other I. if S I. if S else S 6 else S I if S. I if S. else S 5 other S I if S else S Compiler Construction F6S 13
14 The parsing table (with conflict): State Input Goto if else other $ S I 0 s4 s accept 2 r1 r1 3 r2 r2 4 s4 s s6/r3 r3 6 s4 s r4 r4 27 When Follow Sets Fail: S id V = E V id E V n But Yacc/Bison has no trouble with this grammar! state 0 id shift, and go to state 1 S go to state 8 V go to state 2 state 1 S > id. (rule 1) V > id. (rule 3) '=' reduce using rule 3 (V) $default reduce using rule 1 (S) Compiler Construction F6S 14
15 Notes on this example Legal strings are: { id, id=id, id=n } First(S) = First(V) = {id}, First(E) = {id, n} Follow(S) = Follow(E) = {$} Follow(V) = {$, id, =} There is a reducereduce SLR(1) conflict on $ in state 1 (Follow(S) and Follow(V) both contain $) 29 LALR(1) parsers add lookahead An LR(1) item has a lookahead component: [ S.S, $] Begin with the above item, propagate lookahead to closure items as follows: given item [ A α.bγ,a], add closure item [B.β, b] for all b in First(γa). 30 Compiler Construction F6S 15
16 The DFA for the example is then: [ S'. S, $ ] [ S.id, $ ] [ S. V = E, $ ] [ V.id, = ] V 0 S id [ S' S., $ ] 8 [ S id., $ ] [ V id., = ] 1 [ S V. = E, $ ] [ S V = E.,$ ] 2 := E [ S V =. E, $ ] [ E. V, $ ] [ E.n, $ ] [ V.id, $ ] 3 7 V n id [ E V., $ ] [ E n., $ ] [ V id., $ ] Definition of First Sets for Symbols (and ε): If X is a terminal or ε, then First(X) = {X}. If X is a nonterminal, then for each production choice X X 1 X 2...X n, First(X) contains First(X 1 ) {ε}. If also for some i < n, all of the sets First(X 1 ),..., First (X i ) contain ε, then First(X) contains First(X i+1 ) {ε}. If all of the sets First(X 1 ),..., First (X n ) contain ε then First(X) also contains ε. 32 Compiler Construction F6S 16
17 Definition of First Sets for Strings of Symbols: First (α) for any string α = X 1 X 2...X n (a string of terminals and nonterminals) is as follows. First(α) contains First(X 1 ) {ε}. For each i = 2,..., n, if First (X k ) contains ε for all k = 1,..., i 1, then First(α) contains First (X i ) {ε}. Finally, if for all i = 1,..., n, First (X i ) contains ε, then First(α) contains ε. 33 Example S A B z, A x ε, B y ε First(A) = { x, ε } First(B) = { y, ε } First(S) = { x, y, z, ε } First(ASw) = { x, y, z, w } 34 Compiler Construction F6S 17
18 Algorithm for Computing First Sets for Nonterminals: For all nonterminals A do First(A) := {}; while there are changes to any First(A) do for each production choice A X 1 X 2...X n do k := 1 ; Continue := true ; while Continue = true and k <= n do add First (X k ) {ε} to First(A); if ε is not in First (X k ) then Continue := false ; k := k + 1 ; if Continue = true then add ε to First(A) ; 35 Example of Algorithm Grammar Rule Pass 1 Pass 2 statement ifstmt First(statement) = {if, other} statement other First(statement) = {other} ifstmt if ( exp) statement elsepart First(ifstmt) = {if} elsepart else statement First(elsepart) = {else} elsepart ε First(elsepart) = {else, ε} exp 0 First(exp) = {0} exp 1 First(exp) = {0, 1} 36 Compiler Construction F6S 18
19 Definition of Follow Sets for Nonterminals: Given a nonterminal A, the set Follow(A), consisting of terminals, and possibly $, is defined as follows. 1. If A is the start symbol, then $ is in Follow(A). 2. If there is a production B αa γ, then First(γ) {ε} is in Follow(A). 3. If there is a production B αa γ such that ε is in First(γ), then Follow(A) contains Follow(B). 37 Algorithm for Computing Follow Sets for Nonterminals: Follow (startsymbol) := {$} ; for all nonterminals A startsymbol do Follow (A) := {} ; while there are changes to any Follow sets do for each production A X 1 X 2... X n do for each X i that is a nonterminal do add First(X i+1 X i+2... X n ) {ε} to Follow (X i ) (* Note: if i=n, then X i+1 X i+2... X n = ε *) if ε is in First(X i+1 X i+2... X n ) then add Follow(A) to Follow (X i ) 38 Compiler Construction F6S 19
20 Example S A B z, A x ε, B y ε First(A) = { x, ε } First(B) = { y, ε } First(S) = { x, y, z, ε } First(ASw) = { x, y, z, w } Follow(S) = { $ } Follow(A) = { y, $ } Follow(B) = { $ } 39 A Second Example statement ifstmt other ifstmt if ( exp )statement elsepart elsepart else statement ε exp 0 First(statement) = {if, other} First(ifstmt) = {if} First(elsepart) = {else, ε} First(exp) = {0, 1} Follow(statement) = {$,else} Follow(ifstmt) = {$,else} Follow(elsepart) = {$,else} Follow(exp) = {)} 40 Compiler Construction F6S 20
