Concepts of Programming Languages, CSCI 305, Fall 2017 Exercise 11, LL Grammars and Predict Sets, Oct. 30 1. The following grammar defines a subset of Lisp: P E $$ E atom E E // Tick ( ) is a token E ( E Es ) Es E Es Es ε a. Determine the EPS, FIRST and FOLLOWs set for this grammar. EPS FIRST set FOLLLOW - set P False atom,, ( E False atom,, ( $$, atom,, (, ) Es True atom,, ( ) $$ False $$ atom False atom $$, atom,, (, ) False atom,, ( ( False ( atom,, ( ) false ) $$, atom,, (, ) b. Add the predict sets to the grammar above. 1. P E $$ {atom,, (} 2. E atom {atom} 3. E E { } 4. E ( E Es ) {(} 5. Es E Es {atom,, (} 6. Es ε {)} c. Create a parser table atom ( ) $ P 1 1 1 - - E 2 3 4 - - Es 5 5 5 6-1
2. Exercise 2.11 from the text. Prove that the following grammar is LL(1): decl ID decl_tail decl_tail, decl decl_tail, : ID ; LL(1) means that only 1 symbol (token) is needed to decide which production to apply. That is, no peeking forward is necessary. EPS FIRST set FOLLLOW - set decl False ID decl_tail False, and : ID False ID, False, ; False : Since EPS set is empty, the FOLLOW sets aren t needed. 1. decl ID decl_tail { ID } 2. decl_tail, decl {, } 3. decl_tail, : ID ; { : } ID, : ; decl 1 - - - decl_tail - 2 3 - - There is no ambiguity in the parse table so the grammar is LL(1). Another way to argue this is that since the right side of every production ends with a token, and these aren t duplicated for any variable on the left side of the production, there is no ambiguity of which production to use so the grammar is LL(1). 2
3. Do exercise 2.12 from the text. Consider the following grammar: G S $$ S A M M S ε A a E b A A E a B b A ε B b E a B B a. Describe in English the language that the grammar generates. The grammar generates all strings of a s and b s (terminated by an end marker), in which there are more a s than b s. b. Show a parse tree for the string a b a a. There are two parser trees for the specified string (this means that the grammar is ambiguous). c. Is the grammar LL(1)? If so, show the parse table; if not, identify a prediction conflict. Since the grammar is ambiguous, it can not be LL(1). A top-down parser would be unable to decide whether to predict an epsilon production when E is at the top of the stack. 3
To see the above, could create the EPS, FIRST and FOLLOW sets. Non-terminal EPS FIRST FOLLOW and terminals G false {a, b} Φ S false {a, b} {$$} A false {a, b} {a, b, $$} M true {a, b} {$$} E true {a, b} {a, b, $$} B false {a, b} {a, b, $$} $$ false {$$} Φ a false {a} {a, b, $$} b false {b} {a, b, $$} Add the predict sets to the grammar above. 1. G S $$ {a, b} 2. S AM {a, b} 3. M S {a, b} 4. M ε {$$} 5. A ae {a} 6. A baa {b} 7. E ab {a} 8. E ba {b} 9. E ε {a, b, $$} 10. B be {b} 11. B abb {a} Create a parser table a b $ G 1 1 S 2 2 A 5 6 M 3 3 4 E 7,9 8,9 9 B 11 10 Grammar can t be LL(1) because given an E and the token a, rule 7 or 9 can be applied. Similarly, on an E with the token b, rule 8 or 9 can be applied. 4
4. Do exercise 2.13 from the text. Consider the following grammar: stmt assignment stmt subr_call assignment id := expr subr_call id ( arg_list ) expr primary expr_tail expr_tail op expr expr_tail ε primary id primary subr_call primary ( expr ) op + - * / arg_list expr args_tail args_tail, arg_list args_tail ε a. Construct a parse tree for the input string foo(a, b). b. Give a canonical (right-most) derivation of this same string. 5
c. Prove that the grammar is not LL(1). EPS FIRST set FOLLLOW - set stmt false id empty assignment false id empty subr_call false id empty expr false id, (,, ) expr_tail true +, -, *, /,, ) primary false id, ( +, -, *, / op false +, -, *, / id, ( arg_list false id, ( ) args_tail true, ) 1. stmt assignment { id } 2. stmt subr_call { id } 3. assignment id := expr { id } 4. subr_call id ( arg_list ) { id } 5. expr primary expr_tail { id, ( } 6. expr_tail op expr {+, -, *, / } 7. expr_tail ε {,, ) } 8. primary id { id } 9. primary subr_call { id } 10. primary ( expr ) { ( } 11. op + - * / {+, -, *, / } 12. arg_list expr args_tail { id, ( } 13. args_tail, arg_list {, } 14. args_tail ε { ) } Create a parser table id : = ( + - * /, ) stmt 1,2 assignment 3 subr_call 4 expr 5 5 expr_tail - 6 6 6 6 7 7 primary 8, 9 10 op - 11 11 11 11 arg_list 12 12 args_tail - 13 14 Grammar is not LL(1) due to confusion with stmt and id. Also not due to confusion primary and id. 6
d. Modify the grammar so that it is LL(1). stmt assignment stmt subr_call assignment id := expr subr_call id ( arg_list ) expr primary expr_tail expr_tail op expr expr_tail ε primary id primary subr_call primary ( expr ) op + - * / arg_list expr args_tail args_tail, arg_list args_tail ε Factor out id from productions with stmt and expr. stmt id stmt_tail stmt_tail := expr stmt_tail ( arg_list ) expr primary expr_tail expr_tail op expr expr_tail ε primary id pimary_tail primary_tail ε primary_tail ( arg_list ) primary ( expr ) op + - * / arg_list expr args_tail args_tail, arg_list args_tail ε 7