Compiler construction in4303 lecture 3 Top-down parsing Chapter 2.2-2.2.4 Overview syntax analysis: tokens AST language grammar parser generator program text lexical analysis tokens syntax analysis AST context handling AST construction annotated AST by hand: recursive descent automatic: top-down LLgen, bottom-up yacc Syntax analysis parsing: given a context-free grammar and a stream of tokens, find the derivation that ties them together. result: parse tree - * factor factor * factor identifier factor * factor identifier identifier c identifier b constant a b 4 Context free grammar G = V N, V T, S, P V N : set of Non-inal symbols V T : set of Terminal symbols S : Start symbol S V N P : set of Production rules {N α V N V T = P = {N α N V N α V N V T * Top-down parsing Bottom-up parsing process tokens left to right grammar rest_ IFIER rest_ rest_ process tokens left to right grammar rest_ IFIER rest_ rest_ rest_expr IFIER example aap noot mies aap noot mies 1
Comparison Recursive descent parsing node creation alternative selection top-down pre-order 1 2 3 first token 4 5 bottom-up post-order 5 1 4 last token 2 3 each rule N translates to a boolean function return true if a inal production of N was matched return false otherwise without consuming any token try alternatives of N in turn a inal symbol must match the current token a non-inal is matched by calling its routine grammar type implementation restricted LL1 manual automatic relaxed LR1 automatic int void { return && requiretoken; Recursive descent parsing Recursive descent parsing rest_ int void { return && requirerest_; rest_ int rest_void { return token'' && require 1; IFIER int void { return tokenifier token'' && require && requiretoken''; auxiliary functions consume matched tokens report syntax errors int tokenint tk { if Token.class!= tk return 0; get_next_token; return 1; int requireint found { if!found error; return 1; Automatic top-down parsing Exercise 4 min. follow recursive descent scheme, but avoid interpretation overhead for each rule and alternative deine the tokens it can start with: FIRST set parsing scheme for rule N A 1 A 2 if token FIRSTA 1 then parse A 1 else if token FIRSTA 2 then parse A 2 else ERROR deine the FIRST set of the noninals in our grammar rest_ IFIER rest_ and, for this grammar S A B A A a B B b b 2
Answers Computing the FIRST set N w w V T N A 1 A 2 N closure algorithm: initial values inference rules iterative solution N A β Predictive parsing similar to recursive descent, but no back-tracking functions know what they are doing FIRST = {, void void { switch Token.class { case : case '': ; token; break; default: error; void tokenint tk { if Token.class!= tk error; get_next_token; Predictive parsing rest_ FIRST = {, void void { switch Token.class { case : case '': ; rest_; break; default: error; IFIER void void { switch Token.class { case : token; break; case '': token''; ; token''; break; default: error; Predictive parsing rest_ FIRSTrest_expr = {, void rest_void { switch Token.class { case '': token''; ; break; case : case '': break; default: break; error; FOLLOWrest_expr = {, FIRST = { check nothing? NO: token FOLLOWrest_expr Limitations of LL1 parsers FIRST/FIRST conflict IFIER IFIER [ ] FIRST/FOLLOW conflict S A a b A a left recursion - FIRSTA = { a, { a = FOLLOWA 3
Making grammars LL1 Left factoring manual labour rewrite grammar adjust semantic actions three rewrite methods left factoring substitution left-recursion removal IFIER IFIER [ ] factor out common prefix IFIER after_identifier after_identifier [ ] [ FOLLOWafter_identifier Substitution Left-recursion removal S A a b A a replace non-inal by its alternative S a a b a b N N α β replace by N βm M αm example β βα βαα βααα... N α β - tail tail - tail Exercise 7 min. Answers make the following grammar LL1 - * factor / factor factor factor func-call identifier constant func-call identifier expr-list? expr-list, * and what about S if E then S else S? 4
automatic generation LL1 push-down automaton language grammar parser generator program text lexical analysis tokens syntax analysis AST context handling top of stack transition table LL1 push-down automaton annotated AST stack right-hand side of production LL1 push-down automaton LL1 push-down automaton aap noot mies aap noot mies replace non-inal by transition entry replace non-inal by transition entry top of stack top of stack LL1 push-down automaton LL1 push-down automaton pop matching token aap noot mies aap noot mies replace non-inal by transition entry top of stack top of stack 5
LL1 push-down automaton LL1 push-down automaton pop matching token noot mies noot mies replace non-inal by transition entry top of stack top of stack 6