CS308 Compiler Principles Syntax Analyzer Li Jiang


 Barnard Phelps
 3 years ago
 Views:
Transcription
1 CS308 Syntax Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University
2 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This syntactic structure is mostly a parse tree. Syntax Analyzer is also known as parser. The syntax of a program is described by a contextfree grammar (CFG). We will use BNF (BackusNaur Form) notation in the description of CFGs. The syntax analyzer (parser) checks whether a given source program satisfies the rules implied by a contextfree grammar or not. If it satisfies, the parser creates the parse tree of that program. Otherwise the parser gives the error messages. A contextfree grammar gives a precise syntactic specification of a programming language. the design of the grammar is an initial phase of the design of a compiler. a grammar can be directly converted into a parser by some tools. 2
3 Parser / Syntax Analyzer Parser works on a stream of tokens. The smallest item is a token. source program Lexical Analyzer token get next token Parser parse tree The parser obtains a string of tokens from the lexical analyzer, and verifies that the string of token names can be generated by the grammar for the source language. 3
4 Parsers Cont d We categorize the parsers into two groups: 1. TopDown Parser the parse tree is created top to bottom, starting from the root. 2. BottomUp Parser the parse is created bottom to top; starting from the leaves Both scan the input from left to right (one symbol at a time). Efficient topdown and bottomup parsers can be implemented only for subclasses of contextfree grammars. LL for topdown parsing LR for bottomup parsing 4
5 Outline Context Free Grammar Parse Tree Topdown Parser Bottomup Parser 5
6 ContextFree Grammars Recursive structures of a programming language are defined by a contextfree grammar. A contextfree grammar consists of: A finite set of terminals (in our case, these will be the set of tokens) A finite set of nonterminals (syntacticvariables) A finite set of production rules in the following form A where A is a nonterminal and is a string of terminals and nonterminals (including the empty string) A start symbol (one of the nonterminal symbol) Example: E E + E E E E * E E / E  E E ( E ) E id 6
7 Derivations E E+E E derives E+E (E+E derives from E) we can replace E by E+E we have to have a production rule E E+E in our grammar. E E+E id+e id+id A sequence of replacements of nonterminal symbols is called a derivation of id+id from E. In general a derivation step is A if there is a production rule A in our grammar where and are arbitrary strings of terminal and nonterminal symbols n ( n derives from 1 or 1 derives n ) * + : derives in one step : derives in zero or more steps : derives in one or more steps 7
8 CFG  Terminology L(G) is the language of grammar G (the language generated by G). It is a set of sentences. A sentence of L(G) is a string of terminal symbols of G. If S is the start symbol of G then is a sentence of L(G) iff S *, where is a string of terminals of G. If G is a contextfree grammar, L(G) is a contextfree language. Two grammars are equivalent if they produce the same language. * S  If contains nonterminals, it is called as a sentential form of G.  If does not contain nonterminals, it is called as a sentence of G. 8
9 Derivation Example E E (E) (E+E) (id+e) (id+id) OR E E (E) (E+E) (E+id) (id+id) At each derivation step, we can choose any of the nonterminal in the sentential form of G for the replacement. If we always choose the leftmost nonterminal in each derivation step, this derivation is called as leftmost derivation. If we always choose the rightmost nonterminal in each derivation step, this derivation is called as rightmost derivation. 9
10 LeftMost and RightMost Derivations LeftMost Derivation E E (E) (E+E) (id+e) (id+id) lm RightMost Derivation E E (E) (E+E) (E+id) (id+id) rm lm rm lm rm The topdown parsers try to find the leftmost derivation of the given source program. The bottomup parsers try to find the rightmost derivation of the given source program in the reverse order. lm rm lm rm 10
11 Quiz The set of all strings of 0s and 1s that are palindrome; that is, the string reads the same backward as forward. The set of all strings of 0s and 1s with an equal number of 0s and 1s. 11
12 Outline Context Free Grammar Parse Tree Topdown Parser Bottomup Parser 12
13 Parse Tree A parse tree is a graphical representation of a derivation. Inner nodes of a parse tree are nonterminal symbols. The leaves of a parse tree are terminal symbols. E E E E (E)  E  E (E+E) ( E ) E E  E  E (id+e) (id+id) ( E ) ( E ) E + E E + E id id id  E ( E E ) E + E 13
14 Ambiguity A grammar produces more than one parse tree for a sentence is an ambiguous grammar. E E+E id+e id+e*e id+id*e id+id*id E E + id E id E * E id E E*E E+E*E id+e*e id+id*e id+id*id E E + id E * E E id id 14
15 Ambiguity Cont d For the most parsers, the grammar must be unambiguous. unambiguous grammar unique selection of the parse tree for a sentence We should eliminate the ambiguity in the grammar during the design phase of the compiler. An ambiguous grammar should be rewritten to eliminate the ambiguity. How? We have to prefer one of the parse trees of a sentence (generated by an ambiguous grammar) to disambiguate that grammar to restrict to this choice. 15
16 Ambiguity Elimination Cont d Ambiguous grammars (because of ambiguous operators) can be disambiguated according to the precedence and associativity rules. E E+E E*E E^E id (E) disambiguate the grammar precedence: E E+T T T T*F F F G^F G G id (E) ^ (right to left) * (left to right) + (left to right) 16
17 Ambiguity Cont d stmt if expr then stmt if expr then stmt else stmt otherstmts if E 1 then if E 2 then S 1 else S 2 stmt if expr then stmt else stmt if expr stmt then stmt E 1 if expr then stmt S 2 E 1 if expr then stmt else stmt E 2 S 1 E 2 S 1 S
18 Ambiguity Elimination Cont d We prefer the parse tree, in which else matches with the closest if. So, we can disambiguate our grammar to reflect this choice. The unambiguous grammar will be: stmt matchedstmt unmatchedstmt matchedstmt if expr then matchedstmt else matchedstmt otherstmts unmatchedstmt if expr then stmt if expr then matchedstmt else unmatchedstmt Try again! if E 1 then if E 2 then S 1 else S 2 18
19 Left Recursion A grammar is left recursive if it has a nonterminal A such that there is a derivation. + A A for some string Topdown parsing techniques cannot handle leftrecursive grammars. So, we have to convert our leftrecursive grammar into an equivalent grammar which is not leftrecursive. The leftrecursion may appear in a single step of the derivation (immediate leftrecursion), or may appear in more than one step of the derivation. 19
20 Immediate LeftRecursion Elimination A A where does not start with A eliminate immediate left recursion A A A A an equivalent grammar In general: A A 1... A m 1... n where 1... n do not start with A eliminate immediate left recursion A 1 A... n A A 1 A... m A an equivalent grammar 20
21 Immediate LeftRecursion Elimination Example E E+T T T T*F F F id (E) eliminate immediate left recursion E T E E +T E T F T T *F T F id (E) 21
22 NonImmediate LeftRecursion Just eliminating the immediate leftrecursion is not enough to get a leftrecursion free grammar. S Aa b A Sc d This grammar is still leftrecursive. S Aa Sca A Sc Aac or causes to a leftrecursion We have to eliminate all leftrecursions from our grammar The Leftrecursion is hidden! 22
23 Algorithm for Eliminating LeftRecursion  Arrange nonterminals in some order: A 1... A n  for i from 1 to n do {  for j from 1 to i1 do { } replace each production A i A j by A i 1... k where A j 1... k Expose the hidden leftrecursion! }  eliminate immediate leftrecursions among A i productions * 23
24 Example for Eliminating LeftRecursion S Aa b A Ac Sd f  Order of nonterminals: S, A for S:  we do not enter the inner loop.  there is no immediate left recursion in S. for A:  Replace A Sd with A Aad bd So, we will have A Ac Aad bd f  Eliminate the immediate leftrecursion in A A bda fa A ca ada What about another order? So, the resulting equivalent grammar which is not leftrecursive is: S Aa b A bda fa A ca ada 24
25 Example for Eliminating LeftRecursion Cont d S Aa b A Ac Sd f  Order of nonterminals: A, S for A:  Eliminate the immediate leftrecursion in A A SdA fa A ca for S:  Replace S Aa with S SdA a fa a So, we will have S SdA a fa a b  Eliminate the immediate leftrecursion in S S fa as bs S da as So, the resulting equivalent grammar which is not leftrecursive is: S fa as bs S da as A SdA fa A ca See the difference? 25
26 LeftFactoring A predictive parser (a topdown parser without backtracking) needs the grammar to be leftfactored. grammar a new equivalent grammar suitable for predictive parsing stmt if expr then stmt else stmt if expr then stmt when we see if, we cannot know which production rule to choose to rewrite stmt in the derivation. 26
27 In general, LeftFactoring Cont d A 1 2 where is nonempty and the first symbols of 1 and 2 (if they have one) are different. when processing we cannot know whether expand A to 1 or A to 2 But, if we rewrite the grammar as follows A A A 1 2 so, we can immediately expand A to A 27
28 Algorithm for LeftFactoring For each nonterminal A with two or more alternatives (production rules) with a common nonempty prefix, say A 1... n 1... m where is the longest prefix convert it into A A 1... m A 1... n 28
29 LeftFactoring Example1 A abb ab cdg cdeb cdfb A aa cdg cdeb cdfb A bb B A aa cda A bb B A g eb fb 29
30 LeftFactoring Example2 A ad a ab abc b A aa b A d b bc A aa b A d ba A c 30
31 CFG vs. Regular Expression Grammar is a more powerful notation than regular expressions. Every language described by a regular expression can be described by a grammar. For each state i of the FA, create a nonterminal A i. If state i has a transition to state j on input a (include ε), add the production A i aa j. If i is an accepting state, add A i ε. If i is the start state, make A i be the start symbol of the grammar. A 0 ba 0 aa 1 A 1 aa 1 ba 2 (a b) * a b A 2 aa 1 ba 0 A 2 ε 31
32 CFG Vs. Regular Expression Cont d A language described by a grammar may not be described by a regular expression. Because regular expression/finite automata cannot count. Example: Language L = {a n b n n >= 1} Can be written as grammar S asb ab But cannot be expressed by a regular expression 32
33 Quiz Given the following grammar, a) left factor it; b) see whether suitable for topdown parsing? c) Eliminate left recursion from the original grammar; d) Is the resulting grammar suitable for topdown parsing S > 0S1 01 S > SS+ SS* a S > S(S)S e S > (L) a, L > L, S S 33
34 S > SS+ SS* a 1. Left factor 2. No. Left recursion 3. Eliminate left recursion 4. Yes 35
35 CS308 TopDown Parsing
36 TopDown Parsing The parse tree is created top to bottom. Topdown parser RecursiveDescent Parsing Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.) It is a general parsing technique, but not widely used. Not efficient Predictive Parsing No backtracking Efficient Recursive Predictive Parsing is a special form of Recursive Descent parsing without backtracking. NonRecursive (Table Driven) Predictive Parser is also known as LL(1) parser. 40
37 RecursiveDescent Parsing A recursivedescent parsing program consists of a set of procedures, one for each nonterminal. Backtracking is needed (need repeated scans over the input). It tries to find the leftmost derivation. S abc B bc b input: abc S Main(){ } Execution begins with the procedure for the start symbol, which halts and announces success if its procedure body scans the entire input string. a B c a B c fails, backtrack S b c b 41
38 Procedure for stmt compares its argument with the lookahead symbol advances to the next input terminal if they match, and changes the value of lookahead, each terminal is matched with each nonterminal leads to a call of its procedure A leftrecursive grammar can cause a recursivedescent parser to go into an infinite loop. How can we get this procedure? Let s continue 42
39 Predictive Parser a grammar a grammar suitable for predictive eliminating left parsing (a LL(1) grammar) left recursion factoring not 100% guaranteed When rewriting a nonterminal in a derivation step, a predictive parser can uniquely choose a production rule by just looking the current symbol in the input string. A 1... n input:... a... current token 43
40 Predictive Parser Example stmt if... while... begin... for... When we are trying to rewrite the nonterminal stmt, we can uniquely choose the production rule by just looking the current token. if the current token is if we have to choose first production rule. 44
41 Recursive Predictive Parsing Each nonterminal corresponds to a procedure. Example: A abb (Only production rule for A) proc A {  match the current token with a, and move to the next token;  call proc B;  match the current token with b, and move to the next token; } 45
42 Recursive Predictive Parsing Cont d A abb bab proc A { case of the current token { a :  match the current token with a, and move to the next token;  call B;  match the current token with b, and move to the next token; b :  match the current token with b, and move to the next token;  call A;  call B; } } 46
43 Recursive Predictive Parsing Cont d When to apply productions. A aa bb If all other productions fail, we should apply an production. For example, if the current token is not a or b, we may apply the production. Most correct choice: We should apply an production for a nonterminal A when the current token is in the follow set of A (which terminals can follow A in the sentential forms). 47
44 Recursive Predictive Parsing Example A abe cbd C B bb C f proc A { proc C { match the current token with f, case of the current token { and move to the next token; } a:  match the current token with a, and move to the next token; proc B {  call B; case of the current token {  match the current token with e, b:  match the current token with b, and move to the next token; and move to the next token; c:  match the current token with c,  call B and move to the next token; d, e: do nothing  call B; }  match the current token with d, } and move to the next token; follow set of B f:  call C } } first set of C 48
45 Compute FIRST & FOLLOW X > FIRST FIRST & FOLLOW set for tokens! 49
46 NonRecursive Predictive Parsing NonRecursive predictive parsing is a tabledriven parsing method. It is a topdown parser. It is also known as LL(1) Parser. LL(1) one input symbol used as a lookahead symbol to determine parser action left most derivation input scanned from left to right input buffer stack Nonrecursive Predictive Parser output Parsing Table We need an algorithm to implement the aforementioned procedures. What is a proper data structure? 50
47 LL(1) Parser input buffer string of tokens to be parsed, followed by endmarker $. output a production rule representing a step of the derivation sequence (leftmost derivation) of the string in the input buffer. stack contains the grammar symbols at the bottom of the stack, there is a special endmarker $. initially the stack contains only the symbol $ and the starting symbol S. $S when the stack is emptied (i.e., only $ left in the stack), the parsing is completed. parsing table a twodimensional array M[A,a] each row is a nonterminal symbol each column is a terminal symbol or the special symbol $ each entry holds a production rule. 51
48 LL(1) Parser Parser Actions The symbol at the top of the stack (say X) and the current symbol in the input string (say a) determine the parser action. There are four possible parser actions. 1. If X and a are $ parser halts (successful completion) 2. If X and a are the same terminal symbol (different from $) parser pops X from the stack, and moves to the next symbol in the input buffer. 3. If X is a nonterminal parser looks at the parsing table entry M[X,a]. If M[X,a] holds a production rule X Y 1 Y 2...Y k, it pops X from the stack and pushes Y k,y k1,...,y 1 into the stack. 4. none of the above error all empty entries in the parsing table are errors. If X is a terminal symbol different from a, this is also an error case. 52
49 LL(1) Parser Example1 E TE E +TE T FT T *FT F (E) id id + * ( ) $ E E TE E TE E E +TE E E T T FT T FT T T T *FT T T F F id F (E) 53
50 LL(1) Parser Example1 Cont d stack input output $E id+id$ E TE $E T id+id$ T FT $E T F id+id$ F id $ E T id id+id$ $ E T +id$ T $ E +id$ E +TE $ E T+ +id$ $ E T id$ T FT $ E T F id$ F id $ E T id id$ $ E T $ T $ E $ E $ $ accept 54
51 LL(1) Parser Example2 S aba B bb S LL(1) Parsing Table S aba a b $ B B B bb stack input output $S abba$ S aba $aba abba$ $ab bba$ B bb $abb bba$ $ab ba$ B bb $abb ba$ $ab a$ B $a a$ $ $ accept, successful completion 55
52 LL(1) Parser Example2 Cont d Outputs: S aba B bb B bb B Derivation(leftmost): S aba abba abbba abba Parse tree S a B a Remaining question? b B How derive parsing table? b B 56
53 Constructing LL(1) Parsing Tables Two functions are used in the construction of LL(1) parsing tables. FIRST( ) is a set of the terminal symbols which occur as first symbols in strings derived from is any string of grammar symbols. if derives to, then is also in FIRST( ). FOLLOW(A) is the set of the terminals which occur immediately after (follow) the nonterminal A in the strings derived from the starting symbol. a terminal a is in FOLLOW(A) if S * Aa endmarker $ is in FOLLOW(A) if S * A 57
54 Computing FIRST(X) If X is a terminal symbol FIRST(X)={X} If X is a nonterminal symbol and X is a production rule is in FIRST(X) If X is a nonterminal symbol and X Y 1 Y 2..Y n is a production rule if terminal a in FIRST(Y i ) and is in all FIRST(Y j ) for j=1,...,i1, then a is in FIRST(X). if is in all FIRST(Y j ) for j=1,...,n, then is in FIRST(X). If X is FIRST(X)={ } We apply these rules until nothing more can be added to any FIRST set. 58
55 FIRST Example * E TE E +TE T FT T *FT F (E) id FIRST(F) = { (, id } FIRST(TE ) = { (, id } FIRST(T ) = { *, } FIRST(+TE ) = {+} FIRST(T) = { (, id } FIRST( ) = { } FIRST(E ) = { +, } FIRST(FT ) = { (, id } FIRST(E) = { (, id } FIRST(*FT ) = {*} FIRST( ) = { } FIRST((E)) = {(} FIRST(id) = {id} 59
56 Computing FOLLOW(X) If S is the start symbol $ is in FOLLOW(S) if A B is a production rule everything in FIRST( ) is in FOLLOW(B) except If ( A B is a production rule ) or ( A B is a production rule and is in FIRST( ) ) everything in FOLLOW(A) is in FOLLOW(B). We apply these rules until nothing more can be added to any FOLLOW set. 60
57 FOLLOW Example * E TE E +TE T FT T *FT F (E) id FOLLOW(E) = { $, ) } FOLLOW(E ) = { $, ) } FOLLOW(T) = { +, ), $ } FIRST(E ) = {+, } FOLLOW(T ) = { +, ), $ } FOLLOW(F) = { +, *, ), $ } FIRST(T ) = {*, } 61
58 Constructing LL(1) Parsing Table For each production A of grammar G for each terminal a in FIRST( ) add A to M[A,a] If in FIRST( ) for each terminal a in FOLLOW(A), add A to M[A,a] If in FIRST( ) and $ in FOLLOW(A) add A to M[A,$] All other undefined entries of the parsing table are error entries. 62
59 Constructing LL(1) Parsing Table Example E TE FIRST(TE )={(,id} E TE into M[E,(] and M[E,id] E +TE FIRST(+TE )={+} E +TE into M[E,+] E FIRST( )={ } none but since in FIRST( ) and FOLLOW(E )={$,)} E into M[E,$] and M[E,)] T FT FIRST(FT )={(,id} T FT into M[T,(] and M[T,id] T *FT FIRST(*FT )={*} T *FT into M[T,*] T FIRST( )={ } none but since in FIRST( ) and FOLLOW(T )={$,),+} T into M[T,$], M[T,)] and M[T,+] F (E) FIRST((E))={(} F (E) into M[F,(] F id FIRST(id)={id} F id into M[F,id] 63
60 LL(1) Grammars A grammar whose parsing table has no multiply defined entries is said to be LL(1) grammar. An entry in the parsing table of a grammar may contain more than one production rule. In this case, we say that it is not a LL(1) grammar. a grammar a LL(1) grammar (no 100% guarantee) eliminating left recursion left factoring 64
61 A Grammar which is not LL(1) S i C t S E a FOLLOW(S) = { $,e } E e S FOLLOW(E) = { $,e } C b FOLLOW(C) = { t } FIRST(iCtSE) = {i} FIRST(a) = {a} FIRST(eS) = {e} FIRST( ) = { } FIRST(b) = {b} S S a E a b e i t $ E e S E S ictse E Problem: ambiguity C C b two production rules for M[E,e] 65
62 A Grammar which is not LL(1) Cont d What can we do if the resulting parsing table contains multiply defined entries? eliminate the left recursion. left factor the grammar. If the parsing table still contains multiply defined entries, that grammar is ambiguous or it is inherently not a LL(1) grammar. A left recursive grammar cannot be a LL(1) grammar. A A any terminal that appears in FIRST( ) also appears FIRST(A ) because A. If is, any terminal that appears in FIRST( ) also appears in FIRST(A ) and FOLLOW(A). A not left factored grammar cannot be a LL(1) grammar A 1 2 any terminal that appears in FIRST( 1 ) also appears in FIRST( 2 ). An ambiguous grammar cannot be a LL(1) grammar. 66
63 Properties of LL(1) Grammars A grammar G is LL(1) if and only if the following conditions hold for any two distinctive production rules A and A 1. and do not derive any string starting with the same terminals. 2. At most one of and can derive. 3. If can derive, then cannot derive to any string starting with a terminal in FOLLOW(A). 67
64 Quiz For grammar: S > S+S SS (S) S* a, devise predictive parsers and show the parsing tables. You may leftfactor and/or eliminate leftrecursion from your gramars. 68
65 S > S+S SS (S) S* a Leftfactoring Eliminate leftrecursion 69
66 S > S+S SS (S) S* a Revised production FIRST && FOLLOW 70
67 S > S+S SS (S) S* a Parsing table 71
68 CS308 BottomUp Parsing
69 BottomUp Parsing A bottomup parser creates the parse tree of the given input starting from leaves towards the root. A bottomup parser tries to find the rightmost derivation of the given input in the reverse order. S... Bottomup parsing is also known as shiftreduce parsing because its two main actions are shift and reduce. At each shift action, the current symbol in the input string is pushed into a stack. At each reduction step, the symbols at the top of the stack (this symbol sequence is the right side of a production) will be replaced by the nonterminal at the left side of that production. 79
70 ShiftReduce Parsing A shiftreduce parser tries to reduce the given input string into the starting symbol. a string the starting symbol reduced to At each reduction step, a substring of the input matching to the right side of a production rule is replaced by the nonterminal at the left side of that production rule. If the substring is chosen correctly, the right most derivation of that string is created in the reverse order. Rightmost Derivation: ShiftReduce Parser finds: S * rm S... rm rm 80
71 ShiftReduce Parsing  Example S aabb input string: aaabb A aa a aaabb B bb b aabb reduction aabb S S aabb aabb aaabb aaabb rm rm rm rm Right Sentential Forms How do we know which substring to be replaced at each reduction step? 81
72 Handle In the following reduction, a handle of is the body of production A in the position following. S * A rm ( is a string of terminals) A handle is a substring that matches the right side of a production rule. But not every substring matches the right side of a production rule is a handle Only that can move the reduction forward towards the start symbol in the reverse of a rightmost derivation. If the grammar is unambiguous, then every rightsentential form of the grammar has exactly one handle. rm 82
73 S ab ba A a as baa B abb bs b Handle Example What is the handle of aabbab? S ab aabb aabb aabsb aabbab Handle is ba 83
74 Handle Pruning A rightmost derivation in reverse can be obtained by handlepruning. S= n1 n = rm string rm rm rm rm input From n, find a handle A n n in n, and replace n by A n to get n1. Then find a handle A n1 n1 in n1, and replace n1 by A n1 to get n2. Repeat this, until we reach S. 84
75 Handle Pruning Example E E+T T T T*F F F (E) id RightMost Derivation of id+id*id E E+T E+T*F E+T*id E+F*id E+id*id T+id*id F+id*id id+id*id Right Sentential Form id+id*id F+id*id T+id*id E+id*id E+F*id E+T*id E+T*F E+T E Reducing Production F id T F E T F id T F F id T T*F E E+T Handles are red and underlined in the rightsentential forms. 85
76 ShiftReduce Parsing Initial stack just contains only the endmarker $. The end of the input string is marked by the endmarker $. There are four possible actions in a shiftreduce parser: Shift: The next input symbol is shifted into the top of the stack. Reduce: Replace the handle on the top of the stack by the nonterminal. Accept: Successful completion of parsing. Error: Parser discovers a syntax error, and calls an error recovery routine. 86
77 ShiftReduce Parsing Example Stack Input Action E E+T T $ id+id*id$ shift T T*F F $id +id*id$ reduce by F id F (E) id $F +id*id$ reduce by T F $T +id*id$ reduce by E T E 8 $E +id*id$ shift $E+ id*id$ shift E 3 + T 7 $E+id *id$ reduce by F id $E+F *id$ reduce by T F T 2 T 5 * F6 $E+T *id$ shift $E+T* id$ shift F 1 F 4 id $E+T*id $ reduce by F id $E+T*F $ reduce by T T*F id id $E+T $ reduce by E E+T Parse Tree $E $ accept 87
78 Try it by your own Grammar: S > SS+ SS* a Rightsentential forms: aaa*a++ Give bottomup parses 88
79 Conflicts During ShiftReduce Parsing There are contextfree grammars for which shiftreduce parsers cannot be used. Stack contents and the next input symbol may not decide action: shift/reduce conflict: Whether make a shift operation or a reduction. reduce/reduce conflict: The parser cannot decide which of several reductions to make. If a shiftreduce parser cannot be used for a grammar, that grammar is called as non LR(k) grammar. An ambiguous grammar can never be a LR grammar. 89
80 ShiftReduce Parsers There are two main categories of shiftreduce parsers 1. OperatorPrecedence Parser simple, but only a small class of grammars. CFG LR LALR SLR 2. LRParsers covers wide range of grammars. SLR simple LR parser LR most general LR parser LALR intermediate LR parser (lookahead LR parser) SLR, LR and LALR work same, only their parsing tables are different. 90
81 LR Parsers The most powerful shiftreduce parsing (yet efficient) is: LR(k) parsing. left to right rightmost k lookahead scanning derivation (k is omitted it is 1) LR parsing s advantages: LR parsing is the most general nonbacktracking shiftreduce parsing, yet it is still efficient. The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive parsers. LL(1)Grammars LR(1)Grammars An LRparser can detect a syntactic error in a lefttoright scan of the input. 91
82 Model of LR Parser * state symbol stack S m X m S m1 input a 1... a i... a n $ LR Parsing Algorithm output X m1.. Action Table Goto Table S 1 X 1 S 0 s t a t e s terminals and $ four different actions s t a t e s nonterminal each item is a state number 92
83 A Configuration of LR Parsing Algorithm A configuration of a LR parsing is: ( S o X 1 S 1... X m S m, a i a i+1... a n $ ) Stack Rest of Input S m and a i decides the parser action by consulting the parsing action table. (Initial Stack contains just S o ) A configuration of a LR parsing represents the right sentential form: X 1... X m a i a i+1... a n $ 93
84 Actions of A LRParser 1. shift s  shifts the next input symbol and the state s into the stack ( S o X 1 S 1... X m S m, a i a i+1... a n $ ) ( S o X 1 S 1... X m S m a i s, a i+1... a n $ ) 2. reduce A pop 2 (r= ) items from the stack; then push A and s, where s=goto[s mr, A] ( S o X 1 S 1... X m S m, a i a i+1... a n $ ) ( S o X 1 S 1... X mr S mr A s, a i... a n $ ) Output is the reducing production A 3. Accept Parsing successfully completed 4. Error  Parser detected an error (an empty entry in the action table) 94
85 Reduce Action Pop 2 (r= ) items from the stack; Assume that = Y 1 Y 2...Y r Push A and s where s=goto[s mr, A] ( S o X 1 S 1... X mr S mr Y 1 S mr+1...y r S m, a i a i+1... a n $ ) ( S o X 1 S 1... X mr S mr A s, a i... a n $ ) In fact, Y 1 Y 2...Y r is a handle. X 1... X mr A a i... a n $ X 1... X m Y 1...Y r a i a i+1... a n $ 95
86 (SLR) Parsing Table 1) E E+T 2) E T 3) T T*F 4) T F 5) F (E) 6) F id Action Table Goto Table state id + * ( ) $ E T F 0 s5 s s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s r6 r6 r6 r6 6 s5 s s5 s s6 s11 9 r1 s7 r1 r1 10 r3 r3 r3 r3 11 r5 r5 r5 r5 96
87 Moves of A LRParser Example stack input action output 0 id*id+id$ shift 5 0id5 *id+id$ reduce by F id F id 0F3 *id+id$ reduce by T F T F 0T2 *id+id$ shift 7 0T2*7 id+id$ shift 5 0T2*7id5 +id$ reduce by F id F id 0T2*7F10 +id$ reduce by T T*F T T*F 0T2 +id$ reduce by E T E T 0E1 +id$ shift 6 0E1+6 id$ shift 5 0E1+6id5 $ reduce by F id F id 0E1+6F3 $ reduce by T F T F 0E1+6T9 $ reduce by E E+T E E+T 0E1 $ accept 97
88 Constructing SLR Parsing Tables LR(0) Item An LR(0) item of a grammar G is a production of G with a dot at some position of the body. Ex: A abb Possible LR(0) Items: A.aBb (four different possibilities) A a.bb Reduction: Moving doc to the right end A ab.b A abb. A collection of sets of LR(0) items (the canonical LR(0) collection) is the basis for constructing SLR parsers. (LR(0) automation) The collection of sets of LR(0) items will be the states. Augmented Grammar: G is G with a new production rule S S where S is the new starting symbol. CLOSURE and GOTO function * 98
89 The Closure Operation If I is a set of LR(0) items for a grammar G, then closure(i) is the set of LR(0) items constructed from I by the two rules: 1. Initially, every LR(0) item in I is added to closure(i). 2. If A.B is in closure(i) and B is a production rule of G, then B. will be in the closure(i). Apply this rule until no more new LR(0) items can be added to closure(i). * 99
90 The Closure Operation  Example E E closure({e.e}) = E E+T { E.E kernel item * E T T T*F T F F (E) F id E.E+T E.T T.T*F T.F F.(E) F.id } Kernel items : the initial item, S.S, and all items whose dots are not at the left end. Nonkernel items : all items with their dots at the left end, except for S'.S. 100
91 Goto Operation If I is a set of LR(0) items and X is a grammar symbol (terminal or nonterminal), then goto(i,x) is defined as follows: If A.X in I, then every item in closure({a X. }) will be in goto(i,x). Example: I ={ E.E, E.E+T, E.T, T.T*F, T.F, F.(E), F.id } goto(i,e) = { E E., E E.+T } goto(i,t) = { E T., T T.*F } goto(i,f) = {T F. } goto(i,() = { F (.E), E.E+T, E.T, T.T*F, T.F, F.(E), F.id } goto(i,id) = { F id. } 101
92 Construction of The Canonical LR(0) Collections To create the SLR parsing tables for a grammar G, we will create the canonical LR(0) collection of the grammar G. Algorithm: C is { closure({s.s}) } repeat the followings until no more set of LR(0) items can be added to C. for each I in C and each grammar symbol X if goto(i,x) is not empty and not in C add goto(i,x) to C goto function is a DFA on the sets in C. 102
93 The Canonical LR(0) Collection Example I 0 : E.E I 1 : E E. I 6 : E E+.T I 9 : E E+T. E.E+T E E.+T T.T*F T T.*F E.T T.F T.T*F I 2 : E T. F.(E) I 10 : T T*F. T.F T T.*F F.id F.(E) F.id I 3 : T F. I 7 : T T*.F I 11 : F (E). F.(E) I 4 : F (.E) F.id E.E+T E.T I 8 : F (E.) T.T*F E E.+T T.F F.(E) F.id I 5 : F id. 103
94 Transition Diagram (DFA) of Goto Function E I 0 I 1 T + I 6 T F ( id I 9 to I 3 to I 4 * to I 7 F ( id I 2 I 3 I 4 id I 5 * E T F ( I 7 I 8 to I 2 to I 3 to I 4 F ( id ) + to I 5 I 10 to I 4 to I 5 I 11 to I 6 104
95 Constructing SLR Parsing Table 1. Construct the canonical collection of sets of LR(0) items for G. C {I 0,...,I n } 2. Create the parsing action table as follows If a is a terminal, A.a in I i and goto(i i,a)=i j then action[i,a] is shift j. If A. is in I i, then action[i,a] is reduce A for all a in FOLLOW(A) where A S. If S S. is in I i, then action[i,$] is accept. If any conflicting actions generated by these rules, the grammar is not SLR. 3. Create the parsing goto table for all nonterminals A, if goto(i i,a)=i j then goto[i,a]=j 4. All entries not defined by (2) and (3) are errors. 5. Initial state of the parser contains S.S 105
96 Parsing Tables of Expression Grammar 1) E E+T 2) E T 3) T T*F 4) T F 5) F (E) 6) F id Action Table Goto Table state id + * ( ) $ E T F 0 s5 s s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s r6 r6 r6 r6 6 s5 s s5 s s6 s11 9 r1 s7 r1 r1 10 r3 r3 r3 r3 11 r5 r5 r5 r5 106
97 SLR(1) Grammar An LR parser using SLR(1) parsing tables for a grammar G is called a SLR(1) parser for G. If a grammar G has an SLR(1) parsing table, it is called SLR(1) grammar (SLR grammar for short). Every SLR grammar is unambiguous, but not every unambiguous grammar is a SLR grammar. 107
98 Shift/Reduce and Reduce/Reduce Conflicts If a state does not know whether it will make a shift operation or reduction for a terminal, we say that there is a shift/reduce conflict. If a state does not know whether it will make a reduction operation using the production rule i or j for a terminal, we say that there is a reduce/reduce conflict. If the SLR parsing table of a grammar G has a conflict, we say that the grammar is not a SLR grammar. 108
99 Conflict Example 1 S L=R I 0 : S.S I 1 : S S. I 6 : S L=.R I 9 : S L=R. S R S.L=R R.L L *R S.R I 2 : S L.=R L.*R L id L.*R R L. L.id R L L.id R.L I 3 : S R. I 4 : L *.R I 7 : L *R. Problem R.L FOLLOW(R) = {=,$} L.*R I 8 : R L. = shift 6 L.id reduce by R L shift/reduce conflict I 5 : L id. 109
100 Conflict Example 2 S AaAb I 0 :S.S S BbBa S.AaAb A S.BbBa B A. B. Problem FOLLOW(A)={a,b} FOLLOW(B)={a,b} a reduce by A b reduce by A reduce by B reduce by B reduce/reduce conflict reduce/reduce conflict 110
101 Constructing Canonical LR(1) Items In SLR method, the state i makes a reduction by A when the current token is a: if the A. in the I i and a is in FOLLOW(A) In some situations, A cannot be followed by the terminal a in a rightsentential form when and the state i are on the top stack. This means that making reduction in this case is not correct. Consider previous example 1 111
102 LR(1) Item To avoid some of invalid reductions, the states need to carry more information. Extra information is put into a state by including a terminal symbol as a second component in an item. A LR(1) item is: A.,a where a is the lookahead of the LR(1) item a is a terminal or endmarker. 112
103 LR(1) Item Cont d When ( in the LR(1) item A.,a ) is not empty, the lookahead does not have any effect. When is empty (A.,a ), we do the reduction by A only if the next input symbol is a (not for any terminal in FOLLOW(A)). A state will contain A.,a 1... A.,a n where {a 1,...,a n } FOLLOW(A) 113
104 A Short Notation A set of LR(1) items containing the following items A.,a 1... A.,a n can be written as A.,a 1 /a 2 /.../a n 114
105 Canonical Collection of Sets of LR(1) Items The construction of the canonical collection of the sets of LR(1) items are similar to that of the sets of LR(0) items, except that closure and goto operations work a little bit different. closure(i) is: ( where I is a set of LR(1) items) every LR(1) item in I is in closure(i) if A.B,a in closure(i) and B is a production rule of G; then B.,b will be in the closure(i) for each terminal b in FIRST( a). 115
106 goto operation If I is a set of LR(1) items and X is a grammar symbol (terminal or nonterminal), then goto(i,x) is defined as follows: If A.X,a in I then every item in closure({a X.,a}) will be in goto(i,x). 116
107 Construction of The Canonical LR(1) Collection Algorithm: C is { closure({s.s,$}) } repeat the followings until no more set of LR(1) items can be added to C. for each I in C and each grammar symbol X if goto(i,x) is not empty and not in C add goto(i,x) to C goto function is a DFA on the sets in C. 117
108 Canonical LR(1) Collection Example 1 S S 1) S L=R 2) S R 3) L *R 4) L id 5) R L I 0 :S.S,$ S.L=R,$ S.R,$ L.*R,$/= L.id,$/= R.L,$ S I 1 :S S.,$ * LI 2 :S L.=R,$ R L.,$ R I 3 :S R.,$ = to I 6 I 4 :L *.R,$/= R.L,$/= L.*R,$/= L.id,$/= id I 5 :L id.,$/= R L * id to I 7 to I 8 to I 4 to I 5 I 6 :S L=.R,$ R.L,$ L.*R,$ L.id,$ I 7 :L *R.,$/= I 8 : R L.,$/= R L * id to I 9 to I 10 to I 11 to I 12 I 9 :S L=R.,$ I 10 :R L.,$ I 11 :L *.R,$ R.L,$ L.*R,$ L.id,$ I 12 :L id.,$ R L * id to I 13 to I 10 to I 11 to I 12 I 13 :L *R.,$ I 4 and I 11 I 5 and I 12 I 7 and I 13 I 8 and I
109 Canonical LR(1) Collection Example 2 S AaAb I 0 : S.S,$ S I 1 : S S.,$ S BbBa S.AaAb,$ A A S.BbBa,$ B A.,a B I 2 : S A.aAb,$ B.,b I 3 : S B.bBa,$ I 4 : S Aa.Ab,$ A I 6 : S AaA.b,$ b I 8 : S AaAb.,$ A.,b I 5 : S Bb.Ba,$ B I 7 : S BbB.a,$ a I 9 : S BbBa.,$ B.,a a b to I 4 to I 5 119
110 Construction of LR(1) Parsing Tables 1. Construct the canonical collection of sets of LR(1) items for G. C {I 0,...,I n } 2. Create the parsing action table as follows If a is a terminal, A.a,b in I i and goto(i i,a)=i j then action[i,a] is shift j. If A.,a is in I i, then action[i,a] is reduce A where A S. If S S.,$ is in I i, then action[i,$] is accept. If any conflicting actions generated by these rules, the grammar is not LR(1). 3. Create the parsing goto table for all nonterminals A, if goto(i i,a)=i j then goto[i,a]=j 4. All entries not defined by (2) and (3) are errors. 5. Initial state of the parser contains S.S,$ 120
111 LR(1) Parsing Tables for Example 1 id * = $ S L R 0 s5 s acc 2 s6 r5 3 r2 4 s5 s r4 r4 6 s12 s r3 r3 8 r5 r5 9 r1 10 r5 11 s12 s r4 13 r3 no shift/reduce or no reduce/reduce conflict so, it is a LR(1) grammar 121
112 LALR Parsing Tables LALR stands for LookAhead LR. LALR parsers are often used in practice because LALR parsing tables are smaller than LR(1) parsing tables. The number of states in SLR and LALR parsing tables for a grammar G are equal. But LALR parsers recognize more grammars than SLR parsers. A state of LALR parser will be a set of LR(1) items with modifications. Yacc creates a LALR parser for the given grammar. 122
113 The Core of A Set of LR(1) Items The core of a set of LR(1) items is the set of its first component. S L.=R,$ S L.=R Core R L.,$ R L. Find the states (sets of LR(1) items) in a canonical LR(1) parser with the same core, and merge them into a single state. I 1 :L id.,= A new state: I 12 : L id.,=/$ I 2 :L id.,$ Do this for all states of a canonical LR(1) parser to get the states of the LALR parser. 123
114 Creating LALR Parsing Tables Canonical LR(1) Parser LALR Parser shrink # of states This shrink process may introduce a reduce/reduce conflict in the resulting LALR parser (so the grammar is NOT LALR) But, this shrink process does not produce a shift/reduce conflict. 124
115 Shift/Reduce Conflict We cannot introduce a shift/reduce conflict during the shrinking process for the creation of the states of a LALR parser. Assume that we can introduce a shift/reduce conflict. In this case, a state of LALR parser must have: A.,a and B.a,b This means that a state of the canonical LR(1) parser must have: A.,a and B.a,c But, this state has also a shift/reduce conflict. i.e. The original canonical LR(1) parser has a conflict. Contradiction! 125
116 Reduce/Reduce Conflict But, we may introduce a reduce/reduce conflict during the shrink process for the creation of the states of a LALR parser. I 1 : A.,a I 2 : A.,b B.,b B.,c I 12 : A.,a/b reduce/reduce conflict B.,b/c 126
117 Creation of LALR Parsing Tables Create the canonical LR(1) collection of the sets of LR(1) items for the given grammar. For each core, find all sets having it, and replace those sets into a single set. C={I 0,...,I n } C ={J 0,...,J m } where m n Create the parsing table (action and goto tables) the same way as that of LR(1) parser. Note: If J=I 1... I k, since I 1,...,I k have the same core cores of goto(i 1,X),...,goto(I k,x) must be same. So, goto(j,x)=k where K is the union of all sets of items having the same core as goto(i 1,X). If no conflict is introduced, the grammar is LALR(1) grammar. 127
118 Canonical LR(1) Collection Example 1 S S 1) S L=R 2) S R 3) L *R 4) L id 5) R L I 0 :S.S,$ S.L=R,$ S.R,$ L.*R,$/= L.id,$/= R.L,$ I 1 :S S.,$ S * LI 2 :S L.=R,$ R L.,$ R I 3 :S R.,$ to I 6 I 4 :L *.R,$/= R.L,$/= L.*R,$/= L.id,$/= id I 5 :L id.,$/= R L * id to I 7 to I 8 to I 4 to I 5 I 6 :S L=.R,$ R.L,$ L.*R,$ L.id,$ I 7 :L *R.,$/= I 8 : R L.,$/= R L * id to I 9 to I 10 to I 11 to I 12 I 9 :S L=R.,$ I 10 :R L.,$ I 11 :L *.R,$ R.L,$ L.*R,$ L.id,$ I 12 :L id.,$ R L * id I 13 :L *R.,$ Merging? I 4 and I 11 to I 13 I to I 5 and I to I I 11 7 and I 13 to I 12 I 8 and I
119 Canonical LALR(1) Collection Example 1 S S 1) S L=R 2) S R 3) L *R 4) L id 5) R L I 0 :S.S,$ S.L=R,$ S.R,$ L.*R,$/= L.id,$/= R.L,$ I 1 :S S.,$ S * LI 2 :S L.=R,$ R L.,$ R I 3 :S R.,$ id I 411 :L *.R,$/= R.L,$/= to I 6 L.*R,$/= L.id,$/= I 512 :L id.,$/= R L * id to I 713 to I 810 to I 411 to I 512 I 6 :S L=.R,$ R.L,$ L.*R,$ L.id,$ I 713 :L *R.,$/= R L * id to I 9 to I 810 to I 411 to I 512 I 9 :S L=R.,$ Same Cores I 4 and I 11 I 5 and I 12 I 7 and I 13 I 810 : R L.,$/= Let s construct the parsing table! I 8 and I
120 LALR(1) Parsing Tables for Example2 id * = $ S L R 0 s5 s acc 2 s6 r5 3 r2 4 s5 s r4 r4 6 s12 s r3 r3 8 r5 r5 9 r1 no shift/reduce or no reduce/reduce conflict so, it is a LALR(1) grammar 130
121 Homework Exercise Exercise 4.4.1(e), Exercise Exercise Due date: Oct. 31,
122 Summary Parsers, Contextfree grammar, Derivations, Parse Trees, Ambiguity, Top Down and Bottomup Parsing, Design of Grammars, RecursiveDecent Parsers LL(1) parsers, Shiftreduce parsing, Viable prefixes, Valid Items, 140
LALR Parsing. What Yacc and most compilers employ.
LALR Parsing Canonical sets of LR(1) items Number of states much larger than in the SLR construction LR(1) = Order of thousands for a standard prog. Lang. SLR(1) = order of hundreds for a standard prog.
More informationLR Parsing Techniques
LR Parsing Techniques Introduction BottomUp Parsing LR Parsing as Handle Pruning ShiftReduce Parser LR(k) Parsing Model Parsing Table Construction: SLR, LR, LALR 1 BottomUP Parsing A bottomup parser
More informationBottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.
Bottom up parsing Construct a parse tree for an input string beginning at leaves and going towards root OR Reduce a string w of input to start symbol of grammar Consider a grammar S aabe A Abc b B d And
More information3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino
3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the
More informationPrinciples of Programming Languages
Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 8! Bo;om Up Parsing Shi? Reduce LR(0) automata and
More informationUNITIII BOTTOMUP PARSING
UNITIII BOTTOMUP PARSING Constructing a parse tree for an input string beginning at the leaves and going towards the root is called bottomup parsing. A general type of bottomup parser is a shiftreduce
More informationSection A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.
Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or
More informationSyntax Analysis (Parsing)
An overview of parsing Functions & Responsibilities Context Free Grammars Concepts & Terminology Writing and Designing Grammars Syntax Analysis (Parsing) Resolving Grammar Problems / Difficulties TopDown
More informationLR Parsing Techniques
LR Parsing Techniques BottomUp Parsing  LR: a special form of BU Parser LR Parsing as Handle Pruning ShiftReduce Parser (LR Implementation) LR(k) Parsing Model  k lookaheads to determine next action
More informationSYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic
More informationTableDriven Parsing
TableDriven Parsing It is possible to build a nonrecursive predictive parser by maintaining a stack explicitly, rather than implicitly via recursive calls [1] The nonrecursive parser looks up the production
More informationParsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)
TD parsing  LL(1) Parsing First and Follow sets Parse table construction BU Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) Problems with SLR Aho, Sethi, Ullman, Compilers
More informationSyn S t yn a t x a Ana x lysi y s si 1
Syntax Analysis 1 Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token, tokenval Get next token Parser and rest of frontend Intermediate representation Lexical error Syntax
More informationPART 3  SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309
PART 3  SYNTAX ANALYSIS F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 64 / 309 Goals Definition of the syntax of a programming language using context free grammars Methods for parsing
More informationWWW.STUDENTSFOCUS.COM UNIT 3 SYNTAX ANALYSIS 3.1 ROLE OF THE PARSER Parser obtains a string of tokens from the lexical analyzer and verifies that it can be generated by the language for the source program.
More informationBottomup parsing. BottomUp Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form
Bottomup parsing Bottomup parsing Recall Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form If α V t,thenα is called a sentence in L(G) Otherwise it is just
More informationA leftsentential form is a sentential form that occurs in the leftmost derivation of some sentence.
Bottomup parsing Recall For a grammar G, with start symbol S, any string α such that S α is a sentential form If α V t, then α is a sentence in L(G) A leftsentential form is a sentential form that occurs
More informationMODULE 14 SLR PARSER LR(0) ITEMS
MODULE 14 SLR PARSER LR(0) ITEMS In this module we shall discuss one of the LR type parser namely SLR parser. The various steps involved in the SLR parser will be discussed with a focus on the construction
More informationSyntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38
Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program
More informationParsing. Roadmap. > Contextfree grammars > Derivations and precedence > Topdown parsing > Leftrecursion > Lookahead > Tabledriven parsing
Roadmap > Contextfree grammars > Derivations and precedence > Topdown parsing > Leftrecursion > Lookahead > Tabledriven parsing The role of the parser > performs contextfree syntax analysis > guides
More informationAcknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal
Acknowledgements The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error
More informationSyntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay
Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax
More informationTop down vs. bottom up parsing
Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs
More information8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing
8 Parsing Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string
More informationCompilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017
Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege
More informationFormal Languages and Compilers Lecture VII Part 3: Syntactic A
Formal Languages and Compilers Lecture VII Part 3: Syntactic Analysis Free University of BozenBolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/
More informationIntroduction to parsers
Syntax Analysis Introduction to parsers Contextfree grammars Pushdown automata Topdown parsing LL grammars and parsers Bottomup parsing LR grammars and parsers Bison/Yacc  parser generators Error
More informationMIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology
MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) $ X s0 shift to s2 error error goto s1
More informationBottomUp Parsing. Parser Generation. LR Parsing. Constructing LR Parser
Parser Generation Main Problem: given a grammar G, how to build a topdown parser or a bottomup parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence 
More informationCSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D1
CSE P 501 Compilers LR Parsing Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 D1 Agenda LR Parsing Tabledriven Parsers Parser States ShiftReduce and ReduceReduce conflicts UW CSE P 501 Spring 2018
More informationCA Compiler Construction
CA4003  Compiler Construction David Sinclair A topdown parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back
More informationCompiler Construction 2016/2017 Syntax Analysis
Compiler Construction 2016/2017 Syntax Analysis Peter Thiemann November 2, 2016 Outline 1 Syntax Analysis Recursive topdown parsing Nonrecursive topdown parsing Bottomup parsing Syntax Analysis tokens
More informationConcepts Introduced in Chapter 4
Concepts Introduced in Chapter 4 Grammars ContextFree Grammars Derivations and Parse Trees Ambiguity, Precedence, and Associativity Top Down Parsing Recursive Descent, LL Bottom Up Parsing SLR, LR, LALR
More informationCS 321 Programming Languages and Compilers. VI. Parsing
CS 321 Programming Languages and Compilers VI. Parsing Parsing Calculate grammatical structure of program, like diagramming sentences, where: Tokens = words Programs = sentences For further information,
More information3. Parsing. Oscar Nierstrasz
3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/
More informationCompiler Design 1. BottomUP Parsing. Goutam Biswas. Lect 6
Compiler Design 1 BottomUP Parsing Compiler Design 2 The Process The parse tree is built starting from the leaf nodes labeled by the terminals (tokens). The parser tries to discover appropriate reductions,
More informationSYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram
CS6660 COMPILER DESIGN Question Bank UNIT IINTRODUCTION TO COMPILERS 1. Define compiler. 2. Differentiate compiler and interpreter. 3. What is a language processing system? 4. List four software tools
More informationDownloaded from Page 1. LR Parsing
Downloaded from http://himadri.cmsdu.org Page 1 LR Parsing We first understand Context Free Grammars. Consider the input string: x+2*y When scanned by a scanner, it produces the following stream of tokens:
More informationPrinciples of Compiler Design Presented by, R.Venkadeshan,M.TechIT, Lecturer /CSE Dept, Chettinad College of Engineering &Technology
Principles of Compiler Design Presented by, R.Venkadeshan,M.TechIT, Lecturer /CSE Dept, Chettinad College of Engineering &Technology 6/30/2010 Principles of Compiler Design R.Venkadeshan 1 Preliminaries
More informationParsing Wrapup. Roadmap (Where are we?) Last lecture Shiftreduce parser LR(1) parsing. This lecture LR(1) parsing
Parsing Wrapup Roadmap (Where are we?) Last lecture Shiftreduce parser LR(1) parsing LR(1) items Computing closure Computing goto LR(1) canonical collection This lecture LR(1) parsing Building ACTION
More informationContextfree grammars
Contextfree grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals  tokens nonterminals  represent higherlevel structures of a program start symbol,
More informationNote that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).
LL(k) Grammars We need a bunch of terminology. For any terminal string a we write First k (a) is the prefix of a of length k (or all of a if its length is less than k) For any string g of terminal and
More informationWednesday, September 9, 15. Parsers
Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda
More informationParsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:
What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda
More informationUNIT III & IV. Bottom up parsing
UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.
More informationCompiler Construction: Parsing
Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33 Contextfree grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax
More informationPrinciple of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale
Free University of Bolzano Principles of Compilers Lecture IV Part 4, 2003/2004 AArtale (1) Principle of Compilers Lecture IV Part 4: Syntactic Analysis Alessandro Artale Faculty of Computer Science Free
More informationLet us construct the LR(1) items for the grammar given below to construct the LALR parsing table.
MODULE 18 LALR parsing After understanding the most powerful CALR parser, in this module we will learn to construct the LALR parser. The CALR parser has a large set of items and hence the LALR parser is
More informationLR Parsers. Aditi Raste, CCOEW
LR Parsers Aditi Raste, CCOEW 1 LR Parsers Most powerful shiftreduce parsers and yet efficient. LR(k) parsing L : left to right scanning of input R : constructing rightmost derivation in reverse k : number
More informationFormal Languages and Compilers Lecture VII Part 4: Syntactic A
Formal Languages and Compilers Lecture VII Part 4: Syntactic Analysis Free University of BozenBolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/
More informationWhere We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser
More informationSyntax Analysis Part I
Syntax Analysis Part I Chapter 4: ContextFree Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,
More informationCMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters
: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter
More informationVIVA QUESTIONS WITH ANSWERS
VIVA QUESTIONS WITH ANSWERS 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another languagethe
More informationLR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 ShiftReduce Parsing.
LR Parsing Compiler Design CSE 504 1 ShiftReduce Parsing 2 LR Parsers 3 SLR and LR(1) Parsers Last modifled: Fri Mar 06 2015 at 13:50:06 EST Version: 1.7 16:58:46 2016/01/29 Compiled at 12:57 on 2016/02/26
More informationCSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D1
CSE 401 Compilers LR Parsing Hal Perkins Autumn 2011 10/10/2011 200211 Hal Perkins & UW CSE D1 Agenda LR Parsing Tabledriven Parsers Parser States ShiftReduce and ReduceReduce conflicts 10/10/2011
More informationParsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468
Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure
More informationParser Generation. BottomUp Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottomup  from leaves to the root
Parser Generation Main Problem: given a grammar G, how to build a topdown parser or a bottomup parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence 
More informationMonday, September 13, Parsers
Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of nonterminals S is the start symbol P is the set of productions
More informationSyntactic Analysis. Chapter 4. Compiler Construction Syntactic Analysis 1
Syntactic Analysis Chapter 4 Compiler Construction Syntactic Analysis 1 Contextfree Grammars The syntax of programming language constructs can be described by contextfree grammars (CFGs) Relatively simple
More informationCompiler Design. Spring Syntactic Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compiler Design Spring 2010 Syntactic Analysis Sample Exercises and Solutions Prof. Pedro C. Diniz USC / Information Sciences Institute 4676 Admiralty Way, Suite 1001 Marina del Rey, California 90292 pedro@isi.edu
More informationCSC 4181 Compiler Construction. Parsing. Outline. Introduction
CC 4181 Compiler Construction Parsing 1 Outline Topdown v.s. Bottomup Topdown parsing Recursivedescent parsing LL1) parsing LL1) parsing algorithm First and follow sets Constructing LL1) parsing table
More informationArchitecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End
Architecture of Compilers, Interpreters : Organization of Programming Languages ource Analyzer Optimizer Code Generator Context Free Grammars Intermediate Representation Front End Back End Compiler / Interpreter
More informationParsing. Rupesh Nasre. CS3300 Compiler Design IIT Madras July 2018
Parsing Rupesh Nasre. CS3300 Compiler Design IIT Madras July 2018 Character stream Lexical Analyzer MachineIndependent Code Code Optimizer F r o n t e n d Token stream Syntax Analyzer Syntax tree Semantic
More informationThe analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.
COMPILER DESIGN 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another languagethe target
More informationGeneral Overview of Compiler
General Overview of Compiler Compiler:  It is a complex program by which we convert any high level programming language (source code) into machine readable code. Interpreter:  It performs the same task
More informationS Y N T A X A N A L Y S I S LR
LR parsing There are three commonly used algorithms to build tables for an LR parser: 1. SLR(1) = LR(0) plus use of FOLLOW set to select between actions smallest class of grammars smallest tables (number
More informationSyntax Analyzer  Parser
Syntax Analyzer  Parser ASU Textbook Chapter 4.24.9 (w/o error handling) Tsansheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 A program represented by a sequence of tokens
More informationTabledriven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and nonterminals.
Bottomup Parsing: Tabledriven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and nonterminals. Basic operation is to shift terminals from the input to the
More informationSyntax Analysis: Contextfree Grammars, Pushdown Automata and Parsing Part  4. Y.N. Srikant
Syntax Analysis: Contextfree Grammars, Pushdown Automata and Part  4 Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationQUESTIONS RELATED TO UNIT I, II And III
QUESTIONS RELATED TO UNIT I, II And III UNIT I 1. Define the role of input buffer in lexical analysis 2. Write regular expression to generate identifiers give examples. 3. Define the elements of production.
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationLR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States
TDDD16 Compilers and Interpreters TDDB44 Compiler Construction LR Parsing, Part 2 Constructing Parse Tables Parse table construction Grammar conflict handling Categories of LR Grammars and Parsers An NFA
More informationWednesday, August 31, Parsers
Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically
More informationshiftreduce parsing
Parsing #2 Bottomup Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottomup
More informationCompilers. Yannis Smaragdakis, U. Athens (original slides by Sam
Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform
More informationCS2210: Compiler Construction Syntax Analysis Syntax Analysis
Comparison with Lexical Analysis The second phase of compilation Phase Input Output Lexer string of characters string of tokens Parser string of tokens Parse tree/ast What Parse Tree? CS2210: Compiler
More informationCOP4020 Programming Languages. Syntax Prof. Robert van Engelen
COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and contextfree grammars n Grammar derivations n More about parse trees n Topdown and
More informationSyntactic Analysis. TopDown Parsing
Syntactic Analysis TopDown Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make
More informationLR Parsing  The Items
LR Parsing  The Items Lecture 10 Sections 4.5, 4.7 Robb T. Koether HampdenSydney College Fri, Feb 13, 2015 Robb T. Koether (HampdenSydney College) LR Parsing  The Items Fri, Feb 13, 2015 1 / 31 1 LR
More informationGujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) JulyNov Compiler Design (170701)
Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) JulyNov 2014 Compiler Design (170701) Question Bank / Assignment Unit 1: INTRODUCTION TO COMPILING
More informationTopDown Parsing and Intro to BottomUp Parsing. Lecture 7
TopDown Parsing and Intro to BottomUp Parsing Lecture 7 1 Predictive Parsers Like recursivedescent but parser can predict which production to use Predictive parsers are never wrong Always able to guess
More informationDEPARTMENT OF INFORMATION TECHNOLOGY / COMPUTER SCIENCE AND ENGINEERING UNIT 1INTRODUCTION TO COMPILERS 2 MARK QUESTIONS
BHARATHIDASAN ENGINEERING COLLEGE DEPARTMENT OF INFORMATION TECHNOLOGY / COMPUTER SCIENCE AND ENGINEERING Year & Semester : III & VI Degree & Branch : B.E (CSE) /B.Tech (Information Technology) Subject
More informationEDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:
EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 20170911 This lecture Regular expressions Contextfree grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)
More informationVALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur
VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur 603203. DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III & VI Section : CSE 1 & 2 Subject Code : CS6660 Subject Name : COMPILER
More informationChapter 3. Parsing #1
Chapter 3 Parsing #1 Parser source file get next character scanner get token parser AST token A parser recognizes sequences of tokens according to some grammar and generates Abstract Syntax Trees (ASTs)
More informationDEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI UNIT I  LEXICAL ANALYSIS 1. What is the role of Lexical Analyzer? [NOV 2014] 2. Write
More informationBottomUp Parsing. Lecture 1112
BottomUp Parsing Lecture 1112 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 BottomUp Parsing Bottomup parsing is more general than topdown parsing And just as efficient
More informationPart 3. Syntax analysis. Syntax analysis 96
Part 3 Syntax analysis Syntax analysis 96 Outline 1. Introduction 2. Contextfree grammar 3. Topdown parsing 4. Bottomup parsing 5. Conclusion and some practical considerations Syntax analysis 97 Structure
More informationOutline. 1 Introduction. 2 Contextfree Grammars and Languages. 3 Topdown Deterministic Parsing. 4 Bottomup Deterministic Parsing
Parsing 1 / 90 Outline 1 Introduction 2 Contextfree Grammars and Languages 3 Topdown Deterministic Parsing 4 Bottomup Deterministic Parsing 5 Parser Generation Using JavaCC 2 / 90 Introduction Once
More informationQuestion Bank. 10CS63:Compiler Design
Question Bank 10CS63:Compiler Design 1.Determine whether the following regular expressions define the same language? (ab)* and a*b* 2.List the properties of an operator grammar 3. Is macro processing a
More informationSyntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6Feb2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.
Syntax Analysis Prof. James L. Frankel Harvard University Version of 6:43 PM 6Feb2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. ContextFree Grammar (CFG) terminals nonterminals start
More informationCS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University
CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions
More informationTopDown Parsing and Intro to BottomUp Parsing. Lecture 7
TopDown Parsing and Intro to BottomUp Parsing Lecture 7 1 Predictive Parsers Like recursivedescent but parser can predict which production to use Predictive parsers are never wrong Always able to guess
More informationCompiler Design 1. TopDown Parsing. Goutam Biswas. Lect 5
Compiler Design 1 TopDown Parsing Compiler Design 2 Nonterminal as a Function In a topdown parser a nonterminal may be viewed as a generator of a substring of the input. We may view a nonterminal
More informationIntroduction to Syntax Analysis
Compiler Design 1 Introduction to Syntax Analysis Compiler Design 2 Syntax Analysis The syntactic or the structural correctness of a program is checked during the syntax analysis phase of compilation.
More information