Bottom-Up Parsing II. Lecture 8

Bottom-Up Parsing II Lecture 8 1

Review: Shift-Reduce Parsing Bottom-up parsing uses two actions: Shift ABC xyz ABCx yz Reduce Cbxy ijk CbA ijk 2

Recall: he Stack Left string can be implemented by a stack op of the stack is the Shift pushes a terminal on the stack Reduce pops 0 or more symbols off of the stack production rhs pushes a non-terminal on the stack production lhs 3

Key Issue How do we decide when to shift or reduce? xample grammar: + int * int ) Consider step int * int + int We could reduce by int giving * int + int A fatal mistake! No way to reduce to the start symbol Basically, no production that looks like * 4

So What his Shows We definitely don t always want to reduce just because we have the right hand side of a production on top of the stack o repeat, just because there s the right hand side of a production sitting on the top of the stack, it might be a mistake to do a reduction We might instead want to wait and do our reduction someplace else And the idea about how we decide is that we only want to reduce if the result can still be reduced to the start symbol 5

Recall Some erminology If S is the start symbol of a grammar G, and α is such that S * α, then α is a sentential form of G Note α may contain both terminals and nonterminals A sentence of G is a sentential form that contains no nonterminals. echnically speaking, the language of G, LG), is the set of sentences of G 6

Handles Intuition: Want to reduce only if the result can still be reduced to the start symbol Informally, a handle is a substring that matches the body of a production, and whose reduction using that production) represents one step along the reverse of a rightmost derivation 7

Handles Intuition: Want to reduce only if the result can still be reduced to the start symbol Assume a rightmost derivation S * αxω αβω Remember, parser is going from right to left here hen production X β is a handle of αβω What is means is that you can reduce using this production and still potentially get back to the start symbol 8

Handles Cont.) Handles formalize the intuition A handle is a string that can be reduced and also allows further reductions back to the start symbol using a particular production at a specific spot) We only want to reduce at handles Clearly, since if we do a reduction at a place that is not a handle, we won t be able to get back to the start symbol! Note: We have said what a handle is, not how to find handles Finding handles consumes most of the rest of our 9 discussions of parsing

echniques for Recognizing Handles Bad news: here is no known efficient algorithm that recognizes handles in general So for arbitrary grammar, we don t have a fast way to find handles when we are parsing Good news: here are heuristics for guessing handles AND for some fairly large classes of context-free grammars, these heuristics always identify the handles correctly For the heuristics we will use, these are the SLR simple LR) grammars L Left-to-right) R Rightmost derivation in reverse))

Important Fact #2 Important Fact #2 about bottom-up parsing: In shift-reduce parsing, handles appear only at the top of the stack, never inside Recall that Important Fact #1 was in previous slide set: A bottom-up parser traces a rightmost derivation in reverse. 11

Why? Informal induction on # of reduce moves: rue initially, stack is empty Immediately after reducing a handle right-most non-terminal on top of the stack next handle must be to right of right-most nonterminal, because this is a right-most derivation Sequence of shift moves reaches next handle 12

Summary of Handles In shift-reduce parsing, handles always appear at the top of the stack Handles are never to the left of the rightmost non-terminal herefore, shift-reduce moves are sufficient; the need never move left Bottom-up parsing algorithms are based on recognizing handles 13

Recognizing Handles here are no known efficient algorithms to recognize handles Solution: use heuristics to guess which stacks are handles On some CFGs, the heuristics always guess correctly For the heuristics we use here, these are the SLR grammars Other heuristics work for other grammars 14

Classes of GFGs equivalently, parsers) Unambiguous CFG Only one parse tree for a given string of tokens quivalently, ambiguous grammar is one that produces more than one leftmost tree or more than one rightmost tree for a given sentence LRk) CFG Bottom-up parser that scans from left-to-right, produces a rightmost derivation, and uses k tokens of lookahead 15

Classes of GFGs LALRk) CFG Look-ahead LR grammar Basically a more memory efficient variant of an LR parser at time of development, memory requirements of LR parsers made them impractical) Most parser generator tools use this class SLRk) CFG Simple LR parser Smaller parse table and simpler algorithm 16

Grammars Strict containment relations here All CFGs Unambiguous CFGs LRk) CFGs will generate conflicts LALRk) CFGs SLRk) CFGs

Viable Prefixes It is not obvious how to detect handles At each step the parser sees only the stack, not the entire input; start with that... α is a viable prefix if there is an ω such that α ω is a valid state of a shift-reduce parse stack rest of input parser might know first token of ω but typically not more than that 18

Huh? What does this mean? A few things: A viable prefix does not extend past the right end of the handle It s a viable prefix because it is a prefix of the handle As long as a parser has viable prefixes on the stack no parsing error has been detected his all isn t very deep. Saying that α is a viable prefix is just saying that if α is at the top of the stack, we haven t yet made an error. hese are the valid states of a shift reduce parse.)

Important Fact #3 Important Fact #3 about bottom-up parsing: For any grammar, the set of viable prefixes is a regular language his is an amazing fact, and one that is the key to bottom-up parsing. Almost all the bottom up parsing tools are based on this fact. 20

Important Fact #3 Cont.) Important Fact #3 is non-obvious We show how to compute automata that accept viable prefixes But first, we ll need a number of additional definitions 21

Items An item is a production with a. somewhere on the rhs We can create these from the usual productions.g., the items for ) are.).).) ). 22

Items Cont.) he only item for X is X. his is a special case Items are often called LR0) items 23

Intuition he problem in recognizing viable prefixes is that the stack has only bits and pieces of the rhs of productions If it had a complete rhs, we could reduce hese bits and pieces are always prefixes of rhs of productions 24

It urns Out that what is on the stack is not random It has a special structure: these bits and pieces are always prefixes of the right-hand sides of productions hat is, in any successful parse, what is on the stack always has to be a prefix of the right hand side of some production or productions 25

xample Consider the input int) stack input hen ) is a valid state of a shift-reduce parse is a prefix of the rhs of ) Will be reduced after the next shift Item.) says that so far we have seen of this production and hope to see ) I.e., this item describes this state of affairs what we ve seen and what we hope to see) Note: we may not see what we hope to see 26

Stated Another Way Left of dot tells what we re working on, right of dot tells what we re waiting to see before we can perform a reduction 27

Generalization he stack always has the structure of possibly multiple) prefixes of rhs s Several prefixes literally stacked up on the stack Prefix 1 Prefix 2... Prefix n-1 Prefix n 28

Generalization Let Prefix i be a prefix of rhs of X i α i Prefix i will eventually reduce to X i he missing part of α i-1 starts with X i i.e., the missing suffix of α i-1 must start with X i ) hat is, when we eventually replace α i with X i, X i must extend prefix α i-1 to be closer to a complete rhs of production X i-1 α i-1 i.e. there is a production X i-1 Prefix i-1 X i β for some β which may be empty) 29

Generalization Recursively, Prefix k+1 Prefix n must eventually reduce to the missing part of α k hat is, all the prefixes above prefix k must eventually reduce to the missing part of α k So the image: A stack of prefixes. Always working on the topmost shifting and reducing) But every time we perform a reduction, that must extend the prefix immediately below it on the stack And when a bunch of these have been removed via reduction, then we get to work on prefixes lower in the stack

An xample + int * int ) Consider the string int * int): int * int) is a state of a shift-reduce parse stack input is a prefix of the rhs of ) Seen, working on rest of this production is a prefix of the rhs of We ve seen none of this int * is a prefix of the rhs of int * We ve seen the start of this 31

An xample Cont.) Can record this with the stack of items Says.). int *. We ve seen of ) We ve seen of We ve seen int * of int * top of stack Note how lhs of each production is going to become part of rhs of previous production

Recognizing Viable Prefixes Idea: o recognize viable prefixes, we must Recognize a sequence of partial right hand sides of productions, where ach sequence can eventually reduce to part of the missing suffix of its predecessor 33

Now, onto the algorithm for implementing this idea hat is, we finally get to the technical highlight of bottom-up parsing! 34

An NFA Recognizing Viable Prefixes We re going to build an NFA that recognizes the viable prefixes of G his should be possible, since we claim that the set of viable prefixes of G are a regular language Input to the NFA is the stack the NFA reads the stack) Reads the stack from bottom to top Output of NFA is either yes viable prefix) or no not viable prefix) 35

An NFA Recognizing Viable Prefixes But let s be clear what this is saying: Yes: his stack is OK, we could end up parsing the input No: What we ve got on the stack now doesn t resemble any valid stack for any possible parse of any input string for this grammar. 36

An NFA Recognizing Viable Prefixes Simplifies some things 1. Add a dummy production S S to G 2. he NFA states are the items of G Including the extra production 3. For item α.xβ add transition α.xβ X αx.β X can be terminal or non-terminal here means on input X And remember, items are the states of the NFA, so this is telling us when to move from one state to another e.g., a transition) 37

An NFA Recognizing Viable Prefixes X must be non-terminal here 4. For item α.xβ and production X γ add α.xβ X.γ Is it possible that there could be a valid configuration of the parser where we saw α but then X didn t appear next? Yes: stack is sequence of partial rhs Could be that all that s on the stack right now for this production is α and the next thing on the stack is eventually going to reduce to X 38

An NFA Recognizing Viable Prefixes What does this mean? It means that whatever is there on the stack has to be derived from X, has to be something that can be generated by using a sequence of X productions), because eventually it s going to reduce the X So, for every item that looks like this, and for every production for X, we re going to say that if there is no X on the stack, we make an move, we shift to a state where we can try to recognize the rhs plus something derived from X 39

An NFA Recognizing Viable Prefixes 1. Add a dummy production S S to G 2. he NFA states are the items of G Including the extra production 3. For item NFA state) α.xβ add transition α.xβ X αx.β 4. For item α.xβ and production X γ add α.xβ X.γ point must mark the start of another rhs 40

An NFA Recognizing Viable Prefixes Cont.) 5. very state is an accepting state If the automaton successfully consumes the entire stack, then the stack is viable Note not every state is going to have a transition on every symbol so lot of possible stacks will be rejected) 6. Start state is S.S Dummy production added so we can conveniently name the start state 41

An xample S + int * int ) he usual, augmented with the extra production 42

NFA for Viable Prefixes of the xample ). ).).) ). S. S.. +.int.+ int + int. +. +... int * int *. int.* int *. 43

he xample Automata It s actually quite complicated Many states and transitions And this is for a relatively simple grammar, so hese NFAs for recognizing viable prefixes for grammars are actually quite elaborate Let s build it! 44

NFA for Viable Prefixes in Detail 1) S. So, this is saying we re hoping to see an on the stack. If we don t, then we re hoping to see something derived from. 45

NFA for Viable Prefixes in Detail 2) S. S... + If we see and on the stack, we move the dot over, saying we ve read the on the stack. his would indicate we re probably done with parsing you would have reduced the old start symbol and be about to reduce to the new start symbol.) Lacking at top of stack, hope for either a or a + use Rule 4) 46

NFA for Viable Prefixes in Detail 2) S. S... + Notice how we crucially use the power of nondeterministic automata. We don t know which rhs of a production is going to appear on the stack and this grammar isn t even left-factored) so we just use the guessing power of the NFA, let it choose which one to use because it can always guess correctly). 47

NFA for Viable Prefixes in Detail 2) S. S... + Another way of describing the intuition here: From the red box, we d like to see an as the next input symbol. If we do, then we can reduce to the start state S. But, if we don t see an next, then we d like to see something that can be derived from, because then we could eventually reduce that derived thing back to, and then back to S. he - transitions take us to things that can be derived from. 48

NFA for Viable Prefixes in Detail 2) S. S... + Of course we can compile this down to a DFA, but at this point it s really nice to not have to know which choice to make. S + int * int ) 49

NFA for Viable Prefixes in Detail 3). ) S. S...+.int Same as before. Note that when we see a dot at the end of a rhs, we ve got a handle that we might want to reduce. We ll talk about this later, but this is kind of how we re going to go about recognizing handles. 50

NFA for Viable Prefixes in Detail 3). ) S. S..+.int Note that in these newly added links, the dot is all the way to the left, indicating that we haven t actually seen anything from the rhs of the production yet.. S + int * int ). 51

NFA for Viable Prefixes in Detail 4). ) S. S... +..int.+ Note that when we see a non-terminal to the right of the dot, we create a link corresponding to that non-terminal being on top of the stack, and other links -transitions) to the other productions that derive things from that nonterminal

NFA for Viable Prefixes in Detail 5). ).) S. S... +.int.+ Note that when we see a terminal to the right of the dot, there is only on possible move: we create a link corresponding to that terminal being on top of the stack.. 53

NFA for Viable Prefixes in Detail 6). ).).) S. S.. +.int.+.. 54

NFA for Viable Prefixes in Detail 7). ).).) ) ). S. S.. +.int.+.. 55

NFA for Viable Prefixes in Detail 8). ).).) ) ). S. S.. +.int.+ + +... 56

NFA for Viable Prefixes in Detail 9). ) S. S..). +.int.).+ + ) ). +. +... 57

NFA for Viable Prefixes in Detail 10). ) S. S..). +.int.).+ int int. ) + ). +. +... 58

NFA for Viable Prefixes in Detail 11). ) S. S..). +.int.).+ int int. ) + ). +. +.. int int.*. 59

NFA for Viable Prefixes in Detail 12). ) S. S..). +.int.).+ int int. ) + ). +. +.. int int.* *. int *. 60

NFA for Viable Prefixes in Detail 13). ) S. S..). +.int.).+ int int. ) + ). +. +... int * int *. int.* int *. 61

Start symbol NFA for Viable Prefixes) to DFA. ) S. S..). +.int.).+ int int. ) + ). +. +... int * int *. int.* int *. 62

NFA for Viable Prefixes) to DFA. ) S. S..). +.int.).+ int int. ) + ). +. +... int * int *. int.* int *. 63

NFA for Viable Prefixes) to DFA. ) S. S..). +.int.).+ int int. ) + ). +. +... int * int *. int.* int *. 64

NFA for Viable Prefixes) to DFA. ) S. S..). +.int.).+ int int. ) + ). +. +... int * int *. int.* int *. 65

+. ranslation to the DFA S.. +. + int int. * S.. int int.. + int *.) int *..).. +.) int int *..) +.. ).. +.) ). ) 66

Lingo he states of the DFA are canonical collections of items or canonical collections of LR0) items he Dragon book gives another way of constructing LR0) items It s a little more complex, and a bit more difficult to understand if you re seeing this for the first time.) 67

Valid Items Item X β.γ is valid for a viable prefix αβ if S * αxω αβγω by a right-most derivation After parsing αβ, the valid items are the possible tops of the stack of items Remember what this item is saying: I hope to see γ at the top of the stack so that I can reduce βγ to X. Saying the item is valid is saying that doing that reduction leaves open the possibility of a successful parse 68

Simpler Way: Items Valid for a Prefix For a given viable prefix α, the items that are valid in that prefix) are exactly the items that are in the final state of the DFA after it reads α. An item I is valid for a given viable prefix α if the DFA that recognizes viable prefixes terminates on input α in a state s containing I) he items in s describe what the top of the item stack might be after reading input α So valid items describe what might be on top of stack after you ve seen stack α )

Valid Items xample An item is often valid for many prefixes xample: he item.) is valid for prefixes... o see this, look at the DFA 70

+. Valid Items for S... + int. * S.. int int.. + int *.) int *..) +.. +.) int int int *..) +.. ).. +.) ). ) 71

So Now We re finally going to give the actual bottomup parsing algorithm. But first because we like to keep you in suspense), we give a few minutes to a very weak, but very similar, bottom-up parsing algorithm. All this builds on what we ve been talking about so far. 72

LR0) Parsing Algorithm Idea: Assume stack contains α next input is t DFA on input α terminates in state s. i.e., s is a final state of the DFA) hen Reduce by X β if s contains item X β. we ve seen entire rhs here) Shift if s contains item X β.tω OK to add t to the stack) equivalent to saying s has a transition labeled t

LR0) Conflicts i.e. When does LR0) get into trouble) LR0) has a reduce/reduce conflict if: Any state has two reduce items: X β. and Y ω. two complete rhs, could do either) might not be able to decide between two possible reduce moves) LR0) has a shift/reduce conflict if: Any state has a reduce item and a shift item: X β. and Y ω.tδ Do we shift or reduce?) Only an issue if t is the next item in the input 74

if next input is +, can do shift or reduce here LR0) Conflicts S... + + int S.. int int. * int.. + int *.) int *..) +... +.) int int *..) ) ). +.. ).. +.) wo shift/reduce conflicts with LR0) rules 75

if next input is *, can do shift or reduce here LR0) Conflicts S... + + int S.. int int. * int.. + int *.) int *..) +... +.) int int *..) ) ). +.. ).. +.) wo shift/reduce conflicts with LR0) rules 76

SLR LR = Left-to-right scan SLR = Simple LR SLR improves on LR0) shift/reduce heuristics Add one small heuristic so that fewer states have conflicts 77

SLR Parsing Idea: Assume stack contains α next input is t DFA on input α terminates in state s Reduce by X β if s contains item X β. β is on top of stack and valid) t FollowX) Shift if s contains item X β.tω Without Follow here, decision is based entirely on stack contents. But it might not make sense to reduce, given what next input symbol is. 78

So what s going on here? We have our stack contents It ends in a β And now we re going to make a move a reduction) and replace that β by X So things look like this β t And we re going to make it look like X t And this makes no sense whatsoever if t cannot follow X So we add the restriction that we only make the move if t can follow X 79

SLR Parsing Cont.) If there are conflicts under these rules, the grammar is not SLR he rules amount to a heuristic for detecting handles Defn: he SLR grammars are those where the heuristics detect exactly the handles So we take into account two pieces of information: 1) the contents of the stack that s what the DFA does for us it tells us what items are possible when we get to the top of the stack) and 2) what s 80 coming in the input

+. SLR Conflicts S... + S.. int int. * int.. + int *.) int *..) + int.. +.) int int *..) ) ). +.. ).. + Follow) = { ), $ }.) Follow) = { +, ), $ } No conflicts with SLR rules! 81

Precedence Declarations Digression Lots of grammars aren t SLR including all ambiguous grammars So SLR is an improvement, but not a really very general class of grammar We can make SLR parse more grammars by using precedence declarations Instructions for resolving conflicts 82

Precedence Declarations Cont.) Consider our favorite ambiguous grammar: + * ) int he DFA for this grammar contains a state with the following items: *.. + shift/reduce conflict! xactly the question of precedence of + and * Declaring * has higher precedence than + resolves this conflict in favor of reducing Note that reducing here places the * lower in the parse tree! 83

Precedence Declarations Cont.) he term precedence declaration is misleading hese declarations do not define precedence; they define conflict resolutions Not quite the same thing! In our example, because we re dealing with a natural grammar, conflict resolution has same effect as enforcing precedence we want. But more complicated grammar -> more interactions between various pieces of grammar -> you might not do what you want in terms of enforcing precedence 84

Precedence Declarations Cont.) Fortunately, the tools provide means for printing out the parsing automaton, which allows you to see exactly how conflicts are being resolved and whether these are the resolutions you expected. It is recommended that when you are building a parser, especially a fairly complex one, that you examine the parsing automaton to make sure that it s doing exactly what you expect. 85

Naïve SLR Parsing Algorithm 1. Let M be DFA for viable prefixes of G 2. Let x 1 x n $ be initial configuration 3. Repeat until configuration is S $ Let α ω be current configuration Run M on current stack α If M rejects α, report parsing error Stack α is not a viable prefix If M accepts α with items I, let a be next input Shift if X β. a γ I Reduce if X β. I and a FollowX) Report parsing error if neither applies 86

Notes If there is a conflict in the last step, grammar is not SLRk) k is the amount of lookahead In practice k = 1 88

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 89 8 7

xtended xample of SLR Parsing Configuration DFA Halt State int * int$ Action 90

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 91 8 7

xtended xample of SLR Parsing Configuration DFA Halt State Action int * int$ 1 shift 92

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 93

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 +.. ). + 4. +.) int int *..) ) 10 6 9 +.. ).. +.) 94 8 7

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 +.. ). + 4. +.) int int *..) ) 10 6 9 +.. ).. +.) 95 8 7

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift 96

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift int * int$ 97

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 +.. ). + 4. +.) int int *..) ) 10 6 9 +.. ).. +.) 98 8 7

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 +.. ). + 4. +.) int int *..) ) 10 6 9 +.. ).. +.) 99 8 7

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 +.. ). + 4. +.) int int *..) ) 10 6 9 +.. ).. +.) 100 8 7

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift int * int$ 11 shift 101

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift int * int$ 11 shift int * int $ 102

2 Configuration int * int $ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 103 8 7

2 Configuration int * int $ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 104 8 7

2 Configuration int * int $ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 105 8 7

2 Configuration int * int $ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 106 8 7

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift int * int$ 11 shift int * int $ 3 $ Follow) red. int 107

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift int * int$ 11 shift int * int $ 3 $ Follow) red. int int * $ 108

2 Configuration int * $ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 109 8 7

2 Configuration int * $ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 110 8 7

2 Configuration int * $ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 111 8 7

2 Configuration int * $ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 112 8 7

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift int * int$ 11 shift int * int $ 3 $ Follow) red. int int * $ 4 $ Follow) red. int* 113

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift int * int$ 11 shift int * int $ 3 $ Follow) red. int int * $ 4 $ Follow) red. int* $ 114

+. 7 2 Configuration $ S.. S... +.) 1. + int. * int int. int * int *..) 5 3 11 + int 4.. +.) int int *..) ) ). 10 6 9 +.. ).. +.) 8 115

+. 7 2 Configuration $ S.. S... +.) 1. + int. * int int. int * int *..) 5 3 11 + int 4.. +.) int int *..) ) ). 10 6 9 +.. ).. +.) 8 116

SLR xample Configuration DFA Halt State Action int * int$ 1 shift int * int$ 3 * not in Follow) shift int * int$ 11 shift int * int $ 3 $ Follow) red. int int * $ 4 $ Follow) red. int* $ 5 $ Follow) red. 117

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 119 8 7

2 Configuration int * int$ S.. S... +.) 1. + int int. * int int. int * int *..) 5 3 11 ). + 4 +... +.) int int *..) ) 10 6 9 +.. ).. +.) 120 8 7

Notes Rerunning the automaton at each step is wasteful Most of the work is repeated Put another way, at each iteration of the algorithm, the change occurs at the top of the stack, but each iteration we repeat through the DFA, which is 122 unnecessary

An Improvement Remember the state of the automaton on each prefix of the stack Change stack to contain pairs before it just contained symbols) Symbol, DFA State 123

An Improvement Cont.) For a stack sym 1, state 1... sym n, state n state n is the final state of the DFA on sym 1 sym n Detail: he bottom of the stack is any,start where any is any dummy symbol doesn t matter what) start is the start state of the DFA 124

Goto able Define goto[i,a] = j if state i A state j goto is just the transition function of the DFA written out as an array) One of two parsing tables other shown a few slides further on) 125

Refined Parser Has 4 Possible Moves Shift x Push a, x on the stack a is current input x is a DFA state Reduce X α As before pop an element from the rhs) off the stack and push an element lhs) onto the stack) Accept rror 126

Action able the second table) For each state s i and terminal a If s i has item X α.aβ and goto[i,a] = j then action[i,a] = shift j If s i has item X α. and a FollowX) and X S then action[i,a] = reduce X α If s i has item S S. then action[i,$] = accept Otherwise, action[i,a] = error ells us which kind of move to make in every possible state Indexed by the state of the automaton and next input symbol

SLR Parsing Algorithm Let I = w$ be initial input Let j = 0 Let DFA state 1 have item S.S Let stack = dummy, 1 repeat case action[top_statestack),i[j]] of shift k: push I[j++], k reduce X A: pop A pairs, push X, goto[top_statestack),x] accept: halt normally error: halt and report error 128

Notes on SLR Parsing Algorithm Note that the algorithm uses only the DFA states and the input he stack symbols are never used! However, we still need the symbols for semantic actions hrowing away the stack symbols amounts to throwing away the program, which we ll obviously need for later stages of compilation! 129

More Notes Could be k lookahead and thus k items here and item need not be $) Some common constructs are not SLR1) LR1) is more powerful Main difference between SLR and LR Build lookahead into the items An LR1) item is a pair: LR0) item x lookahead [. int *, $] means After seeing int * reduce if lookahead is $ More accurate than just using follow sets ake a look at the LR1) automaton for your parser! Actually, your parser is an LALR1) automaton Bit of an optimization over LR1) 130