Bottom Up Parsing Handout. 1 Introduction. 2 Example illustrating bottom-up parsing

Similar documents
Lecture Bottom-Up Parsing

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Bottom-Up Parsing II. Lecture 8

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Lexical and Syntax Analysis. Bottom-Up Parsing

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Context-free grammars

Bottom-Up Parsing. Lecture 11-12

Compiler Construction: Parsing

Bottom-Up Parsing. Lecture 11-12

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

SLR parsers. LR(0) items

CS 4120 Introduction to Compilers

In One Slide. Outline. LR Parsing. Table Construction

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

UNIT III & IV. Bottom up parsing

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

shift-reduce parsing

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Lecture 7: Deterministic Bottom-Up Parsing

Introduction to Bottom-Up Parsing

Intro to Bottom-up Parsing. Lecture 9

LR Parsing LALR Parser Generators

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Lecture 8: Deterministic Bottom-Up Parsing

Wednesday, August 31, Parsers

Conflicts in LR Parsing and More LR Parsing Types

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

LR Parsing LALR Parser Generators

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

CS143 Handout 20 Summer 2012 July 18 th, 2012 Practice CS143 Midterm Exam. (signed)

LR Parsing. Table Construction

LR Parsing - The Items

S Y N T A X A N A L Y S I S LR

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Monday, September 13, Parsers

Chapter 4: LR Parsing

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

LR Parsing Techniques

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

Compilers. Bottom-up Parsing. (original slides by Sam

More Bottom-Up Parsing

Bottom-Up Parsing LR Parsing

Compiler Construction 2016/2017 Syntax Analysis

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Principles of Programming Languages

Downloaded from Page 1. LR Parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

How do LL(1) Parsers Build Syntax Trees?

Top down vs. bottom up parsing

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

4. Lexical and Syntax Analysis

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Talen en Compilers. Johan Jeuring , period 2. January 17, Department of Information and Computing Sciences Utrecht University

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

4. Lexical and Syntax Analysis

3. Parsing. Oscar Nierstrasz

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Lexical and Syntax Analysis. Top-Down Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

UNIT-III BOTTOM-UP PARSING

Parsing - 1. What is parsing? Shift-reduce parsing. Operator precedence parsing. Shift-reduce conflict Reduce-reduce conflict

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

LR Parsing E T + E T 1 T

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

MODULE 12 LEADING, TRAILING and Operator Precedence Table

LL(1) predictive parsing

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Chapter 4. Lexical and Syntax Analysis

Concepts Introduced in Chapter 4

LR Parsers. Aditi Raste, CCOEW

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser

Lecture Notes on Shift-Reduce Parsing


FROWN An LALR(k) Parser Generator

CS453 : Shift Reduce Parsing Unambiguous Grammars LR(0) and SLR Parse Tables by Wim Bohm and Michelle Strout. CS453 Shift-reduce Parsing 1

Lecture Notes on Bottom-Up LR Parsing

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Syntactic Analysis. Top-Down Parsing. Parsing Techniques. Top-Down Parsing. Remember the Expression Grammar? Example. Example

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Transcription:

Bottom Up Parsing Handout Compiled by: Nomair. Naeem dditional Material by: driel Dean-Hall and Brad Lushman his handout is intended to accompany material covered during lectures and is not consered a replacement for lectures. Dissemination of this handout (including posting on a website) is explicitly prohibited unless permission is obtained. Please report any errors to nanaeem@uwaterloo.ca. Introduction In bottom up parsing we start from the input string w and create a reverse derivation to the start symbol S w α k α k... α S xample illustrating bottom-up parsing Using the augmented grammar: S S S yb ab cd B z B wx INPU s is: a b y w x We will use the stack to store partially reduced info read so far: t each step, the algorithm has two choices: SHIF: consume the next input symbol and push it on the stack RDUC: POP the RHS of a rule off the stack and push its LHS

Consumed Input Remaining Input Stack ction ε abywx - abywx - shift Check if contents on the stack represent some RHS of a rule if not shift next SHIFING a a bywx - a gain check if we have RHS on stack shift b ab ywx - b a Check gain. his time we have ab on OS Can reduce using -> ab Pop b and a Push ab ywx - Check if contents on the stack represent some RHS of a rule if not shift next shift y aby wx - y shift w abyw x - abywx - abywx - w y x w y B y shift x Check if we have a RHS on OS. We have wx on OS Can reduce using B -> wx Pop x and w Push B Check if we have a RHS on OS. We have yb on OS Can reduce using S -> yb Pop B y and Push S abywx - abywz - ε S S Check if we have a RHS on OS. We DON shift - Check if we have a RHS on OS. We have S - on OS Can reduce using S' -> S - Pop S Push S' abywz - ε S'

he accepting condition is reached when stack contains S and remaining input is ɛ Since we only shift if we cannot reduce, the key challenge is to dece when to reduce. We use the notion of items which can be viewed as tracking what portion of the RHS of a rule is on the stack. USING H SIMPLR GRMMR S IM: n item is a production with a bookmark (represented by a dot) somewhere between the symbols in the RHS. e.g. (a FRSH item) e.g. (a RDUCIBL item) S' - S' - 5 Label a transition with the symbol that follows the dot dvance the dot past the symbol S' - If the dot now precedes a non-terminal, add FRSH items for all rules for this non-terminal Do this RCURSIVLY 4 - S' - ID: o dece on an action (shift/reduce) to perform, run the contents of the stack through the LR(0) DF and see which state we end up in If we end up in a state which has a reducible item we will reduce, if not we will shift the next symbol from the unconsumed portion of the input

XMPL RUN: Parse : Symbols Stack Read Input Unread Input ε - ction Pass the contents of the stack through the DF and see what state we end up Since stack is empty we are in state Since State has no reducible item Shift next input symbol on to stack - - - - gain run the DF with current stack contents We end up in state Since State has no reducible item Shift next input symbol on to stack Pass contents through DF We reach: State State has a RDUCIBL item RDUC: pop RHS of rule, push LHS Pass contents through DF We reach: State 5 State 5 has a RDUCIBL item RDUC: pop RHS of rule, push LHS Pass contents through DF We reach: State State has no reducible item shift next input symbol on stack - nd so on. Making the algorithm efficient he algorithm discussed above is inefficient, because after each action we pass the entire contents of the stack through the DF. We can keep track of the state using a separate (or same) stack. he key is to ensure that the op of the State Stack always contains the state we would end up in if we ran the symbol stack contents through the DF. 4

Symbols Stack State Stack Read Input Unread Input ction ε - * Check op of State Stack * State has no reducible item - *Shift next symbol ND SHIF CORRSPONDING S on state stack op of State stack is state No reducible item Shift next symbol and push corresponding state to state stack - * op of state stack is * It has a reducible item * Symbol Stack: by popping and pushing * State Stack: Pop equal number of state off the state stack and push corresponding state 5 - * op of state stack is 5 * It has a reducible item * Symbol Stack: reduce by popping and pushing * State Stack: pop 5 and push - * op of state stack is * Has no reducible item * Shift on symbol stack * Shift corresponding state to state stack - * op of state stack is * Has no reducible item * Shift on symbol stack * Shift corresponding state to state stack - op of state stack is - op of state stack is - - * op of state stack is * Has no reducible item *Shift on symbol stack * Shift corresponding state to state stack op of state stack is Has no reducible item Shift on symbol stack shift corresponding state to state stack - op of state stack is - op of state stack is - 4 - - ε * op of state stack is * Has no reducible item * Shift - on symbol stack * Shift corresponding state to state stack op of state stack is 4 5

. Summary he method described above is called LR parsing Left to Right scan Right most derivation he DF we used d not look at the next input symbol. Hence the exact method is LR(0) For WLP4 you will be using SLR() / LR() parsing i.e. you will need to look at the next input symbol in order to make the shift/reduce decisions Bottom Up Parsing Invariant: symbol stack unread input = α i. Possible rrors What if a DF state looks like this: Do we try to shift the next character (as suggested by α cβ) or do we reduce by B γ (as suggested by B γ ). We cannot dece: this is called a shift-reduce conflict (error). What if a DF state looks like this: Do we reduce by α or do we reduce by B β? We cannot dece and this is a reduce-reduce conflict B B α δ α β c β If any reducible item occurs in a state in which it is NO alone, then this is a shift-reduce conflict and the grammar is not LR(0).4 xample of grammar that is NO LR(0) S 4 S' - S' - S' - - S' - 5 Shift- Conflict

Suppose input starts with... hen, once we shift, we will be in state. Reducing would lead us to state 5 which has a shift-reduce conflict i.e. Should we reduce using or wait for a symbol he answer depends on what is next. If the input is then YS we should reduce using However, if the input is... then, we should shift.5 SLR() We can add a lookahead to the automaton to do this. For each, reducible item α attach the Follow() to the item. Recall: Follow() = {b S αbβ for some α,β} i.e. starting from the SR SYMBOL if we can get to a situation where we have b after then b is in the follow set of ssentially, what the follow set is checking is that if we do reduce using α and we know that the next symbol is in the follow set, we know there is at least some derivation still possible. For the example Follow() = { } Follow() = {, } he example had a shift-reduce conflict. We can extend the LR(0) DF by using follow sets: becomes : - For each reducible item in the LR(0) DF, add an indication that we will reduce only if the next input symbol is in the FOLLOW set We read such a state as: if we reach this state and the next input symbol is in the follow set (in this case { }) only then will we reduce otherwise we will shift. Notice that the conflict is resolved. his construction is called an SLR() parser: Simplified LR() parser and can resolve many conflicts that arise in an LR(0) parser. LR() parsing LR() parsers are strictly more powerful than LR(0) and SLR() parsers. Use the next input symbol to make the decision rather than the weaker condition that the next input symbol should be in the follow set. DVNG: parse more grammars DISDVNG: complex automaton (expensive to store)

LL(k) LR(k) LL() LR() LLR() SLR() LL(0) LR(0) 4 lgorithm he SLR(), LLR() and LR() parsing algorithms are the same. he only difference is the DF used by the algorithm. LR() algorithm (concrete, linear time, input DF(Σ,Q,q 0,rans, ) symstack.push statestack.push rans[q 0, ] for each symbol a in input from left to right while [statestack.top, a] is some production γ do symstack.pop symbols in γ (right end first) statestack.pop γ states symstack.push statestack.push rans[statestack.top, ] end while symstack.push a reject if rans[statestack.top, a] is undefined (RROR) statestack.push rans[statestack.top, a] end for accept (symstack is necessarily S key thing to notice is that the function looks at the top of the state stack. If it contains reducible items, the choice to reduce is made if the reduce is indicated when the next symbol is a. Notice also how the symbol stack and state stack are always kept in sync, any updates to the symbol stack lead to a corresponding update to the state stack. 5 Hints for the last question of assignment We have seen that the LR() parsing algorithm will give us a reversed rightmost derivation. In P4 you will implement this algorithm perhaps using the pseudocode given above. In P5, you can use the same implementation this time using the lr() machine for WLP4, also proved (aren t we nice!). his will give you a reversed rightmost derivation for the input WLP4 program. he challenge is that the required output (.wlp4i format) needs a leftmost derivation whereas you have a rightmost derivation. Note that wlp4i is more than just a print of the leftmost derivation since it also requires you to print terminals (and their lexemes).

But doing this is fairly straightforward if you think about it for a second. Using the reversed rightmost derivation you can create a parse tree. Recall the property that a derivation uniquely specifies a parse tree. herefore, since you have a rightmost derivation for the input you should be able to create a parse tree for the input. Now: recall the property that a parse tree has a unique leftmost derivation. herefore, once you have created the parse tree (using the rightmost derivation) you can now traverse this tree to generate a leftmost derivation. Look at the sample code given in CFGrl (Racket/C/C versions are available). his code has functions to create a tree from a rightmost derivation and traversal of a tree to get a leftmost derivation. What the code does NO have is a way to interleave the leftmost derivation output with information about the terminals you are encountering (these are the leaf nodes in the tree). You will have to implement this functionality on your own. 9