Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Similar documents
Formal Languages and Compilers Lecture VII Part 3: Syntactic A

S Y N T A X A N A L Y S I S LR

Principles of Programming Languages

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

LR Parsing Techniques

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Downloaded from Page 1. LR Parsing

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Lecture Bottom-Up Parsing

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

MODULE 14 SLR PARSER LR(0) ITEMS

Wednesday, August 31, Parsers

UNIT-III BOTTOM-UP PARSING

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

LALR Parsing. What Yacc and most compilers employ.

shift-reduce parsing

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Monday, September 13, Parsers

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

CS308 Compiler Principles Syntax Analyzer Li Jiang

LR Parsers. Aditi Raste, CCOEW

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Lecture 7: Deterministic Bottom-Up Parsing


Lecture 8: Deterministic Bottom-Up Parsing

Top down vs. bottom up parsing

Compiler Construction 2016/2017 Syntax Analysis

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Context-free grammars

Lexical and Syntax Analysis. Bottom-Up Parsing

LR Parsing Techniques

Compiler Construction: Parsing

UNIT III & IV. Bottom up parsing

SLR parsers. LR(0) items

Bottom-Up Parsing II. Lecture 8

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser

Syn S t yn a t x a Ana x lysi y s si 1

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)


Bottom-Up Parsing. Lecture 11-12

Compilers. Predictive Parsing. Alex Aiken

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Bottom-Up Parsing LR Parsing

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Concepts Introduced in Chapter 4

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

LR Parsing LALR Parser Generators

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

3. Parsing. Oscar Nierstrasz

Conflicts in LR Parsing and More LR Parsing Types

LR Parsing - The Items

Table-Driven Parsing

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Compilers. Bottom-up Parsing. (original slides by Sam

LR Parsing LALR Parser Generators

Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

ECE 468/573 Midterm 1 October 1, 2014

Bottom-Up Parsing. Lecture 11-12

Syntax Analyzer --- Parser

Automatic generation of LL(1) parsers

CA Compiler Construction

Syntax Analysis Part I

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

CS606- compiler instruction Solved MCQS From Midterm Papers

VIVA QUESTIONS WITH ANSWERS

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

CS 4120 Introduction to Compilers

BSCS Fall Mid Term Examination December 2012

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Transcription:

TD parsing - LL(1) Parsing First and Follow sets Parse table construction BU Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) Problems with SLR Aho, Sethi, Ullman, Compilers : Principles, Techniques and Tools Aho + Ullman, Theory of Parsing and Compiling, vol II. 1 TD Parsing Elimination of left recursion. E " E # $ becomes E " $ A A " # A % Example: S " E S " E E " E + T E " T A E " T A " + T A % T " id T " id Can also left factor the grammar removing shared prefixes of right-hand-sides. 2

Parse Tree E + T T S E + T i d i d T id S E A + T A id Parse tree converted from left recursive to right recursive. id + T A id % 3 TD Parsing Problem: predicting which nonterminal to expand next, from a leading string of symbols Idea: generate parse tree top down so its frontier is always a sentential form Use First and Follow sets to understand the shape of sentential forms possibly generated by the grammar 4

TD Stack Parser, EG Stack Input Production S " E $S id+id+id$ $E id+id+id$ S " E $A T id+id+id$ E " T A $A +id+id$ T " id $A T id+id$ A " + T A $A +id$ T " id $A T id$ A " + T A $A $ T " id $ $ A " % E " T A A " + T A % T " id See algm in ASU Fig 4.14, p 187 5 How to mechanize? Define # to be string of nonterminals and terminals First(#) is the set of terminals that begin strings derivable from #. If # * %, then % is in First(#). Follow(A) is the set of terminals that can appear directly to the right of A in a sentential form S * # A a $ means a is in Follow(A). If A can be rightmost symbol in a sentential form, that is, X * # & ' where ' * %, then Follow(A)( Follow(X). 6

Example First(S) = First(E) = First(T) = {id} First(A) = { +, % } Follow(S) = Follow(E) = Follow(A) = {$} Follow(T) = {+, $} S " E E " T A A " + T A % T " id 7 LL(k) Grammars Can choose next production to expand by during TD phase, by looking k symbols ahead into input Use First sets to choose production Use Follow sets to handle % cases 8

Example: LL(1) Nonterms\Inputs: id + $ S S " E E E " T A T T " id A A " + T A A " % Ambiguous or left recursive grammars result in multiply defined entries in table First(S) = First(E) = First(T) = {id} First(A) = { +, % } Follow(S) = Follow(E) = Follow(A) = {$} Follow(T) = {+, $} S " E E " T A A " + T A % T " id 9 A View During TD Parsing S derives # ), a string of terminals; X is nonterminal at top of stack, X derives ); Initially X==S, # == %, ) is input S X partially constructed parse tree # ) 10

A View During BU Parsing * S rm # A w " # $ w, rm so $ is the handle. S # partially constructed parse tree A $ w, a string of terminals 11 Intuitive Comparison S w x z LR(k) can recognize A " # knowing w, x, and First k (z). LL(k) can recognize A " # knowing only w and First k (x). Therefore, the set of languages recognizable by LR(k) contain those recognizable by LL(k). A # 12

BU Parsing (Shift-Reduce) Handle - part of sentential form last added in a rightmost derivation. BU parsing as handle hunting (1) S " * (2) * " * + + (3) * " T (4) T " id Rightmost derivation of a+b+c, handles in red S " E " * + + " * + id " * + + + id " * + id + id " T + id + id " id + id + id 13 Shift-Reduce Parser, Example Actions: shift, reduce, accept, error Stack Input Action $ id1 + id2 + id3 $ shift $ id1 + id2 + id3 $ reduce (4) $ T + id2 + id3 $ reduce (3) $ E + id2 + id3 $ shift $ E + id2 + id3 $ shift $ E + id2 + id3 $ reduce(4) $ E + T + id3 $ reduce (2) $ E + id3 $ shift $ E + id3 $ shift $ E + id3 $ reduce (4) $ E + T $ reduce(2) $ E $ reduce (1) $ S $ accept (1) S " * (2) * " * + + (3) * " T (4) T " id 14

Problems Shift-reduce conflicts S " if E then S if E then S else S other On stack: if E then S Input: else Should shift trying for 2nd alternative or reduce by first rule? Reduce-reduce conflicts if A " # and B " # both in grammar When # on stack, how know which production to choose? 15 Predictive Parsing Top Down: LL(k), Bottom Up: LR(k) Avoids backtracking while parsing by using lookahead into input NO cases where more than 1 action possible 16

LR(k) Left to right scan parsing does a rightmost derivation in reverse, using k symbols of lookahead into input Three flavors Simple LR, SLR(1) Cheap Doesn t always work LR Most powerful Most expensive 17 LR(k) LALR Intermediate in cost and power All SLR(1) languages are also LR(1), but parsers generated by corresponding grammars for the same language will differ in size. LR(k) catches syntax errors as early as possible in a left-to-right scan of the input Covers most programming languages 18

LR Parsing DFA is embedded in parser which is a PDA (top stack, input_symbol) accesses a particular entry in the parser table Shift to state s Reduce by A " $ Accept Error Goto: (state, top stack ) " state 19 LR Parser symbol state input state\input Action/ goto table stack 20

LR Parsing Viable prefix - set of prefixes of right sentential forms which can appear on a stack of a shift/reduce parser Prefix of right sentential form that doesn t contain symbols beyond the handle Goto function is transition function of DFA that recognizes viable prefixes of the grammar Idea: continue to stack inputs until have handle on top of stack and then reduce 21 Building an SLR Parser Need states, goto s, Follow sets Item - rule with embedded dot S ". * Closure of item I I, {B ".), if A " #. B $ in I} States built from items and their closures 22

Example - States S " E I 0 : S ". E E " E + T E ". E + T E " + E ". + + " id + ". id Closure of S ". E I 1 : S " *. - 2 : E " +. * " *. + T I 3 : + " id. I 4 : * " * +. T T ". id I 5 : * " * + T. 23 Example - Goto s + Follow sets goto (0, E) = 1 goto (0, id) = 3 goto (0, T) = 2 goto (1, +) = 4 goto (4, T) = 5 goto (4, id) = 3 goto ({set of items}, X) = closure {[A " # X. $] [A " #. X $] in {set of items}} where X is a terminal or nonterminal Follow(S) = {$} Follow(E) = Follow(T) = { +, $} 24

Example - Parser Table si, shift to state I; r(j) reduce by rule j States\ inputs: id + $ E T 0 s3 1 2 1 s4 accept 2 r(3) r(3) 3 r(4) r(4) 4 s3 5 5 r(2) r(2) goto s 25 Example Stack input action 0 id1 + id2 $ s3 0 id1 3 + id2 $ r(4), goto on T 0 T 2 + id2 $ r(3), goto on E 0 E 1 + id2 $ s4 0 E 1 + 4 id2 $ s3 0 E 1 + 4 id2 3 $ r(4), goto on T 0 E 1 + 4 T 5 $ r(2), goto on E 0 E 1 $ accept 26

SLR(1) Parser Rules If A " #. a $ is in state I j and goto(i j, a) is I r then (I j,, a) transitions by shift r (sr) If A " #. is in state I j, set action [j,a] to reduce A " # for all a in Follow(A) Note: A!= S If S " *. in I j, action (j,$) is accept Any table entry not defined is error. 27 Problems Shift-reduce conflicts happen when Ab can occur in some sentential form and b. Follow(A). S " L = R I 0 : S ". L = R S " R S ". R L " * R R ". L L " id L ". * R R " L L ".!d I 1 : S " L. = R (1) R " L. (2) In I 1 shift when see = in input(item 1); reduce on = because = in Follow(R) (item 2). Note: S " L = R " * R = R, but this is not a rightmost derivation! 28

Problems, cont. Can see that rightmost derivation is: S " L = R " L = L "L = id " * R = id " *L = id " * id = id Therefore, should reduce *R to L when see =, not shift in order to get *R onto the stack. Problem is that we can t distinguish those Follow elements corresponding to a rightmost derivation in a specific context. 29 Nomenclature in ASU An item [ A " $. ) ] is valid for viable prefix # $ if S * # A w # $ ) w. rm Means can continue towards accumulating an handle on the stack by shifting Previously, shift would have changed viable prefix *R to nonviable prefix *R= If I is set of items valid for viable prefix $ then goto(i, X) is set of items valid for viable prefix $X where X is terminal or nonterminal rm 30

LR Parsing LR items include a lookahead symbol, (into the input) which helps in conflict resolution Need new closure rule: For [ A " #. B ), a ] item add [B ". ', b] for every b in First() a). 31 Example I 0 : S ". E, $ - initial item E ". * + +, $ - closure initial item E ". T, $ E ". E + T, + - closure 1st red item E ". +, + + ".id, $ - closure 2nd red item T ".id, + - closure 2nd blue item Will write these in more compact form by combining lookaheads. For [ A " #. B ), a ] item add [B ". ', b] for every b in First() a). 32

Example, LR(1) Parser I 0 :S ".E, $ I 1 : [goto (I 0, E)] E ".E + T, $/+ S " E., $ E ". T, $/+ S " E. + T, $/+ T ".id, +/$ I 2 :[goto (I 0, T)] I 4 :[goto(i 1, +)] E " T., $/+ E " * +. +, $/+ I 3 : [goto (I 0, id)] T ". id, $/+ T " id., $/+ I 5 : [goto (I 4, T)] E " * + +., $/+ 33 LR(1) Parser Reduce based on lookaheads in item which are a subset of Follow set Rules similar to SLR Shift in I k, [A " #. a $, b], goto (I k, a) = I j Reduce [A " #., b] reduce # to A on b Accept [S " E., $], accept on $ 34

LALR Parsing Idea: merge all states with common first components in their LR(1) items Implementation problem: need to reduce number of states to get smaller parser table Reduced size parser will perform Same as LR on correct inputs (will be parsed by LALR) On incorrect inputs, LR may find error faster; LALR will never do an incorrect shift but may do some wrong reductions 35 LALR Parsing Conceptually, build LALR(1) parser from LR(1) parser Merge all states with same first components Union all goto s of these merged states (goto s are independent of second components) Correctness of conceptual derivation Can never produce a shift-reduce conflict or else [A " #., a] and [B "$. a ), b] existed in some LR(1) state 36

LALR, cont. But can create reduce-reduce conflicts State 1 State 2 [A " c., d] [A " c., e] [B " c., e] [B " c., d] After merge: [A " c., d/e] [B " c., e/d] 37 Handles (Informal) Show the handle always remains on the top of the stack in shift-reduce parsing Assume ) B w is a right sentential form with ) B on the stack. 1. Assume handle includes B. * handle S # 1 A # 2 # 1 $ 1 / $ 2 # 2 rm rm ) w where $ 1 or $ 2 can be %. 38

Handles (Informal) 2. Assume handle included in w. S * # 1 / # 2 A # 3 # 1 B # 2 $ # 3 3. Assume handle included in ). * rm S # 1 & # 2 # 1 $ B w rm rm But if $ is handle, then A isn t rightmost nonterminal, B is. This is a contradiction! rm handle ) ) handle w 39 Handles, More Formally A " $ 1. $ 2 valid for viable prefix # $ * 1 means 0 S # A w # $ 1 $ 2 w rm If $ 2 = % then should reduce by A " $ 1 If $ 2!= % then should shift Note can have two valid items indicating different actions for same viable prefix A " $ 1. $ 2 and A " $ 1. Lookahead chooses which action is taken rm 40

Handles, More Formally Previous argument shows if A " $. is valid item for viable prefix # $ then # A is viable prefix (i.e., we needn t rescan the parse after a reduction) 41 Ambiguous Grammars Used to build compact parse trees Get rid of useless nonterminal to nonterminal productions (e.g., S-->E-->T) Conflicts resolvable through desired properties of operators (e.g., precedence) Generate smaller parsers Example of expression grammar 42

Example S " E I 0 : S ". * - 1 : goto (I 0,E) E " * + * * ". * + * S " E. E " id E ". id E " E. + E I 2 : goto (- 1,+) I 3 : goto (- 2,E) I 4 :goto(- 0,id) E " E +. E E " E + E. E " id. E ". * + * * " E. + E E ". id (reduce on + in Follow(E) shift on +) Choose reduce action making + left associative; can resolve operator precedences the same way (e.g., + versus *) 43 Grammar Classification CFL {0 n 1 n n >= 1} union {0 n 1 2n n >= 1} LR(k) ~ LR(1) SLR(k) LL(k) LALR(k) S " L = R R L " *R a R " L 44