Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Similar documents
MODULE 14 SLR PARSER LR(0) ITEMS

UNIT III & IV. Bottom up parsing

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

LALR Parsing. What Yacc and most compilers employ.

Principles of Programming Languages

Context-free grammars

Compiler Construction: Parsing

shift-reduce parsing

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

S Y N T A X A N A L Y S I S LR

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1


UNIT-III BOTTOM-UP PARSING

Lexical and Syntax Analysis. Bottom-Up Parsing

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Conflicts in LR Parsing and More LR Parsing Types

LR Parsing Techniques

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root


Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

CS 4120 Introduction to Compilers

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

Bottom-Up Parsing. Lecture 11-12

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Downloaded from Page 1. LR Parsing

Bottom-Up Parsing. Lecture 11-12

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

LR Parsing Techniques

LR Parsers. Aditi Raste, CCOEW

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Syn S t yn a t x a Ana x lysi y s si 1

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Wednesday, August 31, Parsers

Syntax Analysis. Chapter 4

How do LL(1) Parsers Build Syntax Trees?

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Monday, September 13, Parsers

In One Slide. Outline. LR Parsing. Table Construction

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Lecture Bottom-Up Parsing

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

LR Parsing LALR Parser Generators

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

Lecture 8: Deterministic Bottom-Up Parsing

Syntax Analysis Part I

VIVA QUESTIONS WITH ANSWERS

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

Lecture 7: Deterministic Bottom-Up Parsing

LR Parsing LALR Parser Generators

4. Lexical and Syntax Analysis

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Bottom-Up Parsing II. Lecture 8

LR Parsing E T + E T 1 T

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Compilers. Bottom-up Parsing. (original slides by Sam

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

4. Lexical and Syntax Analysis

General Overview of Compiler

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

LR Parsing - The Items

Concepts Introduced in Chapter 4

Context-Free Grammars and Parsers. Peter S. Housel January 2001

CS606- compiler instruction Solved MCQS From Midterm Papers

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Syntax-Directed Translation

CS415 Compilers. LR Parsing & Error Recovery

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Recursive Descent Parsers

Compiler Construction 2016/2017 Syntax Analysis

The following deflniüons h:i e been establish for the tokens: LITERAL any group olcharacters surrounded by matching quotes.

LR Parsing. Table Construction

Algorithms for NLP. LR Parsing. Reading: Hopcroft and Ullman, Intro. to Automata Theory, Lang. and Comp. Section , pp.

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

Chapter 4. Lexical and Syntax Analysis

LR Parsing - Conflicts

Configuration Sets for CSX- Lite. Parser Action Table

CS 164 Programming Languages and Compilers Handout 8. Midterm I

Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

Lecture Notes on Bottom-Up LR Parsing

Transcription:

MODULE 18 LALR parsing After understanding the most powerful CALR parser, in this module we will learn to construct the LALR parser. The CALR parser has a large set of items and hence the LALR parser is designed that has lesser number of items but with reduction in the number of conflicts which is a problem of SLR parser. This module will discuss the construction of LR(1) items necessary for LALR parsing, LALR parsing table followed by parsing a string using the LALR parser. 18.1 Need for LALR parser Though the CALR parser is powerful enough in avoiding the conflicts of the SLR parser, it suffers from a large set of LR(1) items. This increases the number of entries in the CALR parsing table and thus increases the time complexity of computation and parsing. Increase in the number of items is reduced in LALR parsing table by combining the items that have the same core items but different look-ahead. Thus this is less powerful than CALR parser but avoids shift/reduce conflicts as shifts do not use look-ahead. As we are combining the items with different lookahead into one, the LALR parser may introduce reduce-reduce conflicts, but not much of a problem for grammars of programming languages. 18.2 LR(1) items The algorithm for LR(1) items for the LALR parser is computed by first constructing the LR(1) items as in the case of the CALR parser and then combining the items that have the same itemset but differing look-ahead into one item. The algorithm for the CALR s LR(1) items construction is discussed in module 17. Combining the items alone is discussed by means of an example. Example 18.1 Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table. S CC C cc C d The augmented grammar is given below and the CALR s LR(1) items are repeated here for a quick reference in Table 18.1 S S S CC C cc C d

Table 18.1 LR(1) items of CALR parsing. Item Set of Items Goto(I, X) Comments I 0 S.S, $ This is the initial item. We have a non-terminal S S.CC, $ C.cC, c/d C.d, c/d after the dot. So we add the productions of S, with look-ahead as FIRST($) since β is ε. Now again we have non-terminal C after the dot and here β is C and a is $. So, we add the productions of C with lookahead as FIRST(C$). FIRST(C) = {c, d} from the two productions of C. Thus we add two items for each of the productions of C one with c and other with d as look-ahead. However, we could represent it in a combined fashion as given in this items set. I 1 S S., $ (I 0, S) Shifting the dot results in a kernel item, the lookahead remains the same. I 2 S C.C, $ (I 0, C) The dot is shifted by one position to the right. Now C.cC, $ C.d, $ we have C after the dot. β is ε and we add the items of C with FIRST($) as look-ahead. I 3 C c.c, c/d (I 0, c), Shifting the dot by one position and keeping the C.cC, c/d C.d, c/d (I 3, c) initial look ahead as it is, results in the first item. Now we have a C after the dot. β is ε and we add the items of C with FIRST(c/d) as look-ahead. I 4 C d., c/d (I 0, d), Kernel item with the look-ahead being the same (I 3, d) I 5 S CC., $ (I 2, C) Kernel item I 6 C c.c, $ (I 2, c) The dot is shifted by one position to the right. Now C.cC, $ C.d, $ (I 6, c) we have C after the dot. β is ε and we add the items of C with FIRST($) as look-ahead. I 7 C d., $ (I 2, d) Kernel item (I 6, d) I 8 C cc., c/d (I 3, C) Kernel item I 9 C cc., $ (I 6, C) Kernel item and no more new items are necessary to be added. From Table 18.1 consider items I 3 and I 6. Both these items set have the same core but they differ in their look-ahead and hence we combine them and call it as item I 36 as given below. I 36 : goto(i 0, c), goto(i 36, c), C c.c, c/d/$ C.cC, c/d/$ C.d, c/d/$

Similarly items I 4 and I 7 could be combined together as item I 47 and so is items I 8 and I 9 as I 89. The items are given below: I 47 : goto(i 2, d) goto(i 6, d) C d., c/d/$ I 89 : goto(i 3, C) C cc., c/d/$ Thus we have reduced 3 items from the CALR s LR(1) items and have items I 0, I 1, I 2, I 36, I 47, I 5 and I 89. 18.2 LALR parsing table After constructing the LR(1) items by combining the necessary items we use this reduced set to construct the LALR parsing table. The parsing construction is the same as that discussed for the CALR parser in the previous module but we work with LALR s LR(1) items. The LALR parsing table is given in Table 18.2 for the grammar of example 18.1. Table 18.2 LALR parsing table State Action Goto Comments c d $ S C 0 s36 s47 1 2 Goto(I 0,c) = I 36, => [0,c] = s36 Goto(I 0,d) = I 47 => [0,d] = s47 Goto(I 0,S) = I 1 => [0,S] = 1 Goto(I 0,C) = I 2 => [0,C] = 2 1 accept I 1 has [S S., $] so at [1, $] we have accept action 2 S36 S47 5 Goto(I 2,c) = I 36, => [2,c] = s36 Goto(I 2,d) = I 47 => [2,d] = s47 Goto(I 2,C) = I 5 => [2,C] = 5 36 s36 s47 89 Goto(I 36,c) = I 36, => [36,c] = s36 Goto(I 36,d) = I 47 => [36,d] = s47 Goto(I 36,C) = I 89 => [36,C] = 89 47 r3 r3 r3 C d., c/d/$, so at the intersection of [47, c], [47,d] and [47, $] we set reduce by C d 5 r1 S CC., $, at the intersection of [5, $] we set reduce by S CC 89 r2 r2 r2 C cc., c/d/$ at the intersection of [89,c], [89,d] and [89, $] we set reduce by C cc

18.3 LALR Parsing The LALR parsing algorithm is the same as CALR s parsing algorithm except that this algorithm will refer to the LALR parsing table and the input stack. This parser will not have a shift/reduce conflict but for some grammar this will have a reduce/reduce conflict and the parser will be in favor of reducing with the first production. Example 18.2 Consider the grammar of example 18.1 and see the parsing action of LALR parser for the input ccdd. Like other parsers, the input string is appended with $ and the parsing action is shown in Table 18.3 Table 18.3 Parsing action of the LALR parser Stack Input Action 0 ccdd$ [0, c] shift 36 0 c 36 c d d $ [36, c] shift 36 0 c 36 c 36 d d $ [36, d] shift 47 0 c 36 c 36 d 47 d $ [47, d] reduce 3, pop 2 symbols from stack, push C, goto(36, C) = 89 0 c 36 c 36 C 89 d $ [89, d] reduce 2, pop 4 symbols from the stack, push C, goto(36, C) = 89 0 c 36 C 89 d $ [89, d] reduce 2, pop 4 symbols from the stack, push C, goto(0, C) = 2 0 C 2 d $ [2, d] shift 47 0 C 2 d 47 $ [47, $] reduce 3, pop 2 symbols from the stack, goto(2, C) = 5 0 C 2 C 5 $ [5, $] reduce 1, pop 4 symbols off the stack, goto(0, S) = 1 0 S 1 $ [1, $] accept successful parsing As can be seen from Table 18.3 the number of steps in parsing is lesser than that of the CALR parser. Example 18.4 For the pointer variable declaration grammar, the modified set of LR (1) items and the parsing table are given in Table 18.4 and 18.5 respectively

Item Set of Items Goto(I, X) Comments I 0 S.S, $ Initial item. Then all the items need to be added with $ as look ahead for S, R. But for L we have two look-ahead $ and = one from S.L=R and other from R.L. S L=R, $ S R,$ L *R,=/$ L id,=/$ R L,$ I 1 S S,$ (I 0,S) Kernel item to result in accept action I 2 S L =R,$ (I 0,L) After the dot we have a terminal and hence no additional items need to be added R L, $ Kernel item I 3 S R, $ (I 0,R) Kernel item I 4 L * R,=/$ R L,=/$ (I 0,*), (I 4,*) L *R,=/$ L id, =/$ I 5 L id,=/$ (I 0,id) Kernel item (I 4,id) Items of R to be added with the same look-ahead which results in addition of the items corresponding to R and in turn L I 6 S L= R,$ (I 2,=) Items of R to be added with same look ahead and R L, $ L *R, $ L id, $ in-turn items of L are added. I 7 L *R,=/$ (I 4,R) Kernel item I 8 R L,=/$ (I 4,L) Kernel item I 9 S L=R,$ (I 6,R) Kernel item I 10 R L,$ (I 6,L), Kernel item (I 11,L) I 11 L * R,$ (I 6,*) This is a new item and is different from I 4 because R L,$ L *R,$ L id, $ (I 11,*) they have a different look-ahead I 12 L id,$ (I 6,id) Kernel item (I 11,id) I 13 L *R, $ (I 11,id) Kernel item I 4 and I 11 could be combined together and called I 411 I 5 and I 12 could be combined together and called I 512 I 7 and I 13 could be combined together and called I 713 I 8 and I 10 could be combined together and called I 810

Thus we reduce the set of items from 14 to 10 in the LALR parsing algorithm. The modified LALR parsing table is given in Table 18.5 Table 18.5 LALR parsing table State Action Goto id * = $ S L R 0 s512 s411 1 2 3 1 acc 2 s6 r5 3 r2 411 s512 s411 810 713 512 r4 r4 6 s512 s411 810 9 713 r3 r3 810 r5 r5 9 r1 11 s512 s411 810 713 18.4 Conflicts in LL and LR Parsers LL parsing tables are computed using FIRST/FOLLOW where the rows correspond to the nonterminals and the columns correspond to the terminals. To construct the parsing table the grammar need to be pre-processed to remove left recursion and need to be left factored and generate a modified grammar. This modified grammar is used to construct the FIRST and FOLLOW which are used to construct the parsing table. LR parsing tables are computed using Closure and Goto, where the actions correspond to the shift, reduce, accept and error situation. The three types of LR parsers are SLR, CALR and LALR and all of them constructs a parsing table where the rows correspond to the states which are the result of LR() items and the columns corresponds to the terminals and non-terminals. This parsing table is fundamental and is very important for the parsing action. An incorrect parsing table will result in an ambiguous parsing. A grammar is said to be LL(1) if its LL(1) parse table has no conflicts, SLR if its SLR parse table has no conflicts, LALR(1) if its LALR(1) parse table has no conflicts and CALR(1) if its CALR(1) parse table has no conflicts. The conflicts can be shift / reduce conflict or a reduce/ reduce conflict.

Conflicts in LL and LR parsers Conflicts are resolved depending on whether operators / symbols involved are left/right associative or precedence. The following is the manner in which the parsers resolve the conflicts. For Left-associative operators the conflict is resolved in favor of reduce action. For Right-associative operators the conflict is resolved in favor of shift action. If the stack has a higher precedent operator the conflict is in favor of reduce action. If the stack has a lower precedent operator the conflict is in favor of shift action. 18.5 Error Detection and Recovery Canonical LR parser uses full LR (1) parse table and will never make a single reduction before recognizing the error when a syntax error occurs on the input. SLR and LALR may still reduce when a syntax error occurs on the input, but will never shift the erroneous input symbol. An error is detected if the symbol on top of the stack and the input symbol do not have a LR parsing table entry. The parsers recover from errors so that the compilation can be carried forward and will not make it as a permanent change. The errors are recovered in one of the following ways: Panic mode: In this mode of error recovery, the stack symbols are popped until a state with a goto on a non-terminal A is found, where A represents a non-terminal of the grammar. From the input, the symbols are discarded until we find a symbol in the input that matches with the FOLLOW set of A. Phrase-level recovery: We implement individual error routines and call appropriate routines which will pop the stack / discard the input or both and log this information in an error log and recovers from error so that parsing could continue. Error productions: New error productions are added to the grammar. In the event of an incorrect state and table entry match, the symbols in the stack are popped until state has error production and this is pushed onto the stack. After that the input symbols are discarded till a parsing action could continue. Summary: In this module we discussed the construction of LR(1) items for the LALR parser which is a modified LR(1) items after constructing it for the CALR parser. Using the modified LR(1) items the LALR parsing table is constructed and is used to parser a given string. We also discussed the LALR parsing action along with error recovery in LR parsers.