LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Similar documents
Context-free grammars

Compiler Construction: Parsing

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Principles of Programming Languages

S Y N T A X A N A L Y S I S LR

Compilers. Bottom-up Parsing. (original slides by Sam

Lexical and Syntax Analysis. Bottom-Up Parsing

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

LALR Parsing. What Yacc and most compilers employ.

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

UNIT-III BOTTOM-UP PARSING

MODULE 14 SLR PARSER LR(0) ITEMS

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Lecture Bottom-Up Parsing

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12

4. Lexical and Syntax Analysis

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

4. Lexical and Syntax Analysis

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

LR Parsing - The Items

Compiler Construction 2016/2017 Syntax Analysis

SLR parsers. LR(0) items

Lecture 8: Deterministic Bottom-Up Parsing

Downloaded from Page 1. LR Parsing

LR Parsers. Aditi Raste, CCOEW

UNIT III & IV. Bottom up parsing

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

CS 4120 Introduction to Compilers

Wednesday, August 31, Parsers

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

Monday, September 13, Parsers

LR Parsing LALR Parser Generators

LR Parsing Techniques

Lecture 7: Deterministic Bottom-Up Parsing

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

How do LL(1) Parsers Build Syntax Trees?

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

Chapter 4. Lexical and Syntax Analysis

Syntactic Analysis. Chapter 4. Compiler Construction Syntactic Analysis 1


Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

LR Parsing LALR Parser Generators

Bottom-up Parser. Jungsik Choi

shift-reduce parsing

Syntax Analyzer --- Parser

Conflicts in LR Parsing and More LR Parsing Types

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Chapter 4: LR Parsing

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)


In One Slide. Outline. LR Parsing. Table Construction

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Concepts Introduced in Chapter 4

Using an LALR(1) Parser Generator

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Bottom-Up Parsing II. Lecture 8

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

CSE302: Compiler Design

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

CS308 Compiler Principles Syntax Analyzer Li Jiang

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Syntax-Directed Translation

LR Parsing Techniques

Syn S t yn a t x a Ana x lysi y s si 1

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

VIVA QUESTIONS WITH ANSWERS

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

CS415 Compilers. LR Parsing & Error Recovery

Principles of Programming Languages

Syntax-Directed Translation. Lecture 14

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

LR Parsing. Table Construction

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

General Overview of Compiler

Outline. 1 Introduction. 2 Context-free Grammars and Languages. 3 Top-down Deterministic Parsing. 4 Bottom-up Deterministic Parsing

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Yacc: A Syntactic Analysers Generator

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Introduction. Introduction. Introduction. Lexical Analysis. Lexical Analysis 4/2/2019. Chapter 4. Lexical and Syntax Analysis.

Lex and Yacc. More Details

Programming Language Syntax and Analysis

Transcription:

LR Parsing Compiler Design CSE 504 1 Shift-Reduce Parsing 2 LR Parsers 3 SLR and LR(1) Parsers Last modifled: Fri Mar 06 2015 at 13:50:06 EST Version: 1.7 16:58:46 2016/01/29 Compiled at 12:57 on 2016/02/26 Compiler Design LR Parsing CSE 504 1 / 32 Shift-Reduce Parsing Leftmost and Rightmost Derivations Derivations for id + id: E = E+T = T +T = id+t = id+id LEFTMOST E = E+T = E+id = T +id = id+id RIGHTMOST Compiler Design LR Parsing CSE 504 2 / 32

Bottom-up Parsing Shift-Reduce Parsing Given a stream of tokens w, reduce it to the start symbol. Parse input stream: id + id: Reduction Derivation 1. id + id T + id E + id E + T E Compiler Design LR Parsing CSE 504 3 / 32 Shift-Reduce Parsing Shift-Reduce Parsing: An Example Stack Input Stream Action $ id + id $ shift $ id + id $ reduce by $ T + id $ reduce by $ E + id $ shift $ E + id $ shift $ E + id $ reduce by $ E + T $ reduce by $ E $ ACCEPT Compiler Design LR Parsing CSE 504 4 / 32

Shift-Reduce Parsing Handles A structure that furnishes a means to perform reductions Parse input stream: id + id: id + id T + id E + id E + T E Compiler Design LR Parsing CSE 504 5 / 32 Handles Shift-Reduce Parsing Handles are substrings of sentential forms: 1 A substring that matches the right hand side of a production 2 Reduction using that rule can lead to the start symbol 3 The rule forms one step in a rightmost derivation of the string E = E + T = E + id = T + id = id + id Handle Pruning: replace handle by corresponding LHS. Compiler Design LR Parsing CSE 504 6 / 32

Shift-Reduce Parsing Shift-Reduce Parsing Bottom-up parsing Shift: Construct leftmost handle on top of stack Reduce: Identify handle and replace by corresponding RHS Accept: Continue until string is reduced to start symbol and input token stream is empty Error: Signal parse error if no handle is found. Compiler Design LR Parsing CSE 504 7 / 32 Shift-Reduce Parsing Implementing Shift-Reduce Parsers Stack to hold grammar symbols (corresponding to tokens seen thus far). Input stream of yet-to-be-seen tokens. Handles appear on top of stack. Stack is initially empty (denoted by $). Parse is successful if stack contains only the start symbol when the input stream ends. Compiler Design LR Parsing CSE 504 8 / 32

Shift-Reduce Parsing Preparing for Shift-Reduce Parsing 1 Identify a handle in string. Top of stack is the rightmost end of the handle. What is the leftmost end? 2 If there are multiple productions with the handle on the RHS, which one to choose? Construct a parsing table, just as in the case of LL(1) parsing. Compiler Design LR Parsing CSE 504 9 / 32 Shift-Reduce Parsing Shift-Reduce Parsing: Derivations Stack Input Stream Action $ id + id $ shift $ id + id $ reduce by $ T + id $ reduce by $ E + id $ shift $ E + id $ shift $ E + id $ reduce by $ E + T $ reduce by $ E $ ACCEPT Left to Right Scan of input Rightmost Derivation in reverse. Compiler Design LR Parsing CSE 504 10 / 32

LR Parsers A Simple Example of LR Parsing S BC B a C a Stack Input Stream Action $ a a $ shift $ a a $ reduce by B a $ B a $ shift $ B a $ reduce by C a $ B C $ reduce by S BC $ S $ ACCEPT Compiler Design LR Parsing CSE 504 11 / 32 LR Parsers A Simple Example of LR Parsing: A Detailed Look S S S B C B a C a Stack Input State Action $ a a $ S S S BC shift B a $ a a $ B a reduce by 3 $ B a $ S B C C a shift $ B a $ C a reduce by 4 $ B C $ S BC reduce by 2 $ S $ S S ACCEPT Compiler Design LR Parsing CSE 504 12 / 32

LR Parsers LR Parsing: Another Example E E Stack Input State Action $ id + id $ E E E E+T E T shift T id $ id + id $ reduce by 4 $ T + id $ reduce by 3 $ E + id $ E E E E +T shift $ E + id $ E E+ T T id shift $ E + id $ reduce by 4 $ E + T $ reduce by 2 $ E $ E E +T E E ACCEPT Compiler Design LR Parsing CSE 504 13 / 32 States of an LR parser LR Parsers I 0 : E E E E+T E T T id Item: A production with somewhere on the RHS. Intuitively, grammar symbols before the are on stack; grammar symbols after the represent symbols in the input stream. Item set: A set of items; corresponds to a state of the parser. Compiler Design LR Parsing CSE 504 14 / 32

LR Parsers States of an LR parser (contd.) I 0 E E E E+T E T T id Initial State = closure({e E}) Closure: What other items are equivalent to the given item? Given an item A α Bβ, closure(a α Bβ) is the smallest set that contains 1 the item A α Bβ, and 2 every item in closure(b γ) for every production B γ G Compiler Design LR Parsing CSE 504 15 / 32 LR Parsers States of an LR parser (contd.) I 0 E E E E+T E T T id Initial State = closure({e E}) I 3 = goto(i 0, id) Goto: goto(i, X ) specifies the next state to visit. X is a terminal: when the next symbol on input stream is X. X is a nonterminal: when the last reduction was to X. goto(i, X ) contains all items in closure(a αx β) for every item A α X β I. Compiler Design LR Parsing CSE 504 16 / 32

LR Parsers Collection of LR(0) Item Sets The canonical collection of LR(0) item sets, C = {I 0, I 1,...} is the smallest set such that closure({s S}) C. I C X, goto(i, X ) C. Compiler Design LR Parsing CSE 504 17 / 32 LR Parsers Canonical LR(0) Item Sets: An Example E E I 0 = closure({e E}) E E E E+T E T T id I 1 = goto(i 0, E) E E E E +T I 2 = goto(i 0, T ) I 3 = goto(i 0, id) I 4 = goto(i 1, +) E E+ T T id I 5 = goto(i 4, T ) Compiler Design LR Parsing CSE 504 18 / 32

LR Parsers LR Action Table E E id + $ 0 S, 3 1 S, 4 A 2 R3 R3 R3 3 R4 R4 R4 4 S, 3 5 R2 R2 R2 Compiler Design LR Parsing CSE 504 19 / 32 LR Goto Table LR Parsers E E E T 0 1 2 1 2 3 4 5 5 Compiler Design LR Parsing CSE 504 20 / 32

LR Parsers LR Parsing: States and Transitions Action Table: id + $ 0 S, 3 1 S, 4 A 2 R3 R3 R3 3 R4 R4 R4 4 S, 3 5 R2 R2 R2 Goto Table: E T 0 1 2 1 2 3 4 5 5 E E State Stack Symbol Stack Input Action $ 0 $ id + id $ shift, 3 $ 0 3 $ id + id $ reduce by 4 $ 0 2 $ T + id $ reduce by 3 $ 0 1 $ E + id $ shift, 4 $ 0 1 4 $ E + id $ shift, 3 $ 0 1 4 3 $ E + id $ reduce by 4 $ 0 1 4 5 $ E + T $ reduce by 2 $ 0 1 $ E $ ACCEPT Compiler Design LR Parsing CSE 504 21 / 32 LR Parser LR Parsers while (true) { switch (action(state stack.top(), current token)) { case shift s : symbol stack.push(current token); state stack.push(s ); next token(); case reduce A β: pop β symbols off symbol stack and state stack; symbol stack.push(a); state stack.push(goto(state stack.top(), A)); case accept: return; default: error; }} Compiler Design LR Parsing CSE 504 22 / 32

LR Parsers LR Parsing: A review E E Table-driven shift reduce parsing: Shift Move terminal symbols from input stream to stack. Reduce Replace top elements of stack that form an instance of the RHS of a production with the corresponding LHS Accept Stack top is the start symbol when the input stream is exhausted Table constructed using LR(0) Item Sets. Compiler Design LR Parsing CSE 504 23 / 32 SLR and LR(1) Parsers Conflicts in Parsing Table Grammar: S S S a S S ɛ Item Sets: I 0 = closure({s S}) S S S a S S I 1 = goto(i 0, S) S S I 2 = goto(i 0, a) S a S S a S S I 3 = goto(i 2, S) S a S Action Table: a $ 0 S, 2 R 3 R 3 1 A S, 2 2 R 3 R 3 3 R 2 R 2 Shift-Reduce Conflict Compiler Design LR Parsing CSE 504 24 / 32

SLR and LR(1) Parsers Simple LR (SLR) Parsing Constructing Action Table action, indexed by states terminals, and Goto Table goto, indexed by states nonterminals: Construct {I 0, I 1,..., I n }, the LR(0) sets of items for the grammar. For each i, 0 i n, do the following: If A α aβ I i, and goto(i i, a) = I j, set action[i, a] = shift j. If A γ I i (A is not the start symbol), for each a FOLLOW (A), set action[i, a] = reduce A γ. If S S I i, set action[i, $] = accept. If goto(i i, A) = I j (A is a nonterminal), set goto[i, A] = j. Compiler Design LR Parsing CSE 504 25 / 32 SLR Parsing Table SLR and LR(1) Parsers Grammar: S S S a S S ɛ Item Sets: I 0 = closure({s S}) S S S a S S I 1 = goto(i 0, S) S S I 2 = goto(i 0, a) S a S S a S S I 3 = goto(i 2, S) S a S FOLLOW (S) = {$} SLR Action Table: a $ 0 S, 2 R 3 1 A 2 S, 2 R 3 3 R 2 Compiler Design LR Parsing CSE 504 26 / 32

SLR and LR(1) Parsers Deficiencies of SLR Parsing SLR(1) treats all occurrences of a RHS on stack as identical. Only a few of these reductions may lead to a successful parse. Example: S AaAb S BbBa A ɛ B ɛ I 0 = {[S S], [S AaAb], [S BbBa], [A ], [B ]}. Since FOLLOW (A) = FOLLOW (B), we have reduce/reduce conflict in state 0. Compiler Design LR Parsing CSE 504 27 / 32 LR(1) Item Sets SLR and LR(1) Parsers Construct LR(1) items of the form A α β, a, which means: The production A αβ can be applied when the next token on input stream is a. S AaAb S BbBa A ɛ B ɛ An example LR(1) item set: I 0 = {[S S, $], [S AaAb, $], [S BbBa, $], [A, a], [B, b]}. Compiler Design LR Parsing CSE 504 28 / 32

SLR and LR(1) Parsers LR(1) and LALR(1) Parsing LR(1) parsing: Parse tables built using LR(1) item sets. LALR(1) parsing: Look Ahead LR(1) Merge LR(1) item sets; then build parsing table. Typically, LALR(1) parsing tables are much smaller than LR(1) parsing table. SLR(1) LALR(1) LR(1). LL(1) SLR(1), but LL(1) LR(1). Compiler Design LR Parsing CSE 504 29 / 32 YACC SLR and LR(1) Parsers Yet Another Compiler Compiler: LALR(1) parser generator. Grammar rules written in a specification (.y) file, analogous to the regular definitions in a lex specification file. Yacc translates the specifications into a parsing function yyparse(). spec.y yacc spec.tab.c yyparse() calls yylex() whenever input tokens need to be consumed. bison: GNU variant of yacc. ply: Python s yacc ; provides function yacc() that is similar to Yacc s yyparse() Compiler Design LR Parsing CSE 504 30 / 32

SLR and LR(1) Parsers Using Yacc %{... C headers (#include) %}... Yacc declarations: %token... %union{...} precedences %%... Grammar rules with actions: Expr: Expr TOK_PLUS Expr Expr TOK_STAR Expr ; %%... C support functions Compiler Design LR Parsing CSE 504 31 / 32 Parsing in PLY SLR and LR(1) Parsers See http://www.dabeaz.com/ply/ply.html import ply.yacc as yacc... import tokens from PLY/lexer # precedences: precedence = ( ( left, TOK_PLUS ), ( left, TOK_STAR ) ) # Grammar rules with actions: def p_expression_plus(p): expr: expr TOK_PLUS expr pass #action, if necessary def p_expression_minus(p): expr: expr TOK_STAR expr pass #action, if necessary Compiler Design LR Parsing CSE 504 32 / 32