Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Similar documents
Lecture 7: Deterministic Bottom-Up Parsing

Lecture 8: Deterministic Bottom-Up Parsing

Syntax Errors; Static Semantics

Bottom-Up Parsing. Lecture 11-12

Conflicts in LR Parsing and More LR Parsing Types

In One Slide. Outline. LR Parsing. Table Construction

Bottom-Up Parsing. Lecture 11-12

LR Parsing. Table Construction

LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators

Compilers. Bottom-up Parsing. (original slides by Sam

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

More Bottom-Up Parsing

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Context-free grammars

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

Compiler Construction: Parsing

Monday, September 13, Parsers

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Wednesday, August 31, Parsers

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Lecture Notes on Bottom-Up LR Parsing

Lecture Notes on Bottom-Up LR Parsing

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

S Y N T A X A N A L Y S I S LR

3. Parsing. Oscar Nierstrasz

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Context-free grammars (CFG s)

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Bottom-Up Parsing II. Lecture 8

How do LL(1) Parsers Build Syntax Trees?

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

CS 4120 Introduction to Compilers

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

Concepts Introduced in Chapter 4

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Configuration Sets for CSX- Lite. Parser Action Table

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

LALR Parsing. What Yacc and most compilers employ.

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Principles of Programming Languages

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

CS415 Compilers. LR Parsing & Error Recovery

Building a Parser Part III

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

shift-reduce parsing

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008

Lecture Notes on Shift-Reduce Parsing

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

Formal Languages and Compilers Lecture VII Part 3: Syntactic A


Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Syntax Analysis Part I

Lexical and Syntax Analysis. Bottom-Up Parsing

Building Compilers with Phoenix

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Compiler Construction

Syntax Analysis. Chapter 4

LR Parsing - Conflicts

Compiler Construction

CS 314 Principles of Programming Languages

LR Parsing - The Items

Semantic Analysis. Lecture 9. February 7, 2018

Syntax-Directed Translation

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

UNIT III & IV. Bottom up parsing

Compiler Construction

UNIVERSITY OF CALIFORNIA

Using an LALR(1) Parser Generator

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Lecture Bottom-Up Parsing

Syntactic Analysis. Syntactic analysis, or parsing, is the second phase of compilation: The token file is converted to an abstract syntax tree.

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Transcription:

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 1

Shift/Reduce Conflicts If a DFA state contains both [X: α aβ, b] and [Y: γ, a], then we have two choices when the parser gets into that state at the and the next input symbol is a: Shift into the state containing [X: αa β, b], or Reduce with Y: γ. This is called a shift-reduce conflict. Often due to ambiguities in the grammar. Classic example: the dangling else S: "if" E "then" S "if" E "then" S "else" S... This grammar gives rise to a DFA state containing [S: "if" E "then" S, "else"] and [S: "if" E "then" S "else" S,...] So if else is next, we can shift or reduce. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 2

More Shift/Reduce Conflicts Consider the ambiguous grammar E : E + E E * E int We will have states containing [E: E + E, */+] [E: E + E, */+] E [E: E + E, */+] = [E: E + E, */+] [E: E * E, */+] [E: E * E, */+]...... Again we have a shift/reduce conflict on input * or + (in the item set on the right). We probably want to shift on * (which is usually supposed to bind more tightly than + ) We probably want to reduce on + (left-associativity). Solution: provide extra information (the precedence of * and + ) that allows the parser generator to decide what to do. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 3

Using Precedence in Bison/Horn In Bison or Horn, you can declare precedence and associativity of both terminal symbols and rules, For terminal symbols (tokens), there are precedence declarations, listed from lowest to highest precedence: %left + %left * %right "**" Symbols on each such line have the same precedence. For a rule, precedence = that of its last terminal (Can override with %prec if needed, cf. the Bison manual). Now, we resolve shift/reduce conflict with a shift if: The next input token has higher precedence than the rule, or The next input token has the same precedence as the rule and the relevent precedence declaration was %right. and otherwise, we choose to reduce the rule. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 4

Example of Using Precedence to Solve S/R Conflict (1) Assuming we ve declared %left + %left * the rule E: E + E will have precedence 1 (left-associative) and the rule E: E * E will have precedence 2. So, when the parser confronts the choice in state 6 w/next token *, 5 E: E + E, */+ E: E + E, */+ E: E * E, */+ etc. E E: E + E, */+ E: E + E, */+ E: E * E, */+ 6 it will choose to shift because the * has higher precedence than the rule E + E. On the other hand, with input symbol +, it will choose to reduce, because the input token then has the same precedence as the rule to be reduced, and is left-associative. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 5

Example of Using Precedence to Solve S/R Conflict (2) Back to our dangling else example. We ll have the state 10 S: "if" E "then" S, "else" S: "if" E "then" S "else" S, "else" etc. Can eliminate conflict by declaring the token else to have higher precedence than then (and thus, than the first rule above). HOWEVER: best to limit use of precedence to these standard examples (expressions, dangling elses). If you simply throw them in because you have a conflict you don t understand, you re like to end up with unexpected parse trees or syntax errors. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 6

Reduce/Reduce Conflicts The lookahead symbols in LR(1) items are only considered for reductions in items that end in. If a DFA state contains both [X: α, a] and [Y: β, a] then on input a we don t know which production to reduce. Such reduce/reduce conflicts are often due to a gross ambiguity in the grammar. Example: defining a sequence of identifiers with S: ǫ id id S There are two parse trees for the string id: S id or S id S id. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 7

Reduce/Reduce Conflicts in DFA For this example, you ll get states: 0 S : S, S:, S: id, S: id S, S : S, id S: id, S: id S, S:, S: id S, S: id, S: id S, 1 Reduce/reduce conflict on input. Better rewrite the grammar: S: ǫ id S. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 8

Parsing Errors One purpose of the parser is to filter out errors that show up in parsing Later stages should not have to deal with possibility of malformed constructs Parser must identify error so programmer knows what to correct Parser should recover so that processing can continue (and other errors found). Parser might even correct error (e.g., PL/C compiler could correct some Fortran programs into equivalent PL/1 programs!) Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 9

Identifying Errors All of the valid parsers we ve seen identify syntax errors as soon as possible. Valid prefix property: all the input that is shifted or scanned is the beginning of some valid program......but the rest of the input might not be. So in principle, deleting the lookahead (and subsequent symbols) and inserting others will give a valid program. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 10

Automating Recovery Unfortunately, best results require using semantic knowledge and hand tuning. E.g., a(i].y = 5 might be turned to a[i].y = 5 if a is statically known to be a list, or a(i).y = 5 if a function. Some automatic methods can do an OK job that at least allows parser to catch more than one error. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 11

Bison s Technique The special terminal symbol error is never actually returned by the lexer. Gets inserted by parser in place of erroneous tokens. Parsing then proceeds normally. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 12

Example of Bison s Error Rules Suppose we want to throw away bad statements and carry on stmt : whilestmt ifstmt... error NEWLINE ; Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 13

Consider erroneous text like if x y:... Response to Error When parser gets to the y, will detect error. Then pops items off parsing stack until it finds a state that allows a shift or reduction on error terminal Does reductions, then shifts error. Finally, throws away input until it finds a symbol it can shift after error, according to the grammar. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 14

So with our example: Error Response, contd. stmt : whilestmt ifstmt... error NEWLINE ; We see y, throw away the if x, so as to be back to where a stmt can start. Shift error andthrowawaymoresymbolsto NEWLINE. Thencarry on. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 15

Of Course, It s Not Perfect Throw away and punt is sometimes called panic-mode error recovery Results are often annoying. For example, in our example, there could be an INDENT after the NEWLINE, which doesn t fit the grammar and causes another error. Bison compensates in this case by not reporting errors that are too close together But in general, can get cascade of errors. Doing it right takes a lot of work. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 16

[See lecture15 directory.] Bison Examples Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 17

A Hierarchy of Grammar Classes Unambiguous Grammars LL(k) LR(k) LR(1) LL(1) LALR(1) SLR LL(0) LR(0) GLR Ambiguous Grammars From Andrew Appel, Modern Compiler Implementation in Java Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 18

Summary Parsing provides a means of tying translation actions to syntax clearly. A simple parser: LL(1), recursive descent A more powerful parser: LR(1) An efficiency hack: LALR(1), as in Bison. Earley s algorithm provides a complete algorithm for parsing all contextfree languages. WecangetthesameeffectinBisonbyothermeans(the%glr-parser option, for GeneralizedLR), as seenin one of the examplesfrom lecture #5. Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 19