Panic-mode error recovery. Top-down parsing with a parsing table (once more)

Similar documents
SLR parsers. LR(0) items

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Context-free grammars

Compiler Construction: Parsing

Syntactic Analysis. Top-Down Parsing

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Chapter 4: LR Parsing

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

CSE302: Compiler Design

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Monday, September 13, Parsers

CA Compiler Construction

Principles of Programming Languages

I 1 : {E E, E E +E, E E E}

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Lecture 7: Deterministic Bottom-Up Parsing


Lecture 8: Deterministic Bottom-Up Parsing

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Parsing Part II (Top-down parsing, left-recursion removal)

CS502: Compilers & Programming Systems

Topdown parsing with backtracking

CS143 Handout 08 Summer 2009 June 30, 2009 Top-Down Parsing

Compiler Construction 2016/2017 Syntax Analysis

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam


Formal Languages and Compilers Lecture VII Part 3: Syntactic A

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

CS Parsing 1

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Compilers. Predictive Parsing. Alex Aiken

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Types of parsing. CMSC 430 Lecture 4, Page 1

Wednesday, August 31, Parsers

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Chapter 4. Lexical and Syntax Analysis

ECE 468/573 Midterm 1 September 30, 2015

Top down vs. bottom up parsing

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Compilers. Bottom-up Parsing. (original slides by Sam

VIVA QUESTIONS WITH ANSWERS

Question Marks 1 /12 2 /6 3 /14 4 /8 5 /5 6 /16 7 /34 8 /25 Total /120

Bottom-Up Parsing. Lecture 11-12

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

shift-reduce parsing

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5

Syntax Analysis Part I

CS 4120 Introduction to Compilers

CSCI312 Principles of Programming Languages

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas

ECE 468/573 Midterm 1 October 1, 2014

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

4. Lexical and Syntax Analysis

Introduction to Syntax Analysis

Bottom-Up Parsing. Lecture 11-12

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution

4. Lexical and Syntax Analysis

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

LALR Parsing. What Yacc and most compilers employ.

CS 321 Programming Languages and Compilers. VI. Parsing

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

Table-Driven Parsing

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Introduction to Syntax Analysis. The Second Phase of Front-End

Syntax Analysis, III Comp 412

Parsing III. (Top-down parsing: recursive descent & LL(1) )

3. Parsing. Oscar Nierstrasz

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

Syntax Analysis, III Comp 412

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

CSE431 Translation of Computer Languages

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Syn S t yn a t x a Ana x lysi y s si 1

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Bottom-Up Parsing II. Lecture 8

Introduction to parsers

Final Term Papers 2013

Concepts Introduced in Chapter 4

COMP 181. Prelude. Next step. Parsing. Study of parsing. Specifying syntax with a grammar

Syntax Analyzer --- Parser

Talen en Compilers. Johan Jeuring , period 2. January 17, Department of Information and Computing Sciences Utrecht University

Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

컴파일러입문 제 6 장 구문분석

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Transcription:

Top-down parsing with a parsing table (once more) Panic-mode error recovery CURRENT INPUT TOKEN VAR a b c d e f g h $ S b AaS AaS AaS A cb db ecdbf DB B ǫ ǫ DB DB C c d ecdbf D gc hc STACK CURRENT INPUT PRODUCTION TO APPLY S$ cgedhcf ab$ S AaS AaS$ cgedhcf ab$ A cb cbas$ cgedhcf ab$ match BaS$ gedhcf ab$ B DB DBaS$ gedhcf ab$ D gc gcbas$ gedhcf ab$ match CBaS$ edhcf ab$ C ecdbf ecdbf BaS$ edhcf ab$ match CDBf BaS$ dhcf ab$ Idea 1: If you have a variable on top of the stack, skip input tokens until a synchronizing token for that variable appears At that point, pop the variable and try to resume (Of course, also say something about what has happened) When does it (possibly) make sense to discard the variable on top of the stack? When we see a token that can follow whatever that variable can generate This idea suggests that the synchronizing tokens for variable A will be the elements of FOLLOW(A) 1 2

S AaS b A cb db ecdbfdb B DB ǫ C c d ecdbf D gc hc For every variable A, FOLLOW(A) is the set consisting of S AaS b A cb db ecdbfdb B DB ǫ C c d ecdbf D gc hc For every variable A, FOLLOW(A) is the set consisting of all terminals a st S αaaβ all terminals a st S αaaβ for some strings α, β over V Σ, along with for some strings α, β over V Σ, along with $, if S αa $, if S αa for some string α over V Σ We previously saw that FOLLOW(B) = {a, f} VARIABLE FOLLOW SET S A B {a,f} C D for some string α over V Σ $ FOLLOW(S) since S S a FOLLOW(A) since S AaS a, f, g, h FOLLOW(D) since S ecdfdbas and FIRST(B) = {g, h, ǫ} a, f, g, h FOLLOW(C) since S ecgcfgcbas and FIRST(B) = {g, h, ǫ} 3 4

FOLLOW TABLE VARIABLE FOLLOW SET S {$} A {a} B {a, f} C {a, f, g, h} D {a, f, g, h} So we can augment the parsing table to indicate variable lookahead pairs that may be useful for error recovery synchronization If there is currently no entry for a pair from the follow table, add a synch entry CURRENT INPUT TOKEN VAR a b c d e f g h $ S b AaS AaS AaS synch A synch cb db ecdbf DB B ǫ ǫ DB DB C synch c d ecdbf synch synch synch D synch synch gc hc Let s try this on an example CURRENT INPUT TOKEN VAR a b c d e f g h $ S b AaS AaS AaS synch A synch cb db ecdbf DB B ǫ ǫ DB DB C synch c d ecdbf synch synch synch D synch synch gc hc STACK CURRENT INPUT PRODUCTION TO APPLY S$ cgah$ S AaS AaS$ cgah$ A cb cbas$ cgah$ match BaS$ gah$ B DB DBaS$ gah$ D gc gcbas$ gah$ match CBaS$ ah$ error, synch BaS$ ah$ B ǫ as$ ah$ match S$ h$ error S$ $ error, synch $ $ parse complete First error? Missing C (missing term) Second error? Ignored unexpected h (unexpected ) Third error? Missing S (missing eof?) 5 6

Two more ideas for what to do when an error occurs: Idea 2: If you have a variable A on top of the stack, skip input tokens until you get a token in FIRST(A) (Also say something about what has happened) Notice that we don t need to add any information to the parse table in order to implement Idea 2 CURRENT INPUT TOKEN VAR a b c d e f g h $ S b AaS AaS AaS synch A synch cb db ecdbf DB B ǫ ǫ DB DB C synch c d ecdbf synch synch synch D synch synch gc hc So in combination with Idea 1, when there is a parse error and a variable A on top of the stack, we skip input tokens until we see either a token in FIRST(A), in which case we simply continue, or a token in FOLLOW(A), in which case we pop A off the stack and continue Idea 3: If you have a token a on top of the stack, discard it, and say inserting a in input STACK CURRENT INPUT PRODUCTION TO APPLY S$ caab$ S AaS AaS$ caab$ A cb cbas$ caab$ match BaS$ aab$ B ǫ as$ aab$ match S$ ab$ error S$ b$ S b b$ b$ match $ $ parse complete What to say about this error? ignored unexpected a 7 8

CURRENT INPUT TOKEN VAR a b c d e f g h $ S b AaS AaS AaS synch A synch cb db ecdbf DB B ǫ ǫ DB DB C synch c d ecdbf synch synch synch D synch synch gc hc STACK CURRENT INPUT PRODUCTION TO APPLY S$ f eab$ error S$ eab$ S AaS AaS$ eab$ A ecdbf DB ecdbf DBaS$ eab$ match CDBfDBaS$ ab$ error, synch DBfDBaS$ ab$ error, synch BfDBaS$ ab$ B ǫ f DBaS$ ab$ error DBaS$ ab$ error,synch BaS$ ab$ B ǫ as$ ab$ The book describes two more ideas for panic-mode error handling in top-down parsing They are less convincing It appears that on this example, the techniques we ve looked at work pretty well Of course, if you like, you can simply insert error routines as actions in the parse table, doing arbitrarily helpful and/or complex things in response to errors The book calls this phrase-level recovery First error, ignoring unexpected f Second error, missing C Third error, missing D Fourth error, inserted f 9 10

Closing Remarks on Top-Down Parsing Bottom-up parsing In many cases, as in the long example last time, we can eliminate all left recursion (in three steps) and if we simply left factor at that point, we will fail to obtain an LL(1) grammar even though there is in fact an equivalent LL(1) grammar Finding an equivalent LL(1) grammar is too much of an art! And if we do find one, it may be hard to understand and awkward for producing a translation Bottom-up parsing is more widely applicable than top-down parsing, and more widely used but less intuitive Rough idea, construct a parse tree from the bottom up (instead of from the top down) That sounds simple enough, but it seems to be harder to understand in detail how it works Top-down parsing is appealing because it is relatively intuitive But in practice, the approach often leads to grammars that are unintuitive because we need an LL(1) grammar Moreover, there are many languages that are eminently parsable, but for which there is no LL(1) grammar 11 12

Bottom-up parsing aka shift-reduce parsing Bottom-up parsing is also called shift-reduce parsing A successful parse reduces the input string to the start symbol Example Consider input a + b a and grammar E E + E E E a b STACK INPUT ACTION $ a+b a$ shift $a +b a$ E a $E +b a$ shift $E+ b a$ shift $E+b a$ E b $E+E a$ E E + E $E a$ shift $E a$ shift $E a $ E a $E E $ E E E $E $ accept STACK INPUT ACTION $ a+b a$ shift $a +b a$ E a $E +b a$ shift $E+ b a$ shift $E+b a$ E b $E+E a$ E E + E $E a$ shift $E a$ shift $E a $ E a $E E $ E E E $E $ accept To what derivation does this correspond? Notice something remarkable about the correspondence between the derivation steps and the stack and input contents at each step? 13 14

Before beginning to say more precisely what is happening here (a long story!), let s consider another example Here s the (transposed) grammar used to specify the first problem in hwk 1 S AaS b A c d B B AgC AhC DgC DhC C c d D D ebf STACK INPUT ACTION $ cab$ shift $c ab$ A c $A ab$ shift $Aa b$ shift $Aab $ S b $AaS $ S AaS $S $ accept Right-sentential forms We notice that at each step in a successful bottom-up parse, as illustrated in the previous examples, the concatenation of the current stack and input corresponds to a sentential form Moreover, the parse as a whole corresponds to a rightmost derivation of the input We call a sentential form a right-sentential form if it appears in some rightmost derivation from the start symbol Notice, in particular, that every sentence of the grammar (ie every string in the language generated by the grammar) is a right-sentential form Now why didn t we use production C c at the second step? Cab is not a sentential form! 15 16

Handles We can reduce by a production B β when there is a right-sentential form αβγ st the stack holds $αβ and the current input is γ$ and αbγ αβγ is the last step in a rightmost derivation of αβγ from the start symbol Definition A production B β is a handle of αβγ in the position following α if αbγ αβγ is the last step in a rightmost derivation from the start symbol A production B β is a handle of αβγ in the position following α if αbγ αβγ is the last step in a rightmost derivation from the start symbol It is convenient to simply say that β is a handle of αβγ, if it is clear what production and position are meant Example S cad A ab a So a is a handle for cad, since S cad cad Is a a handle for cabd? What production is meant? And in what position? So, is there a rightmost derivation that ends cabd cabd? Is ab a handle for cabd? Does ab have a handle? Does cad? 17 18

For next time We ll continue our study of bottom-up parsing Read 45 19