Configuration Sets for CSX- Lite. Parser Action Table

Similar documents
Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

Table-Driven Top-Down Parsers

How do LL(1) Parsers Build Syntax Trees?

LL(1) Grammars. Example. Recursive Descent Parsers. S A a {b,d,a} A B D {b, d, a} B b { b } B λ {d, a} D d { d } D λ { a }

Building A Recursive Descent Parser. Example: CSX-Lite. match terminals, and calling parsing procedures to match nonterminals.

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Wednesday, August 31, Parsers

Monday, September 13, Parsers

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12

UNIVERSITY OF CALIFORNIA

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

CA Compiler Construction

Conflicts in LR Parsing and More LR Parsing Types

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Context-free grammars

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

Compiler Construction: Parsing

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Lecture 7: Deterministic Bottom-Up Parsing

Properties of Regular Expressions and Finite Automata

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

LR Parsing LALR Parser Generators

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CS415 Compilers. LR Parsing & Error Recovery

In One Slide. Outline. LR Parsing. Table Construction

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Lecture 8: Deterministic Bottom-Up Parsing


shift-reduce parsing

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Optimizing Finite Automata

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Compiler Construction 2016/2017 Syntax Analysis

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

S Y N T A X A N A L Y S I S LR

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Compilers. Bottom-up Parsing. (original slides by Sam

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Lexical and Syntax Analysis. Bottom-Up Parsing

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

LR Parsing LALR Parser Generators

Compilers. Predictive Parsing. Alex Aiken

Types of parsing. CMSC 430 Lecture 4, Page 1

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

CS 314 Principles of Programming Languages

Compiler construction in4303 lecture 3

Compiler construction lecture 3

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

LR Parsing. Table Construction

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Top down vs. bottom up parsing

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

LR Parsing Techniques

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Syntax-Directed Translation

UNIT III & IV. Bottom up parsing

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof.

Building a Parser Part III


CSE 582 Autumn 2002 Exam Sample Solution

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

4. Lexical and Syntax Analysis

CSCI312 Principles of Programming Languages

Lexical and Syntax Analysis (2)

Lecture Bottom-Up Parsing

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

Defining syntax using CFGs

Concepts Introduced in Chapter 4

a. Determine the EPS, FIRST and FOLLOWs set for this grammar.

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

LALR Parsing. What Yacc and most compilers employ.

3. Parsing. Oscar Nierstrasz

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

UNIT-III BOTTOM-UP PARSING

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Principles of Programming Languages

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

4. Lexical and Syntax Analysis

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Chapter 3. Parsing #1

Transcription:

Configuration Sets for CSX- Lite State s 6 s 7 Cofiguration Set Prog { Stmts } Eof Stmts Stmt Stmts State s s Cofiguration Set Prog { Stmts } Eof Prog { Stmts } Eof Stmts Stmt Stmts Stmts λ Stmt if ( Expr ) Stmt Stmt id = Expr ; s Expr Expr - id s 9 Stmt if ( Expr ) Stmt Expr Expr - id s Prog { Stmts } Eof s Prog { Stmts } Eof s Stmts Stmt Stmts Stmts Stmt Stmts Stmts λ Stmt if ( Expr ) Stmt s Stmt id = Expr ; s 5 Stmt if ( Expr ) Stmt Stmt id = Expr ; s s Expr id s Stmt if ( Expr ) Stmt 9 State Cofiguration Set r Action Table s s 5 s 6 Stmt id = Expr ; Expr Expr + id Expr Expr - id Stmt if ( Expr ) Stmt s 7 Stmt if ( Expr ) Stmt s Expr Expr + id We will table possible parser actions based on the current state (configuration set) and token. Given configuration set C and input token T four actions are possible: Reduce i: The i-th production has been matched. s 9 Expr Expr - id Shift: Match the current token. s Stmt if ( Expr ) Stmt Accept: is correct and complete. Error: A syntax error has been discovered.

We will let A[C][T] represent the possible parser actions given configuration set C and input token T. A[C][T] = {Reduce i i-th production is A α and A α is in C and T in Follow(A) } U (If (B β T γ is in C) {Shift} else φ) This rule simply collects all the actions that a parser might do given C and T. But we want parser actions to be unique so we require that the parser action always be unique for any C and T. If the parser action isn t unique, then we have a shift/reduce error or reduce/reduce error. The grammar is then rejected as unparsable. If parser actions are always unique then we will consider a shift of EOF to be an accept action. An empty (or undefined) action for C and T will signify that token T is illegal given configuration set C. A syntax error will be signaled. LALR r Driver Action Table for CSX-Lite Given the GoTo and parser action tables, a Shift/Reduce (LALR) parser is fairly simple: { S 5 6 7 9 5 6 7 9 void LALRDriver(){ Push(S ); } R S R R R R5 if S S R S R5 while(true){ //Let S = Top state on parse stack //Let CT = current token to match switch (A[S][CT]) { case error: SyntaxError(CT);return; case accept: return; case shift: push(goto[s][ct]); CT= Scanner(); break; case reduce i: //Let prod i = A Y...Y m pop m states; //Let S = new top state push(goto[s ][A]); break; } } } ( S ) R S R6 R7 id S S S S R S S S = S + S R S R6 R7 - S R S R6 R7 ; S R R6R7R5 eof A 5 6

GoTo Table for CSX-Lite 5 6 7 9 { } 6 if 5 5 5 ( 9 ) 7 id 9 = + 5 5-6 6 ; eof stmts 7 stmt expr 5 6 7 9 Example of LALR() Parsing We ll again parse { a = b + c; } Eof We start by pushing state on the parse stack. Prog { Stmts } Eof Shift {a=b + c; } Eof Prog { Stmts } Eof Stmts Stmt Stmts Stmts λ Stmt if ( Expr ) Shift a = b + c; } Eof Stmt id = Expr ; = b + c; } Eof Stmt id = Expr ; Expr Expr - id Shift b + c; } Eof 7 5 Expr id Reduce + c; } Eof Stmt id = Expr ; Shift + c; } Eof Expr Expr + id Shift c; } Eof 5 Expr Expr + id Reduce 6 ; } Eof Stmt id = Expr ; Shift ; } Eof Stmt id = Expr ; Reduce } Eof 9

7 6 Stmts Stmt Stmts Reduce } Eof Stmts Stmt Stmts Stmts λ Stmt if ( Expr ) Stmt Stmts Stmt Stmts Reduce } Eof Prog { Stmts } Eof Shift } Eof Prog { Stmts } Eof Accept Eof Error Detection in LALR rs In bottom-up, LALR parsers syntax errors are discovered when a blank (error) entry is fetched from the parser action table. Let s again trace how the following illegal CSX-lite program is parsed: { b + c = a; } Eof LALR is More Powerful Prog { Stmts } Eof Shift {b+c=a;}eof Prog { Stmts } Eof Stmts Stmt Stmts Stmts λ Stmt if ( Expr ) Stmt id = Expr ; Shift Error (blank) b + c = a; } Eof +c=a;}eof Essentially all LL() grammars are LALR() plus many more. Grammar constructs that confuse LL() are readily handled. Common prefixes are no problem. Since sets of configurations are tracked, more than one prefix can be followed. For example, in Stmt id = Expr ; Stmt id ( Args ) ; after we match an id we have Stmt id = Expr ; Stmt id ( Args ) ; The next token will tell us which production to use.

Left recursion is also not a problem. Since sets of configurations are tracked, we can follow a left-recursive production and all others it might use. For example, in we can first match an id: Expr id Then the Expr is recognized: But ambiguity will still block construction of an LALR parser. Some shift/reduce or reduce/ reduce conflict must appear. (Since two or more distinct parses are possible for some input). Consider our original productions for if-then and if-then-else statements: Stmt if ( Expr ) Stmt Stmt if ( Expr ) Stmt else Stmt Since else can follow Stmt, we have an unresolvable shift/reduce conflict. The left-recursion is handled! 5 6 Grammar Engineering Though LALR grammars are very general and inclusive, sometimes a reasonable set of productions is rejected due to shift/reduce or reduce/reduce conflicts. In such cases, the grammar may need to be engineered to allow the parser to operate. A good example of this is the definition of MemberDecls in CSX. A straightforward definition is MemberDecls FieldDecls MethodDecls FieldDecls FieldDecl FieldDecls FieldDecls λ MethodDecls MethodDecl MethodDecls MethodDecls λ FieldDecl int id ; MethodDecl int id ( ) ; Body When we predict MemberDecls we get: MemberDecls FieldDecls MethodDecls FieldDecls FieldDecl FieldDecls FieldDecls λ FieldDecl int id ; Now int follows FieldDecls since MethodDecls + int... Thus an unresolvable shift/reduce conflict exists. The problem is that int is derivable from both FieldDecls and MethodDecls, so when we see an int, we can t tell which way to parse it (and FieldDecls λ requires we make an immediate decision!). 7

If we rewrite the grammar so that we can delay deciding from where the int was generated, a valid LALR parser can be built: MemberDecls FieldDecl MemberDecls MemberDecls MethodDecls MethodDecls MethodDecl MethodDecls MethodDecls λ FieldDecl int id ; MethodDecl int id ( ) ; Body When MemberDecls is predicted we have Now Follow(MethodDecls) = Follow(MemberDecls) = }, so we have no shift/reduce conflict. After int id is matched, the next token (a ; or a ( ) will tell us whether a FieldDecl or a MethodDecl is being matched. MemberDecls FieldDecl MemberDecls MemberDecls MethodDecls MethodDecls MethodDecl MethodDecls MethodDecls λ FieldDecl int id ; MethodDecl int id ( ) ; Body 9