LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

Similar documents
Lexical and Syntax Analysis. Bottom-Up Parsing

Conflicts in LR Parsing and More LR Parsing Types

LR Parsing LALR Parser Generators

shift-reduce parsing

Bottom-Up Parsing. Lecture 11-12

In One Slide. Outline. LR Parsing. Table Construction

LR Parsing LALR Parser Generators

CS 4120 Introduction to Compilers

Bottom-Up Parsing. Lecture 11-12

S Y N T A X A N A L Y S I S LR

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

LR Parsing. Table Construction

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MODULE 14 SLR PARSER LR(0) ITEMS

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Monday, September 13, Parsers

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

UNIT III & IV. Bottom up parsing

More Bottom-Up Parsing

Wednesday, August 31, Parsers

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Compilers. Bottom-up Parsing. (original slides by Sam

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

LR Parsing Techniques

Lecture 8: Deterministic Bottom-Up Parsing

How do LL(1) Parsers Build Syntax Trees?

Lecture 7: Deterministic Bottom-Up Parsing

Lecture Bottom-Up Parsing

Chapter 4. Lexical and Syntax Analysis

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Context-free grammars

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Principles of Programming Languages

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Compiler Construction: Parsing

LR Parsing - The Items

Bottom-Up Parsing II. Lecture 8

Downloaded from Page 1. LR Parsing

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

LALR Parsing. What Yacc and most compilers employ.

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

LR Parsing - Conflicts

Bottom-Up Parsing LR Parsing

Configuration Sets for CSX- Lite. Parser Action Table

4. Lexical and Syntax Analysis

FROWN An LALR(k) Parser Generator

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

4. Lexical and Syntax Analysis

Compiler Construction 2016/2017 Syntax Analysis

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Bottom-up Parser. Jungsik Choi

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

CS415 Compilers. LR Parsing & Error Recovery

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Building a Parser Part III

CS453 : Shift Reduce Parsing Unambiguous Grammars LR(0) and SLR Parse Tables by Wim Bohm and Michelle Strout. CS453 Shift-reduce Parsing 1

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

Tasks of the Tokenizer

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Zhizheng Zhang. Southeast University

Syntax-Directed Translation

Chapter 3: Lexing and Parsing

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution

Formal Languages and Compilers Lecture VI: Lexical Analysis

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412


Context-Free Grammars and Parsers. Peter S. Housel January 2001

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Lecture Notes on Bottom-Up LR Parsing

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

CS143 Handout 14 Summer 2011 July 6 th, LALR Parsing

CS606- compiler instruction Solved MCQS From Midterm Papers

Lecture Notes on Bottom-Up LR Parsing

Syntax Analysis, VII One more LR(1) example, plus some more stuff. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. target code.

I 1 : {E E, E E +E, E E E}

Compiler Construction

LR Parsing Techniques


Transcription:

LALR parsing 1

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the decisions without using a look ahead. Sometimes, a look ahead of 1 is required. Most parser generators (and in particular Bison and Yacc) construct LALR parsers. In LALR parsing, a deterministic finite automaton is used for determining when reductions have to be made. The deterministic finite automaton is usually called prefix automaton. On the following slides, I will explain how it is constructed. 2

Items Let G = (Σ, R, S) be a grammar. Definition Let σ Σ, w 1, w 2 Σ. If σ w 1 w 2 R, then σ w 1.w 2 is called an item. An item is a rule with a dot added somewhere in the right hand side. The intuitive meaning of an item σ w 1.w 2 is that w 1 has been read, and if w 2 is also found, then rule σ w 1 w 2 can be reduced. 3

Items Let a bbc be a rule. The following items can be constructed from this rule: a.bbc, a b.bc, a bb.c, a bbc. For a given grammar G, the set of possible items is finite. 4

Operations on Itemsets (1) Definition: An itemset is a set of items. Because for a given grammar, there exists only a finite set of possible items, the set of itemsets is also finite. Let I be an itemset. The closure CLOS(I) of I is defined as the smallest itemset J, s.t. I J, If σ w 1.Aw 2 J, and there exists a rule A v R, then A.v J. 5

Operations on Itemsets (2) Let I be an itemset, let α Σ be a symbol. The set TRANS(I, α) is defined as {σ w 1 α.w 2 σ w 1.αw 2 I }. 6

The Prefix Automaton Let G = (Σ, R, S) be a grammar. The prefix automaton of G is the deterministic finite automaton A = (Σ, Q, Q s, Q a, δ), that is the result of the following algorithm: Start with A = (Σ, {CLOS(I)}, {CLOS(I)},, ), where I = {Ŝ.S #}, Ŝ Σ is a new start symbol, S is the original start symbol of G, and # Σ is the EOF symbol. As long as there exists an I Q, and a σ Σ, s.t. I = CLOS(TRANS(I, σ)) Q, put Q := Q {I }, δ := δ {(I, σ, I )}. As long as there exist I, I Q, and a σ Σ, s.t. I = CLOS(TRANS(I, σ)), and (I, σ, I ) δ, put δ := δ {(I, σ, I )}. 7

The Prefix Automaton (2) The prefix automaton can be big, but it can be easily computed. Every context-free language has a prefix automaton, but not every language can be parsed by an LALR parser, because of the look ahead sets. 8

Parse Algorithm (1) std::vector< state > states; // Stack of states of the prefix automaton. std::vector< token > tokens; // We assume that a token has attributes, so // we don t encode them separately. std::dequeue< token > lookahead; // Will never be longer than one. states. push_back( q0 ); // The initial state. while( true ) { 9

Parse Algorithm (2) decision = unknown; state topstate = states. back( ); if(topstate has only one reduction R and no shifts) decision = reduce(r); // We know for sure that we need lookahead: if( decision == unknown && lookahead. size( ) == 0 ) { lookahead. push_back( inputstream. readtoken( )); } 10

Parse Algorithm (3) if( lookahead. front( ) == EOF ) { if( topstate is an accepting state ) return tokens. back( ); else return error, unexpected end of input. } 11

Parse Algorithm (4) if( decision == unknown && topstate has only one reduction R with lookahead. front( ) && no shift is possible with lookahead. front( )) { decision = reduce(r); } if( decision == unknown && topstate has only a shift Q with lookahead. front( ) && no reduction is possible with lookahead. front()) { decision = shift(q); } 12

Parse Algorithm (5) if( decision == unknown ) { // Either we have a conflict, or the parser is // stuck. if( no reduction/no shift is possible ) print error message, try to recover. 13

Parse Algorithm (6) // A conflict can be shift/reduce, or // reduce/reduce: Let R, from the set of possible reductions, (taking into account lookahead. front( )), be the rule with the smallest number. } decision = reduce(r); 14

Parse Algorithm (7) if( decision == push(q)) { states. push_back( Q ); tokens. push_back( lookahead. front( )); lookahead. pop_front( ); } else { // decision has form reduce(r) unsigned int n = the length of the rhs of R. 15

Parse Algorithm (8) token lhs = compute_lhs( R, tokens. begin( ) + tokens. size( ) - n, tokens. begin( ) + tokens. size( )); // this also computes the attribute. for( unsigned int i = 0; i < n; ++ i ) { states. pop_back( ); tokens. pop_back( ); } 16

Parse Algorithm (9) // The shift of the lhs after a reduction is // also called goto topstate = states. back( ); state newstate = the state reachable from topstate under lhs. } } states. push_back( newstate ); tokens. push_back( lhs ); // Unreachable. 17

Lookahead Sets We already have seen lookahead sets in action. If a state has more than one reduction, or a reduction and a shift, the parser looks at the lookahead symbol, in order to decide what to do next. LA(I, σ w) Σ is defined a set of tokens. If the parser is in state I, and the lookahead LA(I, σ w), then the parser can reduce σ w. When should a token σ be in LA(I, σ w)? 18

Lookahead Sets (2) Definition: s LA(I, σ w) if 1. σ w. I (obvious) 2. There exists a correct input word w 1 s w 2 #, such that 3. The parser reaches a state with state stack (...,I) and token stack (...,w), the lookahead (of the parser) is s, and 4. the parser can reduce the rule σ w, after which 5. it can read the rest of the input w 2 and reach an accepting state. 19

Computing Look Ahead Sets For every rule A w of the grammar G, such that there exist states I 1, I 2, I 3, s.t. A.w I 1, A w. I 2, there exists a path from I 1 to I 2 in the prefix automaton using w, and there is a transition from I 1 to I 3 based on A, the following must hold: For every symbol σ Σ, for which a transition from I 3 to some other state is possible in the prefix automaton, σ LA(I 2, A w.). For every item of form B v. I 3, LA(I 3, B v.) LA(I 2, A w.) Compute the LA as the smallest such sets. 20

Computing Look Ahead Sets (2) Example S Aa, A B, A Bb, B C, B Cc, C d. 21

The algorithm on the previous slides can sometimes compute too big look ahead sets. You will see this in the exercises. 22

Computing the Correct Sets I don t want to say much about this, because it is complicated. Definition: An LR(1)-item has form σ w 1.w 2 /s, where σ w 1 w 2 is a rule of the grammar, and s S. STEP remains the same. CLOS has to be modified. 23