Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Similar documents
Compilers. Predictive Parsing. Alex Aiken

CS502: Compilers & Programming Systems

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Types of parsing. CMSC 430 Lecture 4, Page 1

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5

CA Compiler Construction

Table-Driven Parsing

Top down vs. bottom up parsing

Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas

CS 132 Compiler Construction, Fall 2012 Instructor: Jens Palsberg Multiple Choice Exam, Nov 6, 2012

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

CS 321 Programming Languages and Compilers. VI. Parsing

CS Parsing 1

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Monday, September 13, Parsers

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

CSCI312 Principles of Programming Languages

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

3. Parsing. Oscar Nierstrasz

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

LL parsing Nullable, FIRST, and FOLLOW

Chapter 3. Parsing #1

Abstract Syntax Trees & Top-Down Parsing

Syntactic Analysis. Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

CSE431 Translation of Computer Languages

Parsing. Lecture 11: Parsing. Recursive Descent Parser. Arithmetic grammar. - drops irrelevant details from parse tree

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

Extra Credit Question


Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam


Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Lexical and Syntax Analysis. Top-Down Parsing

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Wednesday, August 31, Parsers

4 (c) parsing. Parsing. Top down vs. bo5om up parsing

PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering

Introduction to parsers

Automatic generation of LL(1) parsers

Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser

Lexical and Syntax Analysis (2)

shift-reduce parsing

More Bottom-Up Parsing

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compila(on (Semester A, 2013/14)

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Lexical and Syntax Analysis

Language Processing note 12 CS

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Lecture Notes on Top-Down Predictive LL Parsing

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

CS308 Compiler Principles Syntax Analyzer Li Jiang

Parsing III. (Top-down parsing: recursive descent & LL(1) )

CS 536 Midterm Exam Spring 2013

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

3. Parsing. 3.1 Context-Free Grammars and Push-Down Automata 3.2 Recursive Descent Parsing 3.3 LL(1) Property 3.4 Error Handling

Fall Compiler Principles Lecture 3: Parsing part 2. Roman Manevich Ben-Gurion University

Compilation Lecture 3: Syntax Analysis: Top-Down parsing. Noam Rinetzky

LL(1) Grammars. Example. Recursive Descent Parsers. S A a {b,d,a} A B D {b, d, a} B b { b } B λ {d, a} D d { d } D λ { a }

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Principles of Programming Languages

Compiler construction in4303 lecture 3

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

ECE 468/573 Midterm 1 September 30, 2015

Topdown parsing with backtracking

CS453 : Shift Reduce Parsing Unambiguous Grammars LR(0) and SLR Parse Tables by Wim Bohm and Michelle Strout. CS453 Shift-reduce Parsing 1

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Recursive Descent Parsers

Configuration Sets for CSX- Lite. Parser Action Table

COMP 181. Prelude. Next step. Parsing. Study of parsing. Specifying syntax with a grammar

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Parsing Part II (Top-down parsing, left-recursion removal)

CS143 Midterm Sample Solution Fall 2010

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

CS143 Handout 08 Summer 2009 June 30, 2009 Top-Down Parsing

ECE 468/573 Midterm 1 October 1, 2014

LR Parsing. The first L means the input string is processed from left to right.

Chapter 4. Lexical and Syntax Analysis

Syn S t yn a t x a Ana x lysi y s si 1

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

Transcription:

LL(k) Grammars

We need a bunch of terminology. For any terminal string a we write First k (a) is the prefix of a of length k (or all of a if its length is less than k) For any string g of terminal and non-terminal symbols and any non-terminal symbol A, we say A = * > g if we can derive g from A. In English we say A derives g. For any non-terminal symbol First k (A) is the set of First k ( a) of all terminal strings A derives. Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

For any two sets S1 and S2 of strings of terminal symbols S1 k S2 is the set of prefixes of length k of all of the strings you an get by concatenating a string from S1 with a string from S2.

For any non-terminal symbol A, Follow k (A) is the set of prefixes of length k of all the terminal strings that could come after strings generated from symbol A. To be precise Follow k (A) = {FIrst k (x) S=*> wax any strings w and x} We say a grammar is LL(k) if for every non-terminal symbol a and every pair of rules A ::= a b we have [First k (a) k Follow k (A)] [First k (b) k Follow k (A)] = The idea is that by looking ahead k token we can decide if we should use the rule A ::= a or the rule A::= b. We will see algorithms for generating the First k and Follow k sets.

For now, suppose we know how to generate these sets. If the grammar is LL(k) we can build a table-driven parser for it. Assume first that k=1. We build a table whose columns are indexed by tokens and whose rows are indexed by the nonterminal symbols of the grammar. The entries of the table are the grammar rules: 1. If A ::=a is a grammar rule and a is in First(a), then Table[A, a] is the rule A ::= a. 2. If A ::= a is a rule, and First(a) contains the empty string, and if b is in Follow(A), then Table[A, b] is the rule A ::= a. 3. If A ::= a is a rule, and First(a) contains the empty string, and if EOF is in Follow(A), then Table[A, EOF] is the rule A ::= a.

To use the table we maintain a stack that contains the current sentential form (string of symbols derived from the Start symbol): 1. Begin by pushing the Start symbol on an empty stack. 2. If the current token is on top of the stack, pop the stack and get the next token. 3. If there is a non-terminal symbol A on top of the stack and the current token is a, and if there is a rule in Table[A, a], pop the A off the stack and push on the right side of the rule from the table. 4. In (3), if there is no rule in Table[A, a] issue an error. 5. The stack should be empty when you reach the EOF token.

Example. S ::= if (C) S fi if (C) S else S fi a C ::= b We would need too many tokens to disambiguate the two if-rules, so we "factor" the grammar into S ::= if (C) S S' a S' ::= fi else S fi C ::= b This is actually 5 rules: (P1) S ::= if (C) S S' First( if (C) S S') = { if } (P2) S ::= a First( a ) = { a } (P3) S ' ::= fi First(fi) = { fi } (P4) S' ::= else S fi First(else S fi ) = { else } (P5) C ::= b First( b ) = { b }

This gives the following parse table: if ( ) fi else a b EOF S P1 P2 S' P3 P4 C P5 It is easy to walk through parsing an expression like if (b) if (b) a else a fi fi

Note that if we had the more familiar grammar S ::= if (C) S if (C) S else S a C ::= b We would factor it into (P1) S ::= if (C) S S' First( if (C) S S') = { if } (P2) S ::= a First( a ) = { a } (P3) S ' ::= else S First(else S) = { else } (P4) S' ::= e First(e ) Follow(S')= { else, EOF } (P5) C ::= b First( b ) = { b } There is an ambiguity between P3 and P4 on the token else.

This time the table is if ( ) else a b EOF S P1 P2 S' P3/P4 P4 C P5 We can effectively "disambiguate" the language by choosing which rule we use for Table[S', else]. The standard choice is to use P3, which puts a nested else with the nearest possible if.

We can build parse trees with this parser. Each time we pop a nonterminal string off the stack, we replace it by a tree node pointing at the right-side elements that are on the stack. We have built the parse table for an LL(1) grammar. An LL(k) table is exactly the same, only the columns are indexed by strings of k tokens. The primary rule for building the table is: If A ::= a is a grammar rule and w is in [First k (a) k Follow k (A)], then Table[A, w] is the rule A ::= a.

Left-recursive rules like E ::= E+T T are not LL(k) for any k. Here is a way to eliminate them. This is called "left factoring". Example: Consider the grammar E ::= E + T T T ::= T * F F F ::= id We factor it into E ::= T E' E' ::= + T E' e T ::= F T' T' ::= * F T' e F ::= id This is completely unambiguous. Unfortunately, it rules right-associative trees, but we can deal with that as we complete the tree.

Algorithm to find the First 1 sets (you can generalize to First k ) Step I: First compute First(X) for every individual symbol X 1. If x is a terminal symbol, First(x) = {x} 2. For the non-terminal symbols X, start with First(X) empty for every X. Apply the following rules until nothing changes: a) If X ::= e is a rule, add e to First(X). b) If X ::= Y 1 Y 2..Y n is a rule and e is in First(Y 1 ).. First(Y j ), then add all the symbols of First(Y j+1 ) to First(X). c) If X ::= Y 1 Y 2..Y n is a rule and e is in every First(Y i ), then add e to First(X).

Step 2: Now find the First set for the right hand side of every rule: Suppose X ::= X 1 X 2..X n is a grammar rule 1. Start with First(X 1 X 2..X n ) = First(X 1 ) 2. Let i be the smallest index so e is not in First(X i ). Then all of the symbols in First(X j ) for j <= i are in First(X 1 X 2..X n ). 3. If e is in all of the First(X i ) then it is also in First(X 1 X 2..X n ).

Algorithm: To compute Follow(X) for every non-terminal symbol X: 1. Include EOF in Follow(Start). 2. If there is a rule X ::= abb, include First(b)-e in Follow(B). Now apply the following rules until nothing changes: 3. If X ::= ab is a grammar rule, then include all of Follow(X) in Follow(B). 4. If X ::= abb is a grammar rule and e is in First(b), then include all of Follow(X) in Follow(B).