LL(1) predictive parsing

Similar documents
LL(1) predictive parsing

Automatic generation of LL(1) parsers

CS502: Compilers & Programming Systems

Semantics of programming languages

Lecture 8: Deterministic Bottom-Up Parsing

Lecture 7: Deterministic Bottom-Up Parsing

Bottom-Up Parsing II. Lecture 8

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Types of parsing. CMSC 430 Lecture 4, Page 1

Parsing II Top-down parsing. Comp 412

CS 4120 Introduction to Compilers

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Syntax Analysis Part I

LR Parsing. The first L means the input string is processed from left to right.

Lecture Bottom-Up Parsing

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

CSCI312 Principles of Programming Languages

Types and Static Type Checking (Introducing Micro-Haskell)

3. Parsing. Oscar Nierstrasz

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

S Y N T A X A N A L Y S I S LR

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

SLR parsers. LR(0) items

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CS 314 Principles of Programming Languages

Semantics of programming languages

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

Chapter 4. Lexical and Syntax Analysis

Language Processing note 12 CS

Semantics of programming languages

4. Lexical and Syntax Analysis

Lecture Notes on Top-Down Predictive LL Parsing

CS 321 Programming Languages and Compilers. VI. Parsing

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

4. Lexical and Syntax Analysis

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Evaluation of Semantic Actions in Predictive Non- Recursive Parsing

CA Compiler Construction

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

Lexical and Syntax Analysis. Top-Down Parsing

UNIT-III BOTTOM-UP PARSING

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

LR Parsing - The Items

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Context-free grammars

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Types and Static Type Checking (Introducing Micro-Haskell)

Syntactic Analysis. Top-Down Parsing

It parses an input string of tokens by tracing out the steps in a leftmost derivation.

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Defining syntax using CFGs

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Monday, September 13, Parsers

Top down vs. bottom up parsing

Lexical and Syntax Analysis. Bottom-Up Parsing

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

1 Parsing (25 pts, 5 each)

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

CS 536 Midterm Exam Spring 2013

CS 44 Exam #2 February 14, 2001

Introduction to Parsing. Comp 412

Error Recovery during Top-Down Parsing: Acceptable-sets derived from continuation

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Context-Free Languages and Parse Trees

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Lecture Notes on Shift-Reduce Parsing

Syntax Analysis Check syntax and construct abstract syntax tree

LR Parsing E T + E T 1 T

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3)

Compilers and computer architecture From strings to ASTs (2): context free grammars

LLparse and LRparse: Visual and Interactive Tools for Parsing

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Optimizing Finite Automata

UNIT III & IV. Bottom up parsing

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

15 212: Principles of Programming. Some Notes on Grammars and Parsing

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

shift-reduce parsing

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

Principles of Programming Languages

Introduction to Lexing and Parsing

Transcription:

LL(1) predictive parsing Informatics 2A: Lecture 11 John Longley School of Informatics University of Edinburgh jrl@staffmail.ed.ac.uk 13 October, 2011 1 / 12

1 LL(1) grammars and parse tables 2 3 2 / 12

LL(1) grammars: the intuition Think about a (one-state) NPDA derived from a CFG G. Intuition: at each step, our NPDA can see two things: the current input symbol the topmost stack symbol Roughly speaking, we say G is an LL(1) grammar if, just from this information, it s possible to determine which transition/production to apply next. (Precise definition later!) If this is so, parsing can proceed deterministically and efficiently. Here LL(1) means read input from left, build leftmost derivation, look just one symbol ahead. Subtle point: we can use the input symbol to help us choose a transition, even if this transition doesn t consume that input symbol... hence look ahead. 3 / 12

Parse tables Saying the current input symbol and stack symbol uniquely determine the production means we can draw up a two-dimensional parse table telling us which production to apply in any situation. Consider e.g. the following grammar for well-bracketed sequences: S ɛ TS This has the following parse table: T (S) ( ) $ S S TS S ɛ S ɛ T T (S) Columns are labelled by input symbols. We include an extra column for an end-of-input marker $. Rows are labelled by nonterminal stack symbols. If the current stack symbol is a terminal, just match it against current input symbol. Blank entries correspond to situations that can never arise in processing a legal input string. 4 / 12

Predictive parsing with parse tables Given such a parse table, parsing can be done very efficiently using a stack. The stack (reading downwards) records the predicted form for the remaining part of the input. Begin with just start symbol S on the stack. If current input symbol is a (maybe $), and current stack symbol is a non-terminal X, look up rule for a, X in table. [Error if no rule.] If rule is X β, pop X and replace with β (pushed end-first!) If current input symbol is a and stack symbol is a, just pop a from stack and advance input read position. [Error if stack symbol is any other terminal.] Accept if stack empties with $ as input symbol. [Error if stack empties sooner.] 5 / 12

Example of predictive parsing ( ) $ S S TS S ɛ S ɛ T T (S) Let s use this table to parse the input string (()). Operation Remaining input Stack state (())$ S Lookup (, S (())$ TS Lookup (, T (())$ (S)S Match ( ())$ S)S Lookup (, S ())$ TS)S Lookup (, T ())$ (S)S)S Match ( ))$ S)S)S Lookup ), S ))$ )S)S Match ) )$ S)S Lookup ), S )$ )S Match ) $ S Lookup $, S $ empty stack (Also easy to build a syntax tree as we go along!) 6 / 12

Clicker questions ( ) $ S S TS S ɛ S ɛ T T (S) For each of the following two input strings: ) ( what will go wrong when we try to apply the predictive parsing algorithm? 1 Blank entry in table encountered 2 Input symbol (or end marker) doesn t match expected symbol 3 Stack empties before end of string reached 7 / 12

Further remarks Slogan: the parse table entry for a, X tells us which rule to apply if we re expecting an X and see an a. Often, the a will be simply the first symbol of the X -subphrase in question. But not always: maybe the X -subphrase in question is ɛ, and the a belongs to whatever follows the X. E.g. in the lookups for ), S on the previous slide, the S in question turns out to be empty. Once we ve got one for a given grammar G, we can parse strings of length n in O(n) time (and O(n) space). Our algorithm is an example of a top-down predictive parser: it works by predicting the form of the remainder of the input, and builds syntax trees a top-down way (i.e. starting from the root). There are other parsers (e.g. LR(1)) that work bottom up. 8 / 12

LL(1) grammars: formal definition Suppose G is a CFG containing no useless nonterminals, i.e. every nonterminal appears in some sentential form derived from the start symbol; every nonterminal can be expanded to some (possibly empty) string of terminals. We say G is LL(1) if for each terminal a and nonterminal X, there is some production X α with the following property: If b 1... b n X γ is a sentential form appearing in a leftmost deriviation of some string b 1... b n ac 1... c m (n, m 0), the next sentential form appearing in the derivation is necessarily b 1... b n αγ. Note that if a, X corresponds to a blank entry in the table, any production X α will satisfy this property, because a sentential form b 1... b n X γ can t arise. 9 / 12

Non-LL(1) grammars Roughly speaking, a grammar is by definition LL(1) if and only if there s a parse table for it. Not all CFGs have this property! Consider e.g. a different grammar for the same language of well-bracketed sequences: S ɛ (S) SS Suppose we d set up our initial stack with S, and we see the input symbol (. What rule should we apply? If the input string is (()), should apply S (S). If the input string is ()(), should apply S SS. We can t tell without looking further ahead. Put another way: if we tried to build a parse table for this grammar, the two rules S (S) and S SS would be competing for the slot (, S. So this grammar isn t LL(1). 10 / 12

Remaining issues Easy to see from the definition that any LL(1) grammar will be unambiguous: never have two syntax trees for the same string. For computer languages, this is fine: normally want to avoid ambiguity anyway. For natural languages, ambiguity is a fact of life! So LL(1) grammars are less suitable here. Two outstanding questions... How can we tell if a grammar is LL(1) and if it is, how can we construct a parse table? (See Lecture 12.) If a grammar isn t LL(1), is there any hope of replacing it by an equivalent one that is? (See Lecture 13.) 11 / 12

Reading and prospectus Relevant reading: Some lecture notes from a previous year (covering the same material but with different examples) are available via the course website. See also Aho, Sethi and Ullman, Compilers: Principles, Techniques, Tools, Section 4.4. 12 / 12