LR Parsing. The first L means the input string is processed from left to right.

Similar documents
LR Parsing E T + E T 1 T

CYK parsing. I love automata theory I love love automata automata theory I love automata love automata theory I love automata theory

LL(1) predictive parsing

UNIT III & IV. Bottom up parsing

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

Table-Driven Parsing

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

LL(1) predictive parsing

CS453 : Shift Reduce Parsing Unambiguous Grammars LR(0) and SLR Parse Tables by Wim Bohm and Michelle Strout. CS453 Shift-reduce Parsing 1

LLparse and LRparse: Visual and Interactive Tools for Parsing

Talen en Compilers. Johan Jeuring , period 2. January 17, Department of Information and Computing Sciences Utrecht University

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

CSCI312 Principles of Programming Languages

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Lecture Bottom-Up Parsing

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

CSE431 Translation of Computer Languages

Principles of Programming Languages

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

1 Parsing (25 pts, 5 each)

Ambiguous Grammars and Compactification

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

CS 536 Midterm Exam Spring 2013

CSE302: Compiler Design

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution

Introduction to Parsing. Lecture 8

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

COL728 Minor1 Exam Compiler Design Sem II, Answer all 5 questions Max. Marks: 20

CS 44 Exam #2 February 14, 2001

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Lexical and Syntax Analysis. Bottom-Up Parsing

General Overview of Compiler

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

CS502: Compilers & Programming Systems


CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Top down vs. bottom up parsing

Recursive Descent Parsers

Midterm I (Solutions) CS164, Spring 2002

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CSE 5317 Midterm Examination 4 March Solutions

컴파일러구성 제 2 강 Recursive-descent Parser / Predictive Parser

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

CSE 105 THEORY OF COMPUTATION

Chapter 3: Lexing and Parsing

Context Free Languages and Pushdown Automata

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Pushdown Automata. A PDA is an FA together with a stack.

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

CS 164 Handout 11. Midterm Examination. There are seven questions on the exam, each worth between 10 and 20 points.

CSE 401 Midterm Exam 11/5/10

CMSC 330: Organization of Programming Languages

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

Monday, September 13, Parsers

CS 314 Principles of Programming Languages

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

CSE 401 Midterm Exam Sample Solution 2/11/15

INFOB3TC Solutions for Exam 2

Introduction to parsers

Lexical Scanning COMP360

LR Parsing - The Items

LL(1) Grammars. Example. Recursive Descent Parsers. S A a {b,d,a} A B D {b, d, a} B b { b } B λ {d, a} D d { d } D λ { a }

CS 4120 Introduction to Compilers

CSE 401/M501 18au Midterm Exam 11/2/18. Name ID #

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

shift-reduce parsing

CS 164 Programming Languages and Compilers Handout 8. Midterm I

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

LR Parsing Techniques

COP 3402 Systems Software Syntax Analysis (Parser)

Types of parsing. CMSC 430 Lecture 4, Page 1

2068 (I) Attempt all questions.

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

Converting a DFA to a Regular Expression JP

Chapter 3 Parsing. time, arrow, banana, flies, like, a, an, the, fruit

Syntactic Analysis. Top-Down Parsing

CS164: Midterm I. Fall 2003

CA Compiler Construction

Models of Computation II: Grammars and Pushdown Automata

Wednesday, August 31, Parsers

Context-free grammars

LR(0) Parsers. CSCI 3130 Formal Languages and Automata Theory. Siu On CHAN. Fall Chinese University of Hong Kong 1/31

INFOB3TC Exam 2. Sean Leather. Monday, 30 January 2012, 17:00 20:00

Transcription:

LR Parsing 1 Introduction The LL Parsing that is provided in JFLAP is what is formally referred to as LL(1) parsing. Grammars that can be parsed using this algorithm are called LL grammars and they form a subset of the grammars that can be represented using Deterministic Pushdown Automata. The key properties of the LL parsing algorithm are The first L means the input string is processed from left to right. The second L means the derivation will be a leftmost derivation (the leftmost variable is replaced at each step). 1 means that one symbol in the input string is used to help guide the parse. It is a top-down parser which means you start your parsing by using the start symbol and then considering the possible production rules that can be used to build up the string in a top-down manner. Contrast this to the SLR parser which works in a bottom-up fashion. 2 Building the first and follow sets The example we will do is to parse a simple arithmetic expression (note that this is the same example used in our LR parsing tutorial). Our grammar only uses 1 as an integer. But you can imagine adding more rules to get the same for all possible integers. Enter the following grammar into JFLAP or load the file LLParsing Parens.jff. S (S)S S ε The LL parsing works by first creating these sets called FIRST and FOLLOW for each of the non-terminals (aside from the special start variable S that is added in). The FIRST set of a variable is just the first terminal that can possibly appear in a derivation that starts from that variable. The follow sets consist of the terminals that could be immediately after the variable in question. In JFLAP when you load up the grammar and hit LL parse you have to compute FIRST and FOLLOW sets first up. If you are incorrect in figuring out the FIRST sets for any variable JFLAP highlights the rows where there is an error. Also remember that the exclamation point (!) represents the empty string in this setting. 1

In our example since S is the only non-terminal, it is easy to see that the first set for S will be {(,!}. As far as the follow set goes, since a closing ) can come immediately after an S, that is the only element in the follow set {)}. At the end this should be what you have for your first and follow sets for this example The next step is building the all important parse table. There are two main rules for filling in the table. To understand those rules it is important to define FIRST(w), where w is a string. FIRST(w) for any string w = X 1 X 2... X n (a string of terminals and nonterminals) as follows: 1. FIRST(w) contains F IRST (X 1 ){ε}. 2. For each i = 2,..., n if FIRST(X k ) contains ε for all k = 1,..., i 1, then FIRST(w) contains FIRST(X i ){ε}. 3. If every single X i has ε in their FIRST set, then ε will be in FIRST(w). Now given that definition of FIRST(w) for any string w, we can build up the parse table as follows. 1. For production A w, for each terminal in FIRST(w), put w as an entry in the A row under the column for that terminal. 2. For production A w, if the empty string is in FIRST(w), put w as an entry in the A row under the columns for all symbols in the FOLLOW(A). Applying these 2 rules to our simple grammar we obtain the following parse table 2

At this point you are ready to go to the next step of building the parser which is making the DFA. So click next. If you have done everything correctly, JFLAP will display a cool congratulatory message. 3 The actual parsing! Click on parse and let us try and parse the simple set of matched parentheses - (()). In the first step, we just mark the termination of the string with a $. The parsing is done in a top-down fashion, so the first step would be to begin with the start symbol S. Also, the start symbol S would be on the top of the stack. Now the next step will be to look at the next character in the input, which is ( and then use the parse table to see which rule gets applied. 3

The rule application means that the variable on the stack gets popped off A stack stores variables and terminals from right sides of productions. Variables are popped off the stack and replaced by the right side of a production, and terminals are popped off and matched in the order they appear in the string. In our case the right side of our production rule is the symbol S, so that gets put on the stack. That also means the derivation tree changes to add a node corresponding to S. 4

The next step adds more of the right side of the production to the stack 5

In two more steps, the stack has now accomodated the entire right side of the rule and the derivation tree has made its first level. 6

The next step is to observe the top of the stack and match the ( with the ( that is the start of the remaining string and removes both of them (we have effectively derived the start of the string). 7

JFLAP also puts a message at the bottom corner telling you that the ( was matched. Next step, the S is on the top of the stack and the next symbol in the input is the (. That means the highlighted rule will be applied and the S will be popped 8

As before, the right side of the production is pushed into the stack and the derivation tree is modified as well. 9

Now after matching the ) we are in a situation where there is an S on the top of the stack and the first symbol at the start of the input is a ). That means we will have to use the S ε rule as shown below. 10

Again the ) are matched. Then we have one more step of using the S ε shown below. 11

Finally you end up with an S on the stack and the end of input symbol (the $). That means the following highlighted rule is applied as shown below 12

and that completes the derivation. A more complicated example Now that the basics are clear, we can try a more complicated example. Either enter these rules into JFLAP or download the file called LLParsingabd.jff. S Aa A BD B b B ε D d D ε 13

Step 1 - the First sets for each of the non-terminals For a quick example, we can see that A for instance has the possibility of generating BD. Since both B and D have an εrule, the first set for A will have to include ε. Since B can produce a b and D can produce a d, this also means that b and d show up in the first set. Here are the first sets for the remaining B {b, ε} D {d, ε} S {a, b, d} Now the follow sets need to be determined. Again, just for a quick example let us look at the follow set for A. Clearly a needs to be in it thanks to the first rule. And that is the only thing that can immediately follow an A. Hence we are done with A. Fill out the others. B {d, a} D {a} S {$} Remember that the $ sign indicates the end of input. Now we have to build our parse table. The same idea is used. We can do it on a per production rule basis. The first rule is S Aa. The FIRST for Aa is going to correspond to the FIRST set of A which we have computed to be {, d, ε}. This means we will fill in Aa corresponding to the columns for both b and d. Also since A does contain an ε in its FIRST set, a will need to be added to the FIRST set of Aa. Therefore, corresponding to columns for b, d and a, we will put down Aa. And for one rule that has an ε in its first set, if we pick the rule corresponding to B ε, we have to put ε in the columns corresponding to the terminals we find in the FOLLOW set for B. This means that ε will be put down in the columns corresponding to a and d. Try it : - fill the rest of the parse table yourself. 14

Now to try and parse a string using the rules of this grammar. Let us try to do ba. The S upon seeing a b as the first character of the input remaining, gets popped and the derivation tree produces the Aa of the right side. Now the A is on top of the stack and the entry corresponding to the b column is BD. So 15

to make that step we use the rule as follows. Note how both B and D are pushed into the stack. The B on the stack with b as the next input symbol results in the B b rule being used. The b on the stack gets matched with the b at the start of the input which then leaves 16

D on the stack with an a at the beginning of the input. When we look at the D row and the column corresponding to a, we see the D ε rule. Apply the rule and get the following And of course, finally, the a gets matched with an a and the string is successfully parsed. 4 Questions Try parsing the string bda using the same grammar. What is wrong with trying to use LL parsing on the following grammar. S 0S1S S 1S0S S ε 17