# The procedure attempts to "match" the right hand side of some production for a nonterminal.

Size: px
Start display at page:

Download "The procedure attempts to "match" the right hand side of some production for a nonterminal."

Transcription

1 Parsing A parser is an algorithm that determines whether a given input string is in a language and, as a side-effect, usually produces a parse tree for the input. There is a procedure for generating a parser from a given context-free grammar. Recursive-Descent Parsing Recursive-descent parsing is one of the simplest parsing techniques that is used in practice. Recursive-descent parsers are also called top-down parsers, since they construct the parse tree top down (rather than bottom up). The basic idea of recursive-descent parsing is to associate each non-terminal with a procedure. The goal of each such procedure is to read a sequence of input characters that can be generated by the corresponding non-terminal, and return a pointer to the root of the parse tree for the non-terminal. The structure of the procedure is dictated by the productions for the corresponding non-terminal. The procedure attempts to "match" the right hand side of some production for a nonterminal. To match a terminal symbol, the procedure compares the terminal symbol to the input; if they agree, then the procedure is successful, and it consumes the terminal symbol in the input (that is, moves the input cursor over one symbol). To match a non-terminal symbol, the procedure simply calls the corresponding procedure for that non-terminal symbol (which may be a recursive call, hence the name of the technique). Recursive-Descent Parser for Expressions Consider the following grammar for expressions (we'll look at the reasons for the peculiar structure of this grammar later): 1. <E> --> <T> <E*> 2. <E*> --> + <T> <E*> - <T> <E*> epsilon 3. <T> --> <F> <T*> 4. <T*> --> * <F> <T*> / <F> <T*> epsilon 5. <F> --> ( <E> ) number We create procedures for each of the non-terminals. According to production 1, the procedure to match expressions (<E>) must match a term (by calling the procedure for <T>), and then more expressions (by calling the procedure <E*>).

2 procedure E; T; Estar; Some procedures, such as <E*>, must examine the input to determine which production to choose. procedure Estar; if NextInputChar = "+" or "-" then read(nextinputchar); T; Estar; We will append a special marker symbol (ENDM) to the input string; this marker symbol notifies the parser that the entire input has been seen. We should also modify the procedure for the start symbol, E, to recognize the end marker after seeing an expression. Top-Down Parser for Expressions procedure E; T; Estar; if NextInputChar = ENDM then /* done */ else print("syntax error") procedure Estar; if NextInputChar = "+" or "-" then read(nextinputchar); T; Estar; procedure T; F; Tstar; procedure Tstar; if NextInputChar = "*" or "/" then read(nextinputchar); F; Tstar; procedure F; if NextInputChar = "(" then read(nextinputchar); E; if NextInputChar = ")" then read(nextinputchar) else print("syntax error"); else if NextInputChar = number then read(nextinputchar) else print("syntax error"); Tracing the Parser

3 As an example, consider the following input: 1 + (2 * 3) / 4. We just call the procedure corresponding to the start symbol. NextInputChar = "1" Call E Call T Call F NextInputChar = "+" /* Match 1 with F */ Call Tstar /* Match epsilon */ Call Estar NextInputChar = "(" /* Match + */ Call T Call F /* Match (, looking for E ) */ NextInputChar = "2" Call E Call T Call F /* Match 2 with F */ NextInputChar = "*" Call Tstar /* Match * */ NextInputChar = "3" Call F /* Match 3 with F */ NextInputChar = ")" Call Tstar /* Match epsilon */ Call Estar /* Match epsilon */ NextInputChar = "/" /* Match ")" */ Call Tstar NextInputChar = "4" /* Match "/" */ Call F /* Match 4 with F */ NextInputChar = ENDM Call Tstar /* Match epsilon */ Call Tstar /* Match epsilon */ Call Estar /* Match epsilon */ /* Match ENDM */ Observations about Recursive-Descent Parser In procedure Estar and Tstar, we match one of the productions with an arithmetic operator if we see such an operator in the input; otherwise we simply return. A procedure that returns without matching any symbols is, in effect, choosing the epsilon production. In our expression parser, we only choose the epsilon production if the NextInputChar doesn't match the first terminal on the right hand side of the production.

4 We never attempt to read beyond the end marker (ENDM), which is matched only at the end of an expression. In all other circumstances, the presence of the end marker signals a syntax error. As written, our recursive-descent parser only determines whether or not the input string is in the language of the grammar; it does not give the structure of the string according to the grammar. We could easily build a parse tree incrementally during parsing. Lookahead in Recursive-Descent Parsing In order to implement a recursive-descent parser for a grammar, for each nonterminal in the grammar, it must be possible to determine which production to apply for that nonterminal by looking only at the current input symbol. (We want to avoid having the compiler or other text processing program scan ahead in the input to determine what action to take next.) The lookahead symbol is simply the next terminal that we will try to match in the input. We use a single lookahead symbol to decide what production to match. Consider a production: A --> X1...Xm. We need to know the set of possible lookahead symbols that indicate this production is to be chosen. This set is clearly those terminal symbols that can be produced by the symbols X1...Xm (which may be either terminals or non-terminals). Since a lookahead is only a single terminal symbol, we want the first (i.e., leftmost) symbol that could be produced by X1...Xm. We donote the set of symbols that could be produced first by X1...Xm as First(X1...Xm). First Sets To distinguish two productions with the same non-terminal on the left hand side, we examine the First sets for their corresponding right hand sides. Given the production A -- > X1...Xm we must determine First(X1...Xm). We first consider the leftmost symbol, X1. If this is a terminal symbol, then First(X1...Xm) = X1. If X1 is a non-terminal, then we compute the First sets for each right hand side corresponding to X1. In our expression grammar above:

5 First(<E>) = First(<T> <E*>) First(<T> <E*>) = First(<T>) First(<T>) = First(<F> <T*>) First(<F> <T*>) = First(<F>) = {(,number} If X1 can generate epsilon, then X1 can (in effect) be erased, and First(X1...Xm) depends on X2. If X2 is a terminal, it is included in First(X1...Xm). If X2 is a non-terminal, we compute the First sets for each of its corresponding right hand sides. Similarly, if both X1 and X2 can produce epsilon, we consider X3, then X4, etc. Follow Sets Suppose we are attempting to compute the lookahead symbols that suggest the production A --> X1...Xm. What if each of the Xi can produce epsilon? If the entire right hand side of a production can produce epsilon, then the lookahead for A is determined by those terminal symbols that can follow A in a parse. We denote the set of terminal symbols that can follow a non-terminal A in a parse as Follow(A). We inspect the grammar for all occurences of the non-terminal A. In each production, A is either: followed by a terminal symbol x, so x is in Follow(A). followed by a non-terminal symbol B, so Follow(A) includes First(B). at the end of a production for some non-terminal S (as in S -> Y1...YmA), in which case Follow(A) includes Follow(S). First and Follow Sets for Expression Grammar Computing the First and Follow sets for our expression grammar (as augmented with a new start symbol that includes the ENDM in the production): 1. <S> --> <E> ENDM 2. <E> --> <T> <E*> 3. <E*> --> + <T> <E*> - <T> <E*> epsilon 4. <T> --> <F> <T*> 5. <T*> --> * <F> <T*> / <F> <T*> epsilon 6. <F> --> ( <E> ) number

6 First(<E>) = First(<T> <E*>) = First(<T>) First(<E*>) = {+} U {-} U Follow(<E*>) Follow(<E*>) = Follow(<E>) = {),ENDM} First(<E*>) = {+,-,),ENDM} First(<T>) = First(<F> <T*>) = First(<F>) First(<T*>) = {*} U {/} U Follow(<T*>) Follow(<T*>) = Follow(<T>) = First(<E*>) First(<T*>) = {*,/,+,-,),ENDM} First(<F>) {(,number} LL(1) Grammars for Recursive-Descent Parsing The set of lookahead symbols that will cause the selection (ie., prediction) of the production A --> X1...Xm is Predict(A --> X1...Xm) = First(X1...Xm) U If X1...Xm --> epsilon then Follow(A) else null That is, any symbol that can be the first symbol produced by the right hand side of a production will predict that production. Further, if the entire right hand side can produce epsilon, then symbols that can immediately follow the left hand side of a production will also predict that production. If, for two productions 1. A --> X1...Xm 2. A --> Y1...Yn we have some symbol s for which 1. s is in Predict(A --> X1...Xm) 2. s is in Predict(A --> Y1...Yn) then we cannot in general know which production to select by looking at a single input symbol. Recursive-descent parsing can only parse those CFG's that have disjoint predict sets for productions that share a common left hand side. CFG's that obey this restriction are called LL(1).

7 From experience we know that it is usually possible to create an LL(1) CFG for a programming language. However, not all CFG's are LL(1) and a CFG that is not LL(1) may be parsable using some other (usually more complex) parsing technique. Creating LL(1) Grammars Recursive-descent parsing can only parse grammars that have disjoint predict sets for productions that share a common left hand side. Two common properties of grammars that violate this condition are: Left recursion: any grammar containing productions with left recursion, that is, productions of the form A --> A X1...Xm, cannot be LL(1). The problem is that any symbol that predicts this production the first time will, of necessity, continue to predict this production forever (and never be matched). Common prefix: any grammar containing two productions for the same nonterminal that share a common prefix on the right hand side cannot be LL(1). The problem is that any symbol that predicts the first production must also predict the second; since the predict sets for the two productions are not disjoint, the grammar is not LL(1). Creating an LL(1) Grammar Consider the following grammar for expressions: 1. <E> --> <E> + <T> 2. <E> --> <E> - <T> 3. <E> --> <T> 4. <T> --> <T> * <F> 5. <T> --> <T> / <F> 6. <T> --> <F> 7. <F> --> ( <E> ) 8. <F> --> number This grammar has left recursion, and therefore cannot be LL(1). We can replace the use of left recursion with right recursion as follows: 1. <E> --> <T> + <E> 2. <E> --> <T> - <E> 3. <E> --> <T> 4. <T> --> <F> * <T> 5. <T> --> <F> / <T>

8 6. <T> --> <F> 7. <F> --> ( <E> ) 8. <F> --> number The resulting grammar is still not LL(1); productions 1-3 share a common prefix, as do productions 4-6. We can eliminate the common prefix by defering the decision as to which production to pick until after seeing the common prefix. This technique is called factoring the common prefix. 1. <E> --> <T> <E*> 2. <E*> --> + <T> <E*> - <T> <E*> epsilon 3. <T> --> <F> <T*> 4. <T*> --> * <F> <T*> / <F> <T*> epsilon 5. <F> --> ( <E> ) number Table-Driven Parsing In recursive-descent parsing, the decision as to which production to choose for a particular non-terminal is hard-coded into the procedure for the non-terminal. The procedure uses the Predict sets (computed from the First and Follow sets) for the grammar to decide which production to choose based on the lookahead symbol. The problem with recursive-descent parsing is that it is inflexible; changes in the grammar can cause significant (and in some cases non-obvious) changes to the parser. Since recursive-descent parsing uses an implicit stack of procedure calls, it is possible to replace the parsing procedures and implicit stack with an explicit stack and a single parsing procedure that manipulates the stack. In this scheme, we encode the actions the parsing procedure should take in a table. This table can be generated automatically (with the grammar as input), which is why this approach adapts more easily to changes in the grammar. A Table-Driven Parser The parse table encodes the choice of production as a function of the current non-terminal of interest and the lookahead symbol. T: Non-terminals x Terminals -> Productions U {Error}

9 The entry T[A,x] gives the production number to choose when A is the non-terminal of interest and x is the current input symbol. The table is a mapping from non-terminals x terminals to productions. T[A,x] == A -> X1..Xm if x in Predict(A->X1..Xm) otherwise T[A,x] == Error The driver procedure is very simple. It stacks symbols that are to be matched or expanded. Terminal symbols on the stack must match an input symbol; non-terminal symbols are expanded via the Predict function (which is encoded in the parse table). Parse Table for Expressions Here is an LL(1) expression grammar, augmented to include the end marker: 1. <S> --> <E> ENDM 2. <E> --> <T> <E*> 3. <E*> --> + <T> <E*> 4. <E*> --> - <T> <E*> 5. <E*> --> epsilon 6. <T> --> <F> <T*> 7. <T*> --> * <F> <T*> 8. <T*> --> / <F> <T*> 9. <T*> --> epsilon 10. <F> --> ( <E> ) 11. <F> --> number The table for this expression grammar is (where a blank entry corresponds to an error): ( ) + - * / Number ENDM S 1 1 E 2 2 E* T 6 6 T* F This table is constructed from the Predict sets described earlier.

10 Driver Procedure Under table-driven parsing, there is a single procedure that "interprets" the parse table. This "driver" procedure takes the following form: procedure Parser; /* Push the start symbol S onto the stack */ Push(S,stack) /* Initialize lookahead symbol */ scanner(nextinputsymbol) while not Empty(stack) do top = Top(stack) if top is a nonterminal then action = ParseTable[top,NextInputSymbol] if action > 0 then /* Pop top symbol * Pop(stack) /* Push RHS of production */ for each symbol on RHS #action do Push(symbol) else print("syntax error") else if NextInputSymbol == top then /* Match terminal symbol in input */ Pop(stack) /* Get next terminal symbol in input */ scanner(nextinputsymbol) else print("syntax error") Example Parse Let's trace the parse for the input 1 + (2 * 3) / 4 ENDM: Stack Contents Current input Action 1: S 1 + (2 * 3) / 4 ENDM 1 2: E ENDM 1 + (2 * 3) / 4 ENDM 2 3: T E* ENDM 1 + (2 * 3) / 4 ENDM 6 4: F T* E* ENDM 1 + (2 * 3) / 4 ENDM 11 5: N T* E* ENDM 1 + (2 * 3) / 4 ENDM Pop 6: T* E* ENDM + (2 * 3) / 4 ENDM 9 7: E* ENDM + (2 * 3) / 4 ENDM 3 8: + T E* ENDM + (2 * 3) / 4 ENDM Pop 9: T E* ENDM (2 * 3) / 4 ENDM 6 10: F T* E* ENDM (2 * 3) / 4 ENDM 10 11: ( E ) T* E* ENDM (2 * 3) / 4 ENDM Pop 12: E ) T* E* ENDM 2 * 3) / 4 ENDM 2 13: T E* ) T* E* ENDM 2 * 3) / 4 ENDM 6 14: F T* E* ) T* E* ENDM 2 * 3) / 4 ENDM 11 15: N T* E* ) T* E* ENDM 2 * 3) / 4 ENDM Pop 16: T* E* ) T* E* ENDM * 3) / 4 ENDM 7

11 17: * F T* E* ) T* E* ENDM * 3) / 4 ENDM Pop 18: F T* E* ) T* E* ENDM 3) / 4 ENDM 11 19: N T* E* ) T* E* ENDM 3) / 4 ENDM Pop 20: T* E* ) T* E* ENDM ) / 4 ENDM 9 21: E* ) T* E* ENDM ) / 4 ENDM 5 22: ) T* E* ENDM ) / 4 ENDM Pop 23: T* E* ENDM / 4 ENDM 8 24: / F T* E* ENDM / 4 ENDM Pop 25: F T* E* ENDM 4 ENDM 11 26: N T* E* ENDM 4 END Pop 27: T* E* ENDM ENDM 9 28: E* ENDM ENDM 5 29: ENDM ENDM Pop 30: Done!

### 8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

8 Parsing Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

### CA Compiler Construction

CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

### Syntactic Analysis. Top-Down Parsing

Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

### Top down vs. bottom up parsing

Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs

### LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Predictive Parsers LL(k) Parsing Can we avoid backtracking? es, if for a given input symbol and given nonterminal, we can choose the alternative appropriately. his is possible if the first terminal of

### Compilers. Predictive Parsing. Alex Aiken

Compilers Like recursive-descent but parser can predict which production to use By looking at the next fewtokens No backtracking Predictive parsers accept LL(k) grammars L means left-to-right scan of input

### Table-Driven Parsing

Table-Driven Parsing It is possible to build a non-recursive predictive parser by maintaining a stack explicitly, rather than implicitly via recursive calls [1] The non-recursive parser looks up the production

### CSCI312 Principles of Programming Languages

Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

### CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

CS1622 Lecture 9 Parsing (4) CS 1622 Lecture 9 1 Today Example of a recursive descent parser Predictive & LL(1) parsers Building parse tables CS 1622 Lecture 9 2 A Recursive Descent Parser. Preliminaries

### Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

### Syntax Analysis, III Comp 412

COMP 412 FALL 2017 Syntax Analysis, III Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp

### Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Building a Parser III CS164 3:30-5:00 TT 10 Evans 1 Overview Finish recursive descent parser when it breaks down and how to fix it eliminating left recursion reordering productions Predictive parsers (aka

### Syntax Analysis, III Comp 412

Updated algorithm for removal of indirect left recursion to match EaC3e (3/2018) COMP 412 FALL 2018 Midterm Exam: Thursday October 18, 7PM Herzstein Amphitheater Syntax Analysis, III Comp 412 source code

### Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

LL(k) Grammars We need a bunch of terminology. For any terminal string a we write First k (a) is the prefix of a of length k (or all of a if its length is less than k) For any string g of terminal and

### Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

### Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parsing III (Top-down parsing: recursive descent & LL(1) ) Roadmap (Where are we?) Previously We set out to study parsing Specifying syntax Context-free grammars Ambiguity Top-down parsers Algorithm &

### Chapter 3. Parsing #1

Chapter 3 Parsing #1 Parser source file get next character scanner get token parser AST token A parser recognizes sequences of tokens according to some grammar and generates Abstract Syntax Trees (ASTs)

### CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CSE P 501 Compilers LR Parsing Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts UW CSE P 501 Spring 2018

### LL(1) Grammars. Example. Recursive Descent Parsers. S A a {b,d,a} A B D {b, d, a} B b { b } B λ {d, a} D d { d } D λ { a }

LL(1) Grammars A context-free grammar whose Predict sets are always disjoint (for the same non-terminal) is said to be LL(1). LL(1) grammars are ideally suited for top-down parsing because it is always

### Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

### Abstract Syntax Trees & Top-Down Parsing

Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students

### CSE431 Translation of Computer Languages

CSE431 Translation of Computer Languages Top Down Parsers Doug Shook Top Down Parsers Two forms: Recursive Descent Table Also known as LL(k) parsers: Read tokens from Left to right Produces a Leftmost

### Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### Abstract Syntax Trees & Top-Down Parsing

Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,

### Question Points Score

CS 453 Introduction to Compilers Midterm Examination Spring 2009 March 12, 2009 75 minutes (maximum) Closed Book You may use one side of one sheet (8.5x11) of paper with any notes you like. This exam has

### Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

### Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

Outline Top Down Parsing Top-down parsing SLL(1) grammars Transforming a grammar into SLL(1) form Recursive-descent parsing 1 CS 412/413 Spring 2008 Introduction to Compilers 2 Where We Are SLL(1) Parsing

### CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

CSE 401 Compilers LR Parsing Hal Perkins Autumn 2011 10/10/2011 2002-11 Hal Perkins & UW CSE D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts 10/10/2011

### Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

TD parsing - LL(1) Parsing First and Follow sets Parse table construction BU Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) Problems with SLR Aho, Sethi, Ullman, Compilers

### Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

Parsing #1 Leonidas Fegaras CSE 5317/4305 L3: Parsing #1 1 Parser source file get next character scanner get token parser AST token A parser recognizes sequences of tokens according to some grammar and

### Lexical and Syntax Analysis (2)

Lexical and Syntax Analysis (2) In Text: Chapter 4 N. Meng, F. Poursardar Motivating Example Consider the grammar S -> cad A -> ab a Input string: w = cad How to build a parse tree top-down? 2 Recursive-Descent

### Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Example CFG Lectures 16 & 17 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Fall 2016 1. Sʹ " S 2. S " AyB 3. A " ab 4. A " cd Matt Crane University of Waterloo 5. B " z 6. B " wz 2 LL(1)

### Table-Driven Top-Down Parsers

Table-Driven Top-Down Parsers Recursive descent parsers have many attractive features. They are actual pieces of code that can be read by programmers and extended. This makes it fairly easy to understand

### CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

### 4 (c) parsing. Parsing. Top down vs. bo5om up parsing

4 (c) parsing Parsing A grammar describes syntac2cally legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

### Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

### Chapter 4: LR Parsing

Chapter 4: LR Parsing 110 Some definitions Recall For a grammar G, with start symbol S, any string α such that S called a sentential form α is If α Vt, then α is called a sentence in L G Otherwise it is

### Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

MODULE 18 LALR parsing After understanding the most powerful CALR parser, in this module we will learn to construct the LALR parser. The CALR parser has a large set of items and hence the LALR parser is

### Building A Recursive Descent Parser. Example: CSX-Lite. match terminals, and calling parsing procedures to match nonterminals.

Building A Recursive Descent Parser We start with a procedure Match, that matches the current input token against a predicted token: vo Match(Terminal a) { if (a == currenttoken) currenttoken = Scanner()

### Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Administrativia Building a Parser III CS164 3:30-5:00 10 vans WA1 due on hu PA2 in a week Slides on the web site I do my best to have slides ready and posted by the end of the preceding logical day yesterday,

### LR Parsing E T + E T 1 T

LR Parsing 1 Introduction Before reading this quick JFLAP tutorial on parsing please make sure to look at a reference on LL parsing to get an understanding of how the First and Follow sets are defined.

### MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) \$ X s0 shift to s2 error error goto s1

### CS502: Compilers & Programming Systems

CS502: Compilers & Programming Systems Top-down Parsing Zhiyuan Li Department of Computer Science Purdue University, USA There exist two well-known schemes to construct deterministic top-down parsers:

### Types of parsing. CMSC 430 Lecture 4, Page 1

Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)

### UNIT III & IV. Bottom up parsing

UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.

### Parsing II Top-down parsing. Comp 412

COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

### 1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

CMSC 51086 Winter 2004 Computer Language Implementation Lecture Note 3 February 4, 2004 Predicative parsing 1 Introduction This note continues the discussion of parsing based on context free languages.

### CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

LL(1) Parse Tables CSX-lite Example An LL(1) parse table, T, is a twodimensional array. Entries in T are production numbers or blank (error) entries. T is indexed by: A, a non-terminal. A is the nonterminal

### Lexical and Syntax Analysis. Top-Down Parsing

Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax

### Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Concepts Introduced in Chapter 2 A more detailed overview of the compilation process. Parsing Scanning Semantic Analysis Syntax-Directed Translation Intermediate Code Generation Context-Free Grammar A

### Lecture Bottom-Up Parsing

Lecture 14+15 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Winter 2018 Troy Vasiga et al University of Waterloo 1 Example CFG 1. S S 2. S AyB 3. A ab 4. A cd 5. B z 6. B wz 2 Stacks in

### Wednesday, September 9, 15. Parsers

Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### UNIT-III BOTTOM-UP PARSING

UNIT-III BOTTOM-UP PARSING Constructing a parse tree for an input string beginning at the leaves and going towards the root is called bottom-up parsing. A general type of bottom-up parser is a shift-reduce

### 3. Parsing. Oscar Nierstrasz

3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

### PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering

TEST 1 Date : 24 02 2015 Marks : 50 Subject & Code : Compiler Design ( 10CS63) Class : VI CSE A & B Name of faculty : Mrs. Shanthala P.T/ Mrs. Swati Gambhire Time : 8:30 10:00 AM SOLUTION MANUAL 1. a.

### COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

### Syntax-Directed Translation. Lecture 14

Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik) 9/27/2006 Prof. Hilfinger, Lecture 14 1 Motivation: parser as a translator syntax-directed translation stream of tokens parser ASTs,

### Introduction to Parsing. Comp 412

COMP 412 FALL 2010 Introduction to Parsing Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make

### Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008. Outline. Top-down versus Bottom-up Parsing. Recursive Descent Parsing. Left Recursion Removal. Left Factoring. Predictive Parsing. Introduction.

### Monday, September 13, Parsers

Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

### Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas

Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1 Top-Down Parsing Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed

### Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071-527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege

### Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

### How do LL(1) Parsers Build Syntax Trees?

How do LL(1) Parsers Build Syntax Trees? So far our LL(1) parser has acted like a recognizer. It verifies that input token are syntactically correct, but it produces no output. Building complete (concrete)

### Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Bottom-up parsing Bottom-up parsing Recall Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form If α V t,thenα is called a sentence in L(G) Otherwise it is just

### Chapter 4. Lexical and Syntax Analysis

Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.

### Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Syntax Analysis Björn B. Brandenburg The University of North Carolina at Chapel Hill Based on slides and notes by S. Olivier, A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts. The Big Picture Character

### Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5

Compiler Design 1 Top-Down Parsing Compiler Design 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed as a generator of a substring of the input. We may view a non-terminal

### CS 230 Programming Languages

CS 230 Programming Languages 10 / 16 / 2013 Instructor: Michael Eckmann Today s Topics Questions/comments? Top Down / Recursive Descent Parsers Top Down Parsers We have a left sentential form xa Expand

### Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

Syntax Analysis COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Based in part on slides and notes by Bjoern Brandenburg, S. Olivier and A. Block. 1 The Big Picture Character Stream Token

### Parsing Part II (Top-down parsing, left-recursion removal)

Parsing Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

### BSCS Fall Mid Term Examination December 2012

PUNJAB UNIVERSITY COLLEGE OF INFORMATION TECHNOLOGY University of the Punjab Sheet No.: Invigilator Sign: BSCS Fall 2009 Date: 14-12-2012 Mid Term Examination December 2012 Student ID: Section: Morning

### Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform

### It parses an input string of tokens by tracing out the steps in a leftmost derivation.

It parses an input string of tokens by tracing out CS 4203 Compiler Theory the steps in a leftmost derivation. CHAPTER 4: TOP-DOWN PARSING Part1 And the implied traversal of the parse tree is a preorder

### DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

KATHMANDU UNIVERSITY SCHOOL OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING REPORT ON NON-RECURSIVE PREDICTIVE PARSER Fourth Year First Semester Compiler Design Project Final Report submitted

### Lexical and Syntax Analysis

Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Easy for humans to write and understand String of characters

### COMP3131/9102: Programming Languages and Compilers

COMP3131/9102: Programming Languages and Compilers Jingling Xue School of Computer Science and Engineering The University of New South Wales Sydney, NSW 2052, Australia http://www.cse.unsw.edu.au/~cs3131

### CS 4120 Introduction to Compilers

CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 6: Bottom-Up Parsing 9/9/09 Bottom-up parsing A more powerful parsing technology LR grammars -- more expressive than LL can handle

### Recursive Descent Parsers

Recursive Descent Parsers Lecture 7 Robb T. Koether Hampden-Sydney College Wed, Jan 28, 2015 Robb T. Koether (Hampden-Sydney College) Recursive Descent Parsers Wed, Jan 28, 2015 1 / 18 1 Parsing 2 LL Parsers

### Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches

### CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

### A simple syntax-directed

Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character

### Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Ambiguity, Precedence, Associativity & Top-Down Parsing Lecture 9-10 (From slides by G. Necula & R. Bodik) 9/18/06 Prof. Hilfinger CS164 Lecture 9 1 Administrivia Please let me know if there are continued

### Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Formal Languages and Compilers Lecture VII Part 3: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

### CSC 4181 Compiler Construction. Parsing. Outline. Introduction

CC 4181 Compiler Construction Parsing 1 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL1) parsing LL1) parsing algorithm First and follow sets Constructing LL1) parsing table

### Alternatives for semantic processing

Semantic Processing Copyright c 2000 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies

### LL(1) predictive parsing

LL(1) predictive parsing Informatics 2A: Lecture 11 John Longley School of Informatics University of Edinburgh jrl@staffmail.ed.ac.uk 13 October, 2011 1 / 12 1 LL(1) grammars and parse tables 2 3 2 / 12

### Wednesday, August 31, Parsers

Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

### Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Derivations vs Parses Grammar is used to derive string or construct parser Context ree Grammars A derivation is a sequence of applications of rules Starting from the start symbol S......... (sentence)

### Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Bottom-up Parsing: Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals. Basic operation is to shift terminals from the input to the

### CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

### shift-reduce parsing

Parsing #2 Bottom-up Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottom-up

### Revisit the example. Transformed DFA 10/1/16 A B C D E. Start

Revisit the example ε 0 ε 1 Start ε a ε 2 3 ε b ε 4 5 ε a b b 6 7 8 9 10 ε-closure(0)={0, 1, 2, 4, 7} = A Trans(A, a) = {1, 2, 3, 4, 6, 7, 8} = B Trans(A, b) = {1, 2, 4, 5, 6, 7} = C Trans(B, a) = {1,

WWW.STUDENTSFOCUS.COM UNIT -3 SYNTAX ANALYSIS 3.1 ROLE OF THE PARSER Parser obtains a string of tokens from the lexical analyzer and verifies that it can be generated by the language for the source program.

### Context-free grammars

Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,