# Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas

Size: px
Start display at page:

Download "Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas"

Transcription

1 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1 Top-Down Parsing

2 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed as a procedure matching a portion of the input. In our expression grammar G with the E-productions E E +T, E E T, E T, the function looks as follows:

3 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 3 Function of E E() Select an E-production p // here is the nondeterminism if p = E E +T call E() if yylex()= + call T() else error else if

4 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 4 Example Let the input be ic. The parser starts with the start symbol E and chooses the production rule E E+T. But then there is no change in the input and the leftmost non-terminal E may be expanded again and again ad infinitum. A left recursive grammar may lead to non-termination.

5 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 5 Example In the previous example let the parser starts with the start symbol E and chooses the following sequence of production rules: E T, T F and F ic. The first symbol of the input matches, but the choice may be incorrect if the next input symbol is +, as there is no rule with right hand side F +. It may be necessary to backtrack on the choice of production rules.

6 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 6 Example Consider the grammar: S asa atba c T bs If the input is a, we cannot decide whether to use the first or the second production rule of S. But if the parser is designed to look-ahead another symbol (2-look-ahead), the correct choice can be made. If the input is aa, the selected rule for derivation is S asa. But if it is ab, the choice is S atba.

7 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 7 Note In case of the expression grammar G, no fixed amount of look-ahead can help. We may have 5-look-ahead and the input is ic+ic+ic. The derivation sequence will be E E +T E +T +T. But the next step is not known as the operator after the rightmost ic is not known. Note that no token has been consumed (read) so far.

8 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 8 Example Consider the ambiguous grammar: S asa bsb atba c T bs There is no way to decide a rule entirely on the basis of the input, without removing the ambiguity.

9 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 9 LL(k) An unambiguous context-free grammar without left recursion is called an LL(k) grammar a, if a top-down predictive parser for its language can be constructed with at most k input look-ahead. We shall consider the case of k = 1. a The parser scansthe input fromleft-to-rightand usesthe leftmostderivation.

10 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 10 FIRST(X) Informally, the FIRST() set of a terminal or a non-terminal X or a string over terminals and non-terminals, is the collection of all terminals (also ε) that can be derive from X, in the grammar, as the first (leftmost) terminal symbol.

11 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 11 FIRST(X) If X Σ N {ε}, then FIRST(X) Σ {ε} is defined inductively as follows: FIRST(X) = {X}, if X Σ {ε}, FIRST(X) is X α P FIRST(α), X N, ε FIRST(X), if there is X α and α ε, FIRST(A α) is the FIRST(α).

12 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 12 FIRST(X) If α = X 1 X 2 X k, then FIRST(X 1 )\{ε} FIRST(α) and FIRST(X i )\{ε} FIRST(α) when ε i 1 j=1first(x j ), 1 < i k. If ε k j=1 FIRST(X j ), then ε FIRST(α).

13 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 13 Example Consider the classic expression grammar; FIRST(E) =FIRST(T) =FIRST(F) = {ic,(}. There are two production rules for each of E and T with the identical FIRST() sets: E E +T, E T and T T F, T F

14 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 14 Example Consider the grammar obtained after removing the left-recursion from G: E TE E +TE ε T FT T FT ε F (E) ic

15 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 15 Example FIRST(E) =FIRST(T) =FIRST(F) = {ic,(}, FIRST(E ) = {+,ε}, and FIRST(T ) = {,ε}. No non-terminal has more than one production rule with the identical FIRST() set.

16 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 16 FOLLOW(X) For every non-terminal X, the FOLLOW() set is the collection of all terminals that can follow X in a sentential form. The set can be defined inductively as follows. The special symbol eof or \$ is in FOLLOW(S), where S is the start symbol. If A αbβ be a production rule, FIRST(β)\{ε} FOLLOW(B).

17 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 17 FOLLOW(X) If A αbβ, where β = ε or β ε, then FOLLOW(A) FOLLOW(B). The reason is simple: S uav uαbβv uαbv, naturally FIRST(v) FOLLOW(A), FOLLOW(B).

18 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 18 Computation of FOLLOW() Sets for each A N FOLLOW(A) FOLLOW(S) {\$} while (FOLLOW sets are not fixed points) for each A β 1 β 2 β k P if (β k N) FOLLOW(β k ) FOLLOW(β k ) FOLLOW(A) FA FOLLOW(A)

19 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 19 Computation of FOLLOW() Sets for i k downto 2 if (β i N & ε FOLLOW(β i )) FOLLOW(β i 1 ) FOLLOW(β i 1 ) FIRST(β i )\{ε} FA else FOLLOW(β i 1 ) FOLLOW(β i 1 ) FIRST(β i )\{ε} FA

20 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 20 Example In the expression grammar G: FOLLOW(E) = {\$,+,)}, FOLLOW(T) = FOLLOW(E) { } = {\$,+,), } and FOLLOW(F) = {\$,+,), }. In the transformed grammar: FOLLOW(E) = FOLLOW(E ) = {\$,)}, FOLLOW(T) = FOLLOW(T ) = {\$,),+} and FOLLOW(F) = {\$,),+, }.

21 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 21 LL(1) Grammar A context-free grammar G is LL(1) iff for any pair of distinct productions A α, A β, the following conditions are satisfied. FIRST(α) FIRST(β) = i.e. no a Σ {ε} can belong to both. If α ε or α = ε, then FIRST(β) FOLLOW(A) =.

22 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 22 Example Consider the following grammar with the set of terminals, Σ = {id ; := int float main do else end if print scan then while} {E BE} a ; the set of non-terminals, N = {P DL D VL T SL S ES IS WS IOS}; the start symbol is P and the set of production rules are: a E and BE, corresponds to expression and boolean expressions, are actually non-terminals. But here we treat them as terminals.

23 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 23 Production Rules 1 P main DL SL end 2 DL D DL D 4 D T VL ; 5 VL id VL id 7 T int float 9 SL S SL ε 11 S ES IS WS IOS

24 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 24 Production Rules 15 ES id := E ; 16 IS if BE then SL end if BE then SL else SL end 18 WS while BE do SL end 19 IOS scan id ; print E ;

25 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 25 Note There is no production rule with left-recursion. But the rules 2,3, 5,6, and 16,17 needs left-factoring as the FIRST() sets are not disjoint. The transformed grammar after factoring is:

26 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 26 New Production Rules 1 P main DL SL end 2 DL D DO 3 DO DL ε 4 D T VL ; 5 VL id VO 6 VO VL ε 7 T int float

27 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 27 Production Rules 9 SL S SL ε 11 S ES IS WS IOS 15 ES id := E ; 16 IS if BE then SL EO 17 EO end else SL end 18 WS while BE do SL end 19 IOS scan id ; print E ;

28 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 28 FIRST() The next step is to calculate the FIRST() sets of different rules. NT/Rule FIRST() P (1) main DL (2) int float DO (3) int float DO (3a) ε D (4) int float

29 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 29 FIRST() NT/Rule FIRST() VL (5) id VO (6) id VO (6a) ε T (7) int T (8) float SL (9) id if while scan print

30 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 30 FIRST() NT/Rule FIRST() SL (10) ε S (11) id S (12) if S (13) while S (14) scan print

31 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 31 FIRST() NT/Rule FIRST() ES (15) id IS (16) if EO (17) end EO (17a) else

32 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 32 FIRST() NT/Rule FIRST() WS (18) while IOS (19) scan IOS (20) print

33 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 33 Note Three rules have ε-productions. Their applications in a predictive parser depends on what can follow the corresponding non-terminals. So it is necessary to compute the FOLLOW() sets corresponding to these non-terminals. The rules are: DO ε(3a), VO ε(6a), SL ε(10).

34 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 34 FOLLOW() NT FOLLOW() DO id if while scan print end VO ; SL end else

35 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 35 Note FOLLOW(DO) = FOLLOW(DL) (rule 2). The FOLLOW(DL) = FIRST(SL)\{ε} FOLLOW(P) (rule 1) as SL is nullable (rule 10). Now FOLLOW(P) = {end}.

36 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 36 Note It is clear from the previous computation that no two production rules of the form A α 1 α 2 have common elements in their FIRST() sets. There is also no common element in the FIRST() set of the production rule A α and the FOLLOW() set of A in cases where A ε. So the grammar is LL(1) and a predictive parser can be constructed.

37 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 37 Recursive-Descent Parser We write a function (may be recursive) for every non-terminal. The function corresponding to a non-terminal A returns ACCEPT if the corresponding portion of the input can be generated by A. Otherwise it returns a REJECT with proper error message.

38 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 38 Example Consider the production rule P main DL SL end The function corresponding to the non-terminal P is as follows:

39 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 39 int P() int P(){ if(yylex() == MAIN){ // MAIN for "main" nexttoken = NOTOK; if(dl() == ACCEPT) if(sl() == ACCEPT) { if(nexttoken == NOTOK) nexttoken = yylex(); if(nexttoken == END) // END is the token return ACCEPT; // for "END" else { printf("end missing (1)\n"); return REJECT;

40 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 40 } } else { printf("sl mismatch (1)\n"); return REJECT; } else { printf("dl mismatch (1)\n"); return REJECT; } } else { printf("main missing (1)\n"); return REJECT;

41 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 41 } }

42 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 42 Note The global variable nexttoken stores the look-ahead input (token). If there is a valid nexttoken, it is to be consumed before calling yylex(). The stack of the push-down automaton is the stack of the recursive call. The body of the function corresponding to a non-terminal corresponds to all its production rules.

43 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 43 Example We now consider a non-terminal with ε-production. DO DL ε The members of FIRST(DL) are {int float} and the elements of FOLLOW(DO) are {id if while scan print end}.

44 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 44 int DO() int FDO(){ if(nexttoken == NOTOK) nexttoken = yylex(); if(nexttoken == INT nexttoken == FLOAT) if(dl() == ACCEPT) return ACCEPT; else { printf("dl mismatch (3)\n"); return REJECT; } else if(nexttoken == IDNTIFIER

45 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 45 } } nexttoken == IF nexttoken == WHILE nexttoken == SCAN nexttoken == PRINT nexttoken == END) return ACCEPT; else { printf("do follow mismatch (3)\n"); return REJECT;

46 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 46 Note The global variable nexttoken is used to store the look-ahead token. This helps to report an error earlier.

47 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 47 Table Driven Predictive Parser A non-recursive predictive parser can be constructed that maintains a stack (explicitly) and a table to select the appropriate production rule.

48 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 48 Parsing Table The rows of the predictive parser table are indexed by the non-terminals and the columns are indexed by the terminals including the end-of-input marker (\$). The content of the table are production rules or error situations. The table cannot have multiple entries.

49 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 49 Parsing Stack The parsing stack can hold both terminals and non-terminals. At the beginning, the stack contains the end-of-stack marker (\$) and the start symbol on top of it.

50 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 50 Parsing Table Construction If A α is a production rule and a FIRST(α), then P[A][a] = A α. If A ε is a production rule and a FOLLOW(A), then P[A][a] = A ε.

51 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 51 Actions If the top-of-stack is a terminal symbol (token) and matches with input token, both are consumed. A mismatch is an error. If the top-of-stack is a non-terminal A, the input token is a, P[A][a] has the entry A α, then A is to be replaced by α, with the head of α on the top of the stack.

52 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 52 Example Consider the production rules of the non-terminal SL. SL S SL ε The FIRST(SL S SL) = {id if while scan print} and FOLLOW(SL) ={end else}. So, P[SL][IDNTIFIER] = P[SL][IF] = P[SL][WHILE] = P[SL][SCAN] = P[SL][PRINT] = SL S SL and P[SL][END] = P[SL][ELSE] = SL ε.

53 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 53 Note Multiple entries in a table indicates that the grammar is not LL(1). But it is interesting to note that in some cases we can drop (with proper consideration) some of these entries and construct a parser.

54 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 54 Example Consider the ambiguous grammar G 1 for expressions. E E +E E E E E E/E (E) ic After the removal of left-recursion we get the following ambiguous, no-left-recursive grammar:

55 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 55 Example E (E)E ice E +EE EE EE /EE ε We calculate FIRST(E ) = {+ - * / ε } and the FOLLOW(E ) = FOLLOW(E) = {\$ ) + - * /}.

56 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 56 Example Naturally, P[E ][±] = {E +EE,E ε} and P[E ][ /] = {E EE,E ε}. We may drop the ε-productions from these four places and get a nice parsing table a. a But it does not work for all grammars. Consider S asa bsb ε.

57 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 57 Note It seems that the removal of two ε-production disambiguates the grammar. The corresponding unambiguoes grammar G 2 is as follows: E (E)E ice (E) ic E +E E E /E ε We have L(G 1 ) = L(G 2 ) and FOLLOW(E ) = {\$ )}, so there is no multiple entries in the table a. a How to maintain operator precedence?

58 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 58 Error Recovery The token on the top of stack does not match with the token in the input stream. The entry in the parsing table corresponding to nonterminal on the top of stack and the current input token is an error.

59 Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 59 Panic Mode

### Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5

Compiler Design 1 Top-Down Parsing Compiler Design 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed as a generator of a substring of the input. We may view a non-terminal

### Table-Driven Parsing

Table-Driven Parsing It is possible to build a non-recursive predictive parser by maintaining a stack explicitly, rather than implicitly via recursive calls [1] The non-recursive parser looks up the production

### CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

CS1622 Lecture 9 Parsing (4) CS 1622 Lecture 9 1 Today Example of a recursive descent parser Predictive & LL(1) parsers Building parse tables CS 1622 Lecture 9 2 A Recursive Descent Parser. Preliminaries

### CA Compiler Construction

CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

### Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

LL(k) Grammars We need a bunch of terminology. For any terminal string a we write First k (a) is the prefix of a of length k (or all of a if its length is less than k) For any string g of terminal and

### Syntactic Analysis. Top-Down Parsing

Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

### CS502: Compilers & Programming Systems

CS502: Compilers & Programming Systems Top-down Parsing Zhiyuan Li Department of Computer Science Purdue University, USA There exist two well-known schemes to construct deterministic top-down parsers:

### Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

### Types of parsing. CMSC 430 Lecture 4, Page 1

Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)

### Top down vs. bottom up parsing

Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs

### LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Predictive Parsers LL(k) Parsing Can we avoid backtracking? es, if for a given input symbol and given nonterminal, we can choose the alternative appropriately. his is possible if the first terminal of

### Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

### Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

### Compilers. Predictive Parsing. Alex Aiken

Compilers Like recursive-descent but parser can predict which production to use By looking at the next fewtokens No backtracking Predictive parsers accept LL(k) grammars L means left-to-right scan of input

### Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

### Abstract Syntax Trees & Top-Down Parsing

Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### 3. Parsing. Oscar Nierstrasz

3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

### Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Ambiguity, Precedence, Associativity & Top-Down Parsing Lecture 9-10 (From slides by G. Necula & R. Bodik) 9/18/06 Prof. Hilfinger CS164 Lecture 9 1 Administrivia Please let me know if there are continued

### Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Compiler Design 1 Bottom-UP Parsing Compiler Design 2 The Process The parse tree is built starting from the leaf nodes labeled by the terminals (tokens). The parser tries to discover appropriate reductions,

### Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### Abstract Syntax Trees & Top-Down Parsing

Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### Lexical and Syntax Analysis. Top-Down Parsing

Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax

### 1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

CMSC 51086 Winter 2004 Computer Language Implementation Lecture Note 3 February 4, 2004 Predicative parsing 1 Introduction This note continues the discussion of parsing based on context free languages.

SYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic

### Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Building a Parser III CS164 3:30-5:00 TT 10 Evans 1 Overview Finish recursive descent parser when it breaks down and how to fix it eliminating left recursion reordering productions Predictive parsers (aka

### 8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

8 Parsing Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

### Introduction to Syntax Analysis

Compiler Design 1 Introduction to Syntax Analysis Compiler Design 2 Syntax Analysis The syntactic or the structural correctness of a program is checked during the syntax analysis phase of compilation.

### CS 321 Programming Languages and Compilers. VI. Parsing

CS 321 Programming Languages and Compilers VI. Parsing Parsing Calculate grammatical structure of program, like diagramming sentences, where: Tokens = words Programs = sentences For further information,

### Lexical and Syntax Analysis

Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Easy for humans to write and understand String of characters

### Topdown parsing with backtracking

Top down parsing Types of parsers: Top down: repeatedly rewrite the start symbol; find a left-most derivation of the input string; easy to implement; not all context-free grammars are suitable. Bottom

### Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Administrativia Building a Parser III CS164 3:30-5:00 10 vans WA1 due on hu PA2 in a week Slides on the web site I do my best to have slides ready and posted by the end of the preceding logical day yesterday,

### Introduction to Syntax Analysis. The Second Phase of Front-End

Compiler Design IIIT Kalyani, WB 1 Introduction to Syntax Analysis The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 2 Syntax Analysis The syntactic or the structural correctness of a program

### CSCI312 Principles of Programming Languages

Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

### Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071-527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege

### Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform

### Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Administrivia Ambiguity, Precedence, Associativity & op-down Parsing eam assignments this evening for all those not listed as having one. HW#3 is now available, due next uesday morning (Monday is a holiday).

### Concepts Introduced in Chapter 4

Concepts Introduced in Chapter 4 Grammars Context-Free Grammars Derivations and Parse Trees Ambiguity, Precedence, and Associativity Top Down Parsing Recursive Descent, LL Bottom Up Parsing SLR, LR, LALR

### Introduction to parsers

Syntax Analysis Introduction to parsers Context-free grammars Push-down automata Top-down parsing LL grammars and parsers Bottom-up parsing LR grammars and parsers Bison/Yacc - parser generators Error

### CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Comparison with Lexical Analysis The second phase of compilation Phase Input Output Lexer string of characters string of tokens Parser string of tokens Parse tree/ast What Parse Tree? CS2210: Compiler

### DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

KATHMANDU UNIVERSITY SCHOOL OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING REPORT ON NON-RECURSIVE PREDICTIVE PARSER Fourth Year First Semester Compiler Design Project Final Report submitted

### Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Bottom up parsing Construct a parse tree for an input string beginning at leaves and going towards root OR Reduce a string w of input to start symbol of grammar Consider a grammar S aabe A Abc b B d And

### Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,

### CS143 Handout 08 Summer 2009 June 30, 2009 Top-Down Parsing

CS143 Handout 08 Summer 2009 June 30, 2009 Top-Down Parsing Possible Approaches Handout written by Maggie Johnson and revised by Julie Zelenski. The syntax analysis phase of a compiler verifies that the

### CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

### Parsing Part II (Top-down parsing, left-recursion removal)

Parsing Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

### Syntactic Analysis. Chapter 4. Compiler Construction Syntactic Analysis 1

Syntactic Analysis Chapter 4 Compiler Construction Syntactic Analysis 1 Context-free Grammars The syntax of programming language constructs can be described by context-free grammars (CFGs) Relatively simple

### Compilation Lecture 3: Syntax Analysis: Top-Down parsing. Noam Rinetzky

Compilation 0368-3133 Lecture 3: Syntax Analysis: Top-Down parsing Noam Rinetzky 1 Recursive descent parsing Define a function for every nonterminal Every function work as follows Find applicable production

### Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Bottom-up Parsing: Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals. Basic operation is to shift terminals from the input to the

### 4 (c) parsing. Parsing. Top down vs. bo5om up parsing

4 (c) parsing Parsing A grammar describes syntac2cally legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

### Parsing. Rupesh Nasre. CS3300 Compiler Design IIT Madras July 2018

Parsing Rupesh Nasre. CS3300 Compiler Design IIT Madras July 2018 Character stream Lexical Analyzer Machine-Independent Code Code Optimizer F r o n t e n d Token stream Syntax Analyzer Syntax tree Semantic

### Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Parser Larissa von Witte Institut für oftwaretechnik und Programmiersprachen 11. Januar 2016 L. v. Witte 11. Januar 2016 1/23 Contents Introduction Taxonomy Recursive Descent Parser hift Reduce Parser

### 3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

### Fall Compiler Principles Lecture 3: Parsing part 2. Roman Manevich Ben-Gurion University

Fall 2014-2015 Compiler Principles Lecture 3: Parsing part 2 Roman Manevich Ben-Gurion University Tentative syllabus Front End Intermediate Representation Optimizations Code Generation Scanning Lowering

### Parsing. Lecture 11: Parsing. Recursive Descent Parser. Arithmetic grammar. - drops irrelevant details from parse tree

Parsing Lecture 11: Parsing CSC 131 Fall, 2014 Kim Bruce Build parse tree from an expression Interested in abstract syntax tree - drops irrelevant details from parse tree Arithmetic grammar ::=

### Monday, September 13, Parsers

Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

### Context-free grammars

Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

### Introduction to Parsing. Comp 412

COMP 412 FALL 2010 Introduction to Parsing Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make

### Compiler Construction: Parsing

Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33 Context-free grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax

### CS308 Compiler Principles Syntax Analyzer Li Jiang

CS308 Syntax Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program.

### Lexical and Syntax Analysis (2)

Lexical and Syntax Analysis (2) In Text: Chapter 4 N. Meng, F. Poursardar Motivating Example Consider the grammar S -> cad A -> ab a Input string: w = cad How to build a parse tree top-down? 2 Recursive-Descent

WWW.STUDENTSFOCUS.COM UNIT -3 SYNTAX ANALYSIS 3.1 ROLE OF THE PARSER Parser obtains a string of tokens from the lexical analyzer and verifies that it can be generated by the language for the source program.

### 4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

### Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

### CS 406/534 Compiler Construction Parsing Part II LL(1) and LR(1) Parsing

CS 406/534 Compiler Construction Parsing Part II LL(1) and LR(1) Parsing Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof.

### Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Derivations vs Parses Grammar is used to derive string or construct parser Context ree Grammars A derivation is a sequence of applications of rules Starting from the start symbol S......... (sentence)

### Wednesday, September 9, 15. Parsers

Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

TD parsing - LL(1) Parsing First and Follow sets Parse table construction BU Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) Problems with SLR Aho, Sethi, Ullman, Compilers

### 4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

### COMP 181. Prelude. Next step. Parsing. Study of parsing. Specifying syntax with a grammar

COMP Lecture Parsing September, 00 Prelude What is the common name of the fruit Synsepalum dulcificum? Miracle fruit from West Africa What is special about miracle fruit? Contains a protein called miraculin

### Extra Credit Question

Top-Down Parsing #1 Extra Credit Question Given this grammar G: E E+T E T T T * int T int T (E) Is the string int * (int + int) in L(G)? Give a derivation or prove that it is not. #2 Revenge of Theory

### Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parsing III (Top-down parsing: recursive descent & LL(1) ) Roadmap (Where are we?) Previously We set out to study parsing Specifying syntax Context-free grammars Ambiguity Top-down parsers Algorithm &

### Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

Prelude COMP Lecture Topdown Parsing September, 00 What is the Tufts mascot? Jumbo the elephant Why? P. T. Barnum was an original trustee of Tufts : donated \$0,000 for a natural museum on campus Barnum

### A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

### Compila(on (Semester A, 2013/14)

Compila(on 0368-3133 (Semester A, 2013/14) Lecture 4: Syntax Analysis (Top- Down Parsing) Modern Compiler Design: Chapter 2.2 Noam Rinetzky Slides credit: Roman Manevich, Mooly Sagiv, Jeff Ullman, Eran

### Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Bottom-up parsing Bottom-up parsing Recall Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form If α V t,thenα is called a sentence in L(G) Otherwise it is just

### PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering

TEST 1 Date : 24 02 2015 Marks : 50 Subject & Code : Compiler Design ( 10CS63) Class : VI CSE A & B Name of faculty : Mrs. Shanthala P.T/ Mrs. Swati Gambhire Time : 8:30 10:00 AM SOLUTION MANUAL 1. a.

### Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Fall 2017-2018 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion University of the Negev 1 Books Compilers Principles, Techniques, and Tools Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman

### The procedure attempts to "match" the right hand side of some production for a nonterminal.

Parsing A parser is an algorithm that determines whether a given input string is in a language and, as a side-effect, usually produces a parse tree for the input. There is a procedure for generating a

### CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

### CSE431 Translation of Computer Languages

CSE431 Translation of Computer Languages Top Down Parsers Doug Shook Top Down Parsers Two forms: Recursive Descent Table Also known as LL(k) parsers: Read tokens from Left to right Produces a Leftmost

### Syntax Analysis (Parsing)

An overview of parsing Functions & Responsibilities Context Free Grammars Concepts & Terminology Writing and Designing Grammars Syntax Analysis (Parsing) Resolving Grammar Problems / Difficulties Top-Down

### Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Eliminating Ambiguity Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity. Example: consider the following grammar stat if expr then stat if expr then stat else stat other One can

### Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

### Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students

### MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) \$ X s0 shift to s2 error error goto s1

### Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

Acknowledgements The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error

### Wednesday, August 31, Parsers

Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

### Lecturer: William W.Y. Hsu. Programming Languages

Lecturer: William W.Y. Hsu Programming Languages Chapter 2 - Programming Language Syntax 3 Scanning The main task of scanning is to identify tokens. 4 Pseudo-Code Scanner We skip any initial white spaces

### Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Syntax Analysis: Context-free Grammars, Pushdown Automata and Part - 4 Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler

### COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

### Lecture Notes on Top-Down Predictive LL Parsing

Lecture Notes on Top-Down Predictive LL Parsing 15-411: Compiler Design Frank Pfenning Lecture 8 1 Introduction In this lecture we discuss a parsing algorithm that traverses the input string from l eft

### Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion University of the Negev 1 Books Compilers Principles, Techniques, and Tools Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman

### CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

### Concepts of Programming Languages Recitation 1: Predictive Parsing

Concepts of Programming Languages Recitation 1: Predictive Parsing Oded Padon odedp@mail.tau.ac.il Reference: Modern Compiler Implementation in Java by Andrew W. Appel Ch. 3 Administrative Course website

### CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

### Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Parsing Part II (Ambiguity, Top-down parsing, Left-recursion Removal) Ambiguous Grammars Definitions If a grammar has more than one leftmost derivation for a single sentential form, the grammar is ambiguous