# Lexical and Syntax Analysis. Top-Down Parsing

Size: px
Start display at page:

Transcription

1 Lexical and Syntax Analysis Top-Down Parsing

2 Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure

3 Syntax A syntax is a set of rules defining the valid strings of a language, often specified by a context-free grammar. For example, a grammar E for arithmetic expressions: e x y e + e e e e * e ( e )

4 Derivations A derivation is a proof that some string conforms to a grammar. A leftmost derivation: e e + e x + e x + ( e ) x + ( e * e ) x + ( y * e ) x + ( y * x )

5 Derivations A rightmost derivation: e e + e e + ( e ) e + ( e * e ) e + ( e * x ) e + ( y * x ) x + ( y * x ) Many ways to derive the same string: many ways to write the same proof.

6 Parse tree: motivation Also a proof that a given input is valid according to the grammar. But a parse tree: is more concise: we don t write out the sentence every time a non-terminal is expanded. abstracts over the order in which rules are applied.

7 Parse tree: intuition If non-terminal n has a production n X Y Z where X, Y, and Z are terminals or non-terminals, then a parse tree may have an interior node labelled n with three children labelled X, Y, and Z. n X Y Z

8 Parse tree: definition A parse tree is a tree in which: the root is labelled by the start symbol; each leaf is labelled by a terminal symbol, or ε; each interior node is labelled by a non-terminal; if n is a non-terminal labelling an interior node whose children are X 1, X 2,, X n then there must exist a production n X 1 X 2 X n.

9 Example 1 Example input string: x + y * x A resulting parse tree according to grammar E: e e + e x e * e y x

10 Example 2 The following is not a parse tree according to grammar E. e x + e e y * e x Why? Because e x + e is not a production in grammar E.

11 Grammar notation Non-terminals are underlined. Rather than writing e x e e + e we may write: e x e + e (Also, symbols and ::= will be used interchangeably.)

12 Syntax Analysis String of symbols Parse tree A parse tree is: 1. A proof that a given input is valid according to the grammar; 2. A data structure that is convenient for compilers to process. (Syntax analysis may also report that the input string is invalid.)

13 Ambiguity If there exists more than one parse tree for any string then the grammar is ambiguous. For example, the string x+y*x has two parse trees: e e e + e e * e x e * e e + e x y x x y

14 Operator precedence Different parse trees often have different meanings, so we usually want unambiguous grammars. Conventionally, * has a higher precedence (binds tighter) than +, so there is only one interpretation of x+y*x, namely x+(y*x).

15 Operator associativity Even with precedence rules, ambiguity remains, e.g. x-x-x-x. Binary operators are either: left-associative; right-associative; non-associative. Conventionally, - is left-associative, so there is only one interpretation of x-x-x-x, namely ((x-x)-x)-x.

16 Ambiguity removal Example input: e x y e + e e e e * e ( e ) All operators are left associative, and * binds tighter than + and.

17 Ambiguity removal Example output: e e + e 1 e e 1 e 1 e 1 e 1 * e 2 e 2 e 2 ( e ) x y Note: ignoring bracketed expressions e 1 disallows + and e 2 disallows +, -, and *

18 Disallowed parse trees After disambiguation, there are no parse trees corresponding to the following originals: e e e * e e + e e + e x x e - e x y y x LHS of * cannot contain a +. RHS of + cannot contain a -.

19 Ambiguity removal: step-by-step Given a non-terminal e which involves operators at n levels of precedence: Step 1: introduce n+1 new nonterminals, e 0 e n.

20 Let op denote an operator with precedence i. Step 2a: replace each production with e e op e e i e i op e i+1 e i+1 if op is left-associative, or e i e i+1 op e i e i+1 if op is right-associative

21 Step 2b: replace each production with e op e e i op e i e i+1 Step 2c: replace each production e e op with e i e i op e i+1

22 Construct the precedence table: Operator Precedence +, - 0 * 1 Grammar E after step 2 becomes: e 0 e 0 + e 1 e 0 e 1 e 1 e 1 e 1 * e 2 e 2 e ( e ) x y

23 Step 3: replace each production with e e n After step 3: e 0 e 0 + e 1 e 0 e 1 e 1 e 1 e 1 * e 2 e 2 e 2 ( e ) x y

24 Step 4: replace all occurrences of e 0 with e. After step 4: e e + e 1 e e 1 e 1 e 1 e 1 * e 2 e 2 e 2 ( e ) x y

25 Exercise 1 Consider the following ambiguous grammar for logical propositions. p 0 (Zero) 1 (One) ~ p (Negation) p + p (Disjunction) p * p (Conjunction) Now let + and * be right associative and the operators in increasing order of binding strength be : +, *, ~. Give an unambiguous grammar for logical propositions.

26 Exercise 2 Which of the following grammars are ambiguous? b 0 b e + e e e e x s if b then s if b then s else s skip

27 Homework exercise Consider the following ambiguous grammar G. s if b then s if b then s else s skip Give a unambiguous grammar that accepts the same language as G.

28 Summary so far Syntax of a language is often specified by a context-free grammar Derivations and parse trees are proofs. Parse trees lead to a concise definition of ambiguity. Construction of unambiguous grammars using rules of precedence and associativity.

29 PART 2: TOP-DOWN PARSING Recursive-Descent Backtracking Left-Factoring Predictive Parsing Left-Recursion Removal First and Follow Sets Parsing tables and LL(1)

30 Top-down parsing Top-down: begin with the start symbol and expand non-terminals, succeeding when the input string is matched. A good strategy for writing parsers: 1. Implement a syntax checker to accept or refute input strings. 2. Modify the checker to construct a parse tree straightforward.

31 RECURSIVE DESCENT A popular top-down parsing technique.

32 Recursive descent A recursive descent parser consists of a set of functions, one for each non-terminal. The function for non-terminal n returns true if some prefix of the input string can be derived from n, and false otherwise.

33 Consuming the input We assume a global variable next points to the input string. char* next; Consume c from input if possible. int eat(char c) { if (*next == c) { next++; return 1; } return 0; }

34 Recursive descent Let parse(x) denote X() if X is a non-terminal eat(x) if X is a terminal For each non-terminal N, introduce: int N() { char* save = next; } for each N X 1 X 2 X n if (parse(x 1 ) && parse(x 2 ) && && parse(x n )) return 1; else next = save; return 0; Backtrack

35 Exercise 4 Consider the following grammar G with start symbol e. e ( e + e ) ( e * e ) v v x y Using recursive descent, write a syntax checker for grammar G.

36 Answer (part 1) int e() { char* save = next; if (eat('(') && e() && eat('+') && e() && eat(')')) return 1; else next = save; if (eat('(') && e() && eat('*') && e() && eat(')')) return 1; else next = save; if (v()) return 1; else next = save; return 0; }

37 Answer (part 2) int v() { char* save = next; if (eat('x')) return 1; else next = save; if (eat('y')) return 1; else next = save; return 0; }

38 Exercise 5 How many function calls are made by the recursive descent parser to parse the following strings? (x*x) ((x*x)*x) (((x*x)*x)*x) (See animation of backtracking.)

39 Function calls Answer Number of calls is quadratic in the length of the input string. Input string Length Calls (x*x) 5 21 ((x*x)*x) 9 53 (((x*x)*x)*x) Lesson: backtracking expensive! String length

40 LEFT FACTORING Reducing backtracking!

41 Left factoring When two productions for a non-terminal share a common prefix, expensive backtracking can be avoided by left-factoring the grammar. Idea: Introduce a new nonterminal that accepts each of the different suffixes.

42 Example 3 Left-factoring grammar G by introducing non-terminal r: e ( e r v r + e ) * e ) v x y Common prefix Different suffixes

43 Function calls Effect of left-factoring Number of calls is now linear in the length of input string. Input string Length Calls (x*x) 5 13 ((x*x)*x) 9 22 (((x*x)*x)*x) Lesson: left-factoring a grammar reduces backtracking. String length

44 PREDICTIVE PARSING Eliminating backtracking!

45 Predictive parsing Idea: know which production of a non-terminal to choose based solely on the next input symbol. Advantage: very efficient since it eliminates all backtracking. Disadvantage: not all grammars can be parsed in this way. (But many useful ones can.)

46 Running example The following grammar H will be used as a running example to demonstrate predictive parsing. e e + e e * e ( e ) x y Example: x+y*(y+x)

47 Removing ambiguity Since + and * are left-associative and * binds tighter than +, we can derive an unambiguous variant of H. e e + t t t t * f f f ( e ) x y

48 Left recursion Problem: left-recursive grammars cause recursive descent parsers to loop forever. int e() { char* save = next; if (e() && eat('+') && t()) return 1; next = save; if (t()) return 1; next = save; Call to self without consuming any input } return 0;

49 Eliminating left recursion Let α denote any sequence of grammar symbols. n n α Rule 1 n' α n' n α Rule 2 n α n' where α does not begin with n Introduce new production Rule 3 n' ε

50 Eliminating left recursion Example before: e e + v v v x y and after: e v e' v x y e' ε + v e'

51 Example 4 Running example, after eliminating left-recursion. e t e' e' + t e' ε t f t' t' * f t' ε f ( e ) x y

52 first and follow sets Predictive parsers are built using the first and follow sets of each non-terminal in a grammar.

53 Definition of first sets Let α denote any sequence of grammar symbols. If α can derive a string beginning with terminal a then a first(α). If α can derive ε then ε first(α).

54 Computing first sets If a is a terminal then a first(a α). The empty string ε first(ε). If X 1 X 2 X n is a sequence of grammar symbols and i a first(x i ) and j < i ε first(x j ) then a first(x 1 X 2 X n ). If n α is a production then first( n ) = first(α).

55 Exercise 6 Give all members of the sets: first( v ) first( e ) first( v e ) e ( e + e ) ( e * e ) v v x ε

56 Exercise 7 What are the first sets for each non-terminal in the following grammar. e t e' e' + t e' ε t f t' t' * f t' ε f ( e ) x y

57 Answer first( f ) = { (, x, y } first( t' ) = { *, ε } first( t ) = { (, x, y } first( e' ) = { +, ε } first( e ) = { (, x, y }

58 Definition of follow sets Let α and β denote any sequence of grammar symbols. Terminal a follow(n) if the start symbol of the grammar can derive a string of grammar symbols in which a immediately follows n. The set follow(n) never contains ε.

59 End markers In predictive parsing, it is useful to mark the end of the input string with a \$ symbol. ((x*x)*x)\$ \$ is equivalent to '\0' in C.

60 Computing follow sets If s is the start symbol of the grammar then \$ follow(s). If n α x β then everything in first(β) except ε is in follow(x). If n α x or n α x β and ε first(β) then everything in follow(n) is in follow(x).

61 Exercise Give all members of the sets: follow( e ) follow( v ) e ( e + e ) ( e * e ) v v x ε

62 Exercise 8 What are the follow sets for each non-terminal in the following grammar. e t e' e' + t e' ε t f t' t' * f t' ε f ( e ) x y

63 Answer follow( e' ) = { \$, ) } follow( e ) = { \$, ) } follow( t' ) = { +, \$, ) } follow( t ) = { +, \$, ) } follow( f ) = { *, +, ), \$ }

64 Non-Terminals Predictive parsing table For each non-terminal n, a parse table T defines which production of n should be chosen, based on the next input symbol a. Terminals e r v ( +... e ( e r r + e Production

65 Predictive parsing table for each production n α for each a first(α) add n α to T[n, a] if ε first(α) then for each b follow(n) add n α to T[n, b]

66 Exercise 9 Construct a predictive parsing table for the following grammar. e t e' e' + t e' ε t f t' t' * f t' ε f ( e ) x y

67 LL(1) grammars If each cell in the parse table contains at most one entry then the a non-backtracking parser can be constructed and the grammar is said to be LL(1). First L: left-to-right scanning of the input. Second L: a leftmost derivation is constructed. The (1): using one input symbol of look-ahead to decide which grammar production to choose.

68 Exercise 10 Write a syntax checker for the grammar of Exercise 9, utilising the predictive parsing table. int e() {... } It should return a non-zero value if some prefix of the string pointed to by next conforms to the grammar, otherwise it should return zero.

69 Answer (part 1) int e() { if (*next == 'x') return t() && e1(); if (*next == 'y') return t() && e1(); if (*next == '(') return t() && e1(); return 0; } int e1() { if (*next == '+') return eat('+') && t() && e1(); if (*next == ')') return 1; if (*next == '\0') return 1; return 0; }

70 Answer (part 2) int t() { if (*next == 'x') return f() && t1(); if (*next == 'y') return f() && t1(); if (*next == '(') return f() && t1(); return 0; } int t1() { if (*next == '+') return 1; if (*next == '* ) return eat('*') && f() && t1(); if (*next == ')') return 1; if (*next == '\0') return 1; return 0; }

71 Answer (part 3) int f() { if (*next == 'x') return eat('x'); if (*next == 'y') return eat('y'); if (*next == '(') return eat('(') && e() && eat(')'); return 0; } (Notice how backtracking is not required.)

72 Predictive parsing algorithm Let s be a stack, initially containing the start symbol of the grammar, and let next point to the input string. while (top(s)!= \$) if (top(s) is a terminal) { if (top(s) == *next) { pop(s); next++; } else error(); } else if (T[top(s), *next] == X Y 1 Y n ) { pop(s); push(s, Y n Y 1 ) /* Y 1 on top */ }

73 Exercise 11 Give the steps that a predictive parser takes to parse the following input. x + x * y For each step (loop iteration), show the input stream, the stack, and the parser action.

74 Acknowledgements Plus Stanford University lecture notes by Maggie Johnson and Julie Zelenski.

75 APPENDIX

76 Context-free grammars Have four components: 1. A set of terminal symbols. 2. A set of non-terminal symbols. 3. A set of productions (or rules) of the form: n X 1 X n where n is a non-terminal and X 1 X n is any sequence of terminals, non-terminals, and ε. 4. The start symbol (one of the non-terminals).

77 Notation Non-terminals are underlined. Rather than writing e x e e + e we may write: e x e + e (Also, symbols and ::= will be used interchangeably.)

78 Why context-free? Unrestricted Context Sensitive Context Free Regular Nice balance between expressive power and efficiency of parsing.

79 Chomsky hierarchy Let t range over terminals, x and z over non-terminals and, β and γ over sequences of terminals, nonterminals, and ε. Grammar Unrestricted Valid productions α β Context-Sensitive α x γ α β γ Context-Free Regular x β x t x t z x ε

80 Backus-Naur Form BNF is a standard ASCII notation for specification of context-free grammars whose terminals are ASCII characters. For example: <exp> ::= <exp> "+" <exp> <exp> "-" <exp> <var> <var> ::= "x" "y" The BNF notation can itself be specified in BNF.

### Lexical and Syntax Analysis

Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Easy for humans to write and understand String of characters

### A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

### CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

CS1622 Lecture 9 Parsing (4) CS 1622 Lecture 9 1 Today Example of a recursive descent parser Predictive & LL(1) parsers Building parse tables CS 1622 Lecture 9 2 A Recursive Descent Parser. Preliminaries

### Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

### Wednesday, September 9, 15. Parsers

Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### 3. Parsing. Oscar Nierstrasz

3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

### Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

### Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

### Syntax Analysis Part I

Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

### Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,

### Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

### Monday, September 13, Parsers

Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

### Top down vs. bottom up parsing

Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs

### Building Compilers with Phoenix

Building Compilers with Phoenix Syntax-Directed Translation Structure of a Compiler Character Stream Intermediate Representation Lexical Analyzer Machine-Independent Optimizer token stream Intermediate

### Abstract Syntax Trees & Top-Down Parsing

Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Predictive Parsers LL(k) Parsing Can we avoid backtracking? es, if for a given input symbol and given nonterminal, we can choose the alternative appropriately. his is possible if the first terminal of

### Part 3. Syntax analysis. Syntax analysis 96

Part 3 Syntax analysis Syntax analysis 96 Outline 1. Introduction 2. Context-free grammar 3. Top-down parsing 4. Bottom-up parsing 5. Conclusion and some practical considerations Syntax analysis 97 Structure

### Wednesday, August 31, Parsers

Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

### Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Concepts Introduced in Chapter 2 A more detailed overview of the compilation process. Parsing Scanning Semantic Analysis Syntax-Directed Translation Intermediate Code Generation Context-Free Grammar A

### Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### Abstract Syntax Trees & Top-Down Parsing

Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

### Table-Driven Parsing

Table-Driven Parsing It is possible to build a non-recursive predictive parser by maintaining a stack explicitly, rather than implicitly via recursive calls [1] The non-recursive parser looks up the production

### Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Administrativia Building a Parser III CS164 3:30-5:00 10 vans WA1 due on hu PA2 in a week Slides on the web site I do my best to have slides ready and posted by the end of the preceding logical day yesterday,

### 1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

CMSC 51086 Winter 2004 Computer Language Implementation Lecture Note 3 February 4, 2004 Predicative parsing 1 Introduction This note continues the discussion of parsing based on context free languages.

### COP 3402 Systems Software Top Down Parsing (Recursive Descent)

COP 3402 Systems Software Top Down Parsing (Recursive Descent) Top Down Parsing 1 Outline 1. Top down parsing and LL(k) parsing 2. Recursive descent parsing 3. Example of recursive descent parsing of arithmetic

### Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Syntax Analysis Björn B. Brandenburg The University of North Carolina at Chapel Hill Based on slides and notes by S. Olivier, A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts. The Big Picture Character

### COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

### Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

Syntax Analysis COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Based in part on slides and notes by Bjoern Brandenburg, S. Olivier and A. Block. 1 The Big Picture Character Stream Token

### Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Ambiguity, Precedence, Associativity & Top-Down Parsing Lecture 9-10 (From slides by G. Necula & R. Bodik) 9/18/06 Prof. Hilfinger CS164 Lecture 9 1 Administrivia Please let me know if there are continued

### Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform

### Compilers. Predictive Parsing. Alex Aiken

Compilers Like recursive-descent but parser can predict which production to use By looking at the next fewtokens No backtracking Predictive parsers accept LL(k) grammars L means left-to-right scan of input

### Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Susan Eggers 1 CSE 401 Syntax Analysis/Parsing Context-free grammars (CFG s) Purpose: determine if tokens have the right form for the language (right syntactic structure) stream of tokens abstract syntax

### CS502: Compilers & Programming Systems

CS502: Compilers & Programming Systems Top-down Parsing Zhiyuan Li Department of Computer Science Purdue University, USA There exist two well-known schemes to construct deterministic top-down parsers:

### Syntactic Analysis. Top-Down Parsing

Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

### Context-free grammars (CFG s)

Syntax Analysis/Parsing Purpose: determine if tokens have the right form for the language (right syntactic structure) stream of tokens abstract syntax tree (AST) AST: captures hierarchical structure of

### CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

### Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parsing III (Top-down parsing: recursive descent & LL(1) ) Roadmap (Where are we?) Previously We set out to study parsing Specifying syntax Context-free grammars Ambiguity Top-down parsers Algorithm &

### Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

### Chapter 3. Parsing #1

Chapter 3 Parsing #1 Parser source file get next character scanner get token parser AST token A parser recognizes sequences of tokens according to some grammar and generates Abstract Syntax Trees (ASTs)

### Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

Parsing #1 Leonidas Fegaras CSE 5317/4305 L3: Parsing #1 1 Parser source file get next character scanner get token parser AST token A parser recognizes sequences of tokens according to some grammar and

### COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

### Compilation Lecture 3: Syntax Analysis: Top-Down parsing. Noam Rinetzky

Compilation 0368-3133 Lecture 3: Syntax Analysis: Top-Down parsing Noam Rinetzky 1 Recursive descent parsing Define a function for every nonterminal Every function work as follows Find applicable production

SYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic

### CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

CMPS 3500 Programming Languages Dr. Chengwei Lei CEECS California State University, Bakersfield Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing

### Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Building a Parser III CS164 3:30-5:00 TT 10 Evans 1 Overview Finish recursive descent parser when it breaks down and how to fix it eliminating left recursion reordering productions Predictive parsers (aka

### Lexical and Syntax Analysis. Bottom-Up Parsing

Lexical and Syntax Analysis Bottom-Up Parsing Parsing There are two ways to construct derivation of a grammar. Top-Down: begin with start symbol; repeatedly replace an instance of a production s LHS with

### COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

### 4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

### Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071-527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege

### Syntax. In Text: Chapter 3

Syntax In Text: Chapter 3 1 Outline Syntax: Recognizer vs. generator BNF EBNF Chapter 3: Syntax and Semantics 2 Basic Definitions Syntax the form or structure of the expressions, statements, and program

### Compiler Design Concepts. Syntax Analysis

Compiler Design Concepts Syntax Analysis Introduction First task is to break up the text into meaningful words called tokens. newval=oldval+12 id = id + num Token Stream Lexical Analysis Source Code (High

### COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

### CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

### 4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

### Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Eliminating Ambiguity Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity. Example: consider the following grammar stat if expr then stat if expr then stat else stat other One can

### Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Theoretical Part Chapter one:- - What are the Phases of compiler? Six phases Scanner Parser Semantic Analyzer Source code optimizer Code generator Target Code Optimizer Three auxiliary components Literal

### Defining syntax using CFGs

Defining syntax using CFGs Roadmap Last time Defined context-free grammar This time CFGs for specifying a language s syntax Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S)

### CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

### Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

LL(k) Grammars We need a bunch of terminology. For any terminal string a we write First k (a) is the prefix of a of length k (or all of a if its length is less than k) For any string g of terminal and

### CSCI312 Principles of Programming Languages

Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

### CPS 506 Comparative Programming Languages. Syntax Specification

CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens

### Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5

Compiler Design 1 Top-Down Parsing Compiler Design 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed as a generator of a substring of the input. We may view a non-terminal

### Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

Prelude COMP Lecture Topdown Parsing September, 00 What is the Tufts mascot? Jumbo the elephant Why? P. T. Barnum was an original trustee of Tufts : donated \$0,000 for a natural museum on campus Barnum

### CA Compiler Construction

CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

### Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Administrivia Ambiguity, Precedence, Associativity & op-down Parsing eam assignments this evening for all those not listed as having one. HW#3 is now available, due next uesday morning (Monday is a holiday).

### Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Derivations vs Parses Grammar is used to derive string or construct parser Context ree Grammars A derivation is a sequence of applications of rules Starting from the start symbol S......... (sentence)

### 8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

8 Parsing Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

### Types of parsing. CMSC 430 Lecture 4, Page 1

Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)

### A simple syntax-directed

Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character

### A Simple Syntax-Directed Translator

Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

### Homework. Lecture 7: Parsers & Lambda Calculus. Rewrite Grammar. Problems

Homework Lecture 7: Parsers & Lambda Calculus CSC 131 Spring, 2019 Kim Bruce First line: - module Hmwk3 where - Next line should be name as comment - Name of program file should be Hmwk3.hs Problems How

### Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

### Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas

Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1 Top-Down Parsing Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed

### EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

EDA180: Compiler Construc6on Context- free grammars Görel Hedin Revised: 2013-01- 28 Compiler phases and program representa6ons source code Lexical analysis (scanning) Intermediate code genera6on tokens

### Syntax-Directed Translation. Lecture 14

Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik) 9/27/2006 Prof. Hilfinger, Lecture 14 1 Motivation: parser as a translator syntax-directed translation stream of tokens parser ASTs,

### Introduction to Syntax Analysis

Compiler Design 1 Introduction to Syntax Analysis Compiler Design 2 Syntax Analysis The syntactic or the structural correctness of a program is checked during the syntax analysis phase of compilation.

### It parses an input string of tokens by tracing out the steps in a leftmost derivation.

It parses an input string of tokens by tracing out CS 4203 Compiler Theory the steps in a leftmost derivation. CHAPTER 4: TOP-DOWN PARSING Part1 And the implied traversal of the parse tree is a preorder

### LECTURE 7. Lex and Intro to Parsing

LECTURE 7 Lex and Intro to Parsing LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens) and create real programs that can recognize them.

### Introduction to Syntax Analysis. The Second Phase of Front-End

Compiler Design IIIT Kalyani, WB 1 Introduction to Syntax Analysis The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 2 Syntax Analysis The syntactic or the structural correctness of a program

### Lecture 10 Parsing 10.1

10.1 The next two lectures cover parsing. To parse a sentence in a formal language is to break it down into its syntactic components. Parsing is one of the most basic functions every compiler carries out,

### Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Lecture 11 CS 536 Spring 2001 1 Outline he strategy: shift-reduce parsing Ambiguity and precedence declarations Next lecture: bottom-up parsing algorithms CS 536 Spring

### CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

CIT3136 - Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1 Definition of a Context-free Grammar: An alphabet or set of basic symbols (like regular expressions, only now the symbols are whole tokens,

### LL parsing Nullable, FIRST, and FOLLOW

EDAN65: Compilers LL parsing Nullable, FIRST, and FOLLOW Görel Hedin Revised: 2014-09- 22 Regular expressions Context- free grammar ATribute grammar Lexical analyzer (scanner) SyntacKc analyzer (parser)

### CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

### Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

Course Overview Introduction (Chapter 1) Compiler Frontend: Today Lexical Analysis & Parsing (Chapter 2,3,4) Semantic Analysis (Chapter 5) Activation Records (Chapter 6) Translation to Intermediate Code

### CSCI312 Principles of Programming Languages!

CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGraw-Hill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from

### Parsing. Lecture 11: Parsing. Recursive Descent Parser. Arithmetic grammar. - drops irrelevant details from parse tree

Parsing Lecture 11: Parsing CSC 131 Fall, 2014 Kim Bruce Build parse tree from an expression Interested in abstract syntax tree - drops irrelevant details from parse tree Arithmetic grammar ::=

### 3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

### 3. Context-free grammars & parsing

3. Context-free grammars & parsing The parsing process sequences of tokens parse tree or syntax tree a / [ / index / ]/= / 4 / + / 2 The parsing process sequences of tokens parse tree or syntax tree a

### Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Chapter 3: Describing Syntax and Semantics Introduction Formal methods of describing syntax (BNF) We can analyze syntax of a computer program on two levels: 1. Lexical level 2. Syntactic level Lexical

### LANGUAGE PROCESSORS. Introduction to Language processor:

LANGUAGE PROCESSORS Introduction to Language processor: A program that performs task such as translating and interpreting required for processing a specified programming language. The different types of

### CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

### Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2006

### Context-free grammars

Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

### Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Fall 2017-2018 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion University of the Negev 1 Books Compilers Principles, Techniques, and Tools Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman