It parses an input string of tokens by tracing out the steps in a leftmost derivation.

Similar documents
CIT 3136 Lecture 7. Top-Down Parsing

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

3. Context-free grammars & parsing

Lexical and Syntax Analysis. Top-Down Parsing

Dr. D.M. Akbar Hussain

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Top down vs. bottom up parsing

Chapter 4. Lexical and Syntax Analysis

Lexical and Syntax Analysis

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Building Compilers with Phoenix

Topic of the day: Ch. 4: Top-down parsing

Chapter 4: Syntax Analyzer

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

3. Parsing. Oscar Nierstrasz

A programming language requires two major definitions A simple one pass compiler

Recursive Descent Parsers

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

CS502: Compilers & Programming Systems

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Introduction to Bottom-Up Parsing

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Chapter 3. Describing Syntax and Semantics ISBN

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Programming Language Syntax and Analysis

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax


Abstract Syntax Trees & Top-Down Parsing

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Context-free grammars

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Lexical and Syntax Analysis (2)

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Compiler Construction: Parsing

Compiler Construction 2016/2017 Syntax Analysis

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

LL(1) predictive parsing

Context-Free Grammars

CSE431 Translation of Computer Languages

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Syntax Analysis Part I

CSCI312 Principles of Programming Languages

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

CS 403: Scanning and Parsing

Parsing a primer. Ralf Lämmel Software Languages Team University of Koblenz-Landau

CS 314 Principles of Programming Languages

UNIT-III BOTTOM-UP PARSING

LL(1) predictive parsing

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

EECS 6083 Intro to Parsing Context Free Grammars

Homework. Lecture 7: Parsers & Lambda Calculus. Rewrite Grammar. Problems

4. Lexical and Syntax Analysis

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Syntax. In Text: Chapter 3

Parsing. Lecture 11: Parsing. Recursive Descent Parser. Arithmetic grammar. - drops irrelevant details from parse tree

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Syntactic Analysis. Top-Down Parsing

CS 4120 Introduction to Compilers

CSE302: Compiler Design

4. Lexical and Syntax Analysis

Context-free grammars (CFG s)

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

CS 536 Midterm Exam Spring 2013

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

4. LEXICAL AND SYNTAX ANALYSIS

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Chapter 3. Parsing #1

CA Compiler Construction

Compilers. Predictive Parsing. Alex Aiken

Defining syntax using CFGs

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols

컴파일러입문 제 6 장 구문분석

CS 406/534 Compiler Construction Parsing Part I

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsing II Top-down parsing. Comp 412

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

Transcription:

It parses an input string of tokens by tracing out CS 4203 Compiler Theory the steps in a leftmost derivation. CHAPTER 4: TOP-DOWN PARSING Part1 And the implied traversal of the parse tree is a preorder 1. Introduction traversal 1.1. Concept of and, Top-Down thus, Parsing occurs (1) from the root to the leaves. It parses an input string of tokens by tracing out the steps in a leftmost derivation. And the implied traversal of the parse tree is a preorder traversal and, thus, occurs from the root to the leaves. The example: The example: number + number, and corresponds to the parse tree number + number, and corresponds to the parse tree exp exp op exp number + number The above parse tree corresponds to the leftmost derivations: (1) exp => exp op exp (2) => number op exp (3) => number + exp (4) => number + number 1.2. Two forms of Top-Down Parsers Predictive parsers: attempts to predict the next construction in the input string using one or more look-ahead tokens Backtracking parsers: try different possibilities for a parse of the input, backing up an arbitrary amount in the input if one possibility fails. It is more powerful but much slower, unsuitable for practical compilers. 1.3. Two kinds of Top-Down parsing algorithms Recursive-descent parsing: is quite versatile and suitable for a handwritten parser. LL(1) parsing: The first L refers to the fact that it processes the input from left to right; The second L refers to the fact that it traces out a leftmost derivation for the input string; The number 1 means that it uses only one symbol of input to predict the direction of the parse. 4.1 Top-Down Parsing by Recursive-Descent 4.1.1 The Basic Method of Recursive-Descent 1. The idea of Recursive-Descent Parsing Viewing the grammar rule for a non-terminal A as a definition for a procedure to recognize an A The right-hand side of the grammar for A specifies the structure of the code for this procedure Requiring the Use of EBNF 1

a) The Expression Grammar: exp term { addop term } addop + - term factor { mulop factor } mulop * factor ( exp ) numberr A recursive-descent procedure that recognizes a factor procedure factor case token of ( : match( ( ); exp; match( )); number: match (number); else error; end case; end factor The token keeps the current next token in the input (one symbol of look-ahead) The Match procedure matches the current next token with its parameters, advances the input if it succeeds, and declares error if it does not b) Match Procedure The Match procedure matches the current next token with its parameters, advances the input if it succeeds, and declares error if it does not procedure match( expectedtoken); if token = expectedtoken then gettoken; else error; end if; end match c) procedure exp; term; while token = + or token = - do match(token); term; end exp; d) procedure term; factor; while token = * do match(token); factor; 2

end term; Some Notes The method of turning grammar rule in EBNF into code is quite powerful. There are a few pitfalls, and care must be taken in scheduling the actions within the code. In the previous pseudo-code for exp: (1) The match of Construction operation should be before of repeated the calls syntax to term; tree (2) The global token variable must be set before the parse s; (3) The gettoken must be called just after a successful test of a token 2. Construction of the syntax tree The Expression 3+4+5 The expression: 3+4+5 + + 5 3 4 a) The pseudo-code for constructing the syntax tree(1) function exp : syntaxtree; var temp, newtemp: syntaxtree; temp:=term; while token=+ or token = - do case token of + : match(+); newtemp:=makeopnode(+); leftchild(newtemp):=temp; rightchild(newtemp):=term; temp=newtemp; b) The pseudo-code for constructing the syntax tree(2) -:match(-); newtemp:=makeopnode(-); leftchild(newtemp):=temp; rightchild(newtemp):=term; temp=newtemp; end case; return temp; end exp; c) A simpler one function exp : syntaxtree; var temp, newtemp: syntaxtree; temp:=term; while token=+ or token = - do newtemp:=makeopnode(token); match(token); leftchild(newtemp):=temp; rightchild(newtemp):=term; temp=newtemp; 3

return temp; end exp; 4.1.3 Further Decision Problems More formal methods to deal with complex situation (1) It may be difficult to convert a grammar in BNF into EBNF form; (2) It is difficult to decide when to use the choice A α and the choice A β; if both α and β with non-terminals. Such a decision problem requires the computation of the First Sets. (3) It may be necessary to know what token legally coming after the non-terminal A, in writing the code for an ε-production: A ε. Such tokens indicate A may disappear at this point in the parse. This set is called the Follow Set of A. (4) It requires computing the First and Follow sets in order to detect the errors as early as possible. Such as )3-2), the parse will descend from exp to term to factor before an error is reported. 4.2 LL(1) Parsing 4.2.1 The Basic Method of LL(1) Parsing 1. Main idea LL(1) Parsing uses an explicit stack rather than recursive calls to perform a parse An example: a simple grammar for the strings of balanced parentheses: S (S) S ε The following table shows the actions of a top-down parser given this grammar and the string ( ) Table of Actions Steps Parsing Stack Input Action 1 S ( ) S (S) S 2 S)S( ( ) match 3 S)S ) 4 S) ) match 5 S 6 accept 2. General Schematic A top-down parser s by pushing the start symbol onto the stack It accepts an input string if, after a series of actions, the stack and the input become empty A general schematic for a successful top-down parse: StartSymbol Inputstring //one of the two actions //one of the two actions accept 4

a) Two Actions The two actions (1) Generate: Replace a non-terminal A at the top of the stack by a string α(in reverse) using a grammar rule A α, and (2) Match: Match a token on top of the stack with the next input token. The list of generating actions in the above table: S => (S)S [S (S) S] => ( )S [] => ( ) [] Which corresponds precisely to the steps in a leftmost derivation of string ( ). This is the characteristic of top-down parsing. 4.2.2 The LL(1) Parsing Table and Algorithm 1. Purpose and Example of LL(1) Parsing Table Purpose of the LL(1) Parsing Table: To express the possible rule choices for a non-terminal A when the A is at the top of parsing stack based on the current input token (the look-ahead). The LL(1) Parsing table for the following simple grammar: S (S) S ε M[N,T] ( ) S S (S) S 2. The General Definition of Table The table is a two-dimensional array indexed by non-terminals and terminals Containing production choices to use at the appropriate parsing step called M[N,T] N is the set of non-terminals of the grammar T is the set of terminals or tokens (including ) Any entrances remaining empty Representing potential errors 3. Table-Constructing Rule The table-constructing rule If A α is a production choice, and there is a derivation α=>*aβ, where a is a token, then add A αto the table entry M[A,a]; If A α is a production choice, and there are derivations α=>*ε A and Table-Constructing S=>*βAaγ, where S is the start symbol Caseand a is a token (or ), then add A αto the table entry M[A,a]; 4. A Table-Constructing The constructing-process Case of the following table The constructing-process For the production : S (S) of the S, α=(s)s, following where table a=(, this choice For the will production be added to the : S (S) entry M[S, α=(s)s, (] ; where a=(, this choice will be added to Since: the entry S=>(S)Sε,rule M[S, (] ; 2 applied withα= ε, β=(,a = S, a = ), and γ=s,so add the choice to M[S, )] Since: S=>(S)Sε,rule 2 applied withα= ε, β=(,a = S, a = ), and γ=s,so add Since S=>* S, is also added to M[S, ]. the choice to M[S, )] Since S=>* S, is also added to M[S, ]. M[N,T] ( ) S S (S) S 5

5. Properties of LL(1) Grammar Definition of LL(1) Grammar A grammar is an LL(1) grammar if the associated LL(1) parsing table has at most on production in each table entry An LL(1) grammar cannot be ambiguous 6. A Parsing Algorithm Using the LL(1) Parsing Table (* assumes marks the bottom of the stack and the end of the input *) push the start symbol onto the top the parsing stack; while the top of the parsing stack and the next input token do if the top of the parsing stack is terminal a and the next input token = a then (* match *) pop the parsing stack; advance the input; else if the top of the parsing stack is non-terminal A and the next input token is terminal a and parsing table entry M[A,a] contains production A X1X2 Xn then (* generate *) pop the parsing stack; for i:=n downto 1 do push Xi onto the parsing stack; else error; if the top of the parsing stack = and the next input token = then accept else error. 6