Building A Recursive Descent Parser. Example: CSX-Lite. match terminals, and calling parsing procedures to match nonterminals.

Similar documents
LL(1) Grammars. Example. Recursive Descent Parsers. S A a {b,d,a} A B D {b, d, a} B b { b } B λ {d, a} D d { d } D λ { a }

Table-Driven Top-Down Parsers

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

Configuration Sets for CSX- Lite. Parser Action Table

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

How do LL(1) Parsers Build Syntax Trees?

Chapter 3. Parsing #1

CSE431 Translation of Computer Languages

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Lexical and Syntax Analysis (2)

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Class Information ANNOUCEMENTS

Recursive Descent Parsers

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

CSE 3302 Programming Languages Lecture 2: Syntax

Question Points Score

Alternatives for semantic processing

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Let s look at the CUP specification for CSX-lite. Recall its CFG is

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

CSCI312 Principles of Programming Languages

Properties of Regular Expressions and Finite Automata

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Types of parsing. CMSC 430 Lecture 4, Page 1

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Chapter 4. Lexical and Syntax Analysis

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

LECTURE 7. Lex and Intro to Parsing

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Syntax-Directed Translation. Lecture 14

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Revisit the example. Transformed DFA 10/1/16 A B C D E. Start

Defining syntax using CFGs

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Optimizing Finite Automata

Downloaded from Page 1. LR Parsing

Lexical and Syntax Analysis

CS 536 Midterm Exam Spring 2013

Syntax Analysis Part I

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

4. Lexical and Syntax Analysis

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Monday, September 13, Parsers

Chapter 4. Abstract Syntax

A programming language requires two major definitions A simple one pass compiler

Programming Language Syntax and Analysis

4. Lexical and Syntax Analysis

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Syntax Analysis Check syntax and construct abstract syntax tree

Chapter 4: Syntax Analyzer

COP4020 Programming Assignment 2 - Fall 2016

Top down vs. bottom up parsing

CS 314 Principles of Programming Languages

CS 230 Programming Languages

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Parsing II Top-down parsing. Comp 412

A parser is some system capable of constructing the derivation of any sentence in some language L(G) based on a grammar G.

Question Marks 1 /20 2 /16 3 /7 4 /10 5 /16 6 /7 7 /24 Total /100

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

It parses an input string of tokens by tracing out the steps in a leftmost derivation.

CA Compiler Construction

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

LL parsing Nullable, FIRST, and FOLLOW

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)


Topic 3: Syntax Analysis I

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

COMP3131/9102: Programming Languages and Compilers

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Compiler construction in4303 lecture 3

Syntax-directed translation. Context-sensitive analysis. What context-sensitive questions might the compiler ask?

CS 314 Principles of Programming Languages. Lecture 9

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

CSE302: Compiler Design

CS502: Compilers & Programming Systems

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

Abstract Syntax Trees & Top-Down Parsing

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Transcription:

Building A Recursive Descent Parser We start with a procedure Match, that matches the current input token against a predicted token: vo Match(Terminal a) { if (a == currenttoken) currenttoken = Scanner() else SyntaxErrror() To build a parsing procedure for a non-terminal A, we look at all productions with A on the lefthand se: A X 1...X n A Y 1...Y m... We use predict sets to dece which production to match (LL(1) grammars always have disjoint predict sets). We match a production s righthand se by calling Match to match terminals, and calling parsing procedures to match nonterminals. The general form of a parsing procedure for A X 1...X n A Y 1...Y m... is vo A() { if (currenttoken in Predict(A X 1...X n )) for(i=1i<=ni++) if (X[i] is a terminal) Match(X[i]) else X[i]() else if (currenttoken in Predict(A Y 1...Y m )) for(i=1i<=mi++) if (Y[i] is a terminal) Match(Y[i]) else Y[i]() else // Handle other A... productions else // No production predicted SyntaxError() 241 242 Usually this general form isn t used. Instead, each production is macro-expanded into a sequence of Match and parsing procedure calls. Example: CSX-Lite Production Predict Set Prog { { Stmt if λ Stmt = Stmt if ( ) Stmt if + + - - λ ) 243 244

CSX-Lite Parsing Procedures vo Prog() { Match("{") vo () { if (currenttoken == currenttoken == if){ Stmt() else { vo () { Match() () vo () { if (currenttoken == "+") { Match("+") else if (currenttoken == "-"){ Match("-") else { vo Stmt() { if (currenttoken == ){ Match() Match("=") else { Match(if) Match("(") Match(")") Stmt() 245 246 Let s use recursive descent to parse { a = b + c We start by calling Prog() since this represents the start symbol. Prog() Match("{") Stmt() Match() Match("=") { a = b + c { a = b + c a = b + c a = b + c a = b + c Match("=") Match() () () = b + c b + c b + c + c 247 248

Match("+") + c c Match() () () c Done! All input matched 249 250 Syntax Errors in Recursive Descent Parsing In recursive descent parsing, syntax errors are automatically detected. In fact, they are detected as soon as possible (as soon as the first illegal token is seen). How? When an illegal token is seen by the parser, either it fails to predict any val production or it fails to match an expected token in a call to Match. Let s see how the following illegal CSX-lite program is parsed: { b + c = a (Where should the first syntax error be detected?) Prog() Match("{") Stmt() Match() Match("=") { b + c = a { b + c = a b + c = a b + c = a b + c = a 251 252

Match("=") + c = a Call to Match fails! + c = a Table-Driven Top-Down Parsers Recursive descent parsers have many attractive features. They are actual pieces of code that can be read by programmers and extended. This makes it fairly easy to understand how parsing is done. Parsing procedures are also convenient places to add code to build ASTs, or to do typechecking, or to generate code. A major drawback of recursive descent is that it is quite inconvenient to change the grammar being parsed. Any change, even a minor one, may force parsing procedures to be 253 254 reprogrammed, as productions and predict sets are modified. To a less extent, recursive descent parsing is less efficient than it might be, since subprograms are called just to match a single token or to recognize a righthand se. An alternative to parsing procedures is to encode all prediction in a parsing table. A pre-programed driver program can use a parse table (and list of productions) to parse any LL(1) grammar. If a grammar is changed, the parse table and list of productions will change, but the driver need not be changed. LL(1) Parse Tables An LL(1) parse table, T, is a twodimensional array. Entries in T are production numbers or blank (error) entries. T is indexed by: A, a non-terminal. A is the nonterminal we want to expand. CT, the current token that is to be matched. T[A][CT] = A X 1...X n if CT is in Predict(A X 1...X n ) T[A][CT] = error if CT predicts no production with A as its lefthand se 255 256

CSX-lite Example LL(1) Parser Driver Production Predict Set 1 Prog { { 2 Stmt if 3 λ 4 Stmt = 5 Stmt if ( ) Stmt if 6 7 + + 8 - - 9 λ ) { if ( ) = + - eof Prog 1 3 2 2 Stmt 5 4 6 9 7 8 9 Here is the driver we ll use with the LL(1) parse table. We ll also use a parse stack that remembers symbols we have yet to match. vo LLDriver(){ Push(StartSymbol) while(! stackempty()){ //Let X=Top symbol on parse stack //Let CT = current token to match if (isterminal(x)) { match(x) //CT is updated pop() //X is updated else if (T[X][CT]!= Error){ //Let T[X][CT] = X Y 1...Y m Replace X with Y 1...Y m on parse stack else SyntaxError(CT) 257 258 Example of LL(1) Parsing We ll again parse { a = b + c We start by placing Prog (the start symbol) on the parse stack. Parse Stack Prog { Stmt { a = b + c { a = b + c a = b + c a = b + c Parse Stack = = a = b + c = b + c b + c b + c 259 260

Parse Stack Parse Stack + c + + c c c Done! All input matched 261 262