CIT 3136 Lecture 7. Top-Down Parsing

Similar documents
It parses an input string of tokens by tracing out the steps in a leftmost derivation.

Parsing III. (Top-down parsing: recursive descent & LL(1) )

CSCI312 Principles of Programming Languages

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

Chapter 4: Syntax Analyzer

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Types of parsing. CMSC 430 Lecture 4, Page 1

CS 403: Scanning and Parsing

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Syntactic Analysis. Top-Down Parsing

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CS 230 Programming Languages

Lexical and Syntax Analysis (2)

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Parsing. Lecture 11: Parsing. Recursive Descent Parser. Arithmetic grammar. - drops irrelevant details from parse tree

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

Recursive Descent Parsers

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Chapter 4. Lexical and Syntax Analysis

Top down vs. bottom up parsing

MIT Top-Down Parsing. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Parsing II Top-down parsing. Comp 412

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

Topic of the day: Ch. 4: Top-down parsing

A programming language requires two major definitions A simple one pass compiler

Syntax Analysis, III Comp 412

LL Parsing: A piece of cake after LR

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Syntax Errors; Static Semantics

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Lecture 12: Parser-Generating Tools

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Chapter 3. Parsing #1

Chapter 3. Describing Syntax and Semantics ISBN

Topdown parsing with backtracking

CA Compiler Construction

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Syntax Analysis, III Comp 412

Computer Science 160 Translation of Programming Languages

CS 406/534 Compiler Construction Parsing Part I

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

CS 4120 Introduction to Compilers

Question Points Score

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

Lexical and Syntax Analysis. Top-Down Parsing

Context-free grammars

Parsing Part II (Top-down parsing, left-recursion removal)

Context-free grammars (CFG s)

CSE431 Translation of Computer Languages

Compiler Construction: Parsing

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Project Compiler. CS031 TA Help Session November 28, 2011

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

Lexical and Syntax Analysis

CPS 506 Comparative Programming Languages. Syntax Specification

Abstract Syntax Trees & Top-Down Parsing

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

LECTURE 7. Lex and Intro to Parsing

4. Lexical and Syntax Analysis

shift-reduce parsing

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

CS 314 Principles of Programming Languages

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Programming Language Syntax and Analysis

3. Parsing. Oscar Nierstrasz

4. Lexical and Syntax Analysis

Homework. Lecture 7: Parsers & Lambda Calculus. Rewrite Grammar. Problems

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Compilers. Predictive Parsing. Alex Aiken

4. LEXICAL AND SYNTAX ANALYSIS

Lexical and Syntax Analysis

Question Marks 1 /20 2 /16 3 /7 4 /10 5 /16 6 /7 7 /24 Total /100

Lecture 6: Top-Down Parsing. Beating Grammars into Programs. Expression Recognizer with Actions. Example: Lisp Expression Recognizer

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

COP4020 Programming Assignment 2 - Fall 2016

3. Context-free grammars & parsing

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

Projects for Compilers

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Transcription:

CIT 3136 Lecture 7 Top-Down Parsing

Chapter 4: Top-down Parsing A top-down parsing algorithm parses an input string of tokens by tracing out the steps in a leftmost derivation. Such an algorithm is called top-down because the implied traversal of the parse tree is a preorder traversal. 4/18/2003 2

Two basic kinds of top-down parser: Predictive parsers that try to make decisions about the structure of the tree below a node based on a few lookahead tokens (usually one!). This is a weakness, since little program structure has been seen before predictive decisions must be made. Backtracking parsers that solve the lookahead problem by backtracking if one decision turns out to be wrong and making a different choice. But such parsers are slow (exponential time in general). 4/18/2003 3

Fortunately, many practical techniques have been developed to overcome the predictive lookahead problem, and the version of predictive parsing called recursive-descent is still the method of choice for hand-coding, due to its simplicity. But because of the inherent weakness of topdown parsing, it is not a good choice for machine-generated parsers. Instead, more powerful bottom-up parsing methods should be used (Chapter 5). 4/18/2003 4

Recursive-descent parsing Simple, elegant idea: use the grammar rules as recipes for procedure code. Each non-terminal corresponds to a procedure. Each appearance of a terminal in the rhs of a rule causes a token to be matched. Each appearance of a non-terminal corresponds to a call of the associated procedure. 4/18/2003 5

Example Grammar rule: factor ( exp ) number Code: void factor(void) { if (token == number) match(number); else { match( ( ); exp(); match( ) ); 4/18/2003 6

Example, continued (2) Note how lookahead is not a problem in this example: if the token is number, go one way, if the token is ( go the other, and if the token is neither, declare error: void match(token expect) { if (token == expect) gettoken(); else error(token,expect); 4/18/2003 7

Example, continued (3) A recursive-descent procedure can also compute values or syntax trees: int factor(void) { if (token == number) { int temp = atoi(tokstr); match(number); return temp; else { match( ( ); int temp = exp(); match( ) ); return temp; 4/18/2003 8

Errors in Recursive-descent are tricky to handle: If an error occurs, we must somehow gracefully exit possibly many recursive calls. Best solution: use exception handling to manage stack unwinding (which C doesn t have!). But there are worse problems: left recursion doesn t work! 4/18/2003 9

Left recursion is impossible! exp expaddopterm term void exp(void) { if (token ==??) { exp(); // uh, oh!! addop(); term(); else term(); 4/18/2003 10

EBNF to the rescue! exp term { addop term void exp(void) { term(); while (token is an addop) { addop(); term(); 4/18/2003 11

This code can even left associate: int exp(void) { int temp = term(); while (token == + token == - ) if (token == + ) { match( + ); temp += term(); else { match( - ); temp -= term(); return temp; 4/18/2003 12

Note that right recursion/assoc. is not a problem: exp term [ addop exp ] void exp(void) { term(); if (token is an addop) { addop(); exp(); 4/18/2003 13

Solving the lookahead problem in greater generality: Compute First sets for choices: First(α) = the set of tokens that can appear at the front of α. Then if we have a grammar rule A α 1... α n and First(α i ) First(α j ) is empty for all i and j, then a decision on a choice can be made. A problem occurs if an α i can disappear; then we must compute Follow sets (the tokens that can follow a symbol), and show that First Follow is empty. 4/18/2003 14

Error Recovery in Parsers A parser should try to determine that an error has occurred as soon as possible. Waiting too long before declaring error means the location of the actual error may have been lost. After an error has occurred, the parser must pick a likely place to resume the parse. A parser should always try to parse as much of the code as possible, in order to find as many real errors as possible during a single translation. 4/18/2003 15

Error Recovery in Parsers (continued) A parser should try to avoid the error cascade problem, in which one error generates a lengthy sequence of spurious error messages. A parser must avoid infinite loops on errors, in which an unending cascade of error messages is generated without consuming any input. 4/18/2003 16

Panic Mode in recursive-descent Extra parameter consisting of a set of synchronizing tokens. As parsing proceeds, tokens that may function as synchronizing tokens are added to the synchronizing set as each call occurs. If an error is encountered, the parser scans ahead, throwing away tokens until one of the synchronizing set of tokens is seen in the input, whence parsing is resumed. 4/18/2003 17

Example (in pseudocode) procedure scanto ( synchset ) ; begin while not ( token in synchset { EOF ) do gettoken ; end scanto ; procedure checkinput ( firstset, followset ) ; begin if not ( token in firstset ) then error ; scanto ( firstset followset ) ; end if ; end; 4/18/2003 18

Example (in pseudocode,, cont) procedure exp ( synchset ) ; begin checkinput ( { (, number, synchset ) ; if not ( token in synchset ) then term ( synchset ) ; while token = + or token = - do match (token) ; term ( synchset ) ; end while ; checkinput ( synchset, { (, number ) ; end if; end exp ; 4/18/2003 19

Example (in pseudocode, concl.) procedure factor ( synchset ) ; begin checkinput ( { (, number, synchset ) ; if not ( token in synchset ) then case token of ( : match(( ) ; exp ( { ) ) ; match( )) ; number : match(number) ; else error ; end case ; checkinput ( synchset, { (, number ) ; end if ; end factor ; 4/18/2003 20

C-Minus and recursive-descent parsing problems: 18. expression var = expression simple-expression 19. var ID ID [ expression ] 20. simple-expression additive-expression relop additive-expression additive-expression 21. relop <= < > >= ==!= 22. additive-expression additive-expression addop term term 23. addop + - 24. term term mulop factor factor 25. mulop * / 26. factor ( expression ) var call NUM 27. call ID ( args ) 4/18/2003 21