Parsing. Lecture 11: Parsing. Recursive Descent Parser. Arithmetic grammar. - drops irrelevant details from parse tree

Similar documents
Homework. Lecture 7: Parsers & Lambda Calculus. Rewrite Grammar. Problems

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Top down vs. bottom up parsing

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Types of parsing. CMSC 430 Lecture 4, Page 1

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

CS502: Compilers & Programming Systems

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Abstract Syntax Trees & Top-Down Parsing

CSCI312 Principles of Programming Languages

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Syntactic Analysis. Top-Down Parsing

Projects for Compilers

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

CS 314 Principles of Programming Languages

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Lexical and Syntax Analysis. Top-Down Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Monday, September 13, Parsers

Topic of the day: Ch. 4: Top-down parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Wednesday, August 31, Parsers

Lexical and Syntax Analysis

LECTURE 7. Lex and Intro to Parsing

CA Compiler Construction

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Compiler Construction Class Notes

4 (c) parsing. Parsing. Top down vs. bo5om up parsing

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Recursive Descent Parsers

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5

3. Parsing. Oscar Nierstrasz

LL parsing Nullable, FIRST, and FOLLOW

Table-Driven Parsing

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas

Fall Compiler Principles Lecture 3: Parsing part 2. Roman Manevich Ben-Gurion University

Compilation Lecture 3: Syntax Analysis: Top-Down parsing. Noam Rinetzky

Parsing Part II (Top-down parsing, left-recursion removal)

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

CIT 3136 Lecture 7. Top-Down Parsing

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Compilers. Predictive Parsing. Alex Aiken

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Chapter 4. Lexical and Syntax Analysis

Compiler construction in4303 lecture 3

More Bottom-Up Parsing

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

COP4020 Programming Assignment 2 - Fall 2016

CS 4120 Introduction to Compilers

It parses an input string of tokens by tracing out the steps in a leftmost derivation.

Automatic generation of LL(1) parsers

Topic 3: Syntax Analysis I

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

COP4020 Programming Assignment 2 Spring 2011

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

4. Lexical and Syntax Analysis

Chapter 3. Parsing #1

shift-reduce parsing

Introduction to Parsing. Comp 412

Lecture Notes on Top-Down Predictive LL Parsing


4. Lexical and Syntax Analysis

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

CMSC 330: Organization of Programming Languages

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

CS 321 Programming Languages and Compilers. VI. Parsing

Concepts Introduced in Chapter 4

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Lecture 8: Deterministic Bottom-Up Parsing

CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part II LL(1) and LR(1) Parsing

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

Syntax Analysis, III Comp 412

Transcription:

Parsing Lecture 11: Parsing CSC 131 Fall, 2014 Kim Bruce Build parse tree from an expression Interested in abstract syntax tree - drops irrelevant details from parse tree Arithmetic grammar <exp> ::= <exp> <addop> <term> <term> <term> ::= <term> <mulop> <factor> <factor> <factor> ::= ( <exp> ) NUM ID <addop> ::= + - <mulop> ::= * / Look at parse tree & abstract syntax tree for 2 * 3 + 7 Recursive Descent Parser Base recognizer (ignore building tree now) on productions: <exp> ::= <exp> <addop> <term> addop (fst:rest) = if fst== + or fst== - then rest else error... exp input = let inputafterexp = exp input inputafteraddop = addop inputafterexp rest = term inputafteraddop in rest or fun exp input = term(addop(exp input));

Problems How do we select which production to use when alternatives? Left-recursive - never terminates Rewrite Grammar <exp> ::= <term> <termtail> (1) <termtail> ::= <addop> <term> <termtail> (2) ε (3) <term> ::= <factor> <factortail> (4) <factortail> ::= <mulop> <factor> <factortail> (5) ε (6) <factor> ::= ( <exp> ) (7) NUM (8) ID (9) <addop> ::= + - (10) <mulop> ::= * / (11) No left recursion" How do we know which production to take? Predictive Parsing Goal: a 1 a 2...a n S α... a 1 a 2 Xβ Want next terminal character derived to be a 3 a 3 in First(γ) Need to apply a production X ::= γ where 1) γ can eventually derive a string starting with a 3 or 2) If X can derive the empty string, and also if β can derive a string starting with a 3. a 3 in Follow(X) FIRST Intuition: b First(X) iff there is a derivation X * bω for some ω. 1.First(b) = b for b a terminal or the empty string 2.If have X ::= ω 1 ω 2... ω n then First(X) = First(ω 1 )... First(ω n ) 3.For any right hand side u 1 u 2...u n - First(u 1 ) First(u 1 u 2...u n ) - if all of u 1, u 2..., u i-1 can derive the empty string then also First(u i) First(u 1u 2...u n) - empty string is in First(u 1u 2...u n) iff all of u 1, u 2..., u n can derive the empty string

First for Arithmetic FIRST(<addop>) = { +, - FIRST(<mulop>) = { *, / FIRST(<factor>) = { (, NUM, ID FIRST(<term>) = { (, NUM, ID FIRST(<exp>) = { (, NUM, ID FIRST(<termTail>) = { +, -, ε FIRST(<factorTail>) = { *, /, ε Follow Intuition: A terminal b Follow(X) iff there is a derivation S * vxbω for some v and ω. 1.If S is the start symbol then put EOF Follow(S) 2.For all rules of the form A ::= wxv, a.add all elements of First(v) to Follow(X) b.if v can derive the empty string then add all elts of Follow(A) to Follow(X)" Follow(X) only used if can derive empty string from X. Follow for Arithmetic Only needed to calculate for <termtail>, FOLLOW(<exp>) = { EOF, ) <factortail> FOLLOW(<termTail>) = FOLLOW(<exp>) = { EOF, ) FOLLOW(<term>) = FIRST(<termTail>) FOLLOW(<exp>) FOLLOW(<termTail>) = { +, -, EOF, ) FOLLOW(<factorTail>) = { +, -, EOF, ) FOLLOW(<factor>) = { *, /, +, -, EOF " FOLLOW(<addop>) = { (, NUM, ID Not needed" FOLLOW(<mulop>) = { (, NUM, ID Predictive Parsing, redux Goal: a 1 a 2...a n S α... a 1 a 2 Xβ Want next terminal character derived to be a 3 Need to apply a production X ::= γ where 1) γ can eventually derive a string starting with a 3 or 2) If X can derive the empty string, then see if β can derive a string starting with a 3.

Building Table Need Unambiguous Put X ::= α in entry (X,a) if either - a in First(α), or - e in First(α) and a in Follow(X) Consequence: X ::= α in entry (X,a) iff there is a derivation s.t. applying production can eventually lead to string starting with a. No table entry should have more than one production to ensure it s unambiguous, as otherwise we don t know which rule to apply. Laws of predictive parsing: - If A ::= α 1... α n then for all i j, First(α i) First(α j) =. - If X * ε, then First(X) Follow(X) =. See ArithParse.hs Laws of predictive parsing: - If A ::= α 1... α n then for all i j, First(α i) First(α j) =. - If X * ε, then First(X) Follow(X) =. 2nd is OK for arithmetic: - FIRST(<termTail>) = { +, -, ε - FOLLOW(<termTail>) = { EOF, ) - FIRST(<factorTail>) = { *, /, ε - FOLLOW(<factorTail>) = { +, -, EOF, ) no overlap Nonterminals ID NUM Addop Mulop ( ) EOF <exp> 1 1 1 <termtail> 2 3 3 <term> 4 4 4 <facttail> 6 5 6 6 <factor> 9 8 7 <addop> 10 <mulop> 11 Read off from table which production to apply

Table-Driven Stack-based Parser http://en.wikipedia.org/wiki/ll_parser Alternatives to Recursive Descent Parsers Start with S $ on stack and input $ to be recognized. Use table to replace non-terminals on top of stack. If terminal on top of stack matches next input then erase both and proceed. Success if end up clearing stack and input Show with ID * (NUM + NUM)$ Another alternative More Options LR(1) parsers -- bottom up, gives right-most derivation. Also stack-based. YACC is LR(1). ANTLR is LL(1). k in LL(k) and LR(k) indicates how many letters of look ahead are necessary -- e.g. length of strings in columns of table. Compiler writers are happiest with k=1 to avoid exponential blow-up of table. May have to rewrite grammars. Parser Combinators - Domain specific language for parsing. - Even easier to tie to grammar than recursive descent - Build into Haskell and Scala, definable elsewhere Talk about when cover Scala