Context Free Grammars and Recursive Descent Parsing

Similar documents
Syntax and Grammars 1 / 21

Grammars & Parsing. Lecture 12 CS 2112 Fall 2018

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Parsing. Parsing. Bottom Up Parsing. Bottom Up Parsing. Bottom Up Parsing. Bottom Up Parsing

Principles of Programming Languages COMP251: Syntax and Grammars

Non-deterministic Finite Automata (NFA)

COL728 Minor1 Exam Compiler Design Sem II, Answer all 5 questions Max. Marks: 20

If you are going to form a group for A2, please do it before tomorrow (Friday) noon GRAMMARS & PARSING. Lecture 8 CS2110 Spring 2014

Public-Service Announcement

Compiler Design Concepts. Syntax Analysis

Today. Assignments. Lecture Notes CPSC 326 (Spring 2019) Quiz 2. Lexer design. Syntax Analysis: Context-Free Grammars. HW2 (out, due Tues)

Introduction to Parsing. Lecture 8

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

CMSC 330: Organization of Programming Languages

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

Principles of Programming Languages COMP251: Syntax and Grammars

ECE251 Midterm practice questions, Fall 2010

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Languages and Compilers

Briefly describe the purpose of the lexical and syntax analysis phases in a compiler.

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

Syntax-Directed Translation. Lecture 14

Regexs with DFA and Parse Trees. CS230 Tutorial 11

Binary Tree Application Expression Tree. Revised based on textbook author s notes.

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

CSE431 Translation of Computer Languages

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS. Regrades 10/6/15. Prelim 1. Prelim 1. Expression trees. Pointers to material

GRAMMARS & PARSING. Lecture 7 CS2110 Fall 2013

CMPT 755 Compilers. Anoop Sarkar.

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

First Midterm Exam CS164, Fall 2007 Oct 2, 2007

announcements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 314 Principles of Programming Languages

COP 3402 Systems Software Syntax Analysis (Parser)

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Introduction; Parsing LL Grammars

Properties of Regular Expressions and Finite Automata

CPS 506 Comparative Programming Languages. Syntax Specification

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Announcements. Application of Recursion. Motivation. A Grammar. A Recursive Grammar. Grammars and Parsing

Homework & Announcements

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Related Course Objec6ves

Types of parsing. CMSC 430 Lecture 4, Page 1

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

EECS 6083 Intro to Parsing Context Free Grammars

CMSC 201 Fall 2016 Lab 09 Advanced Debugging

Compiler Construction

ITEC2620 Introduction to Data Structures

Finite State Automata are Limited. Let us use (context-free) grammars!

Question Bank. 10CS63:Compiler Design

CSC148 Intro. to Computer Science

Optimizing Finite Automata

CSE 3302 Programming Languages Lecture 2: Syntax

Test I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Spring Department of Electrical Engineering and Computer Science

CMSC 330: Organization of Programming Languages. Context Free Grammars

Syntax and Semantics

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

UVa ID: NAME (print): CS 4501 LDI Midterm 1

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Syntax Analysis. Chapter 4

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Part 5 Program Analysis Principles and Techniques

MATVEC: MATRIX-VECTOR COMPUTATION LANGUAGE REFERENCE MANUAL. John C. Murphy jcm2105 Programming Languages and Translators Professor Stephen Edwards

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

CMSC 330: Organization of Programming Languages

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Wednesday, August 31, Parsers

Parsing Combinators: Introduction & Tutorial

CSCI312 Principles of Programming Languages

A programming language requires two major definitions A simple one pass compiler

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Chapter 4. Lexical and Syntax Analysis

CSE 413 Final Exam. June 7, 2011

Topic 3: Syntax Analysis I

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Stacks. Revised based on textbook author s notes.

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Midterm I - Solution CS164, Spring 2014

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

CMSC 330: Organization of Programming Languages

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Course Project 2 Regular Expressions

CS153: Compilers Lecture 4: Recursive Parsing

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

CSCE 314 Programming Languages. Functional Parsers

Context-Free Grammar (CFG)

Transcription:

Context Free Grammars and Recursive Descent Parsing Tim Dawborn January, 2018

cfg Parsing Recursive Descent Parsing Calculator 2 Outline 1 Context-Free Grammars (cfg) 2 Parsing 3 Recursive Descent Parsing 4 Calculator

cfg Parsing Recursive Descent Parsing Calculator 3 Regular Expressions Regular expressions are a useful tool for pattern matching as you ll no doubt recall. To warm up, write a regular expression to match strings containing any number of a s, followed by two b s, followed by one or more c s

cfg Parsing Recursive Descent Parsing Calculator 3 Regular Expressions Regular expressions are a useful tool for pattern matching as you ll no doubt recall. To warm up, write a regular expression to match strings containing any number of a s, followed by two b s, followed by one or more c s /a*bbc+/

cfg Parsing Recursive Descent Parsing Calculator 3 Regular Expressions Regular expressions are a useful tool for pattern matching as you ll no doubt recall. To warm up, write a regular expression to match strings containing any number of a s, followed by two b s, followed by one or more c s /a*bbc+/ Now write a regular expression to match any number of a s followed by the same number of b s

cfg Parsing Recursive Descent Parsing Calculator 3 Regular Expressions Regular expressions are a useful tool for pattern matching as you ll no doubt recall. To warm up, write a regular expression to match strings containing any number of a s, followed by two b s, followed by one or more c s /a*bbc+/ Now write a regular expression to match any number of a s followed by the same number of b s Cannot be done!

cfg Parsing Recursive Descent Parsing Calculator 4 Regular Languages There are limitations on what types of languages regular expressions are able to match Regular expressions only able to match regular languages Context Free Grammars (cfgs) have more expressive power than regular expressions recursively enumerable context-sensitive context-free regular

cfg Parsing Recursive Descent Parsing Calculator 5 Grammar Definitions An alternative way to express the constraints on the valid strings of a language is to use a grammar A grammar consists of four things: Terminals (T ) Non-terminals (V ) Production rules (R : V (T V ) ) A start symbol (one of the non-terminals) (S V )

cfg Parsing Recursive Descent Parsing Calculator 6 Grammars An example grammar for micro-english : 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is"

cfg Parsing Recursive Descent Parsing Calculator 6 Grammars An example grammar for micro-english : 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" Terminals are represented as "string literals" Non-terminals are delimited with <angle-brackets> <start-symbol> is where we begin, conventionally at the top left The ::= is the can be re-written as symbol Alternative rewrite rules can be separated by vertical bars or put on separate lines

cfg Parsing Recursive Descent Parsing Calculator 6 Grammars An example grammar for micro-english : 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" Terminals T = {., I, a, the, me, cat, mat, rat, like, see, is} Non-terminals, built V = {<sentence>, <subject>, <object>, <noun>} Production rules R = the above five rules Start symbol S = <sentence>

cfg Parsing Recursive Descent Parsing Calculator 7 Grammars 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" The grammar rules define a set of rewrites <sentence> <subject> <verb> <object>.

cfg Parsing Recursive Descent Parsing Calculator 7 Grammars 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" The grammar rules define a set of rewrites <sentence> <subject> <verb> <object>. I <verb> <object>.

cfg Parsing Recursive Descent Parsing Calculator 7 Grammars 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" The grammar rules define a set of rewrites <sentence> <subject> <verb> <object>. I <verb> <object>. I see <object>.

cfg Parsing Recursive Descent Parsing Calculator 7 Grammars 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" The grammar rules define a set of rewrites <sentence> <subject> <verb> <object>. I <verb> <object>. I see <object>. I see the <noun>.

cfg Parsing Recursive Descent Parsing Calculator 7 Grammars 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" The grammar rules define a set of rewrites <sentence> <subject> <verb> <object>. I <verb> <object>. I see <object>. I see the <noun>. I see the cat.

cfg Parsing Recursive Descent Parsing Calculator 8 Grammars 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" Other sentences in micro-english: I like the cat. I see a rat. The cat like the mat. unfortunately not good English The mat is the rat. syntactically valid string

cfg Parsing Recursive Descent Parsing Calculator 9 Grammars 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" How many strings are in the language of micro-english?

cfg Parsing Recursive Descent Parsing Calculator 9 Grammars 1 <sentence> ::= <subject> <verb> <object> "." 2 <subject> ::= "I" "a" <noun> "the" <noun> 3 <object> ::= "me" "a" <noun> "the" <noun> 4 <noun> ::= "cat" "mat" "rat" 5 <verb> ::= "like" "see" "is" How many strings are in the language of micro-english? <verb> = 3 <noun> = 3 <object> = 1 + 3 + 3 = 7 <subject> = 1 + 3 + 3 = 7 <sentence> = 7 3 7 = 147

cfg Parsing Recursive Descent Parsing Calculator 10 Your turn Write a grammar rule for a 0x prefixed hexadecimal number.

cfg Parsing Recursive Descent Parsing Calculator 10 Your turn Write a grammar rule for a 0x prefixed hexadecimal number. 1 <hex> ::= "0" "x" ( "0"... "9" "A"... "F" )+ Note the parentheses (, ) for grouping, and the regex-like + for one or more of. (Admit it, you know you want to write it as a regular expression: 0x[0-9A-F]+)

cfg Parsing Recursive Descent Parsing Calculator 10 Your turn Write a grammar rule for a 0x prefixed hexadecimal number. 1 <hex> ::= "0" "x" ( "0"... "9" "A"... "F" )+ Note the parentheses (, ) for grouping, and the regex-like + for one or more of. (Admit it, you know you want to write it as a regular expression: 0x[0-9A-F]+) Next, write a grammar rule for an integer (positive or negative).

cfg Parsing Recursive Descent Parsing Calculator 10 Your turn Write a grammar rule for a 0x prefixed hexadecimal number. 1 <hex> ::= "0" "x" ( "0"... "9" "A"... "F" )+ Note the parentheses (, ) for grouping, and the regex-like + for one or more of. (Admit it, you know you want to write it as a regular expression: 0x[0-9A-F]+) Next, write a grammar rule for an integer (positive or negative). 1 <integer> ::= "-"? ( "0" "1"... "9" )+

cfg Parsing Recursive Descent Parsing Calculator 11 Your turn Write a grammar to accept expressions of integers, which could contain zero or more additions (e.g. 23, 1+5+23, -23+23). You will need two grammar rules.

cfg Parsing Recursive Descent Parsing Calculator 11 Your turn Write a grammar to accept expressions of integers, which could contain zero or more additions (e.g. 23, 1+5+23, -23+23). You will need two grammar rules. 1 <expr> ::= <integer> ( "+" <expr> )? 2 <integer> ::= "-"? ( "0" "1"... "9" )+ Note how the first rule is recursive: that means you can create any sequence of <integer> + <integer> + <integer>...

cfg Parsing Recursive Descent Parsing Calculator 11 Your turn Write a grammar to accept expressions of integers, which could contain zero or more additions (e.g. 23, 1+5+23, -23+23). You will need two grammar rules. 1 <expr> ::= <integer> ( "+" <expr> )? 2 <integer> ::= "-"? ( "0" "1"... "9" )+ Note how the first rule is recursive: that means you can create any sequence of <integer> + <integer> + <integer>... Write a grammar to accept any number of a s followed by the same number of b s. Again you ll need two rules.

cfg Parsing Recursive Descent Parsing Calculator 11 Your turn Write a grammar to accept expressions of integers, which could contain zero or more additions (e.g. 23, 1+5+23, -23+23). You will need two grammar rules. 1 <expr> ::= <integer> ( "+" <expr> )? 2 <integer> ::= "-"? ( "0" "1"... "9" )+ Note how the first rule is recursive: that means you can create any sequence of <integer> + <integer> + <integer>... Write a grammar to accept any number of a s followed by the same number of b s. Again you ll need two rules. 1 <string> ::= "a" <string> "b" 2 <string> ::= ε

cfg Parsing Recursive Descent Parsing Calculator 12 Regular Expression Languauge The language of regular expressions is governed by a grammar You know that /a*b bbc/ is valid and /a(*/ is invalid Imagine our own regular expression language only has support for OR, bracketing, and the Kleene star. How might we write this grammar?

cfg Parsing Recursive Descent Parsing Calculator 12 Regular Expression Languauge The language of regular expressions is governed by a grammar You know that /a*b bbc/ is valid and /a(*/ is invalid Imagine our own regular expression language only has support for OR, bracketing, and the Kleene star. How might we write this grammar? Not your turn! My turn!

cfg Parsing Recursive Descent Parsing Calculator 12 Regular Expression Languauge The language of regular expressions is governed by a grammar You know that /a*b bbc/ is valid and /a(*/ is invalid Imagine our own regular expression language only has support for OR, bracketing, and the Kleene star. How might we write this grammar? Not your turn! My turn! 1 <re> ::= <simple-re> ( " " <re> )? 2 <simple-re> ::= <basic-re>+ 3 <basic-re> ::= <elem-re> "*"? 4 <elem-re> ::= "(" <re> ")" 5 <elem-re> ::= "\" ( "*" "(" ")" " " "\" ) 6 <elem-re> ::= ( "*" "(" ")" " " "\" )

cfg Parsing Recursive Descent Parsing Calculator 13 Parsing We saw before how you can use the grammar rules to generate strings of the language of the grammar <sentence> <subject> <verb> <object>.

cfg Parsing Recursive Descent Parsing Calculator 13 Parsing We saw before how you can use the grammar rules to generate strings of the language of the grammar <sentence> <subject> <verb> <object>. I <verb> <object>.

cfg Parsing Recursive Descent Parsing Calculator 13 Parsing We saw before how you can use the grammar rules to generate strings of the language of the grammar <sentence> <subject> <verb> <object>. I <verb> <object>. I see <object>.

cfg Parsing Recursive Descent Parsing Calculator 13 Parsing We saw before how you can use the grammar rules to generate strings of the language of the grammar <sentence> <subject> <verb> <object>. I <verb> <object>. I see <object>. I see the <noun>.

cfg Parsing Recursive Descent Parsing Calculator 13 Parsing We saw before how you can use the grammar rules to generate strings of the language of the grammar <sentence> <subject> <verb> <object>. I <verb> <object>. I see <object>. I see the <noun>. I see the cat.

cfg Parsing Recursive Descent Parsing Calculator 14 Parsing Parsing is the opposite process: Given a string, 1 Check that the string is in the language of the grammar 2 Construct the derivation tree (parse tree) for the string <sentence> <verb> <object> <subject> <noun> I see the cat.

cfg Parsing Recursive Descent Parsing Calculator 15 Parsing in our re implementation We need to be able to convert input strings of regular expressions into nfas i.e. we need to parse the language of regular expressions /a(bc d*)*e/

cfg Parsing Recursive Descent Parsing Calculator 16 /a(bc d*)*e/ re simple basic-re basic-re basic-re elem-re elem-re * elem-re a ( re ) e re re simple-re simple-re basic-re basic-re basic-re elem-re elem-re elem-re * b c d

cfg Parsing Recursive Descent Parsing Calculator 17 Parsing a calculator language Here is a grammar for a basic calculator language 1 <e1> ::= <e2> ( "+" <e1> )? 2 <e2> ::= <e3> ( "*" <e2> )? 3 <e3> ::= "-"? ( "0" "1"... "9" )+ 4 <e3> ::= "(" <e1> ")" Come up with a string in the language of this grammar Draw the parse tree for this string: 4 * (3 + 2)

cfg Parsing Recursive Descent Parsing Calculator 18 Parse tree for 4*(3+2) <e1> <e2> <e3> "4" "*" <e2> <e3> "(" <e1> ")" <e2> <e3> "3" "+" <e1> <e2> <e3> "2"

cfg Parsing Recursive Descent Parsing Calculator 19 Evaluating 4*(3+2) <e1> <e2> <e3> "4" "*" <e2> <e3> "(" <e1> ")" <e2> <e3> "3" "+" <e1> <e2> <e3> "2"

cfg Parsing Recursive Descent Parsing Calculator 19 Evaluating 4*(3+2) <e1> <e2> <e3> "4" "*" <e2> <e3> "(" <e1> ")" <e2> "+" <e1> <e1>: 3+2=5 <e3> <e2> "3" <e3> "2"

cfg Parsing Recursive Descent Parsing Calculator 19 Evaluating 4*(3+2) <e1> <e2> <e2>: 4*5=20 <e3> "*" <e2> "4" <e3> "(" <e1> ")" <e2> "+" <e1> <e1>: 3+2=5 <e3> <e2> "3" <e3> "2"

cfg Parsing Recursive Descent Parsing Calculator 20 Parsing in our re implementation We need to be able to convert input strings of regular expressions into nfas i.e. we need to parse the language of regular expressions /a(bc d*)*e/

cfg Parsing Recursive Descent Parsing Calculator 21 Introduction There is a standard parsing technique for context free grammars using recursion: Recursive Descent Parsing The idea is to come up with one function/method for each non-terminal in the grammar Here we will learn how to construct a recursive descent parser for any cfg

cfg Parsing Recursive Descent Parsing Calculator 22 Balanced Language Grammar We want to write a parser which accepts strings of the grammar: 1 <s> ::= "a" <s> "b" 2 <s> ::= "e" What we re aiming for: 1 >>> Parser('aeb').parse() 2 True 3 >>> Parser('aaebb').parse() 4 True 5 >>> Parser('aaebbb').parse() 6 False Note: this is called an undecorated parser because it doesn t do any more than check syntactic correctness of the input.

cfg Parsing Recursive Descent Parsing Calculator 23 Step 1: Basic parsing framework 1 class Parser: 2 def init (self, tokens): 3 self._tokens = tokens 4 self._length = len(tokens) 5 self._upto = 0 6 7 def end(self): 8 return self._upto == self._length 9 10 def peek(self): 11 return None if self.end() else self._tokens[self._upto] 12 13 def next(self): 14 if not self.end(): 15 self._upto += 1

cfg Parsing Recursive Descent Parsing Calculator 24 Step 2: Helper method for each non-terminal 17 def _parse_s(self): 18 if self.peek() == 'a': # <s> ::= "a" <s> "b" 19 self.next() # move to the next input token 20 ret = self._parse_s() # recursively parse the <s> 21 if not ret: 22 return False 23 if self.peek()!= 'b': # assert the next input token 24 return False 25 self.next() # move to the next input token 26 elif self.peek() == 'e': # <s> ::= "e" 27 self.next() 28 else: # unknown input! 29 return False 30 return True

cfg Parsing Recursive Descent Parsing Calculator 25 Step 3: Wrap the start-symbol parsing helper method 32 def parse(self): 33 return self._parse_s() and self.end()

cfg Parsing Recursive Descent Parsing Calculator 26 Step 4: Test it 1 >>> Parser('ab').parse() 2 False 3 >>> Parser('cd').parse() 4 False 5 >>> Parser('e').parse() 6 True 7 >>> Parser('aeb').parse() 8 True 9 >>> Parser('aaebb').parse() 10 True 11 >>> Parser('aaebbb').parse() 12 False 13 >>> Parser('aaccaaebbddbb').parse() 14 False

cfg Parsing Recursive Descent Parsing Calculator 27 Calculator Let s build a calculator evaluator! (This is decorated.) What we re aiming for: 1 >>> Parser(['3','+','2']).parse().eval() 2 5 3 >>> Parser(['3','*','2']).parse().eval() 4 6 5 >>> Parser(['3','*','2','+','4']).parse().eval() 6 10 7 >>> Parser(['3','+','2','*','4']).parse().eval() 8 11 9 >>> Parser(['(','5','*','2',')','*','3']).parse().eval() 10 30 11 >>> Parser(['5','*','2','*','4']).parse().eval() 12 40

cfg Parsing Recursive Descent Parsing Calculator 28 Calculator Two steps: Parse the input into something which can be evaluated Evaluate the returned object Build an object tree while parsing! Here s the grammar we will use in this example: 1 <e1> ::= <e2> ( "+" <e1> )? 2 <e2> ::= <e3> ( "*" <e2> )? 3 <e3> ::= "-"? ( "0" "1"... "9" )+ 4 <e3> ::= "(" <e1> ")"

cfg Parsing Recursive Descent Parsing Calculator 28 Calculator Two steps: Parse the input into something which can be evaluated Evaluate the returned object Build an object tree while parsing! Here s the grammar we will use in this example: 1 <e1> ::= <e2> ( "+" <e1> )? 2 <e2> ::= <e3> ( "*" <e2> )? 3 <e3> ::= "-"? ( "0" "1"... "9" )+ 4 <e3> ::= "(" <e1> ")" 1,2: "+" will be evaluated after "*"

cfg Parsing Recursive Descent Parsing Calculator 28 Calculator Two steps: Parse the input into something which can be evaluated Evaluate the returned object Build an object tree while parsing! Here s the grammar we will use in this example: 1 <e1> ::= <e2> ( "+" <e1> )? 2 <e2> ::= <e3> ( "*" <e2> )? 3 <e3> ::= "-"? ( "0" "1"... "9" )+ 4 <e3> ::= "(" <e1> ")" 3,4: <e3> is either an integer or a compound expression

cfg Parsing Recursive Descent Parsing Calculator 29 Expression Tree: Abstract Base Class Composite design expression tree We need an abstract base class 54 class Node: 55 def init (self, left, right): 56 self.left = left 57 self.right = right 58 59 def eval(self): 60 raise NotImplementedError()

cfg Parsing Recursive Descent Parsing Calculator 30 Expression Tree: Concrete Subclasses We need concrete subclasses for each type of node 62 class AddNode(Node): 63 def eval(self): 64 return self.left.eval() + self.right.eval() 65 66 class MultNode(Node): 67 def eval(self): 68 return self.left.eval() * self.right.eval() 69 70 class LiteralNode(Node): 71 def init (self, value): 72 super(). init (None, None) 73 self.value = value 74 75 def eval(self): 76 return self.value

cfg Parsing Recursive Descent Parsing Calculator 31 Step 1: Basic parsing framework 3 class Parser: 4 RE_NUMBER = re.compile(r'-?[0-9]+') 5 6 def init (self, tokens): 7 self._tokens = tokens 8 self._length = len(tokens) 9 self._upto = 0 10 11 def end(self): 12 return self._upto == self._length 13 14 def peek(self): 15 return None if self.end() else self._tokens[self._upto] 16 17 def next(self): 18 if not self.end(): 19 self._upto += 1

cfg Parsing Recursive Descent Parsing Calculator 32 Step 2: Helper method for each non-terminal (<e3>) <e3> ::= "-"? ( "0" "1"... "9" )+ <e3> ::= "(" <e1> ")" 37 def _parse_e3(self): 38 node = None 39 if self.peek() == '(': 40 self.next() 41 node = self._parse_e1() 42 if self.peek()!= ')': 43 raise Exception('Closing parenthesis not found!') 44 self.next() 45 elif Parser.RE_NUMBER.match(self.peek()): 46 node = LiteralNode(int(self.peek())) 47 self.next() 48 return node

cfg Parsing Recursive Descent Parsing Calculator 33 Step 2: Helper method for each non-terminal (<e2>) <e2> ::= <e3> ( "*" <e2> )? 29 def _parse_e2(self): 30 node = self._parse_e3() 31 if self.peek() == '*': 32 self.next() 33 node2 = self._parse_e2() 34 node = MultNode(node, node2) 35 return node

cfg Parsing Recursive Descent Parsing Calculator 34 Step 2: Helper method for each non-terminal (<e1>) <e1> ::= <e2> ( "+" <e1> )? 21 def _parse_e1(self): 22 node = self._parse_e2() 23 if self.peek() == '+': 24 self.next() 25 node2 = self._parse_e1() 26 node = AddNode(node, node2) 27 return node

cfg Parsing Recursive Descent Parsing Calculator 35 Step 3: Wrap the start-symbol parsing helper method 50 def parse(self): 51 node = self._parse_e1() 52 if not self.end(): 53 raise Exception('Extra content found at end of input!') 54 return node

cfg Parsing Recursive Descent Parsing Calculator 36 Step 4: Test it 1 >>> Parser(['3','+','2']).parse().eval() 2 5 3 >>> Parser(['3','*','2']).parse().eval() 4 6 5 >>> Parser(['3','*','2','+','4']).parse().eval() 6 10 7 >>> Parser(['3','+','2','*','4']).parse().eval() 8 11 9 >>> Parser(['(','5','*','2',')','*','3']).parse().eval() 10 30 11 >>> Parser(['5','*','2','*','4']).parse().eval() 12 40 13 >>> Parser(['5','+','2','+','4']).parse().eval() 14 11