Parsing Determine if an input string is a sentence of G. G is a context free grammar (later). Assumed to be unambiguous. Recognition of the string plus determination of phrase structure. We constantly do this. Parsing Compare bottom up and top down using the following simple grammar. ::= subject verb object. Subject ::= I a Noun the Noun Object ::= me a Noun the Noun Noun ::= cat mat rat Verb ::= like is see sees Input string of the cat sees a rat 10/16/2006 Systems Software 1 10/16/2006 Systems Software 2 First token the Can t do much two options available. Second token cat Apply the Noun rule Creates a Noun subtree which will become the subject of the sentence after the subject rule is applied. Next input sees Using the verb rule, creates a verb tree. a is ambiguous at this point, so the following terminal symbol rat (Noun rule) creates another noun tree. Combined with the a, it creates the object tree from the object rule. 10/16/2006 Systems Software 3 10/16/2006 Systems Software 4 Subject Noun Verb Object Noun When it reaches the terminal. the sentence rule can be applied. Successful parse has been applied. The string of input symbols has been reduced to a single S-tree where S is one of the starting symbols in the language G. 10/16/2006 Systems Software 5 10/16/2006 Systems Software 6
Works as follows. Parser examines the terminal symbols of the input string in order from left to right and reconstructs its syntax tree from the top down From root node to terminal nodes. Same example sentence. Decide which rule to apply to the sentence node. There is only one. Generates 4 stubs not yet connected to the input string. Subject Verb Object. 10/16/2006 Systems Software 7 10/16/2006 Systems Software 8 Leftmost stub Subject Parser decides which rule to apply. 3 to choose from. One appropriate the noun Connects the the and makes new stub labelled Noun. Subject Noun Verb Object. 10/16/2006 Systems Software 9 10/16/2006 Systems Software 10 Leftmost stub is a node noun Parser decides which rule to apply. In this case choose Cat and connect the noun to the Cat terminal. And continues until sentence is parsed. For a grammar like G (context free) Parser starts with root node At each step takes the leftmost stub. If it is a terminal t, the parser connects it to the next input terminal symbol the was connected to the cat Otherwise, there is a syntax error. Non terminal symbol N must be replaced using a production rule. 10/16/2006 Systems Software 11 10/16/2006 Systems Software 12
Which production rule? In this micro English example easy choice by looking at the next input terminal symbol. If it is I, then subject ::= I is used, or if it is the, then subject ::= the Noun is used. Other grammars are not so convenient Some proving unsuitable to this method of parsing. Parsing The parser has to make choices between production rules. Parsing Algorithm is defined by this way of making choices. Recursice Descent is one algorithm. Consists of a group of methods parsen One for each non terminal N of grammar G. Task of each is to parase a single N phrase. 10/16/2006 Systems Software 13 10/16/2006 Systems Software 14 private void parsenoun(); // parse a noun private void parseverb(); private void parsesubject(); private void parseobject(); private void parse(); Methods cooperate to parse the input. parse parses the whole string, Delegates work to the others and accepts the last terminal. itself. Need a class to contain the methods. Parser class Contains an instance variable currentterminal 10/16/2006 Systems Software 15 10/16/2006 Systems Software 16 Parser class declared as:- Public class Parser { privateterminalsymbol currentterminal; // auxiliary methods here // parsing methods here. currentterminal is accessed by the following method 10/16/2006 Systems Software 17 Recursvie Descent Private void accept (TerminalSymbol expectedterminal) { if (currentterminal matches expectedterminal) currentterminal = next input terminal; else report syntax error. Parser calls accept(t) when it expects he current terminal to be t and wants to check this before getting the next input terminal. 10/16/2006 Systems Software 18
Parsing method implemented as:- private void parse() { parsesubject(); parseverb(); parseobject(); accept(. ); ::= Subject Verb Object. 10/16/2006 Systems Software 19 Method parsesubject Consider the rule for a subject. Three forms I A Noun the Noun Have to decide which form Based on the current terminal symbol On entry to this method, the current terminal symbol should contain the first symbol of the subject. Any other condition is an error. 10/16/2006 Systems Software 20 private void parsesubject () { if (currentterminal matches I ) accept( I ); else (currentterminal matches a ) { accept( a ); parsenoun(); elseif(currentterminal matches the ) { accept( the ); parsenoun(); else report a syntactic error 10/16/2006 Systems Software 21 parsenoun is straight forward for our microenglish example. A noun must be a cat, rat or mat. Method just checks the contents of the currentterminalto discover which it is. 10/16/2006 Systems Software 22 private void parsenoun () { if (currentterminal matches cat ) accept( cat ); else (currentterminal matches rat ) accept( rat ); else if(currentterminal matches mat ) accept( mat ); else report a syntactic error parseobject and parseverb are similar to parsenoun. Parser is initiated using public void parse () { currentterminal = first input terminal; parse(); check that no terminal follows the sentence 10/16/2006 Systems Software 23 10/16/2006 Systems Software 24
In general, the methods cooperate. Variable currentterminal contains current input to which all methods have access. On entry to parsen, currentterminal contains the first terminal of the N-phrase. On exit, contains the input immediately following. On entry to accept with argument t, currentterminal expected to contain t. On exit, currentterminal is supposed to contain the next input. 10/16/2006 Systems Software 25