ITEC2620 Introduction to Data Structures Lecture 10a Grammars II Review Three main problems Derive a sentence from a grammar Develop a grammar for a language Given a grammar, determine if an input is valid (parsing) We have done derivation The next two parts are more interesting Parse Trees I The derivation of a sentence can be visualized as a tree the parse tree. The root of this tree is the start symbol, and the leaves of the tree are the terminals which form a sentence. Parse Trees II Example G = ( T, N, S, P ) T = { id, + } N = { E } S = E P: E E + E E id 1
Parse Trees III E E + E id + E id + id E E + E E E + E E E + E id id id Parse Trees IV Derivation is the process of rewriting the start symbol into the sentence Production rules are used to create a sentence Parsing is the reverse Which production rules were used to create this sentence? Parse Trees V How is parsing done? Performed by building the parse tree Note: the final tree does not show the order in which the rules were performed Only shows which rules were used Building Parse Trees I How can we build a parse tree? Left-most derivation Always expand left-most non-terminal Right-most derivation Always expand right-most non-terminal 2
Building Parse Trees II Building Parse Trees III Example G = ( T, N, S, P ) T = { a, b, c, d, e } N = { S, A, B } P: S aabe A Abc b B d Is sentence abbcde valid? Left-most derivation S aabe aabcbe abbcbe abbcde S a A B e A b c d b Building Parse Trees IV Building Parse Trees V Right-most derivation S aabe aade aabcde abbcde S a A B e A b c d In the previous example, the same parse tree was derived When we parse the sentence abbcde, we discover exactly one set of rules which could have created it The sentence has a single meaning b 3
Building Parse Trees VI Building Parse Trees VII Another example G = ( T, N, S, P ) T = { id, +, * } N = { E } S = E P: E E + E E * E id Is id + id * id a valid sentence? Left-most derivation E E + E id + E id + E * E id + id * E id + id * id E E + E id E * E id id Building Parse Trees VIII Building Parse Trees IX Right-most derivation E E * E E * id E + E * id E + id * id id + id * id E E * E E + E id id id Different rules lead to different parse trees lead to different meaning of the sentences Left-most last two terms multiplied 3 + 2 * 4 = 11 Right-most first two terms added 3 + 2 * 4 = 20 4
Building Parse Trees X These sentences are ambiguous Two meanings for the same sentence The man has a cat and a dog with black hair. Does the cat have black hair? Depends on how the sentence is parsed English is an ambiguous language! Building Parse Trees XI Programming languages cannot be ambiguous 3 + 2 * 4 Add precedence rules Add non-terminals to ensure that certain production rules (e.g. addition) cannot be applied before other production rules Note inversion Languages I If we know the type of sentences that can appear in a language Can we develop a grammar for it? Identify terminals Specify a start symbol Add non-terminals and production rules as necessary Languages II Example Design a grammar for the language in which all sentences consist of lowercase letters separated by commas (, ) 5
Languages III Identify terminals all sentences consist of lower-case letters separated by commas (, ) T = { a, b, c,... z,, } Languages IV Specify a start symbol Start with S Then try to put as much information as possible into the production rules from S S = S Languages V Add non-terminals and production rules as necessary N = { S, A, ch } A Non-terminal useful for replication ch the class of lower-case letters Languages VI P: S A ε The null sentence is a valid sentence A ch ch,a Terminate or replicate ch a b c z ch is the class of lower-case letters 6
Languages VII Test sentence: x,y,z S A ch,a x,a x,ch,a x,y,a x,y,ch x,y,z Languages VIII Testing grammars Try multiple examples Try both valid and invalid sentences Testing is not proving Formal proofs of grammars is extremely difficult Expressions I G = ( T, N, S, P ) Identify terminals T = { true, false, &&,,!, (, ) } Boolean values Boolean operators Parentheses Expressions II Specify a start symbol S = EXPR Start with a simple expression 7
Expressions III Add non-terminals and production rules as necessary N = { EXPR, OP, BOOL } EXPR = a simple expression OP = a binary Boolean operator BOOL = a Boolean value Expressions IV P: EXPR BOOL A single boolean value EXPR EXPR OP EXPR Two expressions with a binary operator EXPR! EXPR The inversion of an expression EXPR ( EXPR ) An expression in parentheses Expressions V P: OP && OP II The class of Boolean operators BOOL true BOOL false The class of Boolean values Expressions VI EXPR EXPR OP EXPR ( EXPR ) &&! EXPR EXPR OP EXPR ( EXPR ) BOOL II BOOL true false BOOL false ( true false ) &&! ( false ) 8
Statements I G = ( T, N, S, P ) Identify terminals T = { if, else, EXPR, STMT, (, ), {, }, } Java keywords Start symbol for Boolean expressions Start symbol for Java code Parentheses and braces Statements II Specify a start symbol S = S Put the information in the first production rule Statements III Add non-terminals and production rules as necessary N = { IF, ELSEIF, ELSE, THEN, STMTS } IF = start of an if statement ELSEIF = optional and multiple else ifs ELSE = optional else THEN = conditional statements STMTS = Multiple lines of Java code Statements IV P:S IF ELSEIF ELSE Specify the general structure of an if statement IF if ( EXPR ) THEN The if part of an if statement ELSEIF else if ( EXPR ) THEN ELSEIF ELSEIF ε Optional and arbitrary number of else ifs 9
Statements V ELSE else THEN ELSE ε Optional (and only one) else else doesn t have a Boolean expression THEN STMT THEN {STMTS} Conditional code can be a single statement of multiple statements inside of braces Statements VI STMTS STMT STMTS STMTS ε A block of code can be 0, 1, 2, n lines of code Note: empty THEN is valid if in braces Summary (Old) Artificial Intelligence is about representing knowledge in formats that allow brute force computation to extract meaning Grammars allow us to parse sentences Determine the rules used to construct it Define languages for specific purposes E.g. Programming languages Readings and Assignments Suggested Readings from Shaffer (third edition) 10