afewadminnotes CSC324 Formal Language Theory Afsaneh Fazly 1 Office Hours: (in BA 4237) Monday 3 4pm Wednesdays 1 2pm January 16, 2013 There will be a lecture Friday January 18, 2013 @2pm. 1 Thanks to A.Tafliovich, S.McIlraith, E.Joanis, S.Stevenson, G.Penn, D.Horton 1 2 Dealing with Ambiguity: Precedence Impose precedence over operators: Low precedence: addition + and subtraction - Medium precedence: multiplication * and division / Higher precedence: exponentiation ^ Highest precedence: parenthesized expressions ( <expr> ) Change the grammar by: introducing a distinct non-terminal for each precedence level. designing grammar so that operators with higher precedence are lower in the parse tree (i.e., evaluated before lower-precedence operators). Remember G1: Example <expn> --> <expn> + <expn> <expn> - <expn> <expn> * <expn> <expn> / <expn> <expn> ^ <expn> (<expn>) <identifier> <literal> Introduce precedence into G1. 3 4
Example (cont d) Dealing with Ambiguity: Associativity Write grammar G3: <expn> --> We still have ambiguity. Example: 3-2 - 1 is still a problem. The grouping of operators of same precedence is not disambiguated. For non-commutative operators, only one parse tree is correct. Operators may be left or right associative. Grouping in parse tree now reflects precedence. Examples: parse 8-3 * 2 parse 3-2 - 1 5 6 Dealing with Ambiguity: Associativity Deals with operators of same precedence. Implicit grouping or parenthesizing. Left associative: *, /, +,-. Right associative: ^. Approach: for left-associative operators, put the recursive term to the left of the non-recursive term in a production rule. for right-associative operators, put the recursive term to the right of the non-recursive term in a production rule. Remember G3: Example <expr> --> <expr> + <expr> <expr> - <expr> <term> <term> --> <term> * <term> <term> / <term> <factor> <factor> --> <factor> ^ <factor> <X> <X> --> (<expr>) <identifier> <literal> Introduce associativity. Grammar G4: <expn> --> 7 8
Example (con td) Dealing with Ambiguity Grammar G4: <expr> --> <expr> + <term> <expr> - <term> <term> <term> --> <term> * <factor> <term> / <factor> <factor> <factor> --> <X> ^ <factor> <X> <X> --> (<expr>) <identifier> <literal> Can t always remove an ambiguity from a grammar by restructuring productions. An inherently ambiguous language does not have a corresponding unambiguous grammar. Example of an Inherently Ambiguous Language: L = {a i b j c k i, j, k 1, i = j or j = k} Write a CFG for L: 9 10 Limitations of CFGs Specifying PL Syntax CFGs are not powerful enough to describe some languages. Examples: { a i b i c i i 1 }. { a m b n c m d n m, n 1 }. Question: Exactly what things can and cannot be expressed with acfg? 11
Specifying PL Syntax Specifying PL Syntax appropriate for lexical analysis: describing tokens such as numbers, identifiers, etc. appropriate for lexical analysis: describing tokens such as numbers, identifiers, etc. CFGs can describe nested constructs, matching pairs of items. Specifying PL Syntax appropriate for lexical analysis: describing tokens such as numbers, identifiers, etc. CFGs for Specifying PL Syntax Some aspects of PL syntax can t be specified with CFGs: CFGs can describe nested constructs, matching pairs of items. appropriate for syntactic analysis: describing the hierarchical structure of program statements, expressions, and other program units. 13
CFGs for Specifying PL Syntax Some aspects of PL syntax can t be specified with CFGs: Cannot declare the same identifier twice in the same block. Must declare an identifier before using it. A[i,j] is valid only if A is two-dimensional. The number of actual parameters must equal the number of formal parameters. The operands of an operator must have compatible types. CFGs for Specifying PL Syntax Some aspects of PL syntax can t be specified with CFGs: Cannot declare the same identifier twice in the same block. Must declare an identifier before using it. A[i,j] is valid only if A is two-dimensional. The number of actual parameters must equal the number of formal parameters. The operands of an operator must have compatible types. These aspects are usually specified informally, separately from the formal grammar, and are checked during semantic analysis. 13 13
Translation Process: Revisited 1. Lexical Analysis: Converts source code into sequence of tokens. We use regular grammars and finite state automata (recognizers). Want to Learn More? Take CSC488 Compilers & Interpreters! 2. Syntactic Analysis: Structures tokens into initial parse tree. We use CFGs and parsing algorithms. 3. Semantic Analysis: Annotates parse tree with semantic actions, e.g., making sure that the types match. 4. Code Generation: Produces final machine code. 16 17 Other Applications of Formal Grammars Identifying strings for an operating system command, e.g., Unix commands that use extended REs: ls s[y--z]* grep Se.h syntax.tex awk /to[kg]e/ {print $1} syntax.tex Automatic voice recognition.: given recorded speech, produce a string containing the words that were spoken. Challenges? How can a grammar help? 18