Chapter 3. Topics. Languages. Formal Definition of Languages. BNF and Context-Free Grammars. Grammar 2/4/2019

Size: px
Start display at page:

Download "Chapter 3. Topics. Languages. Formal Definition of Languages. BNF and Context-Free Grammars. Grammar 2/4/2019"

Transcription

1 Chapter 3. Topics The terms of Syntax, Syntax Description Method: Context-Free Grammar (Backus-Naur Form) Derivation Parse trees Ambiguity Operator precedence and associativity Extended Backus-Naur Form Attribute grammars used to describe both the syntax and static semantics. Formal methods of describing semantics Operational semantics Axiomatic semantics Denotational semantics Languages A language, whether natural (English) or artificial (such as Java), is a set of statement. A language has rules (grammar) to describe a sentence and each sentence have meaning (semantics). Syntax: the form or structure of the expressions, statements, and program units : the meaning of the expressions, statements, and program units Syntax and semantics provide a language s definition Users of a language definition Other language designers Implementers Programmers (the users of the language) 1 2 The General Problem of Describing Syntax (Terminology) Formal Definition of Languages A statement (sentence) is a string of characters over some alphabet A language is a set of statements A lexeme is the lowest level syntactic unit of a language. A token is a category of lexemes. Ex) index = 2 * count + 17; Lexemes Tokens index identifier = equal_sign 2 int_literal * mult_op count identifier + plus_op 17 int_literal ; semicolon In general, languages can be formally defined in two distinct ways: by recognition and by generation. Recognizers (R) A recognition device reads input strings over the alphabet of the language L and decides whether the input strings belong to the language Example: syntax analysis part of a compiler - Detailed discussion of syntax analysis appears in Chapter 4 Generators A device that generates sentences of a language One can determine if the syntax of a particular sentence is syntactically correct by comparing it to the structure of the generator 3 4 BNF and Developed by Noam Chomsky in the mid-1950s It is useful for describing the syntax of programming languages. The syntax of whole programming languages, with minor exceptions, can be described by context-free grammars. Backus-Naur Form (1959) Invented by John Backus to describe the syntax of Algol 58 BNF nearly identical to context-free grammars Grammar A grammar G is defined as a quadruple G = (V, T, S, P) V: a finite set variables (non-terminal), T: finite set of terminals, S: start variable (S V), P: finite set of production rules. The production rules are of the form X Y Where X (V T) + and Y (V T) * 5 6 1

2 Grammar A grammar G = (V, T, S, P) is said to be context-free if all productions in P have the form X Y, (where X V and Y (V T)* A grammar G = (V, T, S, P) is said to be context-sensitive if all productions in P have the form X Y, (where X, Y (V T) + and X Y ) Grammar Let G = (V, T, S, P) be a grammar. Then a language generated by grammar G is denoted by L (G) = {w w T *, S * w } Ex) The grammar G = {S, {a,b}, S, P) with production rules P: S asa bsb λ From the grammar G, we can get a language L, L(G) = {ww R w {a, b}*} 7 8 Grammar Derivation of string abbbba from the grammar G = {S, {a,b}, S, P) with production rules P: S asa bsb λ S asa absba abbsbba abbbba we can write S * abbbba Sentential forms The strings which contain variable as well as terminals. Ex) asa, absba, abbsbba are sentential forms Grammar Consider a language L = {a n b n : n 0}. We can easily get a grammar G for L. G = {{S},{a, b}, S, P} with production rules P: S asb λ 9 10 Grammar Let L = {w w {a, b} *, n a (w) = n b (w) where n i (w): number of i s in w}. A grammar for L is G = ({S}, {a, b}, S, P) with P given by S SS asb bsa λ To Prove L = L(G). We need to show 1. Every sting w generated by G has n a (w) = n b (w) 2. Every string w L could be generated by G. G = (V, T, S, P) is context free if production rules forms X Y where X V and Y (V T) * G = (V, T, S, P) is context sensitive if production rules forms X Y where X, Y (V T) + and X Y

3 Ex) L ={a n b n n0} Sol) The grammar G = {S, {a, b}, S, P) with production rules P: S asb Ex) L = {ww R w {a, b}*} is context-free language? Sol) The grammar G = {S, {a, b}, S, P) with production rules P P: S asa bsb Ex) L ={a n b m n m} is context-free language? Sol) The grammar G = ({S, A, B, C}, {a, b}, S, P) with production rules P S AC CB A aa a B bb b C acb (Leftmost and Rightmost derivation) (Leftmost and Rightmost derivation) Definition 5.2) Leftmost derivation in each step of derivation, the leftmost variable in the sentential form is replaced. Rightmost derivation in each step of derivation, the rightmost variable in the sentential form is replaced. Ex) G = ({A,B,S}, {a, b}, S, P) with production rule S AB A aaa B Bb Lets derive a string aab. Leftmost derivation: S AB aaab aab aabb aab Rightmost derivation: S AB ABb Ab aaab aab (Derivation Tree) (Derivation Tree) Definition ) For context-free grammar G = (V, T, S, P). An ordered tree is a derivation tree for G if and only if it has the following properties. The root is labeled S. Every leaf has a label from T { } Every interior vertex has a label from V If a vertex has label A V, and its children are labeled x 1,x 2, x n then there must be a production rule; A x 1 x 2 x n A leaf labeled has no siblings, since applied production rule A A derivation tree with every leaf is not labeled with T {} is called partial derivation tree. The strings obtained by reading the leaves of derivation tree from left to right is said to be the yield of tree. Ex) G = ({A,B,S}, {a, b}, S, P) with production rule S AB A aaa B Bb S AB aaab aab aabb aab

4 S AB aaab aab aabb aab S A B a a B B b (Derivation Tree) Theorem ) Let G = (V, T, S, P) be a context-free grammar. Then for every w L(G), there exist a derivation tree of G whose yield is w. Also, if t G is any partial derivation tree for G whose root is labeled S, then the yield of t G is a sentential form of G (Parsing and Ambiguity) Parsing A process finding a sequence of productions by which we can decide any w is in L(G) or not. Top-down parsing Start from starting symbol to the string with scanning left to right through the string corresponds to a left-most derivation. Bottom-up parsing Start from the sting to starting symbol with scanning left to right through the string corresponds to a right-most derivation in reverse order. (Parsing and Ambiguity) Ex) E E + T E T T T T * F T/F F F (E) a Parse for a + (a * a) Top-down parsing with leftmost derivation E E + T T + T F + T a + T a + F a + (E) a + (T) a + (T * F) a + (F * F) a + (a * F) a + (a * a) Bottom-up parsing with reverse of rightmost derivation a + (a * a) F + (a * a) T + (a * a) E + (a * a) E + (F * a) E + (T * a) E + (T * F) E + (T) E + (T) E + (E) E + F E + T E (Parsing and Ambiguity) (Parsing and Ambiguity) E E + T Top-down parsing Bottom-up parsing E T F T F E F ( E ) T a T E T * F T T F a F F F a a + ( a * a )

5 (Parsing and Ambiguity) Definition) A grammar G is said ambiguous if there exists w L(G) which has more than one distinct derivation (parse) tree. Ex) Consider the grammar G = (V, T, E, P) with V={E, I} T = {a, b, c, +, *, (, )} And production rules E I E + E E * E (E) I a b c (Parsing and Ambiguity) Definition ) Lets L is a context-free language, If every context-free grammar G L = L(G) is ambiguous, then we call L is inherently ambiguous Syntactic Ambiguity Syntactic Ambiguity Ambiguity in Grammars and Languages Ex) Show following grammar is ambiguous. A grammar G is said ambiguous if there exists w L(G) which has more than one distinct parse tree (multiple meanings). E E + E E E E * E E / E (E) D D Syntactic Ambiguity BNF Fundamentals For string * 3 E E + E D E * E 2 D D 4 3 E E * E + E D D 2 4 E D 3 In BNF, abstractions (nonterminal variables) are used to represent classes of syntactic structures-- they act like syntactic variables (also called nonterminal symbols, or just terminals) Terminals are lexemes or tokens A production rule has a left-hand side (LHS), which is a nonterminal, and a right-hand side (RHS), which is a string of terminals and/or nonterminals (variables) X -> Y X a nonterminal variable Y a string of terminals and/or nonterminals

6 BNF Fundamentals Backus-Naur form (BNF) it is also context-free grammar In BNF, nonterminals (variables) are enclosed in triangular brackets (<>). Terminal symbols are written without any special marking. Empty string is written as <empty>. ->(or ::=) is interpreted as can be derived to and is interpreted as or at production rule. Ex) BNF production rules for real number <real-num> -> <fraction>. <fraction> <fraction>: -> <digit> <digit> <fraction> <digit> -> BNF Fundamentals Nonterminal symbols can have two or more distinct definitions, representing two or more possible syntactic forms in the language Ex) <if_stmt> if ( <logic_expr> ) <stmt> if ( <logic_expr> ) <stmt> else <stmt> Syntactic lists are described using recursion <ident_list> ident ident, <ident_list> A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols) Derivations Every string of symbols in a derivation is a sentential form A sentence is a sentential form that has only terminal symbols A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded A derivation may be neither leftmost nor rightmost An Example Grammar 1 <program> <stmts> <stmts> <stmt> <stmt> ; <stmts> <stmt> <var> = <expr> <var> a b c d <expr> <term> + <term> <term> - <term> <term> <var> const Derivation for a sentense a = b + const from the grammar <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + const An Example Grammar 2 <program> begin <stmt_list> end <stmt_list> <stmt> <stmt> ; <stmt_list> <stmt> <var> = <expression> <var> A B C <expression> <var> + <var> <var> <var> <var> The derivation of sentence begin A = B + C ; B = C end from the grammar 2 <program> => begin <stmt_list> end => begin <stmt> ; <stmt_list> end => begin <var> = <expression> ; <stmt_list> end => begin A = <expression> ; <stmt_list> end => begin A = <var> + <var> ; <stmt_list> end => begin A = B + <var> ; <stmt_list> end => begin A = B + C ; <stmt_list> end => begin A = B + C ; <stmt> end => begin A = B + C ; <var> = <expression> end => begin A = B + C ; B = <expression> end => begin A = B + C ; B = <var> end => begin A = B + C ; B = C end Ex. Grammar for Assignment Statement <assign> = <expr> A B C <expr> + <expr> * <expr> ( <expr> ) The left most derivation for sentence A = B * ( A + C ) <assign> => = <expr> => A = <expr> => A = * <expr> => A = B * <expr> => A = B * ( <expr> ) => A = B * ( + <expr> ) => A = B * ( A + <expr> ) => A = B * ( A + ) => A = B * ( A + C )

7 Parse Tree Grammars naturally describe the hierarchical syntactic structure (Parse Tree) of the sentences of the languages they define. Ex) A= B * (A + C) Ambiguous Grammar A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees Ex) A = B * (A + C) with previous production rules Simple Assignment Grammar <assign> = <expr> A B C <expr> + <expr> * <expr> ( <expr> ) Parse tree for A= B * (A + C) Ambiguous Grammar Operator Precedence Syntactic ambiguity of language structures is a problem because compilers often base the semantics of those structures on their syntactic form. If a language structure has more than one parse tree, then the meaning of the structure cannot be determined uniquely. Some parsing algorithms can be based on ambiguous grammars. When such a parser encounters an ambiguous construct, it uses non-grammatical information provided by the designer to construct the correct parse tree. In many cases, an ambiguous grammar can be rewritten to be unambiguous but still generate the desired language. When an expression includes more than one different operators, one semantic issue is the order of evaluation of operators. (ex. X + Y * Z) By assigning different precedence levels to operators,this semantic question can be answered. Higher precedence is assigned to *, / than +, - Higher precedence operators evaluate before lower precedence operators. Between same precedence operators, evaluate left to right. A grammar can describe a certain syntactic structure so that part of the meaning of the structure can be determined from its parse tree (evaluate from the bottom to up in parse tree) Operator Precedence Operator Precedence Some grammar is not ambiguous but might not build proper parse tree for proper evaluation based on the precedence. Unambiguous Grammar for Expressions without considering precedence. <assign> = <expr> A B C <expr> + <expr> * <expr> ( <expr> ) Ex) check following two expressions A + B * C A * B + C Unambiguous grammar without considering precedence <assign> = <expr> A B C <expr> + <expr> * <expr> ( <expr> ) A = A + B * C <assign> => = <expr> => A = <expr> => A = + <expr> => A = A + <expr> => A = A + * <expr> => A = A + B * <expr> => A = A + B * => A = A + B * C <assign> = <expr> A + <expr> A B * <expr> B * C evaluate first C

8 Operator Precedence Operator Precedence Unambiguous grammar without considering precedence <assign> = <expr> A B C <expr> + <expr> * <expr> ( <expr> ) A = A * B + C <assign> => = <expr> => A = <expr> => A = * <expr> => A = A * <expr> => A = A * + <expr> => A = A * B + <expr> => A = A * B + => A = A * B + C B + C evaluate first <assign> = <expr> A * <expr> A B + <expr> C Unambiguous Grammar for Expressions with considering precedence <assign> = <expr> A B C <expr> <expr> + <term> <term> <term> <term> * <factor> <factor> <factor> ( <expr> ) Ex) check following two expressions A = B + C * A A = B * C + A Operator Precedence Operator Precedence Unambiguous grammar without considering precedence <assign> = <expr> A B C <expr> <expr> + <term> <term> <term> <term> * <factor> <factor> <assign> <factor> ( <expr> ) Unambiguous grammar without considering precedence <assign> = <expr> A B C <expr> <expr> + <term> <term> <term> <term> * <factor> <factor> <assign> <factor> ( <expr> ) A = B + C * A <assign> => = <expr> => A = <expr> = <expr> A <expr> + <term> A = B * C + A <assign> => = <expr> => A = <expr> = <expr> A <expr> + <term> => A = <expr> + <term> => A = <term> + <term> <term> <term> * <factor> => A = <expr> + <term> => A = <term> + <term> <term> <factor> => A = <factor> + <term> => A = + <term> => A = B + <term> => A = B + <term> * <factor> => A = B + <factor> * <factor> => A = B + * <factor> => A = B + C * <factor> <factor> B <factor> C A => A = <term> * <factor> + <term> => A = <factor> * <factor> + <term> => A = * <factor> + <term> => A = B * <factor> + <term> => A = B * + <term> => A = B * C + <term> => A = B * C + <factor> <term> <factor> * <factor> C A => A = B + C * => A = B + C * A => A = B * C + => A = B + C * A B Associativity of Operators Associativity of Operators The rule of associativity When an expression includes two operators that have the same precedence, which should be evaluate first. Ex) A + B + C A * B * C When a grammar rule has its LHS also appearing at the beginning of its RHS, the rule is said to be left recursive. This left recursion specifies left associativity. From the previous grammar <assign> = <expr> A B C <expr> <expr> + <term> <term> <term> <term> * <factor> <factor> <factor> ( <expr> ) Unfortunately, left recursion disallows the use of some important syntax analysis algorithms. When such algorithms are to be used, the grammar must be modified to remove the left recursion. Fortunately, left associativity can be enforced by the compiler, even though the grammar does not dictate it. The exponentiation operator is right associative. To indicate right associativity, right recursion can be used. <factor> <exp> ** <factor> <exp> <exp> ( <expr> )

9 An Unambiguous Grammar for if-then-else <stmt> <if_stmt> <if_stmt> if <logic_expr> then <stmt> if <logic_expr> then <stmt> else <stmt> An Unambiguous Grammar for if-then-else <stmt> <if_stmt> <if_stmt> if <logic_expr> then <stmt> if <logic_expr> then <stmt> else <stmt> Ex) if <logic_expr> then if <logic_expr> then <stmt> else <stmt> <stmt> => <if_stmt> => if <logic_expr> then <stmt> else <stmt> => if <logic_expr> then <if_stmt> else <stmt> => if <logic_expr> then if <logic_expr> then <stmt> else <stmt> Ex) if <logic_expr> then if <logic_expr> then <stmt> else <stmt> <stmt> => <if_stmt> => if <logic_expr> then <stmt> => if <logic_expr> then <if_stmt> => if <logic_expr> then if <logic_expr> then <stmt> else <stmt> An Unambiguous Grammar for if-then-else The rule for if constructs in many languages is that an else clause is matched with the nearest previous unmatched then. Therefore, there cannot be an if statement without an else between a then and its matching else. So, statements must be distinguished between those that are matched and those that are unmatched, where unmatched statements are else-less ifs and all other statements are matched. An Unambiguous Grammar for if-then-else <stmt> <matched> <unmatched> <matched> if <logic_expr> then <matched> else <matched> any non-if statement <unmatched> if <logic_expr> then <stmt> if <logic_expr> then <matched> else <unmatched> Ex) if <logic_expr> then if <logic_expr> then <stmt> else <stmt> <stmt> => <unmatched> => if <logic_expr> then <stmt> => if <logic_expr> then <matched> => if <logic_expr> then if <logic_expr> then <matched> else <matched> => if <logic_expr> then if <logic_expr> then <stmt> else <stmt> An Unambiguous Grammar for if-then-else if <logic_expr> then if <logic_expr> then <stmt> else <stmt> <stmt> <unmatched> if <logic_expr> then <stmt> <matched> if <logic_expr> then <matched> else <matched> <stmt> <stmt> Extended BNF (EBNF) The extensions do not enhance the descriptive power of BNF; they only increase its readability and writability. Three extensions are commonly included in the various versions of EBNF. 1. The first of these denotes an optional part of an RHS, which is delimited by brackets Ex) BNF <if_stmt> if (<expression>) <statement> if (<expression>) <statement> else <statement> Can be expressed with EBSF <if_stmt> if (<expression>) <statement> [else <statement>]

10 Extended BNF (EBNF) The extensions do not enhance the descriptive power of BNF; they only increase its readability and writability. Three extensions are commonly included in the various versions of EBNF. 2. The Second common extension deals with multiplechoice options. Ex) BNF <term> <term> * <factor> <term> / <factor> <term> % <factor> Can be expressed with EBNF <term> <term> (* / %) <factor> Extended BNF (EBNF) The extensions do not enhance the descriptive power of BNF; they only increase its readability and writability. Three extensions are commonly included in the various versions of EBNF. 3. The third extension is the use of braces in an RHS to indicate that the enclosed part can be repeated indefinitely or left out altogether. Ex) lists of identifiers separated by commas can be <ident_list> <identifier> {, <identifier>} The brackets, braces, and parentheses in the EBNF extensions are metasymbols, which means they are notational tools and not terminal symbols in thesyntactic entities they help describe Extended BNF (EBNF) Extended BNF (EBNF) BNF: EBNF: <expr> <expr> + <term> <expr> - <term> <term> <term> <term> * <factor> <term> / <factor> <factor> <factor> <exp> ** <factor> <exp> <exp> (<expr>) id <expr> <term> {(+ -) <term>} <term> <factor> {(* /) <factor>} <factor> <exp> { ** <exp>} <exp> (<expr>) id The BNF rule <expr> <expr> + <term> specifies the + operator to be left associative. However the EBNF version <expr> <term> {+ <term>} does not imply the direction of associativity. This problem is overcome in a syntax analyzer based on an EBNF grammar for expressions by designing the syntax analysis process to enforce the correct associativity Extended BNF (EBNF) Attribute Grammars Some versions of EBNF allow a numeric superscript to be attached to the right brace to indicate an upper limit to the number of times the enclosed part can be repeated. Some versions use a plus (+) superscript to indicate one or more repetitions. Ex) <compound> begin <stmt> {<stmt>} end equivalent to <compound> begin {<stmt>} + end There are some characteristics of the structure of programming languages that are difficult to describe with BNF, and some that are not even possible. An attribute grammar is a device used to describe more of the structure of a programming language than can be described with a contextfree grammar. An attribute grammar is an extension to a context-free grammar

11 Attribute Grammars (Static ) The static semantics of a language is only indirectly related to the meaning of programs during execution; rather, it has to do with the legal forms of programs (syntax rather than semantics). Ex) The common rule that all variables must be declared before they are referenced. Because of the problems of describing static semantics with BNF, a variety of more powerful mechanisms has been devised for that task. One such mechanism, attribute grammars, was designed by Knuth (1968) to describe both the syntax and the static semantics of programs. Attribute grammars are a formal approach both to describing and checking the correctness of the static semantics rules of a program Attribute Grammars (Basic Concept) Attribute grammars are context-free grammars with attributes, attribute computation functions, and predicate functions. Attributes associated with grammar symbols (terminal and nonterminal symbols) Attribute computation functions (semantic functions)- associated with grammar rules. They are used to specify how attribute values are computed. Predicate functions state the static semantic rules of the language, are associated with grammar rules Attribute Grammars (Formal Definition) Attribute Grammars (Additional Features of Attribute Grammars: I) An attribute grammar is a context-free grammar with the following additions: For each grammar symbol X there is a set A(X) of attribute values Each rule has a set of functions that define certain attributes of the nonterminals in the rule Each rule has a (possibly empty) set of predicates to check for attribute consistency I. Associated with each grammar symbol X is a set of attributes A(X). The set A(X) consists of two disjoint sets synthesized attributes S(X) and inherited attributes I(X). Synthesized attributes are used to pass semantic information up a parse tree. inherited attributes used to pass semantic information down and across a parse tree Attribute Grammars (Additional Features of Attribute Grammars: II) Attribute Grammars (Additional Features of Attribute Grammars: III) II. Associated with each grammar rule is a set of semantic functions and a possibly empty set of predicate functions over the attributes of the symbols in the grammar rule. Ex) For a rule X 0 X 1... X n, The synthesized attributes of X 0 are computed with semantic functions of the form S(X 0 ) = f(a(x 1 ),..., A(X n )). So the value of a synthesized attribute on a parse tree node depends only on the values of the attributes on that node s children nodes. Inherited attributes of symbols X j, 1 j n (in the rule above), are computed with a semantic function of the form I(X j ) = f(a(x 0 ),, A(X n )). So the value of an inherited attribute on a parse tree node depends on the attribute values of that node s parent node and those of its sibling nodes. III. A predicate function has the form of a Boolean expression on the union of the attribute set {A(X 0 ),, A(X n )} and a set of literal attribute values. The only derivations allowed with an attribute grammar are those in which every predicate associated with every nonterminal is true. A false predicate function value indicates a violation of the syntax or static semantics rules of the language. A parse tree of an attribute grammar is the parse tree based on its underlying BNF grammar, with a possibly empty set of attribute values attached to each node. If all the attribute values in a parse tree have been computed, the tree is said to be fully attributed

12 Attribute Grammars Example #1 An attribute grammar that describes the rule that the name on the end of an Ada procedure must match the procedure s name (It cannot be stated in BNF). // Example - a procedure that prints 2 lines: procedure demo is procedure printlines is begin put(" "); put(" "); end printlines; begin printlines; -- do something printlines; end demo; Syntax rule: <proc_def> procedure <proc_name>[1] is begin <proc_body> end <proc_name>[2]; Predicate: <proc_name>[1]string == <proc_name>[2].string Attribute Grammars Example #1 (continue) Syntax rule: <proc_def> procedure <proc_name>[1] is begin <proc_body> end <proc_name>[2]; Predicate: <proc_name>[1].string == <proc_name>[2].string In this example, the predicate rule states that the name string attribute of the <proc_name> nonterminal in the subprogram header must match the name string attribute of the <proc_name> nonterminal following the end of the subprogram. The string attribute of <proc_name>, denoted by <proc_name>.string, is the actual string of characters that were found immediately following the reserved word procedure by the compiler. Notice that when there is more than one occurrence of a nonterminal in a syntax rule in an attribute grammar, the non-terminals are subscripted with brackets ([])to distinguish them Attribute Grammars Example #2 Attribute Grammars Example #2 (continue) Following example illustrates how an attribute grammar can be used to check the type rules of a simple assignment statement. The Syntax and static semantics of these assignment statements are: Syntax: The only variable names are A, B, and C. The right side of the assignments can be either a variable or an expression in the form of a variable added to another variable. <assign> <var> = <expr> <expr> <var> + <var> <var> <var> A B C : The variable can be one of two types: int or real When there are two variables on the right side of an assignment, they need not be the same type. The type of expression when the operand types are not the same is always real. When they are the same, the expression type is that of the operands. The type of the left side of the assignment must match the type of the right side. So the types of operands in the right side can be mixed, but the assignment is valid only if the target and the value resulting from evaluating the right side have the same type. <real> = <real>+<int> ok <int> = <int>+<int> ok <real>=<real><real> ok <real>=<int>+ <real> ok <real>=<int> + <int> not ok <int> = <real>+<int> or <int> = <int>+<real> not ok Attribute Grammars Example #2 (continue) Attribute Grammars Example #2 (continue) The attributes for the nonterminals in the example attribute grammar are described in the following paragraphs: actual_type A synthesized attribute associated with the nonterminals <var> and <expr>. It is used to store the actual type, int or real, of a variable or expression. In the case of a variable, the actual type is intrinsic. In the case of an expression, it is determined from the actual types of the child node or children nodes of the <expr> nonterminal. expected_type An inherited attribute associated with the nonterminal <expr>. It is used to store the type, either int or real, that is expected for the expression, as determined by the type of the variable on the left side of the assignment statement. 71 An Attribute Grammar for Simple Assignment Statements 1. Syntax rule: <assign> <var> = <expr> Semantic rule: <expr>.expected_type <var>.actual_type 2. Syntax rule: <expr> <var>[2] + <var>[3] Semantic rule: <expr>.actual_type if (<var>[2].actual_type = int) and (<var>[3].actual_type = int) else real end if Predicate: <expr>.actual_type == <expr>.expected_type 3. Syntax rule: <expr> <var> Semantic rule: <expr>.actual_type <var>.actual_type Predicate: <expr>.actual_type == <expr>.expected_type 4. Syntax rule: <var> A B C Semantic rule: <var>.actual_type look-up(<var>.string) then int The look-up function looks up a given variable name in the symbol table and returns the variable s type 72 12

13 Attribute Grammars Example #2 (continue) <assign> <expr> <var> <var>[2] <var>[3] A = A + B A parse tree for A = A + B Computing Attribute Values with Example #2 Now, consider the process of computing the attribute values of a parse tree, which is sometimes called decorating the parse tree. If all attributes were inherited, this could proceed in a completely top-down order, from the root to the leaves. If all the attributes were synthesized, it could proceed in a completely bottom-up order, from the leaves to the root. Because our grammar has both synthesized and inherited attributes, the evaluation process cannot be in any single direction Computing Attribute Values with Example #2 Computing Attribute Values with Example #2 The following is an evaluation of the attributes, in an order in which it is possible to compute them (A = A + B): <assign> expected_type actual_type <expr> actual_type <var> actual_type <var>[2] <var>[3] actual_type 1. <var>.actual_type look-up(a) (Rule 4) 2. <expr>.expected_type <var>.actual_type (Rule 1) 3. <var>[2].actual_type look-up(a) (Rule 4) <var>[3].actual_type look-up(b) (Rule 4) 4. <expr>.actual_type either int or real (Rule 2) 5. <expr>.expected_type == <expr>.actual_type is either TRUE or FALSE (Rule 2) A = A + B Shows the flow of attribute values in the example parse tree. Solid lines are used for the parse tree Dashed lines show attribute flow in the tree Computing Attribute Values with Example #2 Computing Attribute Values with Example #2 <assign> <assign> expected_type = real_type <expr> actual_type = real_type expected_type = int_type <expr> actual_type = real_type actual_type = real_type <var> actual_type = <var>[2] real_type actual_type = <var>[3] int_type actual_type = int_type <var> actual_type = <var>[2] int_type actual_type = <var>[3] real_type A = A + B A = A + B Shows A fully attributed parse tree with A is defined as a real and B is defined as an int. expected_type == actual_type Shows A fully attributed parse tree with A is defined as a int and B is defined as an real. expected_type!= actual_type

14 Evaluation of Attribute Grammar Checking the static semantic rules of a language is an essential part of all compilers. One of the main difficulties in using an attribute grammar to describe all of the syntax and static semantics of a real contemporary programming language is the size and complexity of the attribute grammar. The attribute values on a large parse tree are costly to evaluate. Less formal attribute grammars are commonly used for compiler writer. are meaning of the expressions, statements and program units of programming languages. There is no single widely acceptable notation or formalism for describing semantics Several needs for a methodology and notation for semantics: Programmers need to know what statements mean Compiler writers must know exactly what language constructs do Programs written in the language could be proven correct without testing. Compiler generators would be possible Designers could detect ambiguities and inconsistencies Software developers and compiler designers typically determine the semantics of programming languages by reading English explanations in language manuals. Because such explanations are often imprecise and incomplete, this approach is clearly unsatisfactory. Due to the lack of complete semantics specifications of programming languages, programs are rarely proven correct without testing, and commercial compilers are never generated automatically from language descriptions. (Operational ) Operational semantics is to describe the meaning of a statement or program by specifying the effects of running it on a machine. The effects on a machine are viewed as the sequence of machine state changes (the collection of the values in its storages). An obvious operational semantics description is given by executing a compiled version of the program on a computer. Most programmers write a small test program to determine the meaning of some programming language construct, while learning the language (Operational ) (Operational ) Problems with operational semantics The individual steps in the execution of machine language and the resulting changes to the state of the machine are too small and too numerous. The storage of a real computer is too large and complex. Different levels of uses of operational semantics. At the highest level, the interest is in the final result of the execution of a complete program (called natural operational semantics). At the lowest level, operational semantics can be used to determine the precise meaning of a program through an examination of the complete sequence of state changes that occur when the program is executed (called structural operational semantics)

15 ( The Basic Process of Operational ) ( The Basic Process of Operational ) Design an appropriate intermediate language, where the primary characteristic of the language is clarity. Every construct of the intermediate language must have an obvious and unambiguous meaning. If the semantics description is to be used for natural operational semantics, a virtual machine (an interpreter) must be constructed for the intermediate language. The basic process of operational semantics is not unusual. In fact, the concept is frequently used in programming textbooks and programming language reference manuals. Example with a C construct for (expr1; expr2; expr3) { } expr1; Loop: if expr2 == 0 goto out expr3; goto Loop Out: C Statement Meaning (Semantic) The human reader of such a description is the virtual computer and is assumed to be able to execute the instructions in the definition correctly and recognize the effects of the execution ( Evaluation of Operational ) (Summary of Operational ) Evaluation of operational semantics: Good if used informally (language manuals, etc.) Extremely complex if used formally (e.g., VDL), it was used for describing semantics of PL/I. Uses of operational semantics: - Language manuals and textbooks - Teaching programming languages Two different levels of uses of operational semantics: - Natural operational semantics - Structural operational semantics Evaluation - Good if used informally (language manuals, etc.) - Extremely complex if used formally (e.g.,vdl) (Denotation ) (Denotation ) Denotational semantics is the most widely known formal method based on recursive function theory for describing the meaning of programs. The process of building a denotational specification for a language: Define a mathematical object for each language entity Define a function that maps instances of the language entities onto instances of the corresponding mathematical objects The difficulty with this method lies in creating the objects and the mapping functions. The method is named denotational because the mathematical objects denote the meaning of their corresponding syntactic entities. The mapping functions of a denotational semantics programming language specification have a domain and a range. The domain is the collection of values that are parameters to the function; the range is the collection of objects to which the parameters are mapped. The domain is called the syntactic domain, because it is syntactic structures that are mapped. The range is called the semantic domain, because each result of function based on parameter has a meaning. In denotational semantics, programming language constructs are mapped to mathematical objects, either sets or, more often, functions

16 (Denotation : Example: binary number) The syntax of binary numbers can be described by the following grammar rules: Notice that we put quotation mark to show they are not mathematical digits but a character. A example binary number '110' is a string with sequence of character '1', '1' and '0'. <bin_num> '0' '1' <bin_num> '0' <bin_num> '1' ( Denotation : Example: binary number) A parse tree for the example binary number, '110', is shown in Figure <bin_num> '1' <bin_num> <bin_num> '1' '0' (Denotation : Example: binary number) To describe the meaning of binary numbers using denotational semantics, we associate the actual meaning (a decimal number) with each rule that has a single terminal symbol as its RHS. In example, decimal numbers must be associated with the first two grammar rules. The other two grammar rules are, in a sense, computational rules, because they combine a terminal symbol, to which an object can be associated, with a nonterminal, which can be expected to represent some construct. The semantic function M bin, defined as recursive function. It maps the syntactic objects, as described in the previous grammar rules, to the objects in N, the set of nonnegative decimal numbers. (Denotation : Example: binary number) The function M bin is defined as follows: M bin ('0') = 0 M bin ('1') = 1 M bin (<bin_num> '0') = 2 * M bin (<bin_num>) M bin (<bin_num> '1') = 2 * M bin (<bin_num>) + 1 The meanings, or denoted objects (decimal numbers), can be attached to the nodes of the parse tree shown on the previous page, yielding the tree (Denotation : Example: binary number) M bin ('110')=M bin ('11' '0' )=2 * M bin ('11' ) = 2 * M bin ('1' '1' ) = 2* (2 * M bin ('1' ) + 1) = 2 * (2 * 1 + 1) = 2 * 3 = 6 6 <bin_num> ( Denotation : Example: binary number) M bin ('1010')=M bin ('101' '0' )=2 * M bin ('101' ) = 2 * M bin ('10' '1' ) = 2* (2 * M bin ('10' ) + 1) = 2 2 * M bin ('1' '0' ) + 2 = 2 2 * 2 * M bin ('1' ) <bin_num> = 2 2 * 2 * = 10 5 <bin_num> '0' 3 <bin_num> '0' 2 <bin_num> '1' 1 <bin_num> '1' 1 <bin_num> 0' '1' '1'

17 (Denotation : Example: decimal number) We can extend this idea to decimal number. The syntax of decimal numbers can be described by the following grammar rules: <dec_num> '0' '1' '2' '3' '4' '5' '6' '7''8' '9' <dec_num> ('0' '1' '2' '3' '4' '5' '6' '7' '8' '9') The denotational mapping functions M dec are defined as follows: M dec ('0') = 0, M dec ('1') = 1, M dec ('9') = 9 M dec (<dec_num> '0') = 10 * M dec (<dec_num>) M dec (<dec_num> '1') = 10 * M dec (<dec_num>) M dec (<dec_num> '9') = 10 * M dec (<dec_num>) + 9 (Denotation : The State of a Program) The denotational semantics of a program could be defined in terms of state changes in an ideal computer. Denotational semantics is defined in terms of only the values of all of the program s variables. In denotational semantics, state changes are defined by mathematical functions. Let the state s of a program be represented as a set of ordered pairs, as follows: s = {<i 1, v 1 >, <i 2, v 2 >,..., <i n, v n >} where i is the name of variable, and v s are the current values of those variables. Any of the v s can have the special value undef, which indicates that its associated variable is currently undefined (Denotation : The State of a Program) ( Denotation : Expressions) Let VARMAP be a function of two parameters: a variable name and the program state. It returns the current value of the variable. VARMAP(i j, s) = v j Most semantics mapping functions for programs and program constructs map states to states. These state changes are used to define the meanings of programs and program constructs. Following example BNF description of expressions are base on following assumption. No side effects The only operators are + and *, and an expression can have at most one operator. The only operands are scalar integer variables and integer literals There are no parentheses The value of expression is integer. <expr> <dec_num> <var> <binary_expr> <binary_expr> <left_expr> <operator> <right_expr> <left_expr> <dec_num> <var> <right_expr> <dec_num> <var> <operator> + * <dec_num> '0' '1' '2' '3' '4' '5' '6' '7''8' '9' <dec_num> ('0' '1' '2' '3' '4' '5' '6' '7' '8' '9') ( Denotation : Expressions) (Denotation : Expressions) The only error we consider in expressions is a variable having an undefined value. Let Z be the set of integers, and let error be the error value. Then Z {error} is the semantic domain for the denotational specification for example expressions. The mapping function for a given expression E and state s follows. Symbol = is used to define mathematical functions. The implication symbol, =>, used in this definition connects the form of an operand with its associated case (or switch) construct. Dot notation is used to refer to the child nodes of a node. For example, <binary_expr>.<left_expr> refers to the left child node of <binary_expr>. M e (<expr>, s) = case <expr> of <dec_num> => M dec (<dec_num>, s) <var> => if VARMAP(<var>, s) == undef then error else VARMAP(<var>, s) <binary_expr> => if (M e (<binary_expr>.<left_expr>, s) == undef OR M e (<binary_expr>.<right_expr>, s) = undef) then error else if (<binary_expr>.<operator> == '+' then M e (<binary_expr>.<left_expr>, s) + M e (<binary_expr>.<right_expr>, s) else M e (<binary_expr>.<left_expr>, s) * M e (<binary_expr>.<right_expr>, s)

18 (Denotation : Assignment Statements) An assignment statement is an expression evaluation plus the setting of the target variable to the expression s value. In this case, the meaning function maps a state to a state. M a (x = E, s) = if M e (E, s) == error then error else s ={<i 1,v 1 >,<i 2,v 2 >,...,<i n,v n >}, where for j = 1, 2,..., n, if i j == x then v j = M e (E, s) else v j = VARMAP(i j, s) (Denotation : Logical Pretest Loops) To expedite the discussion, we assume that there are two other existing mapping functions, M sl map statement lists and states to states M b, map Boolean expressions to Boolean values (or error). M l (while B do L, s) = if M b (B, s) == undef then error else if M b (B, s) == false then s else if M sl (L, s) == error then error else M l (while B do L, M sl (L, s)) Note that the comparison i j == x is of names, not values (Evaluation of Denotation ) (Axiomatic ) Can be used to prove the correctness of programs Provides a rigorous way to think about programs Can be an aid to language design Has been used in compiler generation systems Because of its complexity, it are of little use to language users Based on formal logic (predicate calculus) Original purpose: formal program verification (check the correctness of a program) Axioms or inference rules are defined for each statement type in the language (to allow transformations of logic expressions into more formal logic expressions) The logic expressions are called assertions (or predicate)

3. DESCRIBING SYNTAX AND SEMANTICS

3. DESCRIBING SYNTAX AND SEMANTICS 3. DESCRIBING SYNTAX AND SEMANTICS CSc 4330/6330 3-1 9/15 Introduction The task of providing a concise yet understandable description of a programming language is difficult but essential to the language

More information

Programming Languages

Programming Languages Programming Languages Chapter 3 Describing Syntax and Semantics Instructor: Chih-Yi Chiu 邱志義 http://sites.google.com/site/programminglanguages Outline Introduction The General Problem of Describing Syntax

More information

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics ISBN Chapter 3 Describing Syntax and Semantics ISBN 0-321-49362-1 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the

More information

Chapter 3. Describing Syntax and Semantics

Chapter 3. Describing Syntax and Semantics Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:

More information

Chapter 3. Describing Syntax and Semantics

Chapter 3. Describing Syntax and Semantics Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:

More information

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units Syntax - the form or structure of the expressions, statements, and program units Semantics - the meaning of the expressions, statements, and program units Who must use language definitions? 1. Other language

More information

Syntax and Semantics

Syntax and Semantics Syntax and Semantics Syntax - The form or structure of the expressions, statements, and program units Semantics - The meaning of the expressions, statements, and program units Syntax Example: simple C

More information

Describing Syntax and Semantics

Describing Syntax and Semantics Describing Syntax and Semantics Introduction Syntax: the form or structure of the expressions, statements, and program units Semantics: the meaning of the expressions, statements, and program units Syntax

More information

Chapter 3: Syntax and Semantics. Syntax and Semantics. Syntax Definitions. Matt Evett Dept. Computer Science Eastern Michigan University 1999

Chapter 3: Syntax and Semantics. Syntax and Semantics. Syntax Definitions. Matt Evett Dept. Computer Science Eastern Michigan University 1999 Chapter 3: Syntax and Semantics Matt Evett Dept. Computer Science Eastern Michigan University 1999 Syntax and Semantics Syntax - the form or structure of the expressions, statements, and program units

More information

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield CMPS 3500 Programming Languages Dr. Chengwei Lei CEECS California State University, Bakersfield Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing

More information

Chapter 3 (part 3) Describing Syntax and Semantics

Chapter 3 (part 3) Describing Syntax and Semantics Chapter 3 (part 3) Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings

More information

Chapter 3. Describing Syntax and Semantics. Introduction. Why and How. Syntax Overview

Chapter 3. Describing Syntax and Semantics. Introduction. Why and How. Syntax Overview Chapter 3 Describing Syntax and Semantics CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 1 Introduction We usually break down the problem of defining a programming language into two parts.

More information

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics ISBN Chapter 3 Describing Syntax and Semantics ISBN 0-321-49362-1 Chapter 3 Topics Describing the Meanings of Programs: Dynamic Semantics Copyright 2015 Pearson. All rights reserved. 2 Semantics There is no

More information

Syntax. In Text: Chapter 3

Syntax. In Text: Chapter 3 Syntax In Text: Chapter 3 1 Outline Syntax: Recognizer vs. generator BNF EBNF Chapter 3: Syntax and Semantics 2 Basic Definitions Syntax the form or structure of the expressions, statements, and program

More information

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics ISBN Chapter 3 Describing Syntax and Semantics ISBN 0-321-49362-1 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Copyright 2009 Addison-Wesley. All

More information

Chapter 3. Describing Syntax and Semantics

Chapter 3. Describing Syntax and Semantics Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:

More information

P L. rogramming anguages. Fall COS 301 Programming Languages. Syntax & Semantics. UMaine School of Computing and Information Science

P L. rogramming anguages. Fall COS 301 Programming Languages. Syntax & Semantics. UMaine School of Computing and Information Science COS 301 P L Syntax & Semantics Syntax & semantics Syntax: Defines correctly-formed components of language Structure of expressions, statements Semantics: meaning of components Together: define the p language

More information

Syntax & Semantics. COS 301 Programming Languages

Syntax & Semantics. COS 301 Programming Languages COS 301 P L Syntax & Semantics Syntax & semantics Syntax: Defines correctly-formed components of language Structure of expressions, statements Semantics: meaning of components Together: define the p language

More information

10/30/18. Dynamic Semantics DYNAMIC SEMANTICS. Operational Semantics. An Example. Operational Semantics Definition Process

10/30/18. Dynamic Semantics DYNAMIC SEMANTICS. Operational Semantics. An Example. Operational Semantics Definition Process Dynamic Semantics DYNAMIC SEMANTICS Describe the meaning of expressions, statements, and program units No single widely acceptable notation or formalism for describing semantics Two common approaches:

More information

Semantics. There is no single widely acceptable notation or formalism for describing semantics Operational Semantics

Semantics. There is no single widely acceptable notation or formalism for describing semantics Operational Semantics There is no single widely acceptable notation or formalism for describing semantics Operational Describe the meaning of a program by executing its statements on a machine, either simulated or actual. The

More information

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units Syntax - the form or structure of the expressions, statements, and program units Semantics - the meaning of the expressions, statements, and program units Who must use language definitions? 1. Other language

More information

10/18/18. Outline. Semantic Analysis. Two types of semantic rules. Syntax vs. Semantics. Static Semantics. Static Semantics.

10/18/18. Outline. Semantic Analysis. Two types of semantic rules. Syntax vs. Semantics. Static Semantics. Static Semantics. Outline Semantic Analysis In Text: Chapter 3 Static semantics Attribute grammars Dynamic semantics Operational semantics Denotational semantics N. Meng, S. Arthur 2 Syntax vs. Semantics Syntax concerns

More information

10/26/17. Attribute Evaluation Order. Attribute Grammar for CE LL(1) CFG. Attribute Grammar for Constant Expressions based on LL(1) CFG

10/26/17. Attribute Evaluation Order. Attribute Grammar for CE LL(1) CFG. Attribute Grammar for Constant Expressions based on LL(1) CFG Attribute Evaluation Order Determining attribute 2 expected_type evaluation order actual_type actual_type for any attribute grammar is a 5 complex problem, 1 [1] [2] requiring

More information

Programming Language Syntax and Analysis

Programming Language Syntax and Analysis Programming Language Syntax and Analysis 2017 Kwangman Ko (http://compiler.sangji.ac.kr, kkman@sangji.ac.kr) Dept. of Computer Engineering, Sangji University Introduction Syntax the form or structure of

More information

Defining Languages GMU

Defining Languages GMU Defining Languages CS463 @ GMU How do we discuss languages? We might focus on these qualities: readability: how well does a language explicitly and clearly describe its purpose? writability: how expressive

More information

Syntax. A. Bellaachia Page: 1

Syntax. A. Bellaachia Page: 1 Syntax 1. Objectives & Definitions... 2 2. Definitions... 3 3. Lexical Rules... 4 4. BNF: Formal Syntactic rules... 6 5. Syntax Diagrams... 9 6. EBNF: Extended BNF... 10 7. Example:... 11 8. BNF Statement

More information

Habanero Extreme Scale Software Research Project

Habanero Extreme Scale Software Research Project Habanero Extreme Scale Software Research Project Comp215: Grammars Zoran Budimlić (Rice University) Grammar, which knows how to control even kings - Moliere So you know everything about regular expressions

More information

Program Assignment 2 Due date: 10/20 12:30pm

Program Assignment 2 Due date: 10/20 12:30pm Decoration of parse tree for (1 + 3) * 2 N. Meng, S. Arthur 1 Program Assignment 2 Due date: 10/20 12:30pm Bitwise Manipulation of Hexidecimal Numbers CFG E E A bitwise OR E A A A ^ B bitwise XOR A B B

More information

EECS 6083 Intro to Parsing Context Free Grammars

EECS 6083 Intro to Parsing Context Free Grammars EECS 6083 Intro to Parsing Context Free Grammars Based on slides from text web site: Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. 1 Parsing sequence of tokens parser

More information

COSC252: Programming Languages: Semantic Specification. Jeremy Bolton, PhD Adjunct Professor

COSC252: Programming Languages: Semantic Specification. Jeremy Bolton, PhD Adjunct Professor COSC252: Programming Languages: Semantic Specification Jeremy Bolton, PhD Adjunct Professor Outline I. What happens after syntactic analysis (parsing)? II. Attribute Grammars: bridging the gap III. Semantic

More information

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF) Chapter 3: Describing Syntax and Semantics Introduction Formal methods of describing syntax (BNF) We can analyze syntax of a computer program on two levels: 1. Lexical level 2. Syntactic level Lexical

More information

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction Topics Chapter 3 Semantics Introduction Static Semantics Attribute Grammars Dynamic Semantics Operational Semantics Axiomatic Semantics Denotational Semantics 2 Introduction Introduction Language implementors

More information

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters : Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter

More information

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser

More information

Programming Languages Third Edition

Programming Languages Third Edition Programming Languages Third Edition Chapter 12 Formal Semantics Objectives Become familiar with a sample small language for the purpose of semantic specification Understand operational semantics Understand

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2006

More information

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End Architecture of Compilers, Interpreters : Organization of Programming Languages ource Analyzer Optimizer Code Generator Context Free Grammars Intermediate Representation Front End Back End Compiler / Interpreter

More information

CPS 506 Comparative Programming Languages. Syntax Specification

CPS 506 Comparative Programming Languages. Syntax Specification CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens

More information

Programming Language Definition. Regular Expressions

Programming Language Definition. Regular Expressions Programming Language Definition Syntax To describe what its programs look like Specified using regular expressions and context-free grammars Semantics To describe what its programs mean Specified using

More information

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3 Programming Language Specification and Translation ICOM 4036 Fall 2009 Lecture 3 Some parts are Copyright 2004 Pearson Addison-Wesley. All rights reserved. 3-1 Language Specification and Translation Topics

More information

ICOM 4036 Spring 2004

ICOM 4036 Spring 2004 Language Specification and Translation ICOM 4036 Spring 2004 Lecture 3 Copyright 2004 Pearson Addison-Wesley. All rights reserved. 3-1 Language Specification and Translation Topics Structure of a Compiler

More information

Part 5 Program Analysis Principles and Techniques

Part 5 Program Analysis Principles and Techniques 1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Syntax Analysis Parsing Syntax Or Structure Given By Determines Grammar Rules Context Free Grammar 1 Context Free Grammars (CFG) Provides the syntactic structure: A grammar is quadruple (V T, V N, S, R)

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 1. Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program. The syntax is defined by the grammar rules

More information

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols The current topic:! Introduction! Object-oriented programming: Python! Functional programming: Scheme! Python GUI programming (Tkinter)! Types and values! Logic programming: Prolog! Introduction! Rules,

More information

Syntax Analysis Check syntax and construct abstract syntax tree

Syntax Analysis Check syntax and construct abstract syntax tree Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers

More information

More Assigned Reading and Exercises on Syntax (for Exam 2)

More Assigned Reading and Exercises on Syntax (for Exam 2) More Assigned Reading and Exercises on Syntax (for Exam 2) 1. Read sections 2.3 (Lexical Syntax) and 2.4 (Context-Free Grammars) on pp. 33 41 of Sethi. 2. Read section 2.6 (Variants of Grammars) on pp.

More information

This book is licensed under a Creative Commons Attribution 3.0 License

This book is licensed under a Creative Commons Attribution 3.0 License 6. Syntax Learning objectives: syntax and semantics syntax diagrams and EBNF describe context-free grammars terminal and nonterminal symbols productions definition of EBNF by itself parse tree grammars

More information

Unit-1. Evaluation of programming languages:

Unit-1. Evaluation of programming languages: Evaluation of programming languages: 1. Zuse s Plankalkül 2. Pseudocodes 3. The IBM 704 and Fortran 4. Functional Programming: LISP 5. The First Step Toward Sophistication: ALGOL 60 6. Computerizing Business

More information

CSE 12 Abstract Syntax Trees

CSE 12 Abstract Syntax Trees CSE 12 Abstract Syntax Trees Compilers and Interpreters Parse Trees and Abstract Syntax Trees (AST's) Creating and Evaluating AST's The Table ADT and Symbol Tables 16 Using Algorithms and Data Structures

More information

Introduction. Introduction. Introduction. Lexical Analysis. Lexical Analysis 4/2/2019. Chapter 4. Lexical and Syntax Analysis.

Introduction. Introduction. Introduction. Lexical Analysis. Lexical Analysis 4/2/2019. Chapter 4. Lexical and Syntax Analysis. Chapter 4. Lexical and Syntax Analysis Introduction Introduction The Parsing Problem Three approaches to implementing programming languages Compilation Compiler translates programs written in a highlevel

More information

A Simple Syntax-Directed Translator

A Simple Syntax-Directed Translator Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Introduction to Syntax Analysis

Introduction to Syntax Analysis Compiler Design 1 Introduction to Syntax Analysis Compiler Design 2 Syntax Analysis The syntactic or the structural correctness of a program is checked during the syntax analysis phase of compilation.

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Crafting a Compiler with C (II) Compiler V. S. Interpreter Crafting a Compiler with C (II) 資科系 林偉川 Compiler V S Interpreter Compilation - Translate high-level program to machine code Lexical Analyzer, Syntax Analyzer, Intermediate code generator(semantics Analyzer),

More information

CSCE 314 Programming Languages

CSCE 314 Programming Languages CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee 1 What Is a Programming Language? Language = syntax + semantics The syntax of a language is concerned with the form of a program: how

More information

Introduction to Syntax Analysis. The Second Phase of Front-End

Introduction to Syntax Analysis. The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 1 Introduction to Syntax Analysis The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 2 Syntax Analysis The syntactic or the structural correctness of a program

More information

Homework & Announcements

Homework & Announcements Homework & nnouncements New schedule on line. Reading: Chapter 18 Homework: Exercises at end Due: 11/1 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 25 COS 140: Foundations of

More information

Chapter 4. Lexical and Syntax Analysis

Chapter 4. Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.

More information

Programming Languages, Summary CSC419; Odelia Schwartz

Programming Languages, Summary CSC419; Odelia Schwartz Programming Languages, Summary CSC419; Odelia Schwartz Chapter 1 Topics Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation Criteria Influences on Language Design

More information

Context-Free Grammars and Languages (2015/11)

Context-Free Grammars and Languages (2015/11) Chapter 5 Context-Free Grammars and Languages (2015/11) Adriatic Sea shore at Opatija, Croatia Outline 5.0 Introduction 5.1 Context-Free Grammars (CFG s) 5.2 Parse Trees 5.3 Applications of CFG s 5.4 Ambiguity

More information

CSCI312 Principles of Programming Languages!

CSCI312 Principles of Programming Languages! CSCI312 Principles of Programming Languages! Chapter 2 Syntax! Xu Liu Review! Principles of PL syntax, naming, types, semantics Paradigms of PL design imperative, OO, functional, logic What makes a successful

More information

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction Topics Chapter 3 Semantics Introduction Static Semantics Attribute Grammars Dynamic Semantics Operational Semantics Axiomatic Semantics Denotational Semantics 2 Introduction Introduction Language implementors

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees 3.3.1 Parse trees 1. Derivation V.S. Structure Derivations do not uniquely represent the structure of the strings

More information

Formal Languages. Formal Languages

Formal Languages. Formal Languages Regular expressions Formal Languages Finite state automata Deterministic Non-deterministic Review of BNF Introduction to Grammars Regular grammars Formal Languages, CS34 Fall2 BGRyder Formal Languages

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

More information

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built) Programming languages must be precise Remember instructions This is unlike natural languages CS 315 Programming Languages Syntax Precision is required for syntax think of this as the format of the language

More information

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

More information

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott Introduction programming languages need to be precise natural languages less so both form (syntax) and meaning

More information

Languages and Compilers

Languages and Compilers Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:

More information

Lecture 8: Context Free Grammars

Lecture 8: Context Free Grammars Lecture 8: Context Free s Dr Kieran T. Herley Department of Computer Science University College Cork 2017-2018 KH (12/10/17) Lecture 8: Context Free s 2017-2018 1 / 1 Specifying Non-Regular Languages Recall

More information

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1 Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And Semantics Programming language syntax: how programs look, their form and structure Syntax is defined using a kind

More information

Compiler Design Concepts. Syntax Analysis

Compiler Design Concepts. Syntax Analysis Compiler Design Concepts Syntax Analysis Introduction First task is to break up the text into meaningful words called tokens. newval=oldval+12 id = id + num Token Stream Lexical Analysis Source Code (High

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars 1 Informal Comments A context-free grammar is a notation for describing languages. It is more powerful than finite automata or RE s, but still cannot define all possible languages.

More information

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing Syntax/semantics Program program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing Meta-models 8/27/10 1 Program program execution Syntax Semantics

More information

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis. Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches

More information

Parsing II Top-down parsing. Comp 412

Parsing II Top-down parsing. Comp 412 COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information

15 212: Principles of Programming. Some Notes on Grammars and Parsing

15 212: Principles of Programming. Some Notes on Grammars and Parsing 15 212: Principles of Programming Some Notes on Grammars and Parsing Michael Erdmann Spring 2011 1 Introduction These notes are intended as a rough and ready guide to grammars and parsing. The theoretical

More information

Compiler Construction

Compiler Construction Compiler Construction Exercises 1 Review of some Topics in Formal Languages 1. (a) Prove that two words x, y commute (i.e., satisfy xy = yx) if and only if there exists a word w such that x = w m, y =

More information

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Theory and Compiling COMP360

Theory and Compiling COMP360 Theory and Compiling COMP360 It has been said that man is a rational animal. All my life I have been searching for evidence which could support this. Bertrand Russell Reading Read sections 2.1 3.2 in the

More information

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2 Formal Languages and Grammars Chapter 2: Sections 2.1 and 2.2 Formal Languages Basis for the design and implementation of programming languages Alphabet: finite set Σ of symbols String: finite sequence

More information

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;} Compiler Construction Grammars Parsing source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2012 01 23 2012 parse tree AST builder (implicit)

More information

Introduction to Parsing

Introduction to Parsing Introduction to Parsing The Front End Source code Scanner tokens Parser IR Errors Parser Checks the stream of words and their parts of speech (produced by the scanner) for grammatical correctness Determines

More information

Lexical Considerations

Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 6 Decaf Language Wednesday, September 7 The project for the course is to write a

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

DEMO A Language for Practice Implementation Comp 506, Spring 2018

DEMO A Language for Practice Implementation Comp 506, Spring 2018 DEMO A Language for Practice Implementation Comp 506, Spring 2018 1 Purpose This document describes the Demo programming language. Demo was invented for instructional purposes; it has no real use aside

More information