Compiled on 5/05/207 at 3:2pm Abbreviations NFA. Non-deterministic finite automaton DFA. Deterministic finite automaton Compiler Construction Collection of exercises Version May 5, 207 General Remarks Please write legibly. When asked to find a solution using an algorithm, make sure to show how you constructed the solution, i.e. simply submitting the final result is not sufficient (e.g. Subset Construction Algorithm). Discussion of the tasks between the lecture participants is welcome, but plagiarism will lead to a score of 0 points. Lexical analysis. Construct deterministic finite automata for the following regular expressions. (a) ( ( a b a c ) (c a) + ) a (b) a ( a b) ( (c (d ɛ) a ) ɛ ) (c) ( ( a b ) c (b a ɛ) ) + 2. For the following regular expressions: Build a NFA by means of the Thompson Construction Algorithm. Transform the NFA into a DFA by means of the Subset Construction Algorithm. (a) (a b) + c (b ɛ ) (b) c ( a ( b c b ) (c) ( a b ) a b? (d) 0 + ) (e) a ( b a ) + ( c ɛ ) (f) ( a ( b c ) + ) $ 3. Integers must be enclosed by the hash sign ( # ) and consist of a sequence of numerics ( 0-9 ). A list of integers consists of an arbitrary number of comma separated integers and is surrounded by curly brackets. Valid lists Invalid lists {} {##,} {##} {,2,3} {#02#,#0#,#080#} {#02##0##080#}
Use regular expressions to describe integers and lists of integers. Construct an equivalent DFA for these regular expressions using the direct method exploiting the syntax tree and the indirect method by first constructing the NFA by means of the Thompson Construction Algorithm and then transforming the NFA into a DFA by means of the Subset Construction Algorithm. Compare the two resulting DFAs. 4. For the following NFA: Indicate an equivalent regular expression. Transform the NFA into a DFA by means of the Subset Construction Algorithm. (a) NFA (b) NFA 2 (c) NFA 3 2
(a) NFA + - 2 3 (b) NFA 2 4 0 0 6 7 5 (c) NFA 3 5. Real constants are defined in the SML standard as follows: An integer constant (in decimal notation) is an optional negation symbol (~) followed by a non-empty sequence of decimal digits 0,...,9. [... ] A real constant is an integer constant in decimal notation, either followed by a point (.) and one or more decimal digits possibly followed by an exponent symbol (E or e) and an integer constant in decimal notation or followed by an exponent symbol (E or e) and an integer constant in decimal notation; at least one of the optional parts must occur, hence no integer constant is a real constant. Valid real constants Invalid real constants 0.7 23 3.32E5.3 3E~7 4.E5 0.0 E2.0 3
Define a regular expression that fits real constants. Construct an equivalent NFA by means of the Thompson Construction Algorithm. Simplify the resulting NFA. Transform the NFA into a DFA by means of the Subset Construction Algorithm. 6. The following regular expression describes the token filename: filename /? file (/ file) f ile (char + ( char + ) ) (. + ) char x Valid filename.././x/x x Invalid filename /../ x x Construct an equivalent NFA by means of the Thompson Construction Algorithm. You are allowed to make simplifications. Transform the NFA into a DFA by means of the Subset Construction Algorithm. 4
7. Non-deterministic finite automaton (NFA ) with 0 as initial state and 4 and 7 as final states: 0 3 0 2 0. 4 5 0 7 6 0 Transform the NFA into a DFA by means of the Subset Construction Algorithm. Indicate an equivalent regular expression. Explain that language is defined with that regular expression / automata and give examples. Construct a non-deterministic finite automaton (NFA 2 ) from the regular expression by means of the Thompson Construction Algorithm. What are the main differences between NFA and NFA 2? 5
2 Syntactic analysis. For the following grammars (with S as start symbol): Transform the grammar to LL() if necessary. Compute the FIRST and FOLLOW sets. Build the LL() parsing table. (a) S E E E op E E F E id F E ( E ) (b) S C C // T C /* A */ T char T T cr A char T A cr T A ɛ (c) S B B A B C A id C ( LL ) LL B LR LL ɛ LR, B LR LR ɛ (d) S S, S S F S id R R : num R ɛ F id ( S) (e) S T T @ ( T, T ) T @ L L id L num 2. Explain why the following grammar cannot be transformed into LL()! Z $Z$ Z %Z% Z ɛ 6
3. For the following grammars (with S as start symbol): Transform the grammar to LL() if necessary and build the LL() parsing table. Build the SLR() parsing table from the original (unmodified) grammar. (a) Grammar: S b W S S b W W a W ɛ (b) Grammar: S ( LO ) S atom LO S LR LO ɛ LR, S LR LR ɛ (c) Grammar: S C S assign C if P S S fi C if P S fi P pred (d) Grammar: S S, S S char 4. For the following grammars (with S as start symbol): Remove all left recursions and apply left factoring. Construct a Recursive Descent Parser for the transformed grammar. Build the LL() parsing table for the transformed grammar. Build the SLR() parsing table for the original (unmodified) grammar. (a) Grammar: S idlist statlist idlist identifier idlist, identifier statlist stat statlist ; stat stat identifier := expression if expression then stat else stat if expression then stat expression identifier + identifier (b) Grammar: S formula formula formula expr formula expr expr expr expr term term term term factor factor factor factor (formula) 7
5. For the following grammars (with S as start symbol): Compute the SLR() parsing table. If there are any shift-reduce conflicts resolve them with the listed assumptions. Show that the given expression can be parsed correctly with your SLR() parsing table. (a) Grammar: S S # S S S % S S char Assumptions: # has a stronger binding than %. Equivalent operators are left associative. Expression: char % char # char (b) Grammar: S S # S S S % S S? S S bn Assumptions:? has a stronger binding than # # has a stronger binding than %. Equivalent operators are left associative. Expression: bn % bn #? (c) Grammar: S S # S S S : E S E E a E b E E E Assumptions: bn : has a stronger binding than # Equivalent operators are left associative. Expression: a # b : b (d) Grammar: S L L L C L C C cmd AL AL str AL AL ɛ Assumptions: Operators are left associative. Expression: cmd str str cmd 8
3 Attributed grammars. Infix, postfix and prefix notation (a) Extend the following grammar with attributes that allow to translate prefix expressions to postfix expressions. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar. S E E ( O E E ) E id E num O + - Prefix Postfix (+(+ 2)3) (( 2+)3+) (+(+2 3)) ((2 3+)+) (b) Write an attributed grammar that translates expressions in postfix notation into infix. Define a proper grammar for the postfix notation. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar. Postfix Infix 2 3-4 * (2-3) * 4 2. Given the following grammar: W [ W ] W ɛ (a) Write an attributed grammar that returns the amount of pairs of parentheses. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar. (b) Build the SLR parsing table. Indicate where you have to apply the rules defined in (a). (c) Implement the attributed grammar as a recursive function which returns the number of pairs of parentheses. (d) Show for both implementations (b) and (c) how the number of parentheses are computed for the term [[]]. 3. The following grammar, with S as start symbol, defines binary numbers. Write an attributed Grammar that computes the decimal number of a binary number. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, 9
show an example of a translation using your attributed grammar. 4. For the following grammar G: S OP BB OB OE OP + ɛ BB 0 R BB R R 0 R R R R ɛ OB. BB OB ɛ OE E OP BB OE ɛ Binary number Decimal number 0.0000 5.0325 +E- 0.5-0.0-0.25 Write an attributed grammar that translates expressions of the grammar G into Assembler code. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. The following Assembler instructions are available: ADD R, R 2 R := R + R 2 MUL R, R 2 R := R R 2 MOV M, R R := memory(m) LOA N, R R := N (Directly sets the value of a register) R i names a register, M names a location in the main memory, and N is a number. Assume that you have an infinite number of registers. The function nextregister() returns the next register. Show the transformation of the Expression E into Assembler code. (a) Grammar G: E (E) E E + E E E E E id E num Expression E: (+(x*5)) (b) Grammar G: E ( E E ) E ( E num ) E id E num The operator ** denominates the power function, e.g. (2 3 **) means 2 3. Expression E: (2(2 3**)*) 0
(c) Grammar G: P E : AL E + E E E id E num AL AE ; AL AL ɛ AE id := num 5. Nested lists A program (P) consists of an expression (E) and a list of value assignments (AL) for all variables in the expression. Note that the values of the variables have to be defined before the expression can be evaluated. Expression E: + a + b 3; a=2; b=3; (a) Given the following grammar with S as start symbol. Write an attributed grammar that returns the length of the outermost list. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar. S ( E ) L ( E ) L atom E L R E ɛ R, L R R ɛ List Result () 0 (atom,(atom),(atom)) 3 (atom,(atom,atom)) 2 (b) Given the following grammar with L as start symbol. The empty list is defined as nil. atom names any element of the list that cannot be further divided. L nil L atom L [ LR ] LR L, LR LR L Write an attributed grammar that returns the list of atoms that are element of the given depth. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar.examples:
Original list N Result list [nil] 0 [nil] [atom] 0 [atom] [atom] [nil] [[atom]] [atom] [[atom,[atom,atom]]] [atom] [[atom,[atom,nil],atom]] [atom,atom] [[nil,[atom,[nil]],nil]] 2 [atom] (c) Write a grammar for the following type of lists: Each list starts with a number followed by : and the virtual list. A list starts with ( and ends with ). The elements of a list can be both atoms (atom) and lists. Ensure with attributes that the number at the beginning of the list definition is equal to the number of atoms in the list and the sub-lists. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar.examples: List Result 3:((atome)((atom)atom)) Valid :() Invalid 0:() Valid 2
(d) Extend the following grammar with attributes so that given expressions are evaluated according to the following examples. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar.the given grammar describes an operator that should be applied to all elements of a list. The list may contain sublists (without a preceding operator). The result should also be a list. E O L L ( LE LR ) LR, LE LR LR ɛ LE num LE L O inc dec Expression Evaluated expression inc(,(,2),3) (2,(2,3),4) dec(,(,2),3) (0,(0,),2) 6. Given the following grammar with S as start symbol which defines binary trees: S T L T char tree T int tree L char L int L tree ( L, L ) Write an attributed grammar that returns the number of char or int according to the tree type. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar.examples: Tree Result char_tree tree(int,char) int_tree tree(int,char) char_tree tree(int,int) 0 int_tree tree(int,tree(int,char)) 2 7. Given the following simple programming language P which allows only sequences of assignments: P S ; newline P P ɛ S id := E E id E ( E op E ) E ( uop E ) Write an attributed grammar that prints the set of dependencies as pairs (defined variable, referenced variable) for a program written in P enriched by line numbers. Given the program in Listing, the grammar produces the output {(x,y,), (y,x,2), (y,z,2)}. 3
x:=y; 2 y :=( x+z); Listing : Program in P Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar. 8. Given the grammar G which defines a simple programming language. Write an attributed grammar which returns the control flow graph (CFG) for a program written in G. Give your attributes meaningful names and state which attributes are synthesized and which are inherited. Further, show an example of a translation using your attributed grammar.your are allowed to define your own data structure for the CFG. P S ; P P ɛ S id := E S if E then { P } Oelse S while E do { P } Oelse else { P } Oelse ɛ E id E ( E op E ) E ( uop E ) 4