Question 1. egulr expressions (20 points) In the Ad Progrmming lnguge n integer constnt contins one or more digits, but it my lso contin embedded underscores. Any underscores must be preceded nd followed by t lest one digit, which implies tht number my not begin or end with n underscore, nd numbers my not contin two or more djcent underscores. Underscores hve no significnce other thn redbility. Exmples: 0, 17, 017, 0_01_7, 1_048_576. Integers my lso be written in bses other thn 10 by preceding the number with the bse nd # chrcter. For exmple, ll of these constnts represent the deciml number 15231: 15231, 10#15231, 16#3B7F, 16#03B_7F, 2#00111011_0111111, nd 8#35577. Other bses besides powers of 2 re lso llowed, for exmple, 2#00101, 7#5, nd 5#10 ll represent the deciml number 5. The bse of number (preceding the # sign) must be number tht is t lest 2 nd t most 16, nd is written in deciml. The digits in number consist of the deciml digits 0 through 9, nd the letters A, B, C, D, E, F, written only in upper cse. The lnguge specifies tht the digits must be less thn the bse if one is given, but for this problem we ll ignore tht requirement nd llow ny string of digits, including A through F, to pper following bse if one is present. If no bse is present, the digits my only be 0 through 9. () (10 points) Give regulr expression for Ad integer constnts s described bove. You my only use bsic regulr expression opertors (conctention rs, choice r s, Kleene str r*) nd the dditionl opertors r+ nd r?. You my lso specify chrcter sets using the nottion [bw-z], nd sets excluding specified chrcters [^eiou]. Finlly, you cn nme prts of the regulr expression, like vowel=[eiou]. decint = [0-9](_?[0-9])* bsedint = (1[0-6] [2-9])#[0-9A-F](_?[0-9A-F])* integer = decint bsedint (continued on next pge) Pge 1 of 8
Question 1. (cont.) (b) (10 points) Drw DFA (Deterministic Finite Automt) tht recognizes Ad integer constnts s described in the problem nd generted by the regulr expression in your nswer to prt (). 0-9 0-9 _ 0 7-9 _ 0-9 _ 0-9A-F 1 0-6 # _ 0-9A-F 2-9 0-9A-F Pge 2 of 8
Question 2. Ambiguity (14 points) The syntx used to specify regulr expressions cn itself be defined by context-free grmmr. Here is one possible grmmr for regulr expressions with the opertors conctention, choice ( ), Kleene str ( * ), nd prenthesized subexpressions over the lphbet {, b }. ::= ::= ::= * ::= ( ) ::= ::= b (conctention) (the here is the literl regulr expression choice opertor) (Kleene str) Show tht this grmmr for specifying the syntx of regulr expressions is mbiguous. There re huge number of possibilities. Here re two prse trees for : Pge 3 of 8
Question 3. (30 points) The you re-probbly-not-surprised-to-see-it L-prsing question. Here is tiny grmmr. 0. S ::= S $ 1. S ::= x S S 2. S ::= y () (12 points) Drw the L(0) stte mchine for this grmmr. 1 S ::=. S $ S ::=. x S S S ::=. y S y S ::= S. $ S ::= y. 3 2 x x 4 S ::= x. S S S ::=. x S S S ::=. y y x S 5 S ::= x S. S S ::=. x S S S ::=. y S y 6 S ::= y. (continued on next pge) Pge 4 of 8
Question 3. (cont.) Grmmr repeted from previous pge for reference. 0. S ::= S $ 1. S ::= x S S 2. S ::= y (b) (12 points) Construct the L(0) prse tble for this grmmr bsed on the stte mchine in your nswer to prt (). x y $ S 1 s4 s3 g2 2 cc 3 r2 r2 r2 4 s4 s3 g5 5 s4 s3 g6 6 r1 r1 r1 (b) (3 points) Is this grmmr L(0)? Why or why not? Yes. No shift-reduce or reduce-reduce conflicts in the tble. (c) (3 points) Is this grmmr SL? Why or why not? Yes. All L(0) grmmrs re lso SL. Pge 5 of 8
Question 4. First/Follow/Nullble (20 points) Consider the following grmmr: S ::= A C B A ::= B C B A B ::= b C C ::= C b C c ε Complete the following tble to give the FIST nd FOLLOW sets nd NULLABLE ttribute for ech of the non-terminls in the grmmr. Symbol NULLABLE FIST FOLLOW S F { b } $ A F { b } { b, c } B F { b } {, b, c } C T { b, c } {, b, c } Pge 6 of 8
Question 5. Prsing tools (16 points) For this problem we would like to trnslte rithmetic expressions from ordinry infix nottion to postfix using CUP. In postfix nottion binry expression like *b is written s b*, with the opernds ppering in the sme order s they did originlly followed by the opertor. Opertors ct on the two expressions or subexpressions immeditely to their left nd opertors re executed s soon s they re reched scnning left to right. Postfix expressions re unmbiguous nd do not require prentheses or opertor precedence rules. Here re severl exmples: Infix Postfix Evlution order -b+c b-c+ -b is computed, then c is dded to the result -(b+c) bc+-, b, nd c re scnned, then b+c is computed first, then tht result is the right opernd of the subtrction from +b*c bc*+ multiply b*c, then dd tht result to (+b)*c b+c* dd +b first, then multiply by c *b+c b*c+ multiply *b first, then dd c (*b)+c b*c+ sme *b+c*d b*cd*+ multiply *b, multiply c*d, then dd the two products On the next pge, complete the CUP specifiction so the resulting prser will ccept infix rithmetic expressions involving identifiers; the opertors +, -, nd *; nd prenthesized subexpressions, nd print the corresponding postfix expression. (Note: you re not building tree s you did for the prser prt of the project. Just print things t pproprite plces.) You should ssume tht the semntic ctions cn cll function print(s) tht will print the contents of string s to the output immeditely following ny previous output. The semntic ction for identifier is supplied s n exmple. (We re tking some liberties with types here nd re ssuming tht when n identifier token is printed the corresponding identifier string ppers in the output. Tht is not true for the opertors PLUS, TIMES, etc. You need to print pproprite literl strings in the right plces.) You will lso need to dd precedence declrtions to hndle the mbiguities in the given grmmr. Write your nswer on the (you cn detch this pge if you like) next pge => Pge 7 of 8
Question 5 (cont.) Complete the CUP specifiction below to red infix expressions nd print the corresponding postfix. (Hint: the nswers my be quite short nd/or simple don t be lrmed if tht hppens.) /* Terminls (tokens returned by the scnner) */ terminl PLUS, MINUS, TIMES, LPAEN, PAEN, IDENTIFIE; /* Nonterminls */ nonterminl void Expr; /* Precedence declrtions dd nything pproprite here */ precedence left PLUS, MINUS; precedence left TIMES; /* Productions */ Expr ::= IDENTIFIE:id {: print(id) :} Expr:exp1 PLUS Expr:exp2 {: print( + ); :} Expr:exp1 MINUS Expr:exp2 {: print( - ); :} Expr:exp1 TIMES Expr:exp2 {: print( * ); :} LPAEN Expr:exp PAEN {: /* nothing needed */ :} ; Pge 8 of 8