Total No. of Questions : 6 P4890 B.E/ Insem.- 74 B.E ( Computer Engg) PRINCIPLES OF MODERN COMPILER DESIGN (2012 Pattern) (Semester I) Time : 1 Hour Max Marks : 30 Q.1 a) Explain need of symbol table with compiler. List different data structures for symbol table. [4] Ans. Symbol table information is used by the analysis and synthesis phases : 1) To verify that used identifiers have been defined (declared) 2) To verify that expressions and assignments are semantically correct type checking 3) To generate intermediate or target code Different data Structures for Symbol Table : 1) Unordered List 2) Ordered List 3) Hash Table 4) Binary Search Tree Q.1 b) What is Garbage Collection [2] Ans. Garbage collection is the systematic recovery of pooled computer storage that is being used by a program when that program no longer needs the storage. There are three main techniques for automatic memory management: reference counting, mark-and-sweep, and copying. Q.1 c) What is LEX? Give format of LEX specification file. [4] Ans. Lex:It is a Scanner Generator that Helps write programs whose control flow is directed by instances of regular expressions in the input stream. Format of LEX specification file :...definitions... %%...rules... %%...code... 1) Definitions section : There are three things that can go in the definitions section:
C code Any indented code between %{ and %} is copied to the C file. This is typically used for defining file variables, and for prototypes of routines that are defined in the code segment. A definition is very much like a #define cpp directive. For example letter [a-za-z] digit [0-9] punct [,.:;!?] nonblank [ˆ \t] These definitions can be used in the rules section: one could start a rule {letter}+ {... 2) Rules section The rules section has a number of pattern-action pairs. The patterns are regular expressions. If more than one rule matches the input, the longer match is taken. If two matches are the same length, the earlier one in the list is taken. 3) User code section If the lex program is to be used on its own, this section will contain a main program. If you leave this section empty you will get the default main: int main() { yylex(); return 0; } where yylex is the parser that is built from the rules. OR Q.2) a) Compare single pass and multipass design for compiler[4] Ans. a) 1. A one-pass compiler is a compiler that passes through the source code of each compilation unit only once. A multi-pass compiler is a type of compiler that processes the source code or abstract syntax tree of a program several times.
2. A one-pass compilers is faster than multi-pass compilers 3. A one-pass compiler has limited scope of passes but multi-pass compiler has wide scope of passes. 4. Multi-pass compilers are sometimes called wide compilers where as one-pass compiler are sometimes called narrow compiler. 5. Many programming languages cannot be represented with a single pass compilers, for example Pascal can be implemented with a single pass compiler where as languages like Java require a multi-pass compiler. Single Pass Compiler Driver Multi-Pass Compiler Driver Calls calls calls calls Syntactic Analyzer Syntactic Analyzer Contextual Analyzer Code Generator Calls calls i/p o/p i/p o/p i/p o/p Contextual Analyzer Code Generator Source text AST Decorated AST M/c Code Q.2 b) What are lexeme, pattern and token in lexical analysis? [3] Ans b) Token: Token is a sequence of characters that can be treated as a single logical entity. Typical tokens are, 1) Identifiers 2) keywords 3) operators 4) special symbols 5)constants Pattern: A set of strings in the input for which the same token is produced as output. This set of strings is described by a rule called a pattern associated with the token. Lexeme: A lexeme is a sequence of characters in the source program that is matched by the pattern for a token. Q.2 c) Explain static Vs dynamic storage allocation.[3] Ans. c) Static Allocation means, that the memory for your variables is automatically allocated, either on thestack or in other sections of your program. You do not have to reserve extra memory using them, but on the other hand, have also no control over the lifetime of this memory. E.g: a variable in a function, is only there until the function finishes.
void func() { } int i; /* `i` only exists during `func` */ Dynamic memory allocation is a bit different. Allocation of memory at the time of execution (run time) is known as dynamic memory allocation. Here we can control the exact size and the lifetime of these memory locations. Static Allocation Memory is allocated before the execution of the program begins. (During Compilation) Dynamic Allocation Memory is allocated during the execution of the program. No memory allocation or deallocation actions are performed during Execution. Memory Bindings are established and destroyed during the Execution. Variables remain permanently allocated. Allocated only when program unit is active. Implemented using stacks and heaps. Implemented using data segments. Pointer is needed to accessing variables. No need of Dynamically allocated pointers. Faster execution than Dynamic. Slower execution than static. More memory Space required. Less Memory space required. Q.3 a) What are problems/ issues associated with top-down parser. [2] Ans a) Problems with the Top-Down Parser 1. Only judges grammaticality. 2. Stops when it finds a single derivation.
3. No semantic knowledge employed. 4. No way to rank the derivations. 5. Problems with left-recursive rules. 6. Problems with ungrammatical sentences. Q.3 b) What is type checking [2] Ans b) Type checking is a program analysis that verifies something about the types that are used in the program. Type checker verifies that the type of a construct (constant, variable, array, list, object) matches what is expected in its usage context. E.g.,Java's % (modulo) operator expects two integers, so 3%4.5 is a type error. Q.3 c) Generate LR(1) parsing table for following grammar : [6] S -> BB B-> cb B-> d Ans. c) Augment the grammar : I : S ->.S, $ Now, produce LR(1) set of items Closure(I) : Io : S ->.S, $ S->.BB, $ goto(i2, B) I5 : S-> BB., $ B->.cB, c d goto(i2, c) I6 : B-> c.b, $ B->.d, c d B->.cB, $ B->.d, $ goto(io, S) I1 : S -> S., $ goto(i2, d) I7 : B-> d., $ goto(io, B) I2 : S -> B.B, $ goto(i3, B) I8 : B-> cb., c d B->.cB, $ goto(i3, c) : I3 B->.d, $ goto(i3, d) : I3 goto(i6, B) I9 : B-> cb., $ goto(io, c) I3 : B-> c.b, c d goto(i6, c) : I6 B->.cB, c d goto(i6,d) : I7 B->.d, c d goto(io, d) I4 : B-> d., c d
Parsing Table Action Action Goto States c d $ S B 0 S3 S4 1 2 1 accept 2 S6 S7 5 3 S3 S4 8 4 R3 R3 5 R1 6 S6 S7 9 7 R3 8 R2 R2 9 R2 OR Q.4 a) Explain in brief: Recursive Descent parser [2] Ans a) A recursive descent parser is a kind of top-down parser built from a set of mutually recursive procedures (or a non-recursive equivalent) where each such procedure usually implements one of the productions of the grammar. Or Recursive descent parsing associates a procedure with each nonterminal in the grammar, it may require backtracking of the input string. Q.4 b) Differentiate between syntax and semantic analysis by giving example [2] Ans b) Syntax is about the structure or the grammar of the language. They are rules that define whether or not the sentence is properly constructed. Here are some C language syntax rules: separate statements with a semi-colon enclose the conditional expression of an IF statement inside parentheses group multiple statements into a single statement by enclosing in curly braces data types and variables must be declared before the first executable statement (this feature has been dropped in C99. C99 and latter allow mixed type declarations.) Semantics is about the meaning of the sentence. For example: x++; // increment foo(xyz, --b, &qrs); // call foo Consider the ++ operator in the first statement. First of all, is it even valid to attempt this?
If x is a float data type, this statement has no meaning (according to the C language rules) and thus it is an error even though the statement is syntactically correct. If x is a pointer to some data type, the meaning of the statement is to "add sizeof(some data type) to the value at address x and store the result into the location at address x". If x is a scalar, the meaning of the statement is "add one to the value at address x and store the result into the location at address x". Q.4 c) Check if the following grammar is LL(1) [6] S-> ictss a S -> es Ɛ C -> b Ans c) Grammar is not left recursive and also need not be left factored. Now, computing FIRST and FOLLOW sets. FIRST(S) = { i, a} FOLLOW(S) = { e, $} FISRT (S ) = {e, Ɛ} FOLLOW(S ) = { e, $} FIRST ( C) = {b} FOLLOW(C ) = {t } Parsing Table construction T i t a e b $ NT S S->iCtSS S->a S S ->es S -> Ɛ C C->b S ->Ɛ As there are no conflicts in the table and grammar is not left recursive, we can say that the grammar is LL(1).
Q.5 a) Explain advantages of intermediate code[2] Ans a) Advantages of intermediate code. 1. Target code can be generated to any machine just by attaching new machine as the back end. This is called retargeting. b. It is possible to apply machine independent code optimization. This helps in faster generation of code. Q.5 b) Compare quadruple, triple and indirect triple[4] Ans b) 1. Quadruples- Quadruples consists of four fields in the record structure. One field to store operator op, two fields to store operands or arguments arg1and arg2 and one field to store result res. res = arg1 op arg2 Example: a = b + c b is represented as arg1, c is represented as arg2, + as op and a as res. a = -b * d + c + (-b) * d Three address code for the above statement is as follows t1 = - b t2 = t1 * d t3 = t2 + c t4 = - b t5 = t4 * d t6 = t3 + t5 a = t6 Quadruples for the above example is as follows Op Arg1 Arg2 Res - B T1 * T1 D T2 + T2 C T3 - B T4 * T4 D T5 + T3 T5 T6 = T6 a 2. Triples Triples uses only three fields in the record structure. One field for operator, two fields for operands named as arg1 and arg2. Value of temporary
variable can be accessed by the position of the statement the computes it and not by location as in quadruples. Triples for the above example is as follows StmtNo Op Arg1 Arg2 0 - B 1 * D 0 2 + C 1 3 - B 4 * D 3 5 + 2 4 6 = A 5 3. Indirect Triples Indirect Triples Indirect triples are used to achieve indirection in listing of pointers. That is, it uses pointers to triples than listing of triples themselves. Stmt No (0) (10) (1) (11) (2) (12) (3) (13) (4) (14) (5) (15) (6) (16) StmtNo Op Arg1 Arg2 10 - B 11 * D 0 12 + C 1 13 - B 14 * D 3 15 + 2 4 16 = A 5 Both indirect triples and quadruples give almost performance with respect to space and reordering code. However indirect triples can be space saving if temporary variables are reused. Q.5 c) Generate intermediate code for the following statement [4] a= b + c; (Specify syntax directed translation) Ans. c) S id:= E S.Code := E.Code gen(id.place = E.place) E E + E E.place := newtemp( ), E.Code =E1.Code E2.Code gen(e.place = E1.place + E2.place) E E * E E.place = newtemp( ), E.Code := E1.Code E2.Code gen(e.place = E1.place * E2.place)
E -E E (E1) E.place = newtemp( ), E.Code := E1.Code gen (E.place = uni E1.place) E.place = E1.place, E.Code = E1.Code E id E.place = id.place, E.Code := Three-address code statements for the above expression will be as follows t1 = b t2 = c t3 = t1 +t2 a = t3 OR Q.6 a) Explain need for intermediate code [2] Ans a) If a compiler translates the source language to its target machine language without having the option for generating intermediate code, then for each new machine, a full native compiler is required. Intermediate code eliminates the need of a new full compiler for every unique machine by keeping the analysis portion same for all the compilers. The second part of compiler, synthesis, is changed according to the target machine. It becomes easier to apply the source code modifications to improve code performance by applying code optimization techniques on the intermediate code. Q.6 b) Define L-attributed grammar [2] Ans b) L-attributed grammars are a special type of attribute grammars. They allow the attributes to be evaluated in one depth-first left-to-right traversal of the abstract syntax tree. As a result, attribute evaluation in L-attributed grammars can be incorporated conveniently in top-down parsing. A syntax-directed definition is L-attributed if each inherited attribute of Xj on the right side of A X1 X2 Xn depends only on 1.the attributes of the symbols X1, X2,, Xj-1 2.the inherited attributes of A Every S-attributed syntax-directed definition is also L-attributed.
Q.6 c) Generate intermediate code for the following statement [4] Ans c) p<q or a>b (Specify syntax directed translation) Control-Flow Translation of Boolean Expressions SDD for Boolean Expressions is shown below. Production E E1 or E2 E E1 and E2 E-> not E1 Semantic Rules { Eplace := newtemp; E.place := E1.place OR E2.place } { Eplace := newtemp; E.place := E1.place AND E2.place } { Eplace := newtemp; E.place := NOT E1.place } E-> (E1) E.place:= E1.place; } E->id1 relop id2 {E.place := newtemp; gen( if id1.place RELOP id2.place goto stmt +3 ); gen(e.place :=0); gen( goto stmt+2); gen(e.place :=1); } E->true E->false { Eplace := newtemp; gen(e.place := 1 ); } { Eplace := newtemp; gen(e.place := 0 ); } 100: if p<q goto103 101 : t1 = 0 102: goto 104 103: t1 = 1 104: if a>b goto 107 105: t2 = 0 106: goto 108 107: t2 =1 108: