4. Compilation Basics Semantic Analysis Code Generation Introduction to Optimization

Size: px
Start display at page:

Download "4. Compilation Basics Semantic Analysis Code Generation Introduction to Optimization"

Transcription

1 2XWOLQH 4. Compilation Basics Semantic Analysis Code Generation Introduction to Optimization 6HPDQWLF $QDO\VLV After scanning and parsing have been done (including construction of parse tree and symbol and constant tables) the next order of business is semantic analysis The semantic analyzer ensures that the program makes sense beyond simple syntax This does not mean it does what you want, merely that it is consistent with respect to the semantics of the language 1

2 6HPDQWLF $QDO\VLV What kind of compile time, semantic checks can we perform and how complex is it to do them? Mathematical checks Divide by zero zero must be compile time determinable constant zero (e.g. X / 0.0 ), or an expression which symbolically must evaluate to zero at run time (e.g. X / (Y - Y)) ) 6HPDQWLF $QDO\VLV Overflow constant which exceeds representation ability of target machine language Recall, grammar does not restrict the number of digits or their values in a numeric constant arithmetic which obviously leads to overflow constant expression or... Underflow as for overflow etc. 2

3 6HPDQWLF $QDO\VLV Uniqueness Checks In certain situations, it is important that particular constructs occur only once Declarations within any given scope, each identifier must be declared only once VAR i, j : INTEGER; x,y,i : REAL; /* error: i is multiply defined in the scope */ 6HPDQWLF $QDO\VLV the applicable scope may change based on what sort of thing is being declared E.g. Labels must be unique within the entire program unit (program or module) Case statements each case constant must occur only once in the switch etc. 3

4 6HPDQWLF $QDO\VLV Consistency Checks It may also be necessary to check that a symbol that occurs in one place occurs in others as well Example: In Ada you must end each program unit by specifying its name (which must match the name specified at its start) 6HPDQWLF $QDO\VLV Such consistency checks are required whenever matching is required and what must be matched is not specified as a terminal in the grammar Thus, the check is cannot be done by the parser Type checks These checks form the bulk of semantic checking and certainly account for the majority of the overhead of this phase of compilation 4

5 6HPDQWLF $QDO\VLV In general, the types across any given operator must be compatible The meaning of compatible may be: the same two different sizes of the same basic type (e.g. short, int, long) some other pre-defined compatibility (e.g. int op float) 6HPDQWLF $QDO\VLV Having determined the types of each operand in an expression we can type check the expression according to a type system may be a formal definition (ala some type of calculus) but quite often its just an ad-hoc specification 5

6 6HPDQWLF $QDO\VLV In a language supporting the definition of new first class types, the type system must be extensible (E.g. in C++ a class definition may include specific routines for converting between other types/classes and the current one 6HPDQWLF $QDO\VLV pretend to evaluate expression and determine compatibility of the resulting subexpressions applied to the operators Save types for later use in code generation when code must be generated to perform explicit type promotions and/or conversions 6

7 6HPDQWLFV XQLTXH LGHQWLILHUV Each identifier must be declared once and only once for (not within) a given scope Verifying the uniqueness of identifier decl s may be incorporated into syntax analysis although it is a semantic analysis Can t be done before syntax analysis since the context of an identifier occurrence is unknown what scope you are in whether this identifier occurs within a declaration or an executable statement 6HPDQWLFV XQLTXH LGHQWLILHUV Checking for uniqueness of declaration is most easily accomplished in the symbol table maintenance routines When an identifier is defined, call a routine Add_Ident(<ident>) If <ident> doesn t already exist then you should add it to the identifier table If it does exist, then this is an attempt to declare it a second time and an error should be reported 7

8 6HPDQWLFV XQLTXH LGHQWLILHUV When an identifier is referred to (e.g. in an executable statement), call a routine Check_Declared(<ident>) If the identifier has been declared (i.e. is found) then everything is OK. (maybe return some info about it for future use) If not, then this is a reference to un undeclared variable and the error should be reported 6HPDQWLFV XQLTXH LGHQWLILHUV In a scoped language, a 2nd argument to each routine might be a pointer into a scope table to say which scope the operation applies to 8

9 $WWULEXWH *UDPPDUV A general strategy for syntax directed operations (semantic checking, code generation, etc.) Context free grammars are augmented with rules to attribute certain parse tree nodes with information derived from surrounding parts of the tree $WWULEXWH *UDPPDUV Two types of attributes; inherited and synthesized Inherited Attributes are those whose values are determined by the values of the attributes of only ancestor parse tree nodes Synthesized Attributes are those whose values are determined by the values of the attributes of only descendant parse tree nodes 9

10 $WWULEXWH *UDPPDUV Logically, inherited attributes are determined by top-down propagation of attribute values while synthesized attributes are determined by bottom-up propagation in YACC assigning to $$ is synthesizing an attribute A more formal definition (courtesy of the dragon book [ASU86]): $WWULEXWH *UDPPDUV In an attribute grammar, each production A α has associated with it a set of sideeffect free semantic rules of the form b := f(c 1, c 2,..., c k ) where f is a function, and either 1. b is a synthesized attribute of A and c 1, c 2,..., c k are attributes belonging to the grammar symbols on the right hand side of the production, or 10

11 $WWULEXWH *UDPPDUV 2.b is an inherited attribute of one of the grammar symbols on the right hand side of the production, and c 1, c 2,..., c k are attributes belonging to A or any grammar symbols on the right side of the production. $WWULEXWH *UDPPDUV The desk calculator example in YACC was an example of an attribute grammar using only synthesized attributes each production rule effectively had one attribute (the value of the corresponding subexpression) the value of the attribute for a given production was synthesized from the values of the attributes for its RHS symbols 11

12 $WWULEXWH *UDPPDUV Example of inherited attributes in processing decls applicable in languages where the type specification precedes the list of variables declared Production Semantic Rules D ::= T L L.in := T.type T ::= int T.type := INTEGER T ::= float T.type := REAL L ::= L 1, ID L 1.in := L.in ; addtype(id.entry, L.in) L ::= ID addtype(id.entry, L.in) 7\SH &KHFNLQJ Must execute the same steps as for expression evaluation Effectively we are executing the expression at compile time for type information only This is a bottom-up procedure in the parse tree We know the types of things at the leaves of a parse tree corresponding to an expression 12

13 7\SH &KHFNLQJ literals have an associated type (stored in the literal table) identifiers have an associated type (stored in the symbol table) When we encounter a parse tree node corresponding to some operator if the operand sub-trees are leaves we know their types and can check that the types are valid for the given operator. 7\SH &KHFNLQJ Furthermore, we can determine, according to our type system, the type resulting from the application of the operator to the two operands of known types this resulting type may then be associated with the parse tree node being typechecked so that nodes which use it as an operand may also be type-checked 13

14 7\SH &KHFNLQJ In general, each parse tree node (corresponding to an expression) may be type checked once its operand nodes (subexpression parse trees) have been type checked. Consider the following example for type checking an expression: Symbol Table X INT Y INT Z REAL X int + real real Y int * real real Type is a synthesized attribute Z 7\SH &KHFNLQJ The type checking process just sketched assumes that there exists an expression tree the expression tree must reflect the evaluation order of expressions specified in the source program. (i.e. precedence) the expression tree should not contain anything except identifiers and literals at the leaves and operators and links to expression sub-trees in the internal nodes 14

15 7\SH &KHFNLQJ This expression tree must somehow be generated from the parse tree or input source program The parse tree reflects the grammar rules used to recognize the expression in question and does not reflect operation precedence in anyway Special parsing techniques must be used if expression trees are to be produced directly from the source code *HQHUDWLQJ ([SUHVVLRQ 7UHHV An expression tree may be generated from the correct expression (or a depth first walk of the corresponding parse tree considering only leaf nodes and operators) using a simple algorithm (possibly seen in ) Essentially describes what an operator precedence parser does via shifts and reductions 15

16 *HQHUDWLQJ ([SUHVVLRQ 7UHHV Stacks of operators and operands are needed All operators (including parenthesis) have priorities assigned to them (which reflect precedence of operations) *HQHUDWLQJ ([SUHVVLRQ 7UHHV The comparative priorities of the current operator and the one on top of the stack determine whether evaluation should be done or whether additional stacking should occur. evaluation may be the generation of part of an expression tree or the generation of code or the actual evaluation of an expression or its type 16

17 *HQHUDWLQJ ([SUHVVLRQ 7UHHV Example Operator Precedence Table Operator: ( ** *,/ +,- ) Λ Priority: The algorithm is as follows: Initially the operator stack has only the lowest precedence symbol ( Λ ) on it Scan input expressions L->R and whenever an operand (identifier or literal) is encountered push it onto the operand stack *HQHUDWLQJ ([SUHVVLRQ 7UHHV When an operator is encountered, if it is of lower priority than the operator on top of the stack then evaluate sub-expressions (1 operator, 2 operands) off the stacks until the condition is not true If it is of higher priority then push it onto the operator stack 17

18 *HQHUDWLQJ ([SUHVVLRQ 7UHHV If a pair of parenthesis is ever on top of the stack (with no expression between), remove them At end of expression, evaluate subexpressions off the stacks until Λ is found. At this point, the result of the evaluation is on top of the operand stack *HQHUDWLQJ ([SUHVVLRQ 7UHHV Expression : a + b * ( c - f ) Operands Operators Operands Operators Operands Operators Λ a Λ a Λ + Operands Operators Operands Operators Operands Operators a Λ a Λ a Λ b + b + b + * * ( Operands Operators Operands Operators Operands Operators a Λ a Λ a Λ b + b + b + c * c * c * ( ( f ( - - Operands Operators Operands Operators Operands Operators a Λ a Λ a Λ b + b + b + c * (c-f) * (c-f) * f ( ( - ) ) This is just another example of shift-reduce parsing! What made the LR parse we did LR was how the parse table was built not how the parse was performed! 18

19 &RPSLODWLRQ 3DVVHV In some compilers, semantic analysis (and subsequent phases of compilation) are integrated with the parser These systems are referred to as single pass compilers They make a single pass over the source code (or its representation) Limited quality of resulting code but Q&D (good for student and prototype compilers, etc.) &RPSLODWLRQ 3DVVHV Multiple pass compilers go over the source code / parse tree / intermediate form multiple times Typically more efficient results Require greater compilation effort (and hence time) mandatory for some languages (e.g. if no declaration before use) and compiler optimizations 19

20 &RGH *HQHUDWLRQ Once the source code has been scanned, parsed, and semantically analyzed, code generation may be performed. Code generation is the process of creating assembly/machine language statements which will perform the operations specified by the source program when they are run not the process of actually doing what the source code says (that s interpretation) &RGH *HQHUDWLRQ The process of code generation involves producing assembly or machine language code for (typically) each internal node in the parse tree we will assume assembly code is being generated for readability purposes. e.g. while statements (<whilestmt>) might generate a lot of code while an addition (<plusop>) might generate a single instruction 20

21 &RGH *HQHUDWLRQ In addition, other code is also produced typically assembler directives are produced e.g. storage allocation statements for each variable and literal in the program Unoptimized code generation is relatively straightforward Simple mappings of H.L.L. constructs to assembly/machine code sequences Resulting code is pretty poor though (compared to manual coding) +DQGOLQJ 'HFODUDWLRQV Space must be allocated for each variable declared in the source program Also for literals such as strings and integer and real constants Where that space will reside in memory is determined by the memory model employed and the storage class of the variable 21

22 +DQGOLQJ 'HFODUDWLRQV The memory model typically specifies where in the program s address space global, statically allocated data should be placed We assume storage beginning at location zero. +DQGOLQJ 'HFODUDWLRQV Dynamic data (created for each subroutine invocation) is created on the stack This is done dynamically. At compile time, the space needed must be determined so an instruction sequence may be generated to allocate the space Addresses assigned to dynamic data are relative to some point on the runtime stack 22

23 +DQGOLQJ 'HFODUDWLRQV The amount of space allocated is determined by the variable s type E.g. 4 bytes for an integer, 2 bytes for a short, 1 byte for a char, 4 bytes for a float, 8 bytes for a double,... Space allocation on an ideal machine could thus proceed by starting at the lowest possible address and allocating enough bytes for each variable in turn. +DQGOLQJ 'HFODUDWLRQV Consider: Sizes short 2 int 4 float 4 double 6 char 1 Declarations main () { int x,y; char a,b,c; short z; double s,t; float ar[7]; int i;... } Symbol Table x int scalar static 0 y int scalar static 4 a char scalar static 8 b char scalar static 9 c char scalar static 10 z short scalar static 11 s double scalar static 13 t double scalar static 19 ar float array[7] static 25 i int scalar static 53 23

24 +DQGOLQJ 'HFODUDWLRQV In the real world, allocation is complicated by machine requirements for data alignment E.g. doubles might have to be aligned to addresses which are multiples of 8 If a double is to be allocated to an address A=1 (mod 8) then that double must actually be assigned the address A+7. Added complexity and wasted space (especially in large arrays of structures) +DQGOLQJ 'HFODUDWLRQV Consider; Sizes short 2 int 4 float 4 double 6 char 1 Alignments match data sizes Declarations main () { int x,y; char a,b,c; short z; double s,t; float ar[7]; int i;... } Symbol Table x int scalar static 0 y int scalar static 4 a char scalar static 8 b char scalar static 9 c char scalar static 10 z short scalar static 12 s double scalar static 18 t double scalar static 24 ar float array[7] static 32 i int scalar static 60 24

25 +DQGOLQJ 'HFODUDWLRQV Space is uninitialized all we have to do is reserve the space use the RSB (Reserve Storage Bytes) directive E.g. RSB 4 to reserve four bytes at current location for an integer RSB might be called DSB or DS.W or... Once the address of a variable is known it can be stored in its symbol table entry. +DQGOLQJ 'HFODUDWLRQV Storing addresses is not necessary for compilers which generate assembly language Can use symbolic names If generating machine code There are no symbolic names so when referring to a variable we must specify the actual address awkward when debugging generated code but otherwise not difficult 25

26 +DQGOLQJ /LWHUDOV Similar to handling declarations Space is allotted like global static variables The difference between literals and variables is that the space must be initialized to hold the expected values This is normally accomplished by using a different assembler directive E.g. DC.W 27 +DQGOLQJ /LWHUDOV If an assembler does not support this, an alternative (albeit one with run time overhead) is to generate code to store the needed values into the reserved storage areas as initial processing in the main routine Very unlikely as most assemblers support this Again, once the address is known, it may be recorded (this time) in the literal table 26

27 +DQGOLQJ &RQWURO 6WUXFWXUHV Let s consider the kind of code that should be generated for each control structure We will use a hypothetical AL code The WHILE statement : HLL Code Sequence Si WHILE (<expr>) DO BEGIN <stmts> END Sj Corresponding AL Code Sequence code for Si lab_again: code to evaluate <expr> into Rx TST Rx BNEQ lab_exit code for <stmts> BRA lab_again lab_exit: code for Sj +DQGOLQJ &RQWURO 6WUXFWXUHV The REPEAT statement: HLL Code Sequence Si REPEAT <stmts> UNTIL (<expr>); Sj Corresponding AL Code Sequence code for Si lab_again: code for <stmts> code to evaluate <expr> into Rx TST Rx BNEQ lab_again code for Sj 27

28 +DQGOLQJ &RQWURO 6WUXFWXUHV The FOR statement: HLL Code Sequence Si FOR i:=lb TO UB DO BEGIN <stmts> END; Sj Corresponding AL Code Sequence code for Si MOVE #LB,i lab_again: CMP i,#ub BGE lab_exit code for <stmts> INC i BRA lab_again lab_exit: code for Sj +DQGOLQJ &RQWURO 6WUXFWXUHV The IF-THEN construct: HLL Code Sequence Si IF (expr) THEN BEGIN <stmts> END; Sj Corresponding AL Code Sequence code for Si code to evaluate <expr> into Rx TST Rx BNEQ lab_exit code for <stmts> lab_exit: code for Sj 28

29 +DQGOLQJ &RQWURO 6WUXFWXUHV The IF-THEN-ELSE construct: HLL Code Sequence Si IF (expr) THEN BEGIN <stmts1> END ELSE BEGIN <stmts2> END; Sj Corresponding AL Code Sequence code for Si code to evaluate <expr> into Rx TST Rx BNEQ lab_else code for <stmts1> BRA lab_exit lab_else: code for <stmts2> lab_exit: code for Sj +DQGOLQJ &RQWURO 6WUXFWXUHV The CASE statement: HLL Code Sequence Si CASE (expr) OF cond1: <stmts1> cond2: <stmts2> OTHERWISE <stmtso> END; Sj Corresponding AL Code Sequence code for Si code to evaluate <expr> into Rx CMP Rx,#cond1 BNEQ lab_next1 code for <stmts1> BRA lab_exit lab_next1: CMP Rx,#cond2 BNEQ lab_next2 code for <stmts2> BRA lab_exit lab_next2: code for <stmtso> lab_exit: code for Sj 29

30 +DQGOLQJ &RQWURO 6WUXFWXUHV This is all great when each control structure occurs in isolation but what if they occur in a sequence or they are nested? The problem in this case is really generating unique labels and keeping track of which ones to use in each case This is the only real trick to generating simple code for control structures generating the code is otherwise quite trivial +DQGOLQJ &RQWURO 6WUXFWXUHV Really, it is quite easy to deal with too. have a routine that returns unique labels on successive calls as each control structure node is encountered in the parse tree (e.g. <while>) call the routine to generate as many labels as are required. Use them as needed (you know which labels to pair up based on which instance of the control structure you are generating code for) 30

31 +DQGOLQJ &RQWURO 6WUXFWXUHV Consider the following: cg_ifthenelse(pt_ptr nd) { char LabElse[16], LabExit[16]; int RegNum; } strcpy(labelse,newlabel()); strcpy(labexit,newlabel()); RegNum=cg_expr(nd->child); printf( TST %d\n,regnum); printf( BNEQ %s\n,labelse); cg_stmts(nd->child->sibling); printf( BRA %s\n, LabExit); printf( %s:\n, LabElse); cg_stmts(nd->child->sibling->sibling); printf( %s:\n, LabExit); +DQGOLQJ $VVLJQPHQWV An assignment statement has a variable (possibly an array element) on the LHS and an expression on the RHS. Processing assignment statements consists of evaluating the expression on the RHS and leaving the result at a known location Once this is done, the value must be stored in the location determined by the LHS 31

32 +DQGOLQJ $VVLJQPHQWV Consider the following: cg_assign(treeptr nd) SymPtr LHS; TreePtr RHS; int RegNum; { } LHS=(SymPtr) nd->child; RHS=nd->child->sibling; RegNum=cg_expr(RHS); printf( STORE %d,%s\n,regnum, LHS-> symname); +DQGOLQJ $VVLJQPHQWV The preceding code works for scalar variables on the LHS, but not for array elements. Since assemblers typically do not support indexing operations (after all its a 1:1 mapping between assembler and machine code), the compiler must generate code to determine the address of the element being stored to. 32

33 +DQGOLQJ $VVLJQPHQWV this requires some address arithmetic we know the base address of the array (either explicitly or symbolically) and we must add an offset to this corresponding to the index value in general, the calculation is: EA = base_address + index * element_size Assuming 0..upper To use this formula, you must normalize array references so they begin at element zero. Generalizes to multi-dimensional arrays +DQGOLQJ $VVLJQPHQWV The LHS of an assignment specifies both an array name and an expression specifying the array subscript (E.g. myarray[i+7]) LHS = nd->child; ArSym = (SymPtr) LHS->child; RegNum=cg_expr(LHS->child->sibling); // subscr expr printf( ADD %d,%s,%d\n,regnum,arsym->symname, RegNum); /* effective address is now in register RegNum */ RegNum2=cg_expr(RHS); // RHS expression printf( STORE %d,%d\n,regnum2,regnum); 33

34 +DQGOLQJ ([SUHVVLRQV Much of the stuff in the parse tree for an expression is often unnecessary and exists solely to provide a means by which the grammar for expressions may be specified Consider the <term> and <factor> grammar It makes little sense to build the parse tree this way even though expressions may have to be parsed this way (according to the grammar or due to parser limitations) +DQGOLQJ ([SUHVVLRQV A parse tree with this structure fails to consider precedence of operations which determines what an expression is meant to calculate Remember operator grammars and YACC s precedence rules We will assume that a precedence-reflecting expression tree exists within the parse tree rooted wherever a node corresponding to an expression occurs 34

35 +DQGOLQJ ([SUHVVLRQV How do we generate code for expressions given such an expression tree which reflects precedence? Its not too hard Some general things: If we have a reference to an identifier, we generate code which either symbolically or absolutely refers to its address +DQGOLQJ ([SUHVVLRQV Similarly, we generate an address when we are referencing a constant (that way we only store constants once -- that s what the literal table was all about) We handle references to array elements as described for processing assignment statements We have to be able to distinguish between unary and binary operators assume this is encoded in the expression tree (as per previous discussions) 35

36 +DQGOLQJ ([SUHVVLRQV Generating code for an expression (cg_expr) is an exercise in recursion If we have a unary operator, we call cg_expr recursively for the single operand and then generate code to apply the unary operator to what is produced cg_unaryminus(pt_ptr nd) { int RegNum; } RegNum=cg_expr(nd->child); printf( NEG %d\n,regnum); +DQGOLQJ ([SUHVVLRQV If we have a binary operator, we call cg_expr to evaluate the left operand, then we call it to evaluate the right operand, then we generate code to apply the binary operator to the two operands on the top of the stack cgtimes(pt_ptr nd) { int RegNum1, RegNum2; } RegNum1=cg_expr(nd->child); RegNum2=cg_expr(nd->child->sibling); printf( MUL %d,%d\n,regnum1,regnum2); 36

37 +DQGOLQJ ([SUHVVLRQV Throughout this tour of code generation we have simply assumed that we got back register numbers as needed This really simplifies things but is unrealistic Managing registers as is necessary in most real-world compilers is both hard and very important to do Think about it! Which register to use? When to use it? What if we run out of registers? 2SWLPL]DWLRQ This is just an overview!!! Two basic types of optimization machine independent machine dependent Other taxonomies of optimization divide things up differently global optimization (considering the whole program (or routine) local optimization (within a basic block - later) 37

38 2SWLPL]DWLRQ peephole optimization (considering only a small sequence of instructions or statements) Much optimization is done to compensate for compiler rather than programmer deficiencies It is convenient to let the compiler do stupid things early on and then fix them up later i.e. generate unoptimized code 0DFKLQH,QGHSHQGHQW 2SWLPL]DWLRQ Machine independent optimization is typically done using the intermediate form as a base (a.o.t. assembly or machine code) Does not consider any details of the target architecture in making optimization decisions Such optimizations tend to be very general in nature 38

39 0DFKLQH,QGHSHQGHQW 2SWLPL]DWLRQ E.g. determining common sub-expressions so they only have to be evaluated once root1 = (-b + sqrt(b*b - 4*a*c))/(2*a); root2 = (-b - sqrt(b*b - 4*a*c))/(2*a); becomes subexpr1 = sqrt(b*b - 4*a*c); subexpr2 = 2*a; root1 = (-b + subexpr1)/subexpr2; root2 = (-b - subexpr1)/subexpr2; 0DFKLQH 'HSHQGHQW 2SWLPL]DWLRQ Machine dependent optimizations are performed on assembly or machine code Target machine architecture specific Such optimizations are extremely specific Examples; integer multiplication by a power of two is often more efficient to accomplish by generating shift left instructions than multiply instructions 39

40 0DFKLQH 'HSHQGHQW 2SWLPL]DWLRQ Some machines support special instructions for implementing counting loops (DBxx on M680x0) On /370 machines such an instruction is commonly used to decrement by one (never branching) These special sequences are called idioms There are other, more complex, optimizations too Often register related 3URJUDP $QDO\VLV To perform optimization, it is necessary to analyze the program code to derive enough information to know if the optimization to be performed is valid This is especially true of machine independent optimization where the scope of application of the optimizations is commonly larger Optimization is tricky - you must be careful! 40

41 3URJUDP $QDO\VLV E.g.; - common sub-expressions revisited X:=A+B; if (C>27) THEN A++; Any common subexpressions? else D--; NO THERE ARE NOT!!! Y:=Z*(A+B) Clearly, control flow affects optimization %DVLF %ORFNV A first step in analyzing programs is to divide up the statements within a routine into a collection of basic blocks A basic block is a sequence of statements where if the first statement is executed, all the statements will be i.e. it is a block of code devoid of control flow 41

42 %DVLF %ORFNV We determine the basic blocks by finding the leader statements in a routine A statement is a leader if it is the first statement, if it is the target of a branch statement, or if it immediately follows a conditional branch statement A basic block begins with a leader and consists of the leader and all statements up to but not including the next leader %DVLF %ORFNV Consider the following example: Si Sj IF (expr1) THEN Sx ELSE WHILE (expr2) DO Sy ENDWHILE ENDIF Sk Sm Original Code BB1 BB2 BB3 BB4 BB5 Si Sj TST expr1 BNEQ L000 Sx BRA L001 L000: TST expr2 BNEQ L001 L001: Sk Sm Sy BRA L000 42

43 &RQWURO )ORZ *UDSKV The basic blocks in a routine may be threaded to reflect the possible flow of control through a program The resulting structure is a control flow graph (CFG for short) Threading is easily done if labels can be mapped to basic blocks &RQWURO )ORZ *UDSKV If there is a branch statement from BBi to a label in BBj then add an edge in the CFG from BBi to BBj Must also create edges where control flow may fall through to another basic block i.e. after any conditional branch The CFG is the basic data structure for many optimizations 43

44 &RQWURO )ORZ *UDSK ([DPSOH Consider the control flow graph for the previous (basic blocks) example: BB1 BB2 BB3 Si Sj TST expr1 BNEQ L000 Sx BRA L001 L000: TST expr2 BNEQ L001 BB4 BB5 L001: Sk Sm Sy BRA L000 'DWD )ORZ $QDO\VLV Once we have constructed the control flow graph we can use it to solve a number of optimization problems Many of these may be solved using data flow analysis including determining common subexpressions in the presence of control flow 44

45 'DWD )ORZ $QDO\VLV As the name suggests, data flow analysis is concerned with the flow of data through a program This is determined in part by the control flow of the program and hence uses the CFG for implementation purposes Data flow problems are formulated as sets of equations solved iteratively until convergence 'DWD )ORZ $QDO\VLV The easiest way to understand data flow analysis is to consider a real problem The problem we will examine is the reaching definitions problem A definition of a variable is an assignment to it and we are interested in knowing which possible definitions may reach a given point in the program 45

46 'DWD )ORZ $QDO\VLV A point in the program is typically a use of the variable In other words, we want to know where a value we are about to use may have come from (computationally) X:= expr1; IF (expr2) THEN X:= expr3; ENDIF y:=x+7; <- Question: which definitions of X may reach this use? Answer: Either one 5HDFKLQJ 'HILQLWLRQV (TXDWLRQV The following data flow equations specify a solution to the reaching definitions problem: gen[b] = { the set of definitions generated in block B } kill[b] = { the set of previous definitions killed in block B } now we must define equations for each basic block which use the CFG information to propagate the definitions from one basic block to another 46

47 5HDFKLQJ 'HILQLWLRQV (TXDWLRQV This is done based on the type of control flow existing between basic blocks For programs using only structured control constructs, this is relatively straightforward, for others it can be very difficult E.g. analyzing spaghetti FORTRAN code from dusty decks Consider some basic rules: 5HDFKLQJ 'HILQLWLRQV (TXDWLRQV S = d: a := b + c gen[s] = {d} kill[s] = Da - {d} out[s] = gen[s] (in[s] - kill[s]) S = S 1 S 2 gen[s] = gen[s 2 ] (gen[s 1 ] - kill[s 2 ]) kill[s] = kill[s 2 ] (kill[s 1 ] - gen[s 2 ]) in[s 1 ] = in[s] in[s 2 ] = out [S 1 ] out[s] = out[s 2 ] 47

48 5HDFKLQJ 'HILQLWLRQV (TXDWLRQV S = S 1 S 2 gen[s] = gen[s 1 ] gen[s 2 ] kill[s] = kill[s 1 ] kill[s 2 ] in[s 1 ] = in[s] in[s 2 ] = in [S] out[s] = out[s 1 ] out[s 2 ] S = S 1 gen[s] = gen[s 1 ] kill[s] = kill[s 1 ] in[s 1 ]=in[s] gen[s 1 ] out[s] = out[s 1 ] 5HDFKLQJ 'HILQLWLRQV (TXDWLRQV We can apply these equations iteratively to compute the needed reaching definitions information The iteration is needed because of the presence of loops Effectively, we continue re-calculating the outs and ins until the system stabilizes i.e. we reach a fix-point in the computation 48

49 5HDFKLQJ 'HILQLWLRQV,PSOHPHQWDWLRQ We need a way of representing sets within our optimizer A simple and efficient representation of sets uses bit strings for reaching definitions, we will have one bit per definition Consider the following algorithm for computing reaching definitions 5HDFKLQJ 'HILQLWLRQV,PSOHPHQWDWLRQ initialize in[b] to the empty set for all B FOR each block B DO out[b] := gen[b] change := TRUE WHILE change DO BEGIN change := FALSE FOR each block B DO BEGIN in[b] := out[p], P a pred. of B oldout := out[b] out[b] := gen[b] (in[b] - kill[b]) IF out[b] <> oldout THEN change := TRUE END END 49

50 5HDFKLQJ 'HILQLWLRQV ([DPSOH Consider the following example: B1 B2 d1 : i := m - 1 d2 : j := n d3 : a := u1 d4 : i := i + 1 d5: j := j - 1 gen[b1] = {d1, d2, d3} kill[b1] = {d4, d5, d6, d7} gen[b2] = {d4, d5} kill[b2] = {d1, d2, d7} B3 d6 : a := u2 gen[b3] = {d6} kill[b3] = {d3} B4 d7 : i := u3 gen[b4] = {d7} kill[b4] = {d1, d4} 5HDFKLQJ 'HILQLWLRQV ([DPSOH Initial Pass 1 Pass 2 Block in[b] out[b] in[b] out[b] in[b] out[b] B B B B Continue this process until the system stabilizes! 50

51 6RPH 6DPSOH 2SWLPL]DWLRQV Elimination of Common Sub-expressions Finding common sub-expressions which have the same value & calculating them only once Removing Loop Invariant Code Any code within a loop which does not depend on the loop variable (directly or indirectly) may be moved out of the loop Dead Code Elimination Code which will never be executed is discarded 6RPH 6DPSOH 2SWLPL]DWLRQV Strength Reduction Use less expensive operation sequences (E.g. X^3 = X*X*X) Algebraic Optimization Exploit math. properties to optimize (E.g.X*1 = X, etc.) Constant Expressions evaluation Evaluate constant expressions at runtime using the math of the target architecture 51

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab

More information

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions Agenda CS738: Advanced Compiler Optimizations Data Flow Analysis Amey Karkare karkare@cse.iitk.ac.in http://www.cse.iitk.ac.in/~karkare/cs738 Department of CSE, IIT Kanpur Static analysis and compile-time

More information

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES (2018-19) (ODD) Code Optimization Prof. Jonita Roman Date: 30/06/2018 Time: 9:45 to 10:45 Venue: MCA

More information

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology exam Compiler Construction in4303 April 9, 2010 14.00-15.30 This exam (6 pages) consists of 52 True/False

More information

Semantic actions for declarations and expressions

Semantic actions for declarations and expressions Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

Semantic actions for declarations and expressions. Monday, September 28, 15

Semantic actions for declarations and expressions. Monday, September 28, 15 Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

COMPILER CONSTRUCTION Seminar 02 TDDB44

COMPILER CONSTRUCTION Seminar 02 TDDB44 COMPILER CONSTRUCTION Seminar 02 TDDB44 Martin Sjölund (martin.sjolund@liu.se) Adrian Horga (adrian.horga@liu.se) Department of Computer and Information Science Linköping University LABS Lab 3 LR parsing

More information

CS 406/534 Compiler Construction Putting It All Together

CS 406/534 Compiler Construction Putting It All Together CS 406/534 Compiler Construction Putting It All Together Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy

More information

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Crafting a Compiler with C (II) Compiler V. S. Interpreter Crafting a Compiler with C (II) 資科系 林偉川 Compiler V S Interpreter Compilation - Translate high-level program to machine code Lexical Analyzer, Syntax Analyzer, Intermediate code generator(semantics Analyzer),

More information

Intermediate Code Generation

Intermediate Code Generation Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target

More information

A main goal is to achieve a better performance. Code Optimization. Chapter 9

A main goal is to achieve a better performance. Code Optimization. Chapter 9 1 A main goal is to achieve a better performance Code Optimization Chapter 9 2 A main goal is to achieve a better performance source Code Front End Intermediate Code Code Gen target Code user Machineindependent

More information

Why Global Dataflow Analysis?

Why Global Dataflow Analysis? Why Global Dataflow Analysis? Answer key questions at compile-time about the flow of values and other program properties over control-flow paths Compiler fundamentals What defs. of x reach a given use

More information

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis? Semantic analysis and intermediate representations Which methods / formalisms are used in the various phases during the analysis? The task of this phase is to check the "static semantics" and generate

More information

Semantic actions for declarations and expressions

Semantic actions for declarations and expressions Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

SEMANTIC ANALYSIS TYPES AND DECLARATIONS SEMANTIC ANALYSIS CS 403: Type Checking Stefan D. Bruda Winter 2015 Parsing only verifies that the program consists of tokens arranged in a syntactically valid combination now we move to check whether

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information

CSCI 171 Chapter Outlines

CSCI 171 Chapter Outlines Contents CSCI 171 Chapter 1 Overview... 2 CSCI 171 Chapter 2 Programming Components... 3 CSCI 171 Chapter 3 (Sections 1 4) Selection Structures... 5 CSCI 171 Chapter 3 (Sections 5 & 6) Iteration Structures

More information

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem: Expression evaluation CSE 504 Order of evaluation For the abstract syntax tree + + 5 Expression Evaluation, Runtime Environments + + x 3 2 4 the equivalent expression is (x + 3) + (2 + 4) + 5 1 2 (. Contd

More information

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g., C++) to low-level assembly language that can be executed by hardware int a,

More information

ECE220: Computer Systems and Programming Spring 2018 Honors Section due: Saturday 14 April at 11:59:59 p.m. Code Generation for an LC-3 Compiler

ECE220: Computer Systems and Programming Spring 2018 Honors Section due: Saturday 14 April at 11:59:59 p.m. Code Generation for an LC-3 Compiler ECE220: Computer Systems and Programming Spring 2018 Honors Section Machine Problem 11 due: Saturday 14 April at 11:59:59 p.m. Code Generation for an LC-3 Compiler This assignment requires you to use recursion

More information

Chapter 3. Describing Syntax and Semantics

Chapter 3. Describing Syntax and Semantics Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:

More information

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Objective PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Explain what is meant by compiler. Explain how the compiler works. Describe various analysis of the source program. Describe the

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

Semantic actions for expressions

Semantic actions for expressions Semantic actions for expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate representations

More information

More On Syntax Directed Translation

More On Syntax Directed Translation More On Syntax Directed Translation 1 Types of Attributes We have productions of the form: A X 1 X 2 X 3... X n with semantic rules of the form: b:= f(c 1, c 2, c 3,..., c n ) where b and the c s are attributes

More information

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology exam Compiler Construction in4020 July 5, 2007 14.00-15.30 This exam (8 pages) consists of 60 True/False

More information

Compilers. Compiler Construction Tutorial The Front-end

Compilers. Compiler Construction Tutorial The Front-end Compilers Compiler Construction Tutorial The Front-end Salahaddin University College of Engineering Software Engineering Department 2011-2012 Amanj Sherwany http://www.amanj.me/wiki/doku.php?id=teaching:su:compilers

More information

CS5363 Final Review. cs5363 1

CS5363 Final Review. cs5363 1 CS5363 Final Review cs5363 1 Programming language implementation Programming languages Tools for describing data and algorithms Instructing machines what to do Communicate between computers and programmers

More information

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11 CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11 CS 536 Spring 2015 1 Handling Overloaded Declarations Two approaches are popular: 1. Create a single symbol table

More information

Semantic Analysis. CSE 307 Principles of Programming Languages Stony Brook University

Semantic Analysis. CSE 307 Principles of Programming Languages Stony Brook University Semantic Analysis CSE 307 Principles of Programming Languages Stony Brook University http://www.cs.stonybrook.edu/~cse307 1 Role of Semantic Analysis Syntax vs. Semantics: syntax concerns the form of a

More information

Lexical Considerations

Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 6 Decaf Language Wednesday, September 7 The project for the course is to write a

More information

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation Comp 204: Computer Systems and Their Implementation Lecture 22: Code Generation and Optimisation 1 Today Code generation Three address code Code optimisation Techniques Classification of optimisations

More information

3.5 Practical Issues PRACTICAL ISSUES Error Recovery

3.5 Practical Issues PRACTICAL ISSUES Error Recovery 3.5 Practical Issues 141 3.5 PRACTICAL ISSUES Even with automatic parser generators, the compiler writer must manage several issues to produce a robust, efficient parser for a real programming language.

More information

Lexical Considerations

Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout Decaf Language Tuesday, Feb 2 The project for the course is to write a compiler

More information

UNIT-4 (COMPILER DESIGN)

UNIT-4 (COMPILER DESIGN) UNIT-4 (COMPILER DESIGN) An important part of any compiler is the construction and maintenance of a dictionary containing names and their associated values, such type of dictionary is called a symbol table.

More information

Introduction. Inline Expansion. CSc 553. Principles of Compilation. 29 : Optimization IV. Department of Computer Science University of Arizona

Introduction. Inline Expansion. CSc 553. Principles of Compilation. 29 : Optimization IV. Department of Computer Science University of Arizona CSc 553 Principles of Compilation 29 : Optimization IV Introduction Department of Computer Science University of Arizona collberg@gmail.com Copyright c 2011 Christian Collberg Inline Expansion I Inline

More information

Syntactic Directed Translation

Syntactic Directed Translation Syntactic Directed Translation Translation Schemes Copyright 2016, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission

More information

Symbol Tables. ASU Textbook Chapter 7.6, 6.5 and 6.3. Tsan-sheng Hsu.

Symbol Tables. ASU Textbook Chapter 7.6, 6.5 and 6.3. Tsan-sheng Hsu. Symbol Tables ASU Textbook Chapter 7.6, 6.5 and 6.3 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Definitions Symbol table: A data structure used by a compiler to keep track

More information

Summary: Direct Code Generation

Summary: Direct Code Generation Summary: Direct Code Generation 1 Direct Code Generation Code generation involves the generation of the target representation (object code) from the annotated parse tree (or Abstract Syntactic Tree, AST)

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

Programming Languages Third Edition. Chapter 7 Basic Semantics

Programming Languages Third Edition. Chapter 7 Basic Semantics Programming Languages Third Edition Chapter 7 Basic Semantics Objectives Understand attributes, binding, and semantic functions Understand declarations, blocks, and scope Learn how to construct a symbol

More information

Context-sensitive Analysis. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Context-sensitive Analysis. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Context-sensitive Analysis Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Beyond Syntax There is a level of correctness that is deeper than grammar fie(a,b,c,d) int

More information

COP5621 Exam 4 - Spring 2005

COP5621 Exam 4 - Spring 2005 COP5621 Exam 4 - Spring 2005 Name: (Please print) Put the answers on these sheets. Use additional sheets when necessary. Show how you derived your answer when applicable (this is required for full credit

More information

A Simple Syntax-Directed Translator

A Simple Syntax-Directed Translator Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

More information

Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 20 Intermediate code generation Part-4 Run-time environments

More information

Programming Lecture 3

Programming Lecture 3 Programming Lecture 3 Expressions (Chapter 3) Primitive types Aside: Context Free Grammars Constants, variables Identifiers Variable declarations Arithmetic expressions Operator precedence Assignment statements

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007

More information

B.V. Patel Institute of Business Management, Computer & Information Technology, Uka Tarsadia University

B.V. Patel Institute of Business Management, Computer & Information Technology, Uka Tarsadia University Unit 1 Programming Language and Overview of C 1. State whether the following statements are true or false. a. Every line in a C program should end with a semicolon. b. In C language lowercase letters are

More information

Introduction to Compilers and Language Design Copyright (C) 2017 Douglas Thain. All rights reserved.

Introduction to Compilers and Language Design Copyright (C) 2017 Douglas Thain. All rights reserved. Introduction to Compilers and Language Design Copy (C) 2017 Douglas Thain. All s reserved. Anyone is free to download and print the PDF edition of this book for personal use. Commercial distribution, printing,

More information

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target. COMP 412 FALL 2017 Computing Inside The Parser Syntax-Directed Translation Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights

More information

ELEC 876: Software Reengineering

ELEC 876: Software Reengineering ELEC 876: Software Reengineering () Dr. Ying Zou Department of Electrical & Computer Engineering Queen s University Compiler and Interpreter Compiler Source Code Object Compile Execute Code Results data

More information

Semantic Analysis Attribute Grammars

Semantic Analysis Attribute Grammars Semantic Analysis Attribute Grammars Martin Sulzmann Martin Sulzmann Semantic Analysis Attribute Grammars 1 / 18 Syntax versus Semantics Syntax Analysis When is a program syntactically valid? Formalism:

More information

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1 CSE P 501 Compilers Static Semantics Hal Perkins Winter 2008 1/22/2008 2002-08 Hal Perkins & UW CSE I-1 Agenda Static semantics Types Attribute grammars Representing types Symbol tables Note: this covers

More information

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram CS6660 COMPILER DESIGN Question Bank UNIT I-INTRODUCTION TO COMPILERS 1. Define compiler. 2. Differentiate compiler and interpreter. 3. What is a language processing system? 4. List four software tools

More information

Code Generation. Dragon: Ch (Just part of it) Holub: Ch 6.

Code Generation. Dragon: Ch (Just part of it) Holub: Ch 6. Code Generation Dragon: Ch 7. 8. (Just part of it) Holub: Ch 6. Compilation Processes Again Choice of Intermediate Code Representation (IR) IR examples Parse tree Three address code (e.g., x := y op z)

More information

Syntax-Directed Translation. CS Compiler Design. SDD and SDT scheme. Example: SDD vs SDT scheme infix to postfix trans

Syntax-Directed Translation. CS Compiler Design. SDD and SDT scheme. Example: SDD vs SDT scheme infix to postfix trans Syntax-Directed Translation CS3300 - Compiler Design Syntax Directed Translation V. Krishna Nandivada IIT Madras Attach rules or program fragments to productions in a grammar. Syntax directed definition

More information

CS 415 Midterm Exam Spring SOLUTION

CS 415 Midterm Exam Spring SOLUTION CS 415 Midterm Exam Spring 2005 - SOLUTION Name Email Address Student ID # Pledge: This exam is closed note, closed book. Questions will be graded on quality of answer. Please supply the best answer you

More information

Qualifying Exam in Programming Languages and Compilers

Qualifying Exam in Programming Languages and Compilers Qualifying Exam in Programming Languages and Compilers University of Wisconsin Fall 1991 Instructions This exam contains nine questions, divided into two parts. All students taking the exam should answer

More information

COSE312: Compilers. Lecture 20 Data-Flow Analysis (2)

COSE312: Compilers. Lecture 20 Data-Flow Analysis (2) COSE312: Compilers Lecture 20 Data-Flow Analysis (2) Hakjoo Oh 2017 Spring Hakjoo Oh COSE312 2017 Spring, Lecture 20 June 6, 2017 1 / 18 Final Exam 6/19 (Mon), 15:30 16:45 (in class) Do not be late. Coverage:

More information

CSCI Compiler Design

CSCI Compiler Design CSCI 565 - Compiler Design Spring 2010 Final Exam - Solution May 07, 2010 at 1.30 PM in Room RTH 115 Duration: 2h 30 min. Please label all pages you turn in with your name and student number. Name: Number:

More information

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis? Anatomy of a Compiler Program (character stream) Lexical Analyzer (Scanner) Syntax Analyzer (Parser) Semantic Analysis Parse Tree Intermediate Code Generator Intermediate Code Optimizer Code Generator

More information

Introduction to Programming Using Java (98-388)

Introduction to Programming Using Java (98-388) Introduction to Programming Using Java (98-388) Understand Java fundamentals Describe the use of main in a Java application Signature of main, why it is static; how to consume an instance of your own class;

More information

CODE GENERATION Monday, May 31, 2010

CODE GENERATION Monday, May 31, 2010 CODE GENERATION memory management returned value actual parameters commonly placed in registers (when possible) optional control link optional access link saved machine status local data temporaries A.R.

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT C Review MaxMSP Developers Workshop Summer 2009 CNMAT C Syntax Program control (loops, branches): Function calls Math: +, -, *, /, ++, -- Variables, types, structures, assignment Pointers and memory (***

More information

COMPILER CONSTRUCTION Seminar 03 TDDB

COMPILER CONSTRUCTION Seminar 03 TDDB COMPILER CONSTRUCTION Seminar 03 TDDB44 2016 Martin Sjölund (martin.sjolund@liu.se) Mahder Gebremedhin (mahder.gebremedhin@liu.se) Department of Computer and Information Science Linköping University LABS

More information

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; } Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas

More information

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer. The Compiler So Far CSC 4181 Compiler Construction Scanner - Lexical analysis Detects inputs with illegal tokens e.g.: main 5 (); Parser - Syntactic analysis Detects inputs with ill-formed parse trees

More information

Grammars. CS434 Lecture 15 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Grammars. CS434 Lecture 15 Spring 2005 Department of Computer Science University of Alabama Joel Jones Grammars CS434 Lecture 5 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled

More information

MIDTERM EXAM (Solutions)

MIDTERM EXAM (Solutions) MIDTERM EXAM (Solutions) Total Score: 100, Max. Score: 83, Min. Score: 26, Avg. Score: 57.3 1. (10 pts.) List all major categories of programming languages, outline their definitive characteristics and

More information

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University Lecture 7: Type Systems and Symbol Tables CS 540 George Mason University Static Analysis Compilers examine code to find semantic problems. Easy: undeclared variables, tag matching Difficult: preventing

More information

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona CSc 453 Compilers and Systems Software 13 : Intermediate Code I Department of Computer Science University of Arizona collberg@gmail.com Copyright c 2009 Christian Collberg Introduction Compiler Phases

More information

C Language Programming

C Language Programming Experiment 2 C Language Programming During the infancy years of microprocessor based systems, programs were developed using assemblers and fused into the EPROMs. There used to be no mechanism to find what

More information

UNIT-V. Symbol Table & Run-Time Environments Symbol Table

UNIT-V. Symbol Table & Run-Time Environments Symbol Table 1 P a g e UNIT-V Symbol Table & Run-Time Environments Symbol Table Symbol table is a data structure used by compiler to keep track of semantics of variable. i.e. symbol table stores the information about

More information

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1 Table of Contents About the Authors... iii Introduction... xvii Chapter 1: System Software... 1 1.1 Concept of System Software... 2 Types of Software Programs... 2 Software Programs and the Computing Machine...

More information

CSCI Compiler Design

CSCI Compiler Design CSCI 565 - Compiler Design Spring 2015 Midterm Exam March 04, 2015 at 8:00 AM in class (RTH 217) Duration: 2h 30 min. Please label all pages you turn in with your name and student number. Name: Number:

More information

Project 2 Interpreter for Snail. 2 The Snail Programming Language

Project 2 Interpreter for Snail. 2 The Snail Programming Language CSCI 2400 Models of Computation Project 2 Interpreter for Snail 1 Overview In this assignment you will use the parser generator yacc to construct an interpreter for a language called Snail containing the

More information

Compiler Optimization Techniques

Compiler Optimization Techniques Compiler Optimization Techniques Department of Computer Science, Faculty of ICT February 5, 2014 Introduction Code optimisations usually involve the replacement (transformation) of code from one sequence

More information

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */ Overview Language Basics This chapter describes the basic elements of Rexx. It discusses the simple components that make up the language. These include script structure, elements of the language, operators,

More information

G Programming Languages - Fall 2012

G Programming Languages - Fall 2012 G22.2110-003 Programming Languages - Fall 2012 Lecture 4 Thomas Wies New York University Review Last week Control Structures Selection Loops Adding Invariants Outline Subprograms Calling Sequences Parameter

More information

Summary: Semantic Analysis

Summary: Semantic Analysis Summary: Semantic Analysis 1 Basic Concepts When SA is performed: Semantic Analysis may be performed: In a two-pass compiler: after syntactic analysis is finished, the semantic analyser if called with

More information

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Module No. # 10 Lecture No. # 16 Machine-Independent Optimizations Welcome to the

More information

Programming for Engineers Iteration

Programming for Engineers Iteration Programming for Engineers Iteration ICEN 200 Spring 2018 Prof. Dola Saha 1 Data type conversions Grade average example,-./0 class average = 23450-67 893/0298 Grade and number of students can be integers

More information

Chapter 4 :: Semantic Analysis

Chapter 4 :: Semantic Analysis Chapter 4 :: Semantic Analysis Programming Language Pragmatics, Fourth Edition Michael L. Scott Copyright 2016 Elsevier 1 Chapter04_Semantic_Analysis_4e - Tue November 21, 2017 Role of Semantic Analysis

More information

Syntax-Directed Translation

Syntax-Directed Translation Syntax-Directed Translation What is syntax-directed translation? The compilation process is driven by the syntax. The semantic routines perform interpretation based on the syntax structure. Attaching attributes

More information

Static Checking and Intermediate Code Generation Pat Morin COMP 3002

Static Checking and Intermediate Code Generation Pat Morin COMP 3002 Static Checking and Intermediate Code Generation Pat Morin COMP 3002 Static Checking and Intermediate Code Generation Parser Static Checker Intermediate Code Generator Intermediate Code Generator Parse

More information

Parsing II Top-down parsing. Comp 412

Parsing II Top-down parsing. Comp 412 COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information

Decaf Language Reference Manual

Decaf Language Reference Manual Decaf Language Reference Manual C. R. Ramakrishnan Department of Computer Science SUNY at Stony Brook Stony Brook, NY 11794-4400 cram@cs.stonybrook.edu February 12, 2012 Decaf is a small object oriented

More information

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target. COMP 412 FALL 2017 Generating Code for Assignment Statements back to work Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights

More information

UNIT- 3 Introduction to C++

UNIT- 3 Introduction to C++ UNIT- 3 Introduction to C++ C++ Character Sets: Letters A-Z, a-z Digits 0-9 Special Symbols Space + - * / ^ \ ( ) [ ] =!= . $, ; : %! &? _ # = @ White Spaces Blank spaces, horizontal tab, carriage

More information

Run-Time Data Structures

Run-Time Data Structures Run-Time Data Structures Static Structures For static structures, a fixed address is used throughout execution. This is the oldest and simplest memory organization. In current compilers, it is used for:

More information

Fixed-Point Math and Other Optimizations

Fixed-Point Math and Other Optimizations Fixed-Point Math and Other Optimizations Embedded Systems 8-1 Fixed Point Math Why and How Floating point is too slow and integers truncate the data Floating point subroutines: slower than native, overhead

More information

Compiler Optimization and Code Generation

Compiler Optimization and Code Generation Compiler Optimization and Code Generation Professor: Sc.D., Professor Vazgen Melikyan 1 Course Overview Introduction: Overview of Optimizations 1 lecture Intermediate-Code Generation 2 lectures Machine-Independent

More information

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; } Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas

More information

CMPSCI 201 Fall 2006 Midterm #2 November 20, 2006 SOLUTION KEY

CMPSCI 201 Fall 2006 Midterm #2 November 20, 2006 SOLUTION KEY CMPSCI 201 Fall 2006 Midterm #2 November 20, 2006 SOLUTION KEY Professor William T. Verts 10 Points Trace the following circuit, called a demultiplexer, and show its outputs for all possible inputs.

More information

Context-sensitive Analysis

Context-sensitive Analysis Context-sensitive Analysis Beyond Syntax There is a level of correctness that is deeper than grammar fie(a,b,c,d) int a, b, c, d; { } fee() { int f[3],g[0], h, i, j, k; char *p; fie(h,i, ab,j, k); k =

More information

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano Building a Runnable Program and Code Improvement Dario Marasco, Greg Klepic, Tess DiStefano Building a Runnable Program Review Front end code Source code analysis Syntax tree Back end code Target code

More information

Chapter 9 Subroutines and Control Abstraction. June 22, 2016

Chapter 9 Subroutines and Control Abstraction. June 22, 2016 Chapter 9 Subroutines and Control Abstraction June 22, 2016 Stack layout Common to have subroutine activation record allocated on a stack Typical fare for a frame: arguments, return value, saved registers,

More information

CMPSCI 201 Fall 2004 Midterm #2 Answers

CMPSCI 201 Fall 2004 Midterm #2 Answers CMPSCI 201 Fall 2004 Midterm #2 Answers Professor William T. Verts 15 Points You should be quite familiar by now with the single-precision floating point numeric format (one 32-bit word containing

More information

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program. Language Translation Compilation vs. interpretation Compilation diagram Step 1: compile program compiler Compiled program Step 2: run input Compiled program output Language Translation compilation is translation

More information