Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES (2018-19) (ODD) Code Optimization Prof. Jonita Roman Date: 30/06/2018 Time: 9:45 to 10:45 Venue: MCA Ground Floor Class Room

Presentation Outline Introduction From Definition to Code Need for Code Optimization Schematic diagram Code Optimization at Compiler Level Code Optimization from Developer s end Few optimization techniques Examples Other references

Introduction Code optimization aims at improving the execution efficiency of a program. This can be achieved by: Eliminating redundancies from the code Rearranging or rewriting the computations in a program to make it execute efficiently.

What is an Algorithm? An algorithm (pronounced AL-go-rith-um) is a procedure or formula for solving a problem, based on conducting a sequence of specified actions. A computer program is nothing by an elaborate algorithm. In computer science, an algorithm usually means a small procedure that solves a problem.

Properties of Algorithm (Donald Knuth) An algorithm must possess the following properties: 1. Finiteness: The algorithm must always terminate after a finite number of steps. 2. Definiteness: Each step must be precisely defined; the actions to be carried out must be rigorously and unambiguously specified for each case. 3. Input: An algorithm has zero or more inputs, taken from a specified set of objects. 4. Output: An algorithm has one or more outputs, which have a specified relation to the inputs. 5. Effectiveness: All operations to be performed must be sufficiently basic that they can be done exactly and in finite length.

Parameters Measuring Performance How can we talk precisely about the "cost" of running an algorithm? What do you understand by the term "cost"? Time? (Execution speed) Space? (Memory usage and its management) Both? (Speed and Space) Something else? (Other than this! Parallelism using Multi process/thread)

Program??? A computer program is a list of instructions that tell a computer what to do. Everything a computer does is done by using a computer program. A computer program is written in a programming language [Wikipedia]. Some examples are: Finding factorial of a given number Sorting a given array of elements Operating system Web Browsers Office suite (documents/spreadsheets etc) Video games Malwares...

Preprocessor Compiler Assembler Source Program Target Assembly Program Relocatable Machine Code Loader/Linker Library, Relocatable object files Absolute Machine Code A journey of an HLL to executable

THE PHASES OF A COMPILER Source Program Lexical Analyzer Syntax Analyzer Symbol-table manager Semantic Analyzer IC Generator Error handler Code Optimizer Code Generator Target Program

EXECUTION OF SOURCE PROGRAM The execution of a program is divided into two main phases: Analysis phase and Synthesis phase: Analysis phase 1. Linear analysis (Lexical analysis) 2. Hierarchical analysis (Syntax Analysis) 3. Semantic analysis Synthesis phase 1. Creation of Data Structures (Table of contents) 2. Code Generation For eg: position := initial + rate * 60 Would be grouped into the following tokens :

Expression tree generation for position := initial + rate * 60.0 assignment statement identifier : = expression position expression + expression identifier expression * expression initial identifier number rate 60

In the expression initial + rate * 60, the phrase rate * 60 is a logical unit because the usual conventions of arithmetic expressions tell us that multiplication is performed before addition. Because the expression initial + rate is followed by a *, it is not grouped into a single phrase by itself in the above parse tree. The hierarchical structure of a program is usually expressed by recursive rules. For example, we might have the following rules as part of the definition of expressions : 1. Any identifier is an expression. 2. Any number is an expression. 3. If expression 1 and expression 2 are expressions, then so are expression 1 + expression 2 expression 1 * expression 2 ( expression 1 )

Rules (1) and (2) are (non recursive) basic rules, while (3) defines expressions in terms of operators applied to other expressions. Thus, by rule (1), initial and rate are expressions. By rule 2, 60 is an expression, while by rule 3, rate * 60 is an expression and finally that initial + rate * 60 is an expression. Many languages define statements recursively by rules such as: If identifier 1 is an identifier, and expression 2 is an expression, then identifier 1 := expression 2 is a statement. If expression1 is an expression and statement2 is a statement, then while ( expression 1 ) do statement 2 if ( expression 1 ) then statement 2 are statements.

Lexical constructs do not require recursion, while syntactic constructs do. Context free grammars are a formalization of recursive rules that can be used to guide syntactic analysis. On the other hand, this kind of linear scan is not powerful enough to analyze expressions or statements. For example, we cannot properly match parenthesis in expressions, or begin and end statements, without putting some kind of hierarchical or nesting structure on the input. For example, recursion is not required to identify identifiers, which are typically strings of letters and digits beginning with a letter. We would normally recognize identifiers by a simple scan of the input stream, waiting until a character that was neither a letter nor a digit was found, and then grouping all the letters and digits found up to that point into an identifier token. The characters so grouped are recorded in a table, called a symbol table, and removed from the input so that processing of next token can begin.

The parse tree of the figure 1.3 describes the syntactic structure of the input. A more common internal representation of this syntactic structure is given by the syntax tree in the figure 1.4(A). A syntax tree is a compressed representation of the parse tree in which the operators appear as the interior nodes, and the operands of an operator are the children of the node for that operator. := position + initial * rate 60

position := + initial * rate inttoreal 60 Semantic analysis: The semantic analysis phase checks the source program for semantic errors and gathers the type information for the subsequent code-generation phase. It uses the hierarchical structure determined by the syntax-analysis phase to identify the operators and operands of expressions and statements.

An important component of semantic analysis is type checking. Here the compiler checks that each operator has operands that are permitted by the source language specification. For example, when a binary arithmetic operator is applied to an integer and real. In this case, the compiler may need to convert the integer to real. Suppose, for example, that all identifiers in the above figure have been declared to be real and that 60 by itself is assumed to be an integer. Type checking of above figure reveals that * is applied to real, rate, and an integer, 60. This has been achieved in the next figure by creating an extra node for the operator inttoreal that explicitly converts an integer into a real.

Symbol Table Management An essential function of a compiler is to record the identifiers used in the source program and collect information about various attributes of each identifier. These attributes may provide information about the storage allocated for an identifier, its type, its scope (where in the program it is valid), and, in the case of procedure names, such things as the number and types of its arguments, the method of passing each argument (e.g., by reference), and the type returned, if any. A symbol table is a data structure containing a record for each identifier, with fields for the attributes of the identifier. The data structure allows us to find the record for each identifier quickly and to store or retrieve data from that record quickly.

Error Detection and Reporting Each phase can encounter errors. However after detecting error, a phase must somehow deal with that error, so that compilation can proceed, allowing further errors in the source program to be detected. The syntax and semantic analysis phases usually handle a large fraction of the errors detectable by the compiler. The lexical phase can detect errors where the characters remaining in the input do not form any token of the language. Errors where the token stream violates the structure rules (syntax) of the language are determined by the syntax analysis phase. During semantic analysis the compiler tries to detect constructs that have the right syntactic structure but no meaning to the operation involved.

The Analysis Phases As translation progresses, the compiler s internal representation of the source program changes. We illustrate these representations by considering the translation of the statement position := initial + rate * 60 The lexical analysis phase reads the characters in the source program and groups them into a stream of tokens in which each token represents a logically cohesive sequence of characters, such as an identifier, a keyword ( if, while ), a punctuation character, or a multi-character operator like :=. The character sequence forming a token is called the lexeme for the token. When an identifier like rate is found, the lexical analyzer not only generates a token, say id, but also enters the lexeme rate into the symbol table, if it is not already there. Here, we will use id 1, id 2, id 3 for position, initial and rate to emphasize that internal representation of an identifier is different from the character sequence forming the identifier. The representation of above statement after lexical analysis is therefore

suggested by: id 1 := id 2 + id 3 *60 We would also make up tokens for the multi-character operator := and the number 60 to reflect their internal representation. The second and third phases, syntax and semantic analysis, have also been introduced in the previous section.

Intermediate Code Generation After syntax and semantic analysis, some compilers generate an explicit intermediate representation of the source program. We can think of this intermediate representation as a program for an abstract machine. This intermediate representation should have two important properties. It should be easy to produce, and easy to translate into the target program. We consider an intermediate form called three-address code which is like the assembly language for a machine in which every memory location can act like a register. Three-address code consists of a sequence of instructions, each of which has at most three operands. The source program position := initial + rate * 60 might appear in three-address code as

temp1 := inttoreal(60) temp2 := id3 * temp1 temp3 := id2 + temp2 id1 := temp3 Code optimization The code optimization phase attempts to improve the intermediate code, so that faster-running machine code will result. For example, the above code, after optimization, may be as temp1 := id3 * 60.0 id1 := id2 + temp1

CODE OPTIMIZATION: (Revisited) Code optimization aims at improving the execution efficiency of a program. This is achieved in two ways: 1.Redundancies in a program are eliminated. 2.Computations in a program are rearranged or rewritten to make it execute efficiently. Unoptimized Code Code Optimization Through optimization techniques Optimized Code

Two points concern the scope of optimization. The optimization techniques are thus independent of both the PL and the target machine. 1. Optimization seeks to improve a program rather than the algorithm used in a program. Thus replacement of an algorithm by a more efficient algorithm us beyond the scope of optimization. 2. Efficient code generation for a specific target machine is also beyond its scope; it belongs in the back end of a compiler.

Optimizing Transformations: An optimizing transformation is a rule for rewriting a segment of a program to improve its execution efficiency without affecting its meaning. Optimizing transformations are classified into local and global transformations depending on whether they are applied over small segments of a program consisting of a few source statements, or over larger segments consisting of loops or function bodies. 1.Compile time evaluation: Execution efficiency can be improved by performing certain actions specified in a program during compilation itself. For ex: a := 3.14157 / 2 can be replaced by a := 1.50785 thereby eliminating a division operation.

Elimination of common sub expression: Common sub expressions are occurrences of expressions yielding the same value. They are also called equivalent expressions. For ex: t := b * c; a := b * c a := t; - - - - - - - - - - - - x := b * c + 5.2; a := t + 5.2; Here CS contains the two occurrences of b * c. The second occurrence of b * c can be eliminated because the first occurrence of b * c is always evaluated before the second occurrence is reached during execution of the program. The value computed at the first occurrence is saved in t. This value is used in the assignment of x.

Dead code elimination: Code which can be omitted from a program without affecting its results is called dead code. Dead code is detected by checking whether the value assigned in an assignment statement is used anywhere in the program. For ex: An assignment x := <expr> constitutes dead code if the value assigned to x is not used in the program. Frequency reduction: Execution time of a program can be reduced by moving code from a part of a program which is executed very frequently to another part of the program which is executed fewer times. For example, the transformation of loop optimization moves loop invariant code out of a loop and places it prior to loop entry.

For example: x := 25 * a; for i := 1 to 100 do for i := 1 to 100 do begin begin z := i; z := i; x := 25 * a; y := x + z; y := x + z; end; end; Here x := 25 * a; is loop invariant. Hence in the optimized program it is computed only once before entering the for loop. y := x + z; is not loop invariant. Hence it cannot be subjected to frequency reduction.

Strength reduction: The strength reduction optimization replaces the occurrence of a time consuming operation (a high strength operation) be an occurrence of a faster operation (a low strength operation), e g. replacement of a multiplication by an addition. itemp := 5; for i := 1 to 10 do for i := 1 to 10 do begin begin - - - - - - - - - - k := i * 5; k := itemp; - - - - - - - - - - itemp := itemp + 5; end; end;

Local and global optimization: Optimization of a program is structured into the following two phases: 1. Local optimization: The optimizing transformations are applied over small segments of a program consisting of a few statements. 2. Global optimization: The optimizing transformations are applied over a program unit, i.e. over a function or a procedure.

Local Optimization Basic block: A basic block is a sequence of program statements (s 1, s 2,, s n ) such that only s n can be a transfer of control statement and only s 1 can be the destination of a transfer of control statement. A basic block b is a program segment with a single entry point. If control reaches statement s 1 during program execution, then all the statements s 1, s 2,. s n will be executed. The essentially sequential nature of a basic block simplifies optimization.

Example: t := x*y; a := x*y; a := t; ----- ----- b := x*y; b := t; lab i : c := x*y; lab i : c := x*y; Local optimization is done for the statement x*y. This is done for the first two statements in the block and not for the last statement since it belongs to other block. If the label lab i : did not exist then we could replace the value of c := x*y, because then it would constitute the same block.

Value numbers Value numbers provide a simple means to determine if two occurrences of an expression in a basic block are equivalent. The value numbering technique is applied on the fly while identifying basic blocks in a source program. A value number vn alpha is associated with variable alpha. It identifies the last assignment to alpha processed so far. stmt no. Statement 14 g := 25.2; 15 x := z+2; 16 h := x*y+d; 34 w := x*y;

Local optimization using value numbering: Symbol table Symbol.. Value number y 0 x 15 g 14 z 0 d 5 w 0

Quadruples table Operator Operand1 Operand 2 Oper and Value no. Opera nd Valu e no. Result name 20 := g - 25.2 - t 20 f 21 + z 0 2 - t 21 f 22 := x 0 t21 - t 22 f 23 * x 15 y 0 T 23 f t 24 + t 23 - d 5 T 24 f.. 57 := w 0 t 23 - t 57 f Use flag

Global Optimization Compared to local optimization, global optimizations requires more analysis effort to establish the feasibility of an optimization. Consider global common subexpression elimination. If some expression x*y occurs in set of basic blocks SB of program P, its occurrence in a block b j SB can be eliminated if the following two conditions are satisfied for every execution of P: 1. Basic block b j is executed only after some block b k SB has been executed one or more times. 2. No assignments to x or y have been executed after the last (or only) evaluation of x*y in block b k. (**)

Program Representation: A program is represented in the form of a program flow graph. Definition: A program flow graph for a program P is a directed graph Gp = (N, E, n o ) where N: set of basic blocks in P E: set of directed edges (b i,b j ) indicating the possibility of control flow from the last statement of b i (the source node) to the first statement of b j (the destination node) n o : start node of P.

Control and Data flow analysis Control flow analysis analyses a program to collect information concerning its structure, e.g. presence and nesting of loops in the program. Information concerning program structure is used to answer specific questions of interest (**). The control flow concepts of interest are: 1. Predecessors and successors: If (b i,b j ) E, b i is a predecessor of b j and b j is a successor of b i. 2. Paths: A path is a sequence of edges such that the destination node of one edge is the source node of the following edge. 3. Ancestors and descendants: If a path exists from b i to b j, b i is an ancestor of b j and b j is a descendant of b i. 4. Dominators and post-dominators: Block b i is a dominator of block b j if every path from n o to b j passes through b i. b i is a post-dominator of b j if every path from b j to an exit node passes through b i.

Global Optimization Program Representation: Control and Data flow analysis Control flow analysis Data flow analysis Data flow concept Available expression Live variable Reaching definition Optimization in which used Common sub-expression elimination Dead code elimination Constant and variable propagation

Build a program flow graph for the following program z := 5; w := z; for i := 1 to 100 do x := a*b; y := c+d; if y < 0 then a := 25; f := c+d; else g := w; h := a*b+f; d := z+10; end; g: := c+d; print g,h,d,x,y;

Thank You