Available expressions Suppose we want to do common-subexpression elimination; that is, given a program that computes x y more than once, can we eliminate one of the duplicate computations? To find places where such optimizations are possible, the notion of available expressions is helpful. An expression x y is available at a node n in the flow graph if, on every path from the entry node of the graph to node n, x y is computed at least once and there are no definitions of x or y since the most recent occurrence of x y on that path. We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions. Gen : Any node that computes x y generates {x y}. Kill : any definition of x or y kills {x y}. Table 8 summarizes the generate and kill effects. 1. t b + c generates the expression b + c, but b b + c does not generate b + c After b + c there is a subsequent definition of b. gen[s] = {b c} kill[s] 2. A store instruction (M[a] b) might modify any memory location, so it kills any fetch expression (M[x]). If we were sure that a x, we could be less conservative, and say that M[a] b does not kill M[x]. This is called alias analysis; Given gen and kill, we compute in and out almost as for reaching definitions, except that we compute the intersection of the out sets of the predecessors instead of a union. This reflects the fact that an expression is available only if it is computed on every path into the node. 14
Statement s gen[s] kill[s] t b c {b c} kill[s] expressions containing t t M[b] {M[b]} kill[s] expressions containing t M[a] b {} expressions of the form M[x] if a > b goto L 1 else goto L 2 {} {} goto L {} {} L : {} {} f(a 1,..., a n ) {} expressions of the form M[x] t f(a 1,..., a n ) {} expressions containing t, and expressions of the form M[x] Figure 8: Table: Gen and kill for available expressions. in[n] = out[p] p pred[n] out[n] = gen[n] (in[n] kill[n]) if n is not the start node To solve the dataflow equations by iteration, 1. we define the in set of the start node as empty, and 2. initialize all other sets to full (the set of all expressions), not empty. The intersection operator makes sets smaller, not bigger as the union operator does in the computation of reaching definitions. This algorithm then finds the greatest fixed point of the equations. in[ start node ] { } for each n other than the start node in[n] { all expressions }; out[n] { all expressions } repeat for each n in [n] in[n]; out [n] out[n] in[n] p pred[n] out[p] out[n] gen[n] (in[n] kill[n]) until in [n] = in[n] and out [n] = out[n] for all n 15
1 : x a + b 2 : if c > d goto L 1 3 : a p 4 : y a + b 5 : goto L 2 6 : L 1 : if e < f goto L 1 7 : L 2 : z a + b 1 : t a + b 1 : x t 2 : if c > d goto L 1 3 : a p 4 : t a + b 4 : y t 5 : goto L 2 6 : L 1 : if e < f goto L 1 7 : L 2 : z t Figure 9: Program We will take Program 9 as an example; Init Iter. 1 Iter. 2 n gen[n] kill[n] in[n] out[n] in[n] out[n] in[n] out[n] 1 a+b a+b a+b a+b 2 a+b a+b a+b a+b a+b a+b 3 a+b a+b a+b a+b a+b 4 a+b a+b a+b a+b a+b 5 a+b a+b a+b a+b a+b a+b 6 a+b a+b a+b a+b a+b a+b 7 a+b a+b a+b a+b a+b a+b a+b As a result, we can eliminate a + b at statement 7 by replacing it with temporary t introduced for statements 1 and 4. 16
Statement s gen[s] kill[s] t b c {b, c} {t} t M[b] {b} {t} M[a] b {a, b} {} if a > b goto L 1 else goto L 2 {a, b} {} goto L {} {} L : {} {} f(a 1,..., a n ) {a 1,..., a n } {} t f(a 1,..., a n ) {a 1,..., a n } {t} Liveness analysis Figure 10: Table: Gen and kill for liveness analysis. Two temporaries a and b can fit into the same register, if a and b are never in use at the same time. We say a variable is live if it holds a value that may be needed in the future. this analysis is called liveness analysis. Defs and uses of variable x Defs : assignments to variable/temporary x. Uses : assignments with occurrences of variable x on the right-hand sides of the assignments. A variable is live on an edge if there is a directed path from that edge to a use of the variable that does not go through any def. Gen and kill are defined as shown by Figure 10. Gen : any use of a variable generates liveness. Kill : any definition kills liveness. The equations for in and out are similar to the ones for reaching definition, but backward because liveness is a backward dataflow analysis: in[n] = gen[n] (out[n] kill[n]) out[n] = in[s] s succ[n] 17
To solve the dataflow equations by iteration, for each n in[n] { }; out[n] { } repeat for each n in [n] in[n]; out [n] out[n] out[n] s succ[n] in[s] in[n] gen[n] (out[n] kill[n]) until in [n] = in[n] and out [n] = out[n] for all n We will take Program 7 as an example. Iter. 1 Iter. 2 Iter. 3 n gen[n] kill[n] in[n] out[n] in[n] out[n] in[n] out[n] 1 a a a a 2 c a c,a a c,a a c,a 3 c,a c,a c,a c,a c,a c,a c,a 4 c c c c,a c,a c,a c,a 5 c,a c,a c,a c,a 6 c,a a c,a c,a c,a 7 c As a result, we can remove statements 6 and 7, because their results are never used. 18
2.4 Transformations using dataflow analysis Using the results of dataflow analysis, the optimizing compiler can improve the program in several ways. Common-subexpression elimination Given a flow-graph statement s : t x y, where the expression x y is available at s, the computation within s can be eliminated. Elimination steps of s: 1. Choose a new temporary w, and rewrite n with the same expression as s as follows: n : w x y n : v w 2. Finally, modify statement s to be s : t w We will rely on copy propagation to remove some or all of the extra assignment quadruples. Constant propagation Suppose that 1. we have a statement d : t c where c is a constant, and 2. another statement n that uses t, such as n : y t x. We know that t is constant in n if d reaches n, and no other definitions of t reach n. We can rewrite n as y c x. 19
Copy propagation This is like constant propagation, but instead of a constant c, we have a variable z. Suppose that 1. we have a statement d : t z, and 2. another statement n that uses t, such as n : y t x. If d reaches n, and no other definition of t reaches n, and there is no definition of z on any path from d to n (including a path that goes through n one or more times), then we can rewrite n as n : y z x. Copy propagation may enable the recognition of other optimizations such as common-subexpression elimination. Ex. in the program a y + z u y c u + z the two +-expressions are not recognized as common subexpressions until after the copy propagation of u y is performed. Dead-code elimination If there is a quadruple s : a b c or s : a M[x], such that a is not live-out of s, then the quadruple can be deleted. In some cases, applying dead-code elimination to a statement is not allowed because of its implicit side effects. Ex. if the computer is configured to raise an exception on an arithmetic overflow or divide by zero, then deletion of an exception-causing instruction will change the result of the computation. The optimizer should never make a change that changes program behavior. 20