PROGRAM CONTROL FLOW CONTROL FLOW ANALYSIS. Control flow. Control flow analysis. Sequence of operations Representations

Size: px

Start display at page:

Download "PROGRAM CONTROL FLOW CONTROL FLOW ANALYSIS. Control flow. Control flow analysis. Sequence of operations Representations"

Austen Horn
5 years ago
Views:

1 CONTROL FLOW ANALYSIS PROGRAM CONTROL FLOW Control flow Sequence of operations Representations Control flow graph Control dependence Call graph Control flow analysis Analyzing program to discover its control structure 2

2 CONTROL FLOW GRAPH CFG models flow of control in the program (procedure) G = (N, E) as a directed graph Node n N: basic blocks A basic block is a maximal sequence of stmts with a single entry point, single exit point, and no internal branches For simplicity, we assume a unique entry node n0 and a unique exit node nf in later discussions Edge e=(n i, n j ) E: possible transfer of control from block n i to block n j if (x==y) { } else { }. if (x==y) 3 BASIC BLOCKS Definition A basic block is a maximal sequence of consecutive statements with a single entry point, a single exit point, and no internal branches Basic unit in control flow analysis Local level of code optimizations Redundancy elimination Register-allocation 4 2

3 BASIC BLOCK EXAMPLE How many basic blocks in this code fragment? What are they? ) i := m 2) j := n 3) t := 4 * n 4) v := a[t] 5) i := i + 6) t2 := 4 * I 7) t3 := a[t2] 8) if t3 < v goto 5) 9) j := j 0) t4 := 4 * j ) t5 := a[t4] 2) If t5 > v goto 9) 3) if i >= j goto 23) 4) t6 := 4*I 5) x := a[t6] 5 BASIC BLOCK EXAMPLE ) i := m 2) j := n 3) t := 4 * n 4) v := a[t] 5) i := i + 6) t2 := 4 * I 7) t3 := a[t2] 8) if t3 < v goto 5) 9) j := j 0) t4 := 4 * j ) t5 := a[t4] 2) If t5 > v goto 9) 3) if i >= j goto 23) 4) t6 := 4*I 5) x := a[t6] How many basic blocks in this code fragment? What are they? 6 3

4 IDENTIFY BASIC BLOCKS Input: A sequence of intermediate code statements Determine the leaders, the first statements of basic blocks The first statement in the sequence (entry point) is a leader Any statement that is the target of a branch (conditional or unconditional) is a leader Any statement immediately following a branch (conditional or unconditional) or a return is a leader For each leader, its basic block is the leader and all statements up to, but not including, the next leader or the end of the program 7 EXAMPLE: LEADERS () i := m (6) t7 := 4 * i (2) j := n (7) t8 := 4 * j (3) t := 4 * n (8) t9 := a[t8] (4) v := a[t] (9) a[t7] := t9 (5) i := i + (20) t0 := 4 * j (6) t2 := 4 * i (2) a[t0] := x (7) t3 := a[t2] (22) goto (5) (8) if t3 < v goto (5) (23) t := 4 * i (9) j := j - (24) x := a[t] (0) t4 := 4 * j (25) t2 := 4 * i () t5 := a[t4] (26) t3 := 4 * n (2) If t5 > v goto (9) (27) t4 := a[t3] (3) if i >= j goto (23) (28) a[t2] := t4 (4) t6 := 4*i (29) t5 := 4 * n (5) x := a[t6] (30) a[t5] := x 8 4

5 EXAMPLE: LEADERS () i := m (6) t7 := 4 * i (2) j := n (7) t8 := 4 * j (3) t := 4 * n (8) t9 := a[t8] (4) v := a[t] (9) a[t7] := t9 (5) i := i + (20) t0 := 4 * j (6) t2 := 4 * i (2) a[t0] := x (7) t3 := a[t2] (22) goto (5) (8) if t3 < v goto (5) (23) t := 4 * i (9) j := j - (24) x := a[t] (0) t4 := 4 * j (25) t2 := 4 * i () t5 := a[t4] (26) t3 := 4 * n (2) If t5 > v goto (9) (27) t4 := a[t3] (3) if i >= j goto (23) (28) a[t2] := t4 (4) t6 := 4*i (29) t5 := 4 * n (5) x := a[t6] (30) a[t5] := x 9 EXAMPLE: BASIC BLOCKS () i := m (6) t7 := 4 * i (2) j := n (7) t8 := 4 * j (3) t := 4 * n (8) t9 := a[t8] (4) v := a[t] (9) a[t7] := t9 (5) i := i + (20) t0 := 4 * j (6) t2 := 4 * i (2) a[t0] := x (7) t3 := a[t2] (22) goto (5) (8) if t3 < v goto (5) (23) t := 4 * i (9) j := j - (24) x := a[t] (0) t4 := 4 * j (25) t2 := 4 * i () t5 := a[t4] (26) t3 := 4 * n (2) If t5 > v goto (9) (27) t4 := a[t3] (3) if i >= j goto (23) (28) a[t2] := t4 (4) t6 := 4*i (29) t5 := 4 * n (5) x := a[t6] (30) a[t5] := x 0 5

6 GENERATING CFGS Partition intermediate code into basic blocks Add edges corresponding to control flows between blocks Unconditional goto Conditional branch multiple edges Sequential flow control passes to the next block (if no branch at the end) If no unique entry node n 0 or exit node n f, add dummy nodes and insert necessary edges Ideally no edges entering n 0 ; no edges exiting nf Simplify many analysis and transformation algorithms EXAMPLE: LEADERS () i := m (6) t7 := 4 * i (2) j := n (7) t8 := 4 * j (3) t := 4 * n (8) t9 := a[t8] (4) v := a[t] (9) a[t7] := t9 (5) i := i + (20) t0 := 4 * j (6) t2 := 4 * i (2) a[t0] := x (7) t3 := a[t2] (22) goto (5) (8) if t3 < v goto (5) (23) t := 4 * i (9) j := j - (24) x := a[t] (0) t4 := 4 * j (25) t2 := 4 * i () t5 := a[t4] (26) t3 := 4 * n (2) If t5 > v goto (9) (27) t4 := a[t3] (3) if i >= j goto (23) (28) a[t2] := t4 (4) t6 := 4*i (29) t5 := 4 * n (5) x := a[t6] (30) a[t5] := x 2 6

7 EXAMPLE: CFG () i := m (6) t7 := 4 * i (2) j := n (7) t8 := 4 * j (3) t := 4 * n (8) t9 := a[t8] (4) v := a[t] (9) a[t7] := t9 (5) i := i + (20) t0 := 4 * j (6) t2 := 4 * i (2) a[t0] := x (7) t3 := a[t2] (22) goto (5) (8) if t3 < v goto (5) (23) t := 4 * i (9) j := j - (24) x := a[t] (0) t4 := 4 * j (25) t2 := 4 * i () t5 := a[t4] (26) t3 := 4 * n (2) If t5 > v goto (9) (27) t4 := a[t3] (3) if i >= j goto (23) (28) a[t2] := t4 (4) t6 := 4*i (29) t5 := 4 * n (5) x := a[t6] (30) a[t5] := x 3 CFG AND HL CODE I = J = K = L = do if (P) { 3 J = I if (Q) L = else L = 3 K = K + } 6 else K = K + 2 print (I,J,K,L) do if (R) then L = L + 4 while (S) I = I + 6 while (T)

8 COMPLICATIONS IN CFG CONSTRUCTION Function calls Instruction scheduling may prefer function calls as basic block boundaries Special functions as setjmp() and longjmp() Exception handling Ambiguous jump Jump r //target stored in register r Static analysis may generate edges that never occur at runtime Record potential targets if possible 5 NODES IN CFG Given a CFG = <N, E> If there is an edge n i n j E B A C n i is a predecessor of n j n j is a successor of n i For any node n N Pred(n): the set of predecessors of n Succ(n): the set of successors of n A branch node is a node that has more than one successor A join node is a node that has more than one predecessor D 6 8

9 DEPTH FIRST TRAVERSAL CFG is a rooted, directed graph Entry node as the root Depth-first traversal (depth-first searching) Idea: start at the root and explore as far/deep as possible along each branch before backtracking Can build a spanning tree for the graph Spanning tree of a directed graph G contains all nodes of G such that There is a path from the root to any node reachable in the original graph and There are no cycles 7 DFS SPANNING TREE ALGORITHM procedure span(v) /*v is a node in the graph */ InTree(v) = true For each w that is a successor of v do if (!InTree(w)) Add edge v w to spanning tree span(w) end span Initial: span(n 0 ) 8 9

10 DFST EXAMPLE B A C Nodes are numbered in the order visited during the search == depth first pre-order numbering. D E F G J I H 9 DFST EXAMPLE 2 B A C 3 Nodes are numbered in the order visited during the search == depth first pre-order Numbering. D 4 5 E F 9 G 6 J 0 I 8 H

11 CFG EDGES CLASSIFICATION Edge x y in a CFG is an Advancing edge if x is an ancestor of y in the tree Tree edge if part of the spanning tree Forward edge if not part of the spanning tree and x is an ancestor of y in the tree Retreating edge if not part of the spanning tree and y is an ancestor of x in the tree Cross edge if not part of the spanning tree and neither is an ancestor of the other 2 DFST EXAMPLE B A C Tree Edge Forward Edge Retreating Edge Cross Edge D E F G J I H 22

12 BACK EDGES AND REDUCIBILITY An edge x y in a CFG is a back edge if every path from the entry node of the flow graph to x goes through y y dominates x : more details later Every back edge is a retreating edge Vice versa? A flow graph is reducible if all its retreating edges in any DFST are also back edges Flow graphs that occur in practice are almost always reducible 23 NON-REDUCIBLE GRAPHS Testing reducibility: Take any DFST for the flow graph, remove the back edges, and check that the result is acyclic A B C In any DFST, one of these edges will be a retreating edge 24 2

13 NODES ORDERING WRT DFST Enhanced depth-first spanning tree algorithm: time =0; procedure span(v) /* v is a node in the graph */ InTree(v) = true; d[v] = ++time; For each w that is a successor of v do if (!InTree(w)) then Add edge v w to spanning tree span(w) f[v]=++time; end span Associate two numbers to each node v in the graph d[v]: discovery time of v in the spanning f[v]: finish time of v in the spanning 25 NODES ORDERING WRT DFST Pre-ordering Ordering of vertices based on discovery time Post-ordering Ordering of vertices based on finish time Reverse post-ordering The reverse of a post-ordering, i.e. ordering of vertices in the opposite order of their finish time Not the same as pre-ordering Commonly used in forward data flow analysis Backward data flow analysis: RPO on the reverse CFG 26 3

14 ORDERING EXAMPLE D 5 7 E F 6 G 8 Pre-ordering: DEGF Post-ordering: GEFD Reverse post-ordering: DFEG 27 BIG PICTURE Why care about ordering / back edges? CFGs are commonly used to propagate information between nodes (basic blocks) Data flow analysis The existence of back edges / cycles in flow graphs indicates that we may need to traverse the graph more than once Iterative algorithms: when to stop? How quickly can we stop? Proper ordering of nodes during iterative algorithm assures number of passes limited by the number of nested back edges 28 4

15 REGIONS IN CFG Extended basic block (EBB) EBB is a maximal set of nodes in a CFG that contains no join nodes other than the entry node A single entry and possibly multiple exits Some optimizations like value numbering and instruction scheduling are more effective if applied in EBBs Natural loop Loop is a collection of nodes in a CFG such that All nodes in the collection are strongly connected, and The collection of nodes has a unique entry: the only way to visit the loop from outside A loop that contains no other loops is an inner loop Main target of program optimizations 29 EBB EXAMPLE B A Max-size EBBs: {A,B}, {C,J}, {D,E,F}, {G,H,I} C D Loops? Not that obvious Can use dominatorbased loop detection E F G J I H 30 5

16 DOMINANCE Node d of a CFG dominates node n if every path from the entry node of the graph to n passes through d (d dom n) Dom(n): the set of dominators of node n Every node dominates itself: n Dom(n) Node d strictly dominates n if d Dom(n) and d n Dominance-based loop recognition: entry of a loop dominates all nodes in the loop Each node n has a unique immediate dominator m which is the last dominator of n on any path from the entry to n (m idom n), m n The immediate dominator m of n is the strict dominator of n that is closest to n 3 DOMINATOR EXAMPLE Block Dom IDom {} 2 2 {,2} 3 3 {,3} 4 4 {,3,4} {,3,4,5} 4 6 {,3,4,6} {,3,4,7} {,3,4,7,8} 7 9 {,3,4,7,8,9} 8 0 {,3,4,7,8,0}

17 DOMINATOR TREES In a dominator tree, a node s parent is its immediate dominator 33 OTHER SETS OF INTEREST Block SDom Dom - Dom-n {} {,2,3,4,5,6,7,8,9,0} 2 {} {2} 3 {} {3,4,5,6,7,8,9,0} 4 {,3} {4,5,6,7,8,9,0} 5 {,3,4} {5} 6 {,3,4} {6} 7 {,3,4} {7,8,9,0} 8 {,3,4,7} {8,9,0} 9,3,4,7,8} {9} 0,3,4,7,8} {0} 34 7

18 EXAMPLE Block Dom IDom EXAMPLE Block Dom IDom - 2,2 3,2,3 2 4,2,3,4 3 5,2,3,5 3 6,2,3,6 3 7,2,7 2 8,2,8 2 9,2,8,9 8 0,2,8,9,0 9,2,8,9, 9 2,2,8,9,,2 36 8

19 ALGORITHM: COMPUTING DOM An iterative fixed-point calculation N is the set of nodes in the CFG DOM(n 0 ) = {n 0 } (n 0 is the entry) For all nodes x n 0 DOM(x) = N Until no more changes to dominator sets for all nodes x n 0 DOM(x) = { x } + ( DOM(P) ) for all predecessors P of x At termination, node d in DOM(n) iff d dominates n 37 DOMINATOR EXAMPLE initial iteration 0 {0} {0} N {} + (Dom(0) Dom(9)) = {0,} 2 N {2} + Dom() = {0,,2} 3 N {3} + (Dom() Dom(2) Dom(8) Dom(4)) = {0,,3} 4 N {4} + (Dom(3) Dom(7)) = {0,,3,4} 5 N {5} + Dom(4) = {0,,3,4,5} 6 N {6} + Dom(4) = {0,,3,4,6} 7 N {7} + (Dom(5) Dom(6) Dom(0)) = {0,,3,4,7} 8 N {8} + Dom(7) = {0,,3,4,7,8} 9 N {9} + Dom(8) = {0,,3,4,7,8,9} 0 N {0} + Dom(8) = {0,,3,4,7,8,0} 38 9

20 DOMINATOR EXAMPLE Dom Block initial iteration iteration2 0 {0} {0} {0} N {0,} {0,} 2 N {0,,2} {0,,2} 3 N {0,,3} {0,,3} 4 N {0,,3,4} {0,,3,4} 5 N {0,,3,4,5} {0,,3,4,5} 6 N {0,,3,4,6} {0,,3,4,6} 7 N {0,,3,4,7} {0,,3,4,7} 8 N {0,,3,4,7,8} {0,,3,4,7,8} 9 N {0,,3,4,7,8,9} {0,,3,4,7,8,9} 0 N {0,,3,4,7,8,0} {0,,3,4,7,8,0} 39 COMPUTING IDOM FROM DOM. For each node n, initially set IDOM(n) = DOM(n)-{n} (SDOM - strict dominators) 2. For each node p in IDOM(n), see if p has dominators other than itself also included in IDOM(n): if so, remove them from IDOM(n) The immediate dominator m of n is the strict dominator of n that is closest to n 40 20

21 I-DOMINATOR EXAMPLE 0 Block initial (SDOM) IDom {} {} {0} {0} 2 {0,} {} //0 - s dominator 3 {0,} {} //0 - s dominator 4 {0,,3} {3} // 0, - 3 s dominators 5 {0,,3,4} {4} // 0,,3-4 s dominators 6 {0,,3,4} {4} // 0,,3-4 s dominators 7 {0,,3,4} {4} // 0,,3-4 s dominators 8 {0,,3,4,7} {7} // 0,,3,4-7 s dominators 9 {0,,3,4,7,8} {8} // 0,,3,4,7-8 s dominators 0 {0,,3,4,7,8} {8} // 0,,3,4,7-8 s dominators 4 POST-DOMINANCE Related concept (will look at more later) Node d of a CFG post-dominates node n if every path from n to the exit node passes through d (d pdom n) Pdom(n): the set of post-dominators of node n Every node post-dominates itself: n Pdom(n) Each node n has a unique immediate post dominator m 42 2

22 POST-DOMINATOR EXAMPLE Block Pdom IPdom {3,4,7,8,0,exit} {2,3,4,7,8,0,exit} {3,4,7,8,0,exit} {4,7,8,0,exit} {5,7,8,0,exit} 7 6 {6,7,8,0,exit} {7,8,0,exit} {8,0,exit} 0 9 {,3,4,7,8,0,exit} exit 0 {0,exit} exit 43 CFG 2 exit exit 44 22

23 NATURAL LOOPS Natural loops that are suitable for improvement have two essential properties: A loop must have a single entry point called header There must be at least one way to iterate the loop, i.e., at least one path back to the header Identifying natural loops Searching for back edges (n d) in CFG whose heads dominate their tails For an edge a b, b is the head and a is the tail A back edge flows from a node n to one of n s dominators d The natural loop for that edge is {d}+the set of nodes that can reach n without going through d d is the header of the loop 45 BACK EDGE EXAMPLE Block Dom IDom 2 2,2 3 3,3 4 4,3, ,3,4, ,3,4, ,3,4, ,3,4,7,8 7 9,3,4,7,8,9 8 Back edges? 0,3,4,7,8,

24 IDENTIFYING NATURAL LOOPS Given a back edge n d, the natural loop of the edge includes Node d Any node that can reach n without going through d Loop construction Set loop={d} Add n into loop if n d Consider each node m d that we know is in loop, make sure that m s predecessors are also inserted in loop 47 NATURAL LOOPS EXAMPLE Back edge Natural loop {7,0,8} {4,7,5,6 0,8} {3,4,7,5,6,0,8} {,9,8,7,5,6, 0,4,3,2} 9 0 Why neither {3,4} nor {4,5,6,7} is a natural loop? 48 24

25 INNER LOOPS A useful property of natural loops: unless two loops have the same header, they are either disjoint or one is entirely contained (nested within) the other B2 B0 B B3 An inner loop is a loop that contains no other loops Good optimization candidate The inner loop of the previous example: {7,8,0} 49 DOMINANCE FRONTIERS For a node n in CFG, DF(n) denotes the dominance frontier set of n DF(n) contains all nodes x s.t. n dominates an immediate predecessor of x but does not strictly dominate x For this to happen, there is some path from node n to x, n y x where (n DOM y) but!(n SDOM x) Informally, DF(n) contains the first nodes reachable from n that n does not strictly dominate, on each CFG path leaving n Used in SSA calculation and redundancy elimination 50 25

26 DOMINANCE FRONTIER FOR NODE Paths of interest: DF(7)={,3,4,7} 5 DOMINANCE FRONTIER FOR NODE Paths of interest: DF(4)={,3,4} 52 26

27 COMPUTING DOMINANCE FRONTIERS Easiest way: DF(x) = SUCC(DOM - (x)) SDOM - (x) where SUCC(x) = set of successors of x in the CFG But not the most efficient Observation Nodes in a DF must be join nodes The predecessor of any join node j must have j in its DF unless it dominates j The dominators of j s predecessors must have j in their DF sets unless they also dominate j 53 COMPUTING DOMINANCE FRONTIERS for all nodes n, initialize DF(n) =Ø for all nodes n if n has multiple predecessors, then for each predecessor p of n runner = p while (runner IDom(n)) DF(runner) = DF(runner) {n} runner = IDom(runner) First identify join nodes j in CFG Starting with j s predecessors, walk up the dominator tree until we reach the immediate dominator of j 54 Node j should be included in the DF set of all the nodes we pass by except for j s immediate dominator 27

28 COMPUTING DOMINANCE FRONTIER Join node : runner = 0 = IDom() runner = 9 : DF(9) += {} runner = 8 : DF(8) += {} runner = 7 : DF(7) += {} runner = 4 : DF(4) += {} runner = 3 : DF(3) += {} runner = : DF() += {} runner = 0 = IDom() 55 COMPUTING DOMINANCE FRONTIER Join node 3: runner = = IDom(3) runner = 2: DF(2) += {3} runner = 4: DF(4) += {3} runner = 3: DF(3) += {3} runner = 8 : DF(8) += {3} runner = 7 : DF(7) += {3} 56 28

29 COMPUTING DOMINANCE FRONTIER Join node 4: runner = 3 = IDom(4) runner = 7: DF(7) += {4} runner = 4: DF(4) += {4} 8 Join node 7: runner = 5: DF(5) += {7} runner = 6: DF(6) += {7} runner = 0: DF(0) += {7} runner = 8: DF(8) += {7} runner = 7: DF(7) += {7} 57 DOMINANCE FRONTIER EXAMPLE 0 Block DF {} 2 2 {3} 3 3 {,3} 4 {,3,4} 4 5 {7} {7} 7 {,3,4,7} 8 8 {,3,7} {} 0 {7} 58 29

30 EXAMPLE Bloc k DF 2 59 DOMINATOR-BASED ANALYSIS Idea Use dominators to discover loops for optimization Advantages Sufficient for use by iterative data-flow analysis and optimizations Least time-intensive to implement Favored by most current optimizing compilers Alternative approach Interval-based analysis/structural analysis 60 30

31 REDUNDANT EXPRESSION ELIMINATION REDUNDANT EXPRESSIONS Definition An expression x op y is redundant at a point p if it has already been computed and no intervening operations redefined x or y Optimization for a redundant expression Preserve the result of earlier computations Replace subsequent evaluations with references to the saved value Safety Need to prove that x op y is redundant Today Redundancy elimination at different levels 62 3

32 REDUNDANT EXPRESSION EXAMPLE An expression x op y is redundant at a point p if it has already been computed and no intervening operations redefine x or y m = 2*y*z t0 = 2*y t0 = 2*y m = t0*z m = t0*z n = 3*y*z t = 3*y t = 3*y n = t*z n = t*z o = 2*y z t2 = 2*y t2 = t0 o = t2-z o = t0-z redundant RE elimination + copy propagation + dead code elimination 63 REDUNDANCY ELIMINATION Tasks Need to prove that x op y is redundant Need to rewrite the code In basic blocks Using DAGs Using Value Numbering Beyond basic blocks Must consider all paths between the occurrences 64 32

33 DAG REPRESENTATIONS A dag (directed acyclic graph) for a basic block has the following labels for the nodes: Leaves are labeled by atomic operands Unique identifiers/numbers Interior nodes are labeled by operators/ids Edges pointing to operands Nodes can have multiple labels since they represent computed values x := y op z a := y op z op x,a y z 65 GENERATING DAGS FROM IR Process statements in a basic block sequentially: For statement i: x := y op z. if y op z node exists, add x to the label for that node else add new node for op If y or z exist in the dag, point to existing the hardest part! locations, else add leaves for y and/or z and have the op node point to them Label the op node with x 2. If x existed previously as a leaf, subscript that previous entry 3. If x was previously associated with another interior node, remove that previous entry 66 33

34 DAG EXAMPLE t0 := 2 * y m := t0 * z t := 3 * y n := t * z t := 2 * y o := t * z t0 * 2 y 67 DAG EXAMPLE t0 := 2 * y m := t0 * z t n := 3 * y := t * z * m t := 2 * y o := t * z t0 * z 2 y 68 34

35 DAG EXAMPLE t0 := 2 * y m := t0 * z t n := 3 * y := t * z * m t := 2 * y o := t * z t * t0 * z 3 2 y 69 DAG EXAMPLE t0 := 2 * y m := t0 * z t n := 3 * y := t * z * n * m t := 2 * y o := t * z t * t0 * z 3 2 y 70 35

36 DAG EXAMPLE t0 := 2 * y m := t0 * z t n := 3 * y := t * z * n * m t := 2 * y o := t * z t * t0 * t z 3 2 y 7 DAG EXAMPLE t0 := 2 * y m := t0 * z t n := 3 * y := t * z * n * m,o t := 2 * y o := t * z t * t0 * t z 3 2 y 72 36

37 GENERATING CODE FROM DAGS Graph traversal t0 = 2 * y t = 3 * y n = t * z m = t0 * z * n * m,o o = m t = t0 t * t0 * t z 3 2 y 73 DAG GENERATION PROBLEMS Arrays Must equate all references to a given array since for a[i] and a[j], i may or may not be j x = a[i] a[j] = y z = a[i] /* multiple references to array a */ Pointers How can we determine where a pointer is pointing at in memory? *p = 0; /* increment subscript of every var that may get modified*/ 74 37

38 ARRAY x = a[i] a[j] = y z = a[i] [] z [] x [] a a 0 i 0 y 0 j0 OTTIMIZZAZIONE DEI BLOCCHI E RIFERIMENTI AD ARRAY Hanno comportamento diverso rispetto x = a[i] a[j] = y z = a[i]? x = a[i] a[j] = y z = x Se i j sono equivalenti Se i = j non sono equivalenti È necessaria una logica diversa per la costruzione di un DAG per il riferimento ad un array 38

39 OTTIMIZZAZIONE DEI BLOCCHI E RIFERIMENTI AD ARRAY x = a[i] a[j] = y z = a[i] z =[] ucciso =[] x [] = a 0 i 0 j 0 y0 ASSEGNAMENTI DI PUNTATORI E CHIAMATA DI PROCEDURE L operatore =* deve prendere come argomenti tutti i nodi che sono associati ad un identificatore Deve uccidere tutti i nodi del DAG costruiti fino a quel momento 39

40 REDUNDANCY ELIMINATION Tasks Need to prove that x op y is redundant Need to rewrite the code In basic blocks Using DAGs Using Value Numbering Beyond basic blocks Must consider all paths between the occurrences 79 VALUE NUMBERING LOCAL (LVN) Goal: Group expressions that provably have the same value Associate a unique value number with each distinct value created or used within a block Two expressions have the same value number if and only if they are provably identical for all possible operands Classical way to fold constants and eliminate redundant expressions 80 40

41 LVN ALGORITHM Construct a value table for a basic block A hash table that maps variables, constants, computed values to their value numbers Start with an empty value table For each operation o = o operator o 2 in the block. Get value numbers for the operands from a hash lookup in the value table 2. Hash <operator,vn(o ),VN(o 2 )> to get a value number for o 3. If o already had a value number, replace o with a reference; otherwise generate a new value number for o 4. Record o s value number into the value table If hashing behaves, the algorithm runs in linear time Minor issues Commutative operations Looks at VN of operand, not its name 8 EXAMPLE 3 2 Name VN a = i + b = i + i = j if i + goto L For variable a: hash <+, VN(i), VN()> get a new number 3 i 2 ()+(2) 3 a 3 (#) means the entry associated with number # 82 4

42 EXAMPLE 3 2 a = i b = i + i = j if i + goto L Name For variable b: hash <+, VN(i), VN()> get an existing number 3 i 2 ()+(2) 3 a 3 b 3 VN 83 EXAMPLE 3 2 a = i b = i + 4 i = j 4 if i + goto L VN(i) is changed to 4 Name VN i 4 2 ()+(2) 3 a 3 b 3 j

43 EXAMPLE 3 2 a = i b = i + i 4 = j if i + goto L 5 Name VN i 4 2 ()+(2) 3 a 3 b 3 j 4 (2)+(4) 5 85 EXAMPLE 3 2 a = i + a = i + b 3 = i + 2 b = a i 4 = j 4 i = j 4 2 if i + goto L if i + goto L 5 a and b are given the same numbering, but not the condition expression for if-stmt 86 43

44 NAMING ISSUES Original Code a x + y b x + y a 7 c x + y With VNs a 3 x + y 2 b 3 x + y 2 a c 3 x + y 2 Rewritten a 3 x + y 2 b 3 a 3 a c 3 a 3 (??) Options: Use c 3 b 3 Save a 3 in t 3 Give each value a unique name 87 NAMING ISSUES Original Code a x + y b x + y a 7 c x + y With VNs a 3 x + y 2 b 3 x + y 2 a c 3 x + y 2 Rewritten a 0 3 x 0 + y 0 2 b 0 3 a 0 3 a c 0 3 a 0 3 Give each value a unique name No value is ever killed These are SSA names (static single-assignment) We will see how to construct SSA later in this course 88 44

45 LVN LIMITATIONS a 3 2 = i + b 3 = i + 2 i 4 = j 4 if i goto L c = i + a = i + b = a i = j t = i + if t goto L c = t LVN cannot eliminate redundant expression c=i+ Why? 90 EXTENSIONS TO LVN Constant folding Add a bit that records when a value is constant Evaluate constant values at compile-time Replace with load immediate or immediate operand No stronger local algorithm Identities: x+0, x-0, x, x, x-x, x 0, x x, x 0, x 0xFF FF, max(x,maxint), min(x,minint), max(x,x), min(y,y), and so on... Algebraic identities Must check (many) special cases Replace result with a copy operation Build a decision tree on operation With values, not names 9 45

46 FINDING/FOLDING CONSTANTS i = 2 j = i * 2 k = i + Expression Value number Constant value 2 const=2 i const=2 92 FINDING/FOLDING CONSTANTS i = 2 j 2 = i * 2 k = i + Expression Value number Constant value 2 const=2 i const=2 ()*() 2 const=4 j 2 const=

47 FINDING/FOLDING CONSTANTS i = 2 j 2 = i * 2 k 4 = i + i = 2 j = 4 k = 3 3 Expression Value number Constant value 2 const=2 i const=2 ()*() 2 const=4 j 2 const=4 3 const= ()+(3) 4 const=3 k 4 const=3 94 LOCAL VALUE NUMBERING Safety x op y has been computed: hash table starts empty Operands not redefined: mapping uses VN(x) and VN(y), not x and y With SSA, no value is ever redefined Profitability Assumes a copy is cheaper than an operation Loading constant cheaper than an operation Algebraic identities: do not include non-profitable ones Opportunity Linear scan basic block Exhaustive search for optimization opportunities 97 47

48 REDUNDANCY ACROSS BLOCKS A m = a + b n = a + b LVN B LVN p = c + d r = c + d C q = a + b r = c + d D e = b + 8 s = a + b u = e + f E e = a + 7 t = c + d u = e + f F v = a + b w = c + d x = e + f Opportunities missed by LVN G y = a + b z = c + d Need to consider regions larger than basic blocks in CFG 98 REDUNDANCY ACROSS BLOCKS B LVN A p = c + d r = c + d m = a + b n = a + b C LVN q = a + b r = c + d Expressions evaluated in some predecessor/ancestor LVN: each block starts with an empty hash table Can we change this? D e = b + 8 s = a + b u = e + f E e = a + 7 t = c + d u = e + f F v = a + b w = c + d x = e + f G y = a + b z = c + d Superlocal value numbering(svn): EBB 99 48

49 SUPERLOCAL VALUE NUMBERING B LVN A p = c + d r = c + d m = a + b n = a + b C LVN q = a + b r = c + d EBB for this CFG {A,B,C,D,E},{F},{G} Consider each path through EBB AB, ACD, ACE D e = b + 8 s = a + b u = e + f E e = a + 7 t = c + d u = e + f F v = a + b w = c + d x = e + f G y = a + b z = c + d EBB: only the entry node can be join node 00 SUPERLOCAL VALUE NUMBERING Idea Apply local method to each path in the EBB as if the set of blocks in a path were a single block AB, ACD, ACE Results from ancestors can be propagated to descendants A C D, A C E Use A s hash-table to initialize B s and C s Efficiency Avoid re-analyzing common ancestors Stack-like scoped hash-table A, AB, A, AC, ACD, AC, ACE 0 49

50 REDUNDANCY ACROSS BLOCKS A m = a + b n = a + b LVN B LVN p = c + d r = c + d C q = a + b r = c + d SVN D SVN e = b + 8 s = a + b u = e + f E e = a + 7 t = c + d u = e + f SVN F v = a + b w = c + d x = e + f Opportunities missed by LVN & SVN G y = a + b z = c + d Rewriting: need to map VN to unique names Names across block boundaries 02 NAMING ISSUES Need a VN name mapping to handle kills Use the SSA name space Add subscripts to variable names for uniqueness Insert -functions at merge points to reconcile name spaces x... x x +... becomes x 0... x... x 2 (x 0,x ) x

51 SSA FORM B LVN A p 0 = c 0 + d 0 r 0 = c 0 + d 0 m 0 = a 0 + b 0 n 0 = a 0 + b 0 C LVN q 0 = a 0 + b 0 r = c 0 + d 0 SVN SVN does not help block F or G How do we process join nodes? D SVN e 0 = b s = a 0 + b 0 u 0 = e 0 + f 0 E e = a t 0 = c 0 + d 0 u = e + f 0 SVN F e 2 = (e 0, e ) u 2 = (u 0, u ) v 0 = a 0 + b 0 w 0 = c 0 + d 0 x 0 = e 2 + f 0 Opportunities missed by LVN & SVN G r 2 = (r 0,r ) y 0 = a 0 + b 0 z 0 = c 0 + d 0 04 LARGER REGIONS Problem of join nodes: multiple predecessors For block F, combine VN table of D and E? Merging states is expensive Fall back on what s known: both paths to F have a common prefix: {A,C} Dominator-based value numbering Use the VN table of IDOM(J) as the initial state for processing any join node J Start with the VN table produced by processing C for block F Will block D or E interfere? SSA ensures that D and E can add information to the value table, but they cannot invalidate it 05 5

52 DOMINATOR-BASED VALUE NUMBERING B LVN A p 0 = c 0 + d 0 r 0 = c 0 + d 0 m 0 = a 0 + b 0 n 0 = a 0 + b 0 D SVN C e 0 = b s = a 0 + b 0 u 0 = e 0 + f 0 LVN q 0 = a 0 + b 0 r = c 0 + d 0 E SVN SVN e = a t 0 = c 0 + d 0 u = e + f 0 For join node F DOM(F) = {A, C} IDOM(F) = C Perform value numbering for F with the table we got by processing C as the initial state F e 2 = (e 0, e ) u 2 = (u 0, u ) v 0 = a 0 + b 0 w 0 = c 0 + d 0 x 0 = e 2 + f 0 Join node G? G r 2 = (r 0,r ) y 0 = a 0 + b 0 z 0 = c 0 + d 0 06 DOMINATOR-BASED VALUE NUMBERING B LVN A p 0 = c 0 + d 0 r 0 = c 0 + d 0 m 0 = a 0 + b 0 n 0 = a 0 + b 0 D SVN C e 0 = b s = a 0 + b 0 u 0 = e 0 + f 0 LVN q 0 = a 0 + b 0 r = c 0 + d 0 E SVN SVN e = a t 0 = c 0 + d 0 u = e + f 0 DVN features Discover more redundancy(+) Little additional cost(+) F e 2 = (e 0, e ) u 2 = (u 0, u ) v 0 = a 0 + b 0 w 0 = c 0 + d 0 x 0 = e 2 + f 0 DVN DVN Missing some opportunities(-) No values flow along back edges(-) G r 2 = (r 0,r ) y 0 = a 0 + b 0 z 0 = c 0 + d 0 DVN Opportunities missed by LVN, SVN & DVN 07 52

53 GLOBAL REDUNDANCY ELIMINATION Global algorithm that processes an entire cycle of blocks potentially find more redundancy operations Neither SVN nor DVN can propagate information backward May need to process a block more than once Cannot perform code rewriting before analysis phase is finished Classic method: using data-flow analysis to compute the set of expressions that are available on entry to each block AVAIL analysis 08 AVAIL ANALYSIS An expression e is defined at point p if its value is computed at p p is called a definition site for e An expression e is killed at point p if one or more of its operands is defined at p p is called a kill site for e An expression e is available at a point p if every path leading to p contains a definition of e, and e is not killed between that definition and p An expression x op y is redundant at a point p if it has already been computed and no intervening operations redefine x or y 09 53

54 REDUNDANT EXPRESSIONS Definition site Since a + b is available here, redundant! c = a + b d = a * c i = f[i] = a + b c = c * 2 if c > d Candidates: a + b a * c d * d c * 2 i + 0 g = a * c g = d * d i = i + if i > 0 REDUNDANT EXPRESSIONS Definition site Kill site c = a + b d = a * c i = f[i] = a + b c = c * 2 if c > d Candidates: a + b a * c d * d c * 2 i + Not available Not redundant g = a * c i = i + if i > 0 g = d * d 54

55 FINDING GLOBAL REDUNDANCY. Build CFG 2. For each basic block b, compute local information: DEExpr(b) downward exposed expressions e DEExpr(b) if b evaluates e and none of e s operands is re-defined after that evaluation Exprkill(b) expressions killed by definitions in the block 3. Using local information, compute AVAIL_IN(b), AVAIL_OUT(b) over the entire CFG AVAIL_IN(b) is the set of available expressions on entry to block b 2 COMPUTING LOCAL INFORMATION Assume a block B with operations o, o 2,, o k VARKILL[B] = {}; DEExpr[B]={} For i=k to Assume o i is x=y op z Add x to VARKILL[B] If (y VARKILL[B] && z VARKILL[B]) add y op z to DEExpr[B] Backward through block EXPRKILL[B]={} For each expression e in the procedure For each variable v operands(e) If (v VARKILL[B]) EXPRKLL[B] = EXPRKILL[B] {e} } O(k) steps 3 55

56 EXAMPLE B A p = c + d r = c + d m = a + b n = a + b C q = a + b r = c + d Set of expressions to be considered: {a+b, c+d, b+8, e+f, a+7} G D e = b + 8 E e = a + 7 s = a + b t = c + d u = e + f u = e + f F y = a + b z = c + d v = a + b w = c + d x = e + f Block A B C D E F G DEExpr a+b c+d a+b,c+d b+8,a+b,e+f a+7,c+d,e+f a+b,c+d,e+f a+b,c+d VARKILL n,m p,r q,r e,s,u e,t,u v,w,x y,z 4 EXAMPLE B A p = c + d r = c + d m = a + b n = a + b C q = a + b r = c + d Set of expressions to be considered: {a+b, c+d, b+8, e+f, a+7} G D u = e + f E e = a + 7 e = b + 8 t = c + d s = a + b u = e + f F y = a + b z = c + d v = a + b w = c + d x = e + f If we change the order of stmts in D Block A B C D E F G DEExpr a+b c+d a+b,c+d b+8,a+b a+7,c+d,e+f a+b,c+d,e+f a+b,c+d VARKILL n,m p,r q,r e,s,u e,t,u v,w,x y,z 5 56

57 COMPUTING LOCAL INFORMATION Assume a block B with operations o, o 2,, o k VARKILL[B] = {}; DEExpr[B]={} For i=k to Assume o i is x=y+z Add x to VARKILL[B] If (y VARKILL[B] && z VARKILL[B]) add y+z to DEExpr[B] EXPRKILL[B]={} For each expression e in the procedure } O(N) steps For each variable v operands(e) N is # of If (v VARKILL[B]) operations EXPRKLL[B] = EXPRKILL[B] {e} 6 EXAMPLE B A p = c + d r = c + d m = a + b n = a + b C q = a + b r = c + d Set of expressions to be considered: {a+b, c+d, b+8, e+f, a+7} G D e = b + 8 E e = a + 7 s = a + b t = c + d u = e + f u = e + f F y = a + b z = c + d v = a + b w = c + d x = e + f Block A B C D E F G DEExpr a+b c+d a+b,c+d b+8,a+b,e+f a+7,c+d,e+f a+b,c+d,e+f a+b,c+d EXPRKILL {} {} {} e+f e+f {} {} 7 57

58 FINDING GLOBAL REDUNDANCY. Build CFG 2. For each basic block b, compute local information: DEExpr(b) downward exposed expressions Exprkill(b) expressions killed by definitions in the block 3. Using local information, compute AVAIL_IN(b), AVAIL_OUT(b) over the entire CFG AVAIL_IN(b) is the set of available expressions on entry to block b 8 COMPUTING AVAILABLE EXPRESSIONS For each block b Exprkill(b): set of expression killed in b DEExpr(b): set of downward exposed expressions AVAIL(b): set of expressions available on entry to b AVAIL_IN(b)= x pred(b) (AVAIL_OUT(x)) AVAIL_OUT(b)=DEExpr(b) (AVAIL_IN(b)-Exprkill(b)) AVAIL_IN(b 0 ) = Ø This system of simultaneous equations forms a data-flow problem Solve it with a data-flow algorithm Entry node in CFG is b

59 ITERATIVE ALGORITHM FOR AVAIL AVAIL_IN(b 0 ) = Ø for i = 0 to k AVAIL_OUT(b i ) = DEExpr(b i ) while (changed) changed = false for i = 0 to k OldValue = AVAIL_IN(b i ) Expressions that are available on ALL incoming links Computed locally or incoming and not killed AVAIL_IN(b i ) = x pred(bi) (AVAIL_OUT(x)) AVAIL_OUT(b i ) =DEExpr(b i ) (AVAIL_IN(b i )-Exprkill(b i )) If AVAIL(b i ) <> OldValue then changed = true 20 B EXAMPLE A p = c + d r = c + d G m = a + b n = a + b C y = a + b z = c + d q = a + b r = c + d D e = b + 8 E e = a + 7 s = a + b t = c + d u = e + f u = e + f F v = a + b w = c + d x = e + f AVAIL_IN[A]= Ø AVAIL_OUT[A]= {a+b} AVAIL_IN[B]= {a+b} AVAIL_OUT[B] ={a+b,c+d} AVAIL_IN[C]= {a+b} AVAIL_OUT[C]= {a+b,c+d} AVAIL_IN[D] = {a+b,c+d} AVAIL_OUT[D] = {a+b,c+d,b+8,e+f} AVAIL_IN[E] = {a+b,c+d} AVAIL_OUT[E] = {a+b,c+d,a+7,e+f} AVAIL_IN[F] =AVAIL_OUT[D] AVAIL_OUT[E] ={a+b,c+d, e+f} AVAIL_OUT[F] = {a+b,c+d, e+f} AVAIL_IN[G] =AVAIL_OUT[B] AVAIL_OUT[F] ={a+b,c+d} AVAIL_OUT[G]={a+b,c+d} 2 59

60 GLOBAL REDUNDANCY ELIMINATION Redundancy elimination based on AVAIL After computing AVAIL_IN[B] For B, e AVAIL_IN[B], assign a unique name(e) to e At a definition site of e, if e is not available -- a new evaluation of e -- add a copy assignment: name(e) := e At a definition site of e, if e is available, replace (computation of) e by name(e) 22 B EXAMPLE A p = c + d r = c + d G m = a + b n = a + b C y = a + b z = c + d q = a + b r = c + d D e = b + 8 E e = a + 7 s = a + b t = c + d u = e + f u = e + f v = a + b w = c + d x = e + f a+b->g, c+d ->g2, e+f->g3 F AVAIL_IN[A]= Ø AVAIL_OUT[A]= {a+b} AVAIL_IN[B]= {a+b} AVAIL_OUT[B] ={a+b,c+d} AVAIL_IN[C]= {a+b} AVAIL_OUT[C]= {a+b,c+d} AVAIL_IN[D] = {a+b,c+d} AVAIL_OUT[D] = {a+b,c+d,b+8,e+f} AVAIL_IN[E] = {a+b,c+d} AVAIL_OUT[E] = {a+b,c+d,a+7,e+f} AVAIL_IN[F] =AVAIL_OUT[D] AVAIL_OUT[E] ={a+b,c+d, e+f} AVAIL_OUT[F] = {a+b,c+d, e+f} AVAIL_IN[G] =AVAIL_OUT[B] AVAIL_OUT[F] ={a+b,c+d} AVAIL_OUT[G]={a+b,c+d} 23 60

61 B EXAMPLE p = c + d g2=p r = g2 A G m = a + b g=m n = g y = g z = g2 C q = g r = c + d g2=r D e = b + 8 s = g u = e + f g3=u E e = a + 7 t = g2 u = e + f g3=u F v = g w = g2 x = g3 a+b->g, c+d ->g2, e+f->g3 AVAIL_IN[A]= Ø AVAIL_OUT[A]= {a+b} AVAIL_IN[B]= {a+b} AVAIL_OUT[B] ={a+b,c+d} AVAIL_IN[C]= {a+b} AVAIL_OUT[C]= {a+b,c+d} AVAIL_IN[D] = {a+b,c+d} AVAIL_OUT[D] = {a+b,c+d,b+8,e+f} AVAIL_IN[E] = {a+b,c+d} AVAIL_OUT[E] = {a+b,c+d,a+7,e+f} AVAIL_IN[F] =AVAIL_OUT[D] AVAIL_OUT[E] ={a+b,c+d, e+f} AVAIL_OUT[F] = {a+b,c+d, e+f} AVAIL_IN[G] =AVAIL_OUT[B] AVAIL_OUT[F] ={a+b,c+d} AVAIL_OUT[G]={a+b,c+d} 24 COMPARING THE ALGORITHMS B LVN A p = c + d r = c + d m = a + b n = a + b C LVN q = a + b r = c + d GRE,SVN N.B.: SVN subsumes LVN DVN subsumes SVN GRE & xvn are not directly comparable D GRE,SVN e = b + 8 s = a + b u = e + f E e = a + 7 t = c + d u = e + f GRE,SVN F v = a + b w = c + d x = e + f GRE,DVN GRE,DVN GRE G y = a + b z = c + d GRE,DVN GRE LVN/SVN/DVN: Local/Superlocal/Dominator-based value numbering GRE: global redundancy elimination 25 6

62 ANOTHER GRE EXAMPLE PASS Blk VarKill DEExpr ExprKill i,d,c a*c, a + b a*c,d*d,c*2,i+ 2 c,f a+b a*c,c*2 3 g a*c 4 g d*d 5 i i+ AVAIL_IN(b)= x pred(b)(avail_out(x)) AVAIL_OUT(b)= DEExpr(b) (AVAIL_IN(b)-Exprkill(b)) AVAIL_IN(b0) = Ø U={a+b, a*c, i+, c*2, d*d} 2 c = a + b d = a * c i = f[i] = a + b c = c * 2 if {a+b} {a+b} 3 g = a * c 4 g = d * d 5 i = i + if Ø {a+b} 26 ANOTHER GRE EXAMPLE PASS 2 Blk VarKill DEExpr ExprKill i,d,c a*c, a + b a*c,d*d,c*2,i+ 2 c,f a+b a*c,c*2 3 g a*c 4 g d*d 5 i i+ AVAIL_IN(b)= x pred(b)(avail_out(x)) AVAIL_OUT(b)= DEExpr(b) (AVAIL_IN(b)-Exprkill(b)) AVAIL_IN(b0) = Ø 2 c = a + b d = a * c i = f[i] = a + b c = c * 2 if {a+b} {a+b} 3 g = a * c 4 g = d * d 5 {a+b} {a+b} i = i + if Ø 27 62

63 ANOTHER GRE EXAMPLE PASS 3 Blk VarKill DEExpr ExprKill i,d,c a*c, a + b a*c,d*d,c*2,i+ 2 c,f a+b a*c,c*2 3 g a*c 4 g d*d 5 i i+ AVAIL_IN(b)= x pred(b)(avail_out(x)) AVAIL_OUT(b)= DEExpr(b) (AVAIL_IN(b)-Exprkill(b)) AVAIL_IN(b0) = Ø No further changes: We have reached a fixed-point of solution 2 c = a + b d = a * c i = f[i] = a + b c = c * 2 if a+b a+b 3 g = a * c 4 g = d * d 5 a+b a+b i = i + if Ø 28 GRE BASED ON AVAIL Safety Available expressions prove that the replacement value is current Transformation must ensure right name value mapping Profitability Don t add any evaluations Add some copy operations Copies are inexpensive Copies can shrink or stretch live ranges 29 63

64 PARTIAL REDUNDANCY z = a x > 3 B Can we make this better? B2 z = x * y y < 5 B4 z < 7 B5 B3 B6 b = x * y B7 B8 c = x * y B9 Exit 30 PARTIAL REDUNDANCY ELIMINATION An expression is partially redundant if it is available on some paths. Use standard data-flow techniques to figure out where to move the code Subsumes classical global common subexpression elimination and code motion of loop invariants Used by many optimizing compilers Traditionally applied to lexically equivalent expressions With SSA support, applied to values as well 3 64

65 PARTIAL REDUNDANCY ELIMINATION May add a block to deal with critical edges Critical edge edge leading from a block with more than one successor to a block with more than one predecessor a:=d+e t:=d+e a:=t t:=d+e c:=d+e c:=t 32 PARTIAL REDUNDANCY ELIMINATION Code duplication to deal with redundancy a:=d+e a:=d+e t := a B4 B4 B4 c:=d+e c:=t c:=d+e Can we find a way to deal with redundancy in general?? 33 65

66 LAZY CODE MOTION Redundancy: common expressions, loop invariant expressions, partially redundant expressions Desirable Properties: All redundant computations of expressions that can be eliminated with code duplication are eliminated. The optimized program does not perform any computation that is not in the original program execution Expressions are computed at the latest possible time. 34 LAZY CODE MOTION Solve four data-flow problems that reveal the limit of code motion AVAIL: available expressions ANTI: anticipated expression EARLIEST: earliest placement for expressions LATER: expressions that can be postponed Compute INSERT and DELETE sets based on the data-flow solutions for each basic block They define how to move expressions between basic blocks 35 66

67 LAZY CODE MOTION B2 z = x * y y < 5 B4 z = a x > 3 B Can we make this better? B3 z < 7 B5 B6 b = x * y B7 B8 c = x * y B9 Exit 36 LAZY CODE MOTION B2 x*y z = x * y y < 5 z = a x > 3 x*y B z < 7 B3 B4 B5 B6 b = x * y B7 c = x * y Exit B8 B9 Placing computation at these points ensure our conditions 37 67

68 LOCAL INFORMATION For each block b, compute the local sets: DEExpr: an expression is downward-exposed (locally generated) if it is computed in b and its operands are not modified after its last computation UEExpr: an expression is upward-exposed if it is computed in b and its operands are not modified before its first computation NotKilled: an expression is not killed if none of its operands is modified in b f = b + d a = b + c d = a + e DEExpr = {a + e, b + c} UEExpr = { b + d, b + c } NotKilled = { b + c } 38 LOCAL INFORMATION What do they imply? DEExpr:e DEExpr(b) evaluating e at the end of b produces the same result as evaluating it at the original position in b UEExpr:e UEExpr(b) evaluating e at the entry of b produces the same result as evaluating it at the original position in b NotKilled: e NotKilled(b) evaluating e at either the start or end of b produces the same result as evaluating it at the original position f = b + d a = b + c d = a + e DEExpr = {a + e, b + c} UEExpr = { b + d, b + c } NotKilled = { b + c } 39 68

69 EXAMPLE z = a x > 3 B Block Not- Killed DE- Expr B {x*y} {} {} UE- Expr B2 z = x * y y < 5 z < 7 B3 B2 {x*y} {x*y} {x*y} B3 {x*y} {} {} B4 B5 B4 {x*y} {} {} B5 {x*y} {} {} B6 b = x * y B7 B6 {x*y} {x*y} {x*y} B8 B7 {x*y} {} {} c = x * y Exit B9 B8 {x*y} {} {} B9 {x*y} {x*y} {x*y} Exit {x*y} {} {} 40 GLOBAL INFORMATION Availability AvailIn(n 0 ) = Ø AvailIn(b)= x pred(b) AvailOut(x), b n 0 AvailOut(b)=DEExpr(b) (AvailIn(b) NotKilled(b)) Initialize AvailIn and AvailOut to be the set of expressions for all blocks except for the entry block n 0 Interpreting Avail sets e AvailOut(b) evaluating e at end of b produces the same value for e as its most recent evaluation, no matter whether the most recent one is inside b or not AvailOut tells the compiler how far forward e can move 4 69

70 EXAMPLE: AVAIL B6 b = x * y {x*y} {} z = a x > 3 {} B2 z = x * y y < 5 {x*y} {x*y} {} B4 {x*y} {} {x*y} {} B {} B8 {} {} B9 c = x * y {} {x*y} Exit AvailIn(b)= x pred(b) AvailOut(x) AvailOut(b)=DEExpr(b) (AvailIn(b) NotKilled(b)) {} z < 7 {} B5 {} {} B7 B3 Block Not- Killed DE- Expr B {x*y} {} {} Avail- Out B2 {x*y} {x*y} {x*y} B3 {x*y} {} {} B4 {x*y} {} {x*y} B5 {x*y} {} {} B6 {x*y} {x*y} {x*y} B7 {x*y} {} {} B8 {x*y} {} {} B9 {x*y} {x*y} {x*y} Exit {x*y} {} {} 42 GLOBAL INFORMATION Anticipability Expression e is anticipated at a point p if e is certain to be evaluated along all computation path leaving p before any re-computation of e s operands AntOut(nf) = Ø AntOut(b)= x succ(b)antin(x), b nf AntIn(b)=UEExpr(b) (AntOut(b) NotKilled(b)) Initialize AntOut to be the set of expressions for all blocks except for the exit block nf Interpreting Ant sets e AntIn(b) evaluating e at start of b produces the same value for e as evaluating it at the original position(later than start of b) with no additional overhead AntIn tells the compiler how far backward e can move Backwards! 43 70

71 {x*y} B2 z = x * y y < 5 {x*y} {x*y} B4 {x*y} EXAMPLE: ANT B6 b = x * y {} {} z = a x > 3 {} {x*y} {x*y} {x*y} {x*y} c = x * y {} {} Exit {} B {x*y} {x*y} B8 B9 AntOut(b)= x succ(b) AntIn(x) AntIn(b)=UEExpr(b) (AntOut(b) Not Killed(b)) {} z < 7 {} B5 {} {} B7 B3 Block Not- Killed UE- Expr B {x*y} {} {} 44 Ant-In B2 {x*y} {x*y} {x*y} B3 {x*y} {} {} B4 {x*y} {} {x*y} B5 {x*y} {} {x*y} B6 {x*y} {x*y} {x*y} B7 {x*y} {} {} B8 {x*y} {} {x*y} B9 {x*y} {x*y} {x*y} Exit {x*y} {} {} EXAMPLE: AVAIL AND ANT {} {} z = a B x > 3 {} {} {} {x*y} {} {} B2 z = x * y z < 7 y < 5 {x*y} {} {} {x*y} {x*y} {x*y} {}{x*y} B4 B5 {x*y} {x*y} {x*y} {x*y}{} {x*y} {} {} B6 b = x * y B7 {x*y}{} {} {x*y} {} {} B8 {} {x*y} {} {x*y} B9 c = x * y {x*y}{} {} {} Exit B3 Interesting spots: Anticipated but not available 45 7

72 PLACING EXPRESSIONS Earliest placement For an edge <i,j> in CFG, an expression e is in Earliest (i,j) if and only if the computation can legally move to <i,j> and cannot move to any earlier edge EARLIEST(i,j) = AntIn(j) AvailOut(i) (NotKilled(i) AntOut(i)) e AntIn(j): we can move e to the start of block j without generating un-necessary computation e AvailOut(i): no previous computation of e is available from the exit of i: if such an e exists, it would make the computation on <i,j> redundant e (Killed(i) AntOut(i)): we cannot move e further upward e Killed(i): e cannot be moved to an edge <x,i> with the same value e AntOut(i): there is another path starting with edge <i,x> along which e is not evaluated with the same value 46 EARLIEST(I,J) = ANTIN(J) AVAILOUT(I) (NOTKILLED(I) ANTOUT(I)) z = a x > 3 B B2 z = x * y y < 5 {x*y} B3 z < 7 {x*y} Block Not- Killed Avail -Out Ant-In B {x*y} {} {} {} Ant- Out B2 {x*y} {x*y} {x*y} {x*y} B4 B5 B3 {x*y} {} {} {} B4 {x*y} {x*y} {x*y} {x*y} B6 b = x * y B8 B7 B5 {x*y} {} {x*y} {x*y} B6 {x*y} {x*y} {x*y} {x*y} B7 {x*y} {} {} {} c = x * y Exit B9 B8 {x*y} {} {x*y} {x*y} B9 {x*y} {x*y} {x*y} {x*y} Exit {x*y} {} {} {} 47 72

73 EARLIEST(I,J) = ANTIN(J) AVAILOUT(I) (NOTKILLED(I) ANTOUT(I)) edge Earliest Block Not- Killed Avail- Out Ant-In Ant-Out,2 {x*y} {x*y} ({ } {x*y}) = {x*y},3 { } {x*y} ({ } {x*y}) = { } 2,4 {x*y} { } ({ } { }) = { } 2,6 {x*y} { } ({ } { }) ={ } 3,5 {X*y} {x*y} ({ } {x*y}) = {x*y} 3,7 { } {x*y} ({ } {x*y}) = { } 4,8 {x*y} { } ({ } { }) = { } 5,8 {x*y} {x*y} ({ } { }) = { } 6,exit { } { } ({ } { }) = { } 7,exit { } {x*y} ({ } {x*y}) = { } 8,9 {x*y} { } ({ } { }) = { } 9,exit { } { } ({ } {x*y}) = { } B {x*y} {} {} {} B2 {x*y} {x*y} {x*y} {x*y} B3 {x*y} {} {} {} B4 {x*y} {x*y} {x*y} {x*y} B5 {x*y} {} {x*y} {x*y} B6 {x*y} {x*y} {x*y} {x*y} B7 {x*y} {} {} {} B8 {x*y} {} {x*y} {x*y} B9 {x*y} {x*y} {x*y} {x*y} Exit {x*y} {} {} {} 48 PLACING EXPRESSIONS Earliest placement For an edge <i,j> in CFG, an expression e is in Earliest (i,j) if and only if the computation can legally move to <i,j> and cannot move to any earlier edge EARLIEST(i,j) = AntIn(j) AvailOut(i) (NotKilled(i) AntOut(i)) EARLIEST(n 0,j) = AntIn(j) AvailOut(n 0 ) We can never move e before entry point n 0 : the last term is ignored n 0 must be the dummy entry point 49 73

74 POSTPONE EVALUATIONS We want to delay the evaluation of expressions as long as possible Motivation: save register usage There is a limit to this delay Not past the use of the expression Not so far that we end up computing an expression that is already evaluated 50 PLACING EXPRESSIONS Later (than earliest) placement An expression e is in LaterIn(k) if evaluation of e can be moved through entry to k without losing any benefit e LaterIn(k) if and only if every path that reaches k includes an edge <p,q> s.t. e EARLIEST(p,q), and the path from q to k neither kills e nor uses e LaterIn(j) = i pred(j) LATER(i,j), j n 0 LaterIn(n 0 ) = Ø LATER(i,j) = (EARLIEST(i,j) LaterIn(i)) UEExpr(i), i pred(j) An expression e is in LATER(i,j) if evaluation of e can be moved (postponed) to CFG edge <i,j> e LATER(i,j) if <i,j> is its earliest placement, or it can be moved to the entry of i and there is no evaluation(use) of e in block i 5 74

75 B6 EXAMPLE: LATER EARLIEST {x*y} {x*y} B2 z = x * y y < 5 B4 b = x * y z = a x > 3 c = x * y LaterIn(j) = i pred(j) LATER(i,j), j n 0 B LATER(i,j) = (EARLIEST(i,j) LaterIn(i)) UEExpr(i), i pred(j) z < 7 {x*y} {x*y} B5 {x*y} B8 B9 B3 B7 LATER Block UE- Expr B {} {} Later In B2 {x*y} {x*y} B3 {} {} B4 {} {} B5 {} {x*y} B6 {x*y} {} B7 {} {} B8 {} {} B9 {x*y} {} Exit Exit {} {} 52 REWRITING CODE Insert set for each CFG edge The computations that LCM should insert on that edge Insert(i,j) = LATER(i,j) LaterIn(j) e Insert(i,j) means an evaluation of e should be added between block i and block j Three possible places to add Delete set for each block The computations that LCM should delete from that block Delete(i) = UEExpr(i) LaterIn(i), i n 0 The first computation in i is redundant 53 75

76 EXAMPLE: INSERT & DELETE INSERT z = a x > 3 B Insert(i,j)= LATER(i,j) LaterIn(j) Delete(i)= UEExpr(i) LaterIn(i), i n 0 B2 B6 z = x * y y < 5 B4 b = x * y {x*y} B3 z < 7 {x*y} B5 {x*y} {x*y} B7 B8 LATER B9 c = x * y Exit Block UE- Expr Later In B {} {} {} B2 {x*y} {x*y} {} B3 {} {} {} B4 {} {} {} B5 {} {x*y} {} 54 Delete B6 {x*y} {} {x*y} B7 {} {} {} B8 {} {} {} B9 {x*y} {} {x*y} Exit {} {} {} REWRITING CODE Insert set for each CFG edge The computations that LCM should insert on that edge Insert(i,j) = LATER(i,j) LaterIn(j) If i has only one successor, insert computations at the end of i If j has only one predecessor, insert computations at the entry of j Otherwise, split the edge and insert the computations in a new block between i and j Delete set for each block The computations that LCM should delete from that block Delete(i) = UEExpr(i) LaterIn(i), i n 0 The first computation in i is redundant: remove it 55 76

77 INSERTING CODE Evaluation placement for x INSERT(i,j) Three cases succs(i) = insert at end of i succs(i) >, but preds(j) = insert at start of j succs(i) >, & preds(j) > create new block in <i,j> for x x B i B i x B h B i B j x B j B k B j B k succs(i) = preds(j) = succs(i) > preds(j) > 56 EXAMPLE: INSERT & DELETE INSERT z = a x > 3 B Insert(i,j)= LATER(i,j) LaterIn(j) Delete(i)= UEExpr(i) LaterIn(i), i n 0 B2 z = x * y y < 5 B4 z < 7 B5 B3 Block UE- Expr Later In B {} {} {} B2 {x*y} {x*y} {} B3 {} {} {} Delete B6 b = x * y {x*y} B7 B4 {} {} {} B5 {} {x*y} {} B8 B6 {x*y} {} {x*y} B7 {} {} {} c = x * y B9 B8 {} {} {} B9 {x*y} {} {x*y} Exit Exit {} {} {} 57 77

Control Flow Analysis

Control Flow Analysis COMP 6 Program Analysis and Transformations These slides have been adapted from http://cs.gmu.edu/~white/cs60/slides/cs60--0.ppt by Professor Liz White. How to represent the structure of the program? Based