Topics. Lecture 37: Global Optimization. Issues. A Simple Example: Copy Propagation X := 3 B > 0 Y := 0 X := 4 Y := Z + W A := 2 * 3X

Similar documents
Lecture Outline. Global flow analysis. Global Optimization. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

(How Not To Do) Global Optimizations

Generic Traverse. CS 362, Lecture 19. DFS and BFS. Today s Outline

Shortest Paths Problem. CS 362, Lecture 20. Today s Outline. Negative Weights

Algorithmic Discrete Mathematics 4. Exercise Sheet

Today s Outline. CS 561, Lecture 23. Negative Weights. Shortest Paths Problem. The presence of a negative cycle might mean that there is

Lecture 14: Minimum Spanning Tree I

1 The secretary problem

Karen L. Collins. Wesleyan University. Middletown, CT and. Mark Hovey MIT. Cambridge, MA Abstract

A SIMPLE IMPERATIVE LANGUAGE THE STORE FUNCTION NON-TERMINATING COMMANDS

Control Flow Analysis

CORRECTNESS ISSUES AND LOOP INVARIANTS

Minimum congestion spanning trees in bipartite and random graphs

A note on degenerate and spectrally degenerate graphs

Size Balanced Tree. Chen Qifeng (Farmer John) Zhongshan Memorial Middle School, Guangdong, China. December 29, 2006.

Routing Definition 4.1

See chapter 8 in the textbook. Dr Muhammad Al Salamah, Industrial Engineering, KFUPM

Today s Outline. CS 362, Lecture 19. DFS and BFS. Generic Traverse. BFS and DFS Wrapup Shortest Paths. Jared Saia University of New Mexico

Operational Semantics Class notes for a lecture given by Mooly Sagiv Tel Aviv University 24/5/2007 By Roy Ganor and Uri Juhasz

Delaunay Triangulation: Incremental Construction

DAROS: Distributed User-Server Assignment And Replication For Online Social Networking Applications

xy-monotone path existence queries in a rectilinear environment

CSE 250B Assignment 4 Report

Fall 2010 EE457 Instructor: Gandhi Puvvada Date: 10/1/2010, Friday in SGM123 Name:

An Intro to LP and the Simplex Algorithm. Primal Simplex

Contents. shortest paths. Notation. Shortest path problem. Applications. Algorithms and Networks 2010/2011. In the entire course:

Laboratory Exercise 2

Chapter S:II (continued)

CS201: Data Structures and Algorithms. Assignment 2. Version 1d

Lecture 36: Code Optimization

Computer Arithmetic Homework Solutions. 1 An adder for graphics. 2 Partitioned adder. 3 HDL implementation of a partitioned adder

Advanced Encryption Standard and Modes of Operation

On successive packing approach to multidimensional (M-D) interleaving

Fall 2010 EE457 Instructor: Gandhi Puvvada Date: 10/1/2010, Friday in SGM123 Name:

Edits in Xylia Validity Preserving Editing of XML Documents

Drawing Lines in 2 Dimensions

Announcements. CSE332: Data Abstractions Lecture 19: Parallel Prefix and Sorting. The prefix-sum problem. Outline. Parallel prefix-sum

Representations and Transformations. Objectives

else end while End References

AVL Tree. The height of the BST be as small as possible

Spring 2012 EE457 Instructor: Gandhi Puvvada

arxiv: v1 [cs.ds] 27 Feb 2018

Lecture 17: Shortest Paths

Performance of a Robust Filter-based Approach for Contour Detection in Wireless Sensor Networks

Lecture 8: More Pipelining

Analyzing Hydra Historical Statistics Part 2

Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

Stochastic Search and Graph Techniques for MCM Path Planning Christine D. Piatko, Christopher P. Diehl, Paul McNamee, Cheryl Resch and I-Jeng Wang

The Data Locality of Work Stealing

A Boyer-Moore Approach for. Two-Dimensional Matching. Jorma Tarhio. University of California. Berkeley, CA Abstract

Laboratory Exercise 6

Key Terms - MinMin, MaxMin, Sufferage, Task Scheduling, Standard Deviation, Load Balancing.

Analysis of slope stability

Laboratory Exercise 6

DWH Performance Tuning For Better Reporting

ADAM - A PROBLEM-ORIENTED SYMBOL PROCESSOR

Shortest Path Routing in Arbitrary Networks

AN ALGORITHM FOR RESTRICTED NORMAL FORM TO SOLVE DUAL TYPE NON-CANONICAL LINEAR FRACTIONAL PROGRAMMING PROBLEM

Gray-level histogram. Intensity (grey-level) transformation, or mapping. Use of intensity transformations:

Laboratory Exercise 6

Variable Resolution Discretization in the Joint Space

Parity-constrained Triangulations with Steiner points

Universität Augsburg. Institut für Informatik. Approximating Optimal Visual Sensor Placement. E. Hörster, R. Lienhart.

Temporal Abstract Interpretation. To have a continuum of program analysis techniques ranging from model-checking to static analysis.

Motion Control (wheeled robots)

A Multi-objective Genetic Algorithm for Reliability Optimization Problem

8.1 Shortest Path Trees

SIMIT 7. Component Type Editor (CTE) User manual. Siemens Industrial

Application of Social Relation Graphs for Early Detection of Transient Spammers

CERIAS Tech Report EFFICIENT PARALLEL ALGORITHMS FOR PLANAR st-graphs. by Mikhail J. Atallah, Danny Z. Chen, and Ovidiu Daescu

Performance Evaluation of an Advanced Local Search Evolutionary Algorithm

Shortest Paths in Directed Graphs

MAT 155: Describing, Exploring, and Comparing Data Page 1 of NotesCh2-3.doc

Hassan Ghaziri AUB, OSB Beirut, Lebanon Key words Competitive self-organizing maps, Meta-heuristics, Vehicle routing problem,

Course Updates. Reminders: 1) Assignment #13 due Monday. 2) Mirrors & Lenses. 3) Review for Final: Wednesday, May 5th

The Set Constraint/CFL Reachability Connection in Practice

Cutting Stock by Iterated Matching. Andreas Fritsch, Oliver Vornberger. University of Osnabruck. D Osnabruck.

Floating Point CORDIC Based Power Operation

Lemma 1. A 3-connected maximal generalized outerplanar graph is a wheel.

CS 467/567: Divide and Conquer on the PRAM

arxiv: v3 [cs.cg] 1 Oct 2018

Region analysis and the polymorphic lambda calculus

Compiler Construction

The Association of System Performance Professionals

Parallel MATLAB at FSU: Task Computing

Quadrilaterals. Learning Objectives. Pre-Activity

Midterm 2 March 10, 2014 Name: NetID: # Total Score

Touring a Sequence of Polygons

Description of background ideas, and the module itself.

Source Code (C) Phantom Support Sytem Generic Front-End Compiler BB CFG Code Generation ANSI C Single-threaded Application Phantom Call Identifier AEB

Optimal Gossip with Direct Addressing

The norm Package. November 15, Title Analysis of multivariate normal datasets with missing values

Modeling of underwater vehicle s dynamics

KS3 Maths Assessment Objectives

Laboratory Exercise 6

Growing Networks Through Random Walks Without Restarts

Using Partial Evaluation in Distributed Query Evaluation

3D SMAP Algorithm. April 11, 2012

Shortest Paths with Single-Point Visibility Constraint

Planning of scooping position and approach path for loading operation by wheel loader

Transcription:

Lecture 37: Global Optimization [Adapted from note by R. Bodik and G. Necula] Topic Global optimization refer to program optimization that encompa multiple baic block in a function. (I have ued the term galactic optimization to referto going beyond function boundarie, but it han t caught on; we call it jut interprocedural optimization.) Since we can t ue the uual aumption about baic block, global optimization require global flow analyi to ee where value can come from and get ued. The overall quetion i: When can local optimization (from the lat lecture) be applied acro multiple baic block? Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 1 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 2 A Simple Example: Copy Propagation X := 4 A := 2 * 3X Without other aignment to X, it i valid to treat the red part a if they were in the ame baic block. But a oon a one other block on the path to the bottom block aign to X, we can no longer do o. It i correct to apply copy propagation to a variable x from an aignment tatementa: x :=... to a given ue of x in tatement B only if the lat aignment to x in every path from to B i A. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 3 Iue Thi correctne condition i not trivial to check All path include path around loop and through branche of conditional Checking the condition require global analyi: an analyi of the entire control-flow graph for one method body. Thi i typical for optimization that depend on ome property P at a particular point in program execution. Indeed, property P i typically undecidable, o program optimization i all about making conervative (but not cowardly) approximation of P. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 4

Undecidability of Program Propertie Rice theorem: Mot intereting dynamic propertie of a program are undecidable. E.g., Doe the program halt on all (ome) input? (Halting Problem) I the reult of a function F alway poitive? (Conider def F(x): H(x) return 1 Reult i poitive iff H halt.) Syntactic propertie are typically decidable (e.g., How many occurrence of x are there? ). Theorem doe not apply in abence of loop Conervative Program Analye If a certain optimization require P to be true, then If we know that P i definitely true, we can apply the optimization If we don t know whether P i true, we imply don t do the optimization. Since optimization are not uppoed to change the meaning of a program, thi i afe. In other word, in analyzing a program for propertie like P, it i alway correct (albeit non-optimal) to ay don t know. The trick i to ay it a eldom a poible. Global dataflow analyi i a tandard technique for olving problem with thee characteritic. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 5 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 6 Example: Global Contant Propagation Example of Reult of Contant Propagation Global contant propagation i jut the retriction of copy propagation to contant. In thi example, we ll conider doing it for a ingle variable (X). At every program point (i.e., before or after any intruction), we aociate one of the following value with X Value Interpretation X = 4 X := 4 # (aka bottom) No value ha reached here (yet) c (For c a contant) X definitely ha the value c. * (aka top) Don t know what, if any, contant value X ha. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 7 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 8

Uing Analyi Reult Given global contant information, it i eay to perform the optimization: If the point immediately before a tatement uing x tell u that x = c, then replace x with c. Otherwie, leave it alone (the conervative option). But how do we compute thee propertie x =...? Tranfer Function Baic Idea: Expretheanalyiofacomplicatedprogramaacombination of imple rule relating the change in information between adjacent tatement That i, we puh or tranfer information from one tatement to the next. For each tatement, we end up with information about the value of x immediately before and after : Cin(X,) = value of x before Cout(X,) = value of x after Here, the value of x we ue come from an abtract domain, containing the value we care about #,*, k value computed tatically by our analyi. For the contant propagation problem, we ll compute Cout from Cin, and we ll get Cin from the Cout of predeceor tatement, Cout(X, p 1 ),...,Cout(X,p n ). Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 9 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 10 Contant Propagation: Rule 1 Contant Propagation: Rule 2 p 1 p 2 p 3 p n p 1 X = c p 2 p 3 X = d p n If Cout(X, p i ) = * for ome i, then Cin(X, ) = * If Cout(X, p i ) = c and Cout(X, p j ) = d with contant c d, then Cin(X, ) = * Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 11 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 12

Contant Propagation: Rule 3 Contant Propagation: Rule 4 p 1 X = c p 2 p 3 X = c p n p 1 p 2 p 3 p n X = c If Cout(X, p i ) = c for ome i and Cout(X, p j ) = c or Cout(X, p j ) = # for all j, then Cin(X, ) = c If Cout(X, p j ) = # for all j, then Cin(X, ) = # Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 13 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 14 Contant Propagation: Computing Cout Contant Propagation: Rule 5 Rule 1 4 relate the out of one tatement to the in of the ucceor tatement, thu propagating information forward acro CFG edge. Nowweneedlocal rulerelatingthein andout ofaingletatement to propagate information acro tatement. Cout(X, ) = # if Cin(X, ) = # The value # mean o far, no value of X get here, becaue the we don t (yet) know that thi tatement ever get executed. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 15 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 16

Contant Propagation: Rule 6 Contant Propagation: Rule 7 X := c X = c X := f(...) Cout(X, X := c) = c if c i a contant and? i not #. Cout(X, X := f(...)) = * for any function call, if? i not #. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 17 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 18 Contant Propagation: Rule 8 X = α Y :=... X = α Cout(X, Y :=...) = Cin(X, Y :=...) if X and Y are different variable. Propagation Algorithm To ue thee rule, we employ a tandard technique: iteration to a fixed point: Mark all point in the program with current approximation of the variable() of interet (X in our example). Set the initial approximation to for the program entry point and everywhere ele. Repeatedly apply rule 1 8 every place they are applicable until nothing change until the program i at a fixed point with repect to all the tranfer rule. We can be clever about thi, keeping a lit of all node any of whoe predeceor Cout value have changed ince the lat rule application. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 19 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 20

An Example of the Algorithm Another Example of the Propagation Algorithm 4 X := 4 * * A < B A < B * * * So we can replace X with 3 in the bottom block. Here, we cannot replace X in two of the baic block. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 21 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 22 A Third Example Comment * * The example ued a depth-firt approach to conidering poible place to apply the rule, tarting from the entry point. In fact, the order in which one look at tatement i irrelevant. We could have changed the Cout value after the aignment to X firt, for example. The # value i neceary to avoid deciding on a final value too oon. In effect, it allow u to tentatively propogate contant value through before finding out what happen in path we haven t looked at yet. X := 4 A < B * * 4 4 Likewie, we cannot replace X. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 23 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 24

Ordering the Abtract Domain We can implify the preentation of the analyi by ordering the value # < c < *. Or pictorially, with lower meaning le than, * Termination Simply aying repeat until nothing change doen t guarantee that eventually nothing change. But the ue of lub explain why the algorithm terminate: Value tart a # and only increae By the tructure of the lattice, therefore, each value can only change twice. Thu the algorithm i linear in program ize. The number of tep 1 0 1 2 = 2 Number of Cin and Cout value computed = 4 Number of program tatement. #...a mathematical tructure known a a lattice. With thi, our rule for computing Cin i imply a leat upper bound: Cin(x, ) = lub { Cout(x, p) uch that p i a predeceor of }. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 25 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 26 Livene Analyi Once contant have been globally propagated, we would like to eliminate dead code In the program Terminology: Live and Dead ; /*(1)*/ X = 4; /*(2)*/ Y := X /*(3)*/ the variable X i dead (never ued) at point (1), live at point (2), and may or may not be live at point (3), depending on the ret of the program. More generally, a variable x i live at tatement if There exit a tatement that ue x; There i a path from to ; and That path ha no intervening aignment to x A < B A tatement x :=... i dead code (and may be deleted) if x i dead after the aignment. After contant propagation, i dead code (auming thi i the entire CFG) Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 27 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 28

Computing Livene We can expre livene a a function of information tranferred between adjacent tatement, jut a in copy propagation Livene i impler than contant propagation, ince it i a boolean property (true or fale). That i, the lattice ha two value, with fale<true. Italodifferinthatlivenedependonwhatcomeafter atatement, not before we propagate information backward through the flow graph, from Lout (livene information at the end of a tatment) to Lin. So 1 L(X) =? 2 L(X) =? Livene Rule 1 p 3 L(X) = true L(X) = true n L(X) =? Lout(x, p) = lub { Lin(x, ) uch that i a predeceor of p }. Here, leat upper bound (lub) i the ame a or. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 29 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 30 Livene Rule 2 Livene Rule 3...:=...X... L(X) = true L(X) =? X := e L(X) = fale L(X) =? Lout(X, ) = true if ue the previou value of X. Lout(X, X := e) = fale if e doe not ue the previou value of X. The ame rule applie to any other tatement that ue the value of X, uch a tet (e.g., X < 0). Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 31 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 32

Livene Rule 4 L(X) = α L(X) = α Lout(X, ) = Lin(X, ) if doe not mention X. Propagation Algorithm for Livene Initially, let all Lin and Lout value be fale. Set Lout value at the program exit to true iff x i going to be ued elewhere (e.g., if it i global and we are analyzing only one procedure). A before, repeatedly pick where one of 1 4 doe not hold and update uing the appropriate rule, until there are no more violation. When we re done, we can eliminate aignment to X if X i dead at the point after the aignment. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 33 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 34 Example of Livene Computation Termination L(X) = fale A before, a value can only change a bounded number of time: the bound being 1 in thi cae. Termination i guaranteed Once the analyi i computed, it i imple to eliminate dead code, but having done o, we mut recompute the livene information. X := X * X X := 4 A < B L(X) = fale L(X) = fale Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 35 Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 36

SSA and Global Analyi For local optimization, the ingle tatic aignment (SSA) form wa ueful. But applying it to a full CFG i require a trick. E.g., how do we avoid two aignment to the temporary holding x after thi conditional? if a > b: x = a ele: x = b # where i x at thi point? Anwer: a mall kludge known a φ function Turn the previou example into thi: if a > b: x1 = a ele: x2 = b x3 = φ(x1, x2) Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 37 φ Function An artificial device to allow SSA notation in CFG. In a baic block, each variable i aociated with one definition, φ function in effect aociate each variable with a et of poible definition. In general, one trie to introduce them in trategic place o a to minimize the total number of φ. Although thi device increae number of aignment in IL, regiter allocation can remove many by aigning related IL regiter to the ame real regiter. Their ue enable u to extend uch optimization a CSE elimination in baic block to Global CSE Elimination. With SSA form, eay to tell (conervatively) if two IL aignment compute the ame value: jut ee if they have the ame right-hand ide. The ame variable indicate the ame value. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 38 Summary We ve een two kind of analyi: Contant propagation i a forward analyi: information i puhed from input to output. Livene i a backward analyi: information i puhed from output back toward input. But both make ue of eentially the ame algorithm. Numerou other analye fall into thee categorie, and allow u to ue a imilar formulation: An abtract domain (abtract relative to actual value); Local rule relating information between conecutive program point around a ingle tatement; and Lattice operation like leat upper bound (or join) or greatet lower bound (or meet) to relate input and output of adjoining tatement. Lat modified: Wed Apr 20 22:55:29 2011 CS164: Lecture #37 39