Lazy Code Motion. Jens Knoop FernUniversität Hagen. Oliver Rüthing University of Dortmund. Bernhard Steffen University of Dortmund

Size: px

Start display at page:

Download "Lazy Code Motion. Jens Knoop FernUniversität Hagen. Oliver Rüthing University of Dortmund. Bernhard Steffen University of Dortmund"

Victor Jacobs
6 years ago
Views:

1 RETROSPECTIVE: Lazy Code Motion Jens Knoop FernUniversität Hagen Oliver Rüthing University of Dortmund Bernhard Steffen University of Dortmund 1. CODE MOTION THE HISTORY Compiler techniques for eliminating redundant computations or moving invariant code out of loops have been known and widely been used since the early seventies [4]. In 1979, Morel and Renvoise came up with an exciting new technique, the suppression of partial redundancies [20]. Their technique uniformly subsumed loop invariant code motion, common subexpression elimination, and the elimination of redundant computations. Probably, the most appealing facet of their technique was its structural purity: the transformation was solely based on data-flow analysis and did not require any specific knowledge on the control flow of the programs under investigation. On the other hand, their genuine proposal was conceptually complex. It involved an equation system of four highly interacting global properties while still suffering from three deficiencies. First, too few partial redundancies are removed. Intuitively, this was due to Morel s and Renvoise s design decision to insert code at the nodes of the underlying flow graph rather than at its edges, which unnecessarily restricted the possible computation points. Second, code was moved too far, which led to unnecessary register pressure. And third, the node placement was based on bidirectional data-flow equations which from a conceptual point of view are difficult to comprehend and from a computational point of view are more costly to compute. Up to the early nineties a number of modifications to the Morel-Renvoise algorithm had been proposed in order to address these deficiencies [2, 5]. While the first deficiency could be resolved by means of edge placement techniques, bidirectionality and register pressure were only partially addressed by means of heuristics. 2. LCM OUR CONTRIBUTION As we started our work on lazy code motion, we were convinced that research efforts based on modifying the Morel/- Renvoise-style equations had led into a dead end. Partial redundancy elimination (PRE) is a beautiful technique with a simple underlying basic idea: expressions are hoisted to earlier program points increasing thereby their potential to make the original ones fully redundant, which can then be eliminated. We had the strong belief that it must be possible to construct a PRE-technique which is solely composed out of simple and well-understood components. Decomposing the problem. A key idea to attack the prob- 20 Years of the ACM/SIGPLAN Conference on Programming Language Design and Implementation ( ): A Selection, Copyright 2003 ACM $5.00. lem was its decomposition based on a clean separation of concerns. We noticed that there are two optimization goals with a natural hierachy: the primary is to reduce the number of computations to a minimum (computational optimality); the secondary to avoid unnecessary code movement to minimize the lifetimes of temporaries and hence the register pressure (lifetime optimality). Solving the problem. Fortunately, we already had an offthe-shelf solution for the first optimization goal. Investigating the relationship between model checking and data-flow analysis has led to a modal logic specification of a computationally optimal PRE following an as-early-as-possible code placement strategy [27]. We called the resulting transformation Busy Code Motion (BCM), as it hoists code as far as possible. Technically it required only two simple unidirectional data-flow analyses. This simplicity revealed the solution to our secondary goal, the avoidance of unnecessary code motion: the code only had to sink back from the BCM insertion points as far as computational optimality was preserved, which can be realized simply by adding another unidirectional data-flow analysis. The resulting transformation, which solves the problem of unnecessary register pressure, hoists code just far enough to ensure computationally optimal results, the reason for it being called Lazy Code Motion (LCM). This successful way of playing with simple analysis components was later extended to also control/minimize code size [26]. Here, the natural trade-off between the optimization goals led to different solutions depending on the chosen priority between size and speed. 3. THE IMPACT How to figure out and measure the impact of a paper? As a first indication, a zeitgeisty approach for tracking this down might be to consult some pertinent Internet search engines. Doing so for the phrase lazy code motion, a Google search yields around 670 results, Scirus comes up with about 235 Web references, and CiteSeer with around 75 citations of both the PLDI 92 paper on LCM and its 1994 TOPLAS journal version, all this in October Of course, not all of the links resulting from these searches are truly or meaningfully related to LCM. Following some of the remaining ones, this suggests that the PLDI 1992 paper on LCM is going to have some impact on both teaching, research, and industrial practice. To give a few examples, LCM is subject to class assignments (UC Berkeley), considered in courses entitled such as on classic and maybe-will-be classic compiler papers (Rice), ACM SIGPLAN 460 Best of PLDI

2 and part of the body of knowledge for qualifying examinations in the area of programming languages and compilers (Georgia Tech). LCM has been incorporated in recent textbooks on compiler construction such as the ones of Muchnick [22] and Morgan [21]. Browsing the PLDI proceedings of 1993 through 2002, there are 18 papers citing the PLDI 1992 paper or its TOPLAS version. Most notable are the PLDI 1997 and 1998 papers of Chow, Kennedy et al. In [3, 19] they develop SSA-based algorithms for PRE and register promotion based on the LCM idea, which influenced the PRE-implementations in several commercial and academic compilers. Examples are production compilers such as the Sun SPARCompiler language systems, SGI s and Intel s IA- 64 compilers, and IBM s JIT compiler. Furthermore, LCM impacted the PRE-optimization in open-source compilers such as the GNU-compilers, and the Open Research Compiler (ORC), a descendant of SGI s Pro64 compiler, which was recently released by Intel to the open source community. Similarly, LCM influenced research compiler projects such as SUIF, Soot, and the CoSy compilation system. In the meantime, the latter also turned into a commercial product. Besides immediate adaptations [6, 23], LCM also stipulated research in other areas. In varying degrees, it influenced the development of profile-based refinements of PRE [8], the development of techniques for register promotion [19, 1] and array bounds checks elimination [18], as well as of extensions to other optimization goals such as constant propagation and strength reduction [7, 9] to cite a few. As indicated above, LCM also broadly influenced our own research. Its structural clarity paved the way to systematically extend PRE to different settings and paradigms such as the interprocedural, explicitly parallel, and predicated one [10, 17, 11], and it provided the key to develop related techniques such as lazy strength reduction, partial dead-code elimination, and assignment motion [16, 14, 15], as well as to the redesign of previous algorithms for global value numbering [13]. Moreover, it opened the gate to new applications such as communication optimization in data-parallel languages [12], and to thoroughly investigating the impact of interdependent program transformations [24], and of the interdependencies of bidirectionality and critical edges in PRE [25]. For a detailed account of PRE see Last but not least, the LCM paper won the Most Influencial PLDI Paper Award 2002 (for 1992). Acknowledgements We would like to thank Hans Boehm, Michael Burke, Sun C. Chan, Fred Chow, Jim Dehnert, Michael Hind, Roy Ju, Robert Kennedy, Jim Larus, Shin-Ming Liu, and Peng Tu for sharing their knowledge on the impact of LCM to related techniques and their usage in commercial compilers with us. REFERENCES [1] R. Bodík, R. Gupta, and M. L. Soffa. Load-reuse analysis: Design and evaluation. In Proc. ACM SIGPLAN PLDI 99, ACM SIGPLAN Not., 34(5):64-76, [2] F. Chow. A Portable Machine Independent Optimizer Design and Measurements. PhD thesis, Stanford Univ., Dept. of Electrical Eng., Stanford, CA, [3] F. Chow, S. Chan, R. Kennedy, S. Liu, R. Lo, and P. Tu. A new algorithm for partial redundancy elimination based on SSA form. In Proc. ACM SIGPLAN PLDI 97, ACM SIGPLAN Not., 32(5): , [4] J. Cocke and J. T. Schwartz. Programming languages and their compilers. Courant Inst. Math. Sciences, NY, [5] D. M. Dhamdhere. Practical adaptation of the global optimization algorithm of Morel and Renvoise. ACM Trans. Prog. Lang. Syst., 13(2): , Tech. Corr. [6] K.-H. Drechsler and M. P. Stadel. A variation of Knoop, Rüthingand Steffen s LAZY CODE MOTION. ACM SIGPLAN Not., 28(5):29 38, [7] M. Hailperin. Cost-optimal code motion. ACM Trans. Prog. Lang. Syst., 20(6): , [8] R. N. Horspool and H. C. Ho. Partial redundancy elimination driven by a cost-benefit analysis. In Proc. 8th Israeli Conf. on Computer Systems and Software Engineering (ICSSE 97), pages , [9] R. Kennedy, F. Chow, P. Dahl, S.-M. Liu, R. Lo, and M. Streich. Strength reduction via SSAPRE. In Proc. 7th Conf. Comp. Construction (CC 98), LNCS 1383, , [10] J. Knoop. Optimal Interprocedural Program Optimization: A new Framework and its Application. LNCS Tutorial 1428, Springer-V., [11] J. Knoop, J.-F. Collard, and R. D. Ju. Partial redundancy elimination on predicated code. In Proc. 7th Static Analysis Symp. (SAS 2000), LNCS 1824, , [12] J. Knoop and E. Mehofer. Distribution assignment placement: Effective optimization of redistribution costs. IEEE Trans. Parallel and Distributed Systems (TPDS), 13(6): , [13] J. Knoop, O. Rüthing, and B. Steffen. Code motion and code placement: Just synomyms? In Proc. 7th European Symp. on Prog. (ESOP 98), LNCS 1381, , [14] J. Knoop, O. Rüthing, and B. Steffen. Partial dead code elimination. In Proc. ACM SIGPLAN PLDI 94, ACM SIGPLAN Not., 29(6): , [15] J. Knoop, O. Rüthing, and B. Steffen. The power of assignment motion. In Proc. ACM SIGPLAN PLDI 95, ACM SIGPLAN Not., 30(6): , [16] J. Knoop, O. Rüthing, and B. Steffen. Lazy strength reduction. J. Prog. Lang., 1(1):71 91, [17] J. Knoop and B. Steffen. Code motion for explicitly parallel programs. In Proc. 7th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPoPP 99), ACM SIGPLAN Not., 34(8):13-24, [18] P. Kolte and M. Wolfe. Elimination of redundant array subscript range checks. In Proc. ACM SIGPLAN PLDI 95, ACM SIGPLAN Not., 30(6): , [19] R. Lo, F. C. Chow, R. Kennedy, S. M. Liu, and P. Tu. Register promotion by sparse partial redundancy elimination of loads and stores. In Proc. ACM SIGPLAN PLDI 98, ACM SIGPLAN Not., 33(5):26-37, [20] E. Morel and C. Renvoise. Global optimization by suppression of partial redundancies. Comm. ACM, 22(2):96 103, [21] R. Morgan. Building an Optimizing Compiler. Digital Press, [22] S. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, [23] V.K.Paleri,Y.N.Srikant,andP.Shankar.Asimple algorithm for partial redundancy elimination. ACM SIGPLAN Not., 33(12):35 43, [24] O. Rüthing. Interacting Code Motion Transformations: Their Impact and Their Complexity. LNCS Springer-V., [25] O. Rüthing. Code motion in the presence of critical edges without bidirectional data flow analysis. Science of Computer Programming, 39:3 29, [26] O. Rüthing, J. Knoop, and B. Steffen. Sparse code motion. In Conf. Rec. 27th Symp. Principles of Prog. Lang. (POPL 2000), pages ACM, NY, [27] B. Steffen. Data flow analysis as model checking. In Proc. 1st Int. Conf. Theoretical Aspects of Computer Software (TACS 91), LNCS 526, pages Springer-V., ACM SIGPLAN 461 Best of PLDI

3 ACM SIGPLAN 462 Best of PLDI

4 ACM SIGPLAN 463 Best of PLDI

5 ACM SIGPLAN 464 Best of PLDI

6 ACM SIGPLAN 465 Best of PLDI

7 ACM SIGPLAN 466 Best of PLDI

8 ACM SIGPLAN 467 Best of PLDI

9 ACM SIGPLAN 468 Best of PLDI

10 ACM SIGPLAN 469 Best of PLDI

11 ACM SIGPLAN 470 Best of PLDI

12 ACM SIGPLAN 471 Best of PLDI

13 ACM SIGPLAN 472 Best of PLDI

E-path PRE Partial Redundancy Elimination Made Easy

E-path PRE Partial Redundancy Elimination Made Easy Dhananjay M. Dhamdhere dmd@cse.iitb.ac.in Department of Computer Science and Engineering Indian Institute of Technology, Mumbai 400 076 (India). Abstract