Interprocedural Dependence Analysis and Parallelization

Size: px

Start display at page:

Download "Interprocedural Dependence Analysis and Parallelization"

Branden Malone
5 years ago
Views:

1 RETROSPECTIVE: Interprocedural Dependence Analysis and Parallelization Michael G Burke IBM T.J. Watson Research Labs P.O. Box 704 Yorktown Heights, NY USA mgburke@us.ibm.com Ron K. Cytron Department of Computer Science and Engineering Washington University Campus Box 1045 St Louis, MO USA cytron@acm.org ABSTRACT The area of dependence analysis has served as grounds for fruitful research as well as practical implementation. Compilers and tools that utilize dependence information can generate code that takes advantage of parallel resources and storage hierarchies on modern architectures. Here, we offer some historical background on the context and thinking that fostered our 1986 paper. We also attempt to summarize the direction research in this area has taken since the paper s appearance. Background In 1985, when this paper was submitted to PLDI, the authors of this paper were members of the PTRAN (Parallel TRANslation) group at IBM T. J. Watson Research Labs in Yorktown Heights. Fran Allen, now Research Staff Member Emerita, directed the group, whose research included program optimizations and transformations for parallel architectures. Fran asked us to think about a compile-time test for array overlap that would be appropriate for Fortran, where arrays are statically declared but can overlap in nonobvious ways across different compilation units. The problem thus posed was interprocedural in nature, but it was complicated by Fortran COMMON blocks and other such structures by which a given location in memory could be known by different names. We surveyed literature on dependence analysis and concluded that a subscript test, of the kind formulated by Banerjee-Wolfe, would be appropriate. That test, however, proceeds subscript-by-subscript, and holds only if array indices do not violate their declared bounds. Fortran offered no mechanism to determine the size of an array dimension at runtime, nor were runtime violations of declared bounds cause for terminating a Fortran program. Thus, Fortran programmers violated declared array-bounds with abandon. It occurrred to us that the lower view of an array subscript is simply a linear index in memory, and all Fortran compilers eventually generate code to treat arrays of higher dimension as a onedimensional vector. By applying Banerjee-Wolfe to the linearized subscript form, Fran s problem could be solved conservatively: the compile-time test is reliable concerning array independence, but the test may flag some arrays as overlapping when in fact they are independent. Fortunately, this approach is appropriate for a compiletime test. Further, it turned out that for some higher-dimension arrays, tests on the linearized form could prove independence where subscriptby-subscript testing could not. 20 Years of the ACM/SIGPLAN Conference on Programming Language Design and Implementation ( ): A Selection, Copyright 2003 ACM $5.00. Our paper is perhaps better known for its hierarchical reformulation of Wolfe s direction vectors. A direction vector shows the direction of a data dependence in terms of an iteration space. In a single-loop environment, Wolfe s < dependence (called a true dependence by Kennedy and Allen) moves forward through the iteration space, while Wolfe s > dependence moves backward. We showed that dependence testing could proceed first by testing for dependences over all direction vectors (which we called? ). If the test is positive, then further refinement of? into Wolfe s direction vectors can provide more information about the nature of the dependence. This reformulation was useful, especially for the problem posed by Fran, because many array expressions that might appear to overlap have absolutely no overlap once the linearized references are obtained. The? test quickly obviates the need for further dependence testing between the arrays. Subsequent Developments At the time of our paper s publication in 1986, we had implemented the dependence-testing aspects of the paper and verified the efficiency on the Perfect benchmarks a popular suite of Fortran benchmarks at that time. While our paper suggested that interprocedural subscript analysis might uncover more parallelism, we did not implement the work to that extent, and so that hypothesis remained open. Michael Hind added the full, interprocedural subscript-analysis described in our paper. His experiments showed that the extra analysis did not in fact expose much more parallelism than did the intraprocedural version we had implemented [2]. Subsequently, Mary Hall, continuing work she had begun at Rice but now at Stanford working with Monica Lam and others, showed that exposing significant parallelism on the Perfect benchmarks required powerful transformations like array privatization. The Stanford experiments [1] were performed against the PTRAN measurements on the Perfect benchmarks that Hind, et al. had described in our paper. In a subsequent paper [3], the Stanford group cited our approach as the standard one for computing direction vectors. In his book [4], Michael Wolfe acknowledged and adopted our framework for computing dependence relations hierarchically. While dependence-testing of the form described in our paper does not see much use these days for uncovering parallelism in dusty deck Fortran programs, sophisticated analysis of this form is present in tools and in compilers that restructure programs for advanced architectures, including those that feature elements of parallelism as well as deep storage hierarchies. At IBM, our dependence test found its way into the IBM XL Fortran product that was first shipped as a product in 1996 ten years after the publication of our ACM SIGPLAN 139 Best of PLDI

2 paper. Dependence analysis is a specialized area of computer science, but it has served as a fertile ground for theoretical and practical research. We are pleased to have been part of its noble history and we thank the selection committee for this honor. 1. ACKNOWLEDGEMENTS This work builds on the work of two groups who pioneered the area of dependence analysis: from Illinois, David Kuck, Utpal Banerjee, and Michael Wolfe; from Rice, Ken Kennedy and Randy Allen. The authors thank Fran Allen for inspiring and supporting this work and Vivek Sarkar for advocating our work for this recognition. REFERENCES [1] M. W. Hall, S. Amarasinghe, B. Murphy, S. Liao, and M. Lam. Detecting coarse-grain parallelism using an interprocedural parallelizing compiler. Proceedings of Supercomputing 95, [2] Michael Hind, Michael Burke, Paul Carini, and Sam Midkiff. An Empirical Study of Precise Interprocedural Array Analysis. Scientific Programming, 3(3), [3] Maydan, Hennessy, and Lam. Efficient and exact data dependence analysis. PLDI, [4] Michael J. Wolfe. Optimizing Supercompilers for Supercomputers. Pitman, London and The MIT Press, Cambridge, Massachusetts, In the series, Research Monographs in Parallel and Distributed Computing This monograph is a revised version of the author s Ph.D. dissertation published as Technical Report UIUCDCS-R , U. Illinois at Urbana-Champaign, ACM SIGPLAN 140 Best of PLDI

3 ACM SIGPLAN 141 Best of PLDI

4 ACM SIGPLAN 142 Best of PLDI

5 ACM SIGPLAN 143 Best of PLDI

6 ACM SIGPLAN 144 Best of PLDI

7 ACM SIGPLAN 145 Best of PLDI

8 ACM SIGPLAN 146 Best of PLDI

9 ACM SIGPLAN 147 Best of PLDI

10 ACM SIGPLAN 148 Best of PLDI

11 ACM SIGPLAN 149 Best of PLDI

12 ACM SIGPLAN 150 Best of PLDI

13 ACM SIGPLAN 151 Best of PLDI

14 ACM SIGPLAN 152 Best of PLDI

15 ACM SIGPLAN 153 Best of PLDI

16 ACM SIGPLAN 154 Best of PLDI

Identifying Parallelism in Construction Operations of Cyclic Pointer-Linked Data Structures 1

Identifying Parallelism in Construction Operations of Cyclic Pointer-Linked Data Structures 1 Yuan-Shin Hwang Department of Computer Science National Taiwan Ocean University Keelung 20224 Taiwan shin@cs.ntou.edu.tw