Automating Big Refactorings for Componentization and the Move to SOA IBM Programming Languages and Development Environments Seminar 2008 Aharon Abadi, Ran Ettinger and Yishai Feldman Software Asset Management Group IBM Haifa Research Lab
Background What is Refactoring? The process of gradually improving the design of an existing software system by performing source code transformations that improve its quality in such a way that it becomes easier to maintain the system and reuse parts of it, while preserving the behavior of the original system For example: Extract Method void printowing(double amount) { printbanner(); // print details print( name: + _name); print( amount: + amount); } Source: Martin Fowler s online refactoring catalog 2 void printowing(double amount) { printbanner(); printdetails(amount); } void printdetails(double amount) { print( name: + _name); print( amount: + amount); }
Prior Art Refactoring to Design Patterns + 3
Our Interest: Big Refactorings Enterprise Architecture Patterns 4
The Gap: Techniques and Tools for Enterprise Refactoring?? 5
Automating Big Refactorings Enterprise Refactorings Separate Presentation Code from Business Logic (introduce the MVC pattern), Extract Reusable Services (implementing SOA), etc. Composition Small Refactorings Rename Paragraph, Split/Merge Paragraphs, Extract/Inline Paragraph/Section/Program, Extract Slice, Swap Consecutive (Independent) Statements/Sentences/Paragraphs/Sections, Split/Merge (Consecutive) Conditionals, Loop-Invariant Code Motion, etc. Required Quality Flexibility Allow (user-determined) choice between alternatives Applicability Avoid unnecessary rejections by precise identification of (weak) preconditions Reliability* Guarantee behavior preservation Enabling Technology Deep Program Analysis Program analysis infrastructure for legacy enterprise software systems: A powerful static analysis infrastructure using the plan-calculus intermediate representation 6 * See http://progtools.comlab.ox.ac.uk/projects/refactoring/bugreports for examples of bugs in modern IDEs
Example: Introduce the MVC Pattern 7
As-Is Version: Photo Album Web Application Source: Alex Chaffee, draft Refactoring to MVC online article, 2002 8
To-Be Version: Photo Album Web Application View Presentation Model Controller Source: Alex Chaffee, draft Refactoring to MVC online article, 2002 Model 9
Small Refactorings on the move to MVC All kinds of renaming Variables, fields, methods, etc. Extracting program entities Constants, local (temp) variables, parameters, methods (Extract Method, Replace Temp with Query, Decompose Conditional), classes (Extract Class, Extract Superclass, Extract Method Object) Some reverse refactorings too, to inline program entities Moving program entities Constants, fields, methods (Move Method, Pull-Up Method), statements (Swap Statements), classes Replace Algorithm 10
Shortcomings of Eclipse on the move to MVC Missing implementation for key transformations Extract Class, Extract Method Object Buggy implementation of some refactorings Extract/Inline Local Variable: Ignores potential modification of parameters (on any path from source to target location) See http://progtools.comlab.ox.ac.uk/projects/refactoring/bugreports for examples of bugs in (earlier releases of) modern IDEs Restricted implementation of existing refactorings Extract Method: contiguous code only; weak control over parameters Move Method: Source class must have a field with type of target class Extract Local Variable: No control over location of declaration 11
Thanks! 12
Backup 13
Internal Representation: The Plan Calculus Wide-spectrum Specification to implementation Canonical Abstracts away from syntactic variations Language independent All legacy languages have similar capabilities Expressive Directly expresses program semantics in terms of dataflow and control-flow Convenient for machine manipulation Naturally expresses semantic transformations Rich, C. 1986. A formal representation for plans in the programmer's apprentice. In Readings in Artificial intelligence and Software Engineering, C. Rich and R. C. Waters, Eds. Morgan Kaufmann Publishers, San Francisco, CA, 491-506. 14
Program Sliding A family of provably-correct code-motion untangling transformations Automate slice extraction: Sequential composition of a selected slice with its complement (i.e. co-slice); Useful for refactoring, componentization, the move to SOA, obfuscation, etc. Combine statement reordering with code duplication, including duplication of assignments Sliding is particularly strong in Preserving behavior Maximizing reuse (of extracted computation s results, in the complement) Minimizing code duplication, i.e. yielding a smaller, more desirable complement; Improving applicability, i.e. less reasons to reject a request Benefit from the best of leading earlier solutions without suffering some of their respective deficiencies MOVE 0 TO TOTAL-SALE MOVE 0 TO TOTAL-PAY PERFORM VARYING i FROM 1 BY 1 UNTIL i > DAYS ADD SALE(i) TO TOTAL-SALE COMPUTE TOTAL-PAY = TOTAL-PAY + 0.1*SALE(i) IF SALE(i)>1000 ADD 50 TO TOTAL-PAY END-IF END-PERFORM COMPUTE PAY = TOTAL-PAY / DAYS + 100 COMPUTE PROFIT = 0.9*TOTAL-SALE - COST Example source: Lakhotia and Deprez (rewritten in COBOL) 15
Towards a COBOL Refactoring Catalog Rename Paragraph This refactoring might look trivial, but as it is with the renaming of variables, it must be done with care: the new name must be valid, it must not conflict with existing names, and it must be replaced correctly in each call (PERFORM, GO TO, etc.), without violating any column restrictions Split/Merge Paragraphs When merging two consecutive paragraphs, one must check the second is not referenced, or if it is, its reference must always follow a call to the first paragraph such that the two calls can be merged. Similarly, one must verify that any call to the first paragraph is either followed by a call to the second, or it must be a non-returning call that implies fall-through to the second paragraph Extract/Inline Paragraph/Section/Program Could support clone detection too, such that upon extraction, the tool will identify (at least exact) clones of the selected code, and suggest to replace it too with a call to the newly introduced program Extract Slice (through Sliding) First support the extraction of the code for computing a set of variables in a selected compound statement (or sentence); later add support for extraction from internal program points, i.e., the slicing criteria involves pairs of program point and (sets of) variables of interest (at that particular point); and finally support arbitrary method extraction, i.e., the slicing criteria involves a set of statements (or sentences), not necessarily contiguous, for extraction Swap Consecutive (Independent) Executable Program Entities Such as compound statements, sentences, and even paragraphs or sections Split/Merge (Consecutive) Conditionals So long as two instances of the conditional s predicate are guaranteed to evaluate similarly Loop-Invariant Code Motion Computation flavor: A loop-invariant computation is moved inside/outside that loop, as in optimizing compilers Conditional flavor: Instead of a computation, it is a loop-invariant conditional being moved If moved out, the loop itself is duplicated, for each branch of the conditional, but with each branch simplified based on the known conditional s result 16