Evaluating Software Maintenance Cost Using Functional Redundancy Metrics

Size: px

Start display at page:

Download "Evaluating Software Maintenance Cost Using Functional Redundancy Metrics"

Phoebe Warren
5 years ago
Views:

1 1 Evaluating Software Maintenance Cost Using Functional Redundancy Metrics Takeo Imai, Yoshio Kataoka, Tetsuji Fukaya System Engineering Laboratory Corporate Research & Development Center Toshiba Corp. a Abstract Source code copying for reuse (code cloning) is often observed in software implementations. Such code cloning causes difficulty when software functionalities are modified: i.e, cloned codes increase the maintenance cost of software. We aim to estimate the maintenance cost caused by clones. We propose a novel approach, which evaluates influence of cloned codes over the maintenance cost. The basic idea is to measure Functional Redundancy(FR): A degree of propagation of clone-potential functions. FR is measured as follows: First, we cluster functions in the software according to similarities between them. Second, we make n-ary weighted tree(fr tree) based on the cluster. Finally, we measure FR by weight of each node in FR-tree. In this paper, we describe the details of our proposal. We also apply the approach to 17K-ELOC C code to demonstrate its effectiveness. Keywords FR, Functional Redundancy, Code Cloning, Software Maintenance, Similarity, Cost I. INTRODUCTION In most software projects, quite a large proportion of the cost is used in maintenance [13]. We believe some effective method is needed to reduce the cost on software maintenance. Our primal objective is to construct a cost-effective method for maintenance management, such as the following: 1. Cost estimation on software maintainability 2. Problem detection using the estimation result 3. Activity control on the maintenance implementation In order to achieve our purpose, the achievement of the first step is the most important issue of our study. We therefore aim to evaluate the maintenance cost with quantitative metrics. There are several factors that affect the maintenance cost. One of the factors is code cloning, which occurs quite often. It also has a severe impact on the cost. We discuss the detail of this problem below. Code cloning is a software reuse technique using source code copying. Developers t to use this technique commonly for various reasons. One of the reasons is that it is easier to use existing software than to program from scratch. Another is that the code which developers want to reuse is usually not designed to be reused as a black-box. Therefore, many similar segments of a code are commonly observed throughout a software or a software project. Such scattered segments of codes (we call them simply as cloned codes or clones) however cause adverse effects, some of which are listed in [5]. We cite several such items: All authors are with System Engineering Laboratory, Corporate Research & Development Center, Toshiba Corporation, 1 Komukai-Cho, Saiwai-Ku, Kawasaki , Japan. Tel: Fax: E- mail:{takeo.imai, yoshio.kataoka, tetsuji.fukaya}@toshiba.co.jp The same software bug reoccurs throughout software despite many local fixes. Code reviews and inspections are needlessly exted. It becomes difficult to locate and fix all instances of a particular mistake. Software defects are replicated through the system. Developers create multiple unique fixes for bugs with no method of resolving the variations into a standard fix. The items above principally cause problems when a software is modified. If developers wish to modify a code segment, all clones of the segment potentially need to be modified, or the same software bug (or enhancement) reoccurs. That is why we focus on the code cloning problem as a significant cost-intensive issue. In this paper, our purpose is to construct a metric to estimate the software modification cost caused by code clones. Our first candidate for such metric was clone percentage, the ratio of cloned codes to the whole source code the in size; in LOC (Lines of Code), for example. We initially thought this way for the following reasons: In a previous work, Bosch [4] estimated the modification cost with the following equation. cost = requests LOC/request hour/loc where request is a modification request for the source code. Bosch also introduced the study of Henry et al.[6], which took statistics of software productivity with LOC/hour. Therefore, if LOC/request is calculated, we can estimate the cost per change request. Hence we thought clone percentage as the estimation of LOC caused by clones, and attempted to apply the equation above as our cost-estimation model. We, however, found an oversight in such a model. As we cited above, it becomes difficult to locate and fix all clones: i.e., clones cause an increased cost on identification: identifying the segments to be modified in the source code. Many of previous works dealing with clones [3][7][9][8][1] have focused on code detection techniques. This fact indicates the difficulty of identifying clones in a software. Here we assume a simple model of software modification with clones. The NS chart in Fig. 1 shows the flow of the model. When a request occurs, the modified place and its clones are identified, manually or with a clone-detection method. After

2 2 True True (requested) identify design implement another clone? False another change? False The second feature is to use similarity between two functions. Here, we do not indicate the definition of similarity. We simply assume that similarity is described in the real number, and the range is [0..1], such as percentage. Clones (clone-potential functions) rarely match each other exactly. i.e., clones match in some similarity. As we described in Section I, clones may increase the number of modified times. Therefore, as the similarity becomes larger, the chance of required modification may become greater. We must note here that the method for calculating similarity is not given in this paper, because the method is out of our concern. We can use various methods here for similarity estimation. Fig. 1. software modification model with clones that fixed codes are designed and implemented onto identified places. We must note the implement phase are repeated by the number of clones. Moreover, there may be another pattern of clones or a leak in clone detection: They can make other requests of modification. In this case the whole modification steps are iterated again, and the maintenance cost increases drastically. That is to say, there are two major factors that increase the maintenance cost: the modified size and the number of modified times. And the second factor is increased by clone propagation: the number of clones and the number of patterns of clones. In this paper, we focus on the clone propagation, and propose a metric named Functional Redundancy (FR), which stands for a degree of clone-potential function propagation. This paper is organized as follows: First, we discuss the detail of FR in Section II. We then apply our FR evaluation approach to 17K-ELOC sized practical software written in C in Section III. In Section IV we introduce some additional uses of FR for clone management. We discuss related works in comparison with ours in Section V. Finally we conclude this paper in Section VI. II. FUNCTIONAL REDUNDANCY A. Basic idea The FR metric has two features. First, the metric focuses on clone-potential functions: functions that possibly contain clones. One of the reasons of our focusing on clone-potential functions is a nature of clones: a segment of code and its clones basically have the same functionality, while a function in a source code is an explicit unit of functionality in a software. In addition, our experience alludes that many functions are cloned entirely. Another reason is, as we described above, that we take clone propagation into consideration in evaluating FR. In light of our objective, it is sufficient to estimate the number of functions containing clones. Such a feature, however, has a demerit. The estimation result might be inaccurate when the size of each function is uneven because we have to normalize the size of a source code by function. The suitability of the feature will be evaluated later. B. Measurement of FR B.1 similarity between functions Functions and their similarities can be illustrated as in Fig.2. Circles which contain numbers mean different functions (from 1 to 8), and values on each lines stand for the similarity ([0..1] in this figure) between two functions Fig. 2. functions and their similarity With the similarity, functions form some classified groups, which are shown as ovals in Fig. 2. The classification heuristic is as in the following, for example: 1. Set a threshold as the minimum (i.e, 0). 2. Raise the threshold. 3. If a value of the similarity becomes lower than the threshold, connect the relation associated with the value, and group functions that are connected with one another. 4. Return to 2 until the threshold comes up to the maximum (i.e, 1). Otherwise quit. In the example of Fig. 2, functions are grouped into two groups when the threshold is equal to 0.1: {1,2,3,4,5} and {6,7,8}. When the threshold is 0.4, there are 4 groups, {1,2,3}, {4, 5}, {6}, and {7,8}. C. FR tree An FR tree is an n-ary tree with which functions and their similarities are illustrated. The example of Fig.2 with a FR tree is shown in Fig. 3.

3 3 0.1(8) 0.4(5) 0.8(3) 1 0.6(2) 0.4(3) 0.7(2) Node: Fig. 3. FR tree Similarity (NL) procedure MakeTree(pairs: array of Pair) var T: Tree; N: Node; begin sort_by_similarity(pairs, decreasing); for i := 1 to sizeof(pairs) do begin if T contains pairs[i].node[1] and T contains pairs[i].node[2] then return; for j := 1 to 2 do begin new(n); register(n, T); if T contains pairs[i].node[j] then add tree_top(pairs[i].node[j]) to N.child; else add pairs[i].node[j] to N.child; An FR tree consists of nodes and their parent-child relations, like an ordinary n-ary tree. The tree also has a top (a node that does not have the parent node), external nodes (nodes that do not have the child node) and internal nodes (nodes that are not external ones). An external node in a tree corresponds to a function in one software. Every function has one corresponding external node. An internal node has a node (or nodes) whether internal or external. Additionally, an internal node has two extra weights: The number of leaves (NL) and Similarity. These weights are defined as follows. 1. Given any node N, NL of N (NL(N)) is the number of leaves N contains. Here, a leaf contained in a node N corresponds to (a) an external node that is a child of N, or (b) a leaf that is contained in a child of N. The number of leaves a node K contains can be calculated recursively, as follows. leaf(k) = leaf(k c ) + child ext (K) where K c child int(k) leaf(n) is a set of all leaves N contains, child ext (N) is a set of all external child nodes of N, child int (N) is a set of all internal child nodes of N, and S be the size of a set S. NL(N) corresponds to leaf(n). 2. Prior to defining similarity weight, we introduce someleaf(k) and conleaf(k, s). Let a node K have n children. We define a set someleaf(k) ={k 1,k 2..., k n }, where every element k i is K s child node when k i is external, or one of the leaves K s child has when k i is internal. Given the similarity s, we define conleaf(k, s) that satisfies the condition below: conleaf(k, s) def = {someleaf(k) return; Fig. 4. Algorithm of Making FR Tree k someleaf(k), k i someleaf(k), similar(k, k i )=s(1 i n, k k i )} where similar(k, k i ) equals the similarity between k and k i. Here we describe the similarity attribute of a node N as s(n). s(k) is equivalent to s iff both of the conditions (a) and (b) below are met: (a) There exists at least one conleaf(k, s). (b) For all s 1 s, there is no conleaf(k, s 1 ). In addition to the definitions above, an FR tree is formed with the following rule: 3. For all two nodes P and C whose relation is a parent-child one, the two similarities s(p ) and s(c) are related as follows: s(c) s(p ) D. Measurement process D.1 making FR tree Fig. 4 describes the algorithm to construct an FR tree, written in pseudo-pascal code. As an input, all function pairs and the similarity of every pair are given. First, the pairs are sorted by the similarity in the decreasing order. Then, each pair is added to the tree in order. D.2 measuring FR Here we measure FR based on an FR tree. The concept of FR is given below: As similarity between functions becomes greater, the possibility increases that these functions are clones, i.e., FR should be greater. FR should also be greater when there are more similar functions in a group.

4 4 Along with such concept, we measure FR from similarity and the number of leaves (or the number of children) each node of the tree has as weights. We propose two metrics, one of which can be an alternative of the other. We call them as FR1 and FR2. FR1 FR1 of a program P (fr 1 (P )) is defined as follows. fr 1 (P ) def = ( child(n) 1) s(n) +1 N nodes int(p ) where nodes int (P ) is a set of all internal nodes contained in FR tree of P. child(n) is a set of all the child nodes of N, whether internal or external. It should be noted that we can normalize the value of FR1. In the right side of the definition above, the number of product terms (( child(n) 1) s(n), and 1 ) is equivalent to the number of functions P has. Hence as for the range of the similarity [0..1] such as percentage, fr 1 (P ) and the number of functions ( func(p ) ) are related as follows. fr 1 (P ) func(p ) When fr 1 (P ) and func(p ) are equal, all similarities are equal to 1. Therefore, we can use the following normalizing operation to FR1 and calculate normalized FR1(fr 1 norm (P )). fr 1 norm = fr 1(P ) func(p ) In the example of Fig. 3, fr 1 (P ) and fr 1 norm (P ) are measured as follows. fr 1 (P ) = = 4.8 fr 1 norm (P )=4.8/8 =0.6 FR2 FR2 of a program P (fr 2 (P )) is defined as follows. fr 2 (P ) def = child(n) 2 s(n) N nodes(p ) FR2 evaluates the impact of child(n) more than FR1. As is mentioned above, the more children a node has in FR tree, the more clones there can be. Hence this metric is effective when many copies of the same code occur in a program in the same way. In the example of Fig. 3, fr 2 (P ) is measured as follows. fr 2 (P ) = = 16 In our study so far, we have not focused on FR2 as much as FR1. Therefore we discuss this method only briefly in this paper. III. EXPERIMENT AND EXAMINATION We applied FR metric FR1 to the source code of a system for telecommunication control. This code is written in C and it has approximately 17K ELOC (Executable Lines of Code). This software consists of 5 modules, which are between 2K and 4K ELOC size, and have about functions (Table. I). We call the modules A to E respectively. ELOC functions ELOC/func A B C D E TABLE I SAMPLE SOURCE CODE Here, we also applied clone percentage (CP) as our comparative approach. As we mentioned in Section I, CP was the first candidate of our metric. We believed this metric can evaluate the size of clones but cannot reflect the clone propagation. We compared our metric and CP, and then verified whether our belief was correct. In addition, another concern of ours was the issue described in Section II-A: Our metric has an inaccuracy in evaluating the size of clones. We verified this inaccuracy here. In this section, after introducing the detail of experiment method, we show the experiment result, and then move to their comparison. A. Experiment Detail A.1 Measurement of Similarity We measured similarity between two functions by using simple diff, and calculated the following equation, given a and b as functions. ( similarity(a, b) =max 1 diff(a) LOC(a), 1 diff(b) ) LOC(b) i.e., we calculated the number of common lines between a and b first, and then measured the ratio of common lines per all lines of each function. Using diff is actually an inefficient method, but the efficiency of similarity measurement is not our concern in this paper. There have been many effective approaches so far, and we can use them alternatively, if needed. We used the maximum of the ratio between two functions because we attempted to measure conservatively. A.2 Measurement of CP We also measured CP by using diff. CP = cloned(p ) LOC(P )

5 5 Where P is the measured program, LOC(P ) is all the ELOCs of P, and cloned(p ) is a cloned LOC of P, i.e. the number of not diff ed codes. In fact, accurate calculation cannot be carried out in a timeefficient manner, so we used some simpler approximation. The detail is not described here because it is out of our concern. A.3 Measurement of FR As an FR metric, we used normalized FR1(fr 1 norm ), which is shown in II-D.2. There are two reasons to select this metric. First, fr 1 norm is normalized so that we can compare the results with each other. Moreover, fr 1 norm represents an approximate percentage of clones; that is, we can compare the metric with CP. B. Experiment Result The experiment result is shown in Fig A B C D E Fig. 5. Experiment Result: CP and FR CP FR_norm Table II shows the statistical comparison between CP and FR: average, standard deviation, and correlation coefficient. CP FR average std dev correlation TABLE II COMPARISON WITH LINE-LEVEL REDUNDANCY We must note that Fig.5 shows the biggest difference on the measurement of module B, where CP was 21% while FR was 59%. With more detailed investigation, we found that there were over 50 similar functions in B (that is, over 55% of the functions were clones!). They looked almost entirely correspondent with each other, but the size of each function was about 10 lines. Therefore this great clone group affected CP much less than FR. This example shows that FR evaluated the clone propagation while CP did not. In this point of view, our purpose is achieved with the FR metric. On the other hand, CP and FR were strongly correlated as a whole; the correlation coefficient was between CP and FR. This seems to indicate that the demerit of FR had a less influence in our result. But we believe further evaluation and discussion is needed for appraising the influence of the demerit. Fig. 5 also shows another feature that FR exceeded CP in every module. This may be because of our FR measurement policy. As we mentioned, we used the maximum between two ratios as similarity. It may be why FR was considered to become relatively larger. We must also note that the algorithm of forming an FR tree make FR exaggerated. For example, let us consider the following situation. Given the three functions a, b and c, their similarity are: similarity(a, b) = 60% similarity(b, c) = 30% similarity(c, a) = 20% According to the algorithm described in Fig. 4, first, a and b make a node whose similarity is 60%. Second, b and c make a node of 30% similarity. Finally c and a is evaluated, but the FR tree already has both of the two functions. Therefore c and a are treated as their similarity being 30% in the tree. In actual situations, similarity consists (n 1)-dimension network during functions, not a tree. However it costs more to treat such network. We believe it is efficient to substitute an FR tree for the network. IV. ADDITIONAL USES OF FR In this section, we introduce additional uses of FR for clone management. We mentioned in Section I that our objective was to construct a maintenance-management method, such as in the following: 1. Cost estimation 2. Problem detection 3. Activity control In the previous sections, we described the first step of the method. Here we will describe the others in order. A. Detection of Influential Clones on FR Firstly we describe the approach along with FR metric to detect the most influential clones on maintenance cost. We achieve this aim with an FR tree. Our policy here is to detect clones which are more similar, and larger in number. Fig. 6 shows such algorithm. We introduce a criterion number u in advance. It is used to estimate how many clones should be detected. First, from the top of an FR tree, NL (See Section II- C) of each child node are compared. Second, the node whose NL is the largest is selected. Then the same operation is applied to the selected node. Finally, the larger NL node is traced from parent to child. It should be noted that the NL becomes smaller as this process is repeated. When NL becomes smaller than u, or the selected node has no internal node, we finally select the functions that the finally selected node has as its leaves, and this process s. We interpret these functions as the most influential clones.

6 6 function influentialcodedetection (tree: Tree) : set of Nodes var N: Node; begin N := top(tree); while (child(n) contains internal node) or ( leaf(n) is not less than $u$) do N := search_the_max_nl(child(n)); return leaf(n); Fig. 6. Influential Clone Detection: Algorithm This approach is based on a simple searching algorithm of n- ary tree. Therefore the complexity is O(log(n)) where n is the number of functions. In the example of Fig. 3, we start from the top 0.1(8), and select 0.4(5), 0.8(3) in order. If u is smaller than 5, the functions 1, 2, 3 as the leaves of the node 0.8(3) are finally chosen. If u is equal or larger than 5, the functions 1, 2, 3, 4, 5 of 0.4(5) are chosen. B. Modification Activity Control Secondly we describe the method to check the existence of clones when some modification request has taken place. This method is also an application of an FR tree. The process is described as the following: 1. An FR tree is given as an input in advance. 2. A function to be modified (f) is selected. 3. The FR tree is traced from the leaf associated with f, in the child-to-parent order. 4. As the tree is traced, the node s similarity becomes smaller, and NL becomes larger. The node is evaluated with these two weights. 5. According to the evaluation, one node N is chosen at the and the functions related to leaf(n) are determined as clones of f. To evaluate the nodes in the process above, we introduce Partial Functional Redundancy(PFR). Before the definition of PFR, we would like to redefine FR. With the definition here, the following equations are formed. N = top(p ) fr(n) = fr(p ) where top(p) is the top node of FR tree constructed from P. Given any number n, PFR of a function f (pfr(f,n)) is defined as follows: pfr(f,n) def = fr(p n (f)) where p n (f) is the nth parent node from the external node associated with f. In the process above, n starts with 1 and increases as the recursion of trace occurs. We can determine whether to finish the process with PFR. When the process is finished, functions of leaf(p n (f)) is determined as potential clones. V. RELATED WORK Mayrand et al.[11][12] aim at evaluating the quality of software, and focus on functional clones like our study. The main feature of their approach is to classify the clones according to their modified pattern: from ExactCopy to DistinctControlFlow. They use their classification for maintenance management and attempt to improve the maintenability. Balazinska et al. [1] follow Mayrands approach and classify clones in 18 patterns, but use the classification to suggest reengineering methods. The main advantage of our study over theirs will be the scope of evaluation: We evaluate the quality of the whole software with one metric, while their evaluation methods are based on the relation between two functions. As for the understandability of these evaluation results, we believe our approach is much easier to be understood. Moreover, our metric can be used to estimate the maintenance cost. This feature promotes the understandability and utility of our metric. In addition, our metric (FR1) can be normalized and therefore we can compare the estimation result with other softwares or projects. In return, our method does not deal with the semantics of clones. Therefore our method can detect which functions to be modified, but cannot suggest how to modify them. This feature is where their works have an advantage over ours. VI. CONCLUSION AND FUTURE WORK We have proposed a metric Functional Redundancy (FR) metric for estimating the maintenance cost caused by code clones, especially by propagation of clones throughout a software. We described the basic concept of FR metric, its measurement process. Then we applied our metric to 17K-ELOC source code written in C. In this context we verified our metric indubitably reflected the clone propagation, while clone percentage (CP, the ratio of cloned codes to the whole source code in LOC) did not. Moreover, we showed the demerit of FR, inaccuracy in evaluating the size of clones, had a little influence on our estimation result, and FR and CP had a strong correlation as a whole: the correlation coefficient with CP was We have also introduced the additional uses of FR, clonemanagement method based on FR: a detection of influential clones over the maintenance cost, and a management method when a modification request occurs. We believe FR is a novel approach to evaluate the quality of the whole software as to code clones. FR is especially good for analysis of the impact of clone propagation, which very few people have explored before. On the other hand, our metric might have the weakness in evaluating the size of clones. It is debatable which is more important issue for the maintenance cost estimation, and we have only limited information on the issue. We will inspect the relation between the size and the propagation of clones, and the impact each of them has on the maintenance cost in our future work. In addition, we will study some size-oriented approach, and finally merge FR and the size-oriented one. Such study will bring us more accurate and effective cost-estimation method.

7 7 REFERENCES [1] Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lagüe, and Kostas Kontogiannis. Measuring clone based reengineering opportunities. In Metrics 99, [2] Victor Basili, Lionel Briand, Steven Condon, Yong-Mi Kim, Walceĺio L. Melo, and Jon D. Valett. Understanding and predicting the process of software maintenance release. In Proceedings of the 1996 International Conference on Software Engineering (ICSE 96). ACM Press, [3] Ira D. Baxter, Andrew Yahin, Leonardo Moura, and Marcelo Sant Anna Lorraine Bier. Clone detection using abstract syntax trees. In Proceedings of the 1997 International Conference on Software Maintenance (ICSM 97). IEEE Computer Society Press, [4] Jan Bosch. Design & Use of Software Architectures. Addison Wesley, [5] William H. Brown, Raphael C. Malveau, Hays W. Skip McCormickIII, and Thomas J. Mowbray. AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. Wiley Computer Publishing, [6] J.E. Henry and J.P.Cain. A quantitative comparison of perfective and corrective software maintenance. Journal of Software Maintenance, Oct [7] J Howard Johnson. Identifying redundancy in source code using fingerprints. In CASCON 93, [8] Toshihiro Kamiya, Fumiaki Ohata, Kazuhiro Kondou, Shinji Kusumoto, and Katuro Inoue. Maintenance support tools for java programs: Ccfinder and jaat. In Proceedings of the 23rd International Conference on Software Engineering (ICSE 2001). ACM Press, [9] Jens Krinke. Identifying similar code with program depence graphs. In Proceedings of The Eighth Working Conference On Reverse Engineering 2001, [10] Bruno Laguë, Daniel Proulx, Ettore M. Merlo, Jean Mayrand, and John Hudepohl. Assessing benefits of incorporating function clone detection in a development process. In Proceedings of the 1997 International Conference on Software Maintenance (ICSM 97). IEEE Computer Society Press, [11] Jean Mayrand, Bruno Laguë, and John Hudepohl. Evaluating the benefits of clone detection in the software maintenance activities in large scale systems. In WESS 96, [12] Jean Mayrand, Claude Leblanc, and Ettore M. Merlo. Experiment on the automatic detection of function clones in a software system using metrics. In Proceedings of the International Conference on Software Management IEEE Computer Society Press, [13] Stephen R. Schach. Software Engineering. Asken Associates, 1993.

On Refactoring for Open Source Java Program

On Refactoring for Open Source Java Program Yoshiki Higo 1,Toshihiro Kamiya 2, Shinji Kusumoto 1, Katsuro Inoue 1 and Yoshio Kataoka 3 1 Graduate School of Information Science and Technology, Osaka University