Regression Test Case Prioritization using Genetic Algorithm

9International Journal of Current Trends in Engineering & Research (IJCTER) e-issn 2455 1392 Volume 2 Issue 8, August 2016 pp. 9 16 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com Regression Test Case Prioritization using Genetic Algorithm Ms. N. Sangeetha 1, Mr. K. Radhakrishnan 2 1,2 Department of Computer Science, Dr. Ambedkar Government Arts College, Vyasarpadi, Chennai-39. Abstract Regression Testing is conducted for the purpose of evaluating whether or not a change to the system has introduced a new failure. Retest all is both expensive and impractical due to fact that software which is under test is not changed 100%, therefore it is not required to test whole software. These limits force to think about prioritization in the test cases. In this paper, to prioritize the test case a Genetic algorithm is used. Prioritization based on complete code coverage. Effectiveness is calculated with the help of Average Percentage of Code Covered (APCC). Keywords Regression testing, Genetic Algorithm, APCC, Code Coverage, Test Case, Prioritization I. INTRODUCTION Regression testing is expensive but an essential activity in software maintenance. Regression testing attempts to validate modified software and ensure that the modified parts of the program do not introduce unexpected errors. The time used for regression testing can be assumed approximately half of the software maintenance activities [10]. Due to time and cost limitations of software development, retesting software using the entire suites is obviously expensive and inefficient. Thus, prioritization of the test cases becomes essential. The priority criteria can be set accordingly e.g. to increase rate of fault detection, to achieve maximum code coverage, and so on [2]. Several methods have been developed to reduce the cost of regression testing [1]. These include retest all, test selection, test case minimization and prioritization techniques. Retest all technique in which all the tests in the existing test suites are performed again. Test case Selection techniques select a subset of test cases. Test case minimization techniques [3] [4] [5], reduce a test suite that still maintains the coverage. Test case prioritization techniques organize the test cases in a test suite by ordering such that the most beneficial are executed first thus allowing for an increase in the effectiveness of testing. It is better than other techniques as it doesn t discard or permanently remove the test cases from test suite. Effectiveness will be measured by the rate of faults detected [12]. Several prioritization techniques are used some of them are Greedy algorithms, nonevolutionary algorithms and evolutionary algorithms. Evolutionary algorithms (EA) have been chosen as they are global optimization methods and scale well to higher dimensional problems. A genetic algorithm is one of the heuristics widely used to solve problems in various domains including test case prioritization. The main benefit of a Genetic Algorithm technique is that it can be easily varied and customized to suit the problem. Additionally, genetic operators also contribute test case generation. This paper focuses on test case prioritization technique for total code coverage. Section I gives Introduction of the Regression testing and its techniques. Section II has background and related work. Section III about Genetic algorithm. Section IV explains research work. Section V shows the performance of proposed technique and the last section VI concludes the paper and followed by the references @IJCTER-2016, All rights Reserved 9

II. BACKGROUND AND RELATED WORK In this section, a brief overview of the previously proposed methods to prioritize the regression test cases is discussed. Kaur and Goyal [6], proposed a genetic algorithm for test case prioritization techniques using code coverage. The prioritized test cases give high value of APCC but not as high as the optimal order. The final test suite is responsible for achieving almost 90% code coverage. Interestingly, the control approaches which are Random and Reverse order gain higher APCC values than the GA order. A new test case reduction technique called TestFilter was proposed by Khan and Nadeem [3]. Weight criteria are used in the reduction algorithm. Similar to other reduction techniques, TestFilter eliminates unnecessary test cases, and, hence, reduces related costs such as test case storage and execution costs. The effectiveness of test cases generated was not yet studied. Nirpal and Kale [4], proposed a Genetic Algorithm to test case generation for path testing. The Genetic Algorithm is praised for its simplicity. The modification of their algorithm is required to get rid of unwanted paths and correctly locate the desired path. This indicates that some kind of preprocessing and/or post processing for Genetic Algorithms is needed to enhance the effectiveness of the technique employing Genetic Algorithms. Akarunisa and colleagues [7], proposed a metric for assessing the effectiveness of prioritization techniques in terms of fault detection rate and coverage. The new metric APFD assesses the rate of fault detection by considering varying test cases and fault costs. Moreover, other new metrics are also presented including Average Percentage of Statement Coverage (APSC), Average Percentage of Branch Coverage (APBC), Average Percentage of Loop Coverage (APLC) and Average Percentage of Condition Coverage (APCC) based on the coverage criterion of various prioritization techniques. Athar and Ahmad [8], presented a test suite optimization technique using a Genetic Algorithm. The test suite formed gives 100% code coverage. As the chromosome size increases, the number of iterations required to get the optimized test suite is also increased. This kind of overhead could be reduced via preprocessing. Srivastava and Kim [9], presented a method to identify the most critical path clusters in a program using Genetic Algorithm with the main aim to increase software testing efficiency. The method is compared with a random test generation method. By identifying the most critical path, the rate of fault detection could be increased. However, fault detection cannot be guaranteed. The background research illustrates that Genetic Algorithms [12] [13], are widely used for optimization. The simplicity of Genetic Algorithm makes it a suitable candidate to implement prioritization techniques. However, the Genetic Algorithm should be modified. III. GENETIC ALGORITHM Genetic algorithm is a population-based search method. Genetic algorithms are acknowledged as good solvers for tough problems. Genetic algorithms require several parameters including the following [2]: 1. Maximum number of generations, 2. Population size, 3. Crossover rate, 4. Mutation rate, 5. Convergence criterion. There are two main concepts in genetic algorithm viz: crossover and mutation. A. Crossover A binary variation operator is called recombination or crossover. This operator merges information from two parent genotypes into one or two offspring genotypes. Similarly to mutation, @IJCTER-2016, All rights Reserved 10

crossover is a stochastic operator: the choice of what parts of each parent are combined, and the way these parts are combined, depends on random drawings. The principle behind crossover is simple: by mating two individuals with different but desirable features, we can produce an offspring which combines both of those features. Child1 = c*parent1 + (1-c)*parent2 Child2 = (1-c)*parent1 + c*parent2 B. Mutation A unary variation operator is called mutation. It is applied to one genotype and delivers a modified mutant, the child or offspring of it. In general, mutation is supposed to cause a random unbiased change. Mutation has a theoretical role: it can guarantee that the space is connected. A simple piece of code: child = generatenewchild(); The optimization problems are solved by GA s recombination and replacement operators, where recombination is key operator and frequently used, whereas, replacement is optional and applied for solving optimization of problem [1]. IV. EXPERIMENTAL RESEARCH To illustrate the prioritization technique using genetic algorithm, the triangle problem is choosed because in software testing it is a well known example. First step in this experiment is graph generation. The decision graph from a control graph is generated for the selected problem. Independent Path 1. ABCEJKLM 2. ABCFGKLM 3. ABDHJKLM 4. ABDIJKLM 5. ABCFJKLM 6. ABDIM 7. ABCELM Figure 1: Decision Graph @IJCTER-2016, All rights Reserved 11

The second step is test case generation. The population size is set to 20 and the test cases are generated. Below table shows number of test cases, independent path, condition covered, expected output and number of node it covered Table I: Test Case Generation Test Path Condition Node Case Covered Covered T1 ABDIM 2 1,2 T2 ABCELM 2 1,2 T3 ABCFGKLM 2 1,2 T4 ABCEFGLM 4 1,2,3,4 T5 ABCDLM 2 1,2 T6 ABCEFJLM 4 1,2,3,4 T7 ABCEHIKLM 5 1,2,3,5,6 T8 ABCEHIKLM 5 1,2,3,5,6 T9 ABCEHIJLM 5 1,2,3,5,6 T10 ABCDLM 2 1,2 T11 ABCDLM 2 1,2 T12 ABCDLM 2 1,2 T13 ABCEHIKLM 5 1,2,3,5,6 T14 ABCEHIKLM 5 1,2,3,5,6 T15 ABDLM 1 1 T16 ABDLM 1 1 T17 ABCDLM 2 1,2 T18 ABCEHIKLM 5 1,2,3,5,6 T19 ABCEHJLM 4 1,2,3,5 T20 ABCEHIKLM 5 1,2,3,5,6 The third step deals with test suite generation. Here the algorithm checks for test case duplication. If any duplication found it randomly pick another test case number to replace. The replaced test case must also satisfy the number of condition covered by the test suit Table II: Test Suite Generation Test Suites Observation for first of Iteration TS1 T7 T6 T4 T9 T20 T15 T1 T3 T10 T19 T12 TS2 T19 T9 T6 T8 T4 T2 T18 T16 T5 T11 TS3 T13 T4 T9 T15 T3 T6 T14 T19 T10 T17 TS4 T5 T9 T4 T6 T7 T19 T16 T11 T8 T12 T17 TS5 T12 T13 T9 T19 T4 T15 T10 T6 T20 T1 TS6 T6 T12 T7 T20 T16 T1 T4 T9 TS7 T18 T11 T9 T19 T6 T15 T17 T14 T4 T1 TS8 T8 T9 T4 T15 T6 T12 T19 T20 T17 T11 T2 TS9 T17 T6 T9 T14 T16 T19 T4 T7 T2 T3 TS10 T3 T13 T6 T9 T4 T15 T1 T19 T18 T5 T10 TS11 T9 T4 T16 T8 T2 T6 T19 T18 T5 T10 TS12 T13 T6 T15 T14 T10 T4 T19 T2 TS13 T7 T9 T6 T18 T4 T3 T19 T16 T10 T11 TS14 T20 T19 T9 T4 T6 T15 T5 T13 T12 T11 T3 TS15 T9 T4 T19 T16 T7 T10 T6 T12 T13 T8 TS16 T16 T19 T4 T9 T11 T6 T14 T1 T8 T17 T2 TS17 T12 T15 T13 T4 T6 T8 T9 T1 T19 T17 TS18 T2 T20 T6 T19 T16 T9 T17 T4 T14 T3 T1 TS19 T8 T6 T1 T15 T19 T7 T3 T9 T2 T4 TS20 T6 T20 T5 T19 T2 T16 T4 T8 T9 T10 @IJCTER-2016, All rights Reserved 12

Calculating fitness value is fourth step. Fitness value is calculated based on summation of total code coverage of each test case in the test suite. The total coverage is then used to rank the test suites. Ranking denotes the suite with greater coverage is at top. They also used to select the parents for genetic operator. Test suite TS6 and TS12 were selected as parents which has greater coverage with minimum test cases. Final step is applying genetic operator. Test cases with high fitness value are selected to yield optimized result. (i) Cross Over Cross over is performed on both parents to produce a new offspring which covers the entire independent path. By recombining portions of good individuals, this process is likely to create even better individuals Figure 2: Crossover is performed on two test suites Figure 3: New offspring resulted from crossover T6 T12 T7 T14 T10 T4 T19 T2 Figure 4: Selected offspring (ii) Next, mutation is performed on a bit-by-bit basis to alter value of bits in a chromosome. The result is a new gene (test suite). Every bit in a chromosome of an offspring may change or mutate depending on mutation probability (pm) For example, T12 and T19 satisfy the condition. So they are swapped. Figure 5: The result of mutation T6 T12 T10 T14 T4 T2 Figure 6: Final off spring @IJCTER-2016, All rights Reserved 13

V. EVALUATION To analyze code coverage based testing effectively the Average Percentage of Condition Coverage (APCC) [2] approach has been used where average percentage of test suite to be executed with respect to average condition s covered. APCC= Where T= test suite been executed n= number of test cases, m= number of conditions to be covered, TC i = First test case covering i th condition. Table III: APCC values for prioritization Prioritization APCC Approach Value (%) No 90.3 Reverse 92.4 Random 98.5 Optimum 97.5 GA 100 The above table shows APCC values of prioritization approaches. The prioritized test suite obtained from the Genetic Algorithm demonstrates complete code coverage (100%). In this paper five prioritization techniques are studied they are: 1) No prioritization, 2) Reverse prioritization, 3) Random prioritization, 4) Optimal prioritization, 5) Genetic algorithm prioritization. Table IV: Test cases for various prioritization techniques No Reverse Random Optimum GA T1 T20 T6 T9 T6 T2 T19 T10 T7 T8 T3 T18 T8 T4 T10 T4 T17 T5 T2 T2 T5 T16 T4 T18 T20 T6 T15 T2 T1 T15 T7 T14 T1 T13 T13 T8 T13 T9 T3 T9 T9 T12 T7 T5 T7 T10 T11 T3 T8 T1 T11 T10 T16 T6 T3 T12 T9 T13 T10 T12 T13 T8 T19 T19 T16 T14 T7 T20 T11 T14 T15 T6 T17 T15 T11 T16 T5 T15 T12 T17 T17 T4 T12 T16 T19 T18 T3 T18 T14 T5 T19 T2 T11 T17 T18 T20 T1 T14 T20 T4 @IJCTER-2016, All rights Reserved 14

Figure 7: APCC value for prioritization approaches In APCC value calculation, other prioritization technique require more number of test cases to achieve the code coverage but genetic algorithm can get the APCC value 100% from 15 test cases. Genetic algorithm can effectively prioritize the test case from test suite. VI. CONCLUSION AND FUTURE WORK In this paper, prioritization using Genetic algorithm for regression testing is proposed. Experiments have been conducted to measure the effectiveness of proposed technique. Five approaches to prioritize test case are studied and their performances are calculated. An APCC metric are used to determine the effectiveness of prioritization technique. Further work is additional prioritization approaches should be included. Implementing complex program and compare the effectiveness specified in this paper. REFERENCES [1] K. R. Walcott, Prioritizing regression test suites for time-constrained execution using a genetic algorithm [online] available at www.cs.virginia.edu/~krw7c/gaprioritizations.pdf [2] Arivnder kaur, shubhra Goyal, A Genetic algorithm for regression Test Case prioritization using code coverage, International Journal of Computer Science and Engineering, May 2011 [3] S.R. Khan, A. Nadeem, TestFilter: A Statement-Coverage Based Test Case Reduction Technique. IEEE Conference on multitopic, (INMC 06), pp.275-280, December 2009. [4] P.B. Nirpal and K.V. Kale, Using Genetic Algorithm for Automated Efficient Software Test Case Generation for Path Testing, International Journal of Advanced Networking and Applications, Volume: 02, pp. 911-915, 2011 [5] G. Fraser and A. Arcuri, Whole test suite generation, Software Engineering, IEEE Transactions on 39.2, pp. 276-291, 2013 [6] A. Kaur, S. Goyal. A genetic algorithm for regression test case prioritization using code coverage, International journal on computer science and engineering 3.5 (2011) pp. 1839-1847. [7] Metrics for Measuring the Effectiveness of Test Case Prioritization Technique, INFOCOMP Journal of Computer Science, pp. 1-10, 2009 [8] M. Athar, I. Ahmad, Maximize the Code Coverage for Test Suit by Genetic Algorithm, International Journal of Computer Science and Information Technologies, Vol.5, pp. 431-435, 2014 [9] P.R. Srivastava and T-h. Kim, Application of Genetic Algorithm in Software Testing, Internationl Journal of Software Engineering and Its Applications, Vol.3, October 2009 [10] Suman, Seema, A Genetic Algorithm for Regression Test Sequence Optimization,nternational Journal of Advanced Research in Computer and Communication Engineering, Vol. 1, September 2012 [11] L. Shanmugapriya, A. Askarunisa, and N. Ramaraj, Cost and Coverage Metrics for Measuring the Effectiveness of Test Case Prioritization Techniques, INFOCOMP Journal of Computer Science, pp. 1-10, 2009. [12] M. Srinivas and L. M. Patnaik, Genetic algorithms: A survey, IEEE Computer, 27(6), pp.17 26, June 1994. [13] J. L. R. Filho, P. C. Treleaven, and C. Alippi, Genetic-algorithm programming environments, IEEE Computer, 27(6), pp.28 43, June 1994. @IJCTER-2016, All rights Reserved 15

[14] Mohammad Rava, Wan M.N. Wan-Kadir, A Review on Prioritization Techniques in Regression Testing, International Journal of Software Engineering and Its Applications Vol. 10, No. 1 (2016). @IJCTER-2016, All rights Reserved 16