Automatic Generation of Test Case based on GATS Algorithm * Xiajiong Shen and Qian Wang Institute of Data and Knowledge Engineering Henan University Kaifeng, Henan Province 475001, China shenxj@henu.edu.cn Peipei Wang and Bo Zhou Computer and Information Engineering department Henan University Kaifeng, Henan Province 475004, China Wangqiansusan1984@163.com Abstract A kind of software test data automated generation method based on genetic algorithm and tabu search algorithm is proposed. Having both local search capabilities of tabu search algorithm and global search capability of genetic algorithm, this tabu genetic algorithm combines tabu search algorithm with genetic algorithm. The experiment results show that the tabu genetic algorithm with tabu search as mutation operator is effective on generating test cases and its optimizing performance is superior to the simple genetic algorithm. 1. Introduction In software testing, one is often interested in judging how well a series of test inputs tests a piece of code good testing means uncovering as many faults as possible with a potent set of tests. Thus, a test series that has the potential to uncover many faults is better than one that can only uncover a few. However, it is almost impossible to say quantitatively how many faults are potentially uncovered by a given test set. Since this can be difficult to do by hand, there is an obvious need for automatic test data generation. Automatic test data generation is the basic problem of software testing. The following form: give a program P and a path W in P, assume D is the input domain of the program P,and find x D,in which P consider W is the path in the process of input running. To meet the needs of the practical problems, it puts forward a number of ways: symbol implementation method, iterative relaxation method and genetic algorithm. With the development of artificial intelligence, the superiority of genetic algorithm has got embody. The literature [1] has proposed how to apply genetic algorithm to generating the minimum test case sets. The literature [3] has showed how to transform the goal into the coverage criterion. However, it has many problems, such as searching blindly and converging slowly. This genetic algorithm can find the approximate solution which is close to the globally optimal solution, but it can not guarantee that it can converge to the globally optimal solution. To solve the problem, an effective way is combining other optimized algorithm to the genetic algorithm. In this paper, we present a hybrid method based on genetic algorithm and tabu search algorithm for generating test cases automatically. The experiment result shows the hybrid algorithm is effective on generating test cases. Genetic algorithms (GAs) are an adaptive search algorithms based on genetics and biological mechanisms of natural selection for generating populations of individuals (i.e. solutions) fitter and fitter. This population evolution approach is composed of three different operators which use probabilistic rules. A general description of the various mechanisms underlying GAs is given in Section 2. On the other hand, tabu search (TS) algorithms have recourse to notions of Artificial Intelligence. It imitates human behavior by applying some learning rules to direct the search properly and to avoid undesirable loops in the random walk through the solution space. The basic ideas of TS algorithms are sketched in section 3. 2. Genetic algorithms Genetic algorithms (GAs) [2] represent a class of adaptive search techniques and procedures based on the processes of natural genetics and Darwin s principal of the survival of the fittest. There is a randomized exchange of structured information among a population of artificial chromosomes. GAs are a computer model of biological evolution. When GAs * This work is supported by Nation 863 Project (2007AA04Z148).
are used to solve optimization problems, good results are obtained surprisingly quickly. In the context of software testing, the basic idea is to search the domain for input variables which satisfy the goal of testing. GAs have been used in many different applications as an adaptive search method, e.g. combinatorial optimization, prisoner s Dilemma problem. Each individual in each generation is evaluated with a fitness function. The following fitness function evaluates the unacceptability of feasible test cases S : f ( S) = w i f i( S) ic b f ( S ) is defined through a set of weights w i giving relative importance to each relaxed constraint. C b is a category in the set C of all the constraints. Constraints in C b are termed relaxed. GA is generally composed of three operators [13] : (a) Reproduction: This operation assigns the reproduction probability to each individual based on the output of the fitness function. The individual with a higher ranking is given a greater probability for reproduction. As a result, the fitter individuals are allowed a better survival chance from one generation to the next. (b) Crossover: This operation is used to produce the descendants that make up the next generation. This operation involves the following crossbreeding procedures: (i) Randomly select two individuals as a couple from the parent generation. (ii) Randomly select a position of the genes, corresponding to this couple, as the crossover point. Thus, each gene is divided into two parts. (iii) Exchange the first parts of both genes (iv) corresponding to the couple. Add the two resulted individuals to the next generation. (c) Mutation: This operation picks a gene at random and changing its state according to the mutation probability. The purpose of the mutation operation is to maintain the diversity in a generation to prevent premature convergence to a local optimal solution. The mutation probability is given intuitively since there is no definite way to determine the mutation probability. Before and after mutation is showed in figure 1. The genetic search process is iterative: evaluating, selecting, and recombining strings in the population until reaching some termination condition. Figure 1. Before and after mutation The pseudo code is displayed in figure 2 where t is the generation number and P the population. t=1 initialize P(t) while not finished evaluate P(t) Select P(t+1) from P(t) Recombine P(t+1) using crossover and mutation survive t=t+1 end Figure 2. Pseudo code of GA GA has the ability of doing a global searching quickly and stochastically. But it has many problems, such as searching blindly and converging slowly. 3. Tabu search algorithms Tabu search (TS) algorithm is a memory based search strategy to guide the local search descent method to continue its search beyond local optimality [8]. When a local optimum is encountered, a move to the best neighbor is made to explore the solution space, even though this may cause deterioration in the objective function value. TS seeks the best available move that can be deterioration in the objective function value. TS seeks the best available move that can be determined in a reasonable amount of time. If the neighborhood is large or its elements are expensive to evaluate, candidate list strategies are used to help restrict the number of solutions examined on a given iteration. TS is a general heuristic devised for solving large combinatorial optimization problems as the same as GAs. However, the principles underlying the two algorithms are fundamentally different. GAs deal with a population of solutions evolving naturally, and TS consists of an iterative search procedure on individual solutions. It is an adaptive procedure with the ability to make use of many other methods, such as linear programming algorithms and specialized heuristics, which it directs to overcome the limitations of local optimality. To describe the workings of tabu search, we represent a combinatorial optimization problem in the following form [8]. ( P ) Minimize cx (): x X in R n
The objective function cx () may be linear or nonlinear, and the condition x X is assumed to constrain specified components of x to discrete values. In some settings ( P ) may represent a modified form of some original problem, as where X is a superset of the vectors that normally qualify as feasible, and cx () is a penalty function, designed to assure that optimal solutions to ( P ) likewise are optimal for the problem from which it derived. A wide range of procedures, heuristic and optimal, for solving various problems capable of being written in the form ( P ) can be characterized conveniently by reference to sequences of moves that lead from one trial solution (selected x X ) to another. We will define a move s to consist of a mapping defined on a subset X( sof ) X : s: Xs ( ) X. Associated with x X is the set S( x ) which consists of those moves s S that can be applied to X ; i.e., S( x) = { s S: x X( s) } (and we may thus also write X ( s) = { x X: s S( x) }. The set S( x ) can be viewed as a neighborhood function. The procedure may be described as follows [8]. (1) Select an initial x X and let x * : = x Set the iteration counter k = 0 and begin with T empty. (2) If S( x) T is empty, go to Step 4. Otherwise, set k: =+ k 1 and select s k S( x) T such that S k ( x) = OPTIMUM( s( x) : s S( x) T). (3) Let x: s k ( x). If c( x) < c( x * ), where x * denotes the * best solution currently found, let x : = x. (4) If a chosen number of iterations have elapsed either in total or since x * was last improved, or if S( x) T= upon reaching this step directly from Step 2, stop. Otherwise, update T (as subsequently identified) and return to Step2. To provide a basis for understanding the extensions of ideas to be developed here, we briefly comment on the character of the foregoing process, which rests on the way the tabu set T is defined and treated. A key concept in the management of T is to constrain the search in a manner that allows latitude in selecting best (highest evaluation) moves with the method will not re-visit a previous solution except by following a trajectory not traveled before. This is accomplished by introducing tabu restrictions (or penalties) which discourage the reversal, and in some cases repetition, of selected moves. In the simplest implementations, an attribute or set of attributes is identified which, if prevented from occurring in a future move, will assure the present move cannot be reversed. The attributes are classified as forbidden (tabu). And the attributes record on a tabu list where they reside for a specified number of iterations and then are removed, freeing them from their tabu status. TS algorithm is also a kind of iteration searching algorithm, and has the ability of doing a local searching quickly. But it has its defect as follows. TS algorithm has the strong dependence regarding the initial solution and the searching process of TS is too unitary. Tabu search operator uses standard taboo searching algorithm. TS algorithm has the memory function, and it may accept the poor solution in the search process, therefore, TS has the strong ability of hill climbing. GA with tabu search as mutation operator can avoid the defect of genetic algorithm which falls into the local minimum. 4. An improved hybrid method These two algorithms have the defects, and then we combine GAs with TS for generating test cases, namely, TS as mutation operator, so that each individual conduct an independent optimization before the breeding population. The hybrid method will be tabu search algorithm embedded in the genetic algorithm (as shown in Figure 3). The use of tabu climbing ability can effectively avoid the existence of genetic algorithms precocious phenomenon, and the use of genetic algorithm can get initial solution in order to improve the quality of solutions. Figure 3. GAs combined with TS Steps of the hybrid algorithm to generate test cases: Step1 (Initialization): Set up the evolution of algebra ( N gen ) population size ( N pop ) crossover probability ( P c ) mutation probability( P m ); Step2 (Initial solution): Set gen=0; generating initial population; Step3 (Evaluation of the individual): In the calculation of the current group fitness chromosomes; Step4 (Production): Elected by the rotary select N pop chromosomes into the mating pool;
Step5 (crossover): In accordance with the crossover probability ( P c ), cross-cutting approach to the use of cross-pmx; Step6 (mutation): In accordance with the mutation probability ( P ), the mutation operator for tabu m mutation; Step7 (final solution): gen = gen + 1, if gen < N, to gen step 3; Otherwise, output the optimal solution to terminate the algorithm. The hybrid algorithm of genetic algorithm with the largest difference is a tabu mutation operator used for mutation operation. 5. Experiment To prove the approach effectively, the paper designs the following simulation experiment. We analyze the algorithm performance taking an isosceles triangle as an example. At the same time, the paper compares GATS algorithm with GA algorithm, and then analyzes the two algorithms performance changes in the simulation results. And the algorithm stability has been verified. The realization of GATS model uses the package of the hybrid algorithm (GATS) based on Java language. The programs under test use Java language. GATS algorithm for initial parameter settings are as follows. Parameters coding use the three-tier cascade encoding parameters. And a total length of coding is 18 bits. At the same time, each parameter is six bits. Besides each input parameter s value ranges from 0 to 63 and an accuracy of coding is 1. Crossover probability of parameters is 0.8 and mutation probability 0.09. The number of initial population is 50 and the largest number of iterations of the algorithm is 100. The hybrid algorithm proposed which combines the local search and climbing ability of tabu algorithms with.the parallel search and globe search ability of GA algorithms resolve the problem of generating test cases well. The literature [12] has proved that tabu search converges and the hybrid proposed in this paper is also convergence, thus the entire algorithm can converge to global optimal solution. The function coverage of the two algorithms which are GA algorithms and GATS algorithms is shown in figure 4. Figure 4. Comparison of performance using GA and GATS Figure 4 displays the comparison of performance using GA and GATS. X-axis represents the number of the problem under test and y-axis means the function coverage of test cases. Generally, GA algorithms tend to be more the function coverage, so the curve of GA algorithm is upper than of the hybrid algorithm proposed in this paper in the beginning. However, GA algorithm cannot get the global minimum, due to the algorithm easily trapped into local minimum. So the function coverage of GA algorithm cannot reach 100 percent. The curve of GATS algorithm proposed in this paper is lower than of the GA algorithm in the beginning. With the increase in the number of algorithm, the GATS algorithm is superior to the GA algorithm. This is because an increase of taboo search operator. 6. Conclusion In automatic test data generation methods, GA has achieved good results. However, the flaws and limitations of algorithms and the hybrid method to generate test cases are more efficient than the genetic algorithm. Therefore, the hybrid method can be used as an ideal algorithm for automatic generation of test data. 7. References [1] STHAMER H, The automatic generation of software test data using genetic algorithms [D], Pontyprid, Wales, Great Britain: University of Glamorgan, 1996. [2] Harmen-Hinrich Sthamer, The Automatic Generation of SoftwareTest Data Using Genetic Algorithms, http://www.systematic-testing.com/documents/sthamer_thesis.pdf
[3]PARGAS R, HARROLD M J, PECK R, Test-data generation using genetic algorithms [J], Journal of Software Testing, Verifications, and Reliability, 1999(9):263-282. [4] Jones B F, Eyres D E, Sthamer H H, A strategy for using genetic algorithms to automate branch and fault-based testing[j], The Computer Journal, 1998. [5] Ling Liu, and Huaikou Miao, Axiomatic Assessment of Logic Coverage Software Testing Criteria, Journal of Software, Vol.15, No.9, 2004, pp1301-1310. [7] Christoph C.Michael, Gary E.McGraw, Michael A.Schatz, and Curtis C.Walton, Genetic Algorithms for Dynamic Test Data Generation, http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=6328 58, 1997. [8] FRED GLOVER, Tabu Search-Part I, II, http://www.cds.caltech.edu/~shiling/tabu%20search%20par t%20i.pdf, 1989. [9] KOREL B, Automated software test data generation [J], IEEE Transactions on Software Engineering 1990, 16(8):870-879. [10] Wegener J, Baresel A, Sthamer H, Evolutionary test environment for automatic structural testing[j], Infoemation and Software Technology, 2001; 43(4):841~854. [11] Yao yao, New Test Case Generation Method Based on Genetic Algorithm, Computer & Digital Engineering, 2009. [12] Ling Wang, Intelligent Optimization Algorithm and Its Application, Tsinghua University press, 2001. [13] G. Syswerda, Uniform crossover in genetic algorithms, Proceeding of the Third International Conference on Genetic Algorithms, 1989, pp. 2-9.