An Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem

An Effcent Genetc Algorthm wth Fuzzy c-means Clusterng for Travelng Salesman Problem Jong-Won Yoon and Sung-Bae Cho Dept. of Computer Scence Yonse Unversty Seoul, Korea jwyoon@sclab.yonse.ac.r, sbcho@cs.yonse.ac.r Abstract Genetc algorthms (GA) are one of effectve approaches to solve the travelng salesman problem (TSP). When applyng GA to the TSP, t s necessary to use a large number of ndvduals n order to ncrease the chance of fndng optmal solutons. However, ths ncurs hgh evaluaton costs whch mae t dffcult to obtan ftness values of all the ndvduals. To overcome ths lmtaton we propose an effcent genetc algorthm based on fuzzy clusterng whch reduces evaluaton costs wth mnmzng loss of performance. It wors by evaluatng only one representatve ndvdual for each cluster of a gven populaton, and estmatng the ftness values of the others from the representatves ndrectly. A fuzzy c-means algorthm s used for groupng the ndvduals and the ftness of each ndvdual s estmated accordng to membershp values. The experments were conducted wth randomly generated ctes, and the performance of the method was evaluated by comparng to other GAs. The results showed the usefulness of the proposed method on the TSP. Keywords-genetc algorthm, ftness evaluaton, fuzzy clusterng, fuzzy c-means algorthm I. INTRODUCTION The travelng salesman problem (TSP) [1] s a well-nown non-determnstc polynomal-tme(np)-hard problem n combnatoral optmzaton. Gven a lst of N ctes V{v 1, v,, v N } and dstances between them W{w 11, w 1,, w 1N,, w NN } where w j ndcates the dstance from v to v j, the goal of the TSP s to fnd a shortest tour path that vsts each cty only once. Fg. 1 shows examples of the TSP. Fgure 1. Examples of the travelng salesman problem (a: Ctes to be vsted, b: A possble tour path but not optmal, c: The shortest possble tour path) The TSP has been appled to varous real-world problems, such as logstcs, data clusterng, computng genome sequences [1]. Snce t s one of NP-hard optmzaton problems, there are only approxmaton algorthms to solve. A genetc algorthm (GA) s one of well-nown effectve approaches to solve the TSP [] snce t s nown that t has hgher chance to get a global optmum than other searchng or optmzaton methods [3]. When usng GA to solve the TSP, t s mportant to mantan the sze of the populaton as large as possble to search the global optmal soluton. Ths causes some problems when applyng GA to the problem because t taes too much cost to evaluate the ftness of a large number of ndvduals n the populaton. Ths problem s related to not only the sze of the populaton but also the cost taen to evaluate an ndvdual. In order to surmount the lmtatons, many researchers suggested evaluatng ndvduals partally, for example, the modfed nteractve GA that the users evaluate only the part of ndvduals and the rest of ndvduals are evaluated by the computer s proposed [4]. It s stll an mportant ssue how to estmate the ftness values as precse as possble whle reducng the cost of the evaluaton. In ths paper, we appled an effcent GA that partally evaluates and estmates ftness values by usng the fuzzy clusterng technque to the TSP to overcome the lmtatons of the conventonal GA. The proposed method ams to eep the smlar performance to the standard GA. In addton, we conducted experments wth randomly generated TSP data to verfy that the method can apply to the TSP successfully. II. EFFICIENT GENETIC ALGORITHMS In general, t s dffcult to defne the ftness evaluaton functon and almost mpossble to evaluate by human as the sze of the populaton grows n the case of the nteractve GA. For the nverse problem of the engneerng smlarly the ftness evaluaton taes much tme and cost [5]. In order to reduce the costs of the ftness evaluaton n the GA, many researchers have proposed effcent GAs that evaluate only the part of the populaton and estmate the ftness of the rest ndvduals. These methods had ts orgn n the characterstcs of the GA that the optmzaton s acheved as a generaton s proceedng based on ndvduals. One of them wors proposed was a GA that evaluates only a few ndvduals drectly and estmates the remander of the ndvduals by examnng ther smlartes to the selected ndvduals [6]. In addton, a hybrd GA based on clusterng has been developed. Ths GA can consderably reduce the number of evaluatons by evaluatng only one representatve of each cluster s center after clusterng all the ndvduals n the 978-1-444-7833-0/11/$6.00 011 IEEE

gven populaton [7, 8]. Graenng et al. suggested that the best ndvdual of each cluster be re-evaluated usng the real ftness functon [7]. Jn and Sendhoff evaluated the only the cluster centers wth the real ftness functons and other ndvduals wth a neural networ ensemble [8]. However, such hard clusterng technques that dstrbute the ftness value of each of the ndvduals lnearly wth only one representatve ndvdual to whch each ndvdual belongs do not provde accurate ftness values unless the clusterng forms an deal cluster partton. For the effcent GAs, t s mportant to prove that the algorthms can be appled to the practcal problems successfully. Even though some effcent GAs has been proposed, they have been verfed wth only some smple benchmar functons, ncludng our prevous wors [9, 10], wthout any applcatons to complex and practcal problems. In ths paper, we proposed to apply an effcent GA wth the fuzzy clusterng technque that can estmate ftness values more precsely wth fuzzy membershp-based soft cluster boundares to solve the TSP wth less costs. III. FUZZY CLUSTERING-BASED EFFICIENT GA FOR TSP In ths paper, we propose an effcent GA for the TSP usng the fuzzy clusterng technque whch shows performance to the conventonal GA even though t has fewer costs for ftness evaluatons. produces a new generaton and the whole process s repeated untl the maxmum number of generatons s reached. A. Desgnng GA for TSP Pror to apply the proposed effcent GA to the TSP, t s necessary to desgn GA for the problem. Frst of all, we appled random ey encodng (RKE) scheme [11] to encode chromosomes. RKE s a strategy avalable for problems nvolvng permutaton evoluton [1], and t can represent the TSP also [11]. Moreover, n RKE, the genes are represented by real numbers and these are sutable to the proposed method snce t uses the clusterng technque (See next secton). In RKE, a real number s assgned for each gene (and t s mapped to each cty). In decodng, the order of vstng ctes s generated as the ascendng order of ther correspondng numbers. Fg. 3 shows an example of RKE for the TSP. As shown on the fgure, for example, the thrd cty whch has 0.16 n the chromosome s vsted frstly after the decodng because 0.16 s the smallest number n the chromosome. On the contrary, the second cty whch has 0.75 s vsted lastly snce t s the hghest value among all the ctes. Fgure 3. An example of RKE for the TSP Snce standard crossover technques can be used for RKE, we used one-pont crossover. The ftness value of th ndvdual, f, was defned as below: 1 f w + w (1) + w +... c ( 1) c () c () c (3) c ( N ) c (1) Fgure. The flow of the proposed method It s constructed wth three parts: In the frst part, the ndvduals are grouped wth ther smlartes and membershp values are obtaned. Secondly, the centrod of each cluster s evaluated by the orgnal ftness functon and the ftness values of other ndvduals are estmated wth the membershp values n the next part. Fg. shows the flow of the proposed method. Fnally, the general GA operatons are performed. Ths where c () represents the th vsted cty n the soluton of th ndvdual. The denomnator of Eq (1) means the length of the tour path. The shorter the path s, the greater the correspondng ftness value s. B. Fuzzy Clusterng In order to separate ndvduals nto several groups, a fuzzy clusterng algorthm s used for groupng the populaton nstead of a hard clusterng algorthm. A fuzzy clusterng approach s less lely to get stuc n the local mnmum than a hard clusterng approach snce t maes soft decsons n teraton through the use of membershp values. The most wdely-used fuzzy clusterng algorthm s the fuzzy c-means algorthm, proposed by Bezdec [13]. It generates a fuzzy partton that provdes each pece of data wth a degree of membershp to a gven cluster. The values of the degrees of membershp le between 0 and 1. Values close to 0 ndcate the absence of assocaton to the correspondng cluster, whle values close to 1 ndcate strong assocaton to the cluster. Fg. shows the procedure of the fuzzy c-means algorthm. C. Ftness Estmaton Snce we evaluate only the centrods wth the orgnal ftness functon, ftness values of remanng ndvduals should

be estmated. Ftness estmaton s rather mportant and t s necessary to use an approprate and effcent ftness estmaton method because performance depends on accurate ftness estmaton of the ndvduals. Fg. 4 shows an nstance of the ftness estmaton process wth the centers of the clusters constructed by the clusterng algorthm. 1) Determne the number of clusters c and the fuzzness parameter m ) Intalze the membershp matrx satsfyng the condton: 1, 1 3) Compute centrods 1,,, : 4) Compute membershp values matrx U: 1 d x,v µ 1 d x,v 5) Compute the objectve functon :,,, 6) Repeat 3) through 5) untl stablzed as: Fgure 4. The fuzzy c-means clusterng algorthm As shown n Fg. 5, suppose that S{s 1, s,..., s n } s a set of ndvduals n the populaton, C{C 1, C,..., C c } s a set of clusters, and the ftness values of the cluster centers are F{f 1, f,..., f c }. The ftness values of an ndvdual can be estmated based on the smlarty between all the centers of the clusters and the ndvdual. m ndcates the degree of smlarty between the th ndvdual and the th cluster center. defned wth respect to an ordnary measure. Hence fuzzy ntegraton consttutes a vast famly of aggregaton operators, ncludng many wdely-used ones sutable for ths nd of aggregaton. We have adopted ths fuzzy ntegral to calculate the ndvdual s smlarty measure m over the centers of all the clusters. Let h : S [0,1] be the degree of belongness of an ndvdual to the th cluster, where 1 ndcates absolute certanty that the ndvdual s n the th cluster and 0 mples absolute certanty that the ndvdual s not n the th cluster. The smlarty measure m between the th ndvdual and the th cluster center s as follows: m h s ) () ( and the dstrbuted ftness value of s s as follows: s h ( s ) f c 1 m f c 1 h ( s ) f Snce the number of clusters s dscrete, the fuzzy ntegral of could be substtuted by just usng the sum of the values. Also h (s ) could be substtuted by m usng the Eq (3). D. Computatonal Complexty The man advantage of the proposed method s that t reduces a computatonal complexty. In ths secton, we analyzed the reducton of the complexty wth the bg-o notaton. Snce the conventonal GA and the proposed method share same GA procedures except ftness evaluatons, only the complexty of the ftness evaluaton step was treated. The complexty of the ftness evaluaton for a sngle generaton of the conventonal GA can be defned as follows: (3) GA no( f ) O( nf ) (4) where n s the number of ndvduals and s the complexty of the ftness evaluaton functon. On the other hand, the complexty of the same part of the proposed method s defned as below: GA FCM O( + O( cf ) + O( n) O(max( ndc, cf, n)) O(max( ndc, cf )) (5) Fgure 5. Ftness estmaton of ndvduals wth each cluster centrod For optmal ftness estmaton, a fuzzy ntegral [14] s used to calculate the smlarty between the ndvduals. The fuzzy ntegrals are the ntegrals of a real functon wth respect to a fuzzy measure, compared to the Lebesgue ntegral, whch s where d s the length of the chromosome n the case of the TSP, t s related to the number of ctes, c s the number of clusters, and s requred number of teratons for the clusterng. Eq (5) conssts of three terms. The frst term s the computatonal complexty of the fuzzy c-means clusterng algorthm [15]. The second term shows the complexty for evaluatng total c centrods of clusters. The last thrd term represents the ftness estmaton of n ndvduals, and ths can be completed n wthout any extra computatons snce the ftness values of centrods and membershp values of ndvduals were already obtaned. By the summng-rule of the bg-o notaton, the complexty s determned as the maxmum value among three terms. However, because d, c, and are

greater than 0, n cannot be greater than ndc. Therefore, only the frst and the second term are chosen as the computatonal complexty of the proposed method. Fgure 6. Total runnng steps of GA and FCM dependng on the length of chromosomes (n 100, c 10, 50, O(f)O(d 4 )) Fgure 7. Total runnng steps of GA and FCM dependng on the complexty of ftness functon (n 100, c 10, 50, d 10) Snce the number of clusters c s smaller than the number of ndvduals n, The proposed method s faster than the conventonal GA f O( cf. Even though O( O( cf ), the proposed method can reduce complexty f O( O( nf ) whch means that the computatonal complexty of ftness evaluatons of the conventonal method s greater than the complexty for the fuzzy clusterng. Fg. 6 shows the total runnng steps dependng on the length of chromosomes when the complexty ftness functon s dependent on t. We fxed n, c, and as 100, 10, 50 respectvely, and changed d from 10 to 350. The complexty of ftness functon was set to O(d 4 ). The conventonal GA s presented as GA, and FCM s the proposed method. If d s small, GA requres less total runnng steps than FCM. However, as d grows, FCM runs faster than GA snce the ftness functon whch s dependent on d requres more runnng steps. Fg. 7 shows the steps dependng on the complexty of ftness functon. We fxed n, c,, and d as 100, 10, 50, 10 respectvely. When the ftness functon s rather smple, GA performs faster than FCM. On the other hand, FCM requres less steps than GA f the ftness functon s complex enough 5 that O( O( nf ) (n ths case, O ( f ) O( d ) ). IV. EXPERIMENTAL RESULTS We conducted experments to prove the usefulness of the proposed method on the TSP. In order to show an outstandng performance of the method, we compared several GA methods ncludng exstng effcent GAs wth partal evaluaton. A. Expermental Settngs For the experments, total 30 ctes were used and dstances between them were set randomly. The crossover rate and the mutaton rate were 0.75 and 0.0005, respectvely. Several GA methods ncludng some methods whch evaluate ndvduals partally were used to compare performances. Smple GAs wth populaton szes of 100 (Pop100) and 10 (Pop10), partally evaluated GAs wth hard clusterng technques ncludng the sngle lnage (S-L), the hard c-means (HCM), and the -means(km), and the proposed method (FCM) were compared. The number of cluster c was set to 10. All experments were conducted 0 runs and the average results were used. The general parameters of the GA are shown n Table 1. TABLE I. PARAMETERS IN THE EXPERIMENTS Pop100 Pop10 FCM Populaton sze 100 10 100 # of evaluatons 100 10 10 # of clusters - - 10 Length of chromosome 150 Crossover rate 0.75 Mutaton rate 0.0005 # of generatons 300 Fuzzness parameter 1. Termnal condton 0.0001 B. Expermental Results Table shows the dstance of shortest tour path obtaned from Pop100, Pop10, the hard clusterng algorthms; S-L, HCM, and KM, and the proposed method (FCM). FCM performed showed even better performance than the alternatve methods except Pop100. Only HCM performed better than n Pop10 by estmatng the ftness of ndvduals even though they evaluated the same number of ndvduals, however, worse than n the FCM. The result mples that the proposed method estmates the ftness more accurately. TABLE II. THE DISTANCE OF THE SHORTEST TOUR PATH FROM POP100, POP10, THREE HARD CLUSTERING ALGORITHMS AND THE FCM (AVERAGE OF 0 RUNS) Pop100 Pop10 S-L HCM KM FCM Mean 484.6 98.8 934.4 90.75 94.35 756.6 Error ±10.08 ±.30 ±13.99 ±1.3 ±15.00 ±10.3 Fg. 8 shows the evoluton process for the TSP wth several methods. As a result, Pop10 evolved slowly and never reached the optmal soluton. Among the hard clusterng algorthms, only KM produced better solutons than. HCM showed better performance than Pop10 for a moment, however, only n the begnnng of the evoluton, The FCM showed the best performance among the other alternatve methods except Pop100 and ths mpled that the proposed method s more accurate wth the fuzzy membershp based estmaton process descrbed n secton III.B. Even though the FCM evolved more

slowly than Pop100, t showed the most smlar result wth reduced complexty cost. Fgure 8. Evoluton processes for the TSP problem Table 3 shows the results of the pared t-test between the conventonal method (Pop100) and the alternatve methods. The t-value represents how dfferent the performances from Pop100 and each comparson methods are. Although the statstcs accepted the hypothess that there are dfferences between the results from Pop100 and all the alternatve methods snce absolute t-value of each method s greater than the two-sded crtcal value of t, the FCM showed the lowest dfference between t-value and the crtcal value whch mpled that the FCM showed the most smlar result to Pop100 than any alternatve methods. It turns out to be very effcent n terms of tme and costs to get smlar results wthout evaluatng all the 100 ndvduals. TABLE III. THE T-TEST RESULTS OF THE STANDARD GA (POP100) AND POP10, THREE HARD CLUSTERING ALGORITHMS AND THE FCM (SIGNIFICANCE LEVEL 0.005, DOF 19, TWO-SIDED CRITICAL VALUE OF T.09) Pop10 S-L HCM KM FCM Mean 98.8 934.4 90.75 94.35 756.6 Std 99.49 65.56 55.10 67.09 45.76 t-value -19.0-35.85-6.13-7.95-17.86 Table 4 shows the comparson result of the evaluaton tme requred for the two methods Pop100 and FCM. The tme was reduced to almost one eghth va the proposed method. Ths result mpled the usefulness of the proposed method that can provde accurate ftness estmaton wth less number of evaluatons n the populaton. TABLE IV. COMPARISON OF EVALUATION TIME OF THE CONVENTIONAL GA (POP100) AND THE PROPOSED METHOD (FCM) (UNIT : MILLISECONDS) Pop100 FCM Tme 40.57 49.17 Error ±5.17 ±0.34 V. CONCLUDING REMARKS We proposed an effcent genetc algorthm (GA) for the travelng salesman problem (TSP). The method requres less ftness evaluaton due to the process of fuzzy clusterng. Ths process dvdes the whole populaton nto several clusters, and evaluates one ndvdual for each cluster. The ftness values of the others are estmated from the ftness values of the representatve ndvduals ndrectly by ther membershp functons. Results from experments confrm that the algorthm produces the most domnant performance than other methods and t reduces computatonal complexty. However, there stll exst gaps n performance between the conventonal GA and the proposed method. In order to narrow these gaps, enhanced estmaton methods should be nvestgated n the future. The method should be also appled to extended benchmar data sets of TSP and more real-world problems whch actually tae hgh cost to evaluate ftness. REFERENCES [1] D. L. Applegate, R. E. Bxby, V. Chvatal, and W. J. Coo, The Travelng Salesman Problem: A Computatonal Study, Prnceton Unversty Press, 006. [] F. Lu and G. Zeng, "Study of genetc algorthm wth renforcement learnng to solve the TSP," Expert Systems wth Applcatons, vol. 36, no. 3, pp. 6995-7001, 009. [3] S.-B. Cho and J.-Y. Lee, A human-orented mage retreval system usng nteractve genetc algorthm, IEEE Trans. Systems, Man, and Cybernetcs, vol. 3, no. 3, pp. 45-458, 00. [4] M. Shbuya, H. Kta and S. Kobayash, Integraton of mult-objectve and nteractve genetc algorthms and ts applcaton to anmaton desgn, Proc. of 99 Int l Conf. on Systems, Man, and Cybernetcs, vol. 3, pp. 646-651, 1999 [5] R. Shonwler, F. Mendvl and A. Delu, Genetc algorthms for 1-D fractal nverse problem, Proc. of 4 th Int l Conf. on Genetc Algorthms, pp. 495-501, 1991. [6] F. Sugmoto, and M. Yoneyama, "Hybrd ftness assgnment strategy n IGA," Proc. IEEE Worshop on Multmeda Sganl Processng, pp. 84-87, Dec. 00. [7] L. Graenng, Y. Jn and B. Sendhoff, Effcent evolutonary optmzaton usng ndvdual-based evoluton control and neural networs: A comparatve study, Proc. of European Symposum on Artfcal Neural Networs, pp. 73-78, 005. [8] Y. Jn and B. Sendhoff, Reducng ftness evaluatons usng clusterng technques and neural networ ensembles, Lecture Notes n Computer Scence, vol. 310, pp. 688-699, 004. [9] H.-S. Km and S.-B. Cho, "An effcent genetc algorthm wth less ftness evaluaton by clusterng," Proc. of 001 IEEE Congress on Evolutonary Computaton, pp. 887-894, May 001. [10] J.-W. Yoon and S.-B. Cho, "Ftness approxmaton for genetc algorthm usng combnaton of approxmaton model and fuzzy clusterng technque," World Congress on Computatonal Intellgence (WCCI010), pp. 1-6, 010. 7. [11] L. V. Snyder and M. S. Dasn, "A random-ey genetc algorthm for the generalzed travelng salesman problem," European Journal of Operatonal Research, vol. 174, no. 1, pp. 38-53, 006. [1] P. Kromer, J. Platos, and V. Snasel, Modelng permutatons for genetc algorthms, Proc. of 010 Int l Conf. of Soft Computng and Pattern Recognton, pp. 100-105, 009. [13] J. C. Bezde, Pattern Recognton wth Fuzzy Objectve Functon Algorthms, Plenum Press, 1981. [14] M. Sugeno, Theory of Fuzzy Integrals and ts Applcatons, PhD Thess, Toyo Insttute of Technology, 1974. [15] V. Sreenvasarao and D. S. Vdyavath, Comparatve analyss of fuzzy c-mean and modfed fuzzy possblstc c-mean algorthms n data mnng, Internatonal Journal of Computer Scence and Technology, vol. 1, no. 1, pp. 104-106, 010. [16] Y. Jn, A comprehensve survey of ftness approxmaton n evolutonary computaton, Soft Computng, vol. 9, no. 1, pp. 3-1, 005.