K-means Optimization Clustering Algorithm Based on Hybrid PSO/GA Optimization and CS validity index

Similar documents
Meta-heuristics for Multidimensional Knapsack Problems

Cluster Analysis of Electrical Behavior

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation

Support Vector Machines

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Machine Learning. Topic 6: Clustering

Complexity Analysis of Problem-Dimension Using PSO

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Analysis of Particle Swarm Optimization and Genetic Algorithm based on Task Scheduling in Cloud Computing Environment

Support Vector Machines

Network Intrusion Detection Based on PSO-SVM

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

An Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem

Clustering Algorithm Combining CPSO with K-Means Chunqin Gu 1, a, Qian Tao 2, b

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

Parallelism for Nested Loops with Non-uniform and Flow Dependences

The Research of Support Vector Machine in Agricultural Data Classification

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Hierarchical clustering for gene expression data analysis

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Classifier Selection Based on Data Complexity Measures *

Application of Improved Fish Swarm Algorithm in Cloud Computing Resource Scheduling

Chinese Word Segmentation based on the Improved Particle Swarm Optimization Neural Networks

CHAPTER 4 OPTIMIZATION TECHNIQUES

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Smoothing Spline ANOVA for variable screening

Machine Learning: Algorithms and Applications

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Programming in Fortran 90 : 2017/2018

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

A Clustering Algorithm Solution to the Collaborative Filtering

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Solving two-person zero-sum game by Matlab

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Using Particle Swarm Optimization for Enhancing the Hierarchical Cell Relay Routing Protocol

CS 534: Computer Vision Model Fitting

An Entropy-Based Approach to Integrated Information Needs Assessment

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition

Natural Computing. Lecture 13: Particle swarm optimisation INFR /11/2010

A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Two-Stage Algorithm for Data Clustering

Biostatistics 615/815

A hybrid sequential approach for data clustering using K-Means and particle swarm optimization algorithm

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Unsupervised Learning

THE PATH PLANNING ALGORITHM AND SIMULATION FOR MOBILE ROBOT

Classification / Regression Support Vector Machines

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

S1 Note. Basis functions.

An Optimal Algorithm for Prufer Codes *

Research of Neural Network Classifier Based on FCM and PSO for Breast Cancer Classification

A Binarization Algorithm specialized on Document Images and Photos

EVALUATION OF THE PERFORMANCES OF ARTIFICIAL BEE COLONY AND INVASIVE WEED OPTIMIZATION ALGORITHMS ON THE MODIFIED BENCHMARK FUNCTIONS

Image Segmentation of Thermal Waving Inspection based on Particle Swarm Optimization Fuzzy Clustering Algorithm

A Time-driven Data Placement Strategy for a Scientific Workflow Combining Edge Computing and Cloud Computing

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

An Improved Image Segmentation Algorithm Based on the Otsu Method

Unsupervised Learning and Clustering

An Improved Particle Swarm Optimization for Feature Selection

Study on Multi-objective Flexible Job-shop Scheduling Problem considering Energy Consumption

Training ANFIS Structure with Modified PSO Algorithm

Study of Data Stream Clustering Based on Bio-inspired Model

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Optimal Fuzzy Clustering in Overlapping Clusters

A Deflected Grid-based Algorithm for Clustering Analysis

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

A parallel implementation of particle swarm optimization using digital pheromones

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

Histogram based Evolutionary Dynamic Image Segmentation

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A Genetic Programming-PCA Hybrid Face Recognition Algorithm

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Classifier Swarms for Human Detection in Infrared Imagery

PARETO BAYESIAN OPTIMIZATION ALGORITHM FOR THE MULTIOBJECTIVE 0/1 KNAPSACK PROBLEM

A Novel Deluge Swarm Algorithm for Optimization Problems

Unsupervised Learning and Clustering

OPTIMIZATION OF SKELETAL STRUCTURES USING IMPROVED GENETIC ALGORITHM BASED ON PROPOSED SAMPLING SEARCH SPACE IDEA

Parameters Optimization of SVM Based on Improved FOA and Its Application in Fault Diagnosis

Feature Reduction and Selection

Load Balancing for Hex-Cell Interconnection Network

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

ARTICLE IN PRESS. Applied Soft Computing xxx (2012) xxx xxx. Contents lists available at SciVerse ScienceDirect. Applied Soft Computing

Data Mining For Multi-Criteria Energy Predictions

A Load-balancing and Energy-aware Clustering Algorithm in Wireless Ad-hoc Networks

A New Hybrid Method Based on Improved Particle Swarm Optimization, Ant Colony Algorithm and HMM for Web Information Extraction

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

Available online at Available online at Advanced in Control Engineering and Information Science

K-means and Hierarchical Clustering

GA-Based Learning Algorithms to Identify Fuzzy Rules for Fuzzy Neural Networks

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

Transcription:

Orgnal Artcle Prnt ISSN: 3-6379 Onlne ISSN: 3-595X DOI: 0.7354/jss/07/33 K-means Optmzaton Clusterng Algorthm Based on Hybrd PSO/GA Optmzaton and CS valdty ndex K Jahanbn *, F Rahmanan, H Rezae 3, Y Farhang 4, A Afroozeh 5 Department of Computer, Kerman Branch, Islamc Azad Unversty, Kerman, Iran, Department of Computer, Kerman Branch, Islamc Azad Unversty, Kerman, Iran, 3 Department of Managment, Frozuh Branch, Islamc Azad Unversty, Frozuh, Iran, 4 Faculty of Computer, Khoy Branch, Islamc Azad Unversty, Khoy, Iran, 5 Unversty of Larestan, Lar, Iran Abstract K-means s the most usable clusterng partton-based algorthm, however, ths algorthm s hghly dependent on the ntal mass centers and the selecton of the number of clusters to start; hence t s rapdly trapped n the local optmum. A new framewor has been presented n ths paper based on the combnaton of Genetc, Partcle swarm Optmzaton algorthms and tang advantage of CS valdaton ndex as the cost functon called Hybrd-PSO-GA. At frst, n the proposed method the locaton and number of clusters have been found by the HGA algorthm and then the cost functon for each teraton s calculated, fnally, the optmal soluton wth the best value of cost functon, locaton and number of mass centers s sent to the K-means algorthm. The proposed approach n addton to solve the problem of random start and fallng K-means algorthm nto the local optmum, because of usng both evolutonary and swarm ntellgence approaches n comparson to the optmzaton method of K-means smply wth genetcs or PSO n terms of fnal clusterng, faster and more accurate convergence to global optmum and the number of functon evaluaton has better performance and can acheve the most effectve results of clusterng. Keywords: K-means, Hybrd PSO/GA Optmzaton, Clusterng INTRODUCTION Clusterng s one of the most mportant ssues of data mnng, unle classfcaton, clusterng s an unsupervsed learnng method and ts ultmate goal s to maxmze the separaton and mnmze the coheson. Hence the clusterng can also be regarded as an optmzaton problem and the evolutonary and swarm ntellgence methods can be used to solve and optmze the clusterng crtera. In general, clusterng methods can be dvded nto fourgroups: Partton-Based Methods Herarchcal methods Grd-Based Methods Model-based methods www.jss-sn.com Access ths artcle onlne Month of Submsson : 06-07 Month of Peer Revew : 06-07 Month of Acceptance : 07-07 Month of Publshng : 07-07 K-means based s a partton-based method and due to the smple structure, fast mplementaton, and rapd convergence rate has numerous applcatons n the feld of Data-mnng & Machne Learnng ncludng: Clusterng, Image Segmentaton, and Pattern recognton; However Partton-based clusterng algorthms such as K-means, K-medods and Fuzzy K-Means (FCM) have bg problems such as extreme senstvty to ntal startng locaton and are fast convergence to local optmum []. In ths paper, a hybrd approach based on PSO and GA algorthms called Hybrd PSO/GA (HPG) has been used to deal wth the problems of the K-means algorthm and other Partton-Based Clusterng algorthms. Due to the hgh power of genetc algorthms n searchng the answer t s an approprate tem to optmze the objectve functon; however, evolutonary algorthms have only a mere optmzaton approach, whle swarm ntellgence algorthms le PSO produce the optmal soluton based on the experences and observatons of each partcle, thus, n ths paper, the combnaton of both methods have been used to explot ther benefts amed at fndng an effectve method of clusterng. Correspondng Author: K. Jahanbn. Department of Computer, Kerman Branch, Islamc Azad Unversty, Kerman, Iran. E-mal: Mohagheghna7877@gmal.com Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4 3

Jahanbn, et al.: Cash flow and proft effect on the value of the companes In the proposed approach, K of the cluster center wth the mnmum soluton n terms of cost functon CS produced by HPG algorthm and s sent to K-means algorthm or any other Partton-Based algorthm; Then the algorthm based on the poston of the centers of clusters has conducted the clusterng and the results are analyzed wth the crtera Separaton, Coheson, NFE and value of cost functon CS per teraton, the analyss results show that the algorthm HPG solves the problems of random start and fallng nto the local optmum of K-means and also, uses the smple structure, fast mplementaton and rapd convergence of Partton-Based algorthms. A REVIEW OF PREVIOUS LITERATURE Lu et al. provded fast genetc-based K-means algorthm (FGKA) and genetc growng algorthm nspred by Chrsta and Morett algorthm called GKA[], features of these two algorthms are hgher speed and better convergence to global optmum [3, 4]. In [5] an automatc clusterng method based on Dfferental Evoluton for the classfcaton of large data wthout the nowledge of ther labelng s provded, Automatc segmentaton mage s also used by the author n the algorthm; Accordng to the author, the proposed method has approprate speed, stablty, and convergence to the global optmum. In[] an mproved K-means technque called PM-Kmeans has been provded that, sends the optmal centers of clusters to K-means usng PSO and K-means has conducted the clusterng process and then two solutons are combned. As can be seen, all methods are based solely on evolutonary algorthms or swarm ntellgence, but n the proposed HPG algorthm, advantages of both methods and CS cost functon have been used n automatc clusterng. SCIENTIFIC BACKGROUNDS K-means Algorthm K-means [6] s a Vector Quantzaton method wdely used n Cluster Analyss and Data Mnng. K-means san NP-Hard problem, though t can be mproved by Meta-Heurstc algorthms and converged towards a local optmum rapdly. K-means got a set of observatons d X = { x, x,...,x n}, X and assgned t nto K clusters of S = {S, S,...S } where, K<< N. The am of K-means s to mnmze the Wthn-Cluster Sum of Square Dstance (WCSS), n other words: K () arg mn s = 0 x S x s. PSO Algorthm PSO [7] s essentally a contnuous algorthm n the feld of Swarm Intellgence (SI). From another perspectve, due to the mplementaton of an optmzaton process n each run round and ablty to obtan the soluton wth mnmum nformaton, PSO can be consdered as an evolutonary algorthm (EA) and Meta-Heurstc algorthm. The PSO algorthm wors by havng a populaton (called a swarm) of canddate solutons (called partcles). these partclesmove around the search-space wth velocty determned by the best postonexperenced by each partcle. Partcle saves ts best-experencedposton dsplayed by x best, n ts memory and compares t wth the best poston experenced by all partcles shown wth x Gbest and, all partcles set ther new poston accordng to the values of x best, and x Gbest to reach new soluton. If x, v and x best, are the poston, velocty and the best poston experenced by partcle, respectvely, ω s nerta coeffcent and x Gbest s best poston for the accumulaton of partcles, equatons descrbng the behavor of the partcles can be wrtten as follows: [ ]= [] (, [ ] [ ]) ) v t+ r t + cr x t x t best ( ) Gbest + cr x [] t x [] t [ ]= []+ [ + ] x t+ x t v t Where x t s the current poston n t-thtme (or teraton), [ ] [ ] v t+ s the new velocty at tme t + th and x t+ s the new poston of the partcle n tme t + th, r, r ( 0, ) and c, c are respectvelythe factor of personal and collectve learnng set accordng Constrcton coeffcents property. Constrcton Coeffcents[8]: Consder to constants, we defne: 3) φ φ φ = φ φ, > 0 + > 4 Now the number of ξ s as follows: 4) ξ = φ + φ 4φ We defne the coeffcents: ω = ξ 5) C = ξφ C = ξφ, > 0 33 Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4

Jahanbn, et al.: Cash flow and proft effect on the value of the companes Proven at optmal state of = = 05., ω = 0.798 and C+ C =. 496. Genetc Algorthm Genetc Algorthm[9] s an optmzaton and randomzes search technque, nspred by the prncples of evoluton and natural genetcs. GA can search n complex, large and multdmensonal spaces to fnd anear-optmal soluton for the objectve functon n an optmzaton problem. In GA, the parameters are encoded n the search space n a strng called chromosomes, and a set of chromosomes have formed a populaton. In general, ether bnary or contnuously, genetc algorthm can be wrtten as follows: : t 0 : Intalze Populaton [P(t)]; 3: Evaluates The Populaton [P(t)]; 4: Whle (not termnaton) do 5: P ' = Varaton [ P '(t)]; {Create of new solutons} 6: Evaluate Populatons [ P ' (t)]; {Evaluates the new solutons} 7: P(t+) Apply Genetc Operaton [ P ' (t) Q]; { New generaton Pop} 8: t t+; 9: End Whle. Mutaton & Arthmetc Crossover: If n var ndcates the { } number of unnowns, and X = x, x,, x nvar, X [ xmn, xmax ] represents the populaton, then each parent s defned as follows (two parents are shown): X = { x, x,, x n} Parents, X = { x, x,, xn} ( mas) = x, x,, x n Offsprng, { } = + ( ) = + ( ) Y x x Y x x α s a parameter for themutaton but ths parameter α not alone enough, because offsprng n the best condtons are smlar to one of ther parents and are not able to search n the space further than ts parents. If offsprng are n the range 0 α n the best case, they wll be smlar to one of ther parents but f the range wth mnor changes wll be γ α + γ, offsprng can obtan more optmal soluton than ther parent (for example γ =0.05). CS Clusterng Valdty Index CSclusterng valdty ndex[0] s recently proposed to assess the valdty and objectve functon s clusterng algorthms proposed. Before applyng CS measurng method, cluster centers are calculated by calculatng the average of data vectors of each cluster center: 6) m c = N x x j C j The measure of the dstance between ponts x and x j s shown by d( x, x j ). Therefore, the CS measure can be defned as the followng: 7) CS ( ) = = j, j { d( X Xq )} ( q ) [ max, ] N = X C X q C = [ mx { d m, m }] = X C X q C = j, j { ( q )} ( q ) [ N max d X, X ] { } [ mx d m, m ] It s observed that most CS dvded the longest dstance between cluster members over the shortest dstance between the clusters, consequently, CS nvestgates two factors of Coheson and Separaton whch are the characterstcs of a sutable clusterng algorthm. Excepton: the error of dvde by zero may occur n CS calculatons, ths error wll be occurred when one of the centers of clusters s outsde the boundary of dstrbutons of the dataset.to avod ths problem, f the cluster center was wthout data or had very small number of data (for example, or 3 data), ths cluster center s penalzed by assgnng large values: ( ) ( ) 8) 0* norm Max X, Mn X HYBRID PSO/GA METHOD PSO algorthm often has a strong tendency to fall nto local optmum because there s swarm n the lac dversty.on the other hand,genetc algorthm s only a mere optmzer algorthm and does not use the propertes of swarm ntellgence that s, the relatonshp between partcles and eepng the best memores, hence, HPG algorthm tres to tae advantage of the propertes of both evolutonary and swarm ntellgence approaches. As mentoned above, the cost functon CS clearly supports the coheson and separaton crtera, hence the PSO n Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4 34

Jahanbn, et al.: Cash flow and proft effect on the value of the companes the HPG algorthm fnds the amount of globalbest at frst,then genetc algorthm operators such as Crossover and Mutaton are appled on the partcles; the goal s to expand the search space and acheve more optmal answers. Partcle Populaton Encodng and Clusterng Cost HPG algorthm answers are enttled as partcle populaton, the objectve functon has been desgned n accordance wth [5] and clusterng valdty ndex CS s attached to t. Partcle populaton encodng: Suppose the dataset s equal nd * to X = { x, x,, xn}, X, the cluster centers are * d represented wth M = { m, m,, m }, X, K << N, and to actvate the center of each cluster the actvaton matrx of A= { a, a,, a }, X s used, then the partcle populaton matrx s * ( d + ) dmensonal and n the followng encodng form: : Cluster_Cost. A : } Functon CS_ndex ( X, : { M ) = ( ) 3: d max p maxx, x C d x, p xq % The maxmum of Parwse p q dstance between the two clusters. 4: IF 5: d ^ 6: Else > a few number of Cluster Numel( C ) = N d x p C max p 7: d = 0* norm Max( X ), Mn( X ) 8: CS = ( j) ^ mn d m, m % mn Parwse dstance j betweentwo cluster 9: Output: d We defne X X m X, m = Max( X ) and X Mn X * d and a ( ) = ( ) then, ~ 0,, a, f, s a_ 0.5 the cluster center s actve. The followng program shows the cost functon that uses the CS ndex to valdate the cluster centers: Functon Clusterng_Cost ( X ) : { 3: M = X (:, :end-) % All columns except the last column 4: A = X (:,end) % the last column 5: A = CS_ndex( M, X ) 6: IF Numel( a 05. ) < Two Cluster Centers 7: fnd two bggest a Actve then 8: Else 8: 0.5 Actve a 0: Output: then 0: CS_ndex. M : CS_ndex.CS : } One of the benefts of HPG method s replacement of CS functon by any other clusterng valdty ndex, now we can consder the cost functon as follows: Clustrng_Cost=CS And ftness functon value s calculated as follows: Ftness = CS + HPG Algorthm As mentoned, HPG algorthm s a combnaton of PSO and GA algorthms, n HPG velocty parameter of PSO algorthm has been appled to the GA algorthm populaton n the parts of cross-over and mutaton so that, GA algorthm can search more space to ncrease the exploraton;on the other hand, by addng GA parameters such as sortng the populaton, Cross- Over, and mutaton on PSO partcles, more accurate optmzaton wll be occurred whch s on of the tass of evolutonary algorthms and wll lead to ncreased explotaton. 35 Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4

Jahanbn, et al.: Cash flow and proft effect on the value of the companes In addton to the benefts of the evolutonary algorthms and swarm ntellgence, the proposed method delver the fnal clusterng to the K-means algorthm due to hgher executon speed, on the other hand, HPG algorthm has solved the problem of fallng of PSO nto local optmzaton and elmnates the severe dependence of partton-based methods such as K-means to the ntal startng centers. In all the algorthms t s assumed that sp. best s the best local value of the objectve functon related to swarm - populaton and sp s the best value obtaned by the total Gbest populaton of partcles to the -th round. M also ndcates the poston and the number of all clusters and X represents the tested dataset. Algorthm : Automatc PSO Clusterng Input: Dataset: X = { x, x,, x n } Cluster Centrod: M = { m, m,, m } Max-Sub-Iteraton-PSO= Sp_populaton =00; Output: Sp_best.cost={ sp. best, sp. best,, sp. best }; Max teraton sp Gbest ; Procedure Automatc PSO clusterng : assgn, C, C usng Equaton(5); : For = to Max-Sub-Iteraton-PSO 3: For = to Sp_populaton 3: UPDATE Veloctes for sp usng Equaton(); 4: mn ( sp. velocty % Apply Velocty Lmt ) sp. velocty max ( sp. velocty ); 5: UPDATE Poston for sp usng Equaton(); 6: IF ( sp. poston < mn ( X ) sp. poston > max ( X )) % Velocty Mrror Effect 9: [sp. cost, sp. ndex]= Clustrng_Cost( sp ); 0: IF ( sp. best : UPDATE sp. cost : END IF 3: IF ( sp Gbest < sp. best 4: sp Gbest = sp best. ; 5: END IF 6: END FOR 7: END FOR 8: END Procedure < Clustrng_Cost( sp )) % Update sp. best, sp. poston ; ) % Update sp Gbest Automatc GA algorthm s as follows: Procedure Automatc PSO clusterng : n crossover Pcrossover * swarm popluato = ; : For = to Max-Sub-Iteraton-GA 3: For = to n crossover 4: Parent = RouletteWheelSelecton ( e C 5: Parent = RouletteWheelSelecton ( e C ); 6: UPDATE sp. best 7: END FOR, sp. velocty ; 8: Merge & Sort populaton; 9: IF ( sp. best < sp. best. cost ) 0: sp. best. cost = sp. best ; : END IF 3: IF ( sp Gbest < sp. best. cost ) 4: sp Gbest = sp. best.cos ta ; 5: END IF ); 7: sp. poston 8: END IF = sp poston. ; 6: END FOR 7: END Procedure Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4 36

Jahanbn, et al.: Cash flow and proft effect on the value of the companes Termnaton condton: HPG algorthm has two combnatonal sub-programs of PSO and GA, n each teraton of the algorthm once PSO and once GA are executed;it can be concluded f global best does not change after 0 runs, the algorthm reached a stable level and algorthm does not have to be contnued more frequently (ths result s clearly seen n the Result part of fgures 3 and 6). Termnaton condton s as follows: : IF (teraton >=30) : IF ( sp Gbest () == sp Gbest (-0)) 3: Brea; 4: END IF 5: END IF Two algorthms of Automatc PSO and GA clusterng and termnaton condton were defned above, the man body of HPG algorthm s as follows: Algorthm 3: Automatc HPG Clusterng Procedure Automatc HPG Clusterng : For = to Max-teraton : Automatc PSO Clusterng; 3: Automatc GA Clusterng; 4: END FOR 4: M = Numel ( sp Gbest.m); 5: Best_Soluton = ( sp Gbest.cost); RESULTS Two lner and non-lner standard dataset named as X and X have been used for the analyss. X s assocated wth 000 samples and fve cluster centers n postons [- 0], [3, 5], [9 ], [- 0] and [8, 0]. X s a non-lnear dataset contanng 9000 samples and 3 cluster centers n postons [0 0], [3 3] and [3-3], Nonlnear dataset has been tested because of the weaness of Partton-Based methods to dentfy ths nd patterns. In dataset, f r represents the π π radus and θ ndcates X-axs angle of θ =,0, 0, and coordnates (X, Y) s equal to x = rcos. ( ) y = r. Sn( ) X and X have been shown n Fgure., Datasets The Table ndcates the coeffcents of PSO and GA: Number Functon Evaluaton (NFE): Number Functon Evaluaton (NFE) s a much better measure than CPU performance tme to understand the operatng speed of the algorthm,because many algorthms ntroduced recently such as quantum algorthms have operators whch are not defned n many common CPUs and applyng them needs calculaton;so ths hgh runtme of ths nd of algorthm cannot be assgned to executon complexty and slowness of the algorthm. NFE s defned as follows: NFE = number of the orgnal populaton + (Offsprng {number of chldren + the number of mutants}) * number of teratons; MATLAB programmng envronment R03b verson has been used for comparng algorthms. The results of clusterng and the cost functon based on the number 6: K-means ( M, X ) % K-medods or Fuzzy-K-means 7: { Repeat Untl Convergence 8: For j j 9: C = arg mn x C ; J 0: For each j C : j = C = M = { C = j} x { C = j} ; : } 3: END Procedure Fgure : Structure of datasets X and X 37 Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4

Jahanbn, et al.: Cash flow and proft effect on the value of the companes of run cycles on algorthms, K-means optmzaton based on GA/PSO and the proposed algorthm K-means optmzaton based on HPG by CS have been shown respectvely n Fgures, 3, 4 and Table for X dataset. Table : Values of PSO and GA parameters Parameter Populaton Max sub teraton Inerta weght (ω) Personal learnng coeffcent(c ) Global Learnng coeffcent(c ) damp (ω = ω * damp) Velocty max Velocty mn Clusterng valdty ndexes Parameter Populaton Max sub teraton Cross over percentage Mutaton percentage γ Velocty max Velocty mn Selecton pressure(β) Clusterng valdty ndexes PSO parameters GA parameters Value 00 ξ ξφ ξφ 0.99 0.*((mx(X ), max(x )) max(x ) CS Value 00 0.8 0.3 0.05 0.*((mx(X ), max(x )) max(x ) 8 CS Fgure 4 shows the data assgnment to each cluster based on the crtera Confuson Matrx by HPG algorthm. Cluster Coheson and Separaton: Cluster Separaton s a standard whch ndcates the smlarty or proxmty (n terms of dstance crtera) between cluster members and the am of thecost functon s to reduce ths amount; on the other hand, separaton represents the dscrmnaton or the lowest smlarty between dfferent clusters. In the CS cost functon formula (7), the numerator s coheson crtera and denomnator calculates the separaton crtera, the followng table ndcates the average of the two measures and the CS cost functon value for Automatc Clusterng GA/PSO and algorthms HPG algorthm: NFE measure n Table has been calculated whle algorthm reaches a level of stablty (for example, 30 or 40 runs); However, the results show that the proposed HPG algorthm has acted much better than other optmzaton methods on the X dataset and acheved very good results.fgures and 3 and Table show the results of the executon and comparson of algorthms K-means optmzaton based on GA/PSO and the proposed algorthm K-means optmzaton based on HPG by CS. Fgure : Clusterng of X dataset respectvely by algorthms HPG, PSO and GA Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4 38

Jahanbn, et al.: Cash flow and proft effect on the value of the companes Table : The fnal output of the algorthm (mean of 40 ndependent run tmes) n terms of CS cost functon and the average NFE Algorthm Coheson Separaton Number of cluster found Mean number of NEF requre CS measure K means Opt. base on HPG 3.0685 8.35 5 400 0.3674 K means Opt. base on PSO 5.646 9.4773 4 6600 0.5449 K means Opt. base on GA 4.0898 7.0 3 4600 0.575 Table 3: The fnal output of the algorthm (mean of 40 ndependent run tmes) n terms of CS and the average NFE) Algorthm Coheson Separaton Number of cluster found Mean number of NEF requre CS measure K means Opt. base on HPG.53.5 3 050 0.6036 K means Opt. base on PSO.087.94 4 0400 0.5577 K means Opt. base on GA..8 4 3960 0.563 Fgure 3: The CS cost functon value per 00 cycles for algorthms HPG, PSO, and GA Fgures 5 and 6 ndcate the atttude to mere optmzaton and try to mnmze the cost functon n automatc overnnovatve evolutonary algorthms. HPG algorthm, unle two prevous methods, does not only try to optmze and reduce the cost functon but s also stopped by fndng the best clusterng. Fgure 6 shows the data assgnment to each cluster based on confuson matrx by HPG algorthm. Table 3 compares algorthms K-means optmzaton based on GA/PSO and the proposed algorthm K-means optmzaton based on HPG by CS n terms of the crtera Coheson, Separaton, CS, and NFE (Fgure 7). CONCLUSIONS In ths paper, optmzaton of K-means based on HPG algorthm was presented, the basc dea of the proposed approach s the combnaton of evolutonary and swarm ntellgence algorthms to dentfy the prmary centers of clusters and use the CS clusterng valdty ndex to actvate the cluster centers. In HPG algorthm, at frst, the optmal 39 Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4

Jahanbn, et al.: Cash flow and proft effect on the value of the companes Fgure 4. Confuson Matrx calculaton for clusterng output of HPG algorthm on X dataset Fgure 5. Clusterng of X dataset respectvely by HPG, PSO and GA algorthms Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4 40

Jahanbn, et al.: Cash flow and proft effect on the value of the companes Fgure 6: The amount of CS cost functon per 00 cycles of algorthms HPG, PSO and GA Fgure 7: Calculaton of confuson matrx for HPG clusterng output on X dataset 4 Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4

Jahanbn, et al.: Cash flow and proft effect on the value of the companes centers of clusterng are found based on CS cost functon output for each partcle populaton effectvely and, fnally the most optmal soluton wth the postons of cluster centers are sent to K-means, ths ssue solves the problem of random start and fast convergence toward the local optmum of K-means algorthm and other Partton-based algorthms. The combnaton of PSO algorthm and GA has caused that, HPG algorthm maes use of optmzaton and connecton between the partcles together, n addton, by applyng the velocty parameter on the populaton created by GA algorthm, search range and the fnal answer of ths algorthm are mproved. Theoretcal results of lnear and nonlnear crsp datasets show that HPG algorthm has been converged at an approprate velocty to the global optmum and n contrast to optmzaton algorthm K-means based on PSO or GA has not been trapped n alocal optmum. Moreover,comparson of HPG wth these methods showed hgh accuracy n fndng the centers of clusters, much more optmal CS cost functon and faster mplementaton of the algorthm accordng to standard NFE. On the other hand, the tradtonal K-means algorthm combned wth HPG clusterng s able to conduct the clusterng of complex, large and non-lnear datasets wth hgh precson and speed. In future wor, the man purpose wll be the extenson of HPG algorthm to acheve a comprehensve framewor of Data Clusterng;Hence, HPG wll be used by addng pre-processng steps for outler detecton and reducton of dmensons on thehgh-dmensonal data set.in addton, more accurate clusterng valdty ndex or combnaton ofclusterng valdty ndexwll be used as the cost functon. REFERENCES. Ln, Y., et al., K-means optmzaton clusterng algorthm based on partcle swarm optmzaton and multclass mergng, n Advances n Computer Scence and Informaton Engneerng. 0, Sprnger. p. 569-578.. Krshna, K. and M.N. Murty, Genetc K-means algorthm. IEEE Transactons on Systems, Man, and Cybernetcs, Part B (Cybernetcs), 999. 9(3): p. 433-439. 3. Lu, Y., et al. FGKA: A fast genetc -means clusterng algorthm. n Proceedngs of the 004 ACM symposum on Appled computng. 004. ACM. 4. Lu, Y., et al., Incremental genetc K-means algorthm and ts applcaton n gene expresson data analyss. BMC bonformatcs, 004. 5(): p.. 5. Das, S., A. Abraham, and A. Konar, Automatc clusterng usng an mproved dfferental evoluton algorthm. IEEE Transactons on systems, man, and cybernetcs-part A: Systems and Humans, 008. 38(): p. 8-37. 6. Hartgan, J.A. and M.A. Wong, Algorthm AS 36: A -means clusterng algorthm. Journal of the Royal Statstcal Socety. Seres C (Appled Statstcs), 979. 8(): p. 00-08. 7. Sh, Y. Partcle swarm optmzaton: developments, applcatons and resources. n evolutonary computaton, 00. Proceedngs of the 00 Congress on. 00. IEEE. 8. Chatterjee, A. and P. Sarry, Nonlnear nerta weght varaton for dynamc adaptaton n partcle swarm optmzaton. Computers & Operatons Research, 006. 33(3): p. 859-87. 9. Whtley, D., A genetc algorthm tutoral. Statstcs and computng, 994. 4(): p. 65-85. 0. Chou, C.-H., M.-C. Su, and E. La, A new cluster valdty measure and ts applcaton to mage compresson. Pattern Analyss and Applcatons, 004. 7(): p. 05-0. Internatonal Journal of Scentfc Study July 07 Vol 5 Issue 4 4