Genetic-PSO Fuzzy Data Mining With Divide and Conquer Strategy

Similar documents
A GA-based fuzzy mining approach to achieve a trade-off between number of rules and suitability of membership functions

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding

M.Kannan et al IJCSET Feb 2011 Vol 1, Issue 1,30-34

Shaikh Nikhat Fatma Department Of Computer, Mumbai University Pillai s Institute Of Information Technology, New Panvel, Mumbai, India

GENETIC ALGORITHM VERSUS PARTICLE SWARM OPTIMIZATION IN N-QUEEN PROBLEM

Hybrid Particle Swarm-Based-Simulated Annealing Optimization Techniques

ISSN: [Keswani* et al., 7(1): January, 2018] Impact Factor: 4.116

MIC 2009: The VIII Metaheuristics International Conference. A Comparative Study of Adaptive Mutation Operators for Genetic Algorithms

CHAPTER 5 ENERGY MANAGEMENT USING FUZZY GENETIC APPROACH IN WSN

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 5, NO. 1, FEBRUARY

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India.

Reconfiguration Optimization for Loss Reduction in Distribution Networks using Hybrid PSO algorithm and Fuzzy logic

HPSOM: A HYBRID PARTICLE SWARM OPTIMIZATION ALGORITHM WITH GENETIC MUTATION. Received February 2012; revised June 2012

ARMA MODEL SELECTION USING PARTICLE SWARM OPTIMIZATION AND AIC CRITERIA. Mark S. Voss a b. and Xin Feng.

Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism

Revision of a Floating-Point Genetic Algorithm GENOCOP V for Nonlinear Programming Problems

Particle swarm optimization for mobile network design

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

International Journal of Digital Application & Contemporary research Website: (Volume 1, Issue 7, February 2013)

Solving Economic Load Dispatch Problems in Power Systems using Genetic Algorithm and Particle Swarm Optimization

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2

A HYBRID ALGORITHM BASED ON PARTICLE SWARM OPTIMIZATION

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

QUANTUM BASED PSO TECHNIQUE FOR IMAGE SEGMENTATION

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid

An improved PID neural network controller for long time delay systems using particle swarm optimization algorithm

A Genetic Algorithm Approach for Clustering

A Real Coded Genetic Algorithm for Data Partitioning and Scheduling in Networks with Arbitrary Processor Release Time

A Study on Optimization Algorithms for Clustering Gene Expression Data

Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization

An Island Based Hybrid Evolutionary Algorithm for Optimization

Using Genetic Algorithms to optimize ACS-TSP

Monika Maharishi Dayanand University Rohtak

Artificial Bee Colony (ABC) Optimization Algorithm for Solving Constrained Optimization Problems

Genetic Tuning for Improving Wang and Mendel s Fuzzy Database

An Optimization of Association Rule Mining Algorithm using Weighted Quantum behaved PSO

Fast Efficient Clustering Algorithm for Balanced Data

A Comparative Study of Genetic Algorithm and Particle Swarm Optimization

DETERMINING MAXIMUM/MINIMUM VALUES FOR TWO- DIMENTIONAL MATHMATICLE FUNCTIONS USING RANDOM CREOSSOVER TECHNIQUES

CHAPTER 6 ORTHOGONAL PARTICLE SWARM OPTIMIZATION

GA is the most popular population based heuristic algorithm since it was developed by Holland in 1975 [1]. This algorithm runs faster and requires les

AN EVOLUTIONARY APPROACH TO DISTANCE VECTOR ROUTING

Approach Using Genetic Algorithm for Intrusion Detection System

A RANDOM SYNCHRONOUS-ASYNCHRONOUS PARTICLE SWARM OPTIMIZATION ALGORITHM WITH A NEW ITERATION STRATEGY

Mobile Robot Path Planning in Static Environments using Particle Swarm Optimization

THREE PHASE FAULT DIAGNOSIS BASED ON RBF NEURAL NETWORK OPTIMIZED BY PSO ALGORITHM

ATI Material Do Not Duplicate ATI Material. www. ATIcourses.com. www. ATIcourses.com

Classification Using Unstructured Rules and Ant Colony Optimization

Classification of Concept-Drifting Data Streams using Optimized Genetic Algorithm

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

The movement of the dimmer firefly i towards the brighter firefly j in terms of the dimmer one s updated location is determined by the following equat

Genetic algorithms and finite element coupling for mechanical optimization

Meta- Heuristic based Optimization Algorithms: A Comparative Study of Genetic Algorithm and Particle Swarm Optimization

The Genetic Algorithm for finding the maxima of single-variable functions

A THREAD BUILDING BLOCKS BASED PARALLEL GENETIC ALGORITHM

Using Genetic Algorithms in Integer Programming for Decision Support

A NEW METHODOLOGY FOR EMERGENT SYSTEM IDENTIFICATION USING PARTICLE SWARM OPTIMIZATION (PSO) AND THE GROUP METHOD OF DATA HANDLING (GMDH)

GRANULAR COMPUTING AND EVOLUTIONARY FUZZY MODELLING FOR MECHANICAL PROPERTIES OF ALLOY STEELS. G. Panoutsos and M. Mahfouf

Discrete Particle Swarm Optimization With Local Search Strategy for Rule Classification

AN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE

Inducing Parameters of a Decision Tree for Expert System Shell McESE by Genetic Algorithm

A Novel Hybrid Self Organizing Migrating Algorithm with Mutation for Global Optimization

A METHOD FOR DIAGNOSIS OF LARGE AIRCRAFT ENGINE FAULT BASED ON PARTICLE SWARM ROUGH SET REDUCTION

Three-Dimensional Off-Line Path Planning for Unmanned Aerial Vehicle Using Modified Particle Swarm Optimization

DERIVATIVE-FREE OPTIMIZATION

MAXIMUM LIKELIHOOD ESTIMATION USING ACCELERATED GENETIC ALGORITHMS

Optimization of Association Rule Mining through Genetic Algorithm

Feature weighting using particle swarm optimization for learning vector quantization classifier

HYBRID GENETIC ALGORITHM WITH GREAT DELUGE TO SOLVE CONSTRAINED OPTIMIZATION PROBLEMS

Binary Differential Evolution Strategies

A Genetic Algorithm-Based Approach for Energy- Efficient Clustering of Wireless Sensor Networks

Resolving the Conflict Between Competitive and Cooperative Behavior in Michigan-Type Fuzzy Classifier Systems

Binary Representations of Integers and the Performance of Selectorecombinative Genetic Algorithms

Genetic algorithm based on number of children and height task for multiprocessor task Scheduling

Particle Swarm Optimization Artificial Bee Colony Chain (PSOABCC): A Hybrid Meteahuristic Algorithm

Distributed Optimization of Feature Mining Using Evolutionary Techniques

Genetic Fourier Descriptor for the Detection of Rotational Symmetry

Feeder Reconfiguration Using Binary Coding Particle Swarm Optimization

International Conference on Modeling and SimulationCoimbatore, August 2007

Improving Tree-Based Classification Rules Using a Particle Swarm Optimization

The Design of Pole Placement With Integral Controllers for Gryphon Robot Using Three Evolutionary Algorithms

Grid Scheduling Strategy using GA (GSSGA)

Comparison of Some Evolutionary Algorithms for Approximate Solutions of Optimal Control Problems

A Parallel Evolutionary Algorithm for Discovery of Decision Rules

ET-based Test Data Generation for Multiple-path Testing

Toward Optimal Pixel Decimation Patterns for Block Matching in Motion Estimation

Dynamic Economic Dispatch for Power Generation Using Hybrid optimization Algorithm

A Hybrid Fireworks Optimization Method with Differential Evolution Operators

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

Multi-objective pattern and feature selection by a genetic algorithm

GENETIC ALGORITHM with Hands-On exercise

Real Coded Genetic Algorithm Particle Filter for Improved Performance

Automatic differentiation based for particle swarm optimization steepest descent direction

Dynamic synthesis of a multibody system: a comparative study between genetic algorithm and particle swarm optimization techniques

Self-learning Mobile Robot Navigation in Unknown Environment Using Evolutionary Learning

Clustering Analysis of Simple K Means Algorithm for Various Data Sets in Function Optimization Problem (Fop) of Evolutionary Programming

Keywords Clustering, K-Mean, Firefly algorithm, Genetic Algorithm (GA), Particle Swarm Optimization (PSO).

Cell-to-switch assignment in. cellular networks. barebones particle swarm optimization

Network Routing Protocol using Genetic Algorithms

Santa Fe Trail Problem Solution Using Grammatical Evolution

Transcription:

Genetic-PSO Fuzzy Data Mining With Divide and Conquer Strategy Amin Jourabloo Department of Computer Engineering, Sharif University of Technology, Tehran, Iran E-mail: jourabloo@ce.sharif.edu Abstract - Nowadays, discovery the association rules is an important and controversial area in data mining research studies. These rules, describe noticeable association relationships among different attributes. While most studies have focused on binary valued transaction data, in real world applications, there data usually consist of quantitative values. With that in mind, in this paper, we propose a fuzzy data mining algorithm for extracting membership functions from quantitative transactions. This is a hybrid genetic-pso algorithm for finding membership functions suitable for mining problems by a strong cooperation of GA and PSO. This algorithm integrates the two techniques entire run of simulation in each iteration, a part of population are substituted by new ones generated by means of GA, while the other part is the same of previous generation but moved on the solution space by PSO. At the end, best final sets of membership functions in all the populations are gathered to be used for mining fuzzy association rules. According to experimental results, the proposed genetic-pso fuzzy data mining algorithm has a good effect on fitness of membership functions. Keywords: data mining, fuzzy sets, genetic algorithms (GA), Particles Swarm Optimization (PSO), membership functions. Introduction An important area of data mining research deals with the discovery of association rules, which describe interesting association relationships among different attributes. [] Association rule techniques are generally applied to databases of transactions where each transaction consists of a set of items. [6] Let us consider a database of customer transactions T, where each transaction is a set of items. The objective is to find all rules of the form X => Y, which correlate the presence of one set of items X with another set of items Y. An example of such a rule is: 98% of people who purchase diapers and baby food also buy baby soap. [] Most previous studies have focused on binary valued transaction data. Transaction data in real-world applications, however, usually consist of quantitative values. Designing a sophisticated data-mining algorithm able to deal with various types of data presents a challenge to workers in this research field. Recently, fuzzy set theory has been used more and more frequently in intelligent systems because of its simplicity and similarity to human reasoning. The theory has been applied in fields such as manufacturing, engineering, diagnosis, economics, among others. Several fuzzy learning algorithms for inducing rules from given sets of data have been designed and used to good effect with specific domains [7]. Evolutionary computing (EC) is an exciting development in mining algorithms. It amounts to building, applying and studying algorithms based on Darwinian principles of natural selection [, 2]. Genetic algorithms (GAs) are a family of computational models developed by Holland [3, 4].Genetic Algorithms (GA) and Particles Swarm Optimization (PSO) are both population based algorithms that have proven to be successful in solving a variety of difficult problems. However, both models have strengths and weaknesses. Comparisons between GAs and PSOs have been performed by Eberhart and Angeline and both conclude that a hybrid of the standard GA and PSO models could lead to further advances. [9,0] Several hybrids Genetic, Particle Swarm Optimization algorithms have been designed.as to [8] proposed an algorithm that combines the standard velocity and position update rules of PSOs with the ideas of selection, crossover and mutation from GAs. The algorithm is designed so that the GA facilitates a global search and the PSO performs a local search. [4] In 2006 Esmin, et al [] proposed a new model called Hybrid Particle Swarm Optimizer with Mutation (HPSOM), by integrate the mutation process often used in GA into PSO. This process allows the search to escape from local optima and search in different zones of the search space. This paper, thus, proposes a fuzzy data-mining algorithm for extracting both association rules and membership functions from quantitative transactions. A hybrid genetic-pso algorithm for finding membership functions suitable for mining problems is proposed that consists in a strong cooperation of GA and PSO, since it maintains the integration of the two techniques for the entire run of simulation. In each iteration, in fact, some of the individuals are substituted by new generated ones by means of GA, while the remaining part is the same of the previous generation but moved on the solution space by PSO. Considering Genetic Algorithms and Particle Swarm Optimization algorithms, most of the times, PSO has faster convergence rate than GA initially, but they are often outperformed by GA for long simulation runs.

2. Genetic Algorithm Genetic algorithms were first introduced by Holand in the early 970 s [3] and have been widely successful in optimization problems. The genetic operator that used in this paper have been described in [7] and [7].In representation step, each set of membership functions are encoded as a chromosome and handled as an individual with real-number schema. Genetic operators are very important to the success of specific GA application. The crossover and mutation operators chosen in this paper are the max-min-arithmetical (MMA) crossover proposed in [7] and the one-point mutation proposed in [7]. 2.. Fitness The fitness value of each set of membership function is determined according to two factors: suitability of membership functions and fuzzy supports of large -itemsets that have been described in [7]. The fitness value of a chromosome Cq is defined a Fig.. The proposed GA-PSO flowchart for fuzzy data mining. 2 GA-PSO Mining framework based on the divide-and-conquer strategy In this section, the fuzzy and GA-PSO concepts are used to discover both useful association rules and suitable membership functions from quantitative values. A GA-PSO framework with the divide-and-conquer strategy is proposed for searching for membership functions suitable for the mining problems. The final best sets of membership functions in all the populations are then gathered together to be used for mining fuzzy association rules. The proposed framework is shown in Fig.. The proposed framework in Fig. is divided into two phases: mining membership functions and mining fuzzy association rules. Assume the number of items is. In the phase of mining membership functions, it maintains populations of membership functions, with each population for an item. Each chromosome in a population represents a possible set of membership functions for that item. The chromosomes in the same population are of the same length. The proposed mechanism then chooses appropriate strings for mating, gradually creating good offspring sets of membership functions. The offspring sets of membership functions undergo recursive evolution until a good set of membership functions has been obtained. Next, in the phase of mining fuzzy association rules, the sets of membership function for all the items are gathered together and used to mine the fuzzy interesting association rules from the given quantitative database. () where L is the set of large -itemsets obtained by using the set of membership function in Cq 2.2 PSO The Particle Swarm Optimization (PSO) algorithm is a new optimization algorithm inspired by social behavior in nature. Like Genetic Algorithms, the PSO is a populationbased optimization method that searches multiple solutions in parallel. However PSO employs a cooperative strategy unlike GA, which utilizes a competitive strategy. During each generation each particle is accelerated toward the particle s previous best position and the global best position. At each iteration a new velocity value for each particle is calculated based on its current velocity, the distance from its previous best position, and the distance from the global best position. The new velocity value is then used to calculate the next position of the particle in the search space. This process is then iterated a set number of times or until a minimum error is achieved. In the inertia version of the algorithm an inertia weight, reduced linearly each generation, is multiplied by the current velocity and the other two components are weighted randomly to produce a new velocity value for this particle, this in turn affects the next position of the particle during the next generation. Thus, the governing equations are: Vi(t) = ω Vi(t-) + c ϕ (Pi Xi(t-)) + c2 ϕ2 (Pg Xi(t-)) Xi(t) = Xi(t-) + Vi(t) (2)

Where in this paper, t is number of generation and X i is particle i s membership function, Vi is particle i s velocity, P i is particle i s previous best membership function and P g is the global best particles membership function. The parameter ω is the inertia weight and variables c, c2, ϕ and ϕ2 are social parameters and random numbers in the range [0.0,.0], respectively. 2.3 randomly generated. Assume the population size is ten in this example. For comparison the proposed algorithm with GA and PSO the best fitness of each generation with GA, PSO and GA-PSO are shown in Fig2 GA PSO GA and PSO are much similar in their inherent parallel characteristics, whereas experiments show that they have their specific advantages when solving different problems. What we would like to do is to use both their excellent features by synthesizing the two algorithms 3 The proposed mining algorithm The input is a body of n quantitative transaction data, a set of m items, each with a number of predefined linguistic terms, a support threshold α, and a population size P and we are looking for the output a set of membership functions for extracting association rules. First randomly generate m populations, each for an item, each individual in a population represents a possible set of membership functions for that items and encode each set of membership functions into a string representation. For each chromosome in each population calculate the fitness value and randomly divide the population into two parts. Execute GA Algorithm on part one using the selection operation to choose membership functions. Any selection operation, such as the elitism selection strategy or the roulette selection strategy may be used here. Now execute crossover operations and then mutation operations on each population. For part two execute PSO Algorithm by finding the best membership function in each population and updating the global best membership function and then update velocity and membership functions. Now merge two parts and create new population, calculate the fitness value of each chromosome in each population. If the termination criterion is not satisfied, again divide this new population into two parts and execute GA and PSO as mentioned on them otherwise gather the sets of membership functions, each of which has the highest fitness value in its population. Note that the termination criterion may be number of iterations, allowed execution time or convergence of the fitness values. The proposed mining algorithm illustrated in Fig. 3. An Example In this section, an example is given to show advantage of the proposed mining algorithm. This is a simple example to show how the proposed algorithm can be used to mine membership from quantitative data. Assume that we have one item in a transaction database: Milk and the data set include the six transactions. Assume item has three fuzzy regions: Low, Middle, and High. Thus, three fuzzy membership functions must be derived for item. The population is Fig. 2. Comparison the proposed algorithm with GA and PSO For statistical analysis of proposed mining algorithm the result of 32 runs of GA-PSO and GA are shown separately in Fig3, Fig4 respectively. (Min Max Average) Fig. 3. Statistical analysis for 32 runs of GA 4 Experimental Results In this section, experiments conducted to show the performance of the proposed approach are described. They were implemented in Matlab on an Intel Core 2 personal computer with 2.00 GHz and GB RAM. The initial population size P was set at 0, the crossover rate Pc was set at 0.8, and the mutation rate Pm was set at 0.0 according to [8]. The parameter d of the crossover operator was set at 0.3 according to [7], the parameter of the mutation operator was set at 3, the minimum support α was set at 0.04 (4%), the inertia weight ω was set at 0.86 and the social parameters c, c2 are set 0.2.

Simulated datasets with 3 items and with different dataset sizes from to 20 k transactions were used in the experiments. In Table., the numerical result of execution of GA, PSO and GA-PSO on different size of database from to 20 k are shown and the result shows that GA-PSO have the best performance. The relationship between the Average Fitness and the database size is shown in Fig. and the relationship between the execution time and the database size is shown in Fig. 6 Fig. 4. Statistical analysis for 32 runs of GA-PSO Table Algorithm (size) Generation Fitness Fitness2 Fitness3 Average Fitness Time(minute) GA( KB) 0.40 0.7 0.49 0.4 PSO( KB) 0.70 3 GA-PSO( KB) 0.98 0.90 0.8 0.9 4 GA(0 KB) 0.30 0.8 0.39 0.42 4 PSO(0 KB) 0.87 GA-PSO(0 KB) 0.9 0.9 0.96 7 GA( KB) 0.07 0.3 0.46 0.28 2 PSO( KB) 0.47 0.82 0.93 8 GA-PSO( KB) 0.9 0.94 0.94 32 GA(20 KB) 0.2 0.0 0.4 0.49 24 PSO(20 KB) 0.79 0.86 0.8 0.9 0.9 0.93 GA-PSO(20 KB) Fig. The relationship between the Average Fitness and the database size Fig. 6 The relationship between the execution time and the database size

Conclusions In this paper, we have proposed a GA-PSO fuzzy data mining algorithm for extracting both association rules and membership functions from quantitative transactions. The experimental results have also shown that the proposed genetic-pso fuzzy mining algorithm have a good effect on fitness of membership function. In the future, we will continuously attempt to enhance the GA-PSO based mining framework for more complex problems. 6 References [] A.E. Eiben, M. Schoenauer, Evolutionary computing, Inform.Process. Lett. 82 () (2002) 6. [2] A.E. Eiben, J.E. Smith, Introduction to Evolutionary Computing, Springer, Berlin, 2003. [3] J.H. Holland, Adaptation in Natural and Artificial System, the University of Michigan Press, Ann Arbor, MI, 97. [4] D.E. Goldberg, Genetic Algorithms in Search, Optimization &Machine Learning, Addison-Wesley, Reading, MA, 989. [] Sushmita Mitra-Tinkuacharya Data Mining Multimedia, Soft Computing, and Bioinformatics - 2003 by John Wiley & Sons, Inc. (page64 & 267) [6] Olivia Parr Rud Data Mining Cookbook Modeling Data for Marketing, Risk, and Customer Relationship Management - 200 - page3. [7] Tzung-Pei Hong, Chun-Hao Chen, Yeong-Chyi Lee, and Yu-Lung Wu- Genetic-Fuzzy Data Mining With Divide-and-Conquer Strategy, IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 2, NO. 2, APRIL 2008 [8] Matthew Settles, Terence Soule - Breeding Swarms A GA-PSO Hybrid, GECCO, ACM 200, -993-00 [9] R. Eberhart and Y. Shi. Comparison between genetic algorithms and particle swarm optimization. In e. a. V. William Porto, editor, Evolutionary Programming, volume 447 of Lecture Notes in Computer Science, pages 666. Springer, 998 [0] P. Angeline. Evolutionary optimization versus particle swarm optimization: Philosophy and performance differences. In V. W. Porto and et al., editors, Evolutionary Programming, volume 447 of Lecture Notes in Computer Science, pages 60-60. Springer, 998 [] A. A. A. Esmin,G. Lambert-Torres,G. B. Alvarenga Hybrid Evolutionary Algorithm Based on PSO and GA mutation Proceedings of the Sixth International Conference on Hybrid Intelligent Systems (HIS'06) 0769-2662-4/06 2006 IEEE [2] X.H. Shi, Y.C. Liang, H.P. Lee, C. Lu, L.M. Wang An improved GA and a novel PSO-GA-based hybrid algorithm Information Processing Letters 93 (200) 2 26 [3] O. Cordón, F. Herrera, and P. Villar, Generating the knowledge base of a fuzzy rule-based system by the genetic learning of the data base, IEEE Tran. Fuzzy Systems, vol. 9, no. 4, 200 [4] A. Parodi and P. Bonelli, A new approach of fuzzy classifier systems, in Proc. th Int. Conf. Genetic Algorithms, 993, pp. 223 230 [] C. H. Wang, T. P. Hong, and S. S. Tseng, Integrating fuzzy knowledge by genetic algorithms, IEEE Trans. Evol. Comput. vol. 2, no. 4, pp. 38 49, 998 [6] C. H. Wang, T. P. Hong, and S. S. Tseng, Integrating membership functions and fuzzy rule sets from multiple knowledge sources, Fuzzy Sets Syst., vol. 2, pp. 4 4, 2000 [7] F. Herrera, M. Lozano, and J. L. Verdegay, Fuzzy connectives based crossover operators to model genetic algorithms population diversity, Fuzzy Sets Syst., vol. 92, no., pp. 2 30, 997 [8] M. Srinivas and L. M. Patnaik, Genetic algorithms: A survey, Computer, vol. 27, no. 6, pp. 7 26, 994