A PSO-based Generic Classifier Design and Weka Implementation Study

Similar documents
Open Access Research on the Prediction Model of Material Cost Based on Data Mining

Study on GA-based matching method of railway vehicle wheels

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India.

Index Terms PSO, parallel computing, clustering, multiprocessor.

Chapter 8 The C 4.5*stat algorithm

Introduction to Artificial Intelligence

Modified Particle Swarm Optimization

PARALLEL PARTICLE SWARM OPTIMIZATION IN DATA CLUSTERING

Discrete Particle Swarm Optimization With Local Search Strategy for Rule Classification

Improving Tree-Based Classification Rules Using a Particle Swarm Optimization

A Network Intrusion Detection System Architecture Based on Snort and. Computational Intelligence

Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization

Particle Swarm Optimization applied to Pattern Recognition

CHAPTER 5 OPTIMAL CLUSTER-BASED RETRIEVAL

Performance Analysis of Data Mining Classification Techniques

Comparative Study of Clustering Algorithms using R

An improved PID neural network controller for long time delay systems using particle swarm optimization algorithm

Feature weighting using particle swarm optimization for learning vector quantization classifier

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

A Classifier with the Function-based Decision Tree

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

Using CODEQ to Train Feed-forward Neural Networks

Handling Multi Objectives of with Multi Objective Dynamic Particle Swarm Optimization

PSOk-NN: A Particle Swarm Optimization Approach to Optimize k-nearest Neighbor Classifier

An Optimization of Association Rule Mining Algorithm using Weighted Quantum behaved PSO

Meta- Heuristic based Optimization Algorithms: A Comparative Study of Genetic Algorithm and Particle Swarm Optimization

The Application Research of Neural Network in Embedded Intelligent Detection

Dynamic Balance Design of the Rotating Arc Sensor Based on PSO Algorithm

A New Meta-heuristic Bat Inspired Classification Approach for Microarray Data

Using a genetic algorithm for editing k-nearest neighbor classifiers

Discrete Particle Swarm Optimization for TSP based on Neighborhood

Image Edge Detection Based on Cellular Neural Network and Particle Swarm Optimization

A HYBRID ALGORITHM BASED ON PARTICLE SWARM OPTIMIZATION

LEARNING WEIGHTS OF FUZZY RULES BY USING GRAVITATIONAL SEARCH ALGORITHM

A Branch and Bound-PSO Hybrid Algorithm for Solving Integer Separable Concave Programming Problems 1

Research Article An Improved Kernel Based Extreme Learning Machine for Robot Execution Failures

Feature Selection Algorithm with Discretization and PSO Search Methods for Continuous Attributes

Novel Initialisation and Updating Mechanisms in PSO for Feature Selection in Classification

A Naïve Soft Computing based Approach for Gene Expression Data Analysis

FUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP

Takagi-Sugeno Fuzzy System Accuracy Improvement with A Two Stage Tuning

A Hybrid Fireworks Optimization Method with Differential Evolution Operators

IMPROVING THE PARTICLE SWARM OPTIMIZATION ALGORITHM USING THE SIMPLEX METHOD AT LATE STAGE

Application of Improved Discrete Particle Swarm Optimization in Logistics Distribution Routing Problem

Landslide Monitoring Point Optimization. Deployment Based on Fuzzy Cluster Analysis.

A NEW APPROACH TO SOLVE ECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATION

Parameter Selection of a Support Vector Machine, Based on a Chaotic Particle Swarm Optimization Algorithm

QUANTUM BASED PSO TECHNIQUE FOR IMAGE SEGMENTATION

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn

An Optimization Algorithm of Selecting Initial Clustering Center in K means

Robust Descriptive Statistics Based PSO Algorithm for Image Segmentation

Knowledge Discovery using PSO and DE Techniques

Hybrid PSO-SA algorithm for training a Neural Network for Classification

Hamming Distance based Binary PSO for Feature Selection and Classification from high dimensional Gene Expression Data

A Modified PSO Technique for the Coordination Problem in Presence of DG

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS /$ IEEE

Inertia Weight. v i = ωv i +φ 1 R(0,1)(p i x i )+φ 2 R(0,1)(p g x i ) The new velocity update equation:


A Modified Particle Swarm Optimization with Neural Network via Euclidean Distance

Particle Swarm Optimization

Particle Swarm Optimization

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Comparison of Some Evolutionary Algorithms for Approximate Solutions of Optimal Control Problems

High-speed Interconnect Simulation Using Particle Swarm Optimization

A HYBRID APPROACH FOR DATA CLUSTERING USING DATA MINING TECHNIQUES

A Kind of Wireless Sensor Network Coverage Optimization Algorithm Based on Genetic PSO

Cell-to-switch assignment in. cellular networks. barebones particle swarm optimization

Tact Optimization Algorithm of LED Bulb Production Line Based on CEPSO

A Novel Probabilistic-PSO Based Learning Algorithm for Optimization of Neural Networks for Benchmark Problems

Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm

Simplifying Handwritten Characters Recognition Using a Particle Swarm Optimization Approach

Seismic regionalization based on an artificial neural network

THREE PHASE FAULT DIAGNOSIS BASED ON RBF NEURAL NETWORK OPTIMIZED BY PSO ALGORITHM

Machine Learning: Algorithms and Applications Mockup Examination

An ICA-Based Multivariate Discretization Algorithm

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Small World Network Based Dynamic Topology for Particle Swarm Optimization

Clustering Analysis for Malicious Network Traffic

FITTING PIECEWISE LINEAR FUNCTIONS USING PARTICLE SWARM OPTIMIZATION

Intuitionistic Fuzzy Petri Nets for Knowledge Representation and Reasoning

Research on Quality Reliability of Rolling Bearings by Multi - Weight Method (Part Ⅱ: Experiment)

Wild Mushrooms Classification Edible or Poisonous

Data Mining With Weka A Short Tutorial

A Maximal Margin Classification Algorithm Based on Data Field

Simulation of Back Propagation Neural Network for Iris Flower Classification

A Lazy Approach for Machine Learning Algorithms

Design of student information system based on association algorithm and data mining technology. CaiYan, ChenHua

Reconfiguration Optimization for Loss Reduction in Distribution Networks using Hybrid PSO algorithm and Fuzzy logic

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS

Clustering of datasets using PSO-K-Means and PCA-K-means

Aero-engine PID parameters Optimization based on Adaptive Genetic Algorithm. Yinling Wang, Huacong Li

Citation for the original published paper (version of record):

Road Network Matching Method Based on Particle Swarm Optimization Algorithm

SwarmOps for Matlab. Numeric & Heuristic Optimization Source-Code Library for Matlab The Manual Revision 1.0

Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms

Tracking Changing Extrema with Particle Swarm Optimizer

Designing of Optimized Combinational Circuits Using Particle Swarm Optimization Algorithm

Tumor Detection and classification of Medical MRI UsingAdvance ROIPropANN Algorithm

Transcription:

International Forum on Mechanical, Control and Automation (IFMCA 16) A PSO-based Generic Classifier Design and Weka Implementation Study Hui HU1, a Xiaodong MAO1, b Qin XI1, c 1 School of Economics and Management, East China Jiaotong University, P.R.China a hh489@163.com, bmaoxd@ecjtu.jx.cn cxiqin@ecjtu.jx.cn Keywords: Classifier; PSO algorithm; Weka platform; instance-based learning. Abstract. Aiming at dataset with different feature attributes and multi nominal decision categories, a new design algorithm using PSO for instance-based learning classifier is put forward which can facilitate to make a better classification decision. On the basis of classes and interfaces in Weka platform, this improved algorithm is implemented and added in the platform. Taking two datasets as for test cases, not only is the classifier compared with optimizing different objective functions, but also is evaluated with other classifiers. Comparisons demonstrate that the proposed algorithm can get more effective classifying results and maybe considered a generic classifier. Introduction As an important analysis approach in Data Mining, classifying is to build a classifier or classification model which can be used to evaluate model and category prediction for test cases or new data instances[1]. There re a great many classifying algorithms [], such as Decision Tree, Bayes Network, Supported Vector Machine, Neural Network, Lay Learning and Statistical Learning etc. As for these algorithms, advantages and disadvantages exist simultaneously in them [3,4]. For instance, some classifying algorithm can gain a high computation speed at the expense of data type restriction and sensitive to noise data while other algorithm may get a more accurate prediction by longer modeling and terrible explanation. Among them, instance-based learning is considered to be a lazy learning algorithm, whose fundamental is to minimize the distance between identical category instances and maximize the distance between different category instances by taking an instance as a point in N-dimension space and a feature attribute as a dimension[]. Particle Swarm Optimization (abbreviated to PSO) is put forward to search global best value by tracking current optimum by J.Kennedy and R.Eberhart in 199 as a kind of swarm evolutionary algorithm[6]. Owing to its easy implementation, high accuracy and fast convergence, PSO is receiving more and more attention in academia and industry and widely applied in various fields. So is for data mining, like feature selection, data partition, classifying, optimization[7] etc. In this paper, in order to solve a classifying problem only for binary decision attribute, an improved PSO-based classifying algorithm is proposed to design a generic classifier for multiple attribute decision on the basis of different adaptive objective functions firstly. Then, the implementation of this algorithm in Weka platform is introduced and the classification results are compared and demonstrate the algorithm s effectiveness by testing two typical datasets. PSO-BASED CLASSIFYING ALGORITHM PSO Algorithm The principle of PSO algorithm can be depicted as followed. A solution of the optimal problem is considered to a particle in multi-dimension space and each particle has a corresponding adaptive value which is decided by objection function; flying at a certain speed in the searching space, a particle can adjust its velocity and position by tracking its current local best value and the populations global optimal value so that the optimal solution of the problem may be found. Classifier Design based on PSO Particle code representation is a prerequisite for PSO-based algorithm. In this study, an instance in dataset is considered to be a particle while feature attributes constitute of multi-dimension space except for decision attributes. For continuous feature attribute, the distance between particles is Copyright 17, the Authors. Published by Atlantis Press. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4./). 7

defined the adaptive value of a particle, which can be computed by Euclidean distance, Minkowsky distance etc. While for discrete feature attribute, the distance can be transformed into simple matching coefficient, Jaccard coefficient etc. On the basis of basic PSO algorithm described above, an improved PSO-based classifier s algorithm is proposed as below: Input: PSO algorithm basic parameters include maximum iteration number, maximum inertia weighted coefficient, minimum inertia weighted coefficient and learning factors; dataset to be classified, the number of datasets with different categories (assumed to be equal) etc. Output: Classified results of the dataset. The detailed procedures for this algorithm is as followed: Step1: According to different decision attribute, datasets with different categories are got from the given dataset. Step: Produce initial velocity matrix of particles. Step3: Dataset with different categories are sampled from the given datasets with different categories. Step4: A population dataset is composed of datasets with different categories according to the given number of datasets. Step: For different category dataset, compute each sample s adaptive value and the decision attribute s value. Step.1: Compute the distance between each instance in the dataset with different category and the distance between each instance in the population dataset. Step.: Compare the distance between each instance in the dataset with different category and population dataset. If it is minimal for some category and identical with the given category, classifying is correct and recorded. Step.3: Save each instance s accuracy and can be got through classifying correct number divided by total sample number. Step6: Set initial samples as the local best value for different category datasets. Select the maximum accuracy datasets with different categories as the global best values of the different categories. Step7: Compute the flying velocity of each instance with different category. Step8: Compute the position of each instance with different category and keep the category unchanged. Step9: Compute the local best position of each instance. Step1: Using equation (4) to compute the global best position of each instance. Step11: Check the termination condition. If not ended, return to Step7. Otherwise, go to next step1. Step1: Compute the global best value and product output. IMPLEMENTATION AND TEST ANALYSIS PSO-based Classifying Algorithm Implementation in Weka As an open-source platform for data mining, Weka covers a wide range of algorithms and tools in data preprocessing and data modeling and can fulfill various data mining tasks which include attribute selection, regression, classifying, clustering and associate rule etc. One advantage of Weka is that it s very convenient to improve existed algorithms or add new algorithms once following Weka s programming rules such as inheriting classes and interfaces. On the basis of the steps described above, this improved PSO-based classifying algorithm is implemented in Weka using Eclipse6. development software. This new classifying class which is set in weka.classifiers.functions inherits from abstract Classifier and implements OptionHandler, Observer interfaces and its main methods are designed as below: (1) PSOClassfier( ): defaulted constructed function 8

() BuildClassfier(Instances data): for classifying (3) void InitDataset(Instances data): initialize data from the given datasets (4) double Calculation_Fitness(Instances data,instances swarm): compute sampe data adaption value () double Calculate_GBest(Instances data): get the global best value (6) double Calculate_LBest(Instances data): get the local best value (7) double Calculate_Velocity_Nest(Instance data,gbest): get the particle s velocity according to global best value) (8) double Calculate_Position(Instance data,velocity): get the particle s position according to it s velocity. This new improved PSO-based classifier is implemented as Fig.1. Fig. 1 A PSO-based classifier implantation result in Weka Case Test Analysis using different objective functions In order to analyze the algorithm parameters influences on classifying effectiveness, classifying results are compared among different objective functions. The case test datasets are originated from reference[1] which is concerned with breast cancer diagnosis. There are 9 feature attributes and breast cancer class is decision attribute which has two values (one is and the other is ). We choose two hundred data cases as the test data. The number of tumor is 1 and so is for tumor. Other data is similar to the reference[7]. The fitness functions are set Euclidean distance, cosine distance, Minkowsky distance(cube root). The classification results are got using the PSO-based algorithm as Table 1 shows. It can be seen that different objective functions have distinct influences on the classification results which demonstrate that the best is Minkowsky distance function, the worst is cosine distance function and Euclidean distance function falls between these two functions. Table 1 Classification results base on different distance function Malign Objective function Breast cancer 1 Euclidean distance 9 98 Cosine distance 9 1 Minkowsky distance 4 96 Case Test for Multiple Decision Attributes The above test cases decision attribute has only two values. To further confirm this improved PSO-based classifier s performance, we select UCI dataset Iris to make a study. There re 3 values for decision attribute: setosa, versicolor and virginica. The parameters in our algorithm are set as followed. Maximum iteration number is set to ; maximum weighted inertia coefficient is.9; minimum weighted inertia coefficient is.1; learning factors c1, c are set to. 9

The classification results are compared with other algorithms such as J48 in Weka and BayesNet in Table. Table Classification results base on different classifier algorithms Classifier algorithm setosa versicolor virginica setosa 49 1 J48 Versicolor 47 3 Virginica 48 Setosa BayesNet Versicolor 44 6 Virginica 4 Setosa PSO-based Malign 48 virginica 3 47 As can be seen from Table, although different classifiers have different classification results, PSO-based classifier is the best one. CONCLUSIONS AND DISCUSSIONS Instance-based learning is one of important classifier design approaches in data mining. As a widely used evolutionary algorithm, PSO is receiving more and more attention in data mining processes. Aimed at classifier design for instance-based learning, an improved PSO-based algorithm is proposed to optimize multiple objective fitness functions in classification decision problem with multiple attribute values. Furthermore, implementation of this PSO-based generic classifier is briefly described in Weka by introducing main methods and screenshot result. Then comparison analysis is carried out using two test cases and results demonstrate that this PSO-based algorithm has a better performance compared to existing classifiers. What s more, it s necessary to point out that this algorithm is based on basic PSO in our paper. Though a great many improved PSO are put forward such as hybrid PSO, prey-escape PSO etc., basic PSO algorithm has the almost same classification results as those other improved PSO algorithms as long as suitable parameters are selected. Acknowledgements This work is supported PhD Start-up Fund of East China Jiaotong University 64411. References [1] Luan Li-Hua, Ji Gen-lin. Study on Decision Tree Classification Technology, Computer Engineering, 4,3(9): 94-96. [] Wang Miao, Chai Rui-min. An Improved feature selection method for Decision Tree, Computer Engineering and Application, 1,46(8):17-19. [3] Wang Hui-qing, Chen Jun-Jie and Hou Xiao-jing. Study on feature selection method on Decision Tree Classification, Journal of TaiYuan Technology College,11,4(8):7-19. [4] David W., Dennis Kibler, Markc and K. Albert. Instance-Based Learning Algorithms, Maching Learning, 199,6(1):37-66. [] Kennedy J, Eberhart R Eberhart. Particle Swarm Optimization, Proceeding of IEEE International Conference on Neural Networks, Perth, Western Australia, 199:194-1947. [6] A. Erdeljan, D. Capko, S. Vukmirovic, D. Bojanic, V. Congradac. Distributed PSO Algorithm for Data Model Partitioning in Power Distribution Systems, Journal of Applied Research and Technology, 14,1():947-97. 3

[7] Alejandro Cernantes, Ines marla Galvan, Pedro Isasi. AMPSO : A New Particle Swarm Method for Nearest Neighborhood Classification, IEEE Transactions on Systems, Man and Cybernetics- Part B: Cybernetics. 9,39():18-191. 31