Research Article International Journals of Advanced Research in Computer Science and Software Engineering ISSN: X (Volume-7, Issue-6)

Similar documents
ISSN: (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies

COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES

Global Journal of Engineering Science and Research Management

A Comparative Study of Selected Classification Algorithms of Data Mining

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Channel Performance Improvement through FF and RBF Neural Network based Equalization

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Iteration Reduction K Means Clustering Algorithm

Analysis of Modified Rule Extraction Algorithm and Internal Representation of Neural Network

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES

Performance Analysis of Data Mining Classification Techniques

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

Comparative Study on Classification Meta Algorithms

Study on Classifiers using Genetic Algorithm and Class based Rules Generation

Face Detection Using Radial Basis Function Neural Networks with Fixed Spread Value

IN recent years, neural networks have attracted considerable attention

Introduction to Data Mining and Data Analytics

A Comparative Study of SVM Kernel Functions Based on Polynomial Coefficients and V-Transform Coefficients

Chap.12 Kernel methods [Book, Chap.7]

AMOL MUKUND LONDHE, DR.CHELPA LINGAM

A study of classification algorithms using Rapidminer

IMPLEMENTATION OF CLASSIFICATION ALGORITHMS USING WEKA NAÏVE BAYES CLASSIFIER

Recognition of Handwritten Digits using Machine Learning Techniques

Image Compression: An Artificial Neural Network Approach

A Review on Performance Comparison of Artificial Intelligence Techniques Used for Intrusion Detection

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Face Detection Using Radial Basis Function Neural Networks With Fixed Spread Value

Department of Electronics and Telecommunication Engineering 1 PG Student, JSPM s Imperial College of Engineering and Research, Pune (M.H.

A Survey on k-means Clustering Algorithm Using Different Ranking Methods in Data Mining

International Journal of Advance Engineering and Research Development. A Survey on Data Mining Methods and its Applications

CLASSIFICATION OF CAROTID PLAQUE USING ULTRASOUND IMAGE FEATURE ANALYSIS

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

An Efficient Clustering for Crime Analysis

A Study on Data mining Classification Algorithms in Heart Disease Prediction

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis

CHAPTER 6 EXPERIMENTS

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

A Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression

Neuro-fuzzy, GA-Fuzzy, Neural-Fuzzy-GA: A Data Mining Technique for Optimization

A new approach of Association rule mining algorithm with error estimation techniques for validate the rules

Dynamic Analysis of Structures Using Neural Networks

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

Simulation of Back Propagation Neural Network for Iris Flower Classification

Performance Based Study of Association Rule Algorithms On Voter DB

Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers

Data Mining and Analytics

International Journal of Computer Engineering and Applications, Volume XI, Issue XII, Dec. 17, ISSN

Clustering of Data with Mixed Attributes based on Unified Similarity Metric

Optimizing Number of Hidden Nodes for Artificial Neural Network using Competitive Learning Approach

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples.

SVM Classification in Multiclass Letter Recognition System

An Efficient Neural Network Based System for Diagnosis of Breast Cancer

A Heart Disease Risk Prediction System Based On Novel Technique Stratified Sampling

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Keywords- Classification algorithm, Hypertensive, K Nearest Neighbor, Naive Bayesian, Data normalization

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India.

A Neural Network Model Of Insurance Customer Ratings

Data mining with sparse grids

Applying Supervised Learning

Keywords Intrusion Detection System, Artificial Neural Network, Multi-Layer Perceptron. Apriori algorithm

Effect of Principle Component Analysis and Support Vector Machine in Software Fault Prediction

COMPUTATIONAL INTELLIGENCE

EE 589 INTRODUCTION TO ARTIFICIAL NETWORK REPORT OF THE TERM PROJECT REAL TIME ODOR RECOGNATION SYSTEM FATMA ÖZYURT SANCAR

A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering

CS6220: DATA MINING TECHNIQUES

Comparative Study of Instance Based Learning and Back Propagation for Classification Problems

Design and Performance Analysis of and Gate using Synaptic Inputs for Neural Network Application

A Novel Technique for Optimizing the Hidden Layer Architecture in Artificial Neural Networks N. M. Wagarachchi 1, A. S.

PCA-NB Algorithm to Enhance the Predictive Accuracy

INVESTIGATING DATA MINING BY ARTIFICIAL NEURAL NETWORK: A CASE OF REAL ESTATE PROPERTY EVALUATION

MIT Samberg Center Cambridge, MA, USA. May 30 th June 2 nd, by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA

Dynamic Clustering of Data with Modified K-Means Algorithm

A Systematic Overview of Data Mining Algorithms

Neural Network Classifier for Isolated Character Recognition

Classification using Weka (Brain, Computation, and Neural Learning)

Proceedings of the 2016 International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, September 23-25, 2016

Use of Artificial Neural Networks to Investigate the Surface Roughness in CNC Milling Machine

Identification of Multisensor Conversion Characteristic Using Neural Networks

Development of an Artificial Neural Network Surface Roughness Prediction Model in Turning of AISI 4140 Steel Using Coated Carbide Tool

A FAST CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM

DEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX

Constructive Neural Network Algorithm for Breast Cancer Detection

Efficient Object Tracking Using K means and Radial Basis Function

Fast Learning for Big Data Using Dynamic Function

Neural Networks and Deep Learning

Machine Learning Algorithms in Air Quality Index Prediction

International Research Journal of Computer Science (IRJCS) ISSN: Issue 09, Volume 4 (September 2017)

Detection and Deletion of Outliers from Large Datasets

Obtaining Rough Set Approximation using MapReduce Technique in Data Mining

Evolving SQL Queries for Data Mining

INTRODUCTION TO MACHINE LEARNING. Measuring model performance or error

Schema Matching with Inter-Attribute Dependencies Using VF2 Approach

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani

CLASSIFICATION BASED HYBRID APPROACH FOR DETECTION OF LUNG CANCER

β-release Multi Layer Perceptron Trained by Quasi Newton Rule MLPQNA User Manual

Machine Learning in Biology

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION USING WAVELET TRANSFORMS

Recitation Supplement: Creating a Neural Network for Classification SAS EM December 2, 2002

Transcription:

International Journals of Advanced Research in Computer Science and Software Engineering Research Article June 17 Artificial Neural Network in Classification A Comparison Dr. J. Jegathesh Amalraj * Assistant Professor, Department of Computer Science Thiruvalluvar University Constituent College, Cuddalore, Tamilnadu, India DOI: 1.395/ijarcsse/V7I/19 S. Sivagowry Research Scholar, Department of Computer Science, Bharathidasan University, Tamilnadu, India Abstract The most critical task involved in Data Mining is the classification and performance analysis of classifier to figure out the suitable classifier. Every single data set should be classified to obtain the knowledge from the data set. Artificial Neural Network tends to be the best classifier in classifying the dataset. In this paper, two different ANNs, like MultiLayer Perceptron Network (MLPN) and Radial Basis Function Network () are studied and compared with experimental results. It was observed that both the network performs well but computes the results in less time when compared with MLPN. Metrics like accuracy, RAE, RRSE, RMSE, TRP, FPR, Kappa are used to analyze the performance of the Network and results obtained are tabulated accordingly. Keywords Artificial Neural Network, Classification, Data Mining, Multi Layer Perceptron Network, Radial Basis Function Network I. INTRODUCTION This Data Mining refers to extracting or mining the knowledge from large amount of data. Data collection and storage technology has made it possible for organizations to accumulate huge amounts of data at lower cost. Exploiting this stored data, in order to extract useful and actionable information, is the overall goal of the generic activity termed as Data Mining. Data Mining is an interdisciplinary subfield of computer science which involves computational process of large data sets patterns discovery. The goal of this advanced analysis process is to extract information from a data set and transform it into an understandable structure for further use. The methods used are at the juncture of artificial intelligence, machine learning, statistics, database systems and business intelligence. Data Mining is about solving problems by analyzing data already present in databases [1]. The purpose of a Data Mining effort is normally either to create a descriptive model or a predictive model. A descriptive model presents, in concise form, the main characteristics of the data set. It is essentially a summary of the data points, making it possible to study important aspects of the data set. Undirected Data Mining finds patterns in the data set but leaves the interpretation of the patterns to the data miner. The purpose of a predictive model is to allow the data miner to predict an unknown (often future) value of a specific variable; the target variable. If the target value is one of a predefined number of discrete (class) labels, the Data Mining task is called classification. If the target variable is a real number, the task is regression. The predictive model is thus created from given known values of variables, possibly including previous values of the target variable. The paper is organized as follows: Section explains the basics of Artificial Neural Network, Section 3 and briefs the MultiLayer Perceptron Network (MLPN) and Radial Basis Function Network (). Section 5 discuses the results obtained using two networks along with graphs and tables. Section concludes the paper. II. ARTIFICIAL NEURAL NETWORK A Neural Network model which is the branch of artificial intelligence is generally referred to as Artificial Neural Networks (ANNs). ANN teaches the system to execute task, instead of programming computational system to do definite tasks. To perform such tasks, Artificial Intelligence System (AI) is generated. It is a pragmatic model which can quickly and precisely find the patterns buried in data that replicate useful knowledge. One case of these AI models is neural networks. AI systems should discover from data on a constant basis. In the areas of medical diagnosis relationships with dissimilar data, the most available techniques are the Artificial Intelligence techniques. An artificial neural network is made up of many artificial neurons which are correlated together in accordance with explicit network architecture. The objective of the neural network is to convert the inputs into significant outputs. The teaching mode can be supervised or unsupervised. Neural Networks learn in the presence of noise. Figure 1: Artificial Neural Network www.ijarcsse.com, All Rights Reserved Page 1

Artificial Neural Networks considers classification as one of the most dynamic research and application areas. The major disadvantage in using ANN is to find the most appropriate grouping of training, learning and transfer function for classifying the data sets with growing number of features and classified sets. The different combinations of functions and its effect while using ANN as a classifier is studied and the correctness of these functions are analyzed for various kinds of datasets. III. MULTI LAYER PERCEPTON NETWORK Multilayer Perceptron (MLP) is a Neural Network which is based on Supervised Learning method and the network is trained by using the Back Propagation Algorithm []. Back propagation algorithm is the most popularly used Neural Network Algorithm. Feed Forward Neural Network or Multilayer Perceptron is the most widely studied network algorithms for classification purpose. MLP uses the non-linear activation function. The hidden neurons make the network dynamic for highly multifarious tasks [3], []. The aspiration of the training process is to find the set of weight values that will cause the output from the neural network to match the actual target values as closely as probable. There are several issues involved in designing and training a Multilayer Perceptron network: Selecting how many hidden layers to use in the network. Deciding how many neurons to use in each hidden layer. Finding a globally optimal solution that avoids local minima. Converging to an optimal solution in a reasonable period of time. Validating the neural network to test for over fitting. The Figure gives the architecture of the MLP network. One of the most important characteristics of a Perceptron network is the number of neurons in the hidden layer(s). If an inadequate number of neurons are used, the network will be unable to model multifaceted data, and the resulting fit will be poor. If too many neurons are used, the training time may become excessively long, and worse, the network may over fit the data. When over fitting occurs, the network will begin to model random noise in the data. The result is that the model fits the training data extremely well, but it generalizes poorly to new, unseen data. Validation must be used to test for this. Figure : Multilayer Perceptron Network IV. RADIAL BASIS FUNCTION NETWORK The Radial Basis Function Network is an ANN which works based on the Radial Basis Function as initiation functions. Radial basis function (RBF) networks typically have three layers: an input layer, a hidden layer with a nonlinear RBF activation function and a linear output layer. The input can be modeled as a vector of real numbersx R n. The output of the network is then a scalar function of the input vector,φ: R n R and is given by N φ x = a i p( x c i ) i=1 Where N is the number of neurons in the hidden layer,c i is the centre vector for neuron I, and a i is the weight of neuron I in the linear output neuron. Functions that depend only on the distance from a centre vector are radically symmetric about that vector, hence the name radial basis function. In the basic form all inputs are connected to each hidden neuron. Figure 3 shows the architecture of Radial Basis Function Network. The main features of RBF are they are two layered feed-forward networks. The hidden node implements a set of Radial Basis Function. The output node implements a set of linear summation functions as in MLP. The network training is divided into two stages. In the first stage, the weight from the input to the hidden layer is determined. And in the second stage, the weight from the hidden to the output layer is determined. The networks are very good at interpolation www.ijarcsse.com, All Rights Reserved Page 17

Figure 3: Architecture of RBF network V. RESULTS AND DISCUSSION Two different data sets are taken from the UCI website [5]. The instances and attributes in different data sets are listed in Table 1. The experimentation is carried out in Matlab. The Parameters like Accuracy, Kappa, Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Relative Absolute Error (RAE), Root Relative Squared Error (RRSE), True Positive Rate (TPR) and False Positive Rate (FPR) are used to validate the performance. Table 1 : List of dataset taken for experimentation Dataset Instances Attributes Chronic Kidney disease 5 Zoo data 11 18 Table describes the results obtained for Zoo dataset. It was observed that both MLPN and provides the same result. But the time taken for computation is lower in then in MLPN. Figure, 5 and depicts the results obtained for MLPN and using the Zoo dataset. Figure 7(a) & (b) shows the confusion matrix obtained in each case. Table : Experimental Results for Zoo Dataset Classifiers Time Accuracy Kappa MAE RMSE RAE RRSE TPR FPR (sec) 1.51 9.1.93..7 7.75.91.9. 3.1 9.1.93..13 7.58 38.5.9. 1 1 8 Figure : Comparison of MLPN and Time (seconds) www.ijarcsse.com, All Rights Reserved Page 18

1 9 8 7 5 3 1 Accuracy RRSE Figure 5: Comparison of MLPN and Accuracy and RRSE. 1 9 8 7 5 3 1 Kappa MAE RMSE RAE TPR FPR Figure : Comparison of MLPN and Kappa, MAE, RMSE, RAE, TPR and FPR Figure 7(a) www.ijarcsse.com, All Rights Reserved Page 19

Figure 7(b) Figure 7(a) & (b) : Confusion Matrix for MLPN and Table 3 describes the results obtained for Chronic Kidney Disease dataset. It was observed that both performs better than MLPN. The accuracy is higher in and also the computational time is much lesser than the MLPN. All the other parametric values also support the in classsifcaiton. Figure 8, 9 and 1 depicts the results obtained for MLPN and using the Chronic Kidney Disease dataset. Figure 11(a) & (b) shows the confusion matrix obtained in each case. Table 3: Experimental Results for Chronic Kidney Disease Dataset Classifiers Time Accuracy Kappa MAE RMSE RAE RRSE TPR FPR (sec) 8.89 97..9..11 5..1.97.3.8 99..98..8 3.8 15.85.99.1 1 8 Figure 8: Comparison of MLPN and Time (seconds) 1 1 8 Accuracy RRSE Figure 9: Comparison of MLPN and Accuracy and RRSE. www.ijarcsse.com, All Rights Reserved Page

5 3 1 Kappa MAE RMSE RAE TPR FPR Figure 1: Comparison of MLPN and Kappa, MAE, RMSE, RAE, TPR and FPR Figure 11 (a) Figure 11 (b) Figure 11 (a) & (b) : Confusion Matrix for Chronic Kidney Disease Dataset VI. CONCLUSION Data Classification is an important step in Data Mining. Classifying the data without interrupting the basic knowledge gained from the classifiers is a difficult task. Artificial Neural Network is suggested to be the best classifier of all the Data Mining algorithms. Two different classifiers are explained and tested and results obtained in each case is tabulated and also represented in graphs. It was observed that is better than the MLPN in terms of time. Different data set have been used to test the classifier. The future direction of this work is to test the classifiers in many numbers of datasets and to improvise the classifiers if needed. REFERENCES [1] Nikita Jain and Vishal Srivastava, Data Mining Techniques: A Survey Paper, International Journal of Research in Engineering and Technology, Vol (11), pp 11-1, 13. [] Jyothi Soni, Uzma ansari and Dipesh Ansari Intelligent and Effective Heart Disease Prediction System using Weighted Associate Classifer, IJCSE, Vol 3(), pp 385-39, June 11. [3] Shantakumar B.Patil, Intelligent and Effective Heart Attack Prediction System using Data Mining and Artifical Neural Network, European Journal of Scientific Research, Vol 31(), pp -5, 9. [] Shanthakumar B. Patil, Extraction of Significant patterns from Heart Disease Ware Houses for Heart Attack Prediction, IJCSNS, Vol 9(), pp 8-35, Feb 9. [5] www.uci.edu www.ijarcsse.com, All Rights Reserved Page 1

AUTHORS PROFILE Dr. J. JEGATHESH AMALRAJ holds the Doctoral Degree in Computer Science from Bharathidasan University, Tiruchirappalli, Tamilnadu, India. He has more than four years of teaching and research experience. He has published a number of research articles and presented research papers in International Journals and Conferences. He is currently working as Assistant Professor in the Department of Computer Science in Thiruvalluvar University Constituent College at Cuddalore, Tamilnadu, India. Presently he is doing his research on Wireless Adhoc Networks and Data Mining domains. S. SIVAGOWRY is doing her Doctroal Degree in the Department of Computer Science, Bharathidasan University. She has published a number of research articles and presented papers in International and National Conferences. Her area of interest are Data Mining, Networks, Security, Artificial Intelligence and Machine Learning. www.ijarcsse.com, All Rights Reserved Page