A Compensatory Wavelet Neuron Model

Similar documents
Dynamic Analysis of Structures Using Neural Networks

COMPUTATIONAL INTELLIGENCE

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.

Supervised Learning in Neural Networks (Part 2)

For Monday. Read chapter 18, sections Homework:

Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Accelerating the convergence speed of neural networks learning methods using least squares

OMBP: Optic Modified BackPropagation training algorithm for fast convergence of Feedforward Neural Network

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

This leads to our algorithm which is outlined in Section III, along with a tabular summary of it's performance on several benchmarks. The last section

Data Mining. Neural Networks

Query Learning Based on Boundary Search and Gradient Computation of Trained Multilayer Perceptrons*

Exercise: Training Simple MLP by Backpropagation. Using Netlab.

Multi Layer Perceptron trained by Quasi Newton learning rule

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

An Improved Backpropagation Method with Adaptive Learning Rate

Multilayer Feed-forward networks

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

Solar Radiation Data Modeling with a Novel Surface Fitting Approach

Artificial Neural Networks

A neural network that classifies glass either as window or non-window depending on the glass chemistry.

Texture classification using convolutional neural networks

Automatic Adaptation of Learning Rate for Backpropagation Neural Networks

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

COMPUTATIONAL INTELLIGENCE

Channel Performance Improvement through FF and RBF Neural Network based Equalization

11/14/2010 Intelligent Systems and Soft Computing 1

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Character Recognition Using Convolutional Neural Networks

THE NEURAL NETWORKS: APPLICATION AND OPTIMIZATION APPLICATION OF LEVENBERG-MARQUARDT ALGORITHM FOR TIFINAGH CHARACTER RECOGNITION

Notes on Multilayer, Feedforward Neural Networks

Fast Learning for Big Data Using Dynamic Function

2. Neural network basics

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

Image Classification Using Wavelet Coefficients in Low-pass Bands

IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM

Neural Networks (Overview) Prof. Richard Zanibbi

Neural Network Neurons

Artificial Neural Network and Multi-Response Optimization in Reliability Measurement Approximation and Redundancy Allocation Problem

Image Compression: An Artificial Neural Network Approach

Neural Networks Laboratory EE 329 A

In this assignment, we investigated the use of neural networks for supervised classification

Constructively Learning a Near-Minimal Neural Network Architecture

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples.

Artificial Neural Network-Based Prediction of Human Posture

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Performance analysis of a MLP weight initialization algorithm

Improving Classification Accuracy for Single-loop Reliability-based Design Optimization

Artificial Neural Network Methodology for Modelling and Forecasting Maize Crop Yield

STEREO-DISPARITY ESTIMATION USING A SUPERVISED NEURAL NETWORK

Week 3: Perceptron and Multi-layer Perceptron

Wevelet Neuron Filter with the Local Statistics. Oriented to the Pre-processor for the Image Signals

MODIFIED KALMAN FILTER BASED METHOD FOR TRAINING STATE-RECURRENT MULTILAYER PERCEPTRONS

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

arxiv: v1 [cs.lg] 25 Jan 2018

FACE RECOGNITION USING FUZZY NEURAL NETWORK

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India.

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

THE CLASSICAL method for training a multilayer feedforward

Face recognition based on improved BP neural network

Experimental Data and Training

KINEMATIC ANALYSIS OF ADEPT VIPER USING NEURAL NETWORK

Edge Detection for Dental X-ray Image Segmentation using Neural Network approach

Comparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity

Proceedings of the 2016 International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, September 23-25, 2016

Extreme Learning Machines. Tony Oakden ANU AI Masters Project (early Presentation) 4/8/2014

INVESTIGATING DATA MINING BY ARTIFICIAL NEURAL NETWORK: A CASE OF REAL ESTATE PROPERTY EVALUATION

CHAPTER VI BACK PROPAGATION ALGORITHM

PARALLEL LEVENBERG-MARQUARDT-BASED NEURAL NETWORK WITH VARIABLE DECAY RATE

CMPT 882 Week 3 Summary

CHAPTER 6 IMPLEMENTATION OF RADIAL BASIS FUNCTION NEURAL NETWORK FOR STEGANALYSIS

Efficient Iterative Semi-supervised Classification on Manifold

Image compression and reconstruction using pi t -sigma neural networks

6. Backpropagation training 6.1 Background

Visual object classification by sparse convolutional neural networks

CS6220: DATA MINING TECHNIQUES

Seismic regionalization based on an artificial neural network

The Mathematics Behind Neural Networks

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CSC 578 Neural Networks and Deep Learning

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Machine Learning Classifiers and Boosting

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

Instantaneously trained neural networks with complex inputs

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

A Novel Technique for Optimizing the Hidden Layer Architecture in Artificial Neural Networks N. M. Wagarachchi 1, A. S.

IN recent years, neural networks have attracted considerable attention

PERFORMANCE OF GRID COMPUTING FOR DISTRIBUTED NEURAL NETWORK. Submitted By:Mohnish Malviya & Suny Shekher Pankaj [CSE,7 TH SEM]

Model parametrization strategies for Newton-based acoustic full waveform

An Empirical Study of Software Metrics in Artificial Neural Networks

Keywords: ANN; network topology; bathymetric model; representability.

Research on Evaluation Method of Product Style Semantics Based on Neural Network

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

Artificial Neural Network based Curve Prediction

Static Gesture Recognition with Restricted Boltzmann Machines

Transcription:

A Compensatory Wavelet Neuron Model Sinha, M., Gupta, M. M. and Nikiforuk, P.N Intelligent Systems Research Laboratory College of Engineering, University of Saskatchewan Saskatoon, SK, S7N 5A9, CANADA guptam@ sask.usask.ca Abstract This paper proposes a compensatory-wavelet neural model, which is based on a wavelet activation function. Here, the basis function comprises of both summation and multiplicative functions. It is shown in [l] that, for a spectrum of functional mapping and classification problems, the compensatory neuron based neural network model performs better than the ordinary neuron based neural network, in terms of both the accuracy of prediction and the computational time involved. On thc other hand, the wavelet neuron is obtained by modifying an ordinary neuron with a non-orthogonal wavelet bases [2]. The performances of different neuron based neural networks are also analyzed in this paper. 2 Neuron Models The neuron model affects the classification and the functional mapping power of a neural'network. In the following sections we investigate different existing neuron models and formulate some new neuron models to improve upon the capability of the existing ones. Basic Neuron Model: The neuron model due to Mc- Culloch and Pitts is given by the Equations (I), (2) and (3) as described below. N u = c wixi i=o 1 Introduction A robust performance and quick convergence of a neural network (NN) with small complexity are vital for its wide application. The architectural complexity, which governs the size of a NN, depends on the number of neurons and the connections [3J. The larger the number of neurons and connections, the more complex will be the architecture. Similarly, the learning complexity depends on the learning algorithm. Any NN designed for the real-life applications must not be complex and it must have adequate functional mapping, classification and generalization capabilitics. The present investigation explores the feasibility of constructing higher order neuron models which may actually serve as the basis for the formulation of some powerful neural network architectures. Some benchmark classification and functional mapping problems are addressed to validate the neuron and neural network models dcvclopeil and reported in this papcr. It will be shown that even simple feedforward neural network can predict a chaotic nonlinear time series problem contrary to the conclusion drawn by Yamakawa et a1 [2]. 4(u) = Y = 4 (U) y (exp-xu/2 - expx"/2) (exp-x"/2 + expxu/2 1 where X is a steepness factor and y is a multiplication factor. (3) Compensatory Neuron Model: Sinha et a1 [I] proposed a compensatory neuron model where each neuron consists of two nonlinearities. Here, we propose a compensatory neuron model with one nonlinearity as shown in Fig. 1. This forms the basis for formulating the compensatory neural network architecture (CNNA) as shown in Fig. 4. This not only reduces the number of neurons required to solve some of the benchmark classification and mapping problems, but also improves the convergence speed and reduces the computational burden. The compensatory neuron model can be expressed as N M N 0-7803-7078-3/OU$l0.00 (C)U#)l IEEE. Page: 1372

where d, (U) is defined in Equation (3). \ 'W I.." / / I.' / neuron Y ----t output Figure 1. Compensatory Neuron Model Figure 3. Compensatory Model -Y output Wavelet Neuron Wavelet Neuron Model: Yamakawa et a1 [2] proposed an over-complete system of non-orthogonal smooth wavelet bases in order to approximate a nonlinear function with a smooth function. The shape of the bases can be described by the following set of equations: u = N i=o wixi where 6 is a shifting parameter, the maximum value of which equals the corresponding scaling parameter a. Figure 2 depicts the wavelet neuron model. 3 Formulation of Various Architectures The neuron models described in the previous section can be arranged to form neural network architectures for solving different problems. The architecture based on a basic neuron model is referred to as standard feedforward neural network (STD). The neural network architectures based on compensatory, wavelet, and compensatory wavelet neurons will be termed compensatory neural network architecture (CNNA) (Fig. 4). wavelet neural network architecture (WNNA) (Fig. 5) and compensatory wavelet neural network architecture (CWNNA) respectively. A modified form of STD where only summation function is used in the output layer will be referred to as modified standard neural network (MSTD). I fo \ Y output XIinp neuron Figure 2. Wavelet Neuron Model I L 8 inputs neuron blocks outputs Figure 4. Compensatory Neural Network Architecture Compensatory Wavelet Neuron Model: In the compensatory neuron model if the sigmoid function is replaced by a wavelet function it results in a compensatory wavelet neural model. Figure 3 presents the schematic of the compensatory wavelet neuron model. This model can be defined as given by Equations (6) and (7) where U is defined by Equation (4). 4 Learning Algorithm The learning rules correlate the input and output values of the nodes by adjusting the weights in a neural 0-7803-7078-3/0V$l0.00 (C)u)ol IEEE. Page: 1373

6s = -(4-Oh) {Xs(yL - 0:)/2yk} Hidden layer weights update: W inputs input neurotis output neurons Figure 5. Wavelet Neural Network Architecture Input layer weights update: network. The steepest descent algorithm requires a selection of user-defined parameters sorted out by trial and error and is slow in convergence. The problem of poor convergence is combated using various acceleration techniques mentioned in the literature [4][5]. But most of these techniques are adhoc patching. We adopt here a method termed scaled conjugate gradient learning (SCG) 161 and self-scaling scaled conjugslte gradient learning (SSCG) [I] to train the various neural network models. If in the neural network model, the output layer has a summation function only, then the output layer-weights update can be carried out using either a linear scheme such as matrix inversion, or a singular value decomposition approach or can be carried out using the usual back propagation scheme. If the weights are updated using a backpropagation scheme in conjunction with SCG then this constitutes the SSCG learning [I]. It has been shown that this method gives better accuracy for functional mapping and classification problems[ I]. Below we give the equations for calculating the error gradient for the STD and the CNNA. Expressions for the error gradients described below are for the error function defined by the Equation (9). All the computations are done in off-line mode. The error gradients for all the patterns are obtained by summing up and averaging the error gradients for the individual patterns. Error gradient calculation for the compensatory neural network architecture (CNNA) is as follows: Output layer: Input layer: K E(G(n)) = 0.5(& - (9) k=l Error gradient calculation for the standard feedforward neural network Output layer weights update: 0-7803-7078-3/0u$10.00 (C)u)Ol IEEE. Page: 1374

error at the nth iteration number o f neurons in output layer number of inputs number o f neurons in a layer kth desired output kth actual output input output o f a neuron weights steepness coefficient multiplication factor nth iteration bias 5 Simulation Studies The essential components defining a NN are topology, size, functionality, learning algorithms, traininghalidation, and implementation/realization. The performance measure involves the selection of these features and quantifying, in some form, success of the selection. The result of the performance evaluation will depend significantly on the application. The main factors which will decide the superiority of neuron models, in general, using supervised learning, are the computational burden for each iteration/epoch, number of epochs for convergence, NN size, gcneralizationltcst, and benchmark problems. Here, we analyze first some functional mapping problems and then classification problems. 5.1 Functional Mapping This may involve mapping from a lower dimensional to a higher dimensional system or vice verse. Essentially, the capability of mapping a function depends upon the neuron model and the architecture of the NN together with the learning scheme being used. In the following sections we first test on z(z, y) = sin(z).sin(y) and a nonlinear time series problem. In all the figures depicting the convergence of the NN, the mean square error (mean of the error functioned defined by the Equation (9)) against the number of iterations is plotted. Sin(x).Sin(y) Problem: General function mapping problems have been used by different researchers to test a NN s capabilities, learning algorithm s efficiency, etc. A popular mapping function is Z(Z, y ) = sin(x). sin(y). This function becomes more complex as the norm of the input vector (2, y) grows. We have generated the training set consisting of 2500 training patterns by varying the values of z and y in the range [O, 5~1. A Chaotic Nonlinear Time Series Problem: Here the training and the test sets are generated using the following nonlinear time series equation. 52, z,+1 = - 1+x; 0.5~~ - 0.5~,-1 + O.5xn-2 (22) with the initial values xo = 0.2, z1 = 0.3, and z2 = 1.0. The set consists of 3 inputs and 1 output. The 3 inputs were comprised of 2 delays and 1 present value of thc independent variable. The data set are constructed by deleting the past-past value and adding a new predicted value. A time series of 101 points was used to construct the training data set, consisting of 99 patterns, as explained above. 5.2 Classification Any new neuron model and learning algorithm developed must be tested for its classification capability on benchmark problems. Therefore, to verify the efficacy of the proposed neuron models we examined them on a few classification problems, such as parity and XOR. XOR Problem: The exclusive-or (XOR) problem is the classic problem requiring hidden units. The XOR problem as compared to other logic operations is nonlinearly separable. The NN models were trained for the XOR problem and their performance was analyzed in terms of the number of epochs required and the degree of accuracy achieved. Parity Problem: The N-input parity problem has been a popular benchmark problem among researchers in NN such as Minsky and Papert [7]. The problem consists in mapping an N - bit wide binary number into its parity, i.e. if the input pattern contains an odd number of Is then the panty is I. else it is 0. We have used it to determine the properties of neurons. 6 Results and Discussion The simulation results for the different neuron models based NNs are presented in Figs. 6 to IO. In Fig. 6 the error decay for the sin(z).sin(y) problem during training is presented. The CNNA-4, STD-3-5- I, MSTD-3-5-1, WNNA-36, CWNNA-15 refer to CNNA with 4 neurons, STD with 3-5-1 neuron in the input, hidden and output layers respectively, MSTD with 3-5- 1 neuron in the input, hidden and output layers respectively, WNNA with 36 neurons (generated out of 8 complete 0-7803-7@78-3/Ol/$lO.@l (C)zoOl IEEE. Page: 075

I I Figure 6. M.S. error decay during training for the sin(x).sin(v) problem Figure 8. Prediction error for diflerent NN architecture bases), CWNNA with 15 neurons (generated out of 5 complete bases) respectively. It may be observed that the convergence of the CWNNA and the CNNA were the best. But the number of neurons involved in thc CNNA were only 4 while that for the CWNNA were 15. This resulted in a computation saving, moreover, fewer parameters (weights) were used to approximate the mapping. The STD and the MSTD have equal numbers of neurons, but the convergence for the MSTD is better except for a small interval where convergence were slow in the case of the MSTD. Figs. 7 and 8 present the results for the chaotic time series problem discussed earlicr. convergence but the ultimate convergence was better for the other models. It is to be noted that the error decay for the wavelet models can be made faster and better but only at the cost of increased computation. This is obvious from Fig. 7 that the wavelet models were computationally costly due to large number of weights and neurons as compared to the other models. A further increase in the number of neurons would make these wavelet models even costlier. The prediction was best for the CNNA which was of course computationally cheaper than the wavelet models and the STD and the MSTD. Similarly, it can be observed that for the classification problems (XOR and Parity) the performance of the compensatory wavelet model was at par with that of the wavelet model while the amount of computation involved in the former was less than that for the latter (Figs. 9 and IO). The compensatory neural network performed best while the amount of computation involved was the least. Figure 7. M.S.error decay during training for STD, MSTD, CNNA, WNNA, CWNNA for the tame series problem.., CNNA-I CWNNA-6 WNNA-6 5TD-2-1-1 MSTD-2-1-1. -- Here the legends have the same meaning as explained earlier. The CWNNA-2 I was generated out of 6 complete wavelet bases while WNNA-28 was generated out of 7 complete wavelet bases. It can be observed that the convergence for the STD was better than both the wavelet models. This conclusion is contrary to the conclusion drawn earlier by Yamakawa et a1 [2]. This is because of the fact that STD model does not require as many neurons as was used to solve this problem by Yamakawa et a1 [2]. The wavelet models showed early Figure 9. M.S. error decay during training for the XOR problem 0-7803-7078-3/0l/$l0.00 (C)2001 IEEJL Page: U76

[6] Molter, M. E, A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning, Neural Networks, Vol. 6, 1993, pp. 525-533. [7] Minsky, M. L., Papert, S., Perceptrons: An Intro-. duction to Computational Geometry, MIT Press, Cambridgc, MA 1969. Figure 10. M.S. error decay during training for the parity problem 7 Conclusion A compensatory and a wavelet compensatory neuron models were proposed proposed in this paper. These models serve as the basis for the formulation of the compensatory neural network and the compensatory wavelet neural network architectures. It is concluded that the compensatory models are much superior to the other models. Moreover, the modified standard neural network (MSTD) is also much superior to the wavelet model. References Sinha, M., Kumar, K., and Kaka, P. K., Some New Neural Network Architectures with Improved Learning Schemes, to appear in Softcomputing, Springer Verlag. Yamakawa, T., Uchino E. and Samatsu T., Wavelet Neural Network Employing Over- Complete Number of Compactly Supported Non-orthogonal Wavelets and their Applications, in Proceeding of IEEE International Conference on Neural Networks, June 28-July 2, 1994, pp. 139 1-1 396. Hassoun, M. H., Fundamentals of Artificial Neural Networks, MIT Press, Cambridge, Massachusetts, 1995. Jacobs, R. A., Increased Rate of Learning Convergence through Learning Adaptation, Neural Networks, Vol. 1, 1988, pp. 295-307. Hush, D. R., and Salas, J. M., Improving the Learning Rate of Backpropagation with the Gradient Reuse Algorithm, in Proceedings of IEEE International Conference on Neural Networks, Vol. I, 1988, pp. 44 1-447. 0-7803-7078-3/0U$l0.00 (C)ZUOl IEEE. Page: 1377