Design and application of neurocomputers

Size: px
Start display at page:

Download "Design and application of neurocomputers"

Transcription

1 Loughborough University Institutional Repository Design and application of neurocomputers This item was submitted to Loughborough University's Institutional Repository by the/an author. Additional Information: A doctoral thesis submitted in partial fulfilment of the requirements for the award of Doctor of Philosophy at Loughborough University. Metadata Record: Publisher: c David Naylor Rights: This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 2.5 Generic (CC BY-NC- ND 2.5) licence. Full details of this licence are available at: Please cite the published version.

2 This item was submitted to Loughborough University as a PhD thesis by the author and is made available in the Institutional Repository ( under the following Creative Commons Licence conditions. For the full text of this licence, please go to:

3 LOUGHBOROUGH UNIVERSITY OF TECHNOLOGY LIBRARY AUTHOR/FILING TITLE ---- N A"-( \...o(l t 'J). c.. '1'"", _ i I ACCESSION/COPY NO. I. I ()tt-009f , VOL. NO. CLASS MARK I sor 2 8 j'jn 1q(lIi 28 Jl.n,> JUN \ JUN \ JVi I'm I11111.

4 l I

5 DESIGN AND APPLICATION OF NEUROCOMPUTERS by David c.j. Naylor, B.Eng., A Doctoral Thesis. Submitted in partial fulfilment of the requirements for the award of - Doctor of Philosophy of the Loughborough University of Technology. January 1994 : by David Naylor 1994

6 ... Loughborough University of Tpr f!",,""v I.ibrary... Date j - Ii'l' '"\1., Class Acc. No. (l\{ '){> '\ to l..21. "

7 ABSTRACT This thesis aims to understand how to design high performance, flexible and cost effective neural computing systems and apply them to a variety of real-time applications. Systems of this type already exist for the support of a range of ANN models. However, many of these designs have concentrated on optimising the architecture of the neural processor and have generally neglected other important aspects. If these neural systems are to be of practical benefit to researchers and allow complex neural problems to be solved efficiently, all aspects of their design must be addressed. This thesis investigates two particular areas of neural system design and application. The first is a study of the hardware characteristics of a neural processing architecture, to determine the most efficient and effective network structural mapping strategies. Initially, a model of hardware performance and utilisation characteristics is developed, describing the implementation of the Back Propagation learning algorithm in a linear array architecture. Using this model, a number of important structural relationships are discovered between the different layers in a network, which can influence the performance of the neural hardware. These are presented as a set of guidelines that can assist neural application designers in their choice of suitable network structures. This choice will then ensure that not only are the algorithmic performance requirements of an application achieved but also that the capabilities of the neural hardware are fully exploited. The second investigation is a design study for a neural computing system with realtime, high bandwidth image processing application requirements. The effective integration of the neural system within an existing environment is a key consideration for designers, if the capabilities of the neural processing architecture are to be efficiently exploited by real applications. The study identifies and addresses a number of critical design issues including the control of the neural processors, the physical system construction, and the strategy for coupling to the non-neural environment. i

8 ACKNOWLEDGEMENTS I sincerely wish to thank my supervisor, Professor Simon Jones, for his support over the last 3 years. His criticism, guidance and encouragement have been invaluable to me, not only during my research but also in the writing of this thesis. I must also thank him for the opportunities I have had to attend many international conferences. These have been very rewarding experiences. Secondly I wish to thank Mr. David Myers at British Telecom Research Laboratory for his support and welcomed criticism. I must also thank Dr. Mike Whybray, Dr. John Vincent, Mr. John Harbridge, Mr. Colin Williamson, Mr. Tony Briers and Mr. Dave Orrey for their help and advice. It has been greatly appreciated. I must also thank all the members of the Electronic System Design Group at Loughborough and previously, those of the Parallel and Novel Architectures Group at the University of Nottingham. They have been the sources of many useful and stimulating discussions. I particul<\i'ly wish to acknowledge Mr. Mark Gooch and Dr. Andrew Spray in this respect. To my parents, I send my deepest gratitude. I wish to thank them for their support and encouragement over the last 3 years - I will always be indebted to them for the education they have afforded me. Also, I must send a special thank you to Usa for always being there to help me through. For my financial support I wish to thank the Science and Engineering Research Council, without whom this work would not have been possible. Furthermore, I wish to thank British Telecom for their generous sponsorship of this project. Finally, I must send a special thank you to Dr. Karl Sammut. His guidance, support and friendship throughout the last 3 years have been deeply appreciated. ii

9 TABLE OF CONTENTS ABSTRACT... i ACKNOWLEDGEMENTS ii STATEMENT OF ORIGINALITY... iii TABLE OF CONTENTS iv CHAPTER ONE INTRODUcnON Background..., Motivation , Applications of Artificial Neural Networks The Hardware Implementation of Neural Networks Objectives of the Thesis Structure of the Thesis CHAPTER Two REVIEW Objectives of the Review Characteristics of Artificial Neural Networks..., Neural Learning Algorithms Hopfield Bidirectional Associative Memory Back Propagation Restricted Coulomb Energy The Adaptive Resonance Theory The Self-Organising Feature Map Discussion iv

10 2.4 Hardware Architectures for Neural Networks CNAPS BACCHUS Torrent MA ETANN ANNA Discussion Neural Systems The CNAPS System PAN IV CNS-l SYNAPSE-l Mod The ANNA System Analysis and Comparison of Neural Systems Conclusions from the Review CHAPTER THREE OUTLINE OF INVESTIGATION Objectives of the Chapter Objectives of the Research Identification of Research Topics Statement of Research Objectives Experimental Vehicles HANNIBAL Image Processing Application System Environment Investigation Summaries Neural Algorithm Characterisation Network Structural Mapping Hardware Design Study v

11 CHAPTER FOUR HANNmAL Objectives of the Chapter Array Architecture Processor Architecture Back Propagation Implementation Physical Characteristics The HANNmAL Simulator Summary CHAPTER FIVE THE ApPLICATION AND ENVIRONMENT Objectives of the Chapter The Application Motivation Basic Requirements Definition The Solution System Environment Hardware Software Summary CHAPTER SIX NEURAL ALGORfIHM CHARACTERISATION Objectives of the Chapter Objectives of the Investigation Back Propagation Implementation Characteristics Processor Level Characterisation Array Level Characterisation Characteristics Modelling Recall Stage Model Learning Stage Model vi

12 6.5 Conclusions CHAPTER SEVEN NE1WORK STRUCIURAL MApPING Objectives of the Chapter Objectives of the Investigation Methodology Recall Stage Results Analysis Learning Stage Results Analysis Network Structural Mapping Guidelines Conclusions CHAPTER EIGHT HARDWARE DESIGN STuDY Objectives of the Chapter Objectives of the Design Study Design Specification Summary Critical Design Issues Operating Frequency System Communications Interfacing Hardware Controller Software Integration Design hnplementation Overview lmplementing the Feature Location Application System Architecture Host Software Architecture vii

13 8.5.5 Design Status Performance Assessment Recall Learning Conclusions CHAPTER NINE CONCLUSIONS Objectives of the Chapter Review of Objectives Conclusions Measurement of Success Limitations of the Work Further Work Extensions of Current Investigations Further Investigations Summary REFERENCES PUBLICATIONS viii

14 CHAPTER ONE INTRODUCTION 1.1 Background The development of new, more complex and larger applications is placing ever increasing demands on computing hardware technology. Many of these applications are related to the fields of vision and speech processing, and are characterised by high processing and communication bandwidth requirements. Furthermore, their complexity is such that it is often difficult to formulate a complete set of rules to govern their response. Therefore, employing conventional algorithmic modelling techniques can result in ill-defined output characteristics. It is therefore desirable and necessary to develop alternative approaches for the implementation of these applications. Artificial Neural Networks (ANNs) employ architectural structures that are inspired by the brain and algorithms that attempt to mimic its ability for determining complex input-output associations that cannot be easily quantified. ANNs achieve this by processing sample data and storing sufficient, appropriate information within the network's structure to be able to recall the association at any time, when presented with similar input data. Hence, the ANN learns by example the required output response. Furthermore, it has the ability to generalise and by that, can respond in an informed manner to previously unseen input data. 1

15 CHAPTER ONE TNTROPUCrroN Neurons are the primitive processing elements of ANNs. Input stimuli or signals are received from any number of other neurons along connecting synapses. The output of the neuron is determined by the total strength of its inputs, and is itself distributed as an input to many other neurons. The strength of the connection, or the synaptic weight value, between two neurons determines how influential the output of one neuron is, when calculating the output of the other. It is these weight values that represent the information relating to the input-output associations that a network has learnt. Hence, the ANN learning algorithm is used to determine the strengths of all the inter-neuron connections for a particular set of input-output relationships. The work of a few key researchers has been the basis for the development of many, widely applied ANN models. McCulloch and Pitts are recognised as the early pioneers (circa 1943) in many algebraic aspects of neural networks. In 1949, Hebb proposed a learning law for ANNs and postulated that"... repeated activation of one neuron by another, across a particular synapse, increases its conductance... " - a statement that embodies the principles of neuron learning or neurodynamics. The next significant step forward came in 1958 when Rosenblatt introduced his work on a neuron model called the Perceptron. This was followed two years later by the Adaptive Linear or Adaline model from Widrow and Hoff [1]. In 1969, Minsky and Papert -published research that showed there were limits to the learning capabilities of both models and that suitable learning schemes for so-called 'hidden layer' Perceptrons did not exist. Many researchers then began to turn their attention elsewhere. Further details of this and other early work in the field can be found in the literature [2]. A resurgence of interest occurred in the 1980's when solutions to the learning problems of Perceptrons were published [3]. A number of proposals were also put forward for new ANN models, including the Hopfield network [4] and the Adaptive Resonance Theory of Carpenter and Grossberg [5]. These ANN 2

16 CHAITER ONE INIRom TenON models chiefly contrast in terms of their neuron interconnection strategies and the complexity of their learning algorithm. 1.2 Motivation Applications of Artificial Neural Networks The use of ANNs is now becoming widespread in many commercial and industrial fields as their capabilities are beginning to be discovered. Applications that particularly benefit from implementation in an ANN are often characterised by, complex input-output relationships, noisy or corrupted input data, and/or continually adaptable input data. Typical applications which exhibit these characteristics include: Control systems for monitoring and adjusting unpredictable or complex industrial processes in noisy environments [6,7]. Financial forecasting [6] models use neural networks to analyse the complex interaction between several independent variables such as employment figures, trade balance and gross national product. The neural model then offers short or long term financial advice. The complexity of the input-output relationships in such an application is very high. Image processing and pattern recognition. This is a major field in which neural networks have found many practical uses. Applications such as, 3

17 CHAPTER ONE TNrROPUCI!ON translation, rotation and scale invariant pattern recognition [8], character recognition [6,9] image segmentation and compression [10,11], and motion detection and tracking [12] have all been able to exploit the characteristics of ANNs The Hardware Implementation of Neural Networks The combination of high image resolution and a real-time operation requirement can demand a processing bandwidth in excess of loombyte/s. Achieving this throughput rate for image processing applications therefore requires the support of high performance hardware. However, the predominant calculation occurring in a neural algorithm, particularly during the recal! stage, is the matrix-vector operation. This involves the accumulation and nonlinear thresholding of a neuron's input-synaptic weight products to generate the neuron's output. Hence, while the number of neural calculations increase with O(N2+N) for N neurons, the nature of the operations that are being performed remain of the same complexity. State of the art microprocessor technology is one option for the implementation of these algorithms, since this can accommodate the processing bandwidth requirements of typical image applications. However, these devices are very general, complex designs and their high functionality goes beyond the requirements of typical neural algorithms. Hence, an optimised hardware architecture that can exploit the characteristics of the neural algorithm in its design is more likely to offer a cost effective solution. 4

18 CHAITER ONE INTRODIfCTlON Specific hardware architectures have therefore been developed for the implementation of many neural applications, including image processing. The approaches that have been adopted by designers of neural specific hardware are diverse, and depend upon the range of ANN models and applications being implemented. In many applications, the neural hardware will not represent the complete solution. Instead, it will simply accelerate the neural processing stage of an application that was previously implemented in software. Hence, the neural computing hardware is often a component in a larger system and consequently, it is generally considered an enhancing rather than a replacing technology. Some key considerations for the design and application of neural specific hardware include: The architecture and interconnection strategy of the neural processors. Control of the neural processors. The hardware and software integration of neural and non-neural systems. The physical construction/organisation of the neural system. Strategies for mapping neural applications into the hardware. These issues have been investigated to a greater or lesser extent by researchers. However, if ANNs are to offer a practical and efficient approach to the solution of complex, computationally intensive problems then the full capabilities of neural specific architectures must be exploited by the designers of both system hardware and applications. All these aspects must therefore be carefully considered. 5

19 CHAPTER ONE WROP! [(J'fON 1.3 Objectives of the Thesis This thesis aims to understand how to design and apply high performance, flexible and cost effective neural computing systems or neurocomputers for a. variety of real-time applications. Systems of this type already exist for the support of a range of ANN models. However, many of these designs have concentrated on optimising the architecture of the neural processor and have generally neglected other important aspects. This thesis examines two of these areas, namely suitable strategies for the structural mapping of neural networks into a neural architecture to ensure an efficient and effective exploitation of hardware resources, and the engineering design issues that are associated with the control of a neurocomputer, its physical construction and its coupling to other, non-neural systems. The field of neural computation has now evolved from a theoretical onto a practical level. Many ANN algorithms have been developed over recent years, along with specific hardware architectures for their implementation. If this technology is to be usefully and efficiently exploited for the solution of complex and computationally intensive problems, both of the above issues must be addressed. In the first instance, system hardware designers must ensure that the neural subsystem is an integral part of a tightly coupled processing environment. Secondly, application designers must tailor their neural networks to capitalise fully on all the hardware resources available in the neural system. 6

20 CHAPTER ONE TmRomrCfrON 1.4 Structure of the Thesis Chapter One introduces the field and discusses the development of artificial neural networks. It explains the motivation for utilising specialised hardware support in ANN applications and introduces some of the issues that are associated with the implementation of a complete neural system. A review of related work is provided in Chapter Two. First, the characteristics of ANN models are summarised to assist in the study of a cross section of ANN learning algorithms. The review then discusses the characteristics of a range of special neural processors that have been developed for the implementation of these models. Finally, it examines how they have been incorporated into complete neural systems. Comparisons of these systems and their neural architectures help to identify the aspects of their design and application which lack coherent strategies and provide the basis for this research. Chapter Three proposes 3 investigations. These are the neural algorithm characterisation, the network structural mapping and the hardware design study. It also explains the experimental assumptions used in these investigations. The linear array neural processor, HANNIBAL, is used as the research vehicle in the investigations. Its architecture is detailed in Chapter Four. Chapter Five explains the requirements of the image processing application and features of the system environment that are utilised in the hardware design study. Chapter Six is the first investigation and provides a detailed methodological approach to the characterisation of a neural algorithm implemented in the HANNlBAL processor. The model produced is used in Chapter Seven for an investigation into the strategies for structuring multi-layer neural networks 7

21 CHAPTER ONE INTRODUCTION when mapping them into the linear array architecture. The results are presented as a set of mapping guidelines which application designers may use to improve hardware performance. The hardware design study in Chapter Eight aims to identify and address the design of the components of a neural computing subsystem that are critical to achieving its goals. After detailing the specification of the system, a range of issues is identified and discussed, before presenting the selected design strategy. An assessment of performance is also included. Chapter Nine draws together the conclusions from each investigation and discusses whether the objectives have been achieved. It examines the limitations of the work and outlines possible extensions to the existing investigations, as well as further research areas that may be of interest. Finally it summarises the main points of the thesis. 8

22 CHAPTER Two REVIEW 2.1 Objectives of the Review This chapter presents- a review of the field of ANNs as a basis for the work presented in this thesis. The objectives of the chapter are : To define the common characteristics of ANN models and review the leaming and recall procedures of a representative sample. To outline a number of hardware architectures that have been developed for the implementation of these models. Then, to examine how these architectures have been integrated at the system level in the development of neurocomputers. To identify the key issues for research relating to the design and application of these systems. 2.2 Characteristics of Artificial Neural Networks A variety of ANN models exist that differ in their physical and algorithmic structure, and can be classified according to a number of characteristics. The main features that distinguish each model are outlined below. 9

23 CHAPIERTwo REVIEW Network Topology. Neurons are arranged in a layer-wise manner in most ANN models, but the number of layers and their interconnection strategy differentiate between them. Two common structures are shown in Figure 2.1. The multilayer feedforward network shown in (a), consists of several layers of neurons with unidirectional connections between adjacent layers. This example shows a sparse network in which not all the possible connections are made. Any layers that are not directly connected to the outside environment are called hidden layers. The aim of a sparse network interconnection strategy is to create localised decision regions that perform specific neural tasks. By connecting several layers together it is possible to create complex decision surfaces, capable of characterising any arbitrary input-output relationship. The single layer recurrent network shown in (b), introduces a time related response to an input stimuli and hence, is a form of dynamic network (as compared to the static network in (a». The outputs from each neuron are fed back via weighted connections in an iterative scheme that will reach a steady state over time. This class of network is important for the modelling of nonlinear dynamic control systems or can simply provide a form of short term memory. Leaming algorithm. Learning is the process of determining the strengths or weight values of the synaptic connections between the neurons in a network. This may be performed in a single-shot or recursive manner. The single-shot method involves the straight forward calculation of the weights given the complete set of input vectors. Therefore the weight values store a direct representation of the input vectors. When an input is presented the resulting output is simply the best matching stored pattern. This is known as an autoassociative learning algorithm. Recursive learning algorithms are either supervised or unsupervised. Both of these adapt their weights in an iterative process until a steady or 'acceptable' 10

24 CHAl'TER Two REVIEW state is reached. In supervised learning the weights are modified to reflect the difference between the expected and actual output vector for a particular input. When this error decreases below a preset threshold the network is considered to have converged and the learning process is halted. Unsupervised learning does not require example output patterns to guide the network to a solution. These learning algorithms examine each input vector in turn and formulate their own output representations that differentiate between dissimilar inputs. Both supervised and unsupervised networks can be used to map a particular input pattern set into a different output set - a heteroassociative learning algorithm. A classifier is a simple form of a heteroassociative network. There are a number of ways to 'determine if the solution to which the network has converged is acceptable. The most common method is to examine the size of weight modifications for a pass of the learning data set. If all the weight updates during this operation are below a certain threshold value then the network has converged. This method does not guarantee that the solution is correct however, as a local minimum in the decision surface could have been found. In this situation, the weights will be stable but the output will not always be correct. To verify that a global solution has been obtained the output must be analysed. Data that the network has not seen before - a test set - is often applied for this purpose. How accurately the network's output matches the expected result can be used as a measure of the learning success. Recall Strategy. Regardless of the network topology or the learning algorithm, the output of a neuron can generally be defined in a simple form as, oil 8YfIlIlJSU Neuron Output = Non-linear function ( L Input x Weight) (2.1) In multilayered feedforward networks, a single pass of this equation is required for each layer to generate the output. However in a recurrent 11

25 CHAmR llio REVIEW Output Vector Y Output Vector Y )) h I) / / \). I' - Input Vector X Input vector X (a) MultiIayer Feedforward Network (h) Recurrent Network Figure 2.1. Typical neural network topologies. network, the recall procedure requires multiple iterations around the feedback path until the state of the output vector remains constant. In this case, the input data are the outputs from the previous iteration. Activation Function. The activation or threshold function provides the neuron with its non-linear mapping characteristics. A variety of functions are used, depending upon the learning algorithm and data type. Some typical examples are shown in Figure 2.2. The sigmoid and ramp function allows a neuron to signal its degree of confidence that a particular feature is present in its input data. The '-1' output of the signum function allows an output to be inhibitory, as opposed to an output of '0' which is simply non-excitatory. 12

26 CHAPTER Two REVIEW out out 1 out 1f-- ---I----in in ramp sigmoid signum Figure 2.2. Typical activation functions. Data Type. The input data for a model can be discrete (binary) or continuous. Discrete data models will generally employ parallel updating of neuron output values, whereas neurons which operate on continuous data may also function asynchronously. 2.3 Neural Learning Algorithms This section provides an overview of a cross section of learning algorithms for ANNs. Introductions to several of these are provided by Lippmann [13] and Wasserman [14]. A glossary of the notation used in the following subsections is presented below. 1 index of the layer (1=0 input, 1=1 1st hidden, etc.) L NI j,k w Jk X k Yj e j number of layers (excluding input layer) number of neurons in layer I indices of neurons synaptic weight value between neurons j and k neuron input element k output value of neuron j expected output value of (output layer) neuron j 13

27 CHAPTER Two REVIEW bias value for neuron j T j radius value for neuron j 1/ constant global learning gain factor, 0<1/<1 1/(t) learning gain factor on iteration t, 0<1/(t)<1 f(e) activation function V input vectors in the learning data set v input vector v t iteration number Hopfield The Hopfield model is composed of a single layer of neurons with fully interconnected synapses, as shown in Figure 2.1(b). The model is an autoassociative type that is capable of realising associative memories, pattern classifiers and optimisation circuits. Although many people had previously examined this recurrent structure, it was the work of John Hopfield that provided a resurgence of interest [4]. His original proposal was for a binary input neuron with a step activation function. Each neuron in the layer could respond asynchronously to changes in its input data. Restrictions were placed on the synapse weight values, requiring the weight matrix to be symmetrical and contain only zeros on its diagonal. Hence, there was no feedback connection from a neuron to itself. These restrictions were placed on the model to ensure stability. Learning The more commonly used version of this learning algorithm employs a sigmoid activation function with parallel updating of the outputs and non-zero diagonal weight matrix terms. The process requires a one-shot calculation of the weight matrix values using (2.2) and is known as Hebbian learning. In this 14

28 CHAPTER Two REVIEW case, binary values are initially converted into a bipolar format before calculating the sum of outer-product matrices for each weight. y = L ( 2x/ - 1 )( 2x; - 1 ) (2.2).=1. Recall When presented with an input vector the Hopfield model must perform several iterations of the feedback loop to generate the associated output. On each iteration every neuron calculates the sum of its weighted inputs and thresholds the result. The output is fed back to the input for the next iteration until the network stabilises - indicated by no further changes in the output. Problems, Solutions and Alternative Methods Stability is always an issue with recurrent networks. The use of non-zero diagonal terms in the weight matrix can help to reduce the number of oscillating states and make the model less sensitive to noisy inputs. However, this can also lead to the generation of an increased number of spurious stable states that were not contained in the learning data. Annealing is a technique which can alleviate the problem of erroneous states or local minima [15]. The limited information storage capacity of Hopfield networks is another problem. For a network with C binary connections there are 2 c possible states, but the storage capacity is limited to approximately O.ISC [4]. Removal of the zero diagonal terms increases this capacity but at the expense of lower error correction capabilities and an increased susceptibility to false matches. Higher order nonlinear functions can also improve the capacity of the model [16]. 15

29 CHAPTER Two REVIEW Bidirectional Associative Memory The Bidirectional Associative Memory (BAM) is a recurrent network model that is similar in its capabilities to Hopfield. However, it is hetero-associative and can therefore produce an output vector that is related to the input, but is not necessarily the same. This characteristic is a result of the dual layer structure of the recurrent model, illustrated in Figure 2.3. Kosko's work has been particularly valuable in the development of this model [17]. Learning The basic learning paradigm for the BAM is a single-shot process, similar to the Hopfield model. Associations between input and output vector pairs are encoded in the weights using (2.3), where T denotes the transpose. y W = L(Xf y; (2.3) v=1 Recall The weight matrix can be regarded as a long term memory, while the outputs of both layers A and B are short term. The aim of the recall process is to make the short term memory output of B converge to the stored vector in the long term memory that is associated with the present input vector. The step thresholded outputs from layer A are fed into B which then calculates its own outputs. These are fed back into A whereupon the cycle repeats until neither output vector changes. The output vectors for each layer after an iteration are given by (2.4) and (2.5). 16

30 CHAPTER Two RE\fJEW LayerB Layer A Figure 2.3. Bidirectional Associative Memory. K y}(t+1) = f ( l::.%l(t) Wft ).1:=1 (2.4).%1(t+1) = f ( 1:: Yj(t) WfJ ) J }=1 (2.5) Problems, Solutions and Alternative Methods As with the Hopfield model, the BAM has a limited information storage capacity. Kosko estimated that the maximum number of associations that could be stored did not exceed the number of neurons, N, in the smallest layer [17]. More realistic calculations put the figure at N/(4 log2 N). Alternative activations functions can be used to increase this capacity. In the 17

31 CHAPTER Two REVIEW non-homogeneous BAM, each neuron can set its own threshold point which theoretically provides a maximum of 2N stable states. In practice however, the number is much less. Advanced forms of the BAM exist that allow asynchronous neuron state changes, adapt their weights during actual operation and have intra-layer inhibitory connections Back Propagation Back Propagation (BP) is a learning algorithm for Multi-Layered Perceptron (MLP) networks. It was proposed by Rumelhart et al. [3] as a solution to the training of hidden Perceptrons and is one of the most popular algorithms in use. A typical MLP network structure is shown in Figure 2.4. This network is capable of creating multiple open or closed convex decision regions. These regions can become arbitrarily complex and concave in shape, with a second hidden layer of Perceptrons. Learning The learning process is a gradient search technique designed to minimise the mean square difference between the expected and actual output. Hence, it is a supervised algorithm and is most commonly used in classification tasks. It uses the sigmoid non-linearity as shown in Figure 2.2, since it requires a differentiable, hence continuous, activation function. The procedure begins with the initialisation of the weights to small ( ± 0.1 ) random values. Three stages then follow - activation feed forward, error back propagation and weight update. 18

32 CHAPTER Two REVIEW The activation feedforward stage requires each neuron in the network to calculate their activation values using the outputs from the previous layer, as shown in (2.6). N 1 -'-1 (2.6) Y/ = f ( a; + L wji xi ) k=o Once the output layer activations have been calculated the difference between the actual and expected output can be obtained using (2.7). The successful convergence of the algorithm can be measured by the size of this error. (2.7) Using a first approximation to the derivative of the sigmoid activation function, (2.7) can be rewritten as shown in (2.8). (2.8) The output layer errors must be propagated back to its adjacent hidden layer, which calculates its own output errors using (2.9); where i donates the index of neurons in the layer above. All hidden layers repeat this process. (2.9) The final stage requires the weight values associated with each neuron to be modified in proportion to the size of the error. The bias value associated with each neuron is updated in the same way - see (2.10) and (2.11). The 3 stages of the learning process are repeated until the network converges. 19

33 CHAPTER Two REVIEW Output Layer Hidden Layer Input Layer x". X, x.. X 3 x,.. Xs Figure 2.4. Multilayer Perceptron Network. (2.10) (2.11) Recall The recall procedure is exactly the same as the activation feedforward stage of learning, as formulated in (2.6). 20

34 CHAPTER Dyo REVIEW Problems, Solutions and Alternative Methods The learning process can be shortened by accumulating individual neuron. error values over the complete learning data set and only then updating the weights. This is known as epoch learning and has the advantage that the learning process is not influenced by the order of exemplars in the data set. The addition of another term to the weight update equation can sometimes increase the rate of decent towards a global minimum. This momentum term can also increase the stability of the convergence procedure. The BP algorithm has successfully demonstrated its ability to learn complex representations, notably in the NetTalk application [18]. However, networks are characteristically very large and the number of iterations of the data set during learning are noticeably greater than for other algorithms. Therefore, several modified BP algorithms have been proposed that employ dynamic neuron creation during learning [19,20] or post-learning pruning techniques to remove duplicated or redundant neurons [21,22]. Both methods aim to match the information capacity of the network to the information content of the data set. Networks which are over- or under-sized can both exhibit diminished generalisation characteristics when presented with a test data set. Techitiques that adapt the value of 1/ during learning have also been shown to improve convergence times [23]. Similarly, the selection of initial weight values and the activation function parameters can also influence the learning speed of networks using the BP algorithm [24]. 21

35 CHAI'TEB Two REVIEW Restricted Coulomb Energy The Restricted Coulomb Energy (RCE) model departs from the use of standard non-linear activation functions and instead employs neurons with Radial Basis Functions (RBFs). Typical examples of these are shown in Figure 2.5. Unlike the sigmoid or ramp functions used in other models, RBFs form functions over a finite region of input space. Thus arbitrarily complex decision regions can be formed with simpler networks resulting in faster training, particularly for classification problems. The structure of the RCE model is shown in Figure 2.6. Note that each hidden RBF neuron is connected via a unity weighted synapse to only one output neuron and these employ standard threshold functions such as the sigmoid. Learning The RBF neurons do not compute their output from the sum of their weighted inputs, but instead regard the weights as defining a point in the input space. The output is calculated as a function of the distance from this point to the point defined by the input vector. Mathematically the RBF neuron output is given by (2.12). Learning is a two stage process which must establish not only the position of the neuron's activation in the weight space but also its radius. YJ = f( (2.12) An unsupervised process, such as the K-Means clustering algorithm,.is often used to calculate the weights and hence determine the centre of each RBF. Alternatively, the RBF layer can be dynamic and grow in response to the information storage requirements during learning. Assuming a block step RBF threshold function, the procedure is as follows. 22

36 CHAPTER Two REVIEW Gausslan Block Step Figure 2.5. RBF neuron activation functions. Apply an input vector and calculate the output using (2.12). Sum and threshold the output layer neurons. For each output : If '1' and should be '0', shrink the radius of ALL neurons in the RBF layer that are outputting '1' until T j = xi' If '0' and should be '1', spawn a new RBF neuron centred on this input point. Otherwise do nothing. Repeat the process for all input vectors. Recall The recall process requires a single forward pass through the network in which the outputs of the RBF layer neurons are calculated using (2.12) and the output layer thresholds the sum of it unity weighted inputs. 23

37 CHAPTER Two REVIEW Output Layer Hidden RBF Layer lnputlayer Figure 2.6. RCE model network. Problems, Solutions and Alternative Methods There are number of extensions available for the RCE model, particularly in the structure of its learning algorithm. These are too detailed to discuss here but are summarised in [25]. Problems that exist with the basic RCE learning algorithm discussed above include the inability to increase the radii of RBF neurons or move the position of the function once the weight has been set. Therefore, new neurons are often 24

38 CHAPTER Two REVIEW introduced unnecessarily. Furthermore, it is not possible to determine the importance - how often the output is active - of particular neurons in relation to others. Hence, the information content of a particular neuron cannot be ascertained. Ideally, the more information that becomes encoded in a neuron the less responsive that neurons should be to noisy input data The Adaptive Resonance Theory The Adaptive Resonance Theory (ART) is an ANN model developed by Carpenter and Grossberg for unsupervised classification tasks [5]. The problem with most models is their inability to adapt to new input data after the initial learning process. has been completed. The only way to introduce further classifications or associations is to retrain the ANN with the expanded data set. This problem was termed the 'Stability-Plasticity Dilemma' by Grossberg, who developed the ART model as its solution. The structure of the model is illustrated in Figure 2.7 and can be seen to differ considerably from those reviewed so far, not least due to the existence of two distinct weight matrices, T and W. The Comparison and Recognition layers both maintain the neuron-like functionality of other models but have a number of discerning features. The Gain units and Reset provide the special control features that make the model unsupervised and will be discussed in the context of the operating procedure. Since the model does not have distinct learning and recall phases it is not appropriate to describe two separate processes. Therefore, the general operation of the model will be explained, which will incorporate both the storage and retrieval phases. The version of the model described here uses binary input vectors. 25

39 CHAPTER Two REVIEW Outputy 1-1 GAIN 2 I G2 RECOGNTIlON LAYER Feedback Feedforwaro I RFSEf I Weight Welght MatrlxT MatrlxW 1-1 GAIN 1 : G1 COMPARlSON LAYER Input X Figure 2.7. The ART network. There are four phases of operation of the model : Initialisation, Recognition, Comparison and Search. The initialisation phase, at t=o, involves the calculation of the weight matrices and the resetting of the output and gain signals. The feedback weights are all initialised to 1 while the values of the feedforward weights are determined by the number of neurons in the Comparison layer, K, using (2.13). The outputs of the Gain units, G 1 and Gz are initialised to zero. (2.13) During the Recognition phase an input vector is fed into the Comparison layer. Each neuron in this layer receives three inputs: G 1, Xk and the feedback 26

40 CHAPTER Two REVIEW from the Recognition layer, tk. The output is then calculated using the 'twothirds' rule - if any two of the three inputs are active then a '1' is output, otherwise the output is '0'. When a valid input is presented to the network, G 1 is set to '1'. tk is initially zero since no Recognition layer neurons are active. Therefore at this stage, in accordance with the two-thirds rule, the input vector is passed directly through to the Recognition layer where each neuron compares it against its own 'pattern template', stored in the weight matrix, W. This layer determines which stored pattern is the best match for the new input and activates the output of the appropriate neuron. For this, each neuron must calculate the dot product of the weight and the input vector. The neuron with the strongest output signal then 'turns off' all the others. This is achieved using connections between all neighbouring neurons and a process known as lateral inhibition. This winner-takes-all competition between output neurons ensures that only one can be active in response to a particular input. Achieving an active state simply means that this stored pattern or class exemplar is the closest to the input vector; there must still be a process to decide if it is close enough. The winning neuron, j', passes its class exemplar, stored in the weight matrix T, back to the Comparison layer. The Comparison phase can now take place. G 1 is now set to zero so that the neurons can calculate a comparison vector C, in accordance with the two-thirds rule. This is fed into the Reset unit where the quality of the match is determined using (2.14). The result,s, must then be compared against a preset vigilance threshold, p, which makes the final decision regarding the similarity between this input vector and the active output class, as shown in (2.15). 27

41 CHAPTER nvo REVIEW s = number 0/ matching ones in the input and feedback vectors number 0/ ones in the input vector (2.14) If S > P then X belongs to class j. If S p then X has been misclassified. (2.15) If the pattern falls into the selected class, the weights must be adapted to reflect the new input pattern using (2.16) and (2.17). The process may then restart for the next input pattern. (2.16) K L tj;l,(t)x k k=1 (2.17) If the pattern has been rnisclassified, the model has to check all the other possible output classes and create a new one if necessary. The Search phase performs this task by initially disabling the output of the 'bad' neuron using G 2 and allowing the next strongest output, that was previously inhibited, to become active. The same process is repeated until an output class is found that meets the vigilance threshold. If none of the classes are close enough to the input vector, a new class exemplar must be created and the weights determined. as above. 28

42 CHAPTER Two REVIEW Problems, Solutions and Alternative Methods It is clear that the vigilance parameter will severely influence the number of classes that are created by the model. Too Iowa value of p will cause very dissimilar vectors to be assigned to the same class and too high a value will make the model too sensitive to minor differences. Furthermore, it is difficult to determine the optimum value for p due to its sensitivity to the context of the training data. Two forms of learning are available - fast and slow. The fast process has been described above and is the most commonly used. Slow learning requires several presentations of each input vector and each time small modifications are made to the weights until convergence is achieved. This method results in weights that are less influenced by anyone input and hence better classification characteristics are obtained. More advanced forms of ART are available. ART-2 develops the model for continuous valued input data [26] and improves the capabilities of the model significantly. ART-3 [27] moves the model closer to a biological system representation by allowing it to respond to real-time constantly varying inputs The Self-Organising Feature Map The self-organising feature map, developed by Kohonen [28], relies on biological evidence which suggests that sensory pathways in the brain are arranged to reflect the characteristics of the input stimuli. Hence, the model makes the assumptions that input patterns with common features will define the class structure and that these features can be extracted from the input data. The structure of the network is illustrated in Figure 2.8. It consists of a single layer, 2D grid arrangement of neurons that are connected to their nearest 29

43 CHAPTER 1Wo REVIEW neighbours and fully connected to an input layer of K neurons. Each neuron on the grid produces an output which represents a particular class of input vector. Learning The unsupervised learning procedure selects a region of the 2D output space to represent a particular class of K-dimensional input vectors. The connections between adjacent neurons in the grid create a winner-takes-all network in which only the strongest output is active. Usually the weights and input data are normalised. This ensures that the training process uses the spatial orientation of vectors rather than their magnitudes to determine the region in the output space or class to which an input belongs. The learning process is therefore faster than for some other neural algorithms, as a degree of variability is removed from the weight space. When the weights are initialised, they are randomly spread around the normalised weight space. This can cause learning difficulties if the input vectors are not evenly distributed, as neurons whose weight orientation is very different to that of the learning data may never become active, allowing regions of the output space to remain unused. To prevent this, the learning paradigm employs a 'shrinking neighbourhood' technique to localise the effects of weight modifications. This forces inputs that are too dissimilar to search for their own independent regions of the output space. The initial size of the neighbourhood around each neuron, M j (0) is dependent upon the dimensions of the grid and of the input vector. The winner takes all selection process involves the calculation of the Euclidean distance d j between the input and weight vectors for each neuron. It is calculated using (2.18). The neuron with the minimum distance is chosen to be updated and is designated t. 30

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

11/14/2010 Intelligent Systems and Soft Computing 1

11/14/2010 Intelligent Systems and Soft Computing 1 Lecture 8 Artificial neural networks: Unsupervised learning Introduction Hebbian learning Generalised Hebbian learning algorithm Competitive learning Self-organising computational map: Kohonen network

More information

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION 6 NEURAL NETWORK BASED PATH PLANNING ALGORITHM 61 INTRODUCTION In previous chapters path planning algorithms such as trigonometry based path planning algorithm and direction based path planning algorithm

More information

11/14/2010 Intelligent Systems and Soft Computing 1

11/14/2010 Intelligent Systems and Soft Computing 1 Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism) Artificial Neural Networks Analogy to biological neural systems, the most robust learning systems we know. Attempt to: Understand natural biological systems through computational modeling. Model intelligent

More information

Data Mining. Neural Networks

Data Mining. Neural Networks Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most

More information

A novel firing rule for training Kohonen selforganising

A novel firing rule for training Kohonen selforganising A novel firing rule for training Kohonen selforganising maps D. T. Pham & A. B. Chan Manufacturing Engineering Centre, School of Engineering, University of Wales Cardiff, P.O. Box 688, Queen's Buildings,

More information

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra Pattern Recall Analysis of the Hopfield Neural Network with a Genetic Algorithm Susmita Mohapatra Department of Computer Science, Utkal University, India Abstract: This paper is focused on the implementation

More information

Image Compression: An Artificial Neural Network Approach

Image Compression: An Artificial Neural Network Approach Image Compression: An Artificial Neural Network Approach Anjana B 1, Mrs Shreeja R 2 1 Department of Computer Science and Engineering, Calicut University, Kuttippuram 2 Department of Computer Science and

More information

Dr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically

Dr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically Supervised Learning in Neural Networks (Part 1) A prescribed set of well-defined rules for the solution of a learning problem is called a learning algorithm. Variety of learning algorithms are existing,

More information

Neural Networks CMSC475/675

Neural Networks CMSC475/675 Introduction to Neural Networks CMSC475/675 Chapter 1 Introduction Why ANN Introduction Some tasks can be done easily (effortlessly) by humans but are hard by conventional paradigms on Von Neumann machine

More information

Correlation Matrix Memories: Improving Performance for Capacity and Generalisation

Correlation Matrix Memories: Improving Performance for Capacity and Generalisation Correlation Matrix Memories: Improving Performance for Capacity and Generalisation Stephen Hobson Ph.D. Thesis This thesis is submitted in partial fulfilment of the requirements for the degree of Doctor

More information

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation Learning Learning agents Inductive learning Different Learning Scenarios Evaluation Slides based on Slides by Russell/Norvig, Ronald Williams, and Torsten Reil Material from Russell & Norvig, chapters

More information

Website: HOPEFIELD NETWORK. Inderjeet Singh Behl, Ankush Saini, Jaideep Verma. ID-

Website:   HOPEFIELD NETWORK. Inderjeet Singh Behl, Ankush Saini, Jaideep Verma.  ID- International Journal Of Scientific Research And Education Volume 1 Issue 7 Pages 154-162 2013 ISSN (e): 2321-7545 Website: http://ijsae.in HOPEFIELD NETWORK Inderjeet Singh Behl, Ankush Saini, Jaideep

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Instantaneously trained neural networks with complex inputs

Instantaneously trained neural networks with complex inputs Louisiana State University LSU Digital Commons LSU Master's Theses Graduate School 2003 Instantaneously trained neural networks with complex inputs Pritam Rajagopal Louisiana State University and Agricultural

More information

COMPUTATIONAL INTELLIGENCE

COMPUTATIONAL INTELLIGENCE COMPUTATIONAL INTELLIGENCE Fundamentals Adrian Horzyk Preface Before we can proceed to discuss specific complex methods we have to introduce basic concepts, principles, and models of computational intelligence

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Computational Intelligence Meets the NetFlix Prize

Computational Intelligence Meets the NetFlix Prize Computational Intelligence Meets the NetFlix Prize Ryan J. Meuth, Paul Robinette, Donald C. Wunsch II Abstract The NetFlix Prize is a research contest that will award $1 Million to the first group to improve

More information

Pattern Classification Algorithms for Face Recognition

Pattern Classification Algorithms for Face Recognition Chapter 7 Pattern Classification Algorithms for Face Recognition 7.1 Introduction The best pattern recognizers in most instances are human beings. Yet we do not completely understand how the brain recognize

More information

Yuki Osada Andrew Cannon

Yuki Osada Andrew Cannon Yuki Osada Andrew Cannon 1 Humans are an intelligent species One feature is the ability to learn The ability to learn comes down to the brain The brain learns from experience Research shows that the brain

More information

Character Recognition Using Convolutional Neural Networks

Character Recognition Using Convolutional Neural Networks Character Recognition Using Convolutional Neural Networks David Bouchain Seminar Statistical Learning Theory University of Ulm, Germany Institute for Neural Information Processing Winter 2006/2007 Abstract

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are connectionist neural networks? Connectionism refers to a computer modeling approach to computation that is loosely based upon the architecture of the brain Many

More information

In this assignment, we investigated the use of neural networks for supervised classification

In this assignment, we investigated the use of neural networks for supervised classification Paul Couchman Fabien Imbault Ronan Tigreat Gorka Urchegui Tellechea Classification assignment (group 6) Image processing MSc Embedded Systems March 2003 Classification includes a broad range of decision-theoric

More information

Supervised Learning in Neural Networks (Part 2)

Supervised Learning in Neural Networks (Part 2) Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning

More information

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

2. Neural network basics

2. Neural network basics 2. Neural network basics Next commonalities among different neural networks are discussed in order to get started and show which structural parts or concepts appear in almost all networks. It is presented

More information

For Monday. Read chapter 18, sections Homework:

For Monday. Read chapter 18, sections Homework: For Monday Read chapter 18, sections 10-12 The material in section 8 and 9 is interesting, but we won t take time to cover it this semester Homework: Chapter 18, exercise 25 a-b Program 4 Model Neuron

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks Ritika Luthra Research Scholar Chandigarh University Gulshan Goyal Associate Professor Chandigarh University ABSTRACT Image Skeletonization

More information

Neural Network Weight Selection Using Genetic Algorithms

Neural Network Weight Selection Using Genetic Algorithms Neural Network Weight Selection Using Genetic Algorithms David Montana presented by: Carl Fink, Hongyi Chen, Jack Cheng, Xinglong Li, Bruce Lin, Chongjie Zhang April 12, 2005 1 Neural Networks Neural networks

More information

Notes on Multilayer, Feedforward Neural Networks

Notes on Multilayer, Feedforward Neural Networks Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book

More information

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class

More information

A Class of Instantaneously Trained Neural Networks

A Class of Instantaneously Trained Neural Networks A Class of Instantaneously Trained Neural Networks Subhash Kak Department of Electrical & Computer Engineering, Louisiana State University, Baton Rouge, LA 70803-5901 May 7, 2002 Abstract This paper presents

More information

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

Neural Networks. Neural Network. Neural Network. Neural Network 2/21/2008. Andrew Kusiak. Intelligent Systems Laboratory Seamans Center

Neural Networks. Neural Network. Neural Network. Neural Network 2/21/2008. Andrew Kusiak. Intelligent Systems Laboratory Seamans Center Neural Networks Neural Network Input Andrew Kusiak Intelligent t Systems Laboratory 2139 Seamans Center Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335

More information

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation Farrukh Jabeen Due Date: November 2, 2009. Neural Networks: Backpropation Assignment # 5 The "Backpropagation" method is one of the most popular methods of "learning" by a neural network. Read the class

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com Neural Networks These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides

More information

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples.

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples. Supervised Learning with Neural Networks We now look at how an agent might learn to solve a general problem by seeing examples. Aims: to present an outline of supervised learning as part of AI; to introduce

More information

Visual object classification by sparse convolutional neural networks

Visual object classification by sparse convolutional neural networks Visual object classification by sparse convolutional neural networks Alexander Gepperth 1 1- Ruhr-Universität Bochum - Institute for Neural Dynamics Universitätsstraße 150, 44801 Bochum - Germany Abstract.

More information

Extending reservoir computing with random static projections: a hybrid between extreme learning and RC

Extending reservoir computing with random static projections: a hybrid between extreme learning and RC Extending reservoir computing with random static projections: a hybrid between extreme learning and RC John Butcher 1, David Verstraeten 2, Benjamin Schrauwen 2,CharlesDay 1 and Peter Haycock 1 1- Institute

More information

II. ARTIFICIAL NEURAL NETWORK

II. ARTIFICIAL NEURAL NETWORK Applications of Artificial Neural Networks in Power Systems: A Review Harsh Sareen 1, Palak Grover 2 1, 2 HMR Institute of Technology and Management Hamidpur New Delhi, India Abstract: A standout amongst

More information

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions ENEE 739Q: STATISTICAL AND NEURAL PATTERN RECOGNITION Spring 2002 Assignment 2 Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions Aravind Sundaresan

More information

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina Neural Network and Deep Learning Early history of deep learning Deep learning dates back to 1940s: known as cybernetics in the 1940s-60s, connectionism in the 1980s-90s, and under the current name starting

More information

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation

More information

Seismic regionalization based on an artificial neural network

Seismic regionalization based on an artificial neural network Seismic regionalization based on an artificial neural network *Jaime García-Pérez 1) and René Riaño 2) 1), 2) Instituto de Ingeniería, UNAM, CU, Coyoacán, México D.F., 014510, Mexico 1) jgap@pumas.ii.unam.mx

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

Efficient Object Extraction Using Fuzzy Cardinality Based Thresholding and Hopfield Network

Efficient Object Extraction Using Fuzzy Cardinality Based Thresholding and Hopfield Network Efficient Object Extraction Using Fuzzy Cardinality Based Thresholding and Hopfield Network S. Bhattacharyya U. Maulik S. Bandyopadhyay Dept. of Information Technology Dept. of Comp. Sc. and Tech. Machine

More information

Artificial Neuron Modelling Based on Wave Shape

Artificial Neuron Modelling Based on Wave Shape Artificial Neuron Modelling Based on Wave Shape Kieran Greer, Distributed Computing Systems, Belfast, UK. http://distributedcomputingsystems.co.uk Version 1.2 Abstract This paper describes a new model

More information

New wavelet based ART network for texture classification

New wavelet based ART network for texture classification University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 1996 New wavelet based ART network for texture classification Jiazhao

More information

MATLAB representation of neural network Outline Neural network with single-layer of neurons. Neural network with multiple-layer of neurons.

MATLAB representation of neural network Outline Neural network with single-layer of neurons. Neural network with multiple-layer of neurons. MATLAB representation of neural network Outline Neural network with single-layer of neurons. Neural network with multiple-layer of neurons. Introduction: Neural Network topologies (Typical Architectures)

More information

Convex combination of adaptive filters for a variable tap-length LMS algorithm

Convex combination of adaptive filters for a variable tap-length LMS algorithm Loughborough University Institutional Repository Convex combination of adaptive filters for a variable tap-length LMS algorithm This item was submitted to Loughborough University's Institutional Repository

More information

Controlling the spread of dynamic self-organising maps

Controlling the spread of dynamic self-organising maps Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April

More information

SVM-based Filter Using Evidence Theory and Neural Network for Image Denosing

SVM-based Filter Using Evidence Theory and Neural Network for Image Denosing Journal of Software Engineering and Applications 013 6 106-110 doi:10.436/sea.013.63b03 Published Online March 013 (http://www.scirp.org/ournal/sea) SVM-based Filter Using Evidence Theory and Neural Network

More information

Artificial Neural Network based Curve Prediction

Artificial Neural Network based Curve Prediction Artificial Neural Network based Curve Prediction LECTURE COURSE: AUSGEWÄHLTE OPTIMIERUNGSVERFAHREN FÜR INGENIEURE SUPERVISOR: PROF. CHRISTIAN HAFNER STUDENTS: ANTHONY HSIAO, MICHAEL BOESCH Abstract We

More information

Dynamic Analysis of Structures Using Neural Networks

Dynamic Analysis of Structures Using Neural Networks Dynamic Analysis of Structures Using Neural Networks Alireza Lavaei Academic member, Islamic Azad University, Boroujerd Branch, Iran Alireza Lohrasbi Academic member, Islamic Azad University, Boroujerd

More information

Multilayer Feed-forward networks

Multilayer Feed-forward networks Multi Feed-forward networks 1. Computational models of McCulloch and Pitts proposed a binary threshold unit as a computational model for artificial neuron. This first type of neuron has been generalized

More information

Climate Precipitation Prediction by Neural Network

Climate Precipitation Prediction by Neural Network Journal of Mathematics and System Science 5 (205) 207-23 doi: 0.7265/259-529/205.05.005 D DAVID PUBLISHING Juliana Aparecida Anochi, Haroldo Fraga de Campos Velho 2. Applied Computing Graduate Program,

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute

More information

m Environment Output Activation 0.8 Output Activation Input Value

m Environment Output Activation 0.8 Output Activation Input Value Learning Sensory-Motor Cortical Mappings Without Training Mike Spratling Gillian Hayes Department of Articial Intelligence University of Edinburgh mikes@dai.ed.ac.uk gmh@dai.ed.ac.uk Abstract. This paper

More information

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter Review: Final Exam Model for a Learning Step Learner initially Environm ent Teacher Compare s pe c ia l Information Control Correct Learning criteria Feedback changed Learner after Learning Learning by

More information

Neural Networks (Overview) Prof. Richard Zanibbi

Neural Networks (Overview) Prof. Richard Zanibbi Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)

More information

Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation

Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation C.J. Norsigian Department of Bioengineering cnorsigi@eng.ucsd.edu Vishwajith Ramesh Department of Bioengineering vramesh@eng.ucsd.edu

More information

An Integer Recurrent Artificial Neural Network for Classifying Feature Vectors

An Integer Recurrent Artificial Neural Network for Classifying Feature Vectors An Integer Recurrent Artificial Neural Network for Classifying Feature Vectors Roelof K Brouwer PEng, PhD University College of the Cariboo, Canada Abstract: The main contribution of this report is the

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

Multi-Layered Perceptrons (MLPs)

Multi-Layered Perceptrons (MLPs) Multi-Layered Perceptrons (MLPs) The XOR problem is solvable if we add an extra node to a Perceptron A set of weights can be found for the above 5 connections which will enable the XOR of the inputs to

More information

IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM

IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM Annals of the University of Petroşani, Economics, 12(4), 2012, 185-192 185 IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM MIRCEA PETRINI * ABSTACT: This paper presents some simple techniques to improve

More information

Exercise 2: Hopeld Networks

Exercise 2: Hopeld Networks Articiella neuronnät och andra lärande system, 2D1432, 2004 Exercise 2: Hopeld Networks [Last examination date: Friday 2004-02-13] 1 Objectives This exercise is about recurrent networks, especially the

More information

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT PhD Summary DOCTORATE OF PHILOSOPHY IN COMPUTER SCIENCE & ENGINEERING By Sandip Kumar Goyal (09-PhD-052) Under the Supervision

More information

Artificial Neural Networks MLP, RBF & GMDH

Artificial Neural Networks MLP, RBF & GMDH Artificial Neural Networks MLP, RBF & GMDH Jan Drchal drchajan@fel.cvut.cz Computational Intelligence Group Department of Computer Science and Engineering Faculty of Electrical Engineering Czech Technical

More information

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort

More information

Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers

Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers Anil Kumar Goswami DTRL, DRDO Delhi, India Heena Joshi Banasthali Vidhyapith

More information

Channel Performance Improvement through FF and RBF Neural Network based Equalization

Channel Performance Improvement through FF and RBF Neural Network based Equalization Channel Performance Improvement through FF and RBF Neural Network based Equalization Manish Mahajan 1, Deepak Pancholi 2, A.C. Tiwari 3 Research Scholar 1, Asst. Professor 2, Professor 3 Lakshmi Narain

More information

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R. Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric

More information

COLLABORATIVE AGENT LEARNING USING HYBRID NEUROCOMPUTING

COLLABORATIVE AGENT LEARNING USING HYBRID NEUROCOMPUTING COLLABORATIVE AGENT LEARNING USING HYBRID NEUROCOMPUTING Saulat Farooque and Lakhmi Jain School of Electrical and Information Engineering, University of South Australia, Adelaide, Australia saulat.farooque@tenix.com,

More information

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS BOGDAN M.WILAMOWSKI University of Wyoming RICHARD C. JAEGER Auburn University ABSTRACT: It is shown that by introducing special

More information

IN recent years, neural networks have attracted considerable attention

IN recent years, neural networks have attracted considerable attention Multilayer Perceptron: Architecture Optimization and Training Hassan Ramchoun, Mohammed Amine Janati Idrissi, Youssef Ghanou, Mohamed Ettaouil Modeling and Scientific Computing Laboratory, Faculty of Science

More information

FAST NEURAL NETWORK ALGORITHM FOR SOLVING CLASSIFICATION TASKS

FAST NEURAL NETWORK ALGORITHM FOR SOLVING CLASSIFICATION TASKS Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 FAST NEURAL NETWORK ALGORITHM FOR SOLVING CLASSIFICATION TASKS Noor Albarakati Virginia Commonwealth

More information

Neuro-Fuzzy Computing

Neuro-Fuzzy Computing CSE53 Neuro-Fuzzy Computing Tutorial/Assignment 3: Unsupervised Learning About this tutorial The objective of this tutorial is to study unsupervised learning, in particular: (Generalized) Hebbian learning.

More information

Parallel Evaluation of Hopfield Neural Networks

Parallel Evaluation of Hopfield Neural Networks Parallel Evaluation of Hopfield Neural Networks Antoine Eiche, Daniel Chillet, Sebastien Pillement and Olivier Sentieys University of Rennes I / IRISA / INRIA 6 rue de Kerampont, BP 818 2232 LANNION,FRANCE

More information

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data A Comparative Study of Conventional and Neural Network Classification of Multispectral Data B.Solaiman & M.C.Mouchot Ecole Nationale Supérieure des Télécommunications de Bretagne B.P. 832, 29285 BREST

More information

Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions

Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions ENEE 739Q SPRING 2002 COURSE ASSIGNMENT 2 REPORT 1 Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions Vikas Chandrakant Raykar Abstract The aim of the

More information

Evolutionary form design: the application of genetic algorithmic techniques to computer-aided product design

Evolutionary form design: the application of genetic algorithmic techniques to computer-aided product design Loughborough University Institutional Repository Evolutionary form design: the application of genetic algorithmic techniques to computer-aided product design This item was submitted to Loughborough University's

More information

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 75 CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 6.1 INTRODUCTION Counter propagation network (CPN) was developed by Robert Hecht-Nielsen as a means to combine an unsupervised Kohonen

More information

Artificial Neural Networks

Artificial Neural Networks The Perceptron Rodrigo Fernandes de Mello Invited Professor at Télécom ParisTech Associate Professor at Universidade de São Paulo, ICMC, Brazil http://www.icmc.usp.br/~mello mello@icmc.usp.br Conceptually

More information

Artificial Neural Networks Unsupervised learning: SOM

Artificial Neural Networks Unsupervised learning: SOM Artificial Neural Networks Unsupervised learning: SOM 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001

More information

Design and Performance Analysis of and Gate using Synaptic Inputs for Neural Network Application

Design and Performance Analysis of and Gate using Synaptic Inputs for Neural Network Application IJIRST International Journal for Innovative Research in Science & Technology Volume 1 Issue 12 May 2015 ISSN (online): 2349-6010 Design and Performance Analysis of and Gate using Synaptic Inputs for Neural

More information

Keywords: ANN; network topology; bathymetric model; representability.

Keywords: ANN; network topology; bathymetric model; representability. Proceedings of ninth International Conference on Hydro-Science and Engineering (ICHE 2010), IIT Proceedings Madras, Chennai, of ICHE2010, India. IIT Madras, Aug 2-5,2010 DETERMINATION OF 2 NETWORK - 5

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without a teacher No targets for the outputs Networks which discover patterns, correlations, etc. in the input data This is a self organisation Self organising networks An

More information

CHAPTER VI BACK PROPAGATION ALGORITHM

CHAPTER VI BACK PROPAGATION ALGORITHM 6.1 Introduction CHAPTER VI BACK PROPAGATION ALGORITHM In the previous chapter, we analysed that multiple layer perceptrons are effectively applied to handle tricky problems if trained with a vastly accepted

More information

A Data Classification Algorithm of Internet of Things Based on Neural Network

A Data Classification Algorithm of Internet of Things Based on Neural Network A Data Classification Algorithm of Internet of Things Based on Neural Network https://doi.org/10.3991/ijoe.v13i09.7587 Zhenjun Li Hunan Radio and TV University, Hunan, China 278060389@qq.com Abstract To

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

Rough Set Approach to Unsupervised Neural Network based Pattern Classifier

Rough Set Approach to Unsupervised Neural Network based Pattern Classifier Rough Set Approach to Unsupervised Neural based Pattern Classifier Ashwin Kothari, Member IAENG, Avinash Keskar, Shreesha Srinath, and Rakesh Chalsani Abstract Early Convergence, input feature space with

More information

NeuroScale: Novel Topographic Feature Extraction using RBF Networks

NeuroScale: Novel Topographic Feature Extraction using RBF Networks NeuroScale: Novel Topographic Feature Extraction using RBF Networks David Lowe D.LoweOaston.ac.uk Michael E. Tipping H.E.TippingOaston.ac.uk Neural Computing Research Group Aston University, Aston Triangle,

More information

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-&3 -(' ( +-   % '.+ % ' -0(+$, The structure is a very important aspect in neural network design, it is not only impossible to determine an optimal structure for a given problem, it is even impossible to prove that a given structure

More information