Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science

Similar documents
COMPUTATIONAL INTELLIGENCE

Radial Basis Function Networks

Radial Basis Function Neural Network Classifier

Artificial Neural Networks MLP, RBF & GMDH

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

Radial Basis Function Networks: Algorithms

CS6220: DATA MINING TECHNIQUES

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

Neural Networks. Theory And Practice. Marco Del Vecchio 19/07/2017. Warwick Manufacturing Group University of Warwick

CSE 5526: Introduction to Neural Networks Radial Basis Function (RBF) Networks

Channel Performance Improvement through FF and RBF Neural Network based Equalization

Week 3: Perceptron and Multi-layer Perceptron

Machine Learning : Clustering, Self-Organizing Maps

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks (Overview) Prof. Richard Zanibbi

3 Nonlinear Regression

Character Recognition Using Convolutional Neural Networks

3 Nonlinear Regression

Lecture #11: The Perceptron

Radial Basis Function (RBF) Neural Networks Based on the Triple Modular Redundancy Technology (TMR)

Support Vector Machines

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Perceptron as a graph

Supervised Learning in Neural Networks (Part 2)

A Dendrogram. Bioinformatics (Lec 17)

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

More Learning. Ensembles Bayes Rule Neural Nets K-means Clustering EM Clustering WEKA

Support Vector Machines

Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah

Support vector machines

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Face Detection Using Radial Basis Function Neural Networks With Fixed Spread Value

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

ESSENTIALLY, system modeling is the task of building

Neural Network Neurons

Neural Nets. General Model Building

Accurate modeling of SiGe HBT using artificial neural networks: Performance Comparison of the MLP and RBF Networks

For Monday. Read chapter 18, sections Homework:

The Automatic Musicologist

Machine Learning Classifiers and Boosting

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

Nonlinear dimensionality reduction of large datasets for data exploration

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Efficient Object Tracking Using K means and Radial Basis Function

Machine Learning 13. week

CS 8520: Artificial Intelligence

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

K-Means and Gaussian Mixture Models

Pattern Classification Algorithms for Face Recognition

Facial expression recognition using shape and texture information

CHAPTER IX Radial Basis Function Networks

Neural networks for variable star classification

In this assignment, we investigated the use of neural networks for supervised classification

Performance Evaluation of a Radial Basis Function Neural Network Learning Algorithm for Function Approximation.

COMS 4771 Clustering. Nakul Verma

Face Detection Using Radial Basis Function Neural Networks with Fixed Spread Value

SNIWD: Simultaneous Weight Noise Injection With Weight Decay for MLP Training

Linear Models. Lecture Outline: Numeric Prediction: Linear Regression. Linear Classification. The Perceptron. Support Vector Machines

Neural Networks: What can a network represent. Deep Learning, Fall 2018

Neural Networks: What can a network represent. Deep Learning, Spring 2018

CS 8520: Artificial Intelligence. Machine Learning 2. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek

Learning from Data: Adaptive Basis Functions

Three learning phases for radial-basis-function networks

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Vulnerability of machine learning models to adversarial examples

CHAPTER 6 IMPLEMENTATION OF RADIAL BASIS FUNCTION NEURAL NETWORK FOR STEGANALYSIS

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions

Deep Learning. Architecture Design for. Sargur N. Srihari

Some fast and compact neural network solutions for artificial intelligence applications

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Lecture #17: Autoencoders and Random Forests with R. Mat Kallada Introduction to Data Mining with R

Optimum size of feed forward neural network for Iris data set.

TWRBF Transductive RBF Neural Network with Weighted Data Normalization

Statistical Learning Part 2 Nonparametric Learning: The Main Ideas. R. Moeller Hamburg University of Technology

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.

CS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS

Visual object classification by sparse convolutional neural networks

.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar..

Global Journal of Engineering Science and Research Management

Alex Waibel

Artificial Neural Networks Unsupervised learning: SOM

Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class

NEURAL NETWORK-BASED SEGMENTATION OF TEXTURES USING GABOR FEATURES

Linguistic Rule Extraction From a Simplified RBF Neural Network

Figure (5) Kohonen Self-Organized Map

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Exercise: Training Simple MLP by Backpropagation. Using Netlab.

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Technical University of Munich. Exercise 7: Neural Network Basics

Neural Bag-of-Features Learning

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Nearest neighbor classifiers

A Quick Guide on Training a neural network using Keras.

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming

Research on Pruning Convolutional Neural Network, Autoencoder and Capsule Network

Machine Learning in Biology

Artificial Neural Networks Lecture Notes Part 5. Stephen Lucci, PhD. Part 5

^ Springer. Computational Intelligence. A Methodological Introduction. Rudolf Kruse Christian Borgelt. Matthias Steinbrecher Pascal Held

Transcription:

Neural Networks Prof. Dr. Rudolf Kruse Computational Intelligence Group Faculty for Computer Science kruse@iws.cs.uni-magdeburg.de Rudolf Kruse Neural Networks

Radial Basis Function Networks Rudolf Kruse Neural Networks

Radial Basis Funktion Networks Properties of Radial Basis Funktion Networks (RBF-Networks) RBF-Networks are strictly layered feed-forward neural networks consisting of exactly hidden layer. Radial Basis Functions are used for input and actiation function. This way a catchment area is assigned to eery neuron. The weights of eery connection between the input layer to a hidden neuron indicate the center of this cathment area. Rudolf Kruse Neural Networks 3

Radial Basis Function Networks A radial basis function network (RBFN) is a neural network with a graph G =(U, C) thatsatisfiesthefollowingconditions (i) U in U out =, (ii) C =(U in U hidden ) C, C (U hidden U out ) The network input function of each hidden neuron is a distance function of the input ector and the weight ector, i.e. u U hidden : f (u) net ( w u, in u )=d( w u, in u ), where d :IR n IR n IR + is a function satisfying x, y, z IRn : (i) d( x, y) = x = y, (ii) d( x, y) =d( y, x) (symmetry), (iii) d( x, z) d( x, y) +d( y, z) (triangle inequality). Rudolf Kruse Neural Networks 4

Distance Functions Illustration of distance functions d k ( x, y) = n i= (x i y i ) k Well-known special cases from this family are: k =: Manhattanorcityblockdistance, k =: Euclideandistance, k : maximum distance, i.e. d ( x, y) =max n i= x i y i. k = k = k k Rudolf Kruse Neural Networks 5

Radial Basis Function Networks The network input function of the output neurons is the weighted sum of their inputs, i.e. u U out : f (u) net ( w u, in u )= w u in u = pred (u) w u out. The actiation function of each hidden neuron is a so-called radial function, i.e.a monotonously decreasing function f :IR + [, ] with f() = and lim f(x) =. x The actiation function of each output neuron is a linear function, namely f (u) act (net u,θ u )=net u θ u. (The linear actiation function is important for the initialization.) Rudolf Kruse Neural Networks 6

Radial Actiation Functions rectangle function: f act (net,σ)= {, if net >σ,, otherwise. triangle function: f act (net,σ)= {, if net >σ, net σ, otherwise. σ net σ net cosine until zero: f act (net,σ)= {, if net > σ, cos( π σ net )+, otherwise. Gaussian function: f act (net,σ)=e net σ e σ σ net e σ σ net Rudolf Kruse Neural Networks 7

Radial Basis Function Networks: Examples Radial basis function networks for the conjunction x x x x y x x (,) is the center is the reference radius Euclidean distance rectengular function as the actiation function bias of in the output neuron Rudolf Kruse Neural Networks 8

Radial Basis Function Networks: Examples Radial basis function networks for the conjunction x x x x 6 5 y x x (,) is the center 6 5 is the reference radius Euclidean distance rectengular function as the actiation function bias of intheoutputneuron Rudolf Kruse Neural Networks 9

Radial Basis Function Networks: Examples Radial basis function networks for the biimplication x x Idea: logical decomposition x x (x x ) (x x ) x x y x x Rudolf Kruse Neural Networks

Radial Basis Function Networks: Function Approximation y y 3 y 4 y y 3 y 4 y y y y x x x x x 3 x 4 x x x 3 x 4 y 4 y 3 y y Approximation of the original function by a step function with each step being resembled by one of the RBF-Network s neurons. (compare MLPs) Rudolf Kruse Neural Networks

Radial Basis Function Networks: Function Approximation.. x σ. y x σ y x y x 3 σ y 3 x 4. σ. Rudolf Kruse Neural Networks y 4. σ = x = (x i+ x i ) RBF-Network calculating the step function shown on the preious slide and the partially linear function on the following slide. The only change is made to the actiation function of the hidden neurons.

Radial Basis Function Networks: Function Approximation y y 3 y 4 y y 3 y 4 y y y y x x x x x 3 x 4 x x x 3 x 4 Representation of a partially linear function by the weighted sum of a triangular function with centers x i. y 4 y 3 y y Rudolf Kruse Neural Networks 3

Radial Basis Function Networks: Function Approximation y y 4 6 8 x 4 6 8 x Approximation of a function by the sum of gaussians with radius σ =. w =,w =3andw 3 =. w w w 3 Rudolf Kruse Neural Networks 4

Radial Basis Function Networks: Function Approximation Radial basis function network for a sum of three Gaussian functions x 5 3 y 6 Rudolf Kruse Neural Networks 5

Training Radial Basis Function Networks Rudolf Kruse Neural Networks 6

Radial Basis Function Networks: Initialization Let L fixed = {l,...,l m } be a fixed learning task, consisting of m training patterns l =( ı (l), o (l) ). Simple radial basis function network: One hidden neuron k, k =,...,m,foreachtrainingpattern: k {,...,m} : w k = ı (l k). If the actiation function is the Gaussian function, the radii σ k are chosen heuristically k {,...,m} : σ k = d max m, where d max = max d ( ı (lj), ı (l ) k). l j,l k L fixed Rudolf Kruse Neural Networks 7

Radial Basis Function Networks: Initialization Initializing the connections from the hidden to the output neurons u : m k= w um out (l) m θ u = o (l) u or abbreiated A w u = o u, where o u =(o (l ) u,...,o (l m) u ) is the ector of desired outputs, θ u =,and A = out (l ) out (l )... out (l ) m out (l ) out (l )... out (l ) m... out (l m) out (l m)... out (l m) m This is a linear equation system, that can be soled by inerting the matrix A: w u = A o u.. Rudolf Kruse Neural Networks 8

RBFN Initialization: Example Simple radial basis function network for the biimplication x x x x y x x w w w 3 w 4 y Rudolf Kruse Neural Networks 9

RBFN Initialization: Example Simple radial basis function network for the biimplication x x where A = e e e 4 e e 4 e e e 4 e e 4 e e A = a D b D b D c D b D a D c D b D b D c D a D b D c D b D b D a D D = 4e 4 +6e 8 4e + e 6.987 a = e 4 + e 8.9637 b = e +e 6 e.34 c = e 4 e 8 + e.77 w u = A o u = D a + c b b a + c.567.89.89.567 Rudolf Kruse Neural Networks

RBFN Initialization: Example Simple radial basis function network for the biimplication x x act act y (,) x x x x x x single basis function all basis functions output Initialization leads already to a perfect solution of the learning task. Subsequent training is not necessary. Rudolf Kruse Neural Networks

Radial Basis Function Networks: Initialization Normal radial basis function networks: Select subset of k training patterns as centers. A = out (l ) out (l )... out (l ) k out (l ) out (l )... out (l ) k.... out (l m) out (l m)... out (l m) k A w u = o u Compute (Moore Penrose) pseudo inerse: A + =(A A) A. The weights can then be computed by w u = A + o u =(A A) A o u Rudolf Kruse Neural Networks

RBFN Initialization: Example Normal radial basis function network for the biimplication x x Select two training patterns: l =( ı (l ), o (l ) )=((, ), ()) l 4 =( ı (l 4), o (l 4) )=((, ), ()) x x w w θ y Rudolf Kruse Neural Networks 3

RBFN Initialization: Example Normal radial basis function network for the biimplication x x A = e 4 e e e e e 4 A + =(A A) A = a b b a c d d e e d d c where a.8, b.68, c.78, d.6688, e.594. Resulting weights: w u = θ w w = A + o u.36.3375.3375. Rudolf Kruse Neural Networks 4

RBFN Initialization: Example Normal radial basis function network for the biimplication x x act act y (,) x x x x.36 basis function (,) basis function (,) output Initialization leads already to a perfect solution of the learning task. This is an accident, because the linear equation system is not oer-determined, due to linearly dependent equations. Rudolf Kruse Neural Networks 5

Radial Basis Function Networks: Initialization Finding appropriate centers for the radial basis functions One approach: k-means clustering Select randomly k training patterns as centers. Assign to each center those training patterns that are closest to it. Compute new centers as the center of graity of the assigned training patterns Repeat preious two steps until conergence, i.e., until the centers do not change anymore. Use resulting centers for the weight ectors of the hidden neurons. Alternatie approach: learning ector quantization Rudolf Kruse Neural Networks 6

Radial Basis Function Networks: Training Training radial basis function networks: Deriation of update rules is analogous to that of multilayer perceptrons. Weights from the hidden to the output neurons. Gradient: wu e (l) u = e(l) u = (o (l) u w u out (l) u ) in (l) u, Weight update rule: w u (l) = η 3 wu e (l) u = η 3 (o (l) u out (l) u ) in (l) u (Two more learning rates are needed for the center coordinates and the radii.) Rudolf Kruse Neural Networks 7

Radial Basis Function Networks: Training Training radial basis function networks: Center coordinates (weights from the input to the hidden neurons). Gradient: w e (l) = e(l) w = s succ() (o (l) s out (l) out (l) s )w su net (l) net (l) w Weight update rule: w (l) = η w e (l) = η s succ() (o (l) s out (l) out (l) s )w s net (l) net (l) w Rudolf Kruse Neural Networks 8

Radial Basis Function Networks: Training Training radial basis function networks: Center coordinates (weights from the input to the hidden neurons). Special case: Euclidean distance net (l) w = n i= (w pi out (l) p i ) ( w in (l) ). Special case: Gaussian actiation function out (l) net (l) = f act(net (l),σ ) net (l) = net (l) e ( net (l) σ ) = net(l) σ e ( net (l) ) σ. Rudolf Kruse Neural Networks 9

Radial Basis Function Networks: Training Training radial basis function networks: Radii of radial basis functions. Gradient: Weight update rule: e (l) σ = s succ() σ (l) = η e (l) = η σ (o (l) s s succ() Special case: Gaussian actiation function out (l) σ = σ e ( net (l) ) σ = out (l) out (l) s )w su. σ (o (l) s ( out (l) out (l) s )w s. σ net (l) σ 3 ) e ( net (l) ) σ. Rudolf Kruse Neural Networks 3

Radial Basis Function Networks: Generalization Generalization of the distance function Idea: Use anisotropic distance function. Example: Mahalanobis distance d( x, y) = ( x y) Σ ( x y). Example: biimplication x x y 3 Σ= ( 9 8 8 9 ) x x Rudolf Kruse Neural Networks 3

Radial Basis Function Networks: Application Adantages simple feed-forward architecture easy adaption quick optimization and calculations Application continuous processes adapting to changes rapidly Approximation Pattern Recognition Control Engineering Below: Examples from [Schwenker et al. ] Rudolf Kruse Neural Networks 3

Example: Handwriting Recognition of digits dataset:. handwritten digits (. samples of each class) normalized in height and width represented by 6 6 greyscale alues G ij {,...,55} data source: EU-project StatLog [Michie et al. 994] Rudolf Kruse Neural Networks 33

Example: Results of RBF-Networks. training samples and. alidation samples (. for each class) training data for learning of classificators, alidation data for ealuation initialization by k-means-clustering with prototypes ( for each class) Method Explanation Test accuracy C4.5 Decision Tree 9.% RBF k-means for RBF-centers 96.94% MLP hiddenlayerwithneurons 97.59% here: median of test accuracy for three training and alidation runs Rudolf Kruse Neural Networks 34

Example: k-means-clustering for RBF-initialization shown: 6 cluster-centers of the digits after k-means-clustering separate run of the algorithm for eery digit, with k =6 centers were initialized by samples chosen from the training data set randomly Rudolf Kruse Neural Networks 35

Example: found RBF-centers after 3 runs shown: 6 RBF-centers of the digits after 3 backpropagation runs of the RBF- Network hardly a difference can be seen in comparison to k-means, but... Rudolf Kruse Neural Networks 36

Example: Comparison of k-means and RBF-Network here: Eucledian distance of the 6 RBF-centers before and after training Centers are sorted by class with the first 6 columns/rows representing the digit, the next 6 rows/digits representing the digit etc. Distances are coded as greyscale alues: The smaller, the darker. left: a lot of small distances between centers of different classes (e.g. 3 and 8 9) these small distance cause misclassifications right: after training there are no small distances anymore. Rudolf Kruse Neural Networks 37