Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions

Size: px
Start display at page:

Download "Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions"

Transcription

1 ENEE 739Q: STATISTICAL AND NEURAL PATTERN RECOGNITION Spring 2002 Assignment 2 Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions Aravind Sundaresan aravinds@glue.umd.edu ENEE 739Q Assignment 2 1 of 14

2 ENEE 739Q Assignment 2 2 of 14

3 1. Pattern Classification using Linear Networks A set of N=300 training samples were used to train a 3 3 linear network, where the input is a 3 dimensional vector X x 100 y T (the bias is chosen to be 0.5). The LMS algorithm was used to train the weights iteratively. The output is a 3 dimensional vector, Z, whose i th element is set to 1 if the input is from the i th class else it is set to zero. The output of the linear network is calculated as follows. Z W X, where X is the input and W is the weight vector. O arg max Z i i Strategy: The learning rate needs to be chosen carefully as large values for the learning rate cause the error to diverge leading to instability in the algorithm. The learning rate is a function of the iteration index and is given by t t It is a 0 good idea to normalize the input so that input values lie in [0,1] or [ 1,1]. In the implementation the inputs have been scaled so that they lie in 0,1 d. Figure 1.1: The Performance of the Network for different learning rates ENEE 739Q Assignment 2 3 of 14

4 Results: The rate of convergence of the error for three different values of are illustrated in Figure 1.1. The convergence is faster and the error (or the energy 1 function which is set to be equal to 2 Z T 2 ) is lesser for higher learning rates, but as observed earlier the algorithm becomes unstable for higher learning rates leading to divergence of the error function. The original configuration and the classification achieved by the linear network after training with learning rate are illustrated in Figure 1.2. Conclusions: Obviously the performance of the network is limited by its linearity. As can be observed from Figure 1.2 only linear discrimination can be performed. In this case where the input is from a 2 dimensional space, the output space is split into regions (classes) separated by lines (hyperplanes in the general case). Figure 1.2: The Output of the Network 2. Pattern Classification using Multi Layer Perceptrons A set of N 2000 training samples were used to train 3 h 1 network (Multi layer perceptron network) using the back propagation algorithm. The input is a is a 3 dimensional vector X x y 50 T. The desired output is a scalar which takes the value 1 if the input is in the foreground and the value 1 if it is from the background. Strategy: Initial Weights are uniformly (and independently ) distributed in [ 0.5C, 0.5C] where C is a scaling constant that is inversely proportional to the average magnitude of the input. The training rate,, is calculated as follows. t 0 1 t 400, where ENEE 739Q Assignment 2 4 of 14

5 The tan sigmoid function is chosen as the activation function. The activation function, and the derivative of the activation function are calculated as f x 1.7 tanh 0.7x f x tanh 2 0.7x The error function is calculated as 1 J N n 1 2 Z n T n 2, where T n is the desired O/P, Z n is the actual O/P. The weights are updated for every sample input (online training) according to the back propagation algorithm. The input is not scaled and therefore a scaling factor (inversely proportional to the average magnitude of the input vector) is multiplied with the actual weight increment to obtain the modified weight increment. The training strategy is to continue training the network until the training set error is below a predetermined threshold. Since the error function of both the training set and the validation set may have multiple minimas, the decision to stop the training becomes complicated if it is based on the minima of the error function of either the training or the validation set. It can be in general quite complicated. Here, since there is a very clear demarcation between the foreground and the background, the error of the validation set does not attain a minima even after several iterations. Therefore a good stopping criterion would be based on the value of the training set error. In the MLP network implemented, training is stopped after 2000 iterations or when J t E threshold, whichever occurs first. Results: Table 2.1 illustrates how the validation error varies with the number of hidden units, the stopping criterion being J t E threshold Figure 2.1 shows the output of the network (without thresholding) for several values of h. Hidden units Stopping iteration Error of Training Set Error of Validation Set 10 2, , , Table 2.1: Number of Hidden units for Optimal Performance As both Table 2.1 and Figure 2.1 indicate, the optimal choice for the number of hidden units seems to be 25. Figure 2.2 and Figure 2.3 illustrate the performance of the network wit h 25 hidden units. ENEE 739Q Assignment 2 5 of 14

6 Figure 2.1: The performance of the MLP network for different values of h Figure 2.2: Performance of MLP network for h = 25 ENEE 739Q Assignment 2 6 of 14

7 Figure 2.3: The error of the MLP network with h = 25 Optimal Brain Damage: Because of the random nature of the initialization process, and possibly other factors, the optimal performance of the MLP network is obtained with a higher number of hidden units than may be actually necessary. Thus, some of the weights in the network with the optimal number of hidden units may be superficial or redundant. These redundant weights or units maybe removed by a process called Optimal Brain Damage, which sets to zero the weights that do not affect the output, or the performance of the network. This has been implemented in the following manner. 1. Train the network using h h opt hidden units than required in the optimal case determined earlier (In this case the number of hidden units is chosen as 25). 2. Determine the saliency of each of the weights in the Input Hidden Layer and set to zero three of the weights that have the smallest saliency. 3. Train the network (keeping the value of the discarded the weights equal to zero) until the training set error is less than the threshold or until 2000 iterations are completed. If the final error is less than the threshold, there is scope for further pruning: Repeat 2. If the final error is greater than the threshold it can be concluded that the number of non zero weights required may be less than the number necessary: Go to Use the most recent weight vector that gave an error less than the threshold with the training set. Using an initial value of h 25, and pruning the weights with E threshold 0.045, we ended up with a network that had 45 nonzero weights and 20 hidden units. The ENEE 739Q Assignment 2 7 of 14

8 performance of the pruned network is illustrated in Figure 2.4. The results of the pruning are summarized in Table 2.2. The number of weights has been reduced by 40% and 5 (20%) of the hidden units have been removed. Hidden units Weights Error of Training Set Error of Validation Set Before Pruning After Pruning Table 2.2: Summary of the pruning Figure 2.4: Performance of MLP network after pruning 3. Function approximating using Radial Basis Functions The objective is to train a RBF network using N 1000 sample points. Though the input is a 3 h 1dimensional vector like before, the bias does not make any difference, because the bias of all the "function centres" is the same as the bias of the input. Strategy: The strategy is to use randomly select the function centres from the training set. The function used in the network is is the inverse multi quadratic basis function defined as i x 1 1 x x i 2 2, where x is the input and x i is the function centre. ENEE 739Q Assignment 2 8 of 14

9 The "variance" or the spread,, is set according to the number of function centers chosen (the hidden units). The experiment is repeated for different values of h, the number of hidden units. The value of for a given value of h is calculated as follows. ( h is proportional to the ratio of the area of the domain of the mapping to 2 ) h The weights W are determined iteratively using the LMS algorithm. The weights are trained until the validation set error increases continuously for 3 epochs or the number of iterations exceeds 200. The network is trained and the results are compared for different values of h. Hidden units Figure 3.1: Performance of RBF network for different values of h Error of Training Set Error of Validation Set Table 3.1: Performance of RBF network for different values of h ENEE 739Q Assignment 2 9 of 14

10 Results: The results for different values of h are listed in Table 3.1 and the respective outputs of the network are illustrated in Figure 3.1. The performance of the network for h 80 is illustrated in Figure 3.2 and Figure 3.3. The RBF network performs rather poorly because we do not train the function centres or the "variance" of the radial basis functions. Training these parameters using the EM algorithm or the gradient descent algorithm should result in a much better performance. Besides, the performace of the RBF network is very much dependent on the choice of the radial basis function and is more suited to (smooth) function approximation rather tha n the current scenario. The RBF network is not able to sharply define the boundary regions because of the inherent smoothness of the basis fucntion. Figure 3.2: Performance of RBF network for h = 80 ENEE 739Q Assignment 2 10 of 14

11 4. Optical Character Reader Figure 3.3 The error of the RBF network with h = 80 To implement an OCR we require a Multi Output Multi layer network. The input is a 16x16 grayscale image anda bias. The simplest network architecture would have 257 input nodes, h hidden units, and 10 output nodes, a 257 h 10 MLP network. Strategy: The training set can be obtained by using using manufactured data that provides for translational, rotational, and scale invariance in the network. The target output is set as follows. T i 1;input i 1;input i The network is trained using the manufactured data. The manufactured data has a translation (in pixels) which is uniformly distributed in [ 1.5, 1.5], rotation (in degrees) which is uniformly distributed in [ 9,9] and a scale factor that is uniformly distributed in [0.9,1.1]. A subset of the training set is presented in Figure 4.1. The output of the neural network is chosen as follows. O arg max Z i i ENEE 739Q Assignment 2 11 of 14

12 The training is continued for 1000 iterations or till the number of misclassified samples for the validation set remains consistently higher than the sum of the minimum value achieved and a threshold. Figure: 4.1: Manufactured data for rotational, translational, and scale invariance Dimensionality reduction using PCA: In the previous case the input dimensions are rather large and this leads to increased computations because the number of weights to be trained depends on the number of input nodes. If it is possible to represent the image using a smaller vector the training would be much less computationally intensive. To this end, the input vector can be transformed using Principal Component Analysis. An estimate of the auto correlation matrix can be obtained from the training set data and using this estimate, k principal eigenvectors (eigenvectors corresponding to the largest eigenvalues) are obtained. The projections of the input vector on these k components are packed into a k dimensional vector, which retains as much information as is necessary to correctly identify the digit. This has an additional advantage that some noise (unnecessary information) is also filtered out which results in a better performance. In the implementation k is set to 30. Thus, including the bias, the dimension of the input vector is 31. Results: The results of the training for both the normal case and the PCA case are summarized in table 4.1. The performances of the normal and PCA cases are also illustrated in Figure 4.2 and Figure 4.3 respectively. Type Input Dimension Hidden units Iterations Misclassified Samples Error of Training Set Error of Validation Set Normal % PCA % Table 4.1: Summary of performances for Normal and PCA cases ENEE 739Q Assignment 2 12 of 14

13 Figure 4.2: Performance of the network :Direct Input Figure 4.3: Performance of the network :PCA ENEE 739Q Assignment 2 13 of 14

14 As can be seen, using PCA to reduce the dimensions of the input leads to a far better performance (both in terms of speed of convergence and validation set error) with the number of misclassified samples in the validation set falling as low as 0.20% (2 in 1000 samples ). In a more general setting it may be a good idea to use a general transformation such as DCT and select the low frequency components to represent the image. 5. References 1. Yann Le Cun, John S. Denker and Sara A. Solla, Optimal Brain Damage. AT&T Bell Laboratories, NJ. 2. Richard Duda, Peter Hart, and David Stork, Pattern Classification. Wiley Interscience, New York, ENEE 739Q Assignment 2 14 of 14

Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions

Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions ENEE 739Q SPRING 2002 COURSE ASSIGNMENT 2 REPORT 1 Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions Vikas Chandrakant Raykar Abstract The aim of the

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

Neural Networks (Overview) Prof. Richard Zanibbi

Neural Networks (Overview) Prof. Richard Zanibbi Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)

More information

Optimal Brain Damage. Yann Le Cun, John S. Denker and Sara A. Solla. presented by Chaitanya Polumetla

Optimal Brain Damage. Yann Le Cun, John S. Denker and Sara A. Solla. presented by Chaitanya Polumetla Optimal Brain Damage Yann Le Cun, John S. Denker and Sara A. Solla presented by Chaitanya Polumetla Overview Introduction Need for OBD The Idea Authors Proposal Why OBD could work? Experiments Results

More information

ICA as a preprocessing technique for classification

ICA as a preprocessing technique for classification ICA as a preprocessing technique for classification V.Sanchez-Poblador 1, E. Monte-Moreno 1, J. Solé-Casals 2 1 TALP Research Center Universitat Politècnica de Catalunya (Catalonia, Spain) enric@gps.tsc.upc.es

More information

Image Compression: An Artificial Neural Network Approach

Image Compression: An Artificial Neural Network Approach Image Compression: An Artificial Neural Network Approach Anjana B 1, Mrs Shreeja R 2 1 Department of Computer Science and Engineering, Calicut University, Kuttippuram 2 Department of Computer Science and

More information

Supervised Learning in Neural Networks (Part 2)

Supervised Learning in Neural Networks (Part 2) Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning

More information

Classification: Linear Discriminant Functions

Classification: Linear Discriminant Functions Classification: Linear Discriminant Functions CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Discriminant functions Linear Discriminant functions

More information

Artificial Neural Networks (Feedforward Nets)

Artificial Neural Networks (Feedforward Nets) Artificial Neural Networks (Feedforward Nets) y w 03-1 w 13 y 1 w 23 y 2 w 01 w 21 w 22 w 02-1 w 11 w 12-1 x 1 x 2 6.034 - Spring 1 Single Perceptron Unit y w 0 w 1 w n w 2 w 3 x 0 =1 x 1 x 2 x 3... x

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

11/14/2010 Intelligent Systems and Soft Computing 1

11/14/2010 Intelligent Systems and Soft Computing 1 Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale. Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature

More information

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different

More information

For Monday. Read chapter 18, sections Homework:

For Monday. Read chapter 18, sections Homework: For Monday Read chapter 18, sections 10-12 The material in section 8 and 9 is interesting, but we won t take time to cover it this semester Homework: Chapter 18, exercise 25 a-b Program 4 Model Neuron

More information

CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr.

CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. Michael Nechyba 1. Abstract The objective of this project is to apply well known

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

CS 195-5: Machine Learning Problem Set 5

CS 195-5: Machine Learning Problem Set 5 CS 195-5: Machine Learning Problem Set 5 Douglas Lanman dlanman@brown.edu 26 November 26 1 Clustering and Vector Quantization Problem 1 Part 1: In this problem we will apply Vector Quantization (VQ) to

More information

Spectral Classification

Spectral Classification Spectral Classification Spectral Classification Supervised versus Unsupervised Classification n Unsupervised Classes are determined by the computer. Also referred to as clustering n Supervised Classes

More information

6. Linear Discriminant Functions

6. Linear Discriminant Functions 6. Linear Discriminant Functions Linear Discriminant Functions Assumption: we know the proper forms for the discriminant functions, and use the samples to estimate the values of parameters of the classifier

More information

Character Recognition Using Convolutional Neural Networks

Character Recognition Using Convolutional Neural Networks Character Recognition Using Convolutional Neural Networks David Bouchain Seminar Statistical Learning Theory University of Ulm, Germany Institute for Neural Information Processing Winter 2006/2007 Abstract

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation Farrukh Jabeen Due Date: November 2, 2009. Neural Networks: Backpropation Assignment # 5 The "Backpropagation" method is one of the most popular methods of "learning" by a neural network. Read the class

More information

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R. Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation Learning Learning agents Inductive learning Different Learning Scenarios Evaluation Slides based on Slides by Russell/Norvig, Ronald Williams, and Torsten Reil Material from Russell & Norvig, chapters

More information

Image Processing. Image Features

Image Processing. Image Features Image Processing Image Features Preliminaries 2 What are Image Features? Anything. What they are used for? Some statements about image fragments (patches) recognition Search for similar patches matching

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

FACE RECOGNITION USING SUPPORT VECTOR MACHINES

FACE RECOGNITION USING SUPPORT VECTOR MACHINES FACE RECOGNITION USING SUPPORT VECTOR MACHINES Ashwin Swaminathan ashwins@umd.edu ENEE633: Statistical and Neural Pattern Recognition Instructor : Prof. Rama Chellappa Project 2, Part (b) 1. INTRODUCTION

More information

Statistical Methods in AI

Statistical Methods in AI Statistical Methods in AI Distance Based and Linear Classifiers Shrenik Lad, 200901097 INTRODUCTION : The aim of the project was to understand different types of classification algorithms by implementing

More information

Chap.12 Kernel methods [Book, Chap.7]

Chap.12 Kernel methods [Book, Chap.7] Chap.12 Kernel methods [Book, Chap.7] Neural network methods became popular in the mid to late 1980s, but by the mid to late 1990s, kernel methods have also become popular in machine learning. The first

More information

In this assignment, we investigated the use of neural networks for supervised classification

In this assignment, we investigated the use of neural networks for supervised classification Paul Couchman Fabien Imbault Ronan Tigreat Gorka Urchegui Tellechea Classification assignment (group 6) Image processing MSc Embedded Systems March 2003 Classification includes a broad range of decision-theoric

More information

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial

More information

Visual object classification by sparse convolutional neural networks

Visual object classification by sparse convolutional neural networks Visual object classification by sparse convolutional neural networks Alexander Gepperth 1 1- Ruhr-Universität Bochum - Institute for Neural Dynamics Universitätsstraße 150, 44801 Bochum - Germany Abstract.

More information

PATTERN CLASSIFICATION AND SCENE ANALYSIS

PATTERN CLASSIFICATION AND SCENE ANALYSIS PATTERN CLASSIFICATION AND SCENE ANALYSIS RICHARD O. DUDA PETER E. HART Stanford Research Institute, Menlo Park, California A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS New York Chichester Brisbane

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

Linear Models. Lecture Outline: Numeric Prediction: Linear Regression. Linear Classification. The Perceptron. Support Vector Machines

Linear Models. Lecture Outline: Numeric Prediction: Linear Regression. Linear Classification. The Perceptron. Support Vector Machines Linear Models Lecture Outline: Numeric Prediction: Linear Regression Linear Classification The Perceptron Support Vector Machines Reading: Chapter 4.6 Witten and Frank, 2nd ed. Chapter 4 of Mitchell Solving

More information

Content-based image and video analysis. Machine learning

Content-based image and video analysis. Machine learning Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right

More information

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Features Points Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Finding Corners Edge detectors perform poorly at corners. Corners provide repeatable points for matching, so

More information

Obtaining Feature Correspondences

Obtaining Feature Correspondences Obtaining Feature Correspondences Neill Campbell May 9, 2008 A state-of-the-art system for finding objects in images has recently been developed by David Lowe. The algorithm is termed the Scale-Invariant

More information

Model Answers to The Next Pixel Prediction Task

Model Answers to The Next Pixel Prediction Task Model Answers to The Next Pixel Prediction Task December 2, 25. (Data preprocessing and visualization, 8 marks) (a) Solution. In Algorithm we are told that the data was discretized to 64 grey scale values,...,

More information

Why MultiLayer Perceptron/Neural Network? Objective: Attributes:

Why MultiLayer Perceptron/Neural Network? Objective: Attributes: Why MultiLayer Perceptron/Neural Network? Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are

More information

Deep Learning With Noise

Deep Learning With Noise Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu

More information

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting. An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting. Mohammad Mahmudul Alam Mia, Shovasis Kumar Biswas, Monalisa Chowdhury Urmi, Abubakar

More information

Static Gesture Recognition with Restricted Boltzmann Machines

Static Gesture Recognition with Restricted Boltzmann Machines Static Gesture Recognition with Restricted Boltzmann Machines Peter O Donovan Department of Computer Science, University of Toronto 6 Kings College Rd, M5S 3G4, Canada odonovan@dgp.toronto.edu Abstract

More information

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism) Artificial Neural Networks Analogy to biological neural systems, the most robust learning systems we know. Attempt to: Understand natural biological systems through computational modeling. Model intelligent

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map Markus Turtinen, Topi Mäenpää, and Matti Pietikäinen Machine Vision Group, P.O.Box 4500, FIN-90014 University

More information

Radial Basis Function Neural Network Classifier

Radial Basis Function Neural Network Classifier Recognition of Unconstrained Handwritten Numerals by a Radial Basis Function Neural Network Classifier Hwang, Young-Sup and Bang, Sung-Yang Department of Computer Science & Engineering Pohang University

More information

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER 1999 1271 Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming Bao-Liang Lu, Member, IEEE, Hajime Kita, and Yoshikazu

More information

Design of an optimal multi-layer neural network for eigenfaces based face recognition

Design of an optimal multi-layer neural network for eigenfaces based face recognition Recent Research in Science and Technology 212, 4(1): 24-32 ISS: 276-561 Available Online: http://recent-science.com/ Design of an optimal multi-layer neural network for eigenfaces based face recognition

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

5 Learning hypothesis classes (16 points)

5 Learning hypothesis classes (16 points) 5 Learning hypothesis classes (16 points) Consider a classification problem with two real valued inputs. For each of the following algorithms, specify all of the separators below that it could have generated

More information

291 Programming Assignment #3

291 Programming Assignment #3 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Dimension Reduction CS534

Dimension Reduction CS534 Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of

More information

Recitation Supplement: Creating a Neural Network for Classification SAS EM December 2, 2002

Recitation Supplement: Creating a Neural Network for Classification SAS EM December 2, 2002 Recitation Supplement: Creating a Neural Network for Classification SAS EM December 2, 2002 Introduction Neural networks are flexible nonlinear models that can be used for regression and classification

More information

A Systematic Overview of Data Mining Algorithms

A Systematic Overview of Data Mining Algorithms A Systematic Overview of Data Mining Algorithms 1 Data Mining Algorithm A well-defined procedure that takes data as input and produces output as models or patterns well-defined: precisely encoded as a

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining

More information

Face Recognition for Mobile Devices

Face Recognition for Mobile Devices Face Recognition for Mobile Devices Aditya Pabbaraju (adisrinu@umich.edu), Srujankumar Puchakayala (psrujan@umich.edu) INTRODUCTION Face recognition is an application used for identifying a person from

More information

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute

More information

A Novel Technique for Optimizing the Hidden Layer Architecture in Artificial Neural Networks N. M. Wagarachchi 1, A. S.

A Novel Technique for Optimizing the Hidden Layer Architecture in Artificial Neural Networks N. M. Wagarachchi 1, A. S. American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training

More information

Lecture 8 Object Descriptors

Lecture 8 Object Descriptors Lecture 8 Object Descriptors Azadeh Fakhrzadeh Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University 2 Reading instructions Chapter 11.1 11.4 in G-W Azadeh Fakhrzadeh

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Tomohiro Tanno, Kazumasa Horie, Jun Izawa, and Masahiko Morita University

More information

Motivation. Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight. Fixed basis function

Motivation. Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight. Fixed basis function Neural Networks Motivation Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight Fixed basis function Flashback: Linear regression Flashback:

More information

Hand Written Digit Recognition Using Tensorflow and Python

Hand Written Digit Recognition Using Tensorflow and Python Hand Written Digit Recognition Using Tensorflow and Python Shekhar Shiroor Department of Computer Science College of Engineering and Computer Science California State University-Sacramento Sacramento,

More information

Experimental Data and Training

Experimental Data and Training Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu, 2008 1 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion

More information

GENDER CLASSIFICATION USING SUPPORT VECTOR MACHINES

GENDER CLASSIFICATION USING SUPPORT VECTOR MACHINES GENDER CLASSIFICATION USING SUPPORT VECTOR MACHINES Ashwin Swaminathan ashwins@umd.edu ENEE633: Statistical and Neural Pattern Recognition Instructor : Prof. Rama Chellappa Project 2, Part (a) 1. INTRODUCTION

More information

Univariate and Multivariate Decision Trees

Univariate and Multivariate Decision Trees Univariate and Multivariate Decision Trees Olcay Taner Yıldız and Ethem Alpaydın Department of Computer Engineering Boğaziçi University İstanbul 80815 Turkey Abstract. Univariate decision trees at each

More information

Opening the Black Box Data Driven Visualizaion of Neural N

Opening the Black Box Data Driven Visualizaion of Neural N Opening the Black Box Data Driven Visualizaion of Neural Networks September 20, 2006 Aritificial Neural Networks Limitations of ANNs Use of Visualization (ANNs) mimic the processes found in biological

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints

More information

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York A Systematic Overview of Data Mining Algorithms Sargur Srihari University at Buffalo The State University of New York 1 Topics Data Mining Algorithm Definition Example of CART Classification Iris, Wine

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017 Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of

More information

HW Assignment 3 (Due by 9:00am on Mar 6)

HW Assignment 3 (Due by 9:00am on Mar 6) HW Assignment 3 (Due by 9:00am on Mar 6) 1 Theory (150 points) 1. [Tied Weights, 50 points] Write down the gradient computation for a (non-linear) auto-encoder with tied weights i.e., W (2) = (W (1) )

More information

Handwritten Hindi Numerals Recognition System

Handwritten Hindi Numerals Recognition System CS365 Project Report Handwritten Hindi Numerals Recognition System Submitted by: Akarshan Sarkar Kritika Singh Project Mentor: Prof. Amitabha Mukerjee 1 Abstract In this project, we consider the problem

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Training Neural Networks II Connelly Barnes Overview Preprocessing Initialization Vanishing/exploding gradients problem Batch normalization Dropout Additional

More information

Support Vector Machines

Support Vector Machines Support Vector Machines About the Name... A Support Vector A training sample used to define classification boundaries in SVMs located near class boundaries Support Vector Machines Binary classifiers whose

More information

Neural Networks for Classification

Neural Networks for Classification Neural Networks for Classification Andrei Alexandrescu June 19, 2007 1 / 40 Neural Networks: History What is a Neural Network? Examples of Neural Networks Elements of a Neural Network 2 / 40 Neural Networks:

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Unsupervised learning Daniel Hennes 29.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Supervised learning Regression (linear

More information

Neural Network Neurons

Neural Network Neurons Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given

More information

Tangent Prop - A formalism for specifying selected invariances in an adaptive network

Tangent Prop - A formalism for specifying selected invariances in an adaptive network Tangent Prop - A formalism for specifying selected invariances in an adaptive network Patrice Simard AT&T Bell Laboratories 101 Crawford Corner Rd Holmdel, NJ 07733 Yann Le Cun AT&T Bell Laboratories 101

More information

Week 3: Perceptron and Multi-layer Perceptron

Week 3: Perceptron and Multi-layer Perceptron Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,

More information

2. Basic Task of Pattern Classification

2. Basic Task of Pattern Classification 2. Basic Task of Pattern Classification Definition of the Task Informal Definition: Telling things apart 3 Definition: http://www.webopedia.com/term/p/pattern_recognition.html pattern recognition Last

More information

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class

More information

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs) Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based

More information

Convolution Neural Networks for Chinese Handwriting Recognition

Convolution Neural Networks for Chinese Handwriting Recognition Convolution Neural Networks for Chinese Handwriting Recognition Xu Chen Stanford University 450 Serra Mall, Stanford, CA 94305 xchen91@stanford.edu Abstract Convolutional neural networks have been proven

More information

Applied Neuroscience. Columbia Science Honors Program Fall Machine Learning and Neural Networks

Applied Neuroscience. Columbia Science Honors Program Fall Machine Learning and Neural Networks Applied Neuroscience Columbia Science Honors Program Fall 2016 Machine Learning and Neural Networks Machine Learning and Neural Networks Objective: Introduction to Machine Learning Agenda: 1. JavaScript

More information

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues

More information

CSE 481C Imitation Learning in Humanoid Robots Motion capture, inverse kinematics, and dimensionality reduction

CSE 481C Imitation Learning in Humanoid Robots Motion capture, inverse kinematics, and dimensionality reduction 1 CSE 481C Imitation Learning in Humanoid Robots Motion capture, inverse kinematics, and dimensionality reduction Robotic Imitation of Human Actions 2 The inverse kinematics problem Joint angles Human-robot

More information

Deep Learning. Volker Tresp Summer 2014

Deep Learning. Volker Tresp Summer 2014 Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

A Novel Pruning Algorithm for Optimizing Feedforward Neural Network of Classification Problems

A Novel Pruning Algorithm for Optimizing Feedforward Neural Network of Classification Problems Chapter 5 A Novel Pruning Algorithm for Optimizing Feedforward Neural Network of Classification Problems 5.1 Introduction Many researchers have proposed pruning algorithms in numerous ways to optimize

More information

Machine Learning : Clustering, Self-Organizing Maps

Machine Learning : Clustering, Self-Organizing Maps Machine Learning Clustering, Self-Organizing Maps 12/12/2013 Machine Learning : Clustering, Self-Organizing Maps Clustering The task: partition a set of objects into meaningful subsets (clusters). The

More information

This leads to our algorithm which is outlined in Section III, along with a tabular summary of it's performance on several benchmarks. The last section

This leads to our algorithm which is outlined in Section III, along with a tabular summary of it's performance on several benchmarks. The last section An Algorithm for Incremental Construction of Feedforward Networks of Threshold Units with Real Valued Inputs Dhananjay S. Phatak Electrical Engineering Department State University of New York, Binghamton,

More information

Neural Networks (pp )

Neural Networks (pp ) Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.

More information