Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
|
|
- Arlene Mosley
- 5 years ago
- Views:
Transcription
1 Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani
2 Outline Biological and artificial neural networks Feed-forward neural networks Single layer networks Multi-Layer Perceptron (MLP) Back-propagation RBF networks 2
3 Biological Neural Network How human recognizes patterns? Human brain Structure: Neurons and connections between them Learning: altering the strengths of connections between neurons (adding or deleting connections). Single Neuron Connection between neurons 3
4 Artificial Neural Networks Artificial Neural Networks (ANNs): mathematical models inspired by biological neural networks : Activation function Σ = 4
5 Neuron Neuron, unit, or processing element: Equivalent to = binary McCulloch- Pitts neuron bias or activation threshold 1, = 0, < 1 = bias: 5
6 Activation function The most common activation functions: Unit step Sigmoid = 1 1+exp ( ) Unit step =2 =1 =0.5 Bipolar activation functions can be defined as These are usually more attractive tanh /2 = 1 exp ( ) 1+exp ( ) 6
7 Feed-Forward Neural Networks (FFNN) Neurons are arranged in layers Each unit receives input only from units in the preceding layer Weights on links can be adapted using training data and a learning algorithm Input Output Non-processing units Hidden Layers 7
8 Feed-Forward Neural Networks (FFNN) The most commonly used networks for pattern classification tasks Expressive while efficient Propagates data from input to output, From the input nodes data goes through the hidden nodes (if any) and then to the output nodes 8
9 Single Layer FFNNs Single layer network can be used as a linear decision boundary: ( ) showstheclassof Types of single layer networks: Hebb (Hebb, 1949) Perceptron (Rosenblatt, 1962) ADALINE (Widrow and Hoff, 1960) 1 = 9 bias:
10 Training of Single Layer FFNNs First initialize and then iteratively update it: = + The learning process goes through all training examples (an epoch) a number of times, until reaching a stopping criterion. 10
11 Training of Single Layer FFNNs Weight update for a training pair () : Hebb: = () () Perceptron: If ( () ) () then = () () else = ADALINE: = ( () () ) () Widrow-Hoff, LMS, or delta rule = = () () 11
12 Perceptron vs. Delta Rule Perceptron learning rule: guaranteed to succeed if training examples are linearly separable Delta rule: guaranteed to converge to the hypothesis with the minimum squared error succeed if sufficiently small learning rate Even when training data contain noise or are not separable by a hyperplane can also be used for regression problems 12
13 Training Mode in Gradient Descent Batch mode (gradient descent) In each iteration, weight update depends on the entire training data = Incremental or sequential mode (stochastic gradient descent) Iteratively update weight vector based on one data point at a time (cycle through data points in sequence or by selecting them at random with replacement) = () = 13
14 Multi-Layer Perceptron (MLP) Two-layer MLP (Number of layers of adaptive weights is counted) (), = (), = h () =1 () =0,, =1 =1 h () =1 =1,, =h () Input h h Output Usually, h are sigmoid activation functions: 1 = 1+exp( ) 14
15 Multi-Layer Perceptron (MLP) Multi-Layer Perceptron (MLP): MLP is a generalized linear model: () For the classification problem,(. ) is a nonlinear activation function The form of the nonlinearity (basis functions )isadaptedfrom the training data (not fixed in advance) is defined based on parameters which can be also adapted during training (e.g. = ) MLP is of greatest practical use Hidden units enable us to express complicated nonlinear functions 15
16 XOR Problem = 1 2 = ( 1 2 ) ( 1 2 ) = = =0 =1 = 1 = 1 [Duda, Hart & Strork] 16
17 MLP Outputs as Discriminant Functions Classification problem: MLP with output units where the number of classes is we can view the network as computing discriminants functions = () ( =1,,) and classify according to the largest discriminant function () 17
18 MLP Universal Approximator A feed-forward network with a single hidden layer and linear outputs can approximate any continuous function on a compact domain to an arbitrary accuracy under mild assumptions on the activation function e.g., sigmoid activation functions (Cybenko,1989) when sufficiently large (but finite) number of hidden units is used = It is of greater theoretical interest than practical the construction of such a network requires the nonlinear activation functions and the weight values which are unknown 18
19 MLP with Different Number of Layers: Separability Properties MLP with unit step activation function Decision region found by an output unit. Structure Type of Decision Regions Interpretation Example of region Single Layer (no hidden layer) Half space Region found by a hyper-plane Two Layer (one hidden layer) Polyhedral (open or closed) region Intersection of half spaces Three Layer (two hidden layers) Arbitrary regions Union of polyhedrals 19
20 MLP Training Backpropagation Training algorithm that is used to adjust weights in MLP networks (based on the training data) The backpropagation algorithm is based on gradient descent MLP with sigmoidal activation function in hidden layers Differentiable w.r.t. parameters 20
21 Backpropagation Sum of square error cost function: = _ = Backpropagation algorithm uses stochastic gradient descent to find weights minimizing the above cost function. Following we remove this superscript a computationally efficient algorithm to learn multiple layers of weights 21
22 Backpropagation First step: Forward propagation Feed the input vector to the network and calculate the activation of all hidden and output nodes (z s and o s) =1 Input () =0,, =1 =1 h h h () =1 =1,, Output =h () 22
23 Backpropagation (hidden-to-output) Weight Adaptation = = = = = is the sensitivity of output unit : = = = ( ) Weight update (or learning rule) for the hidden-to-output weights: = = 23
24 Backpropagation (input-to-hidden) Weight Adaptation = = = 1/2 = = Sensitivity for a hidden unit: = = h ( ) = h ( ) Weight update (or learning rule) for the input-to-hidden weights: = = 24
25 Backpropagation of Errors = ( ) =h ( ) 25
26 Stochastic Back-propagation initialize M, w,, t 0 do t t + 1 x randomly chosen pattern among training data Apply x to the network (forward propagate) and find the activations of all the units Evaluate k for all the output units Backpropagate s to obtain j for all hidden units w ji w ji + j x i w kj w kj + k z j until J(w) < 26
27 Cost function in BP algorithm Non-linear activation functions yields a non-convex cost function in general Error surface (cost function) depends on the training data May have many local minima Networks with multiple hidden layers are more prone to getting stuck in a local minima. Other training criteria can also be used (e.g., cross entropy) instead of least squares cost function 27
28 Backpropagation: Training Mode Advantages of stochastic gradient descent: useful for training of neural networks on large training sets. possibility of escaping from local minima Higher degree of randomness during training can approximate gradient descent arbitrarily closely if is small enough Advantages of batch gradient descent: a better estimate of the gradient thus to more well-behaved convergence 28
29 Backpropagation & Local minima To find a good minimum: usually we run training algorithm multiple times with different random initialization of weights and select one of them according to the performance on a validation set In order to decide when to stop training: During training, the training set is used more than once until the algorithm converges. we do not want to overtrain the network (decreasing the generalization of the model), so we stop training at a minimum of the error on the validation set. 29
30 Back-propagation: MLP Activation Function Sigmoid is the most widely used activation function Sigmoid properties: smooth, differentiable, nonlinear, monotonic, saturating Hidden layer of sigmoids affords global representation of the input Sigmoid derivative: Binary: =(1 ) Bipolar: = (1 ) 30
31 Stopping Criteria Elementary stopping criteria: Other stopping criteria: ( ) < Reaching a minimum (i.e. the training error fails to improve) or an acceptable level of error When the rate of improvement drops below a certain level When a certain number of epochs have passed Number of epochs: number of presentations of the full training set Indicate the relative amount of training When the error on a separate validation set reaches a minimum Stopping the training before completing the gradient descent may help avoid ovefitting 31
32 Number of Hidden Units Shows the expressive power the network Can specify the total numbers of weights that are the number of freedom degree Select among networks with different no. of hidden units by training these networks and then evaluating them on a validation set For large networks and large training set, it is inefficient. Constructive techniques Pruning techniques error validation error training error Number of hidden units
33 Training of MLP using BP: Many Parameters Many parameters to tune: Learning rate: Stopping parameter:, number of epochs, or Number of hidden layers, number of hidden units 33
34 Radial-Basis Function (RBF) Network Radial-Basis Function (RBF) networks can also be considered as a two-layer feed-forward NN Hidden layer: input is mapped onto each RBF in the 'hidden' layer. RBF is a function which has built into a distance criterion with respect to a center It is commonly taken to be Gaussian Output layer: In regression problems: a linear combination of hidden layer values In classification problems: typically a sigmoid function of a linear combination of hidden layer values. 34
35 RBF Network Radial Basis Function (RBF) = + (, ) Transform data into an -dim space: Representing the instances by a number of prototypes,,. Then, using a linear (i.e. single layer) model to find the output. (, ) 1 = + (, ) 35 (, ) =(, ) Can be easily generalized to more than one output unit
36 RBF Network RBF kernel:, = XOR problem example: = 0,0, = 1,1, = =1 36
37 RBF Network RBF kernel:, = XOR problem example: = 0,0, = 1,1, = =1 37
38 RBF Network Training RBF networks are trained by deciding on the number of hidden units deciding on their centers and the sharpnesses (standard deviation) of their Gaussians training up the output layer 38
39 Selecting the centers The first idea is to set a center on each training data In practice, usually we set 39 higher generalization capabilities of the model (avoid overfitting) reduction in computational complexity Selecting center locations Fixing centers before weight adaptation E.g., randomly selected data from training points, or centers found by a clustering algorithm Training centers and standard deviations along with weight adaptation: = (,, ) = 0,, = (,, ) = 1,, = (,, ) = 1,,
40 Kernel SVM vs. RBF Kernel SVM automatically computes all the unknown parameters including the number of centers. In the SVM approach, the number of nodes and the centers are found according to the optimization problem. 1 =( + (, () )) (, () ) (, ( ) ) Only support vectors : Number of SVs 40
41 RBF vs. MLP The activation responses are of a local nature in the RBF networks and of a global nature in MLP networks MLPs exhibit improved generalization properties, especially for regions that are not represented sufficiently in the training set In RBFs, a large number of centers is required to fill in the space (thus exponential dependence on the input dimension-curse of dimensionality) RBF networks do not suffer from local minima (when fixing RBFs ahead) The only parameters that are adjusted in the learning process are the linear mapping from hidden layer to output layer. Linearity ensures that the error surface is quadratic and has a single minimum. Weight adaptation: delta rule, perceptron rule, MLPs learn slower However, RBF networks (when fixing RBFs ahead) show less flexibility 41
42 Neural Network Models: Properties Properties: ability to learn complex nonlinear input-output relationships sequential training procedures adapting the network parameters to the data Neural network models and statistical models Most of the neural network models are similar or equivalent to the classical pattern recognition models [Jain et al. 2000] Statistics for amateurs, conceal the statistics from the user [Anderson] 42
43 Other Neural Networks for Pattern Recognition Self-Organizing Map (SOM) or Kohonen-Network clustering and dimensionality reduction Auto-associative Neural Networks can be used for dimensionality reduction (feature extraction) Recurrent Neural Networks (RNNs) propagate data also from later processing stages to earlier stages. they are general sequence processors. State-of-the-art networks: Deep learning (Hinton, 2006): uses restricted Boltzmann machine (RBM) to find an efficient learning procedure for deep models effective feature extractors 43
44 Neural Networks (MLP): Summery Advantages: can be used for huge data sets learn a feature extractor and a classifier simultaneously do not make any assumption regarding the underlying probability density resistant to outliers should not be used when traditional methods are appropriate Disadvantages: many parameters to be set usually slow training process (many epochs are required) Very slow in networks with multiple hidden layers will find a local, not necessarily global minimum of the error function may be very hard to interpret these models 44
For Monday. Read chapter 18, sections Homework:
For Monday Read chapter 18, sections 10-12 The material in section 8 and 9 is interesting, but we won t take time to cover it this semester Homework: Chapter 18, exercise 25 a-b Program 4 Model Neuron
More informationNeural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.
Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric
More informationNeural Networks (Overview) Prof. Richard Zanibbi
Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)
More informationAssignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions
ENEE 739Q: STATISTICAL AND NEURAL PATTERN RECOGNITION Spring 2002 Assignment 2 Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions Aravind Sundaresan
More informationSupervised Learning in Neural Networks (Part 2)
Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning
More informationClassification: Linear Discriminant Functions
Classification: Linear Discriminant Functions CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Discriminant functions Linear Discriminant functions
More informationData Mining. Neural Networks
Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most
More informationArtificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)
Artificial Neural Networks Analogy to biological neural systems, the most robust learning systems we know. Attempt to: Understand natural biological systems through computational modeling. Model intelligent
More informationMachine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013
Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork
More informationLearning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation
Learning Learning agents Inductive learning Different Learning Scenarios Evaluation Slides based on Slides by Russell/Norvig, Ronald Williams, and Torsten Reil Material from Russell & Norvig, chapters
More informationDr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically
Supervised Learning in Neural Networks (Part 1) A prescribed set of well-defined rules for the solution of a learning problem is called a learning algorithm. Variety of learning algorithms are existing,
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this course
More informationLecture #11: The Perceptron
Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationCMPT 882 Week 3 Summary
CMPT 882 Week 3 Summary! Artificial Neural Networks (ANNs) are networks of interconnected simple units that are based on a greatly simplified model of the brain. ANNs are useful learning tools by being
More information4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.
1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when
More informationNatural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu
Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward
More informationLecture 20: Neural Networks for NLP. Zubin Pahuja
Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationNeural Networks CMSC475/675
Introduction to Neural Networks CMSC475/675 Chapter 1 Introduction Why ANN Introduction Some tasks can be done easily (effortlessly) by humans but are hard by conventional paradigms on Von Neumann machine
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationCOMPUTATIONAL INTELLIGENCE
COMPUTATIONAL INTELLIGENCE Fundamentals Adrian Horzyk Preface Before we can proceed to discuss specific complex methods we have to introduce basic concepts, principles, and models of computational intelligence
More informationLinear Models. Lecture Outline: Numeric Prediction: Linear Regression. Linear Classification. The Perceptron. Support Vector Machines
Linear Models Lecture Outline: Numeric Prediction: Linear Regression Linear Classification The Perceptron Support Vector Machines Reading: Chapter 4.6 Witten and Frank, 2nd ed. Chapter 4 of Mitchell Solving
More informationMulti Layer Perceptron with Back Propagation. User Manual
Multi Layer Perceptron with Back Propagation User Manual DAME-MAN-NA-0011 Issue: 1.3 Date: September 03, 2013 Author: S. Cavuoti, M. Brescia Doc. : MLPBP_UserManual_DAME-MAN-NA-0011-Rel1.3 1 INDEX 1 Introduction...
More informationAssignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation
Farrukh Jabeen Due Date: November 2, 2009. Neural Networks: Backpropation Assignment # 5 The "Backpropagation" method is one of the most popular methods of "learning" by a neural network. Read the class
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining
More informationArtificial Neural Networks MLP, RBF & GMDH
Artificial Neural Networks MLP, RBF & GMDH Jan Drchal drchajan@fel.cvut.cz Computational Intelligence Group Department of Computer Science and Engineering Faculty of Electrical Engineering Czech Technical
More informationClassification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions
ENEE 739Q SPRING 2002 COURSE ASSIGNMENT 2 REPORT 1 Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions Vikas Chandrakant Raykar Abstract The aim of the
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationNeural Networks (pp )
Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.
More informationNeural Network Neurons
Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationPattern Classification Algorithms for Face Recognition
Chapter 7 Pattern Classification Algorithms for Face Recognition 7.1 Introduction The best pattern recognizers in most instances are human beings. Yet we do not completely understand how the brain recognize
More informationClassification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska
Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu
Artificial Neural Networks Introduction to Computational Neuroscience Ardi Tampuu 7.0.206 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More informationDeep Learning. Architecture Design for. Sargur N. Srihari
Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation
More informationWeek 3: Perceptron and Multi-layer Perceptron
Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,
More informationStatistical Learning Part 2 Nonparametric Learning: The Main Ideas. R. Moeller Hamburg University of Technology
Statistical Learning Part 2 Nonparametric Learning: The Main Ideas R. Moeller Hamburg University of Technology Instance-Based Learning So far we saw statistical learning as parameter learning, i.e., given
More informationMultilayer Feed-forward networks
Multi Feed-forward networks 1. Computational models of McCulloch and Pitts proposed a binary threshold unit as a computational model for artificial neuron. This first type of neuron has been generalized
More informationCSE 5526: Introduction to Neural Networks Radial Basis Function (RBF) Networks
CSE 5526: Introduction to Neural Networks Radial Basis Function (RBF) Networks Part IV 1 Function approximation MLP is both a pattern classifier and a function approximator As a function approximator,
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017
3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural
More informationMachine Learning in Biology
Università degli studi di Padova Machine Learning in Biology Luca Silvestrin (Dottorando, XXIII ciclo) Supervised learning Contents Class-conditional probability density Linear and quadratic discriminant
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationCS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016
CS 4510/9010 Applied Machine Learning 1 Neural Nets Paula Matuszek Fall 2016 Neural Nets, the very short version 2 A neural net consists of layers of nodes, or neurons, each of which has an activation
More informationData Mining and Analytics
Data Mining and Analytics Aik Choon Tan, Ph.D. Associate Professor of Bioinformatics Division of Medical Oncology Department of Medicine aikchoon.tan@ucdenver.edu 9/22/2017 http://tanlab.ucdenver.edu/labhomepage/teaching/bsbt6111/
More informationSupervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)
Supervised Learning (contd) Linear Separation Mausam (based on slides by UW-AI faculty) Images as Vectors Binary handwritten characters Treat an image as a highdimensional vector (e.g., by reading pixel
More informationLecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen
Lecture 13 Deep Belief Networks Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com 12 December 2012
More informationMore on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization
More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial
More informationCOMPUTATIONAL INTELLIGENCE
COMPUTATIONAL INTELLIGENCE Radial Basis Function Networks Adrian Horzyk Preface Radial Basis Function Networks (RBFN) are a kind of artificial neural networks that use radial basis functions (RBF) as activation
More informationMulti-Layered Perceptrons (MLPs)
Multi-Layered Perceptrons (MLPs) The XOR problem is solvable if we add an extra node to a Perceptron A set of weights can be found for the above 5 connections which will enable the XOR of the inputs to
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007,
More information6. Backpropagation training 6.1 Background
6. Backpropagation training 6.1 Background To understand well how a feedforward neural network is built and it functions, we consider its basic first steps. We return to its history for a while. In 1949
More informationNeural Nets for Adaptive Filter and Adaptive Pattern Recognition
Adaptive Pattern btyoung@gmail.com CSCE 636 10 February 2010 Outline Adaptive Combiners and Filters Minimal Disturbance and the Algorithm Madaline Rule II () Published 1988 in IEEE Journals Bernard Widrow
More informationCSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning Fall 2018/19 7. Recurrent Neural Networks (Some figures adapted from NNDL book) 1 Recurrent Neural Networks 1. Recurrent Neural Networks (RNNs) 2. RNN Training
More informationIntroduction to Multilayer Perceptrons
An Introduction to Multilayered Neural Networks Introduction to Multilayer Perceptrons Marco Gori University of Siena Outline of the course Motivations and biological inspiration Multilayer perceptrons:
More informationNon-Parametric Modeling
Non-Parametric Modeling CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Non-Parametric Density Estimation Parzen Windows Kn-Nearest Neighbor
More information6. Linear Discriminant Functions
6. Linear Discriminant Functions Linear Discriminant Functions Assumption: we know the proper forms for the discriminant functions, and use the samples to estimate the values of parameters of the classifier
More informationData Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)
Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based
More informationChap.12 Kernel methods [Book, Chap.7]
Chap.12 Kernel methods [Book, Chap.7] Neural network methods became popular in the mid to late 1980s, but by the mid to late 1990s, kernel methods have also become popular in machine learning. The first
More informationFunction approximation using RBF network. 10 basis functions and 25 data points.
1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data
More informationKnowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.
Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN
More informationArtificial Neural Networks Lecture Notes Part 5. Stephen Lucci, PhD. Part 5
Artificial Neural Networks Lecture Notes Part 5 About this file: If you have trouble reading the contents of this file, or in case of transcription errors, email gi0062@bcmail.brooklyn.cuny.edu Acknowledgments:
More informationNeural Networks: What can a network represent. Deep Learning, Spring 2018
Neural Networks: What can a network represent Deep Learning, Spring 2018 1 Recap : Neural networks have taken over AI Tasks that are made possible by NNs, aka deep learning 2 Recap : NNets and the brain
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN
More informationChannel Performance Improvement through FF and RBF Neural Network based Equalization
Channel Performance Improvement through FF and RBF Neural Network based Equalization Manish Mahajan 1, Deepak Pancholi 2, A.C. Tiwari 3 Research Scholar 1, Asst. Professor 2, Professor 3 Lakshmi Narain
More informationArtificial Neural Networks
The Perceptron Rodrigo Fernandes de Mello Invited Professor at Télécom ParisTech Associate Professor at Universidade de São Paulo, ICMC, Brazil http://www.icmc.usp.br/~mello mello@icmc.usp.br Conceptually
More informationLECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS
LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class
More informationAkarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction
Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different
More informationYuki Osada Andrew Cannon
Yuki Osada Andrew Cannon 1 Humans are an intelligent species One feature is the ability to learn The ability to learn comes down to the brain The brain learns from experience Research shows that the brain
More informationNotes on Multilayer, Feedforward Neural Networks
Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationLearning via Optimization
Lecture 7 1 Outline 1. Optimization Convexity 2. Linear regression in depth Locally weighted linear regression 3. Brief dips Logistic Regression [Stochastic] gradient ascent/descent Support Vector Machines
More informationBasis Functions. Volker Tresp Summer 2017
Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More informationAlex Waibel
Alex Waibel 815.11.2011 1 16.11.2011 Organisation Literatur: Introduction to The Theory of Neural Computation Hertz, Krogh, Palmer, Santa Fe Institute Neural Network Architectures An Introduction, Judith
More informationNeural Networks: What can a network represent. Deep Learning, Fall 2018
Neural Networks: What can a network represent Deep Learning, Fall 2018 1 Recap : Neural networks have taken over AI Tasks that are made possible by NNs, aka deep learning 2 Recap : NNets and the brain
More informationPerceptrons and Backpropagation. Fabio Zachert Cognitive Modelling WiSe 2014/15
Perceptrons and Backpropagation Fabio Zachert Cognitive Modelling WiSe 2014/15 Content History Mathematical View of Perceptrons Network Structures Gradient Descent Backpropagation (Single-Layer-, Multilayer-Networks)
More informationMultivariate Data Analysis and Machine Learning in High Energy Physics (V)
Multivariate Data Analysis and Machine Learning in High Energy Physics (V) Helge Voss (MPI K, Heidelberg) Graduierten-Kolleg, Freiburg, 11.5-15.5, 2009 Outline last lecture Rule Fitting Support Vector
More informationCOMP9444 Neural Networks and Deep Learning 5. Geometry of Hidden Units
COMP9 8s Geometry of Hidden Units COMP9 Neural Networks and Deep Learning 5. Geometry of Hidden Units Outline Geometry of Hidden Unit Activations Limitations of -layer networks Alternative transfer functions
More informationCharacter Recognition Using Convolutional Neural Networks
Character Recognition Using Convolutional Neural Networks David Bouchain Seminar Statistical Learning Theory University of Ulm, Germany Institute for Neural Information Processing Winter 2006/2007 Abstract
More informationTraffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers
Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane
More information.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar..
.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar.. Machine Learning: Support Vector Machines: Linear Kernel Support Vector Machines Extending Perceptron Classifiers. There are two ways to
More informationIntroduction to Neural Networks
Introduction to Neural Networks What are connectionist neural networks? Connectionism refers to a computer modeling approach to computation that is loosely based upon the architecture of the brain Many
More information11/14/2010 Intelligent Systems and Soft Computing 1
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationA Dendrogram. Bioinformatics (Lec 17)
A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and
More informationLogical Rhythm - Class 3. August 27, 2018
Logical Rhythm - Class 3 August 27, 2018 In this Class Neural Networks (Intro To Deep Learning) Decision Trees Ensemble Methods(Random Forest) Hyperparameter Optimisation and Bias Variance Tradeoff Biological
More informationIMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS
IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS BOGDAN M.WILAMOWSKI University of Wyoming RICHARD C. JAEGER Auburn University ABSTRACT: It is shown that by introducing special
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2015 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows K-Nearest
More informationNeural Nets. General Model Building
Neural Nets To give you an idea of how new this material is, let s do a little history lesson. The origins of neural nets are typically dated back to the early 1940 s and work by two physiologists, McCulloch
More informationPerceptron as a graph
Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2
More informationLecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan
Lecture 17: Neural Networks and Deep Learning Instructor: Saravanan Thirumuruganathan Outline Perceptron Neural Networks Deep Learning Convolutional Neural Networks Recurrent Neural Networks Auto Encoders
More informationIMPLEMENTING DEEP LEARNING USING CUDNN 이예하 VUNO INC.
IMPLEMENTING DEEP LEARNING USING CUDNN 이예하 VUNO INC. CONTENTS Deep Learning Review Implementation on GPU using cudnn Optimization Issues Introduction to VUNO-Net DEEP LEARNING REVIEW BRIEF HISTORY OF NEURAL
More information