Deep Belief Network for Clustering and Classification of a Continuous Data

Size: px
Start display at page:

Download "Deep Belief Network for Clustering and Classification of a Continuous Data"

Transcription

1 Deep Belief Network for Clustering and Classification of a Continuous Data Mostafa A. SalamaI, Aboul Ella Hassanien" Aly A. Fahmy2 'Department of Computer Science, British University in Egypt, Cairo, Egypt Mostafa.salama@gmail.com 2Cairo University, Faculty of Computers and Information aboitcairo.aly.fahmy@gmail.com Abstract-Deep Belief Network (DBN) is a deep architecture that consists of a stack of Restricted Boltzmann Machines (RBM). The deep architecture has the benefit that each layer learns more complex features than layers before it. DBN and RBM could be used as a feature extraction method also used as neural network with initially learned weights. The approach proposed depends on DBN in clustering and classification of continuous input data without using back propagation in the DBN architecture. DBN should have a better a performance than the traditional neural network due the initialization of the connecting weights rather than just using random weights in NN. Each layer in DBN (RBM) depends on Contrastive Divergence method for input reconstruction which increases the performance of the network. 1. Introduction Kernel machines such as Support Vector Machines are local-kernel based approach, while non-local learning algorithms have the potential to generalize to pieces not covered by the training set. Also they are shallow architecture which are only two levels of data-dependent computational elements. This is also true of feed-forward neural networks with a single hidden layer [1]. Recently, deep architectures trained in an unsupervised manner have been proposed as an automatic method for extracting useful features. Deep architectures consist of feature detector units arranged in layers. Lower layers detect simple features and feed into higher layers, which in turn detect more complex features. Hinton et al. recently proposed a greedy layer-wise unsupervised learning procedure relying on the training algorithm of restricted Boltzmann machines (RBM) to initialize the parameters of a deep belief network (DBN), a generative model with many layers of hidden causal variables. In DBN the bottom layer is observable, and the multiple hidden layers are created by stacking multiple Restricted Boltzmann Machine RBMs on top of each other. RBM is generative model that uses a layer of binary variables to explain its input data [2]. The top RBM is of two layers that have symmetric undirected connection, this RBM is called Harmonium RBM with continuous Gaussian hidden nodes [for certain cases]. The training is unsupervised, but it produces useful features which can later be tuned by back-propagation for classification or dimensionality reduction. The three aspects of this strategy is particularly important: l-pre-training one layer at a time in a greedy way; 2-Using unsupervised learning at each layer in order to preserve information from the input; 3-Fine-tuning of the whole network with respect to the ultimate criterion of interest [1]. Recently, there is a need to adapt the unsupervised learning algorithm to the nature of the inputs [3]. The proposed approach handles continuous-valued inputs by scaling them to [0, 1] interval. Clustering and Classification methods are applied on the famous iris data (a continuous dataset) using DBN architecture, without backpropagating the error from the last layer containing the class labels. The structure of this paper is as follows: Section 2 shows the background of RBM and DBN architecture, Section 3 describes the proposed DBN approach for clustering and classification, finally section 4 shows the clustering results of continuous datasets and the classification results of the iris dataset. 2. Deep Belief Network 2.1. Restricted Boltzmann Machine RBM is an energy-based undirected generative model that uses a layer of hidden variables to model a distribution over visible variables [4]. The undirected model for the interactions between the hidden and visible variables is used to ensure that the contribution of the likelihood term to the posterior over the hidden variables is approximately factorial which greatly facilitates inference [5]. Energy-based model means that the probability distribution over the variables of interest is defmed through an energy function. It is composed from a set of observable variables V={Vj} and a set of hidden variables H={hj}, i node in the visible layer,j node in the hidden layer. It is restricted in the sense that there are no visible-visible or hidden-hidden connections. 473

2 The steps of the RBM learning algorithm can be declared as follows: Due to the conditional independence (no connection) between nodes in the same layer, the conditional distributions are: P(HIV) = f1p(hjlv) p(hj=ilv) = f(a;+r..; wijvj; p(hj=olv)=i- p(hj=ilv); (1) And P(VIH) = Ilip(v;lh) p(v;=ilh)= f(bj+r..j wijh) p(v;=olh)= I-p(v;=Ilh) (2) For a binary data vector. The function f is a sigmoid, such that a(z)= lii +e"z Then the distribution (likelihood) between hidden and visible units is defined as P(v,h)=e "E(v,h) Jr..i e "E(vi,h) E(x,h)=-h'wv-b'v-c'h, (3) Where h' is the transpose of matrix h. The average of the log likelihood with respect to the parameters is given by (4) (5) (6) Epsilon E is a parameter with a small value. The term <> model takes exponential time to compute exactly so the Contrastive Divergence (CD) approximation to the gradient is used instead [6]. Contrastive divergence is a method that depends on the approximation that is to run the sampler for a single Gibbs iteration, instead until the chain converges. In this case the term <>1 will be used such that it represents the expectation with respect to the distribution of samples from running the Gibbs sampler initialized at the data for one full step, the new update rule will be. AWij = E«v;hj>data - <v;hj>j) (7) Av; = E«V/>data - <v/>j) (8) Ah; = E( <hp> data - <h/>j) (9) AWij = E(olog p(v)/ow q )=E(<x;hj>data - <Vjhj>model) Av; = E«V/>data - <v; >model) Ah; = E( <h/> data - <h/> model) The Harmonium RBM is an RBM with Gaussian continuous hidden nodes [6]. Where f is Normal distribution function which takes the form shown in Equation (10) p(hj=hlx)= N(cj+wj-x, 1) (10) Harmonium RBM is used for a discrete output in the last layer of a deep belief network in classification Deep Belief Network Architecture The key idea behind training a deep belief network by training a sequence of RBMs is that the model parameters, e, learned by an RBM define both p(vlh, e) and the prior distribution over hidden vectors, p(hlo), so the probability of generating a visible vector, v, can be written as: p(v) =Lh p(hio)p(vlh,e) (11) After learning e, p(vlh,e) is kept while p(hle) can be replaced by a better model that is learned by treating the hidden activity vectors H={h} as the training data (visible layer) for another RBM. This replacement improves a variation lower bound on the probability of the training data under the composite model. The study in [12] proves the following three rules, I-Once the number of hidden units in the top level crosses a threshold; the performance essentially flattens at around certain accuracy. 2-The performance tends to decrease as the number of layers increases. 3-The performance increases as we train each RBM for an increasing number of iterations. In case of not using class labels and back-propagation in the DBN Architecture (unsupervised training) [7], DBN could be used as a feature extraction method for dimensionality reduction. On the other hand, when associating class labels with feature vectors, DBN is used for classification. There are two general types of DBN classifier architectures which are the Back-Propagation DBN (BP-DBN) and the Associate Memory DBN (AM DBN) [8]. For both architectures, when the number of possible classes is very large and the distribution of frequencies for different classes is far from uniform, it may sometimes be advantageous to use a different encoding for the class targets than the standard one-of-k softmax encoding Back-Propagation DBN Adds a fmal layer of variables that represent the desired outputs (k outputs) then performs a purely discriminative fine-tuning phase using back-propagation. Using back-propagation to fine-tune feature detectors that are initially learned as a generative model works much better than using back-propagation with random initial weights in the traditional neural network: Associate Memory DBN The top level RBM is trained on data obtained by concatenating the high-level representation produced by unsupervised learning with a binary label vector that contains a 1 in the location representing the correct class. In other words, the top RBM models the joint distribution of the inputs and associated target classes. When training the top layer of weights (the ones in the associative memory) the labels were provided as part of the input. The labels were represented by turning on one unit in a "softmax" group of k visible units. Softmax converts an arbitrary real-valued vector into a multinomial probability vector. It is a generalization of the sigmoid function to k outcomes. 474

3 3. Clustering and Classification using DBN The target of this study is to use undirected DBN in the classification of a continuous datasets like the iris and Abalone dataset. RBMs were originally developed using binary stochastic units for both the visible and hidden layers. The information that is available on continuousvalued data and neurons indicates that training is much slower than with binary inputs. Given that training on binary inputs itself is a fairly slow process, training on continuous inputs would have been infeasible. Previous work on continuous-valued input in RBMs had been carried out like adding noise to sigmoid units [9]. This work has scaled the input into interval of [0, 1] in clustering this input using DBN. The DBN consists of three layers of RBMs, the first RBM considers the input (scaled) as the visible layer, and the hidden layer is the visible layer of the second RBM. The hidden layer of the final RBM will be the output of DBN which is consisted of only one unit. Then the first RBM is trained by running Gipps method for 1000 iterations. The output of the ftrst RBM passed to the second RBM for training by running another 1000 iteration of Gipps method. The architecture of the DBN network is shown in figure 1. Initialize E =0.1 II E :epsilon value Read and Scale the input to the range of [0,1] into two dimensional array v[ni][nf] II NI, NF: N umber of Input, features Select the most discriminate features Initialize n=3 number of RBMs Initialize the number of hidden units for each RBM Initialize W randomly II W : weights of BDN Network 00 arrays [W], W2, W3] Define W'IIThe trained weighs resulted from the DBN 00 arrays [W'], W' 2, w' 3] call DBN_train(n, v, gn) Cluster the output, assign a class label to each cluster according to input Run objects in the testing dataset using the trained DBN, output of the DBN layer defines the object's class according to the cluster range. DBN_train(n, v, gn, W) II Train the input, The result of training is a 1 dimensional array of length equals to NI W],b1 RBM_Alg(v, E, W'], b, c, n/2, gn) o for all hidden units i: v1[k][i] = P(v1[k][i]= 1 [ v[kd = sigm(b][i] +sumj(w] [i][j]*v[k][j])) W2, b2 RBM_Alg(vl, E, W'2, b, c, n/4, gn) o for all hidden units i: v2[k][i] = P(v2[k][i]= 1 [ v1[kd = sigm(b2[i]+sum jew 2[i][j]*v 1 [k][j])) DBN of3 RBMLayers Figure 1: The architecture of the used DBN Network The steps of DBN classification approach could be summarized as follows; the first RBM receives the input from its visible nodes and model it on the hidden layer. Then the modeled input on the hidden layer is passed to the visible nodes in second layer. This modeling and passing methods continues to the last layer, which is composed of one visible node. In this approach feature selection could be an optional step that depends on the data itself. DBN Classifier Initialize gn =1000 Ilgn :Num of Gibbs methods W3, b3 RBM_Alg(v2, E, W'3, b, c, 1, gn) o for a single hidden units i: v3[k][0] = P(v3[k][0]= 1 [ v2[kd = sigm(b3[0] + sumj(w3[0][j] * v2[k][j])) return V3 lithe output of DBN network RBM Alg(v, epsilon, W, b, c, I, gn) III is the number of hidden units repeat for ng times for all hidden units i: o h[i] = P(h[i]= l[v[k D = sigm(b[i] + sum j(w[i][j] * v[o][j])) for all visible units j: o v'[k][j]=p(v[k][j] = l[h) = 475

4 sigm( c[j] + sum _i(w[i][j] * h[i])) for all hidden units i: o h'[i]=p(h[i] = 1 I v'[kd = sigm(b[i]+sumj(w[i][j] * v'[k)[j])) W += epsilon * (h * v[k] - h' * v'[kd b += epsilon * (h - h') c += epsilon * (v[o] - v[1 D return W, b, c DBN classification algorithm 4. Experimental Results and Discussion The DBN classification has been applied on the famous Iris dataset [10] of 148 object and 3 classes. The classification used 90% of the input for training, and 10% for testing. The output of the DBN network is divided into three distinct intervals/ clusters, the clusters have the following ranges cluster 1 [0, 0.264], cluster 2 [ ], cluster 3 [0.782, 1]. From the class labels of the objects, It shows that class 2, 1 and 0 corresponds to cluster 1,2 and 3 respectively. Then test the 10 % of the dataset by applied this part of the dataset on the trained DBN and find which range does the output of each object from the DBN lies in. According to the range, the class of each object is determined then compares this result with class label associated with the object. The result of testing (classification) of 10% of the input dataset is 93.3%. The result of clustering is shown in figure The result of classification of Abalone Dataset Another Dataset, Abalone dataset, has been tested as in figure 2. It is noticed that the output can divided into two different intervals each interval includes 2 classes, the first one is [0. 16, 0.44] where it includes class 8 and 9 objects. The second interval is [0.5, 1] and it includes class 6 and 7 objects. A prior action has been applied to this dataset that includes the selection of 4 features out of the 7 features and it was very effective. As it seen in figure 4 all the four classes appears to be concentrated in the same interval [0.2,0.4]. Figure 3: DBN output of 1 unit in the last hidden layer From input of Abalone Dataset After feature selection j,,!, Figure 4: DBN output of 1 unit in the last hidden layer From input of Abalone Dataset Before feature selection 5-Conclusion u u u u u U U M U 08NOtltpYt Figure 2: DBN output of I unit in the last hidden layer From input of Iris Dataset The performance of classification doesn't change by increasing the number of RBM layers or the number of Gipps method (1000). The performance of the proposed classifier has compared with respect to other classifiers used in weka software [11] is shown in table 1. DBN Dataset 80% 90% Name training training Weka Classification accuracy BN SVM MLP DT Iris Table 1: Comparison of accuracy with weka classifiers Undirected deep architecture has provided good results in dimensionality reduction; this supports the idea to use DBN for learning in artificial intelligence. For classification, two types of supervised training had been proposed for the structure of DBN, which are BP-DBN and AM-DBN. Depending on DBN without class labels, unsupervised learning, could leads to dimensionality reduction. In this study we have clarified that the unsupervised learning of DBN could lead to clustering of data and later to classification of data. This approach had been tested on two different and famous datasets which are Iris and Abalone dataset. Also the nature of these two datasets as a continuous data was another challenge that is treated be scaling them into the interval between 0 and References [1] Y. Bengio, P. Lamblin, P. Popovici and H. Larochelle, "Greedy Layer-Wise Training of Deep Networks", Advances in 476

5 Neural Information Processing Systems 19, Cambridge, MA, MIT Press, [2] G. E. Hinton, "A fast learning algorithm for deep belief nets", Neural Computation, July 2006, vol. 18(7), pp [3] H. Larochelle, Y. Bengio, J. Louradour and P. Lamblin, "Exploring Strategies for Training Deep Neural Networks", Journal of Machine Learning Research, 2009, Vol. 10, pp [4] H.Larochelle and Y.Bengio, "Classification using discriminative restricted boltzmann machines", Proceedings of the 25th international conference on Machine learning, 2008, vol. 307, pp [5] 1. Sutskever and G.E. Hinton, "Learning multilevel distributed representations for high-dimensional sequences". In Proceeding of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007, pp [6] A. K. Noulas and BJ.A. Krose, "Deep Belief Networks for Dimensionality Reduction", Belgian-Dutch Conference on Artificial Intelligence 2008, Netherland, [7] 1. Goodfellow, Q. Le, A. Saxe and A.Ng, "Measuring invariances in deep networks", Advances in Neural Information Processing Systems, 2009, vol. 22, pp [8] A. R. Mohamed, G. Dahl and G. E. Hinton, "Deep belief networks for phone recognition", NIPS 22 workshop on deep learning for speech recognition, [9] H.Chen and A.Murray, "A continuous restricted boltzmann machine with an implementable training algorithm", lee Proceedings of Vision, Image and Signal Processing, 2003, vol. 150(3), pp [10] DCI Machine Learning Repository, [11] Weka: Data Mining Software in java, [12] L. McAfee, "Document Classification using Deep Belief Nets", CS224n, Spring

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine

To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine 2014 22nd International Conference on Pattern Recognition To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine Takayoshi Yamashita, Masayuki Tanaka, Eiji Yoshida, Yuji Yamauchi and Hironobu

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007

More information

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt Energy Based Models, Restricted Boltzmann Machines and Deep Networks Jesse Eickholt ???? Who s heard of Energy Based Models (EBMs) Restricted Boltzmann Machines (RBMs) Deep Belief Networks Auto-encoders

More information

Deep Boltzmann Machines

Deep Boltzmann Machines Deep Boltzmann Machines Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines

More information

Neural Networks: promises of current research

Neural Networks: promises of current research April 2008 www.apstat.com Current research on deep architectures A few labs are currently researching deep neural network training: Geoffrey Hinton s lab at U.Toronto Yann LeCun s lab at NYU Our LISA lab

More information

Advanced Introduction to Machine Learning, CMU-10715

Advanced Introduction to Machine Learning, CMU-10715 Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

A Fast Learning Algorithm for Deep Belief Nets

A Fast Learning Algorithm for Deep Belief Nets A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton, Simon Osindero Department of Computer Science University of Toronto, Toronto, Canada Yee-Whye Teh Department of Computer Science National

More information

Stacked Denoising Autoencoders for Face Pose Normalization

Stacked Denoising Autoencoders for Face Pose Normalization Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University

More information

Autoencoders, denoising autoencoders, and learning deep networks

Autoencoders, denoising autoencoders, and learning deep networks 4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,

More information

3D Object Recognition with Deep Belief Nets

3D Object Recognition with Deep Belief Nets 3D Object Recognition with Deep Belief Nets Vinod Nair and Geoffrey E. Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu

More information

DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION

DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION S.Dhanalakshmi #1 #PG Scholar, Department of Computer Science, Dr.Sivanthi Aditanar college of Engineering, Tiruchendur

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Parallel Implementation of Deep Learning Using MPI

Parallel Implementation of Deep Learning Using MPI Parallel Implementation of Deep Learning Using MPI CSE633 Parallel Algorithms (Spring 2014) Instructor: Prof. Russ Miller Team #13: Tianle Ma Email: tianlema@buffalo.edu May 7, 2014 Content Introduction

More information

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components

More information

Static Gesture Recognition with Restricted Boltzmann Machines

Static Gesture Recognition with Restricted Boltzmann Machines Static Gesture Recognition with Restricted Boltzmann Machines Peter O Donovan Department of Computer Science, University of Toronto 6 Kings College Rd, M5S 3G4, Canada odonovan@dgp.toronto.edu Abstract

More information

Lecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen

Lecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen Lecture 13 Deep Belief Networks Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com 12 December 2012

More information

Introduction to Deep Learning

Introduction to Deep Learning ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)

More information

Deep Learning. Volker Tresp Summer 2014

Deep Learning. Volker Tresp Summer 2014 Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle larocheh@iro.umontreal.ca Dumitru Erhan erhandum@iro.umontreal.ca Aaron Courville courvila@iro.umontreal.ca

More information

Comparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity

Comparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Depth Image Dimension Reduction Using Deep Belief Networks

Depth Image Dimension Reduction Using Deep Belief Networks Depth Image Dimension Reduction Using Deep Belief Networks Isma Hadji* and Akshay Jain** Department of Electrical and Computer Engineering University of Missouri 19 Eng. Building West, Columbia, MO, 65211

More information

Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient

Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient Tijmen Tieleman tijmen@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Ontario M5S

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Implicit Mixtures of Restricted Boltzmann Machines

Implicit Mixtures of Restricted Boltzmann Machines Implicit Mixtures of Restricted Boltzmann Machines Vinod Nair and Geoffrey Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu

More information

Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient. Ali Mirzapour Paper Presentation - Deep Learning March 7 th

Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient. Ali Mirzapour Paper Presentation - Deep Learning March 7 th Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient Ali Mirzapour Paper Presentation - Deep Learning March 7 th 1 Outline of the Presentation Restricted Boltzmann Machine

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Learning Class-relevant Features and Class-irrelevant Features via a Hybrid third-order RBM

Learning Class-relevant Features and Class-irrelevant Features via a Hybrid third-order RBM via a Hybrid third-order RBM Heng Luo Ruimin Shen Changyong Niu Carsten Ullrich Shanghai Jiao Tong University hengluo@sjtu.edu Shanghai Jiao Tong University rmshen@sjtu.edu Zhengzhou University iecyniu@zzu.edu.cn

More information

Training Restricted Boltzmann Machines with Overlapping Partitions

Training Restricted Boltzmann Machines with Overlapping Partitions Training Restricted Boltzmann Machines with Overlapping Partitions Hasari Tosun and John W. Sheppard Montana State University, Department of Computer Science, Bozeman, Montana, USA Abstract. Restricted

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

Defects Detection Based on Deep Learning and Transfer Learning

Defects Detection Based on Deep Learning and Transfer Learning We get the result that standard error is 2.243445 through the 6 observed value. Coefficient of determination R 2 is very close to 1. It shows the test value and the actual value is very close, and we know

More information

An Arabic Optical Character Recognition System Using Restricted Boltzmann Machines

An Arabic Optical Character Recognition System Using Restricted Boltzmann Machines An Arabic Optical Character Recognition System Using Restricted Boltzmann Machines Abdullah M. Rashwan, Mohamed S. Kamel, and Fakhri Karray University of Waterloo Abstract. Most of the state-of-the-art

More information

Emotion Detection using Deep Belief Networks

Emotion Detection using Deep Belief Networks Emotion Detection using Deep Belief Networks Kevin Terusaki and Vince Stigliani May 9, 2014 Abstract In this paper, we explore the exciting new field of deep learning. Recent discoveries have made it possible

More information

Deep Generative Models Variational Autoencoders

Deep Generative Models Variational Autoencoders Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects

Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects Joseph Lin Chu and Adam Krzyżak Department of Computer Science

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 7: Universal Approximation Theorem, More Hidden Units, Multi-Class Classifiers, Softmax, and Regularization Peter Belhumeur Computer Science Columbia University

More information

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng Computer Science Department, Stanford University,

More information

Learning robust features from underwater ship-radiated noise with mutual information group sparse DBN

Learning robust features from underwater ship-radiated noise with mutual information group sparse DBN Learning robust features from underwater ship-radiated noise with mutual information group sparse DBN Sheng SHEN ; Honghui YANG ; Zhen HAN ; Junun SHI ; Jinyu XIONG ; Xiaoyong ZHANG School of Marine Science

More information

Convolutional Neural Networks

Convolutional Neural Networks Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

More information

SPE MS. Abstract. Introduction. Autoencoders

SPE MS. Abstract. Introduction. Autoencoders SPE-174015-MS Autoencoder-derived Features as Inputs to Classification Algorithms for Predicting Well Failures Jeremy Liu, ISI USC, Ayush Jaiswal, USC, Ke-Thia Yao, ISI USC, Cauligi S.Raghavendra, USC

More information

A Deep Learning Approach to the Classification of 3D Models under BIM Environment

A Deep Learning Approach to the Classification of 3D Models under BIM Environment , pp.179-188 http//dx.doi.org/10.14257/ijca.2016.9.7.17 A Deep Learning Approach to the Classification of 3D Models under BIM Environment Li Wang *, a, Zhikai Zhao b and Xuefeng Wu c a School of Mechanics

More information

Kernels vs. DNNs for Speech Recognition

Kernels vs. DNNs for Speech Recognition Kernels vs. DNNs for Speech Recognition Joint work with: Columbia: Linxi (Jim) Fan, Michael Collins (my advisor) USC: Zhiyun Lu, Kuan Liu, Alireza Bagheri Garakani, Dong Guo, Aurélien Bellet, Fei Sha IBM:

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

Unsupervised Deep Learning for Scene Recognition

Unsupervised Deep Learning for Scene Recognition Unsupervised Deep Learning for Scene Recognition Akram Helou and Chau Nguyen May 19, 2011 1 Introduction Object and scene recognition are usually studied separately. However, research [2]shows that context

More information

Modeling pigeon behaviour using a Conditional Restricted Boltzmann Machine

Modeling pigeon behaviour using a Conditional Restricted Boltzmann Machine Modeling pigeon behaviour using a Conditional Restricted Boltzmann Machine Matthew D. Zeiler 1,GrahamW.Taylor 1, Nikolaus F. Troje 2 and Geoffrey E. Hinton 1 1- University of Toronto - Dept. of Computer

More information

Novel Lossy Compression Algorithms with Stacked Autoencoders

Novel Lossy Compression Algorithms with Stacked Autoencoders Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is

More information

Rotation Invariance Neural Network

Rotation Invariance Neural Network Rotation Invariance Neural Network Shiyuan Li Abstract Rotation invariance and translate invariance have great values in image recognition. In this paper, we bring a new architecture in convolutional neural

More information

Outlier detection using autoencoders

Outlier detection using autoencoders Outlier detection using autoencoders August 19, 2016 Author: Olga Lyudchik Supervisors: Dr. Jean-Roch Vlimant Dr. Maurizio Pierini CERN Non Member State Summer Student Report 2016 Abstract Outlier detection

More information

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based

More information

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version Shallow vs. deep networks Restricted Boltzmann Machines Shallow: one hidden layer Features can be learned more-or-less independently Arbitrary function approximator (with enough hidden units) Deep: two

More information

Online Social Network Image Classification and Application Based on Deep Learning

Online Social Network Image Classification and Application Based on Deep Learning 2016 3 rd International Conference on Engineering Technology and Application (ICETA 2016) ISBN: 978-1-60595-383-0 Online Social Network Image Classification and Application Based on Deep Learning Chunde

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Modeling image patches with a directed hierarchy of Markov random fields

Modeling image patches with a directed hierarchy of Markov random fields Modeling image patches with a directed hierarchy of Markov random fields Simon Osindero and Geoffrey Hinton Department of Computer Science, University of Toronto 6, King s College Road, M5S 3G4, Canada

More information

Efficient Algorithms may not be those we think

Efficient Algorithms may not be those we think Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann

More information

Neural Network Neurons

Neural Network Neurons Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Machine Learning. MGS Lecture 3: Deep Learning

Machine Learning. MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer

More information

Data Mining. Neural Networks

Data Mining. Neural Networks Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most

More information

Convolutional Restricted Boltzmann Machine Features for TD Learning in Go

Convolutional Restricted Boltzmann Machine Features for TD Learning in Go ConvolutionalRestrictedBoltzmannMachineFeatures fortdlearningingo ByYanLargmanandPeterPham AdvisedbyHonglakLee 1.Background&Motivation AlthoughrecentadvancesinAIhaveallowed Go playing programs to become

More information

Neural Network Weight Selection Using Genetic Algorithms

Neural Network Weight Selection Using Genetic Algorithms Neural Network Weight Selection Using Genetic Algorithms David Montana presented by: Carl Fink, Hongyi Chen, Jack Cheng, Xinglong Li, Bruce Lin, Chongjie Zhang April 12, 2005 1 Neural Networks Neural networks

More information

arxiv: v1 [cs.cl] 18 Jan 2015

arxiv: v1 [cs.cl] 18 Jan 2015 Workshop on Knowledge-Powered Deep Learning for Text Mining (KPDLTM-2014) arxiv:1501.04325v1 [cs.cl] 18 Jan 2015 Lars Maaloe DTU Compute, Technical University of Denmark (DTU) B322, DK-2800 Lyngby Morten

More information

Unsupervised Learning

Unsupervised Learning Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy

More information

Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial)

Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial) MLSS Tutorial on: Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial) Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of Toronto

More information

Alternatives to Direct Supervision

Alternatives to Direct Supervision CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of

More information

Cambridge Interview Technical Talk

Cambridge Interview Technical Talk Cambridge Interview Technical Talk February 2, 2010 Table of contents Causal Learning 1 Causal Learning Conclusion 2 3 Motivation Recursive Segmentation Learning Causal Learning Conclusion Causal learning

More information

Deep Learning With Noise

Deep Learning With Noise Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu

More information

Research Article A New Approach for Mobile Advertising Click-Through Rate Estimation Based on Deep Belief Nets

Research Article A New Approach for Mobile Advertising Click-Through Rate Estimation Based on Deep Belief Nets Hindawi Computational Intelligence and Neuroscience Volume 2017, Article ID 7259762, 8 pages https://doi.org/10.1155/2017/7259762 Research Article A New Approach for Mobile Advertising Click-Through Rate

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right

More information

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing feature 3 PC 3 Beate Sick Many slides are taken form Hinton s great lecture on NN: https://www.coursera.org/course/neuralnets

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Deep Learning. Architecture Design for. Sargur N. Srihari

Deep Learning. Architecture Design for. Sargur N. Srihari Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart Machine Learning The Breadth of ML Neural Networks & Deep Learning Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017 Neural Networks Consider

More information

Learning Two-Layer Contractive Encodings

Learning Two-Layer Contractive Encodings In Proceedings of International Conference on Artificial Neural Networks (ICANN), pp. 620-628, September 202. Learning Two-Layer Contractive Encodings Hannes Schulz and Sven Behnke Rheinische Friedrich-Wilhelms-Universität

More information

Deep Learning. Volker Tresp Summer 2015

Deep Learning. Volker Tresp Summer 2015 Deep Learning Volker Tresp Summer 2015 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD. Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

Summary: A Tutorial on Learning With Bayesian Networks

Summary: A Tutorial on Learning With Bayesian Networks Summary: A Tutorial on Learning With Bayesian Networks Markus Kalisch May 5, 2006 We primarily summarize [4]. When we think that it is appropriate, we comment on additional facts and more recent developments.

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

An Evolutionary Approximation to Contrastive Divergence in Convolutional Restricted Boltzmann Machines

An Evolutionary Approximation to Contrastive Divergence in Convolutional Restricted Boltzmann Machines Wright State University CORE Scholar Browse all Theses and Dissertations Theses and Dissertations 2014 An Evolutionary Approximation to Contrastive Divergence in Convolutional Restricted Boltzmann Machines

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture

More information

2. Neural network basics

2. Neural network basics 2. Neural network basics Next commonalities among different neural networks are discussed in order to get started and show which structural parts or concepts appear in almost all networks. It is presented

More information

Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition

Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition Tara N. Sainath 1, Brian Kingsbury 1, Bhuvana Ramabhadran 1, Petr Fousek 2, Petr Novak 2, Abdel-rahman Mohamed 3

More information

Notes on Multilayer, Feedforward Neural Networks

Notes on Multilayer, Feedforward Neural Networks Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book

More information

Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance

Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance Telmo Amaral¹, Luís M. Silva¹², Luís A. Alexandre³, Chetak Kandaswamy¹, Joaquim Marques de Sá¹ 4, and Jorge M. Santos¹

More information

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R. Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric

More information

Gated Boltzmann Machine in Texture Modeling

Gated Boltzmann Machine in Texture Modeling Gated Boltzmann Machine in Texture Modeling Tele Hao, Tapani Rao, Alexander Ilin, and Juha Karhunen Department of Information and Computer Science Aalto University, Espoo, Finland firstname.lastname@aalto.fi

More information

Backpropagation + Deep Learning

Backpropagation + Deep Learning 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation + Deep Learning Matt Gormley Lecture 13 Mar 1, 2018 1 Reminders

More information

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Grounded Compositional Semantics for Finding and Describing Images with Sentences Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

Two Distributed-State Models For Generating High-Dimensional Time Series

Two Distributed-State Models For Generating High-Dimensional Time Series Journal of Machine Learning Research 12 (2011) 1025-1068 Submitted 7/10; Revised 1/11; Published 3/11 Two Distributed-State Models For Generating High-Dimensional Time Series Graham W. Taylor Courant Institute

More information

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why? Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of

More information