Deep Belief Network for Clustering and Classification of a Continuous Data
|
|
- Terence Flowers
- 5 years ago
- Views:
Transcription
1 Deep Belief Network for Clustering and Classification of a Continuous Data Mostafa A. SalamaI, Aboul Ella Hassanien" Aly A. Fahmy2 'Department of Computer Science, British University in Egypt, Cairo, Egypt Mostafa.salama@gmail.com 2Cairo University, Faculty of Computers and Information aboitcairo.aly.fahmy@gmail.com Abstract-Deep Belief Network (DBN) is a deep architecture that consists of a stack of Restricted Boltzmann Machines (RBM). The deep architecture has the benefit that each layer learns more complex features than layers before it. DBN and RBM could be used as a feature extraction method also used as neural network with initially learned weights. The approach proposed depends on DBN in clustering and classification of continuous input data without using back propagation in the DBN architecture. DBN should have a better a performance than the traditional neural network due the initialization of the connecting weights rather than just using random weights in NN. Each layer in DBN (RBM) depends on Contrastive Divergence method for input reconstruction which increases the performance of the network. 1. Introduction Kernel machines such as Support Vector Machines are local-kernel based approach, while non-local learning algorithms have the potential to generalize to pieces not covered by the training set. Also they are shallow architecture which are only two levels of data-dependent computational elements. This is also true of feed-forward neural networks with a single hidden layer [1]. Recently, deep architectures trained in an unsupervised manner have been proposed as an automatic method for extracting useful features. Deep architectures consist of feature detector units arranged in layers. Lower layers detect simple features and feed into higher layers, which in turn detect more complex features. Hinton et al. recently proposed a greedy layer-wise unsupervised learning procedure relying on the training algorithm of restricted Boltzmann machines (RBM) to initialize the parameters of a deep belief network (DBN), a generative model with many layers of hidden causal variables. In DBN the bottom layer is observable, and the multiple hidden layers are created by stacking multiple Restricted Boltzmann Machine RBMs on top of each other. RBM is generative model that uses a layer of binary variables to explain its input data [2]. The top RBM is of two layers that have symmetric undirected connection, this RBM is called Harmonium RBM with continuous Gaussian hidden nodes [for certain cases]. The training is unsupervised, but it produces useful features which can later be tuned by back-propagation for classification or dimensionality reduction. The three aspects of this strategy is particularly important: l-pre-training one layer at a time in a greedy way; 2-Using unsupervised learning at each layer in order to preserve information from the input; 3-Fine-tuning of the whole network with respect to the ultimate criterion of interest [1]. Recently, there is a need to adapt the unsupervised learning algorithm to the nature of the inputs [3]. The proposed approach handles continuous-valued inputs by scaling them to [0, 1] interval. Clustering and Classification methods are applied on the famous iris data (a continuous dataset) using DBN architecture, without backpropagating the error from the last layer containing the class labels. The structure of this paper is as follows: Section 2 shows the background of RBM and DBN architecture, Section 3 describes the proposed DBN approach for clustering and classification, finally section 4 shows the clustering results of continuous datasets and the classification results of the iris dataset. 2. Deep Belief Network 2.1. Restricted Boltzmann Machine RBM is an energy-based undirected generative model that uses a layer of hidden variables to model a distribution over visible variables [4]. The undirected model for the interactions between the hidden and visible variables is used to ensure that the contribution of the likelihood term to the posterior over the hidden variables is approximately factorial which greatly facilitates inference [5]. Energy-based model means that the probability distribution over the variables of interest is defmed through an energy function. It is composed from a set of observable variables V={Vj} and a set of hidden variables H={hj}, i node in the visible layer,j node in the hidden layer. It is restricted in the sense that there are no visible-visible or hidden-hidden connections. 473
2 The steps of the RBM learning algorithm can be declared as follows: Due to the conditional independence (no connection) between nodes in the same layer, the conditional distributions are: P(HIV) = f1p(hjlv) p(hj=ilv) = f(a;+r..; wijvj; p(hj=olv)=i- p(hj=ilv); (1) And P(VIH) = Ilip(v;lh) p(v;=ilh)= f(bj+r..j wijh) p(v;=olh)= I-p(v;=Ilh) (2) For a binary data vector. The function f is a sigmoid, such that a(z)= lii +e"z Then the distribution (likelihood) between hidden and visible units is defined as P(v,h)=e "E(v,h) Jr..i e "E(vi,h) E(x,h)=-h'wv-b'v-c'h, (3) Where h' is the transpose of matrix h. The average of the log likelihood with respect to the parameters is given by (4) (5) (6) Epsilon E is a parameter with a small value. The term <> model takes exponential time to compute exactly so the Contrastive Divergence (CD) approximation to the gradient is used instead [6]. Contrastive divergence is a method that depends on the approximation that is to run the sampler for a single Gibbs iteration, instead until the chain converges. In this case the term <>1 will be used such that it represents the expectation with respect to the distribution of samples from running the Gibbs sampler initialized at the data for one full step, the new update rule will be. AWij = E«v;hj>data - <v;hj>j) (7) Av; = E«V/>data - <v/>j) (8) Ah; = E( <hp> data - <h/>j) (9) AWij = E(olog p(v)/ow q )=E(<x;hj>data - <Vjhj>model) Av; = E«V/>data - <v; >model) Ah; = E( <h/> data - <h/> model) The Harmonium RBM is an RBM with Gaussian continuous hidden nodes [6]. Where f is Normal distribution function which takes the form shown in Equation (10) p(hj=hlx)= N(cj+wj-x, 1) (10) Harmonium RBM is used for a discrete output in the last layer of a deep belief network in classification Deep Belief Network Architecture The key idea behind training a deep belief network by training a sequence of RBMs is that the model parameters, e, learned by an RBM define both p(vlh, e) and the prior distribution over hidden vectors, p(hlo), so the probability of generating a visible vector, v, can be written as: p(v) =Lh p(hio)p(vlh,e) (11) After learning e, p(vlh,e) is kept while p(hle) can be replaced by a better model that is learned by treating the hidden activity vectors H={h} as the training data (visible layer) for another RBM. This replacement improves a variation lower bound on the probability of the training data under the composite model. The study in [12] proves the following three rules, I-Once the number of hidden units in the top level crosses a threshold; the performance essentially flattens at around certain accuracy. 2-The performance tends to decrease as the number of layers increases. 3-The performance increases as we train each RBM for an increasing number of iterations. In case of not using class labels and back-propagation in the DBN Architecture (unsupervised training) [7], DBN could be used as a feature extraction method for dimensionality reduction. On the other hand, when associating class labels with feature vectors, DBN is used for classification. There are two general types of DBN classifier architectures which are the Back-Propagation DBN (BP-DBN) and the Associate Memory DBN (AM DBN) [8]. For both architectures, when the number of possible classes is very large and the distribution of frequencies for different classes is far from uniform, it may sometimes be advantageous to use a different encoding for the class targets than the standard one-of-k softmax encoding Back-Propagation DBN Adds a fmal layer of variables that represent the desired outputs (k outputs) then performs a purely discriminative fine-tuning phase using back-propagation. Using back-propagation to fine-tune feature detectors that are initially learned as a generative model works much better than using back-propagation with random initial weights in the traditional neural network: Associate Memory DBN The top level RBM is trained on data obtained by concatenating the high-level representation produced by unsupervised learning with a binary label vector that contains a 1 in the location representing the correct class. In other words, the top RBM models the joint distribution of the inputs and associated target classes. When training the top layer of weights (the ones in the associative memory) the labels were provided as part of the input. The labels were represented by turning on one unit in a "softmax" group of k visible units. Softmax converts an arbitrary real-valued vector into a multinomial probability vector. It is a generalization of the sigmoid function to k outcomes. 474
3 3. Clustering and Classification using DBN The target of this study is to use undirected DBN in the classification of a continuous datasets like the iris and Abalone dataset. RBMs were originally developed using binary stochastic units for both the visible and hidden layers. The information that is available on continuousvalued data and neurons indicates that training is much slower than with binary inputs. Given that training on binary inputs itself is a fairly slow process, training on continuous inputs would have been infeasible. Previous work on continuous-valued input in RBMs had been carried out like adding noise to sigmoid units [9]. This work has scaled the input into interval of [0, 1] in clustering this input using DBN. The DBN consists of three layers of RBMs, the first RBM considers the input (scaled) as the visible layer, and the hidden layer is the visible layer of the second RBM. The hidden layer of the final RBM will be the output of DBN which is consisted of only one unit. Then the first RBM is trained by running Gipps method for 1000 iterations. The output of the ftrst RBM passed to the second RBM for training by running another 1000 iteration of Gipps method. The architecture of the DBN network is shown in figure 1. Initialize E =0.1 II E :epsilon value Read and Scale the input to the range of [0,1] into two dimensional array v[ni][nf] II NI, NF: N umber of Input, features Select the most discriminate features Initialize n=3 number of RBMs Initialize the number of hidden units for each RBM Initialize W randomly II W : weights of BDN Network 00 arrays [W], W2, W3] Define W'IIThe trained weighs resulted from the DBN 00 arrays [W'], W' 2, w' 3] call DBN_train(n, v, gn) Cluster the output, assign a class label to each cluster according to input Run objects in the testing dataset using the trained DBN, output of the DBN layer defines the object's class according to the cluster range. DBN_train(n, v, gn, W) II Train the input, The result of training is a 1 dimensional array of length equals to NI W],b1 RBM_Alg(v, E, W'], b, c, n/2, gn) o for all hidden units i: v1[k][i] = P(v1[k][i]= 1 [ v[kd = sigm(b][i] +sumj(w] [i][j]*v[k][j])) W2, b2 RBM_Alg(vl, E, W'2, b, c, n/4, gn) o for all hidden units i: v2[k][i] = P(v2[k][i]= 1 [ v1[kd = sigm(b2[i]+sum jew 2[i][j]*v 1 [k][j])) DBN of3 RBMLayers Figure 1: The architecture of the used DBN Network The steps of DBN classification approach could be summarized as follows; the first RBM receives the input from its visible nodes and model it on the hidden layer. Then the modeled input on the hidden layer is passed to the visible nodes in second layer. This modeling and passing methods continues to the last layer, which is composed of one visible node. In this approach feature selection could be an optional step that depends on the data itself. DBN Classifier Initialize gn =1000 Ilgn :Num of Gibbs methods W3, b3 RBM_Alg(v2, E, W'3, b, c, 1, gn) o for a single hidden units i: v3[k][0] = P(v3[k][0]= 1 [ v2[kd = sigm(b3[0] + sumj(w3[0][j] * v2[k][j])) return V3 lithe output of DBN network RBM Alg(v, epsilon, W, b, c, I, gn) III is the number of hidden units repeat for ng times for all hidden units i: o h[i] = P(h[i]= l[v[k D = sigm(b[i] + sum j(w[i][j] * v[o][j])) for all visible units j: o v'[k][j]=p(v[k][j] = l[h) = 475
4 sigm( c[j] + sum _i(w[i][j] * h[i])) for all hidden units i: o h'[i]=p(h[i] = 1 I v'[kd = sigm(b[i]+sumj(w[i][j] * v'[k)[j])) W += epsilon * (h * v[k] - h' * v'[kd b += epsilon * (h - h') c += epsilon * (v[o] - v[1 D return W, b, c DBN classification algorithm 4. Experimental Results and Discussion The DBN classification has been applied on the famous Iris dataset [10] of 148 object and 3 classes. The classification used 90% of the input for training, and 10% for testing. The output of the DBN network is divided into three distinct intervals/ clusters, the clusters have the following ranges cluster 1 [0, 0.264], cluster 2 [ ], cluster 3 [0.782, 1]. From the class labels of the objects, It shows that class 2, 1 and 0 corresponds to cluster 1,2 and 3 respectively. Then test the 10 % of the dataset by applied this part of the dataset on the trained DBN and find which range does the output of each object from the DBN lies in. According to the range, the class of each object is determined then compares this result with class label associated with the object. The result of testing (classification) of 10% of the input dataset is 93.3%. The result of clustering is shown in figure The result of classification of Abalone Dataset Another Dataset, Abalone dataset, has been tested as in figure 2. It is noticed that the output can divided into two different intervals each interval includes 2 classes, the first one is [0. 16, 0.44] where it includes class 8 and 9 objects. The second interval is [0.5, 1] and it includes class 6 and 7 objects. A prior action has been applied to this dataset that includes the selection of 4 features out of the 7 features and it was very effective. As it seen in figure 4 all the four classes appears to be concentrated in the same interval [0.2,0.4]. Figure 3: DBN output of 1 unit in the last hidden layer From input of Abalone Dataset After feature selection j,,!, Figure 4: DBN output of 1 unit in the last hidden layer From input of Abalone Dataset Before feature selection 5-Conclusion u u u u u U U M U 08NOtltpYt Figure 2: DBN output of I unit in the last hidden layer From input of Iris Dataset The performance of classification doesn't change by increasing the number of RBM layers or the number of Gipps method (1000). The performance of the proposed classifier has compared with respect to other classifiers used in weka software [11] is shown in table 1. DBN Dataset 80% 90% Name training training Weka Classification accuracy BN SVM MLP DT Iris Table 1: Comparison of accuracy with weka classifiers Undirected deep architecture has provided good results in dimensionality reduction; this supports the idea to use DBN for learning in artificial intelligence. For classification, two types of supervised training had been proposed for the structure of DBN, which are BP-DBN and AM-DBN. Depending on DBN without class labels, unsupervised learning, could leads to dimensionality reduction. In this study we have clarified that the unsupervised learning of DBN could lead to clustering of data and later to classification of data. This approach had been tested on two different and famous datasets which are Iris and Abalone dataset. Also the nature of these two datasets as a continuous data was another challenge that is treated be scaling them into the interval between 0 and References [1] Y. Bengio, P. Lamblin, P. Popovici and H. Larochelle, "Greedy Layer-Wise Training of Deep Networks", Advances in 476
5 Neural Information Processing Systems 19, Cambridge, MA, MIT Press, [2] G. E. Hinton, "A fast learning algorithm for deep belief nets", Neural Computation, July 2006, vol. 18(7), pp [3] H. Larochelle, Y. Bengio, J. Louradour and P. Lamblin, "Exploring Strategies for Training Deep Neural Networks", Journal of Machine Learning Research, 2009, Vol. 10, pp [4] H.Larochelle and Y.Bengio, "Classification using discriminative restricted boltzmann machines", Proceedings of the 25th international conference on Machine learning, 2008, vol. 307, pp [5] 1. Sutskever and G.E. Hinton, "Learning multilevel distributed representations for high-dimensional sequences". In Proceeding of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007, pp [6] A. K. Noulas and BJ.A. Krose, "Deep Belief Networks for Dimensionality Reduction", Belgian-Dutch Conference on Artificial Intelligence 2008, Netherland, [7] 1. Goodfellow, Q. Le, A. Saxe and A.Ng, "Measuring invariances in deep networks", Advances in Neural Information Processing Systems, 2009, vol. 22, pp [8] A. R. Mohamed, G. Dahl and G. E. Hinton, "Deep belief networks for phone recognition", NIPS 22 workshop on deep learning for speech recognition, [9] H.Chen and A.Murray, "A continuous restricted boltzmann machine with an implementable training algorithm", lee Proceedings of Vision, Image and Signal Processing, 2003, vol. 150(3), pp [10] DCI Machine Learning Repository, [11] Weka: Data Mining Software in java, [12] L. McAfee, "Document Classification using Deep Belief Nets", CS224n, Spring
Neural Networks and Deep Learning
Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,
More informationTo be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine
2014 22nd International Conference on Pattern Recognition To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine Takayoshi Yamashita, Masayuki Tanaka, Eiji Yoshida, Yuji Yamauchi and Hironobu
More informationAn Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation
An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007
More informationEnergy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt
Energy Based Models, Restricted Boltzmann Machines and Deep Networks Jesse Eickholt ???? Who s heard of Energy Based Models (EBMs) Restricted Boltzmann Machines (RBMs) Deep Belief Networks Auto-encoders
More informationDeep Boltzmann Machines
Deep Boltzmann Machines Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines
More informationNeural Networks: promises of current research
April 2008 www.apstat.com Current research on deep architectures A few labs are currently researching deep neural network training: Geoffrey Hinton s lab at U.Toronto Yann LeCun s lab at NYU Our LISA lab
More informationAdvanced Introduction to Machine Learning, CMU-10715
Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationA Fast Learning Algorithm for Deep Belief Nets
A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton, Simon Osindero Department of Computer Science University of Toronto, Toronto, Canada Yee-Whye Teh Department of Computer Science National
More informationStacked Denoising Autoencoders for Face Pose Normalization
Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University
More informationAutoencoders, denoising autoencoders, and learning deep networks
4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,
More information3D Object Recognition with Deep Belief Nets
3D Object Recognition with Deep Belief Nets Vinod Nair and Geoffrey E. Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu
More informationDEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION
DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION S.Dhanalakshmi #1 #PG Scholar, Department of Computer Science, Dr.Sivanthi Aditanar college of Engineering, Tiruchendur
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More informationParallel Implementation of Deep Learning Using MPI
Parallel Implementation of Deep Learning Using MPI CSE633 Parallel Algorithms (Spring 2014) Instructor: Prof. Russ Miller Team #13: Tianle Ma Email: tianlema@buffalo.edu May 7, 2014 Content Introduction
More informationNeural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders
Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components
More informationStatic Gesture Recognition with Restricted Boltzmann Machines
Static Gesture Recognition with Restricted Boltzmann Machines Peter O Donovan Department of Computer Science, University of Toronto 6 Kings College Rd, M5S 3G4, Canada odonovan@dgp.toronto.edu Abstract
More informationLecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen
Lecture 13 Deep Belief Networks Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com 12 December 2012
More informationIntroduction to Deep Learning
ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)
More informationDeep Learning. Volker Tresp Summer 2014
Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there
More informationAkarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction
Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different
More informationAn Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation
An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle larocheh@iro.umontreal.ca Dumitru Erhan erhandum@iro.umontreal.ca Aaron Courville courvila@iro.umontreal.ca
More informationComparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationDepth Image Dimension Reduction Using Deep Belief Networks
Depth Image Dimension Reduction Using Deep Belief Networks Isma Hadji* and Akshay Jain** Department of Electrical and Computer Engineering University of Missouri 19 Eng. Building West, Columbia, MO, 65211
More informationTraining Restricted Boltzmann Machines using Approximations to the Likelihood Gradient
Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient Tijmen Tieleman tijmen@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Ontario M5S
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017
3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural
More informationImplicit Mixtures of Restricted Boltzmann Machines
Implicit Mixtures of Restricted Boltzmann Machines Vinod Nair and Geoffrey Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu
More informationTraining Restricted Boltzmann Machines using Approximations to the Likelihood Gradient. Ali Mirzapour Paper Presentation - Deep Learning March 7 th
Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient Ali Mirzapour Paper Presentation - Deep Learning March 7 th 1 Outline of the Presentation Restricted Boltzmann Machine
More informationDEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla
DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple
More informationLearning Class-relevant Features and Class-irrelevant Features via a Hybrid third-order RBM
via a Hybrid third-order RBM Heng Luo Ruimin Shen Changyong Niu Carsten Ullrich Shanghai Jiao Tong University hengluo@sjtu.edu Shanghai Jiao Tong University rmshen@sjtu.edu Zhengzhou University iecyniu@zzu.edu.cn
More informationTraining Restricted Boltzmann Machines with Overlapping Partitions
Training Restricted Boltzmann Machines with Overlapping Partitions Hasari Tosun and John W. Sheppard Montana State University, Department of Computer Science, Bozeman, Montana, USA Abstract. Restricted
More informationA Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images
A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract
More informationDefects Detection Based on Deep Learning and Transfer Learning
We get the result that standard error is 2.243445 through the 6 observed value. Coefficient of determination R 2 is very close to 1. It shows the test value and the actual value is very close, and we know
More informationAn Arabic Optical Character Recognition System Using Restricted Boltzmann Machines
An Arabic Optical Character Recognition System Using Restricted Boltzmann Machines Abdullah M. Rashwan, Mohamed S. Kamel, and Fakhri Karray University of Waterloo Abstract. Most of the state-of-the-art
More informationEmotion Detection using Deep Belief Networks
Emotion Detection using Deep Belief Networks Kevin Terusaki and Vince Stigliani May 9, 2014 Abstract In this paper, we explore the exciting new field of deep learning. Recent discoveries have made it possible
More informationDeep Generative Models Variational Autoencoders
Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining
More informationSEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic
SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks
More informationApplication of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects
Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects Joseph Lin Chu and Adam Krzyżak Department of Computer Science
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 7: Universal Approximation Theorem, More Hidden Units, Multi-Class Classifiers, Softmax, and Regularization Peter Belhumeur Computer Science Columbia University
More informationConvolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations
Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng Computer Science Department, Stanford University,
More informationLearning robust features from underwater ship-radiated noise with mutual information group sparse DBN
Learning robust features from underwater ship-radiated noise with mutual information group sparse DBN Sheng SHEN ; Honghui YANG ; Zhen HAN ; Junun SHI ; Jinyu XIONG ; Xiaoyong ZHANG School of Marine Science
More informationConvolutional Neural Networks
Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
More informationSPE MS. Abstract. Introduction. Autoencoders
SPE-174015-MS Autoencoder-derived Features as Inputs to Classification Algorithms for Predicting Well Failures Jeremy Liu, ISI USC, Ayush Jaiswal, USC, Ke-Thia Yao, ISI USC, Cauligi S.Raghavendra, USC
More informationA Deep Learning Approach to the Classification of 3D Models under BIM Environment
, pp.179-188 http//dx.doi.org/10.14257/ijca.2016.9.7.17 A Deep Learning Approach to the Classification of 3D Models under BIM Environment Li Wang *, a, Zhikai Zhao b and Xuefeng Wu c a School of Mechanics
More informationKernels vs. DNNs for Speech Recognition
Kernels vs. DNNs for Speech Recognition Joint work with: Columbia: Linxi (Jim) Fan, Michael Collins (my advisor) USC: Zhiyun Lu, Kuan Liu, Alireza Bagheri Garakani, Dong Guo, Aurélien Bellet, Fei Sha IBM:
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationA Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images
A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract
More informationUnsupervised Deep Learning for Scene Recognition
Unsupervised Deep Learning for Scene Recognition Akram Helou and Chau Nguyen May 19, 2011 1 Introduction Object and scene recognition are usually studied separately. However, research [2]shows that context
More informationModeling pigeon behaviour using a Conditional Restricted Boltzmann Machine
Modeling pigeon behaviour using a Conditional Restricted Boltzmann Machine Matthew D. Zeiler 1,GrahamW.Taylor 1, Nikolaus F. Troje 2 and Geoffrey E. Hinton 1 1- University of Toronto - Dept. of Computer
More informationNovel Lossy Compression Algorithms with Stacked Autoencoders
Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is
More informationRotation Invariance Neural Network
Rotation Invariance Neural Network Shiyuan Li Abstract Rotation invariance and translate invariance have great values in image recognition. In this paper, we bring a new architecture in convolutional neural
More informationOutlier detection using autoencoders
Outlier detection using autoencoders August 19, 2016 Author: Olga Lyudchik Supervisors: Dr. Jean-Roch Vlimant Dr. Maurizio Pierini CERN Non Member State Summer Student Report 2016 Abstract Outlier detection
More informationJOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation
JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based
More informationRestricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version
Shallow vs. deep networks Restricted Boltzmann Machines Shallow: one hidden layer Features can be learned more-or-less independently Arbitrary function approximator (with enough hidden units) Deep: two
More informationOnline Social Network Image Classification and Application Based on Deep Learning
2016 3 rd International Conference on Engineering Technology and Application (ICETA 2016) ISBN: 978-1-60595-383-0 Online Social Network Image Classification and Application Based on Deep Learning Chunde
More informationCOMP 551 Applied Machine Learning Lecture 16: Deep Learning
COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all
More informationModeling image patches with a directed hierarchy of Markov random fields
Modeling image patches with a directed hierarchy of Markov random fields Simon Osindero and Geoffrey Hinton Department of Computer Science, University of Toronto 6, King s College Road, M5S 3G4, Canada
More informationEfficient Algorithms may not be those we think
Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann
More informationNeural Network Neurons
Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationMachine Learning. MGS Lecture 3: Deep Learning
Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer
More informationData Mining. Neural Networks
Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most
More informationConvolutional Restricted Boltzmann Machine Features for TD Learning in Go
ConvolutionalRestrictedBoltzmannMachineFeatures fortdlearningingo ByYanLargmanandPeterPham AdvisedbyHonglakLee 1.Background&Motivation AlthoughrecentadvancesinAIhaveallowed Go playing programs to become
More informationNeural Network Weight Selection Using Genetic Algorithms
Neural Network Weight Selection Using Genetic Algorithms David Montana presented by: Carl Fink, Hongyi Chen, Jack Cheng, Xinglong Li, Bruce Lin, Chongjie Zhang April 12, 2005 1 Neural Networks Neural networks
More informationarxiv: v1 [cs.cl] 18 Jan 2015
Workshop on Knowledge-Powered Deep Learning for Text Mining (KPDLTM-2014) arxiv:1501.04325v1 [cs.cl] 18 Jan 2015 Lars Maaloe DTU Compute, Technical University of Denmark (DTU) B322, DK-2800 Lyngby Morten
More informationUnsupervised Learning
Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy
More informationDeep Belief Nets (An updated and extended version of my 2007 NIPS tutorial)
MLSS Tutorial on: Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial) Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of Toronto
More informationAlternatives to Direct Supervision
CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of
More informationCambridge Interview Technical Talk
Cambridge Interview Technical Talk February 2, 2010 Table of contents Causal Learning 1 Causal Learning Conclusion 2 3 Motivation Recursive Segmentation Learning Causal Learning Conclusion Causal learning
More informationDeep Learning With Noise
Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu
More informationResearch Article A New Approach for Mobile Advertising Click-Through Rate Estimation Based on Deep Belief Nets
Hindawi Computational Intelligence and Neuroscience Volume 2017, Article ID 7259762, 8 pages https://doi.org/10.1155/2017/7259762 Research Article A New Approach for Mobile Advertising Click-Through Rate
More informationLecture 20: Neural Networks for NLP. Zubin Pahuja
Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationNeural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing
Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing feature 3 PC 3 Beate Sick Many slides are taken form Hinton s great lecture on NN: https://www.coursera.org/course/neuralnets
More informationNatural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu
Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward
More informationDeep Learning. Architecture Design for. Sargur N. Srihari
Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationMachine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart
Machine Learning The Breadth of ML Neural Networks & Deep Learning Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017 Neural Networks Consider
More informationLearning Two-Layer Contractive Encodings
In Proceedings of International Conference on Artificial Neural Networks (ICANN), pp. 620-628, September 202. Learning Two-Layer Contractive Encodings Hannes Schulz and Sven Behnke Rheinische Friedrich-Wilhelms-Universität
More informationDeep Learning. Volker Tresp Summer 2015
Deep Learning Volker Tresp Summer 2015 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there
More informationFacial Expression Classification with Random Filters Feature Extraction
Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle
More informationDeep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.
Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationTraffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers
Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane
More informationSummary: A Tutorial on Learning With Bayesian Networks
Summary: A Tutorial on Learning With Bayesian Networks Markus Kalisch May 5, 2006 We primarily summarize [4]. When we think that it is appropriate, we comment on additional facts and more recent developments.
More informationCMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful
More informationAn Evolutionary Approximation to Contrastive Divergence in Convolutional Restricted Boltzmann Machines
Wright State University CORE Scholar Browse all Theses and Dissertations Theses and Dissertations 2014 An Evolutionary Approximation to Contrastive Divergence in Convolutional Restricted Boltzmann Machines
More informationImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture
More information2. Neural network basics
2. Neural network basics Next commonalities among different neural networks are discussed in order to get started and show which structural parts or concepts appear in almost all networks. It is presented
More informationMaking Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition
Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition Tara N. Sainath 1, Brian Kingsbury 1, Bhuvana Ramabhadran 1, Petr Fousek 2, Petr Novak 2, Abdel-rahman Mohamed 3
More informationNotes on Multilayer, Feedforward Neural Networks
Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book
More informationTransfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance
Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance Telmo Amaral¹, Luís M. Silva¹², Luís A. Alexandre³, Chetak Kandaswamy¹, Joaquim Marques de Sá¹ 4, and Jorge M. Santos¹
More informationNeural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.
Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric
More informationGated Boltzmann Machine in Texture Modeling
Gated Boltzmann Machine in Texture Modeling Tele Hao, Tapani Rao, Alexander Ilin, and Juha Karhunen Department of Information and Computer Science Aalto University, Espoo, Finland firstname.lastname@aalto.fi
More informationBackpropagation + Deep Learning
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation + Deep Learning Matt Gormley Lecture 13 Mar 1, 2018 1 Reminders
More informationGrounded Compositional Semantics for Finding and Describing Images with Sentences
Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationTwo Distributed-State Models For Generating High-Dimensional Time Series
Journal of Machine Learning Research 12 (2011) 1025-1068 Submitted 7/10; Revised 1/11; Published 3/11 Two Distributed-State Models For Generating High-Dimensional Time Series Graham W. Taylor Courant Institute
More informationDeep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?
Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of
More information