Hierarchical Representation Using NMF

Size: px
Start display at page:

Download "Hierarchical Representation Using NMF"

Transcription

1 Hierarchical Representation Using NMF Hyun Ah Song 1, and Soo-Young Lee 1,2 1 Department of Electrical Engineering, KAIST, Daejeon , Republic of Korea 2 Department of Bio and Brain Engineering, KAIST, Daejeon , Republic of Korea hi.hyunah@gmail.com, sylee@kaist.ac.kr Abstract. In this paper, we propose a representation model that demonstrates hierarchical feature learning using nsnmf. We stack simple unit algorithm into several layers to take step-by-step approach in learning. By utilizing NMF as unit algorithm, our proposed network provides intuitive understanding of the feature development process. It is able to represent the underlying structure of feature hierarchies present in complex data in intuitively understandable manner. Experiments with document data successfully discovered feature hierarchies of concepts in data. We also observed that proposed method results in much better classification and reconstruction performance, especially for small number of features. Keywords: Hierarchical representation, NMF, unsupervised feature learning, multi-layer, deep learning. 1 Introduction Humans are efficient learning machines. We can easily extract features from complex data, and understand them. How do we do this? We take hierarchical feature extraction strategy. By breaking a complex problem into several simple problems, we solve one by one throughout multiple stages [2]. By integrating simple solutions throughout the layers, algorithm is able to solve complex problem even without involving complex mathematical functions. Visual cortex supports this hierarchical information processing mechanism well [6]. With these biological evidences, researchers have been paying attention to hierarchical feature extracting approaches. One best known algorithm is Deep Belief Network (DBN) introduced in 2006 [5] where Hinton showed first success in training deep architectures of Restricted Boltzmann Machines (RBMs) in greedy layer-wise manner. With the success of training deep architectures, several variants of deep learning have been introduced; auto-encoders stacked into several layers [3, 8], stacking NMF into several layers [1, 4, 10]. Although these multi-layered algorithms take hierarchical approaches in feature extraction and provide efficient solution to complex problems, they do not Hyun Ah Song is now at KIST (Korea Institute of Science and Technology). M. Lee et al. (Eds.): ICONIP 2013, Part I, LNCS 8226, pp , c Springer-Verlag Berlin Heidelberg 2013

2 Hierarchical Representation Using NMF 467 provide us intuitive relationships of features in form of hierarchies that are learned throughout the hierarchical structure; Most of deep learning networks allow both addition and subtraction of features in the hierarchical learning process, and this results in intricate representation of feature development process that is quite hard to follow intuitively. In this paper, we propose a hierarchical data representation model, hierarchical multi-layer non-negative matrix factorization. (This is extended version of [11].) We extend a variant of NMF algorithm [7], nsnmf [9] into several layers for hierarchical learning. By stacking algorithm that restricts non-negativity, we allow only additional operation of features in process of developing feature hierarchies. We demonstrate intuitive feature development process along the layers, and display hierarchies present in the data set by learning relationships between features across the layers. We also prove that instead of one step learning, hierarchical approach learns more meaningful and helpful features, which leads to better distributed representations, and results in better performance in classification and reconstruction for small number of features. The organization of the paper is as follows. In Section 2, we introduce unit algorithm of our hierarchical network, nsnmf. Then we look into the structure of our proposed network in Section 3. We explain the intuitive understanding of our hierarchical feature extraction process in Section 4. We demonstrate the experimental result of our proposed network using Reuters document data set in Section 5, and close our paper in Section 6. 2 Non-smooth Non-negative Matrix Factorization (nsnmf) Proposed network is constructed by stacking nsnmf [9] into several layers. Nonsmooth non-negative matrix factorization (nsnmf) is a variant of NMF that restricts sparsity constraint. Basic NMF decomposes non-negative input data X into non-negative W and H, which are features and corresponding coefficients or data representation respectively. It aims to reduce error between original data X and its reconstruction WH: C = 1 2 X WH 2 = 1 m n 2 i=1 j=1 (X ij f k=1 W H ) 2. To apply sparsity constraint to standard NMF, a sparsity matrix S is introduced in [9]: S =(1 θ)i(k) + θ k ones(k). kis number of features, and θ is parameter for smoothing effect, in range of 0 to 1. I(k) is identity matrix of size kxk,andones(k) is a matrix of size k x k with all components of 1s. We smooth a matrix by multiplying it with S. The closer θ is to 1, more smoothing effect is applied. During alternative update, we smooth H matrix by multiplying S and H during iterations as H=SH. To compensate the loss of sparsity by smoothing, W becomes sparse. 3 Multi-layer Architecture The proposed hierarchical multi-layer NMF structure comprise of several layers of unit algorithm. The structure is described in Fig. 1.

3 468 H.A. Song and S.-Y. Lee Fig. 1. Overall architecture of hierarchical multi-layer NMF network We first train each layer separately. ) We process outcome of each layer H (l) to get K (l) as K (l) = f,wherem (l) f k = =1 H(l) k j. f( ) is nonlinear ( H (l) M (l) function, and the superscript of each term denotes layer index, where l denotes index of layer, l =1, 2,...L. Processed data representation of K (l) is used as input to next layer: Using nsnmf, K (l) is decomposed into W (l+1) and H (l+1) : K (l) W (l+1) H (l+1). We repeat this process by extending layers for l =1, 2,...L. After training of layers 1 to L separately, we use outcome of separate training as initialization, and train the whole network jointly. The cost function for joint training is C = 1 m n 2 i=1 j=1 (X ij f k=1 W (1) H (1) )2 (l),where H is the reconstruction of H (l), which can be computed via back propagation of errors from the last layer to the l th layer. Computation can be described in simpler form as similar to [1] in equation (1). (Nu (l) H ) ( (l)t W (l)t Nu (l)) W (l) W (l) (De (l) H ),andh (l) H(l) ( W (l)t (l)t De (l)),where (1a) { X if l = 1 Nu (l) = ( W (l 1)T Nu (l 1)) ( (M (l 1) f 1 W (l) H (l))) otherwise (1b) { X if l = 1 De (l) = ( W (l 1)T De (l 1)) ( (M (l 1) f 1 W (l) H (l))) otherwise (1c) f Here, X = W (1) H (1),andcanbecomputedasshownin(2). H (l) H (l) ( ) if l = L = M (l) f 1 W (l+1) H (l+1) if l = L 1,..., 1 (2)

4 Hierarchical Representation Using NMF 469 M (l) is a matrix of column-wise mean of H (l),andf 1 ( ) is inverse nonlinear function. After training until the last layer, final data representation H (L) is acquired. This is the activation information of complex features, which is the integration of features throughout the layers, W (1) W (2)...W (L). 4 Intuition of Hierarchical NMF Feature Learning in Image In this section, we provide intuitive understanding of hierarchical feature learning of our proposed network. The hierarchical feature learning displays what kind of features develop at each layer, and how features from the lower layer are combined together to form higher layer features. Other deep learning algorithms also learn hierarchical features that are combination of lower layer features. However, since they do not restrict the sign of the values to be positive, combination of features involves subtraction of features as well, and this yields feature development process hard to follow, and representation of features are not intuitively understood. In contrast, hierarchical NMF provides intuitive understanding of feature hierarchies by allowing only weighted summations among the features during the hierarchical learning process. This can be interpreted as simply adding lego blocks to construct a complete structure. In order to help better understanding, we demonstrate the feature learning process using image data, MNIST digit data 1. In Fig. 2, the construction process of data using learned features is described in feature hierarchy structure. First layer learns very simple spot-le features W (1) which can be seen as the basic building blocks. In the second layer W (2), combinations of these first layer features are learned in distributed pattern. Integration through two layers produces complex blocks by combining W (1) according to W (2). We can intuitively follow the process by building up features in weighted summation manner due to non-negativity constraint which allows simple add-ons. The combination of complex feature is again combined to represent an original data. As explained in above demonstration, our proposed hierarchical network learns features as the building blocks of data, and provides intuitive hierarchical process, discovering feature hierarchies present in the complex data. 5 Experiment with Document Data We applied our proposed network to document database. We used Reuters collection, distribution as input data. We sorted top 10 categories from ModApte split, and reduced dimension to 1000 by removing stop-words. 1 Available at: 2 The Reuters-21578, Distribution 1.0 test collection is available from David D. Lewis professional home page, currently:

5 470 H.A. Song and S.-Y. Lee Fig. 2. Feature hierarchy of MNIST database. Images in order of W (1) W (2) X, from bottom to top. W (1) There are 5786 training samples and 2587 test samples. We constructed twolayered network with number of hidden neurons as 160 ( a, where a denotes dimension of final data representation). 5.1 Feature Hierarchies In Fig. 3, an example of top 10 words in learned high level features via integration of two layers, W (1) W (2), is displayed. Simple observation reveals which topic each feature represents: grain, money, crude, interest, coffe, trade, earn, acq, grain, and ship. Fig. 3. An example of W (1) W (2) As in Fig. 2, features shown in Fig. 3 are part of hierarchy of concepts in Reuters. An example of how concepts form hierarchies in Reuters is shown in Fig. 4. In Fig. 4 (a), three first layer features W (1) s are weighted summed to form second layer feature W (2). By observing words in each feature, we see that lower layer features cover small scope of the topic, containing various words. However, when they proceed to higher layer, they converge to represent one big common concept of oil, with their top four words being synonyms of oil. BasedonFig.4(a),wecanconstructaconcept hierarchy in Reuters as shown in Fig. 4 (b). By hierarchical concept diagram, we can observe that big broad topic oil (words indicated in red color) contains various other oil related words that are colored in blue in the low level. Also, we can observe how some words

6 Hierarchical Representation Using NMF 471 (a) (b) Fig. 4. Concept hierarchy in Reuters. (a) Experimental results, and (b) diagram of result in (a). are extracted together to comprise a feature to show their relations with each other in first layer: words exploration, ecuador, pipeline, well, saudi form a distinct group, same for texas, increase, purchase, contract, effective, and mln, stocks, fell, rose, reserves, refinery ; this grouping information can be interpreted as sub-topics under the topic oil. If we used single layered network, all we could have observed would be those red words that indicate topic oil. However, by hierarchical representation, we can observe deeper into the data structure more in detail by showing contents of blue words, and their groupings in low level. 5.2 Classification and Reconstruction Property The classification and reconstruction property for various a (dimension for H (L) ) is shown in Fig. 5 (a) and (b), respectively. For classifier, we used SVM. We calculated reconstruction error as: Mean reconstruction error = m n i=1 j=1 Xij X ij mn. The proposed hierarchical feature extraction method results in much better classification and reconstruction, especially for small number of features, compared to standard network that consists of only one layer. Even at dimension of 20, our proposed network displays the maximum performance it can reach after convergence. This supports that taking step-by-step approach by learning features in hierarchical manner provides better chance of learning meaningful features out of complex data; first layer pre-processes complex data by breaking it down into small units, lessening the burden for the second layer so that second layer just needs to learn how to combine these units. In Fig. 6, we show two examples of reconstruction, the same word with the same color. In the first example, the reconstruction via our proposed network (c) returns most of significant words present in the original data (a). Also, it successfully learned the importance of words, by displaying word sequence similar to the original. In contrast, the standard network (b) misses key words and fails to capture the importance of the words, showing words in mixed order compared to the original. The second example shows similar result with the first example. In (e), single layer feature confuses the subject of the topic by containing words

7 472 H.A. Song and S.-Y. Lee (a) (b) Fig. 5. (a) Reuters data classification, and (b) Reuters data reconstruction (a) (b) (c) (d) (e) (f) Fig. 6. Two examples of original data, and reconstruction by standard (single-layerd) and proposed network in (a), (b), (c), and (d), (e), (f), respectively wheat, oil, corn, and sugar altogether. However, our proposed network (f) correctly extracted key words related to subject oil. 6 Conclusion In this paper, we proposed a hierarchical representation model of nsnmf, by taking step-by-step approach in learning of the features in complex data. Our proposed network discovers feature hierarchies present in complex data and demonstrate them in intuitively understandable manner by learning feature relationships among the layers in non-negative approach. By simple addition and accumulation of features, we are able to understand the data structure and construct a hierarchy based on the information learned by the network. We also show that our proposed network provides better performance in classification and reconstruction compared to the single-layered network for small number of dimensions provided for final data representation. As a further work, we would le to apply our proposed network to various types of data for discovering underlying feature hierarchies in complex data.

8 Hierarchical Representation Using NMF 473 Acknowledgments. This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning ( and ). References 1. Ahn, J.-H., Choi, S., Oh, J.-H.: A multiplicative up-propagation algorithm. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 3. ACM (2004) 2. Bengio, Y.: Learning deep architectures for ai. Foundations and Trends in Machine Learning 2(1), (2009) 3. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems 19, 153 (2007) 4. Cichocki, A., Zdunek, R.: Multilayer nonnegative matrix factorisation. Electronics Letters 42(16), (2006) 5. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), (2006) 6. Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the cat s visual cortex. The Journal of Physiology 160(1), 106 (1962) 7. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), (1999) 8. Marcrquote Aurelio Ranzato, C.P., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. Advances in neural information processing systems 19, (2007) 9. Pascual-Montano, A., Carazo, J.M., Kochi, K., Lehmann, D., Pascual-Marqui, R.D.: Nonsmooth nonnegative matrix factorization (nsnmf). IEEE Transactions on Pattern Analysis and Machine Intelligence 28(3), (2006) 10. Rebhan, S., Eggert, J.P., Gross, H.-M., Körner, E.: Sparse and transformationinvariant hierarchical NMF. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN LNCS, vol. 4668, pp Springer, Heidelberg (2007) 11. Song, H.A., Lee, S.Y.: Hierarchical data representation model-multi-layer nmf. arxiv preprint arxiv: (2013)

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun Efficient Learning of Sparse Overcomplete Representations with an Energy-Based Model Marc'Aurelio Ranzato C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun CIAR Summer School Toronto 2006 Why Extracting

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

Autoencoders, denoising autoencoders, and learning deep networks

Autoencoders, denoising autoencoders, and learning deep networks 4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

Neural Networks: promises of current research

Neural Networks: promises of current research April 2008 www.apstat.com Current research on deep architectures A few labs are currently researching deep neural network training: Geoffrey Hinton s lab at U.Toronto Yann LeCun s lab at NYU Our LISA lab

More information

Introduction to Deep Learning

Introduction to Deep Learning ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007

More information

Training Restricted Boltzmann Machines with Overlapping Partitions

Training Restricted Boltzmann Machines with Overlapping Partitions Training Restricted Boltzmann Machines with Overlapping Partitions Hasari Tosun and John W. Sheppard Montana State University, Department of Computer Science, Bozeman, Montana, USA Abstract. Restricted

More information

To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine

To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine 2014 22nd International Conference on Pattern Recognition To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine Takayoshi Yamashita, Masayuki Tanaka, Eiji Yoshida, Yuji Yamauchi and Hironobu

More information

Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects

Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects Joseph Lin Chu and Adam Krzyżak Department of Computer Science

More information

Unsupervised Deep Learning for Scene Recognition

Unsupervised Deep Learning for Scene Recognition Unsupervised Deep Learning for Scene Recognition Akram Helou and Chau Nguyen May 19, 2011 1 Introduction Object and scene recognition are usually studied separately. However, research [2]shows that context

More information

Depth Image Dimension Reduction Using Deep Belief Networks

Depth Image Dimension Reduction Using Deep Belief Networks Depth Image Dimension Reduction Using Deep Belief Networks Isma Hadji* and Akshay Jain** Department of Electrical and Computer Engineering University of Missouri 19 Eng. Building West, Columbia, MO, 65211

More information

A Hybrid Face Recognition Approach Using GPUMLib

A Hybrid Face Recognition Approach Using GPUMLib A Hybrid Face Recognition Approach Using GPUMLib Noel Lopes 1,2 and Bernardete Ribeiro 1 1 CISUC - Center for Informatics and Systems of University of Coimbra, Portugal 2 UDI/IPG - Research Unit, Polytechnic

More information

Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance

Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance Telmo Amaral¹, Luís M. Silva¹², Luís A. Alexandre³, Chetak Kandaswamy¹, Joaquim Marques de Sá¹ 4, and Jorge M. Santos¹

More information

Stacked Denoising Autoencoders for Face Pose Normalization

Stacked Denoising Autoencoders for Face Pose Normalization Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University

More information

Advanced Introduction to Machine Learning, CMU-10715

Advanced Introduction to Machine Learning, CMU-10715 Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio

More information

Convolutional Restricted Boltzmann Machine Features for TD Learning in Go

Convolutional Restricted Boltzmann Machine Features for TD Learning in Go ConvolutionalRestrictedBoltzmannMachineFeatures fortdlearningingo ByYanLargmanandPeterPham AdvisedbyHonglakLee 1.Background&Motivation AlthoughrecentadvancesinAIhaveallowed Go playing programs to become

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION

DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION S.Dhanalakshmi #1 #PG Scholar, Department of Computer Science, Dr.Sivanthi Aditanar college of Engineering, Tiruchendur

More information

Visual object classification by sparse convolutional neural networks

Visual object classification by sparse convolutional neural networks Visual object classification by sparse convolutional neural networks Alexander Gepperth 1 1- Ruhr-Universität Bochum - Institute for Neural Dynamics Universitätsstraße 150, 44801 Bochum - Germany Abstract.

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

A Deep Learning Approach to the Classification of 3D Models under BIM Environment

A Deep Learning Approach to the Classification of 3D Models under BIM Environment , pp.179-188 http//dx.doi.org/10.14257/ijca.2016.9.7.17 A Deep Learning Approach to the Classification of 3D Models under BIM Environment Li Wang *, a, Zhikai Zhao b and Xuefeng Wu c a School of Mechanics

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Learning Two-Layer Contractive Encodings

Learning Two-Layer Contractive Encodings In Proceedings of International Conference on Artificial Neural Networks (ICANN), pp. 620-628, September 202. Learning Two-Layer Contractive Encodings Hannes Schulz and Sven Behnke Rheinische Friedrich-Wilhelms-Universität

More information

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort

More information

Parallel Implementation of Deep Learning Using MPI

Parallel Implementation of Deep Learning Using MPI Parallel Implementation of Deep Learning Using MPI CSE633 Parallel Algorithms (Spring 2014) Instructor: Prof. Russ Miller Team #13: Tianle Ma Email: tianlema@buffalo.edu May 7, 2014 Content Introduction

More information

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia CPSC340 State-of-the-art Neural Networks Nando de Freitas November, 2012 University of British Columbia Outline of the lecture This lecture provides an overview of two state-of-the-art neural networks:

More information

Sparse Feature Learning for Deep Belief Networks

Sparse Feature Learning for Deep Belief Networks Sparse Feature Learning for Deep Belief Networks Marc Aurelio Ranato 1 Y-Lan Boureau 2,1 Yann LeCun 1 1 Courant Institute of Mathematical Sciences, New York University 2 INRIA Rocquencourt {ranato,ylan,yann@courant.nyu.edu}

More information

Character Recognition Using Convolutional Neural Networks

Character Recognition Using Convolutional Neural Networks Character Recognition Using Convolutional Neural Networks David Bouchain Seminar Statistical Learning Theory University of Ulm, Germany Institute for Neural Information Processing Winter 2006/2007 Abstract

More information

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD. Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,

More information

Research on Pruning Convolutional Neural Network, Autoencoder and Capsule Network

Research on Pruning Convolutional Neural Network, Autoencoder and Capsule Network Research on Pruning Convolutional Neural Network, Autoencoder and Capsule Network Tianyu Wang Australia National University, Colledge of Engineering and Computer Science u@anu.edu.au Abstract. Some tasks,

More information

Image Compression: An Artificial Neural Network Approach

Image Compression: An Artificial Neural Network Approach Image Compression: An Artificial Neural Network Approach Anjana B 1, Mrs Shreeja R 2 1 Department of Computer Science and Engineering, Calicut University, Kuttippuram 2 Department of Computer Science and

More information

Feedback Alignment Algorithms. Lisa Zhang, Tingwu Wang, Mengye Ren

Feedback Alignment Algorithms. Lisa Zhang, Tingwu Wang, Mengye Ren Feedback Alignment Algorithms Lisa Zhang, Tingwu Wang, Mengye Ren Agenda Review of Back Propagation Random feedback weights support learning in deep neural networks Direct Feedback Alignment Provides Learning

More information

Slides adapted from Marshall Tappen and Bryan Russell. Algorithms in Nature. Non-negative matrix factorization

Slides adapted from Marshall Tappen and Bryan Russell. Algorithms in Nature. Non-negative matrix factorization Slides adapted from Marshall Tappen and Bryan Russell Algorithms in Nature Non-negative matrix factorization Dimensionality Reduction The curse of dimensionality: Too many features makes it difficult to

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Training Restricted Boltzmann Machines with Multi-Tempering: Harnessing Parallelization

Training Restricted Boltzmann Machines with Multi-Tempering: Harnessing Parallelization Training Restricted Boltzmann Machines with Multi-Tempering: Harnessing Parallelization Philemon Brakel, Sander Dieleman, and Benjamin Schrauwen Department of Electronics and Information Systems, Ghent

More information

Keyword Extraction by KNN considering Similarity among Features

Keyword Extraction by KNN considering Similarity among Features 64 Int'l Conf. on Advances in Big Data Analytics ABDA'15 Keyword Extraction by KNN considering Similarity among Features Taeho Jo Department of Computer and Information Engineering, Inha University, Incheon,

More information

Deep Boltzmann Machines

Deep Boltzmann Machines Deep Boltzmann Machines Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines

More information

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

Deep Neural Networks:

Deep Neural Networks: Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,

More information

LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier

LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier Wang Ding, Songnian Yu, Shanqing Yu, Wei Wei, and Qianfeng Wang School of Computer Engineering and Science, Shanghai University, 200072

More information

MULTIVARIATE ANALYSES WITH fmri DATA

MULTIVARIATE ANALYSES WITH fmri DATA MULTIVARIATE ANALYSES WITH fmri DATA Sudhir Shankar Raman Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering University of Zurich & ETH Zurich Motivation Modelling Concepts Learning

More information

A Fast Learning Algorithm for Deep Belief Nets

A Fast Learning Algorithm for Deep Belief Nets A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton, Simon Osindero Department of Computer Science University of Toronto, Toronto, Canada Yee-Whye Teh Department of Computer Science National

More information

DEEP learning algorithms have been the object of much

DEEP learning algorithms have been the object of much 1 Is Joint Training Better for Deep Auto-Encoders? Yingbo Zhou, Devansh Arpit, Ifeoma Nwogu, Venu Govindaraju arxiv:1405.1380v4 [stat.ml] 15 Jun 2015 Abstract Traditionally, when generative models of data

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Applying Neural Network Architecture for Inverse Kinematics Problem in Robotics

Applying Neural Network Architecture for Inverse Kinematics Problem in Robotics J. Software Engineering & Applications, 2010, 3: 230-239 doi:10.4236/jsea.2010.33028 Published Online March 2010 (http://www.scirp.org/journal/jsea) Applying Neural Network Architecture for Inverse Kinematics

More information

SPARSE COMPONENT ANALYSIS FOR BLIND SOURCE SEPARATION WITH LESS SENSORS THAN SOURCES. Yuanqing Li, Andrzej Cichocki and Shun-ichi Amari

SPARSE COMPONENT ANALYSIS FOR BLIND SOURCE SEPARATION WITH LESS SENSORS THAN SOURCES. Yuanqing Li, Andrzej Cichocki and Shun-ichi Amari SPARSE COMPONENT ANALYSIS FOR BLIND SOURCE SEPARATION WITH LESS SENSORS THAN SOURCES Yuanqing Li, Andrzej Cichocki and Shun-ichi Amari Laboratory for Advanced Brain Signal Processing Laboratory for Mathematical

More information

Package NNLM. June 18, 2018

Package NNLM. June 18, 2018 Type Package Package NNLM June 18, 2018 Title Fast and Versatile Non-Negative Matrix Factorization Description This is a package for Non-Negative Linear Models (NNLM). It implements fast sequential coordinate

More information

Deep Belief Network for Clustering and Classification of a Continuous Data

Deep Belief Network for Clustering and Classification of a Continuous Data Deep Belief Network for Clustering and Classification of a Continuous Data Mostafa A. SalamaI, Aboul Ella Hassanien" Aly A. Fahmy2 'Department of Computer Science, British University in Egypt, Cairo, Egypt

More information

Study of Residual Networks for Image Recognition

Study of Residual Networks for Image Recognition Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

Deep Learning. Volker Tresp Summer 2014

Deep Learning. Volker Tresp Summer 2014 Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

A Hierarchial Model for Visual Perception

A Hierarchial Model for Visual Perception A Hierarchial Model for Visual Perception Bolei Zhou 1 and Liqing Zhang 2 1 MOE-Microsoft Laboratory for Intelligent Computing and Intelligent Systems, and Department of Biomedical Engineering, Shanghai

More information

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng Computer Science Department, Stanford University,

More information

Tutorial Deep Learning : Unsupervised Feature Learning

Tutorial Deep Learning : Unsupervised Feature Learning Tutorial Deep Learning : Unsupervised Feature Learning Joana Frontera-Pons 4th September 2017 - Workshop Dictionary Learning on Manifolds OUTLINE Introduction Representation Learning TensorFlow Examples

More information

String Vector based KNN for Text Categorization

String Vector based KNN for Text Categorization 458 String Vector based KNN for Text Categorization Taeho Jo Department of Computer and Information Communication Engineering Hongik University Sejong, South Korea tjo018@hongik.ac.kr Abstract This research

More information

Semi supervised clustering for Text Clustering

Semi supervised clustering for Text Clustering Semi supervised clustering for Text Clustering N.Saranya 1 Assistant Professor, Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Coimbatore 1 ABSTRACT: Based on clustering

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

Learning Spatio-temporally Invariant Representations from Video

Learning Spatio-temporally Invariant Representations from Video Learning Spatio-temporally Invariant Representations from Video Jae Hyun Lim, Hansol Choi, Jun-Cheol Park, Jae Young Jun and Dae-shik Kim Department of Electrical Engineering Korea Advanced Institute of

More information

Autoencoder Using Kernel Method

Autoencoder Using Kernel Method 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) Banff Center, Banff, Canada, October 5-8, 2017 Autoencoder Using Kernel Method Yan Pei Computer Science Division University of

More information

Standard and Convex NMF in Clustering UCI wine and sonar data

Standard and Convex NMF in Clustering UCI wine and sonar data Standard and Convex NMF in Clustering UCI wine and sonar data ACS AISBITS 2016, October 19-21, Miȩdzyzdroje. Anna M. Bartkowiak aba@cs.uni.wroc.pl Institute of Computer Science, University of Wroclaw PL

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle larocheh@iro.umontreal.ca Dumitru Erhan erhandum@iro.umontreal.ca Aaron Courville courvila@iro.umontreal.ca

More information

Machine Learning for Physicists Lecture 6. Summer 2017 University of Erlangen-Nuremberg Florian Marquardt

Machine Learning for Physicists Lecture 6. Summer 2017 University of Erlangen-Nuremberg Florian Marquardt Machine Learning for Physicists Lecture 6 Summer 2017 University of Erlangen-Nuremberg Florian Marquardt Channels MxM image MxM image K K 3 channels conv 6 channels in any output channel, each pixel receives

More information

3D Object Recognition with Deep Belief Nets

3D Object Recognition with Deep Belief Nets 3D Object Recognition with Deep Belief Nets Vinod Nair and Geoffrey E. Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu

More information

Document Summarization using Semantic Feature based on Cloud

Document Summarization using Semantic Feature based on Cloud Advanced Science and echnology Letters, pp.51-55 http://dx.doi.org/10.14257/astl.2013 Document Summarization using Semantic Feature based on Cloud Yoo-Kang Ji 1, Yong-Il Kim 2, Sun Park 3 * 1 Dept. of

More information

Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation

Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation Lori Cillo, Attebury Honors Program Dr. Rajan Alex, Mentor West Texas A&M University Canyon, Texas 1 ABSTRACT. This work is

More information

Learning Class-relevant Features and Class-irrelevant Features via a Hybrid third-order RBM

Learning Class-relevant Features and Class-irrelevant Features via a Hybrid third-order RBM via a Hybrid third-order RBM Heng Luo Ruimin Shen Changyong Niu Carsten Ullrich Shanghai Jiao Tong University hengluo@sjtu.edu Shanghai Jiao Tong University rmshen@sjtu.edu Zhengzhou University iecyniu@zzu.edu.cn

More information

CS 523: Multimedia Systems

CS 523: Multimedia Systems CS 523: Multimedia Systems Angus Forbes creativecoding.evl.uic.edu/courses/cs523 Today - Convolutional Neural Networks - Work on Project 1 http://playground.tensorflow.org/ Convolutional Neural Networks

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Cambridge Interview Technical Talk

Cambridge Interview Technical Talk Cambridge Interview Technical Talk February 2, 2010 Table of contents Causal Learning 1 Causal Learning Conclusion 2 3 Motivation Recursive Segmentation Learning Causal Learning Conclusion Causal learning

More information

Learning Binary Code with Deep Learning to Detect Software Weakness

Learning Binary Code with Deep Learning to Detect Software Weakness KSII The 9 th International Conference on Internet (ICONI) 2017 Symposium. Copyright c 2017 KSII 245 Learning Binary Code with Deep Learning to Detect Software Weakness Young Jun Lee *, Sang-Hoon Choi

More information

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu Artificial Neural Networks Introduction to Computational Neuroscience Ardi Tampuu 7.0.206 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

More information

Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning

Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning Mohammad Norouzi, Mani Ranjbar, and Greg Mori School of Computing Science Simon Fraser University Burnaby, BC

More information

Blind Deconvolution with Sparse Priors on the Deconvolution Filters

Blind Deconvolution with Sparse Priors on the Deconvolution Filters Blind Deconvolution with Sparse Priors on the Deconvolution Filters Hyung-Min Park 1, Jong-Hwan Lee 1, Sang-Hoon Oh 2, and Soo-Young Lee 1 1 Brain Science Research Center and Dept. Biosystems, Korea Advanced

More information

Deep Learning With Noise

Deep Learning With Noise Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu

More information

Why equivariance is better than premature invariance

Why equivariance is better than premature invariance 1 Why equivariance is better than premature invariance Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of Toronto with contributions from Sida Wang

More information

Deep Learning Applications

Deep Learning Applications October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

Sparsity priors and boosting for learning localized distributed feature representations

Sparsity priors and boosting for learning localized distributed feature representations Technical Report TR-2010-04 University of British Columbia Sparsity priors and boosting for learning localized distributed feature representations Bo Chen Kevin Swersky Ben Marlin Nando de Freitas March

More information

Lab 8 CSC 5930/ Computer Vision

Lab 8 CSC 5930/ Computer Vision Lab 8 CSC 5930/9010 - Computer Vision Description: One way to effectively train a neural network with multiple layers is by training one layer at a time. You can achieve this by training a special type

More information

Fast Learning for Big Data Using Dynamic Function

Fast Learning for Big Data Using Dynamic Function IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Fast Learning for Big Data Using Dynamic Function To cite this article: T Alwajeeh et al 2017 IOP Conf. Ser.: Mater. Sci. Eng.

More information

Image Classification using Fast Learning Convolutional Neural Networks

Image Classification using Fast Learning Convolutional Neural Networks , pp.50-55 http://dx.doi.org/10.14257/astl.2015.113.11 Image Classification using Fast Learning Convolutional Neural Networks Keonhee Lee 1 and Dong-Chul Park 2 1 Software Device Research Center Korea

More information

WAVELET BASED NONLINEAR SEPARATION OF IMAGES. Mariana S. C. Almeida and Luís B. Almeida

WAVELET BASED NONLINEAR SEPARATION OF IMAGES. Mariana S. C. Almeida and Luís B. Almeida WAVELET BASED NONLINEAR SEPARATION OF IMAGES Mariana S. C. Almeida and Luís B. Almeida Instituto de Telecomunicações Instituto Superior Técnico Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal mariana.almeida@lx.it.pt,

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Deep networks for robust visual recognition

Deep networks for robust visual recognition Yichuan Tang y3tang@uwaterloo.ca Chris Eliasmith celiasmith@uwaterloo.ca Centre for Theoretical Neuroscience, University of Waterloo, Waterloo ON N2L 3G1 CANADA Abstract Deep Belief Networks (DBNs) are

More information

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt Energy Based Models, Restricted Boltzmann Machines and Deep Networks Jesse Eickholt ???? Who s heard of Energy Based Models (EBMs) Restricted Boltzmann Machines (RBMs) Deep Belief Networks Auto-encoders

More information

Convolutional Layer Pooling Layer Fully Connected Layer Regularization

Convolutional Layer Pooling Layer Fully Connected Layer Regularization Semi-Parallel Deep Neural Networks (SPDNN), Convergence and Generalization Shabab Bazrafkan, Peter Corcoran Center for Cognitive, Connected & Computational Imaging, College of Engineering & Informatics,

More information

Limited Generalization Capabilities of Autoencoders with Logistic Regression on Training Sets of Small Sizes

Limited Generalization Capabilities of Autoencoders with Logistic Regression on Training Sets of Small Sizes Limited Generalization Capabilities of Autoencoders with Logistic Regression on Training Sets of Small Sizes Alexey Potapov 1,2, Vita Batishcheva 2, and Maxim Peterson 1 1 St. Petersburg ational Research

More information

Supervised classification of law area in the legal domain

Supervised classification of law area in the legal domain AFSTUDEERPROJECT BSC KI Supervised classification of law area in the legal domain Author: Mees FRÖBERG (10559949) Supervisors: Evangelos KANOULAS Tjerk DE GREEF June 24, 2016 Abstract Search algorithms

More information

Deep Belief Nets In C++ And CUDA C: Volume 1: Restricted Boltzmann Machines And Supervised Feedforward Networks By Dr. Timothy Masters READ ONLINE

Deep Belief Nets In C++ And CUDA C: Volume 1: Restricted Boltzmann Machines And Supervised Feedforward Networks By Dr. Timothy Masters READ ONLINE Deep Belief Nets In C++ And CUDA C: Volume 1: Restricted Boltzmann Machines And Supervised Feedforward Networks By Dr. Timothy Masters READ ONLINE If you are searching for a book Deep Belief Nets in C++

More information

Learning Invariant Representations with Local Transformations

Learning Invariant Representations with Local Transformations Kihyuk Sohn kihyuks@umich.edu Honglak Lee honglak@eecs.umich.edu Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA Abstract Learning invariant representations

More information

Image Classification Using Wavelet Coefficients in Low-pass Bands

Image Classification Using Wavelet Coefficients in Low-pass Bands Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

arxiv: v2 [cs.lg] 22 Mar 2014

arxiv: v2 [cs.lg] 22 Mar 2014 Alireza Makhzani makhzani@psi.utoronto.ca Brendan Frey frey@psi.utoronto.ca University of Toronto, 10 King s College Rd. Toronto, Ontario M5S 3G4, Canada arxiv:1312.5663v2 [cs.lg] 22 Mar 2014 Abstract

More information

Channel Locality Block: A Variant of Squeeze-and-Excitation

Channel Locality Block: A Variant of Squeeze-and-Excitation Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan

More information

Large-scale Deep Unsupervised Learning using Graphics Processors

Large-scale Deep Unsupervised Learning using Graphics Processors Large-scale Deep Unsupervised Learning using Graphics Processors Rajat Raina Anand Madhavan Andrew Y. Ng Stanford University Learning from unlabeled data Classify vs. car motorcycle Input space Unlabeled

More information

A supervised strategy for deep kernel machine

A supervised strategy for deep kernel machine A supervised strategy for deep kernel machine Florian Yger, Maxime Berar, Gilles Gasso and Alain Rakotomamonjy LITIS EA 4108 - Université de Rouen/ INSA de Rouen, 76800 Saint Etienne du Rouvray - France

More information