Unsupervised Deep Learning for Scene Recognition

Size: px
Start display at page:

Download "Unsupervised Deep Learning for Scene Recognition"

Transcription

1 Unsupervised Deep Learning for Scene Recognition Akram Helou and Chau Nguyen May 19, Introduction Object and scene recognition are usually studied separately. However, research [2]shows that context from scene recognition can greatly improve object recognition performance. The performance of object detectors can be improved by adding information about the type of scene the object is embeded in. This helps disambiguate the class of an object. Gist is an example of a global image feature descriptor for characterizing the global properties of a scene [3]. The aim of our project is to begin exploring the possibility of learning global scene features which can be later used to improve performance of standard object detectors. The main advantage of learning features automatically is that it is dicult to hand engineer features which capture the full statistical properties of natural scenes. 2 Related Work Gist is an example of a global image feature descriptor for characterizing the global properties of a scene [3]. [1, 3] use Gist features in order to classify images into the type of scenes they belong to and to bias an object recognition algorithm to look for an object only in the region of the image where the object is most likely to be located. [3]consider the task of classifying images into types of scenes. On a data containing images belonging to eight dierent scenes, Gist features are fed into to achieve an accuracy of 83.7%. In general, Gist is often used in object recognition systems that make use of context. Clever features like Gist pervade computer vision however such features have been carefully engineered over years using specialized knowledge. In order to accelerate research and advances in computer vision research, it would be advantageous to learn good features automatically. Recently, deep learning techniques have been shown to be able to learn image features which are competitive with hand engineered image features such as Gabor lters [6] in a completely unsupervised ting. 1

2 3 Algorithms and Implementations 3.1 Deep Learning A deep learning algorithm is any algorithm that attempts to learn a hierarchical representation of the data usually in an unsupervised way [4]. This hierarchical representation is then used for solving standard machine learning problems namely prediction. Deep learning techniques namely Deep Belief Nets (DBN) [5] and similar approaches have many attractive properties including: 1. Learning can be done in unsupervised and supervised ting. [5] 2. A hierarchical non-linear representation of the data can boost performance on supervised learning tasks. [5] 3. Learning leads to a generative model of the data which can be eciently sampled from. [5] 4. There are ecient algorithms for learning deep representations. [5] 5. A function represented using K layers may need an exponentially bigger representation in order to be accurately modeled with K-1 layers. [4] 6. The mammal visual cortex system has a hierarchical architecture. Furthermore, sparse DBN have been show to learn features similar to those found in the V1 and V2 cells of the visual cortex system.[6] 3.2 Deep Belief Networks A Deep Belief Network (DBN) is a generative multilayer directed graphical model with unrestricted weigths between layers [5]. A graphical representation of a DBN can be seen in (gure 1). In DBN, the top hidden layer and the hidden layer below it dene a Restricted Boltzman Machines (RBM). A RBM (gure 1) is an undirected probabilistic generative model consisting of one hidden and visible layer. There is a connection between every hidden and visible unit. These connections dene a matrix W. Learning W via maximum likelihood is too slow in practice. Instead, the Contrastive-Divergence (CD) algorithm is used to approximate learning in RBM. CD approximates a dierent objective function than the one maximized by maximum likelihood. In practice, CD works well. Overall, learning a DBN model proceeds by learning the weighs connecting every hidden layer with the one below it starting from the lowest level and treating each pair of hidden layers as a RBM. This greedy learning procedure oers no formal guarantees for converging to the maximum likelihood model however it works well in practice. 2

3 Figure 1: Graphical representation of Deep Belief Network Figure 2: Graphical reprsentation of a Convolutional Resricted Boltzmann Machine 3.3 Convolutional Deep Belief Networks A Convolutional Deep Belief Network (CDBN) is a variant of DBN. The building blocks of CDBN are Convolutional Restricted Boltzmann Machines (CRBM) (gure 2). CRBM diers from RBM in the that weights connecting hidden units to visible units are constrained to be equal over a of visible units. In the case of an image, these weights dene a lter which is convoluted with them image. Additionally, a CRBM has a pooling layer on top of the hidden layer. A pooling layer compresses the hidden layer by choosing the maximal hidden activity over every image patch. This is called maximum pooling. Maximum pooling makes learned features invariant to slight noise and translation. While learning in CDBN the pooling layer from the rst CRBM is fed into the next CRBM. This learning procedure allows for learning more higher level procedures as we move up in the hierarchy of layers. Overall CDBN are attractive for computer vision tasks. They scale to higher resolution images than DBN because they have fewer parameters to learn. The convolutional structure of the network takes into account the structure of an image. Maximum pooling allows for learning shift invariant features. 3.4 Implementations We have found matlab code for training a DBN however it was poorly written and only worked on the MNIST handwritten digits data. Instead we heavily modied the code to make it more general and more tunable. We also added sparsity to the learning procedure because it was shown to help learn more edge like features [6]. We were not able to nd code for training a CDBN. Instead we implemented our own CDBN in matlab as described in [7]. 3

4 Implementing DBN and CDBN was time consuming. Training DBN and CDBN is not straightforward because the models have upwards of eight parameters to tune and because the learning procedure, being an approximation, does not oer formal guarantees on performance and convergence. Auxillary functions for monitoring overtting and divergence conditions need to be written. Finally, the most challenging aspect of using these models is that they remain slow. Modeling images required more than a million parameters using DBN in our case. Therefore, we optimized our code and parallelized parts of our CDBN to take advantage of multiple cores. The code is available The code needs to be cleaned up and some functions are not properly documented yet. 4 Experimental Results 4.1 Data We used Torralba's scene recognition data consisting of *256 color images. The data contains 8 outdoor scene categories: coast, mountain, forest, open country, street, inside city, tall buildings and highways. 4.2 Deep Belief Networks Results Running DBN or CDBN is very time consuming therefore we were not able to properly cross validate the models over large number of parameters values. Instead we used a training and testing sub of the scene recognition data for quickly evaluating the data reconstruction error 1 performance of multiple DBN models over a range of parameters. We picked the best of parameters and used them to train on 1800 images and test on the remaining images. After the DBN model has been learned, we use the activities of all the hidden layers as features to a 2 which we used to classify the test data over the eight given scenes. Figure 3 describes the experiments we ran using the best model and shows the confusion matrix for each model.. Unsurprisingly, colored images coupled with higher resolution images yield better accuracy. As a baseline, we used the raw pixels of the images of the images as feature for 3. The baseline achieves an accuracy of 30% and 47 % for gray and color images respectively. Figure 4 shows four of the learned features from each hidden layer. Admittingly, the features are dicult to interpret. Also, the features do not appear to capture higher level objects as we move up a layer even though we restricted 1 We eventually discovered that data reconstruction error alone is not a good metric for assessing the generative performance of a DBN. 2 We used the hidden activities over the same 1800 images used during training of the DBN. This is clearly undesirable but we did not nd the time to rerun the experiments. 3 If we had more time we would have used gabor lter or/and other low level vision features. This would constitute a more appropriate baseline 4

5 each layer to have sparse activities as in [6]. We are not very surprised by the results as our model had more than a million parameters and was trained on relatively minuscule data of only 1800 images. Figure 5 shows the generated image for a given testing image. 4.3 Convolutinal Deep Belief Network Results As we expected CDBN do signicantly better than DBN (gure 1). CDBN allowed us to scale to 100*100 resolution images and to learn low level features that look like edges and corners (gure 6). Also, higher level lters appear to learn higher level features (gures 7 and 8). However, these higher level are not readily interpretable. This is somewhat not surprising since our data contains images depicting very dierent scenes, it is small in size, and our CDBN model is simpler than the one used to learn to detect images containing a single centered object [7]. Unlike the DBN experiments, we made sure to train over the CDBN features inferred from images not included in the training used for learning the CDBN model. Our CDBN model achieves an accuracy of 86.5% over the 400 images in the testing 4. The features extracted from the CDBN actually manage to be more indicative of scene class than Gist features: trained with Gist features achieves an accuracy of 78.5%. Figures 9 and 10 indicate that our CDBN learns to generate images of the original images that are good enough to discern some of the scenes. 5 Discussion Although the learned features from CDBN manage to be more predictive of scene class than Gist, we cannot decisively conclude that the learned features are always better than Gist for scene classication. On one hand, our experiments are not exhaustive enough. On the other hand, we used more than features for each image to achieve the aforementioned performance while Gist only used 960 features. However, our initial experiments how that the learned features are likely to compete with or are better than Gist since the model we learned is relatively simple and used a small data. Furthermore, we did not use color information in our case and unlike Gist we do not pad the border of the image with zeros to avoid artifacts, we do not prelter the image, and we do not whiten the image. Also, Gist uses Gabor lters as a preprocessing step. The biggest obstacle we encountered while working on this project for the past three weeks was the lack of enough computations resources (fast multiple cores machines). This prevented us from systematically explore parameters, learn more complex models, and use more data. 4 We made sure to include an equal number of images from each class in the training and testing phase 5

6 Confusion t i s h c o m f matrix (Model 1) tall building inside city street highway coast open country mountain forest Confusion matrix (Models 2) t i s h c o m f tall building inside city street highway coast open country mountain forest Models image size image type DBN train train test kernel Number of units per hidden layer 1 50*50 gray Linear [ ] 2 75*75 color Linear [ ] Figure 3: Confusion matrix and experiments 1 and 2 details Sparse? Accuracy Yes 47 Yes Figure 4: Left to Right: Learned features from hidden units in layers 1,2,3,and 4 belonging experiment 1's DBN model 6

7 Figure 5: Left: one of the original images from the scene data. Right: Corresponding generated image from experiment 1's DBN model Confusion t i s h c o m f matrix (CDBN) tall building inside city street highway coast open country mountain forest Models image size image type DBN train train test Confusion t i s h c o m f matrix (Gist) tall building inside city street highway coast open country mountain forest kernel Number of lters/layer Sparse? Accuracy CDBN 100*100 gray Linear [ ] Yes 86.5 Gist 128*128 color NA Linear NA NA 78.5 Table 1: Confusion matrix and experiments CDBN and Gist details 7

8 Figure 6: 20 lters learned by the rst layer of our CDBN model Figure 7: 10 of the 40 lters learned by the second layer of our CDBN model Figure 8: 10 of the 60 lters learned by the rst layer of our CDBN model 8

9 Figure 9: Original images from 8 classes Figure 10: Corresponding generated images from our CDBN model 9

10 6 Future Work We would like to see how eectively CDBNs can be used to model complex scenes including indoor scenes. In order to do continue this investigation, we would need to map our code to code that can be ran on a GPU [8]. Right now, inferring more than features would be unfeasible for real time applications so it would be important to also investigate compressing this representation. Finally, after getting a better intuition into the limitation of CDBNs for modeling complex indoor and outdoor scenes, we will be very interested in investigating architectural modications to CDBN. One interesting extension would be to enforce scale invariance since complex scenes will have similar parts at dierent scales. References [1] P. Sinha A. Torralba. Statistical context priming for object detection. In Proceedings of the International Conference on Computer Vision, [2] W. T. Freeman A. Torralba, K. Murphy. Using the forest to see the trees: object recognition in context. Communications of the ACM, Research Highlights, [3] Antonio Torralba Aude Oliva. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42:145175, [4] Yoshua Bengio. Learning deep architectures for ai. Foundations and Trends in Machine Learning: Vol. 2, [5] Osindero S. Hinton, G. E. and Y. Teh. A fast learning algorithm for deep belief nets. Neural Computation, [6] Chaitu Ekanadham Honglak Lee and Andrew Y. Ng. Sparse deep belief net model for visual area v2. In Advances in Neural Information Processing Systems (NIPS), [7] Rajesh Ranganath Honglak Lee, Roger Grosse and Andrew Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the Twenth-Sixth International Conference on Machine Learning (ICML), [8] Andrew Y. Ng Rajat Raina, Anand Madhavan. Large-scale deep unsupervised learning using graphics processors. In ICML,

Convolutional Deep Belief Networks on CIFAR-10

Convolutional Deep Belief Networks on CIFAR-10 Convolutional Deep Belief Networks on CIFAR-10 Alex Krizhevsky kriz@cs.toronto.edu 1 Introduction We describe how to train a two-layer convolutional Deep Belief Network (DBN) on the 1.6 million tiny images

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng Computer Science Department, Stanford University,

More information

A Fast Learning Algorithm for Deep Belief Nets

A Fast Learning Algorithm for Deep Belief Nets A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton, Simon Osindero Department of Computer Science University of Toronto, Toronto, Canada Yee-Whye Teh Department of Computer Science National

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Large-scale Deep Unsupervised Learning using Graphics Processors

Large-scale Deep Unsupervised Learning using Graphics Processors Large-scale Deep Unsupervised Learning using Graphics Processors Rajat Raina Anand Madhavan Andrew Y. Ng Stanford University Learning from unlabeled data Classify vs. car motorcycle Input space Unlabeled

More information

Advanced Introduction to Machine Learning, CMU-10715

Advanced Introduction to Machine Learning, CMU-10715 Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio

More information

3D Object Recognition with Deep Belief Nets

3D Object Recognition with Deep Belief Nets 3D Object Recognition with Deep Belief Nets Vinod Nair and Geoffrey E. Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu

More information

GIST. GPU Implementation. Prakhar Jain ( ) Ejaz Ahmed ( ) 3 rd May, 2009

GIST. GPU Implementation. Prakhar Jain ( ) Ejaz Ahmed ( ) 3 rd May, 2009 GIST GPU Implementation Prakhar Jain ( 200601066 ) Ejaz Ahmed ( 200601028 ) 3 rd May, 2009 International Institute Of Information Technology, Hyderabad Table of Contents S. No. Topic Page No. 1 Abstract

More information

Convolutional Restricted Boltzmann Machine Features for TD Learning in Go

Convolutional Restricted Boltzmann Machine Features for TD Learning in Go ConvolutionalRestrictedBoltzmannMachineFeatures fortdlearningingo ByYanLargmanandPeterPham AdvisedbyHonglakLee 1.Background&Motivation AlthoughrecentadvancesinAIhaveallowed Go playing programs to become

More information

Content-Based Image Retrieval Using Deep Belief Networks

Content-Based Image Retrieval Using Deep Belief Networks Content-Based Image Retrieval Using Deep Belief Networks By Jason Kroge Submitted to the graduate degree program in the Department of Electrical Engineering and Computer Science of the University of Kansas

More information

A Hierarchial Model for Visual Perception

A Hierarchial Model for Visual Perception A Hierarchial Model for Visual Perception Bolei Zhou 1 and Liqing Zhang 2 1 MOE-Microsoft Laboratory for Intelligent Computing and Intelligent Systems, and Department of Biomedical Engineering, Shanghai

More information

Learning Class-Specific Image Transformations with Higher-Order Boltzmann Machines

Learning Class-Specific Image Transformations with Higher-Order Boltzmann Machines Learning Class-Specific Image Transformations with Higher-Order Boltzmann Machines Gary B. Huang Erik Learned-Miller University of Massachusetts Amherst Amherst, MA {gbhuang,elm}@cs.umass.edu Abstract

More information

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun Efficient Learning of Sparse Overcomplete Representations with an Energy-Based Model Marc'Aurelio Ranzato C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun CIAR Summer School Toronto 2006 Why Extracting

More information

To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine

To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine 2014 22nd International Conference on Pattern Recognition To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine Takayoshi Yamashita, Masayuki Tanaka, Eiji Yoshida, Yuji Yamauchi and Hironobu

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Autoencoders, denoising autoencoders, and learning deep networks

Autoencoders, denoising autoencoders, and learning deep networks 4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,

More information

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort

More information

Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning

Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning Mohammad Norouzi, Mani Ranjbar, and Greg Mori School of Computing Science Simon Fraser University Burnaby, BC

More information

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt Energy Based Models, Restricted Boltzmann Machines and Deep Networks Jesse Eickholt ???? Who s heard of Energy Based Models (EBMs) Restricted Boltzmann Machines (RBMs) Deep Belief Networks Auto-encoders

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Scene Recognition using Bag-of-Words

Scene Recognition using Bag-of-Words Scene Recognition using Bag-of-Words Sarthak Ahuja B.Tech Computer Science Indraprastha Institute of Information Technology Okhla, Delhi 110020 Email: sarthak12088@iiitd.ac.in Anchita Goel B.Tech Computer

More information

Tiled convolutional neural networks

Tiled convolutional neural networks Tiled convolutional neural networks Quoc V. Le, Jiquan Ngiam, Zhenghao Chen, Daniel Chia, Pang Wei Koh, Andrew Y. Ng Computer Science Department, Stanford University {quocle,jngiam,zhenghao,danchia,pangwei,ang}@cs.stanford.edu

More information

Learning to Align from Scratch

Learning to Align from Scratch Learning to Align from Scratch Gary B. Huang 1 Marwan A. Mattar 1 Honglak Lee 2 Erik Learned-Miller 1 1 University of Massachusetts, Amherst, MA {gbhuang,mmattar,elm}@cs.umass.edu 2 University of Michigan,

More information

Novel Lossy Compression Algorithms with Stacked Autoencoders

Novel Lossy Compression Algorithms with Stacked Autoencoders Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is

More information

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Hyunghoon Cho and David Wu December 10, 2010 1 Introduction Given its performance in recent years' PASCAL Visual

More information

Neural Networks: promises of current research

Neural Networks: promises of current research April 2008 www.apstat.com Current research on deep architectures A few labs are currently researching deep neural network training: Geoffrey Hinton s lab at U.Toronto Yann LeCun s lab at NYU Our LISA lab

More information

DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION

DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION S.Dhanalakshmi #1 #PG Scholar, Department of Computer Science, Dr.Sivanthi Aditanar college of Engineering, Tiruchendur

More information

Human Vision Based Object Recognition Sye-Min Christina Chan

Human Vision Based Object Recognition Sye-Min Christina Chan Human Vision Based Object Recognition Sye-Min Christina Chan Abstract Serre, Wolf, and Poggio introduced an object recognition algorithm that simulates image processing in visual cortex and claimed to

More information

Using the Forest to See the Trees: Context-based Object Recognition

Using the Forest to See the Trees: Context-based Object Recognition Using the Forest to See the Trees: Context-based Object Recognition Bill Freeman Joint work with Antonio Torralba and Kevin Murphy Computer Science and Artificial Intelligence Laboratory MIT A computer

More information

Large-Scale Scene Classification Using Gist Feature

Large-Scale Scene Classification Using Gist Feature Large-Scale Scene Classification Using Gist Feature Reza Fuad Rachmadi, I Ketut Eddy Purnama Telematics Laboratory Department of Multimedia and Networking Engineering Institut Teknologi Sepuluh Nopember

More information

Deep Boltzmann Machines

Deep Boltzmann Machines Deep Boltzmann Machines Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines

More information

Stacked Denoising Autoencoders for Face Pose Normalization

Stacked Denoising Autoencoders for Face Pose Normalization Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University

More information

Parallel Implementation of Deep Learning Using MPI

Parallel Implementation of Deep Learning Using MPI Parallel Implementation of Deep Learning Using MPI CSE633 Parallel Algorithms (Spring 2014) Instructor: Prof. Russ Miller Team #13: Tianle Ma Email: tianlema@buffalo.edu May 7, 2014 Content Introduction

More information

Deep Learning. Volker Tresp Summer 2014

Deep Learning. Volker Tresp Summer 2014 Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Classifier C-Net. 2D Projected Images of 3D Objects. 2D Projected Images of 3D Objects. Model I. Model II

Classifier C-Net. 2D Projected Images of 3D Objects. 2D Projected Images of 3D Objects. Model I. Model II Advances in Neural Information Processing Systems 7. (99) The MIT Press, Cambridge, MA. pp.949-96 Unsupervised Classication of 3D Objects from D Views Satoshi Suzuki Hiroshi Ando ATR Human Information

More information

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

An Analysis of Single-Layer Networks in Unsupervised Feature Learning An Analysis of Single-Layer Networks in Unsupervised Feature Learning Adam Coates Honglak Lee Andrew Y. Ng Stanford University Computer Science Dept. 353 Serra Mall Stanford, CA 94305 University of Michigan

More information

Deep Learning Shape Priors for Object Segmentation

Deep Learning Shape Priors for Object Segmentation 2013 IEEE Conference on Computer Vision and Pattern Recognition Deep Learning Shape Priors for Object Segmentation Fei Chen a,c Huimin Yu a,b Roland Hu a Xunxun Zeng d a Department of Information Science

More information

Visual object classification by sparse convolutional neural networks

Visual object classification by sparse convolutional neural networks Visual object classification by sparse convolutional neural networks Alexander Gepperth 1 1- Ruhr-Universität Bochum - Institute for Neural Dynamics Universitätsstraße 150, 44801 Bochum - Germany Abstract.

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms

Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms Liefeng Bo University of Washington Seattle WA 98195, USA Xiaofeng Ren ISTC-Pervasive Computing Intel Labs Seattle

More information

UNSUPERVISED FEATURE LEARNING VIA SPARSE HIERARCHICAL REPRESENTATIONS

UNSUPERVISED FEATURE LEARNING VIA SPARSE HIERARCHICAL REPRESENTATIONS UNSUPERVISED FEATURE LEARNING VIA SPARSE HIERARCHICAL REPRESENTATIONS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL

More information

Deep Learning for Generic Object Recognition

Deep Learning for Generic Object Recognition Deep Learning for Generic Object Recognition, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University Collaborators: Marc'Aurelio Ranzato, Fu Jie Huang,

More information

Single Image Depth Estimation via Deep Learning

Single Image Depth Estimation via Deep Learning Single Image Depth Estimation via Deep Learning Wei Song Stanford University Stanford, CA Abstract The goal of the project is to apply direct supervised deep learning to the problem of monocular depth

More information

Hierarchical Representation Using NMF

Hierarchical Representation Using NMF Hierarchical Representation Using NMF Hyun Ah Song 1, and Soo-Young Lee 1,2 1 Department of Electrical Engineering, KAIST, Daejeon 305-701, Republic of Korea 2 Department of Bio and Brain Engineering,

More information

Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance

Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance Transfer Learning Using Rotated Image Data to Improve Deep Neural Network Performance Telmo Amaral¹, Luís M. Silva¹², Luís A. Alexandre³, Chetak Kandaswamy¹, Joaquim Marques de Sá¹ 4, and Jorge M. Santos¹

More information

Mixed handwritten and printed digit recognition in Sudoku with Convolutional Deep Belief Network

Mixed handwritten and printed digit recognition in Sudoku with Convolutional Deep Belief Network Mixed handwritten and printed digit recognition in Sudoku with Convolutional Deep Belief Network Baptiste Wicht*, Jean Hennebert t University of Fribourg, Switzerland HES-SO, University of Applied Science

More information

Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient. Ali Mirzapour Paper Presentation - Deep Learning March 7 th

Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient. Ali Mirzapour Paper Presentation - Deep Learning March 7 th Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient Ali Mirzapour Paper Presentation - Deep Learning March 7 th 1 Outline of the Presentation Restricted Boltzmann Machine

More information

Object Recognition Under Complex Environmental Conditions on Remote Sensing Image

Object Recognition Under Complex Environmental Conditions on Remote Sensing Image International Journal of Applied Environmental Sciences ISSN 0973-6077 Volume 11, Number 3 (016), pp. 751-758 Research India Publications http://www.ripublication.com Object Recognition Under Complex Environmental

More information

Introduction to Deep Learning

Introduction to Deep Learning ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)

More information

Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects

Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects Application of Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks to Recognition of Partially Occluded Objects Joseph Lin Chu and Adam Krzyżak Department of Computer Science

More information

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning Adam Coates, Blake Carpenter, Carl Case, Sanjeev Satheesh, Bipin Suresh, Tao Wang, Andrew Y. Ng Computer Science

More information

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning Justin Chen Stanford University justinkchen@stanford.edu Abstract This paper focuses on experimenting with

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

Convolutional Neural Networks for Object Classication in CUDA

Convolutional Neural Networks for Object Classication in CUDA Convolutional Neural Networks for Object Classication in CUDA Alex Krizhevsky (kriz@cs.toronto.edu) April 16, 2009 1 Introduction Here I will present my implementation of a simple convolutional neural

More information

Extracting and Composing Robust Features with Denoising Autoencoders

Extracting and Composing Robust Features with Denoising Autoencoders Presenter: Alexander Truong March 16, 2017 Extracting and Composing Robust Features with Denoising Autoencoders Pascal Vincent, Hugo Larochelle, Yoshua Bengio, Pierre-Antoine Manzagol 1 Outline Introduction

More information

Learning Hierarchical Features for Scene Labeling

Learning Hierarchical Features for Scene Labeling Learning Hierarchical Features for Scene Labeling FB Informatik Knowledge Engineering Group Prof. Dr. Johannes Fürnkranz Seminar Machine Learning Author : Tanya Harizanova 14.01.14 Seminar aus maschinellem

More information

Deep Learning. Volker Tresp Summer 2015

Deep Learning. Volker Tresp Summer 2015 Deep Learning Volker Tresp Summer 2015 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Divya Ramesh December 8, 2014

Divya Ramesh December 8, 2014 EE 660 Machine Learning from Signals: Foundations and Methods Project Title: Analysis of Unsupervised Feature Learning Methods for Image Classification Divya Ramesh (dramesh@usc.edu) 5924-4440-58 December

More information

Deep Neural Networks:

Deep Neural Networks: Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,

More information

Machine Learning for Physicists Lecture 6. Summer 2017 University of Erlangen-Nuremberg Florian Marquardt

Machine Learning for Physicists Lecture 6. Summer 2017 University of Erlangen-Nuremberg Florian Marquardt Machine Learning for Physicists Lecture 6 Summer 2017 University of Erlangen-Nuremberg Florian Marquardt Channels MxM image MxM image K K 3 channels conv 6 channels in any output channel, each pixel receives

More information

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Part II: Visual Features and Representations Liangliang Cao, IBM Watson Research Center Evolvement of Visual Features

More information

Progress in Image Analysis and Processing III, pp , World Scientic, Singapore, AUTOMATIC INTERPRETATION OF FLOOR PLANS USING

Progress in Image Analysis and Processing III, pp , World Scientic, Singapore, AUTOMATIC INTERPRETATION OF FLOOR PLANS USING Progress in Image Analysis and Processing III, pp. 233-240, World Scientic, Singapore, 1994. 1 AUTOMATIC INTERPRETATION OF FLOOR PLANS USING SPATIAL INDEXING HANAN SAMET AYA SOFFER Computer Science Department

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Unsupervised Feature Learning for Optical Character Recognition

Unsupervised Feature Learning for Optical Character Recognition Unsupervised Feature Learning for Optical Character Recognition Devendra K Sahu and C. V. Jawahar Center for Visual Information Technology, IIIT Hyderabad, India. Abstract Most of the popular optical character

More information

Training Restricted Boltzmann Machines with Multi-Tempering: Harnessing Parallelization

Training Restricted Boltzmann Machines with Multi-Tempering: Harnessing Parallelization Training Restricted Boltzmann Machines with Multi-Tempering: Harnessing Parallelization Philemon Brakel, Sander Dieleman, and Benjamin Schrauwen Department of Electronics and Information Systems, Ghent

More information

Learning Invariant Representations with Local Transformations

Learning Invariant Representations with Local Transformations Kihyuk Sohn kihyuks@umich.edu Honglak Lee honglak@eecs.umich.edu Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA Abstract Learning invariant representations

More information

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD. Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,

More information

Emotion Detection using Deep Belief Networks

Emotion Detection using Deep Belief Networks Emotion Detection using Deep Belief Networks Kevin Terusaki and Vince Stigliani May 9, 2014 Abstract In this paper, we explore the exciting new field of deep learning. Recent discoveries have made it possible

More information

Region-based Segmentation and Object Detection

Region-based Segmentation and Object Detection Region-based Segmentation and Object Detection Stephen Gould Tianshi Gao Daphne Koller Presented at NIPS 2009 Discussion and Slides by Eric Wang April 23, 2010 Outline Introduction Model Overview Model

More information

Modeling pigeon behaviour using a Conditional Restricted Boltzmann Machine

Modeling pigeon behaviour using a Conditional Restricted Boltzmann Machine Modeling pigeon behaviour using a Conditional Restricted Boltzmann Machine Matthew D. Zeiler 1,GrahamW.Taylor 1, Nikolaus F. Troje 2 and Geoffrey E. Hinton 1 1- University of Toronto - Dept. of Computer

More information

Gated Boltzmann Machine in Texture Modeling

Gated Boltzmann Machine in Texture Modeling Gated Boltzmann Machine in Texture Modeling Tele Hao, Tapani Rao, Alexander Ilin, and Juha Karhunen Department of Information and Computer Science Aalto University, Espoo, Finland firstname.lastname@aalto.fi

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Learning-based Methods in Vision

Learning-based Methods in Vision Learning-based Methods in Vision 16-824 Sparsity and Deep Learning Motivation Multitude of hand-designed features currently in use in vision - SIFT, HoG, LBP, MSER, etc. Even the best approaches, just

More information

Data-driven Depth Inference from a Single Still Image

Data-driven Depth Inference from a Single Still Image Data-driven Depth Inference from a Single Still Image Kyunghee Kim Computer Science Department Stanford University kyunghee.kim@stanford.edu Abstract Given an indoor image, how to recover its depth information

More information

ROTATION INVARIANT SPARSE CODING AND PCA

ROTATION INVARIANT SPARSE CODING AND PCA ROTATION INVARIANT SPARSE CODING AND PCA NATHAN PFLUEGER, RYAN TIMMONS Abstract. We attempt to encode an image in a fashion that is only weakly dependent on rotation of objects within the image, as an

More information

Implicit Mixtures of Restricted Boltzmann Machines

Implicit Mixtures of Restricted Boltzmann Machines Implicit Mixtures of Restricted Boltzmann Machines Vinod Nair and Geoffrey Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu

More information

Hierarchical Feature Construction for Image Classification Using Genetic Programming

Hierarchical Feature Construction for Image Classification Using Genetic Programming Hierarchical Feature Construction for Image Classification Using Genetic Programming Masanori Suganuma, Daiki Tsuchiya, Shinichi Shirakawa and Tomoharu Nagao Graduate School of Environment and Information

More information

Learning Multiple Non-Linear Sub-Spaces using K-RBMs

Learning Multiple Non-Linear Sub-Spaces using K-RBMs 2013 IEEE Conference on Computer Vision and Pattern Recognition Learning Multiple Non-Linear Sub-Spaces using K-RBMs Siddhartha Chandra 1 Shailesh Kumar 2 C. V. Jawahar 1 1 CVIT, IIIT Hyderabad, 2 Google,

More information

Correcting User Guided Image Segmentation

Correcting User Guided Image Segmentation Correcting User Guided Image Segmentation Garrett Bernstein (gsb29) Karen Ho (ksh33) Advanced Machine Learning: CS 6780 Abstract We tackle the problem of segmenting an image into planes given user input.

More information

Learning Representations for Multimodal Data with Deep Belief Nets

Learning Representations for Multimodal Data with Deep Belief Nets Learning Representations for Multimodal Data with Deep Belief Nets Nitish Srivastava University of Toronto, Toronto, ON. M5S 3G4 Canada Ruslan Salakhutdinov University of Toronto, Toronto, ON. M5S 3G4

More information

CS 523: Multimedia Systems

CS 523: Multimedia Systems CS 523: Multimedia Systems Angus Forbes creativecoding.evl.uic.edu/courses/cs523 Today - Convolutional Neural Networks - Work on Project 1 http://playground.tensorflow.org/ Convolutional Neural Networks

More information

Efficient Feature Learning Using Perturb-and-MAP

Efficient Feature Learning Using Perturb-and-MAP Efficient Feature Learning Using Perturb-and-MAP Ke Li, Kevin Swersky, Richard Zemel Dept. of Computer Science, University of Toronto {keli,kswersky,zemel}@cs.toronto.edu Abstract Perturb-and-MAP [1] is

More information

Apparel Classifier and Recommender using Deep Learning

Apparel Classifier and Recommender using Deep Learning Apparel Classifier and Recommender using Deep Learning Live Demo at: http://saurabhg.me/projects/tag-that-apparel Saurabh Gupta sag043@ucsd.edu Siddhartha Agarwal siagarwa@ucsd.edu Apoorve Dave a1dave@ucsd.edu

More information

Effectiveness of Sparse Features: An Application of Sparse PCA

Effectiveness of Sparse Features: An Application of Sparse PCA 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

The Detection of Faces in Color Images: EE368 Project Report

The Detection of Faces in Color Images: EE368 Project Report The Detection of Faces in Color Images: EE368 Project Report Angela Chau, Ezinne Oji, Jeff Walters Dept. of Electrical Engineering Stanford University Stanford, CA 9435 angichau,ezinne,jwalt@stanford.edu

More information

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components

More information

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet

More information

Learning Sparse Feature Representations using Probabilistic Quadtrees and Deep Belief Nets

Learning Sparse Feature Representations using Probabilistic Quadtrees and Deep Belief Nets Noname manuscript No. (will be inserted by the editor) Learning Sparse Feature Representations using Probabilistic Quadtrees and Deep Belief Nets Saikat Basu Manohar Karki Sangram Ganguly Robert DiBiano

More information

Efficient Algorithms may not be those we think

Efficient Algorithms may not be those we think Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

DBNsim. Giorgio Giuffrè. 0 Abstract How to run it on your machine How to contribute... 2

DBNsim. Giorgio Giuffrè. 0 Abstract How to run it on your machine How to contribute... 2 DBNsim Giorgio Giuffrè Contents 0 Abstract 2 0.1 How to run it on your machine................... 2 0.2 How to contribute.......................... 2 1 Installing DBNsim 2 1.1 Requirements.............................

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Modeling image patches with a directed hierarchy of Markov random fields

Modeling image patches with a directed hierarchy of Markov random fields Modeling image patches with a directed hierarchy of Markov random fields Simon Osindero and Geoffrey Hinton Department of Computer Science, University of Toronto 6, King s College Road, M5S 3G4, Canada

More information

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why? Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of

More information