深度学习 : 机器学习的新浪潮. Kai Yu Deputy Managing Director, Baidu IDL

Size: px
Start display at page:

Download "深度学习 : 机器学习的新浪潮. Kai Yu Deputy Managing Director, Baidu IDL"

Transcription

1 深度学习 : 机器学习的新浪潮 Kai Yu Deputy Managing Director, Baidu IDL

2 Machine Learning the design and development of algorithms that allow computers to evolve behaviors based on empirical data, 7/23/2013 2

3 All Machine Learning Models in One Page Shallow Models Deep Models 7/23/2013 3

4 Shallow Models Since Late 80 s Neural Networks Boosting Support Vector Machines Maximum Entropy Given good features, how to do classification? 7/23/2013 4

5 Since 2000 Learning with Structures Kernel Learning Transfer Learning Semi-supervised Learning Manifold Learning Sparse Learning Matrix Factorization? Given good features, how to do classification? 7/23/2013 5

6 Mining$for$Structure$ Mission Yet Accomplished Massive$increase$in$both$computa: onal$power$and$the$amount$of$ data$available$from$web,$video$cameras,$laboratory$measurements.$ Images$&$Video$ Text$&$Language$$ Speech$&$Audio$ Gene$Expression$ Product$$ Recommenda: on$ Rela: onal$data/$$ Social$Network$ Climate$Change$ Geological$Data$ Mostly$Unlabeled$ $Develop$sta: s: cal$models$that$can$discover$underlying$structure,$cause,$or$ sta: s: cal$correla: on$from$data$in$unsupervised*or$semi,supervised*way.$$ Slide Courtesy: Russ Salakhutdinov $Mul: ple$applica: on$domains.$ 7/23/2013 6

7 The pipeline of machine visual perception Most Efforts in Machine Learning Low-level sensing Preprocessing Feature extract. Feature selection Inference: prediction, recognition Most critical for accuracy Account for most of the computation for testing Most time-consuming in development cycle Often hand-craft in practice 7/23/2013 7

8 Computer vision features SIFT Spin image HoG RIFT GLOH Slide Courtesy: Andrew Ng

9 Deep Learning: learning features from data Machine Learning Low-level sensing Preprocessing Feature extract. Feature selection Inference: prediction, recognition Feature Learning 7/23/2013 9

10 Convolution Neural Networks Coding Pooling Coding Pooling Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Computation, /23/

11 Winter of Neural Networks Since 90 s Non-convex Need a lot of tricks to play with Hard to do theoretical analysis 7/23/

12 The Paradigm of Deep Learning

13 Deep Learning Since 2006 Neural networks 焕发第二春! 7/23/

14 Race on ImageNet (Top 5 Hit Rate) 72%, %, /23/

15 The Best system on ImageNet (by ) Local Gradients e.g, SIFT, HOG Pooling Variants of Sparse Coding: LLC, Super-vector Pooling Linear Classifier - This is a moderate deep model - The first two layers are hand-designed 7/23/

16 Challenge to Deep Learners Key questions: What if no hand-craft features at all? What if use much deeper neural networks? -- By Geoff Hinton 7/23/

17 Answer from Geoff Hinton, %, %, %, /23/

18 The Architecture Slide Courtesy: Geoff Hinton 7/23/

19 The Architecture 7 hidden layers not counting max pooling. Early layers are conv., last two layers globally connected. Uses rectified linear units in every layer. Uses competitive normalization to suppress hidden activities. Slide Courtesy: Geoff HInton 7/23/

20 Revolution on Speech Recognition Word error rates from MSR, IBM, and the Google speech group Slide Courtesy: Geoff Hinton 7/23/

21 Deep Learning for NLP Natural Language Processing (Almost) from Scratch, Collobert et al, JMLR /23/

22 Biological & Theoretical Justification

23 [Thorpe] Why Hierarchy? Theoretical: well-known depth-breadth tradeoff in circuits design [Hastad 1987]. This suggests many functions can be much more efficiently represented with deeper architectures [Bengio & LeCun 2007] Biological: Visual cortex is hierarchical (Hubel-Wiesel Model, 1981 Nobel Prize)

24 Sparse DBN: Training on face images object models Deep Architecture in the Brain object parts (combination of edges) Area V4 Area V2 Area V1 Higher level visual abstractions Primitive shape detectors Edge detectors edges Retina pixels pixels Note: Sparsity important for these results [Lee, Grosse, Ranganath & Ng, 2009]

25 Sensor representation in the brain Auditory cortex learns to see. Auditory Cortex (Same rewiring process also works for touch/ somatosensory cortex.) Seeing with your tongue [Roe et al., 1992; BrainPort; Welsh & Blasch, 1997]

26 Deep Learning in Industry

27 Microsoft First successful deep learning models for speech recognition, by MSR in /23/

28 Google Brain Project Ley by Google fellow Jeff Dean Published two papers, ICML2012, NIPS2012 Company-wise large-scale deep learning infrastructure Significant success on images, speech, 7/23/

29 Deep Baidu Started working on deep learning in 2012 summer Big success in speech recognition and image recognition, 6 products have been launched so far. Recently, very encouraging results on CTR Meanwhile, efforts are being tried on in areas like LTR, NLP 7/23/

30 Deep Learning Ingredients Building blocks RBMs, autoencoder neural nets, sparse coding, Go deeper: Layerwise feature learning Layer-by-layer unsupervised training Layer-by-layer supervised training Fine tuning via Backpropogation If data are big enough, direct fine tuning is enough Sparsity on hidden layers are often helpful. 7/23/

31 Building Blocks of Deep Learning

32 CVPR 2012 Tutorial: Deep Learning for Vision 09.00am: Introduction Rob Fergus (NYU) 10.00am: Coffee Break 10.30am: Sparse Coding Kai Yu (Baidu) 11.30am: Neural Networks Marc Aurelio Ranzato (Google) 12.30pm: Lunch 01.30pm: Restricted Boltzmann Honglak Lee (Michigan) Machines 02.30pm: Deep Boltzmann Ruslan Salakhutdinov (Toronto) Machines 03.00pm: Coffee Break 03.30pm: Transfer Learning Ruslan Salakhutdinov (Toronto) 04.00pm: Motion & Video Graham Taylor (Guelph) 05.00pm: Summary / Q & A All 05.30pm: End

33 Building Block 1 - RBM

34 Restricted Boltzmann Machine 7/23/2013 Slide Courtesy: Russ Salakhutdinov 34

35 Restricted Boltzmann Machine Restricted: Slide Courtesy: Russ Salakhutdinov 7/23/

36 Model Parameter Learning i.i.d. Approximate*maximum*likelihood*learning:** Slide Courtesy: Russ Salakhutdinov 7/23/

37 Contrastive Divergence 7/23/2013 Slide Courtesy: Russ Salakhutdinov 37

38 RBMs for Images 7/23/2013 Slide Courtesy: Russ Salakhutdinov 38

39 Layerwise Pre-training 7/23/2013 Slide Courtesy: Russ Salakhutdinov 39

40 DBNs for MNIST Classification 7/23/2013 Slide Courtesy: Russ Salakhutdinov 40

41 Deep Autoencoders for Unsupervised Feature Learning 7/23/2013 Slide Courtesy: Russ Salakhutdinov 41

42 Searching$Large$Image$Database$ using$binary$codes$ Image Retrieval using Binary Codes $Map$images$into$binary$codes$for$fast$retrieval.* $Small$Codes,$Torralba,$Fergus,$Weiss,$CVPR$2008$ *Spectral$Hashing,$Y.$Weiss,$A.$Torralba,$R.$Fergus,$NIPS$2008* *Kulis$and$Darrell,$NIPS$2009,$Gong$and$Lazebnik,$CVPR$20111$ *Norouzi$and$Fleet,$ICML$2011,$* Slide Courtesy: Russ Salakhutdinov 7/23/

43 Building Block 2 Autoencoder Neural Net

44 AUTO-ENCODERS NEURAL NETS Autoencoder Neural Network input encoder code decoder prediction Error input higher dimensional than code - error: prediction - input - training: back-propagation Ranzato Slide Courtesy: Marc'Aurelio Ranzato 7/23/

45 Sparse AutoencoderSparsity SPARSE AUTO-ENCODERS Penalty input input encoder Sparsity Penalty code encoder decoder code decoder prediction prediction Error Err sparsity penalty: code sparsity penalty: code error: prediction - input - error: prediction - input - loss: sum of square reconstruction error and sparsity loss: - training: sum back-propagation of square reconstruction error and sparsity - training: back-propagation 1 2 Ranzato Slide Courtesy: Marc'Aurelio Ranzato 7/23/ Ran

46 Sparse Autoencoder SPARSE AUTO-ENCODERS input input SPARSE AUTO-ENCODERS encoder Sparsity Penalty encoder code Sparsity Penalty decoder prediction decoder prediction Error Er sparsity penalty: code input: code: - error: prediction - input - training: back-propagation loss: sum of square reconstruction error and sparsity - loss: X h= W T X 117 L X ;W = W h X 2 j h j Ranzato Le et al. ICA with reconstruction cost.. NIPS 2011 Ran Slide Courtesy: Marc'Aurelio Ranzato 7/23/

47 Building Block 3 Sparse Coding

48 Sparse coding Sparse coding (Olshausen & Field,1996). Originally developed to explain early visual processing in the brain (edge detection). Training: given a set of random patches x, learning a dictionary of bases [Φ 1, Φ 2, ] Coding: for data vector x, solve LASSO to find the sparse coefficient vector a 7/23/

49 Sparse coding: training time Input: Images x 1, x 2,, x m (each in R d ) Learn: Dictionary of bases f 1, f 2,, f k (also R d ). Alternating optimization: 1.Fix dictionary f 1, f 2,, f k, optimize a (a standard LASSO problem) 2.Fix activations a, optimize dictionary f 1, f 2,, f k (a convex QP problem)

50 Sparse coding: testing time Input: Novel image patch x (in R d ) and previously learned f i s Output: Representation [a i,1, a i,2,, a i,k ] of image patch x i. 0.8 * * * Represent x i as: a i = [0, 0,, 0, 0.8, 0,, 0, 0.3, 0,, 0, 0.5, ]

51 Sparse coding illustration Natural Images Learned bases (f 1,, f 64 ): Edges Test example 0.8 * * * f 63 x 0.8 * f * f * [a 1,, a 64 ] = [0, 0,, 0, 0.8, 0,, 0, 0.3, 0,, 0, 0.5, 0] (feature representation) Slide credit: Andrew Ng Compact & easily interpretable

52 RBM & autoencoders - also involve activation and reconstruction - but have explicit f(x) - not necessarily enforce sparsity on a - but if put sparsity on a, often get improved results [e.g. sparse RBM, Lee et al. NIPS08] a x f(x) encoding a x g(a) decoding 7/23/

53 Sparse coding: A broader view Any feature mapping from x to a, i.e. a = f(x), where -a is sparse (and often higher dim. than x) -f(x) is nonlinear -reconstruction x =g(a), such that x x a f(x) a g(a) x x Therefore, sparse RBMs, sparse auto-encoder, even VQ can be viewed as a form of sparse coding. 7/23/

54 Example of sparse activations (sparse coding) x 3 x 1 x 2 x m a 1 [ ] a 2 [ ] a 3 [ ] a m [ ] different x has different dimensions activated locally-shared sparse representation: similar x s tend to have similar non-zero dimensions 7/23/

55 Example of sparse activations (sparse coding) x 1 x 2 x 3 x m a 1 [ ] a 2 [ ] a 3 [ ] a m [ ] another example: preserving manifold structure more informative in highlighting richer data structures, i.e. clusters, manifolds, 7/23/

56 Sparsity vs. Locality sparse coding local sparse coding Intuition: similar data should get similar activated features Local sparse coding: data in the same neighborhood tend to have shared activated features; data in different neighborhoods tend to have different features activated. 7/23/

57 Sparse coding is not always local: example Case 1 independent subspaces Each basis is a direction Sparsity: each datum is a linear combination of only several bases. 7/23/2013 Case 2 data manifold (or clusters) Each basis an anchor point Sparsity: each datum is a linear combination of neighbor anchors. Sparsity is caused by locality. 57

58 Two approaches to local sparse coding Approach 1 Coding via local anchor points Local coordinate coding Learning locality-constrained linear coding for image classification, Jingjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang. In CVPR Nonlinear learning using local coordinate coding, Kai Yu, Tong Zhang, and Yihong Gong. In NIPS Approach 2 Coding via local subspaces Super-vector coding Image Classification using Super-Vector Coding of Local Image Descriptors, Xi Zhou, Kai Yu, Tong Zhang, and Thomas Huang. In ECCV Large-scale Image Classification: Fast Feature Extraction and SVM Training, Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu, LiangLiang Cao, Thomas Huang. In CVPR /23/

59 Two approaches to local sparse coding Approach 1 Coding via local anchor points Local coordinate coding Approach 2 Coding via local subspaces Super-vector coding - Sparsity achieved by explicitly ensuring locality - Sound theoretical justifications - Much simpler to implement and compute - Strong empirical success 7/23/

60 Hierarchical Sparse Coding n x n ] R d n B R d p q \alpha w x (W,α)= L (W,α)+ λ 1 W,α n W 1 + γ L (W,α) α 0, L (W,α) L (W,α)= 1 2n X BW 2 F + λ 2 S n n 1 1 n 2 x 2 w i i Bw i + λ2 w i Ω(α)w i i= 1 Σ(α W = (w 1 w 2 w n W) R p n S (W α R q W Ω(α) S (W ) 1 n Yu, Lin, & Lafferty, i= 1CVPR 11 7/23/ q n α k (φ k ) w i w T i 1. R

61 Hierarchical Sparse Coding on MNIST Yu, Lin, & Lafferty, CVPR 11 HSC vs. CNN: HSC provide even better performance than CNN more amazingly, HSC learns features in unsupervised manner! 61

62 Second-layer dictionary Yu, Lin, & Lafferty, CVPR 11 A hidden unit in the second layer is connected to a unit group in the 1 st layer: invariance to translation, rotation, and deformation 62

63 Adaptive Deconvolutional Networks for Mid and High Level Feature Learning Matthew D. Zeiler, Graham W. Taylor, and Rob Fergus, ICCV 2011 Hierarchical Convolutional Sparse Coding. Trained with respect to image from all layers (L1-L4). Pooling both spatially and amongst features. Learns invariant midlevel features. L4 Feature Maps L3 Feature Maps L2 Feature Maps L1 Feature Maps Image Select L4 Features Select L3 Feature Groups Select L2 Feature Groups L1 Features

64 Recap: Deep Learning Ingredients Building blocks RBMs, autoencoder neural nets, sparse coding, Go deeper: Layerwise feature learning Layer-by-layer unsupervised training Layer-by-layer supervised training Fine tuning via Backpropogation If data are big enough, direct fine tuning is enough Sparsity on hidden layers are often helpful. 7/23/

65 Layer-by-layer unsupervised training + fine tuning Loss = supervised_error + unsupervised_error input prediction label Weston et al. Deep learning via semi-supervised embedding ICML 2008 Ra Slide Courtesy: Marc'Aurelio Ranzato 7/23/

66 Layer-by-Layer Supervised Training 7/23/

67 Geoff Hinton, ICASSP 2013

68 Large-scale training: Stochastic vs. batch

69 The challenge The Challenge A Large Scale problem has: lots of training samples (>10M) lots of classes (>10K) and lots of input dimensions (>10K). best optimizer in practice is on-line SGD which is naturally sequential, hard to parallelize. layers cannot be trained independently and in parallel, hard to distribute model can have lots of parameters that may clog the network, hard to distribute across machines Slide Courtesy: Marc'Aurelio Ranzato 7/23/

70 A solution by model parallelism Our Solution 1 st machine 2 nd machine 3 rd machine MODEL PARALLELISM + input #3 input #2 input #1 DATA PARALLELISM MODEL PARALLELISM 7/23/ Ranzato Le et al. Building high-level features using large-scale unsupervised learning ICML 2012 Slide Courtesy: Marc'Aurelio Ranzato Ra

71 input #3 input #2 input #1 input #3 input #2 input #1 MODEL PARALLELISM MODEL PARALLELISM + + DATA PARALLELISM DATA PARALLELISM Le et al. Building high-level features using large-scale unsupervised learning ICML 2012 Le 7/23/2013 et al. Building high-level features using large-scale unsupervised learning ICML Slide Courtesy: Marc'Aurelio Ranzato Ra Ran

72 Asynchronous SGD Asynchronous SGD L 1 PARAMETER SERVER 1 st replica 2 nd replica 3 rd replica 7/23/ Ranzato Slide Courtesy: Marc'Aurelio Ranzato

73 Asynchronous SGD Asynchronous SGD PARAMETER SERVER 1 1 st replica 2 nd replica 3 rd replica 7/23/ Ranzato Slide Courtesy: Marc'Aurelio Ranzato

74 PARAMETER SERVER L 2 1 st replica 2 nd replica 3 rd replica 7/23/ Ranzato Slide Courtesy: Marc'Aurelio Ranzato

75 PARAMETER SERVER 2 1 st replica 2 nd replica 3 rd replica 7/23/ Ranzato Slide Courtesy: Marc'Aurelio Ranzato

76 Unsupervised Learning With 1B Paramete Training A Model with 1 B Parameters Deep Net: 3 stages each stage consists of local filtering, L2 pooling, LCN - 18x18 filters - 8 filters at each location - L2 pooling and LCN over 5x5 neighborhoods training jointly the three layers by: - reconstructing the input of each layer - sparsity on the code Slide Courtesy: Marc'Aurelio Ranzato 7/23/

77 Conclusion Remark We see the promise of deep learning Big success on speech, vision, and many other Large-scale training is the key challenge More successful stories are coming 7/23/

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

Deep Neural Networks:

Deep Neural Networks: Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,

More information

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Part II: Visual Features and Representations Liangliang Cao, IBM Watson Research Center Evolvement of Visual Features

More information

Deep Learning & Feature Learning Methods for Vision

Deep Learning & Feature Learning Methods for Vision Deep Learning & Feature Learning Methods for Vision CVPR 2012 Tutorial: 9am-5:30pm Rob Fergus (NYU) Kai Yu (Baidu) Marc Aurelio Ranzato (Google) Honglak Lee (Michigan) Ruslan Salakhutdinov (U. Toronto)

More information

Machine Learning. MGS Lecture 3: Deep Learning

Machine Learning. MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer

More information

ImageNet classification: fast descriptor coding and large-scale SVM training

ImageNet classification: fast descriptor coding and large-scale SVM training ImageNet classification: fast descriptor coding and large-scale SVM training Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu LiangLiang Cao, Zhen Li, Min-Hsuan Tsai, Xi Zhou, Thomas

More information

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD. Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,

More information

Deep Learning for Vision: Tricks of the Trade

Deep Learning for Vision: Tricks of the Trade Deep Learning for Vision: Tricks of the Trade Marc'Aurelio Ranzato Facebook, AI Group www.cs.toronto.edu/~ranzato BAVM Friday, 4 October 2013 Ideal Features Ideal Feature Extractor - window, right - chair,

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Deep Learning & Neural Networks

Deep Learning & Neural Networks Deep Learning & Neural Networks Machine Learning CSE4546 Sham Kakade University of Washington November 29, 2016 Sham Kakade 1 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural

More information

Parallel Implementation of Deep Learning Using MPI

Parallel Implementation of Deep Learning Using MPI Parallel Implementation of Deep Learning Using MPI CSE633 Parallel Algorithms (Spring 2014) Instructor: Prof. Russ Miller Team #13: Tianle Ma Email: tianlema@buffalo.edu May 7, 2014 Content Introduction

More information

Learning Feature Hierarchies for Object Recognition

Learning Feature Hierarchies for Object Recognition Learning Feature Hierarchies for Object Recognition Koray Kavukcuoglu Computer Science Department Courant Institute of Mathematical Sciences New York University Marc Aurelio Ranzato, Kevin Jarrett, Pierre

More information

Unsupervised Deep Learning for Scene Recognition

Unsupervised Deep Learning for Scene Recognition Unsupervised Deep Learning for Scene Recognition Akram Helou and Chau Nguyen May 19, 2011 1 Introduction Object and scene recognition are usually studied separately. However, research [2]shows that context

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

Learning-based Methods in Vision

Learning-based Methods in Vision Learning-based Methods in Vision 16-824 Sparsity and Deep Learning Motivation Multitude of hand-designed features currently in use in vision - SIFT, HoG, LBP, MSER, etc. Even the best approaches, just

More information

Introduction to Deep Learning

Introduction to Deep Learning ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)

More information

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun Efficient Learning of Sparse Overcomplete Representations with an Energy-Based Model Marc'Aurelio Ranzato C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun CIAR Summer School Toronto 2006 Why Extracting

More information

Advanced Introduction to Machine Learning, CMU-10715

Advanced Introduction to Machine Learning, CMU-10715 Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

Deep Learning for Generic Object Recognition

Deep Learning for Generic Object Recognition Deep Learning for Generic Object Recognition, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University Collaborators: Marc'Aurelio Ranzato, Fu Jie Huang,

More information

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia CPSC340 State-of-the-art Neural Networks Nando de Freitas November, 2012 University of British Columbia Outline of the lecture This lecture provides an overview of two state-of-the-art neural networks:

More information

Deep Learning. Volker Tresp Summer 2014

Deep Learning. Volker Tresp Summer 2014 Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng Computer Science Department, Stanford University,

More information

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart Machine Learning The Breadth of ML Neural Networks & Deep Learning Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017 Neural Networks Consider

More information

Multiview Feature Learning

Multiview Feature Learning Multiview Feature Learning Roland Memisevic Frankfurt, Montreal Tutorial at IPAM 2012 Roland Memisevic (Frankfurt, Montreal) Multiview Feature Learning Tutorial at IPAM 2012 1 / 163 Outline 1 Introduction

More information

3D Object Recognition with Deep Belief Nets

3D Object Recognition with Deep Belief Nets 3D Object Recognition with Deep Belief Nets Vinod Nair and Geoffrey E. Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu

More information

ECE 6504: Deep Learning for Perception

ECE 6504: Deep Learning for Perception ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop Convolutional Neural Nets Dhruv Batra Virginia Tech Administrativia Presentation Assignments https://docs.google.com/spreadsheets/d/ 1m76E4mC0wfRjc4HRBWFdAlXKPIzlEwfw1-u7rBw9TJ8/

More information

Histograms of Sparse Codes for Object Detection

Histograms of Sparse Codes for Object Detection Histograms of Sparse Codes for Object Detection Xiaofeng Ren (Amazon), Deva Ramanan (UC Irvine) Presented by Hossein Azizpour What does the paper do? (learning) a new representation local histograms of

More information

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components

More information

Learning Invariant Representations with Local Transformations

Learning Invariant Representations with Local Transformations Kihyuk Sohn kihyuks@umich.edu Honglak Lee honglak@eecs.umich.edu Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA Abstract Learning invariant representations

More information

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing feature 3 PC 3 Beate Sick Many slides are taken form Hinton s great lecture on NN: https://www.coursera.org/course/neuralnets

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

Regionlet Object Detector with Hand-crafted and CNN Feature

Regionlet Object Detector with Hand-crafted and CNN Feature Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina Neural Network and Deep Learning Early history of deep learning Deep learning dates back to 1940s: known as cybernetics in the 1940s-60s, connectionism in the 1980s-90s, and under the current name starting

More information

Efficient Algorithms may not be those we think

Efficient Algorithms may not be those we think Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann

More information

Day 3 Lecture 1. Unsupervised Learning

Day 3 Lecture 1. Unsupervised Learning Day 3 Lecture 1 Unsupervised Learning Semi-supervised and transfer learning Myth: you can t do deep learning unless you have a million labelled examples for your problem. Reality You can learn useful representations

More information

Alternatives to Direct Supervision

Alternatives to Direct Supervision CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of

More information

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA

More information

Object Recognition with Hierarchical Kernel Descriptors

Object Recognition with Hierarchical Kernel Descriptors Object Recognition with Hierarchical Kernel Descriptors Liefeng Bo 1 Kevin Lai 1 Xiaofeng Ren 2 Dieter Fox 1,2 University of Washington 1 Intel Labs Seattle 2 {lfb,kevinlai,fox}@cs.washington.edu xiaofeng.ren@intel.com

More information

LEARNING A SPARSE DICTIONARY OF VIDEO STRUCTURE FOR ACTIVITY MODELING. Nandita M. Nayak, Amit K. Roy-Chowdhury. University of California, Riverside

LEARNING A SPARSE DICTIONARY OF VIDEO STRUCTURE FOR ACTIVITY MODELING. Nandita M. Nayak, Amit K. Roy-Chowdhury. University of California, Riverside LEARNING A SPARSE DICTIONARY OF VIDEO STRUCTURE FOR ACTIVITY MODELING Nandita M. Nayak, Amit K. Roy-Chowdhury University of California, Riverside ABSTRACT We present an approach which incorporates spatiotemporal

More information

Unsupervised Learning of Feature Hierarchies

Unsupervised Learning of Feature Hierarchies Unsupervised Learning of Feature Hierarchies by Marc Aurelio Ranzato A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Study of Residual Networks for Image Recognition

Study of Residual Networks for Image Recognition Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks

More information

Novel Lossy Compression Algorithms with Stacked Autoencoders

Novel Lossy Compression Algorithms with Stacked Autoencoders Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is

More information

Does the Brain do Inverse Graphics?

Does the Brain do Inverse Graphics? Does the Brain do Inverse Graphics? Geoffrey Hinton, Alex Krizhevsky, Navdeep Jaitly, Tijmen Tieleman & Yichuan Tang Department of Computer Science University of Toronto How to learn many layers of features

More information

Deep Learning. Volker Tresp Summer 2015

Deep Learning. Volker Tresp Summer 2015 Deep Learning Volker Tresp Summer 2015 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort

More information

Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms

Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms Liefeng Bo University of Washington Seattle WA 98195, USA Xiaofeng Ren ISTC-Pervasive Computing Intel Labs Seattle

More information

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning Adam Coates, Blake Carpenter, Carl Case, Sanjeev Satheesh, Bipin Suresh, Tao Wang, Andrew Y. Ng Computer Science

More information

Stacked Denoising Autoencoders for Face Pose Normalization

Stacked Denoising Autoencoders for Face Pose Normalization Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Multi-Task Learning of Facial Landmarks and Expression

Multi-Task Learning of Facial Landmarks and Expression Multi-Task Learning of Facial Landmarks and Expression Terrance Devries 1, Kumar Biswaranjan 2, and Graham W. Taylor 1 1 School of Engineering, University of Guelph, Guelph, Canada N1G 2W1 2 Department

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007

More information

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Qi Zhang & Yan Huang Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition

More information

Autoencoders, denoising autoencoders, and learning deep networks

Autoencoders, denoising autoencoders, and learning deep networks 4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,

More information

Autoencoder Using Kernel Method

Autoencoder Using Kernel Method 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) Banff Center, Banff, Canada, October 5-8, 2017 Autoencoder Using Kernel Method Yan Pei Computer Science Division University of

More information

Tiled convolutional neural networks

Tiled convolutional neural networks Tiled convolutional neural networks Quoc V. Le, Jiquan Ngiam, Zhenghao Chen, Daniel Chia, Pang Wei Koh, Andrew Y. Ng Computer Science Department, Stanford University {quocle,jngiam,zhenghao,danchia,pangwei,ang}@cs.stanford.edu

More information

Lecture 7: Semantic Segmentation

Lecture 7: Semantic Segmentation Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr

More information

Face Recognition A Deep Learning Approach

Face Recognition A Deep Learning Approach Face Recognition A Deep Learning Approach Lihi Shiloh Tal Perl Deep Learning Seminar 2 Outline What about Cat recognition? Classical face recognition Modern face recognition DeepFace FaceNet Comparison

More information

Learning Transferable Features with Deep Adaptation Networks

Learning Transferable Features with Deep Adaptation Networks Learning Transferable Features with Deep Adaptation Networks Mingsheng Long, Yue Cao, Jianmin Wang, Michael I. Jordan Presented by Changyou Chen October 30, 2015 1 Changyou Chen Learning Transferable Features

More information

Neural Networks: promises of current research

Neural Networks: promises of current research April 2008 www.apstat.com Current research on deep architectures A few labs are currently researching deep neural network training: Geoffrey Hinton s lab at U.Toronto Yann LeCun s lab at NYU Our LISA lab

More information

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

An Analysis of Single-Layer Networks in Unsupervised Feature Learning An Analysis of Single-Layer Networks in Unsupervised Feature Learning Adam Coates Honglak Lee Andrew Y. Ng Stanford University Computer Science Dept. 353 Serra Mall Stanford, CA 94305 University of Michigan

More information

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction Preprocessing The goal of pre-processing is to try to reduce unwanted variation in image due to lighting,

More information

Adaptive Deconvolutional Networks for Mid and High Level Feature Learning

Adaptive Deconvolutional Networks for Mid and High Level Feature Learning ICCV 2011 submission. Currently under review. Please do not distribute. Adaptive Deconvolutional Networks for Mid and High Level Feature Learning Matthew D. Zeiler, Graham W. Taylor and Rob Fergus Dept.

More information

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based

More information

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why? Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of

More information

arxiv: v1 [cs.lg] 20 Dec 2013

arxiv: v1 [cs.lg] 20 Dec 2013 Unsupervised Feature Learning by Deep Sparse Coding Yunlong He Koray Kavukcuoglu Yun Wang Arthur Szlam Yanjun Qi arxiv:1312.5783v1 [cs.lg] 20 Dec 2013 Abstract In this paper, we propose a new unsupervised

More information

Locally Scale-Invariant Convolutional Neural Networks

Locally Scale-Invariant Convolutional Neural Networks Locally Scale-Invariant Convolutional Neural Networks Angjoo Kanazawa Department of Computer Science University of Maryland, College Park, MD 20740 kanazawa@umiacs.umd.edu Abhishek Sharma Department of

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

Single Image Depth Estimation via Deep Learning

Single Image Depth Estimation via Deep Learning Single Image Depth Estimation via Deep Learning Wei Song Stanford University Stanford, CA Abstract The goal of the project is to apply direct supervised deep learning to the problem of monocular depth

More information

Unsupervised Learning

Unsupervised Learning Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy

More information

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version Shallow vs. deep networks Restricted Boltzmann Machines Shallow: one hidden layer Features can be learned more-or-less independently Arbitrary function approximator (with enough hidden units) Deep: two

More information

Know your data - many types of networks

Know your data - many types of networks Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for

More information

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt Energy Based Models, Restricted Boltzmann Machines and Deep Networks Jesse Eickholt ???? Who s heard of Energy Based Models (EBMs) Restricted Boltzmann Machines (RBMs) Deep Belief Networks Auto-encoders

More information

A Deep Learning Framework for Authorship Classification of Paintings

A Deep Learning Framework for Authorship Classification of Paintings A Deep Learning Framework for Authorship Classification of Paintings Kai-Lung Hua ( 花凱龍 ) Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology Taipei,

More information

Supervised Translation-Invariant Sparse Coding

Supervised Translation-Invariant Sparse Coding Supervised Translation-Invariant Sparse Coding Jianchao Yang,KaiYu, Thomas Huang Beckman Institute, University of Illinois at Urbana-Champaign NEC Laboratories America, Inc., Cupertino, California {jyang29,

More information

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet

More information

Does the Brain do Inverse Graphics?

Does the Brain do Inverse Graphics? Does the Brain do Inverse Graphics? Geoffrey Hinton, Alex Krizhevsky, Navdeep Jaitly, Tijmen Tieleman & Yichuan Tang Department of Computer Science University of Toronto The representation used by the

More information

Using Machine Learning for Classification of Cancer Cells

Using Machine Learning for Classification of Cancer Cells Using Machine Learning for Classification of Cancer Cells Camille Biscarrat University of California, Berkeley I Introduction Cell screening is a commonly used technique in the development of new drugs.

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Hierarchical Representation Using NMF

Hierarchical Representation Using NMF Hierarchical Representation Using NMF Hyun Ah Song 1, and Soo-Young Lee 1,2 1 Department of Electrical Engineering, KAIST, Daejeon 305-701, Republic of Korea 2 Department of Bio and Brain Engineering,

More information

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com Neural Networks These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle larocheh@iro.umontreal.ca Dumitru Erhan erhandum@iro.umontreal.ca Aaron Courville courvila@iro.umontreal.ca

More information

CS 179 Lecture 16. Logistic Regression & Parallel SGD

CS 179 Lecture 16. Logistic Regression & Parallel SGD CS 179 Lecture 16 Logistic Regression & Parallel SGD 1 Outline logistic regression (stochastic) gradient descent parallelizing SGD for neural nets (with emphasis on Google s distributed neural net implementation)

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

Discriminative classifiers for image recognition

Discriminative classifiers for image recognition Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study

More information

Learning Convolutional Feature Hierarchies for Visual Recognition

Learning Convolutional Feature Hierarchies for Visual Recognition Learning Convolutional Feature Hierarchies for Visual Recognition Koray Kavukcuoglu, Pierre Sermanet, Y-Lan Boureau, Karol Gregor, Michael Mathieu, Yann LeCun Computer Science Department Courant Institute

More information

Layerwise Interweaving Convolutional LSTM

Layerwise Interweaving Convolutional LSTM Layerwise Interweaving Convolutional LSTM Tiehang Duan and Sargur N. Srihari Department of Computer Science and Engineering The State University of New York at Buffalo Buffalo, NY 14260, United States

More information

Learning to Align from Scratch

Learning to Align from Scratch Learning to Align from Scratch Gary B. Huang 1 Marwan A. Mattar 1 Honglak Lee 2 Erik Learned-Miller 1 1 University of Massachusetts, Amherst, MA {gbhuang,mmattar,elm}@cs.umass.edu 2 University of Michigan,

More information

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks Jakob Verbeek 2017-2018 Biological motivation Neuron is basic computational unit of the brain about 10^11 neurons in human brain Simplified neuron model as linear threshold

More information