Learning-based Methods in Vision

Size: px
Start display at page:

Download "Learning-based Methods in Vision"

Transcription

1 Learning-based Methods in Vision Sparsity and Deep Learning

2 Motivation Multitude of hand-designed features currently in use in vision - SIFT, HoG, LBP, MSER, etc. Even the best approaches, just capture low-level edge gradients [ Felzenszwalb, Girshick, McAllester and Ramanan, PAMI 2007 ] [ Yan & Huang ] (Winner of PASCAL 2010 classification competition) Slide adopted from Rob Fergus

3 Motivation Multitude of hand-designed features currently in use in vision - SIFT, HoG, LBP, MSER, etc. Even the best approaches, just capture low-level edge gradients [ Felzenszwalb, Girshick, McAllester and Ramanan, PAMI 2007 ] [ Yan & Huang ] (Winner of PASCAL 2010 classification competition) Can we learn the features? Slide adopted from Rob Fergus

4 Visual cortex bottom-up/top-down V1: primary visual cortex simple cells complex cells [ Scientific American, 1999 ] Slide adopted from Ying Nian Wu

5 Simple V1 cells [Daugman, 1985 ] Gabor wavelets: localized sine and cosine waves V1 simple cells Local sum respond to edges image pixels Slide adopted from Ying Nian Wu

6 Complex V1 cells [ Riesenhuber and Poggio, 1999 ] Local max V1 complex cells V1 simple cells Local sum respond to edges image pixels Slide adopted from Ying Nian Wu

7 Single Layer Architecture Input: Image Pixels / Features Slide from Rob Fergus

8 Single Layer Architecture Input: Image Pixels / Features Filter Slide from Rob Fergus

9 Single Layer Architecture Input: Image Pixels / Features Filter Normalize Slide from Rob Fergus

10 Single Layer Architecture Input: Image Pixels / Features Filter Normalize Pool Slide from Rob Fergus

11 Single Layer Architecture Input: Image Pixels / Features Filter Normalize Pool Output: Features / Classifier Slide from Rob Fergus

12 Single Layer Architecture Input: Image Pixels / Features Filter Pool Normalize Output: Features / Classifier Slide from Rob Fergus

13 SIFT Descriptor Image Pixels Apply Gabor filters Slide from Rob Fergus

14 SIFT Descriptor Image Pixels Apply Gabor filters Spatial pool (Sum) Slide from Rob Fergus

15 SIFT Descriptor Image Pixels Apply Gabor filters Spatial pool (Sum) Normalize to unit length Feature Vector Slide from Rob Fergus

16 Feature Learning Architecture Pixels / Features Filter with Dictionary (patch/tiled/ convolutional) + Non-linearity Slide from Rob Fergus

17 Feature Learning Architecture Pixels / Features Filter with Dictionary (patch/tiled/ convolutional) + Non-linearity Normalization between feature responses (Group) Sparsity Max / Softmax Local Contrast Normalization (Subtractive / Divisive) Slide from Rob Fergus

18 Feature Learning Architecture Pixels / Features Filter with Dictionary (patch/tiled/ convolutional) + Non-linearity Normalization between feature responses (Group) Sparsity Max / Softmax Local Contrast Normalization (Subtractive / Divisive) Spatial/Feature (Sum or Max) Features Slide from Rob Fergus

19 Spatial Pyramid Matching SIFT Features Filter with Visual Words [ Lazebnik, Schmid, Ponce, CVPR 2006 ] Max Multi-scale spatial pool (Sum) Classifier Slide from Rob Fergus

20 Role of Normalization Lots of different mechanisms (e.g., max, sparsity, local contrast normalization) All induce local competition between features to explain the input ( explaining away property).... Convolution Convolutional Sparse Coding Filters Zeiler et al. [CVPR 10/ICCV 11], Kavakouglou et al. [NIPS 10], Yang et al. [CVPR 10] Slide from Rob Fergus

21 Role of Pooling Spatial Pooling - Invariance to small transformations (e.g., shifts) - Larger receptive fields Pooling Across Features - Gives and/or behavior (grammar) - Compositionality [ Zeiler, Taylor, Fergus, ICCV 2011 ] Pooling with latent variables/springs Slide from Rob Fergus

22 Role of Pooling Spatial Pooling - Invariance to small transformations (e.g., shifts) - Larger receptive fields Pooling Across Features - Gives and/or behavior (grammar) - Compositionality [ Zeiler, Taylor, Fergus, ICCV 2011 ] Pooling with latent variables/springs [ Felzenszwalb, Girshick, McAllester, Ramanan, PAMI 2009 ] [ Chen, Zhu, Lin, Yuille, Zhang, NIPS 2007 ] Slide from Rob Fergus

23 Image Restoration [ Mairal, Bach, Ponce, Shapiro, Zisserman, ICCV 2009 ] Image Pixels Feature Vector

24 Image Restoration [ Mairal, Bach, Ponce, Shapiro, Zisserman, ICCV 2009 ] Image Pixels Filter with Dictionary (patch) Feature Vector

25 Image Restoration [ Mairal, Bach, Ponce, Shapiro, Zisserman, ICCV 2009 ] Image Pixels Filter with Dictionary (patch) Sparsity Feature Vector

26 Image Restoration [ Mairal, Bach, Ponce, Shapiro, Zisserman, ICCV 2009 ] Image Pixels Filter with Dictionary (patch) Sparsity Spatial pool (Sum) Feature Vector

27 Sparse Representation for Image Restoration y {z} observed image = x orig {z } true image + w {z} noise Slide adopted from Julien Mairal

28 Sparse Representation for Image Restoration y {z} observed image = x orig {z } true image + w {z} noise Can be cast as energy minimization problem: E(x) = 1 2 y x 2 2 {z } reconstruction of observed image + E prior (x) {z } image prior (-log prior) Slide adopted from Julien Mairal

29 Sparse Representation for Image Restoration y {z} observed image = x orig {z } true image + w {z} noise Can be cast as energy minimization problem: E(x) = 1 2 y x 2 2 {z } reconstruction of observed image + E prior (x) {z } image prior (-log prior) or probabilistically: p(y, x) = p(y x) {z } likelihood p(x) {z} prior Slide adopted from Julien Mairal

30 Sparse Representation for Image Restoration y {z} observed image = x orig {z } true image + w {z} noise Can be cast as energy minimization problem: E(x) = 1 2 y x 2 2 {z } reconstruction of observed image + E prior (x) {z } image prior (-log prior) or probabilistically: p(y, x) = p(y x) {z } likelihood p(x) {z} prior Classical priors: - Smoothness: - Total variation: Lx 2 2 rx 2 1 Slide adopted from Julien Mairal

31 Sparse Linear Model Let x be an image (or a signal) in R m And D =[d 1,...,d p ] 2 R m p basis vectors (dictionary) be a set of normalized linear We can represent x with few basis vectors, i.e., there exists a sparse vector 2 R p (sparse code) such that x D. ode. x }{{} x R m d 1 d 2 d p } {{ } D R m p Dictionary α[1] α[2]. α[p] }{{} Sparse code α R p,sparse Slide adopted from Julien Mairal

32 Why sparsity? A dictionary can be good for representing a class of signals We don t want to reconstruct noise! Any given patch looks like part of an image A sum of a few patches is likely to produce a reasonable patch from an image Image Patches Sum of many patches can

33 Lateral Inhibition Visual neurons respond less if they are activated at the same time than if one is activated alone. So the fewer neighboring neurons stimulated, the more strongly a neuron responds. Images from Ying Nian Wu

34 Sparse Representation for Image Restoration Hand designed dictionaries - Wavelets, Curvelets, Wedgelets, Bandlets,... - [Haar, 1910], [Zweig, Morlet, Grossman ~70s], [Meyer, Mallat, Daubechies, Coifman, Donoho, Candes ~80s-today]... (see [Mallat, 1999]) Learned dictionaries of patches - [Olshausen and Field, 1997], [Engan et al., 1999], [Lewicki and Sejnowski, 2000], [Aharon et al., 2006], [Roth and Black, 2005], [Lee et al., 2007] min i,d NX i=1 1 2 x i D i 2 2 {z } reconstruction + i 1 {z } sparsity L1-norm induces sparsity Slide from Julien Mairal

35 Optimization for Dictionary Learning min i,d NX i=1 1 2 x i D i i 1 Classical optimization does this in EM style (alternating between learning the dictionary and the sparse codes) Good results, but slow [ Mairal et al., 2009a] proposes online learning NX I denoised = 1 M i=1 R i D i Slide adopted from Julien Mairal

36 Results Slide adopted from Julien Mairal

37 Image Classifications (Bag-of-Features) SIFT Features Filter with Visual Words Max Spatial pool (Sum) Classifier

38 Learning Codebooks for Image Classification Image is represented by a set of low-level (SIFT) descriptors at N locations identified with their index i x i Hard-quantization (with p visual words) x i D i i 2{0, 1} p px j=1 i [j] =1 Soft-quantization Sparse coding: i [j] = N (x i ; d j ), 2 P p k=1 N (x i; d k, 2 ) min i,d NX i=1 1 2 x i D i 2 2 {z } reconstruction + i 1 {z } sparsity Slide adopted from Julien Mairal

39 Discriminative Learning of Dictionaries [ Mairal, Bach, Ponce, Sapiro, Zisserman, CVPR 2008 ] min i,d NX i=1 1 2 x i D i 2 2 {z } reconstruction + i 1 {z } sparsity Positive class min i,d NX i=1 1 2 x i D i 2 2 {z } reconstruction + i 1 {z } sparsity Negative class

40 Learning Codebooks for Image Classification [ Mairal, Bach, Ponce, Sapiro, Zisserman, CVPR 2008 ] Slide adopted from Julien Mairal

41 Visual cortex V1: primary visual cortex simple cells complex cells [ Scientific American, 1999 ] Slide adopted from Ying Nian Wu

42 Visual cortex V1: primary visual cortex simple cells complex cells What is beyond V1? [ Scientific American, 1999 ] Slide adopted from Ying Nian Wu

43 Visual cortex V1: primary visual cortex simple cells complex cells What is beyond V1? Hierarchical model [ Scientific American, 1999 ] Slide adopted from Ying Nian Wu

44 Mid-level features Beyond Edges Continuation Parallelism Junctions Corners High-level object parts Objects Scenes??? Slide adopted from Rob Fergus

45 Mid-level features Beyond Edges Continuation Parallelism Junctions Corners High-level object parts Objects Scenes??? Slide adopted from Rob Fergus

46 Challenges Grouping mechanism - Want edge structures to group into more complex forms - But it is hard to define explicit rules Invariance to local distortions - Under distortions, corners, T-junctions, parallel lines, etc. can look quite different Slide adopted from Rob Fergus

47 Deep Feature Learning Build hierarchy of feature extractors (layers) - All the way from pixels to classifiers - Homogeneous (simple) structure for all layers - Unsupervised training Image/Video Pixels Layer 1 Layer 2 Layer 3 Simple Classifier Slide from Rob Fergus

48 Deep Feature Learning Build hierarchy of feature extractors (layers) - All the way from pixels to classifiers - Homogeneous (simple) structure for all layers - Unsupervised training Image/Video Pixels Layer 1 Layer 2 Layer 3 Simple Classifier Numerous approaches: Restricted Boltzmann Machines [Hinton, Ng, Bengio, ] Sparse coding [Yu, Fergus, LeCun] Auto-encoders [LeCun, Bengio] ICA variants [Ng, Cottrell] & many more. Slide from Rob Fergus

49 Hierarchical Vision Models [Jin & Geman, CVPR 2006] e.g. animals, trees, rocks e.g. contours, intermediate objects e.g. linelets, curvelets, T- junctions e.g. discontinuities, gradient animal head instantiated by bear head Slide adopted from Rob Fergus

50 Hierarchical Vision Models [Jin & Geman, CVPR 2006] e.g. animals, trees, rocks e.g. contours, intermediate objects e.g. linelets, curvelets, T- junctions e.g. discontinuities, gradient animal head instantiated by bear head Slide adopted from Rob Fergus

51 Single Layer Convolutional Architecture Input: Image Pixels / Features Filter Normalize Pool Output: Features / Classifier Slide from Rob Fergus

52 Single Deconvolutional Layer Convolutional form of sparse coding Slide from Rob Fergus

53 Single Deconvolutional Layer Slide from Rob Fergus

54 Single Deconvolutional Layer Slide from Rob Fergus

55 Single Deconvolutional Layer Slide from Rob Fergus

56 Toy Example Feature maps Filters Slide from Rob Fergus

57 Reversible Max Pooling Feature Map Slide from Rob Fergus

58 Reversible Max Pooling Pooling Feature Map Slide from Rob Fergus

59 Reversible Max Pooling Pooled Feature Maps Pooling Feature Map Slide from Rob Fergus

60 Reversible Max Pooling Max Locations Switches Pooled Feature Maps Pooling Feature Map Slide from Rob Fergus

61 Reversible Max Pooling Max Locations Switches Pooled Feature Maps Pooling Unpooling Feature Map Reconstructed Feature Map Slide from Rob Fergus

62 Overall Architecture (1 layer) Slide from Rob Fergus

63 Toy Example Pooled maps Feature maps Filters Slide from Rob Fergus

64 Overall Architecture (2 Layers) Slide from Rob Fergus

65 Model Parameters 7x7 filters at all layers Slide from Rob Fergus

66 Layer 1 Filters 15 filters/feature maps, showing max for each map Slide from Rob Fergus

67 Layer 2 Filters 50 filters/feature maps, showing max for each map projected down to image Slide from Rob Fergus

68 Layer 3 Filters 100 filters/feature maps, showing max for each map Slide from Rob Fergus

69 Layer 4 Filters 150 in total; receptive field is entire image Slide from Rob Fergus

70 Relative Size of Receptive Fields (to scale) Slide from Rob Fergus

71 Restricted Boltzmann Machines

72 Restricted Boltzmann Machines

73 Restricted Boltzmann Machines Units v i are binary (0/1) v i W ij v j Logistic function p(v i =1 {v j },j 6= i) = 1 1+exp( b i j W ij v j ) Unit is activated based on linear combination of other units plus bias

74 Restricted Boltzmann Machines Units v i are binary (0/1) 1 p(v i = 1) v i 0 W ij 0 v j b i + j W ij v j Logistic function p(v i =1 {v j },j 6= i) = 1 1+exp( b i j W ij v j ) Unit is activated based on linear combination of other units plus bias

75 Restricted Boltzmann Machines v i Units are binary (0/1) v i W ij v j p(v) = exp( E(v)) v exp( E(v)) E(v) = i b i v i i6=j W ij v i v j More probable configuration = lower energy

76 Restricted Boltzmann Machines v i Units are binary (0/1) v i W ij Learning amounts to estimating parameters of the model: v j ={b i,w ij } p(v) = exp( E(v)) v exp( E(v)) E(v) = i b i v i i6=j W ij v i v j More probable configuration = lower energy

77 Restricted Boltzmann Machines v i Units are binary (0/1) v i W ij Learning amounts to estimating parameters of the model: v j ={b i,w ij } Maximum Likelihood p(v) = exp( E(v)) v exp( E(v)) E(v) = i b i v i i6=j W ij v i v j More probable configuration = lower energy

78 Maximum Likelihood Learning Typically - assume independence of N samples L( x 1,x 2,,x N )= NY i=1 p(x i ) Take log (which turns product into a sum) and do gradientbased optimization with respect to parameters For Boltzmann Machine that comes down to optimizing sum of energy functions minus the normalizing constant p(v) = exp( E(v)) v exp( E(v))

79 Bolzmann Machine Learning So, essentially we need to iteratively do: W (iter) W (1) ij = W (0) ij + W (0) ij (iter 1) ij = W ij + W (iter 1) ij b (iter) b (1) i = b (0) i + b (0) i (iter 1) i = b i + b (iter 1) i

80 Bolzmann Machine Learning So, essentially we need to iteratively do: W (iter) W (1) ij = W (0) ij + W (0) ij (iter 1) ij = W ij + W (iter 1) ij b (iter) b (1) i = b (0) i + b (0) i (iter 1) i = b i + b (iter 1) i Where do we get the gradients?

81 Bolzmann Machine Learning So, essentially we need to iteratively do: W (iter) W (1) ij = W (0) ij + W (0) ij (iter 1) ij = W ij + W (iter 1) ij b (iter) b (1) i = b (0) i + b (0) i (iter 1) i = b i + b (iter 1) i Where do we get the gradients? W ij /hv i v j i data hv i v j i model b i /hv i i data hv i i model This is easy, just look at the data Requires samples from the model (Gibbs sampler to equilibrium)

82 Bolzmann Machine Learning W ij /hv i v j i data hv i v j i model Law of large numbers: Approximate expectations using samples W ij = 1 D DX d=1 v (d) i v (d) j 1 M MX m=1 ṽ (m) i ṽ (m) j Where do we get the gradients? W ij /hv i v j i data hv i v j i model b i /hv i i data hv i i model This is easy, just look at the data Requires samples from the model (Gibbs sampler to equilibrium)

83 Restricted Boltzmann Machines Units v i are binary (0/1) h j W ij p(v, h) = exp( E(v, h)) Z v i E(v, h) = ij W ij v i h j i a i v i j b j h j p(h j =1 v) = 1 1+exp( b j i W ij v i ) p(v i =1 h) = 1 1+exp( a i j W ij h j )

84 Restricted Boltzmann Machines Units v i are binary (0/1) h j W ij v i Still not very realistic, because most data in real world is continuos p(v i =1 h) =N (a i + j W ij h j, 2 )

85 Auto-encoders [ Hinton and Salakhutdinov, Science 06 ] Patches 28x28 We train the auto-encoder to reproduce its input vector as its output This forces it to compress as much information as possible into the 30 numbers in the central bottleneck neurons 500 neurons 250 neurons 30 These 30 numbers are then a good way to visualize data and do classification. 250 neurons 500 neurons 1000 neurons 28x28 Patches

86 Learning a Compositional Hierarchy of Object Structure [ Fidler & Leaonardis, CVPR 07; Fidler, Boben & Leonards, CVPR 08 ] Parts model The architecture Learned parts

87 Learning a Compositional Hierarchy of Object Structure [ Fidler & Leaonardis, CVPR 07; Fidler, Boben & Leonards, CVPR 08 ] Layer 2 Layer 3

88 Learning a Compositional Hierarchy of Object Structure [ Fidler & Leaonardis, CVPR 07; Fidler, Boben & Leonards, CVPR 08 ]

89 Learning a Compositional Hierarchy of Object Structure [ Fidler & Leaonardis, CVPR 07; Fidler, Boben & Leonards, CVPR 08 ]

90 Conclusions Interesting paradigm, where algorithm tries to learn everything, right? Patch size (8x8 or 20x20?) Learning parameters Need lots and lots of data typically Higher-levels mostly work on PASCAL or other simple datasets It is hard to train multi-layer architectures! Since learning is in effect unsupervised, it s difficult to debug or gather what s going on.

Part-based models. Lecture 10

Part-based models. Lecture 10 Part-based models Lecture 10 Overview Representation Location Appearance Generative interpretation Learning Distance transforms Other approaches using parts Felzenszwalb, Girshick, McAllester, Ramanan

More information

Adaptive Deconvolutional Networks for Mid and High Level Feature Learning

Adaptive Deconvolutional Networks for Mid and High Level Feature Learning ICCV 2011 submission. Currently under review. Please do not distribute. Adaptive Deconvolutional Networks for Mid and High Level Feature Learning Matthew D. Zeiler, Graham W. Taylor and Rob Fergus Dept.

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Deep Learning & Feature Learning Methods for Vision

Deep Learning & Feature Learning Methods for Vision Deep Learning & Feature Learning Methods for Vision CVPR 2012 Tutorial: 9am-5:30pm Rob Fergus (NYU) Kai Yu (Baidu) Marc Aurelio Ranzato (Google) Honglak Lee (Michigan) Ruslan Salakhutdinov (U. Toronto)

More information

Modeling Visual Cortex V4 in Naturalistic Conditions with Invari. Representations

Modeling Visual Cortex V4 in Naturalistic Conditions with Invari. Representations Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant and Sparse Image Representations Bin Yu Departments of Statistics and EECS University of California at Berkeley Rutgers University, May

More information

Deep Neural Networks:

Deep Neural Networks: Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,

More information

arxiv: v1 [cs.lg] 20 Dec 2013

arxiv: v1 [cs.lg] 20 Dec 2013 Unsupervised Feature Learning by Deep Sparse Coding Yunlong He Koray Kavukcuoglu Yun Wang Arthur Szlam Yanjun Qi arxiv:1312.5783v1 [cs.lg] 20 Dec 2013 Abstract In this paper, we propose a new unsupervised

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Sparse Coding and Dictionary Learning for Image Analysis

Sparse Coding and Dictionary Learning for Image Analysis Sparse Coding and Dictionary Learning for Image Analysis Part IV: Recent Advances in Computer Vision and New Models Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro CVPR 10 tutorial, San Francisco,

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

Learning Feature Hierarchies for Object Recognition

Learning Feature Hierarchies for Object Recognition Learning Feature Hierarchies for Object Recognition Koray Kavukcuoglu Computer Science Department Courant Institute of Mathematical Sciences New York University Marc Aurelio Ranzato, Kevin Jarrett, Pierre

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun Efficient Learning of Sparse Overcomplete Representations with an Energy-Based Model Marc'Aurelio Ranzato C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun CIAR Summer School Toronto 2006 Why Extracting

More information

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Part II: Visual Features and Representations Liangliang Cao, IBM Watson Research Center Evolvement of Visual Features

More information

Multipath Sparse Coding Using Hierarchical Matching Pursuit

Multipath Sparse Coding Using Hierarchical Matching Pursuit Multipath Sparse Coding Using Hierarchical Matching Pursuit Liefeng Bo, Xiaofeng Ren ISTC Pervasive Computing, Intel Labs Seattle WA 98195, USA {liefeng.bo,xiaofeng.ren}@intel.com Dieter Fox University

More information

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt Energy Based Models, Restricted Boltzmann Machines and Deep Networks Jesse Eickholt ???? Who s heard of Energy Based Models (EBMs) Restricted Boltzmann Machines (RBMs) Deep Belief Networks Auto-encoders

More information

Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms

Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms Liefeng Bo University of Washington Seattle WA 98195, USA Xiaofeng Ren ISTC-Pervasive Computing Intel Labs Seattle

More information

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University. Visualizing and Understanding Convolutional Networks Christopher Pennsylvania State University February 23, 2015 Some Slide Information taken from Pierre Sermanet (Google) presentation on and Computer

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

Recent Progress on Object Classification and Detection

Recent Progress on Object Classification and Detection Recent Progress on Object Classification and Detection Tieniu Tan, Yongzhen Huang, and Junge Zhang Center for Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition

More information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

Multiview Feature Learning

Multiview Feature Learning Multiview Feature Learning Roland Memisevic Frankfurt, Montreal Tutorial at IPAM 2012 Roland Memisevic (Frankfurt, Montreal) Multiview Feature Learning Tutorial at IPAM 2012 1 / 163 Outline 1 Introduction

More information

String distance for automatic image classification

String distance for automatic image classification String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,

More information

Learning Convolutional Feature Hierarchies for Visual Recognition

Learning Convolutional Feature Hierarchies for Visual Recognition Learning Convolutional Feature Hierarchies for Visual Recognition Koray Kavukcuoglu, Pierre Sermanet, Y-Lan Boureau, Karol Gregor, Michael Mathieu, Yann LeCun Computer Science Department Courant Institute

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features

Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features Yangqing Jia UC Berkeley EECS jiayq@berkeley.edu Chang Huang NEC Labs America chuang@sv.nec-labs.com Abstract We examine the

More information

Graphical Models for Computer Vision

Graphical Models for Computer Vision Graphical Models for Computer Vision Pedro F Felzenszwalb Brown University Joint work with Dan Huttenlocher, Joshua Schwartz, Ross Girshick, David McAllester, Deva Ramanan, Allie Shapiro, John Oberlin

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

Sparse Models in Image Understanding And Computer Vision

Sparse Models in Image Understanding And Computer Vision Sparse Models in Image Understanding And Computer Vision Jayaraman J. Thiagarajan Arizona State University Collaborators Prof. Andreas Spanias Karthikeyan Natesan Ramamurthy Sparsity Sparsity of a vector

More information

ImageCLEF 2011

ImageCLEF 2011 SZTAKI @ ImageCLEF 2011 Bálint Daróczy joint work with András Benczúr, Róbert Pethes Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Training/test

More information

Multipath Sparse Coding Using Hierarchical Matching Pursuit

Multipath Sparse Coding Using Hierarchical Matching Pursuit 2013 IEEE Conference on Computer Vision and Pattern Recognition Multipath Sparse Coding Using Hierarchical Matching Pursuit Liefeng Bo ISTC-PC Intel Labs liefeng.bo@intel.com Xiaofeng Ren ISTC-PC Intel

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Supervised learning. y = f(x) function

Supervised learning. y = f(x) function Supervised learning y = f(x) output prediction function Image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the

More information

Multipath Sparse Coding Using Hierarchical Matching Pursuit

Multipath Sparse Coding Using Hierarchical Matching Pursuit Multipath Sparse Coding Using Hierarchical Matching Pursuit Liefeng Bo ISTC-PC Intel Labs liefeng.bo@intel.com Xiaofeng Ren ISTC-PC Intel Labs xren@cs.washington.edu Dieter Fox University of Washington

More information

Discriminative sparse model and dictionary learning for object category recognition

Discriminative sparse model and dictionary learning for object category recognition Discriative sparse model and dictionary learning for object category recognition Xiao Deng and Donghui Wang Institute of Artificial Intelligence, Zhejiang University Hangzhou, China, 31007 {yellowxiao,dhwang}@zju.edu.cn

More information

A Probabilistic Model for Recursive Factorized Image Features

A Probabilistic Model for Recursive Factorized Image Features A Probabilistic Model for Recursive Factorized Image Features Sergey Karayev 1 Mario Fritz 1,2 Sanja Fidler 1,3 Trevor Darrell 1 1 UC Berkeley and ICSI Berkeley, CA 2 MPI Informatics Saarbrücken, Germany

More information

The most cited papers in Computer Vision

The most cited papers in Computer Vision COMPUTER VISION, PUBLICATION The most cited papers in Computer Vision In Computer Vision, Paper Talk on February 10, 2012 at 11:10 pm by gooly (Li Yang Ku) Although it s not always the case that a paper

More information

REJECTION-BASED CLASSIFICATION FOR ACTION RECOGNITION USING A SPATIO-TEMPORAL DICTIONARY. Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin

REJECTION-BASED CLASSIFICATION FOR ACTION RECOGNITION USING A SPATIO-TEMPORAL DICTIONARY. Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin REJECTION-BASED CLASSIFICATION FOR ACTION RECOGNITION USING A SPATIO-TEMPORAL DICTIONARY Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin Univ. Grenoble Alpes, GIPSA-Lab, F-38000 Grenoble, France ABSTRACT

More information

Image Representation by Active Curves

Image Representation by Active Curves Image Representation by Active Curves Wenze Hu Ying Nian Wu Song-Chun Zhu Department of Statistics, UCLA {wzhu,ywu,sczhu}@stat.ucla.edu Abstract This paper proposes a sparse image representation using

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018 INF 5860 Machine learning for image classification Lecture 11: Visualization Anne Solberg April 4, 2018 Reading material The lecture is based on papers: Deep Dream: https://research.googleblog.com/2015/06/inceptionism-goingdeeper-into-neural.html

More information

Deep Learning & Neural Networks

Deep Learning & Neural Networks Deep Learning & Neural Networks Machine Learning CSE4546 Sham Kakade University of Washington November 29, 2016 Sham Kakade 1 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural

More information

arxiv: v3 [cs.cv] 3 Oct 2012

arxiv: v3 [cs.cv] 3 Oct 2012 Combined Descriptors in Spatial Pyramid Domain for Image Classification Junlin Hu and Ping Guo arxiv:1210.0386v3 [cs.cv] 3 Oct 2012 Image Processing and Pattern Recognition Laboratory Beijing Normal University,

More information

Ecole normale supérieure, Paris Guillermo Sapiro and Andrew Zisserman

Ecole normale supérieure, Paris Guillermo Sapiro and Andrew Zisserman Sparse Coding for Image and Video Understanding di Jean Ponce http://www.di.ens.fr/willow/ Willow team, LIENS, UMR 8548 Ecole normale supérieure, Paris Joint work with Julien Mairal Francis Bach Joint

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Lecture 2: Spatial And-Or graph

Lecture 2: Spatial And-Or graph Lecture 2: Spatial And-Or graph for representing the scene-object-part-primitive hierarchy Song-Chun Zhu Center for Vision, Cognition, Learning and Arts University of California, Los Angeles At CVPR, Providence,

More information

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce Object Recognition Computer Vision Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce How many visual object categories are there? Biederman 1987 ANIMALS PLANTS OBJECTS

More information

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng Computer Science Department, Stanford University,

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

What is the Best Multi-Stage Architecture for Object Recognition?

What is the Best Multi-Stage Architecture for Object Recognition? What is the Best Multi-Stage Architecture for Object Recognition? Kevin Jarrett, Koray Kavukcuoglu, Marc Aurelio Ranzato and Yann LeCun The Courant Institute of Mathematical Sciences New York University,

More information

Learning Sparse FRAME Models for Natural Image Patterns

Learning Sparse FRAME Models for Natural Image Patterns International Journal of Computer Vision (IJCV Learning Sparse FRAME Models for Natural Image Patterns Jianwen Xie Wenze Hu Song-Chun Zhu Ying Nian Wu Received: 1 February 2014 / Accepted: 13 August 2014

More information

Category-level localization

Category-level localization Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object

More information

Beyond Bags of features Spatial information & Shape models

Beyond Bags of features Spatial information & Shape models Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features

More information

Learning Multiple Non-Linear Sub-Spaces using K-RBMs

Learning Multiple Non-Linear Sub-Spaces using K-RBMs 2013 IEEE Conference on Computer Vision and Pattern Recognition Learning Multiple Non-Linear Sub-Spaces using K-RBMs Siddhartha Chandra 1 Shailesh Kumar 2 C. V. Jawahar 1 1 CVIT, IIIT Hyderabad, 2 Google,

More information

Sparse coding for image classification

Sparse coding for image classification Sparse coding for image classification Columbia University Electrical Engineering: Kun Rong(kr2496@columbia.edu) Yongzhou Xiang(yx2211@columbia.edu) Yin Cui(yc2776@columbia.edu) Outline Background Introduction

More information

Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling

Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling Michael Maire 1,2 Stella X. Yu 3 Pietro Perona 2 1 TTI Chicago 2 California Institute of Technology 3 University of California

More information

Learning Hierarchical Feature Extractors For Image Recognition

Learning Hierarchical Feature Extractors For Image Recognition Learning Hierarchical Feature Extractors For Image Recognition by Y-Lan Boureau A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of

More information

Semantic Pooling for Image Categorization using Multiple Kernel Learning

Semantic Pooling for Image Categorization using Multiple Kernel Learning Semantic Pooling for Image Categorization using Multiple Kernel Learning Thibaut Durand (1,2), Nicolas Thome (1), Matthieu Cord (1), David Picard (2) (1) Sorbonne Universités, UPMC Univ Paris 06, UMR 7606,

More information

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

An Analysis of Single-Layer Networks in Unsupervised Feature Learning An Analysis of Single-Layer Networks in Unsupervised Feature Learning Adam Coates Honglak Lee Andrew Y. Ng Stanford University Computer Science Dept. 353 Serra Mall Stanford, CA 94305 University of Michigan

More information

Object Category Detection. Slides mostly from Derek Hoiem

Object Category Detection. Slides mostly from Derek Hoiem Object Category Detection Slides mostly from Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical template matching with sliding window Part-based Models

More information

Learning a Representative and Discriminative Part Model with Deep Convolutional Features for Scene Recognition

Learning a Representative and Discriminative Part Model with Deep Convolutional Features for Scene Recognition Learning a Representative and Discriminative Part Model with Deep Convolutional Features for Scene Recognition Bingyuan Liu, Jing Liu, Jingqiao Wang, Hanqing Lu Institute of Automation, Chinese Academy

More information

Computer Vision I - Filtering and Feature detection

Computer Vision I - Filtering and Feature detection Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

Content-Based Image Retrieval Using Deep Belief Networks

Content-Based Image Retrieval Using Deep Belief Networks Content-Based Image Retrieval Using Deep Belief Networks By Jason Kroge Submitted to the graduate degree program in the Department of Electrical Engineering and Computer Science of the University of Kansas

More information

Window based detectors

Window based detectors Window based detectors CS 554 Computer Vision Pinar Duygulu Bilkent University (Source: James Hays, Brown) Today Window-based generic object detection basic pipeline boosting classifiers face detection

More information

Segmentation. Bottom up Segmentation Semantic Segmentation

Segmentation. Bottom up Segmentation Semantic Segmentation Segmentation Bottom up Segmentation Semantic Segmentation Semantic Labeling of Street Scenes Ground Truth Labels 11 classes, almost all occur simultaneously, large changes in viewpoint, scale sky, road,

More information

Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning

Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning Stacks of Convolutional Restricted Boltzmann Machines for Shift-Invariant Feature Learning Mohammad Norouzi, Mani Ranjbar, and Greg Mori School of Computing Science Simon Fraser University Burnaby, BC

More information

Supervised Translation-Invariant Sparse Coding

Supervised Translation-Invariant Sparse Coding Supervised Translation-Invariant Sparse Coding Jianchao Yang,KaiYu, Thomas Huang Beckman Institute, University of Illinois at Urbana-Champaign NEC Laboratories America, Inc., Cupertino, California {jyang29,

More information

Generalized Lasso based Approximation of Sparse Coding for Visual Recognition

Generalized Lasso based Approximation of Sparse Coding for Visual Recognition Generalized Lasso based Approximation of Sparse Coding for Visual Recognition Nobuyuki Morioka The University of New South Wales & NICTA Sydney, Australia nmorioka@cse.unsw.edu.au Shin ichi Satoh National

More information

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper Deep Convolutional Neural Networks Nov. 20th, 2015 Bruce Draper Background: Fully-connected single layer neural networks Feed-forward classification Trained through back-propagation Example Computer Vision

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Recap Image Classification with Bags of Local Features

Recap Image Classification with Bags of Local Features Recap Image Classification with Bags of Local Features Bag of Feature models were the state of the art for image classification for a decade BoF may still be the state of the art for instance retrieval

More information

Aggregating Descriptors with Local Gaussian Metrics

Aggregating Descriptors with Local Gaussian Metrics Aggregating Descriptors with Local Gaussian Metrics Hideki Nakayama Grad. School of Information Science and Technology The University of Tokyo Tokyo, JAPAN nakayama@ci.i.u-tokyo.ac.jp Abstract Recently,

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

Locality-constrained and Spatially Regularized Coding for Scene Categorization

Locality-constrained and Spatially Regularized Coding for Scene Categorization Locality-constrained and Spatially Regularized Coding for Scene Categorization Aymen Shabou Hervé Le Borgne CEA, LIST, Vision & Content Engineering Laboratory Gif-sur-Yvettes, France aymen.shabou@cea.fr

More information

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6 Deep (1) Matthieu Cord LIP6 / UPMC Paris 6 Syllabus 1. Whole traditional (old) visual recognition pipeline 2. Introduction to Neural Nets 3. Deep Nets for image classification To do : Voir la leçon inaugurale

More information

Object Recognition with Deformable Models

Object Recognition with Deformable Models Object Recognition with Deformable Models Pedro F. Felzenszwalb Department of Computer Science University of Chicago Joint work with: Dan Huttenlocher, Joshua Schwartz, David McAllester, Deva Ramanan.

More information

A New Algorithm for Training Sparse Autoencoders

A New Algorithm for Training Sparse Autoencoders A New Algorithm for Training Sparse Autoencoders Ali Shahin Shamsabadi, Massoud Babaie-Zadeh, Seyyede Zohreh Seyyedsalehi, Hamid R. Rabiee, Christian Jutten Sharif University of Technology, University

More information

Rotation Invariance Neural Network

Rotation Invariance Neural Network Rotation Invariance Neural Network Shiyuan Li Abstract Rotation invariance and translate invariance have great values in image recognition. In this paper, we bring a new architecture in convolutional neural

More information

Learning Representations for Visual Object Class Recognition

Learning Representations for Visual Object Class Recognition Learning Representations for Visual Object Class Recognition Marcin Marszałek Cordelia Schmid Hedi Harzallah Joost van de Weijer LEAR, INRIA Grenoble, Rhône-Alpes, France October 15th, 2007 Bag-of-Features

More information

Unsupervised Feature Learning for RGB-D Based Object Recognition

Unsupervised Feature Learning for RGB-D Based Object Recognition Unsupervised Feature Learning for RGB-D Based Object Recognition Liefeng Bo 1, Xiaofeng Ren 2, and Dieter Fox 1 1 University of Washington, Seattle, USA {lfb,fox}@cs.washington.edu 2 ISTC-Pervasive Computing

More information

Bag-of-features. Cordelia Schmid

Bag-of-features. Cordelia Schmid Bag-of-features for category classification Cordelia Schmid Visual search Particular objects and scenes, large databases Category recognition Image classification: assigning a class label to the image

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Blind Image Deblurring Using Dark Channel Prior

Blind Image Deblurring Using Dark Channel Prior Blind Image Deblurring Using Dark Channel Prior Jinshan Pan 1,2,3, Deqing Sun 2,4, Hanspeter Pfister 2, and Ming-Hsuan Yang 3 1 Dalian University of Technology 2 Harvard University 3 UC Merced 4 NVIDIA

More information

Bilinear Models for Fine-Grained Visual Recognition

Bilinear Models for Fine-Grained Visual Recognition Bilinear Models for Fine-Grained Visual Recognition Subhransu Maji College of Information and Computer Sciences University of Massachusetts, Amherst Fine-grained visual recognition Example: distinguish

More information

Unsupervised Learning of Feature Hierarchies

Unsupervised Learning of Feature Hierarchies Unsupervised Learning of Feature Hierarchies by Marc Aurelio Ranzato A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science

More information

Patch Descriptors. CSE 455 Linda Shapiro

Patch Descriptors. CSE 455 Linda Shapiro Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Human Vision Based Object Recognition Sye-Min Christina Chan

Human Vision Based Object Recognition Sye-Min Christina Chan Human Vision Based Object Recognition Sye-Min Christina Chan Abstract Serre, Wolf, and Poggio introduced an object recognition algorithm that simulates image processing in visual cortex and claimed to

More information

Learning to Align from Scratch

Learning to Align from Scratch Learning to Align from Scratch Gary B. Huang 1 Marwan A. Mattar 1 Honglak Lee 2 Erik Learned-Miller 1 1 University of Massachusetts, Amherst, MA {gbhuang,mmattar,elm}@cs.umass.edu 2 University of Michigan,

More information

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA

More information

Deep Learning for Vision: Tricks of the Trade

Deep Learning for Vision: Tricks of the Trade Deep Learning for Vision: Tricks of the Trade Marc'Aurelio Ranzato Facebook, AI Group www.cs.toronto.edu/~ranzato BAVM Friday, 4 October 2013 Ideal Features Ideal Feature Extractor - window, right - chair,

More information

Learning Hierarchical Feature Extractors For Image Recognition

Learning Hierarchical Feature Extractors For Image Recognition Learning Hierarchical Feature Extractors For Image Recognition by Y-Lan Boureau A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of

More information

Learning of Visualization of Object Recognition Features and Image Reconstruction

Learning of Visualization of Object Recognition Features and Image Reconstruction Learning of Visualization of Object Recognition Features and Image Reconstruction Qiao Tan qtan@stanford.edu Yuanlin Wen yuanlinw@stanford.edu Chenyue Meng chenyue@stanford.edu Abstract We introduce algorithms

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Single Image Interpolation via Adaptive Non-Local Sparsity-Based Modeling

Single Image Interpolation via Adaptive Non-Local Sparsity-Based Modeling Single Image Interpolation via Adaptive Non-Local Sparsity-Based Modeling Yaniv Romano The Electrical Engineering Department Matan Protter The Computer Science Department Michael Elad The Computer Science

More information

On Deep Generative Models with Applications to Recognition

On Deep Generative Models with Applications to Recognition On Deep Generative Models with Applications to Recognition Marc Aurelio Ranzato Joshua Susskind Department of Computer Science University of Toronto ranzato,vmnih,hinton@cs.toronto.edu Volodymyr Mnih Geoffrey

More information

Detection III: Analyzing and Debugging Detection Methods

Detection III: Analyzing and Debugging Detection Methods CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can

More information