Semantic Pooling for Image Categorization using Multiple Kernel Learning

Similar documents
SEMANTIC POOLING FOR IMAGE CATEGORIZATION USING MULTIPLE KERNEL LEARNING

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search

Category-level localization

Image classification using object detectors

JKernelMachines: A Simple Framework for Kernel Machines

LEARNING OBJECT SEGMENTATION USING A MULTI NETWORK SEGMENT CLASSIFICATION APPROACH

Find that! Visual Object Detection Primer

JKernelMachines: A simple framework for Kernel Machines

Learning Representations for Visual Object Class Recognition

Deep condolence to Professor Mark Everingham

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009

Hierarchical Image-Region Labeling via Structured Learning

An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b

A Comparison of l 1 Norm and l 2 Norm Multiple Kernel SVMs in Image and Video Classification

Deformable Part Models

Bag-of-features. Cordelia Schmid

Object Recognition II

Object Detection Based on Deep Learning

Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds

Detection III: Analyzing and Debugging Detection Methods

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

MULTIMODAL SEMI-SUPERVISED IMAGE CLASSIFICATION BY COMBINING TAG REFINEMENT, GRAPH-BASED LEARNING AND SUPPORT VECTOR REGRESSION

Category-level Localization

Learning Object Representations for Visual Object Class Recognition

Unified, real-time object detection

Aggregating Descriptors with Local Gaussian Metrics

Ranking Figure-Ground Hypotheses for Object Segmentation

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia

A Discriminatively Trained, Multiscale, Deformable Part Model

Sparse coding for image classification

Human detection using histogram of oriented gradients. Srikumar Ramalingam School of Computing University of Utah

Development in Object Detection. Junyuan Lin May 4th

Real-Time Object Segmentation Using a Bag of Features Approach

Classification of Urban Scenes from Geo-referenced Images in Urban Street-View Context

Object Recognition with Deformable Models

PASCAL VOC Classification: Local Features vs. Deep Features. Shuicheng YAN, NUS


Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network

Apprenticeship Learning: Transfer of Knowledge via Dataset Augmentation

CS 231A Computer Vision (Fall 2011) Problem Set 4

Machine Learning and Content-Based Multimedia Retrieval

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem

Multi-Component Models for Object Detection

Descriptors for CV. Introduc)on:

Kernels for Visual Words Histograms

Object Detection with Discriminatively Trained Part Based Models

Weakly Supervised Object Recognition with Convolutional Neural Networks

Diagnosing Error in Object Detectors

Segmenting Objects in Weakly Labeled Videos

BossaNova at ImageCLEF 2012 Flickr Photo Annotation Task

Object Detection by 3D Aspectlets and Occlusion Reasoning

Sparse Coding and Dictionary Learning for Image Analysis

ImageCLEF 2011

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018

MEDICAL IMAGE SEGMENTATION WITH DIGITS. Hyungon Ryu, Jack Han Solution Architect NVIDIA Corporation

Part based models for recognition. Kristen Grauman

Computer Vision: Summary and Discussion

Object recognition (part 2)

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

Histograms of Sparse Codes for Object Detection

Modeling 3D viewpoint for part-based object recognition of rigid objects

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6

24 hours of Photo Sharing. installation by Erik Kessels

Linear combinations of simple classifiers for the PASCAL challenge

c 2011 by Pedro Moises Crisostomo Romero. All rights reserved.

Immediate, scalable object category detection

Part-Based Models for Object Class Recognition Part 3

arxiv: v2 [cs.cv] 22 Sep 2014

Multiple Kernel Learning for Object Classification

2282 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012

Structured Models in. Dan Huttenlocher. June 2010

G-CNN: an Iterative Grid Based Object Detector

Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds

HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION

Object Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Rich feature hierarchies for accurate object detection and semantic segmentation

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

arxiv: v1 [cs.cv] 15 Aug 2018

Learning to Localize Objects with Structured Output Regression

Modern Object Detection. Most slides from Ali Farhadi

CS 231A Computer Vision (Fall 2012) Problem Set 4

Multiple-Choice Questionnaire Group C

Detection and Localization with Multi-scale Models

Object Category Detection. Slides mostly from Derek Hoiem

Layered Object Detection for Multi-Class Segmentation

Mixture Component Identification and Learning for Visual Recognition

Learning Compact Visual Attributes for Large-scale Image Classification

Future directions in computer vision. Larry Davis Computer Vision Laboratory University of Maryland College Park MD USA

Stealing Objects With Computer Vision

Loose Shape Model for Discriminative Learning of Object Categories

CLASSIFICATION Experiments

arxiv: v1 [cs.cv] 2 May 2017

Deformable Part Models with Individual Part Scaling

Self-Paced Learning for Semisupervised Image Classification

Optimizing Object Detection:

Automatic Discovery of Meaningful Object Parts with Latent CRFs

Semantic Texton Forests for Image Categorization and Segmentation

Generalized RBF feature maps for Efficient Detection

Video annotation based on adaptive annular spatial partition scheme

Transcription:

Semantic Pooling for Image Categorization using Multiple Kernel Learning Thibaut Durand (1,2), Nicolas Thome (1), Matthieu Cord (1), David Picard (2) (1) Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6 (2) ETIS/ENSEA, University of Cergy-Pontoise, CNRS, UMR 8051 ICIP 2014 ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 1

Outline 1 Context 2 Model 3 Experiments ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 2

Outline 1 Context 2 Model 3 Experiments ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 3

Supervised image classification Goal Predict the label by using the data of the training set TRAIN images Labels Signatures Learning classifier Signature Classification TEST image Predicted label Figure: Standard pipeline ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 4

Bag of Words (BoW) model Drawback Spatial information is lost ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 5

Integrated geometrical information (1) Spatial Pyramid Good results on scene classification SP is not adapted to objects [ECCV 2012: Russakovsky, Lin, Yu, Fei-Fei. Object-centric spatial pooling for image classification] SP does not encode any semantic information ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 6

Integrated geometrical information (2) Spatial Coordinate Coding (SCC) Integrate the spatial coordinates of the descriptors into the codebook Drawback: lack of invariance with respect to the layout [ICIP 2011: Koniusz, Mikolajczyk. Spatial coordinate coding to reduce histogram representations, dominant angle and colour pyramid match] ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 7

Integrated geometrical information (2) Spatial Coordinate Coding (SCC) Integrate the spatial coordinates of the descriptors into the codebook Drawback: lack of invariance with respect to the layout [ICIP 2011: Koniusz, Mikolajczyk. Spatial coordinate coding to reduce histogram representations, dominant angle and colour pyramid match] Object detectors Use the scores of a set of detectors to compute the signature Invariant signatures to the position of the object [ICIP 2013: Durand, Thome, Cord, Avila. Image classification using object detectors] ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 7

Contributions New image categorization method using semantic pooling regions Semantic pooling region detection Class-wise selection (MKL) Semantic detection Original image Region selection Representation (VLAD, VLAT,...) Classifier Boat ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 8

Outline 1 Context 2 Model Object detection Image representation Region selection and classification 3 Experiments ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 9

Semantic Pooling with MKL aeroplane K aeroplane MKL K person original image person object detection compute signature kernel Figure: SemanticMKL pipeline 1 Object detection 2 Image representation 3 Region selection and classification ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 10

1 - Object detection aeroplane K aeroplane MKL K person original image person object detection region signatures kernels Figure: SemanticMKL pipeline ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 11

1 - Object detection: Latent SVM object detector Sliding window approach Works as a classifier: predict if an object is present in a certain position and scale in an image [PAMI 2010 : Felzenszwalb, Girshick, McAllester, Ramanan. Object detection with discriminatively trained part based models] ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 12

1 - Object detection: Latent SVM object detector Spatial Pyramid Semantic pooling regions Selected semantic pooling regions (details part 3) Boat Car Horse Motorbike Person Pottedplant Sofa Figure: Examples of pooling regions ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 13

1 - Object detection: Latent SVM object detector Spatial Pyramid Semantic pooling regions Selected semantic pooling regions (details part 3) Boat Car Horse Motorbike Person Pottedplant Sofa Figure: Examples of pooling regions ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 14

1 - Object detection: Latent SVM object detector Spatial Pyramid Semantic pooling regions Selected semantic pooling regions (details part 3) Boat Car Horse Motorbike Person Pottedplant Sofa Figure: Examples of pooling regions ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 15

2 - Image representation aeroplane K aeroplane MKL K person original image person object detection region signatures kernels Figure: SemanticMKL pipeline ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 16

Context Model Experiments 2 - Image representation: VLAT [ICIP 2011] Extension of the VLAD approach Vector image representation based on the aggregation of tensor products of local descriptors Model in c cluster of descriptors space Model of descriptors space Bag of descriptors Local descriptor extraction from a dense grid Model in c cluster of i image Figure: VLAT pipeline [ICIP 2011: Picard, Gosselin. Improving Image Similarity With Vectors of Locally Aggregated Tensors] ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 17

3 - Region selection and classification aeroplane K aeroplane MKL K person original image person object detection region signatures kernels Figure: SemanticMKL pipeline ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 18

3 - Kernel selection Sofa Sofa Sofa Figure: Semantic pooling region for object sofa Many signatures represent noise Aggregation of background local descriptors Selection of the relevant regions ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 19

3 - Kernel selection: l 1 -Multiple Kernel Learning (MKL) Similarity between two regions: R pooling region φ R function computing a signature (VLAT) for region R Definition of an explicit kernel function k R (, ) measuring the similarity between two images i and j: k R (i, j) = φ R (B Ri ), φ R (B Rj ) (1) k(, )=, Figure: Kernel for R = motorbike ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 20

3 - Kernel selection: l 1 -Multiple Kernel Learning (MKL) Similarity between two images: Linear combination of the kernel corresponding to the associated pooling regions: k(i, j) = β R k R (i, j) (2) R β R the weights associated with each pooling region R k(, )=βplane +...+βmotorbike, +βperson,, + + ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 21

3 - Kernel selection: l 1 -Multiple Kernel Learning (MKL) Learn the weights associated with each kernel using MKL SimpleMKL algorithm l 1 norm constraint enforces sparsity kernel selection Learning jointly the classifier and the kernel combination Optimization problem min max α i 1 α i α i y i y i β R k R (i, j) (3) β α 2 i i,i R s.t. R, β R 0, β R = 1, i, 0 α i y i C (4) R [JMLR 2008: Rakotomamonjy, Bach, Canu, Grandvalet. SimpleMKL] ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 22

Outline 1 Context 2 Model 3 Experiments ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 23

Dataset - Pascal VOC 2007-20 classes aeroplane bicycle bird boat bottle bus car cat chair cow diningtable dog horse motorbike person pottedplant sheep sofa train tvmonitor ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 24

Setup HOG descriptors sampled every 3 pixels at 4 scales Visual codebook: 64 visual words Compressed VLAT: final dimension 8192 (code available at www.vlat.fr) Spatial pooling: 1 1, 2 2, 3 1 Semantic pooling: detectors trained on the trainval set of Pascal VOC 2007 (20 detectors = 20 classes) l 1 -MKL using JKernelMachines (code available on github) [JMLR 2013: Picard, Thome, Cord, JKernelMachines: A simple framework for kernel machines] ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 25

Results Without kernel selection VLAT pvlat svlat map (%) 57.9 59.0 58.4 Table: Results VOC 2007 mean Average Precision (map) VLAT : VLAT without spatial pyramid pvlat : VLAT with spatial pyramid 1 1, 2 2, 3 1 (concatenation) svlat : VLAT with semantic pooling (concatenation) Straightforward combination of the signatures does not work Many signatures represent noise ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 26

Results Without selection With selection Method VLAT pvlat svlat pmkl smkl spmkl map (%) 57.9 59.0 58.4 59.7 63.2 64.0 Table: Resultats VOC 2007 mean Average Precision (map) VLAT : VLAT without spatial pyramid pvlat : VLAT with spatial pyramid 1 1, 2 2, 3 1 svlat : VLAT with semantic pooling pmkl: MKL with spatial pooling 1 1, 2 2, 3 1 smkl: MKL with semantic pooling spmkl: MKL with spatial and semantic pooling Filter out the objects which are uncorrelated with the considered category ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 27

Results Figure: Learned semanticmkl weights (row category, column region, first column whole image) ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 28

Results Correlation between classes whole bicycle motorbike person image Figure: Learned semanticmkl weights for bicycle category ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 29

Conclusion New image categorization system based on a semantic pooling regions Take into account the layout of the images Selection of the relevant detectors with respect to a specific category ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 30

Thank you for your attention! Questions? Thibaut Durand (1,2) Nicolas Thome (1) Matthieu Cord (1) David Picard (2) thibaut.durand@lip6.fr nicolas.thome@lip6.fr matthieu.cord@lip6.fr picard@ensea.fr (1) Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6 (2) ETIS/ENSEA, University of Cergy-Pontoise, CNRS, UMR 8051 Code available on demand ICIP 2014 Semantic Pooling for Image Categorization using Multiple Kernel Learning 31