Incremental Learning of Object Detectors Using a Visual Shape Alphabet

Similar documents
Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection

Incremental learning of object detectors using a visual shape alphabet

Fusing shape and appearance information for object category detection

A Boundary-Fragment-Model for Object Detection

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond Bags of features Spatial information & Shape models

Part based models for recognition. Kristen Grauman

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Efficient Kernels for Identifying Unbounded-Order Spatial Features

Object Detection Using A Shape Codebook

Learning Spatial Context: Using Stuff to Find Things

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search

Part-based and local feature models for generic object recognition

Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections

Scalable Multi-class Object Detection

Object Detection. Sanja Fidler CSC420: Intro to Image Understanding 1/ 1

Efficiently Combining Contour and Texture Cues for Object Recognition

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009

Deformable Part Models

Part-based models. Lecture 10

Loose Shape Model for Discriminative Learning of Object Categories

Linear combinations of simple classifiers for the PASCAL challenge

Local Configuration of SIFT-like Features by a Shape Context

Real-time Accurate Object Detection using Multiple Resolutions

Learning Class-specific Edges for Object Detection and Segmentation

Detection III: Analyzing and Debugging Detection Methods

Fitting: The Hough transform

Patch-based Object Recognition. Basic Idea

Lecture 16: Object recognition: Part-based generative models

Fitting: The Hough transform

Part-Based Models for Object Class Recognition Part 2

Part-Based Models for Object Class Recognition Part 2

Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction

Classification of Protein Crystallization Imagery

Beyond Local Appearance: Category Recognition from Pairwise Interactions of Simple Features

Basic Problem Addressed. The Approach I: Training. Main Idea. The Approach II: Testing. Why a set of vocabularies?

Category-level Localization

Object Detection by 3D Aspectlets and Occlusion Reasoning

Hierarchical SegmentBoost

Semantic Pooling for Image Categorization using Multiple Kernel Learning

A Cascade of Feed-Forward Classifiers for Fast Pedestrian Detection

Fitting: The Hough transform

Object Category Detection. Slides mostly from Derek Hoiem

Model Fitting: The Hough transform II

Effective Classifiers for Detecting Objects

Beyond Bags of Features

Modern Object Detection. Most slides from Ali Farhadi

Methods for Representing and Recognizing 3D objects

Strangeness Based Feature Selection for Part Based Recognition

Object Class Recognition Using Multiple Layer Boosting with Heterogeneous Features

Recognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)

Local Features based Object Categories and Object Instances Recognition

Object Category Detection: Sliding Windows

Instance-level recognition part 2

Patch Descriptors. CSE 455 Linda Shapiro

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model

CS4495/6495 Introduction to Computer Vision. 8C-L1 Classification: Discriminative models

Window based detectors

Shape-Based Object Localization for Descriptive Classification

Evaluation of GIST descriptors for web scale image search

Recap Image Classification with Bags of Local Features

Hough Transform and RANSAC

Virtual Training for Multi-View Object Class Recognition

EECS 442 Computer vision. Fitting methods

Category-level localization

A novel template matching method for human detection

Keypoint-based Recognition and Object Search

Articulated Pose Estimation with Flexible Mixtures-of-Parts

ObjectDetectionusingaMax-MarginHoughTransform

Discriminative Learning of Contour Fragments for Object Detection

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Object Recognition Using Wavelet Based Salient Points

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Lecture 5: Object Detection

Combining Regions and Patches for Object Class Localization

Patch Descriptors. EE/CSE 576 Linda Shapiro

Multi-Scale Object Detection by Clustering Lines

CS6670: Computer Vision

UNSUPERVISED OBJECT MATCHING AND CATEGORIZATION VIA AGGLOMERATIVE CORRESPONDENCE CLUSTERING

Object recognition (part 2)

Instance-level recognition II.

Eye Detection by Haar wavelets and cascaded Support Vector Machine

Online Object Tracking with Proposal Selection. Yang Hua Karteek Alahari Cordelia Schmid Inria Grenoble Rhône-Alpes, France

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Large-scale visual recognition The bag-of-words representation

Integrated Feature Selection and Higher-order Spatial Feature Extraction for Object Categorization

Bootstrapping Boosted Random Ferns for Discriminative and Efficient Object Classification

Lecture 20: Object recognition

Lecture 12 Recognition

Learning and Recognizing Visual Object Categories Without First Detecting Features

Contexts and 3D Scenes

Automatic Learning and Extraction of Multi-Local Features

Multi-instance Object Segmentation with Occlusion Handling

Real-Time Human Detection using Relational Depth Similarity Features

Improving Recognition through Object Sub-categorization

EECS 442 Computer vision. Object Recognition

Detecting Multiple Symmetries with Extended SIFT

Find that! Visual Object Detection Primer

Survey of The Problem of Object Detection In Real Images

Transcription:

Incremental Learning of Object Detectors Using a Visual Shape Alphabet A. Opelt, A. Pinz & A. Zisserman CVPR 06 Presented by Medha Bhargava* * Several slides adapted from authors presentation, CVPR 06 1

OUTLINE Motivation, Goals & Overview of the approach Learning the model Stage 1: the visual alphabet of shape Stage 2: jointly/incrementally learned detectors Detection Invariances (scale, rotation, viewpoint) Experiments and Results Summary 2

MOTIVATION Class: Bicycle Class: Person 3

MOTIVATION Classification + Localization + rough Segmentation Proposed approach uses Alphabet of Shape + Geometry 4

GOALS Object detection Localization and crude segmentation Learning new models from previously trained detectors Incremental learning Sharing of model features...underlying theme Sublinear learning complexity NEW CONCEPTS Boundary fragment based shape alphabet Incremental joint-adaboost algorithm 5

BOUNDARY-FRAGMENT FRAGMENT-MODEL MODEL 1/2 Learning the BFM Training set Object delineated by bounding box 20 images/class Validation set Labeled as +ve/-ve image Object centroid marked 50 images/class -- 25 +ve, 25 ve A candidate boundary fragment MUST match edge chains in +ve set have good localization of the centroid in +ve set 6

BOUNDARY-FRAGMENT FRAGMENT-MODEL MODEL 2/2 Scoring a Boundary Fragment ( ) ( γ ) = ( γ ) ( γ ) C c c i match i loc i cmatch γ i : Ratio of cumulative Chamfer matching costs of fragment to edge chains in validation images ( ) c ( γ ) = L i= 1 match i L + i= 1 ( γ i i ) distance, V / L ( γ i i ) + + distance, V / L c loc γ i : pixel distance between true centroid and predicted centroid, averaged over +ve validation set 7

OVERVIEW OF THE APPROACH 1/2 The Boundary-Fragment-Model Basic Model first proposed in Opelt, Pinz and Zisserman ECCV 2006 Geometric model related to Leibe, Leonardis and Schiele (Workshop at ECCV 2004) Similar model proposed by Shotton et al. (ICCV 2005) More categories Two possibilities: Learning JOINTLY or INCREMENTALLY 8

OVERVIEW OF THE APPROACH 2/2 9

STAGE 1: THE VISUAL ALPHABET OF SHAPE 1/3 10

For each class Ci For i=1:n trials STAGE 1: THE VISUAL ALPHABET OF SHAPE 1/3 1. Grow candidate fragment in training images around random starting point i 2. Evaluate the fragment at each step on the validation set of the category calculate costs 3. If the fragments costs are above a certain threshold discard this fragment, otherwise go on with step 4. Validation set for the category: Cow COSTS 11

For each class Ci For i=1:n trials 1. 2. 3. STAGE 1: THE VISUAL ALPHABET OF SHAPE 2/3 4. Evaluate the boundary fragment on the validation sets of the other categories. 5. Add this fragment with costs on all categories and the geometric information to the alphabet Horses Faces Update centroid vectors COSTS 12

STAGE 1: THE VISUAL ALPHABET OF SHAPE 3/3 Clustering shape Visual Shape Alphabet 13

INCREMENTAL LEARNING Enlarging the alphabet codebook 1. Add more boundary fragments 2. Allow a single fragment to vote for additional object centroids Sharing to build 1. If fragments from different categories match, update centroid info 2. Evaluation of fragment on ve validation set 3. Granting additional voting privileges 14

EXAMPLE OF SHARING Over Classes Over Aspects Benefit: One class/aspect can build on what has been learnt from another 15

STAGE 2: WEAK DETECTOR CANDIDATES Combinations of 2 boundary fragments as pool for learning Calculated for ALL combinations on ALL validation sets 16

Visual Shape Alphabet STAGE 2: JOINTLY LEARNED DETECTORS Combinations of 2 alphabet entries form POOL of CANDIDATES JOINTBOOST Based on Torralba et al. CVPR 2004 h(horse,carfront) Each candidate that is valid (boosting) for at least one category is added to Collection of weak detectors 17

STAGE 2: INCREMENTALLY LEARNED DETECTORS Knowledge: Collection of weak detectors e.g. CarsSide, Horses, Bicycles h(horse,carfront) h(horse) h(horse) h(horse,bicycle) 1. Update existing knowledge (share) h(horse,carfront,cow) 2. Add new weak detectors (discriminative) h(cow) 18

DETECTION FOR THE MULTICLASS CASE Collection of votes in Hough voting space Mode above threshold Detection Class 3 19

INVARIANCES Translation Mode search in the Hough voting space In-plane Rotation Hough voting with oriented model Scale invariance 3D-Balloon-Meanshift-Mode-Est. Viewpoint 20

EXPERIMENTS 21

MULTICLASS DATASET Collection of 17 categories, From Caltech, Graz02, Magee, ImageGoogle (avalilable at: http://emt.tugraz.at/~pinz/data) Different numbers of training images per category (10-100) Different aspects and similar categories 22

RESULTS 1/6 Similarities at the alphabet level 23

Incremental vs. Joint-Boosting RESULTS 2/6 24

RESULTS 3/6 Sharing of weak detectors 25

RESULTS 4/6 Examples of detection results 26

RESULTS 5/6 Examples of detection results 27

RESULTS 6/6 Detecion results: Independent learning, Joint learning, one-class, multi-class See paper for details! Motorbikes: Shotton et al. 2005 7.6% Ours: 4.4 % (indep.), 3.9 % (joint) Bicycle (Rear): Ours: 25.0 % (indep.), 20.8 % (joint) Cups: Ours: 18.8 % (indep.), 10.0 % (joint) 28

SUMMARY Shape and geometry for categorization and detection Shared over categories (and aspects) Required number of weak detectors grows sublinearly with the number of categories Alphabet and the detector can be updated incrementally Joint learning gives better results with the same amount of training data 29

THANK YOU! 30