Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection

Similar documents
Incremental Learning of Object Detectors Using a Visual Shape Alphabet

Fusing shape and appearance information for object category detection

A Boundary-Fragment-Model for Object Detection

Detection of a Single Hand Shape in the Foreground of Still Images

Incremental learning of object detectors using a visual shape alphabet

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Object Detection Using A Shape Codebook

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Object Detection. Sanja Fidler CSC420: Intro to Image Understanding 1/ 1

Learning Spatial Context: Using Stuff to Find Things

Deformable Part Models

Patch-based Object Recognition. Basic Idea

Computer Vision at Cambridge: Reconstruction,Registration and Recognition

Object Category Detection: Sliding Windows

Part-based and local feature models for generic object recognition

Generic Object Class Detection using Feature Maps

Part based models for recognition. Kristen Grauman

CS 231A Computer Vision (Winter 2014) Problem Set 3

Motion Detection. Final project by. Neta Sokolovsky

FACE DETECTION BY HAAR CASCADE CLASSIFIER WITH SIMPLE AND COMPLEX BACKGROUNDS IMAGES USING OPENCV IMPLEMENTATION

Face Detection and Alignment. Prof. Xin Yang HUST

Designing Applications that See Lecture 7: Object Recognition

Window based detectors

Category vs. instance recognition

Effective Classifiers for Detecting Objects

Solving Word Jumbles

Review for the Final

Efficient Kernels for Identifying Unbounded-Order Spatial Features

Lecture 16: Object recognition: Part-based generative models

Object Detection Design challenges

Linear combinations of simple classifiers for the PASCAL challenge

Estimating the Vanishing Point of a Sidewalk

GENERIC object recognition is a difficult computer vision problem. Although various solutions exist

CS229 Final Project One Click: Object Removal

CS4495/6495 Introduction to Computer Vision. 8C-L1 Classification: Discriminative models

GENERIC object recognition is a difficult computer vision problem. Although various solutions exist

Edge and corner detection

Viola Jones Face Detection. Shahid Nabi Hiader Raiz Muhammad Murtaz

Unconstrained License Plate Detection Using the Hausdorff Distance

Automatic Initialization of the TLD Object Tracker: Milestone Update

Active learning for visual object recognition

Detection III: Analyzing and Debugging Detection Methods

Face and Nose Detection in Digital Images using Local Binary Patterns

CS534: Introduction to Computer Vision Edges and Contours. Ahmed Elgammal Dept. of Computer Science Rutgers University

Part-based models. Lecture 10

Object Category Detection: Sliding Windows

Eye Detection by Haar wavelets and cascaded Support Vector Machine

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Object Class Recognition Using Multiple Layer Boosting with Heterogeneous Features

Other Linear Filters CS 211A

Object and Class Recognition I:

Human detection using histogram of oriented gradients. Srikumar Ramalingam School of Computing University of Utah

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

A Discriminative Voting Scheme for Object Detection using Hough Forests

Recap Image Classification with Bags of Local Features

Learning to Detect Faces. A Large-Scale Application of Machine Learning

Automatic Learning and Extraction of Multi-Local Features

Model Fitting: The Hough transform II

RANSAC and some HOUGH transform

Edge Detection. CS664 Computer Vision. 3. Edges. Several Causes of Edges. Detecting Edges. Finite Differences. The Gradient

Detecting and Reading Text in Natural Scenes

Recognition of a Predefined Landmark Using Optical Flow Sensor/Camera

Announcements. Edges. Last Lecture. Gradients: Numerical Derivatives f(x) Edge Detection, Lines. Intro Computer Vision. CSE 152 Lecture 10

VOTING METHODS FOR GENERIC OBJECT DETECTION

Real-time Accurate Object Detection using Multiple Resolutions

A System for Real-time Detection and Tracking of Vehicles from a Single Car-mounted Camera

EE795: Computer Vision and Intelligent Systems

Discriminative Learning of Contour Fragments for Object Detection

EDGE EXTRACTION ALGORITHM BASED ON LINEAR PERCEPTION ENHANCEMENT

Discriminative classifiers for image recognition

REAL-TIME, LONG-TERM HAND TRACKING WITH UNSUPERVISED INITIALIZATION

What is an edge? Paint. Depth discontinuity. Material change. Texture boundary

Finding 2D Shapes and the Hough Transform

Postprint.

Generic Object Detection Using Improved Gentleboost Classifier

A Survey of Methods to Extract Buildings from High-Resolution Satellite Images Ryan Friese CS510

Previously. Window-based models for generic object detection 4/11/2011

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

Face detection, validation and tracking. Océane Esposito, Grazina Laurinaviciute, Alexandre Majetniak

Human Detection and Action Recognition. in Video Sequences

A robust method for automatic player detection in sport videos

Computer Vision. Exercise Session 10 Image Categorization

People detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features

Fitting: The Hough transform

Local Configuration of SIFT-like Features by a Shape Context

Multi-Class Segmentation with Relative Location Prior

Ensemble Methods, Decision Trees

EE795: Computer Vision and Intelligent Systems

Beyond Bags of features Spatial information & Shape models

Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade


Face detection and recognition. Many slides adapted from K. Grauman and D. Lowe

Classification of Protein Crystallization Imagery

Fitting: The Hough transform

Classifier Case Study: Viola-Jones Face Detector

CPSC 425: Computer Vision

Object recognition (part 1)

Object detection as supervised classification

CS334: Digital Imaging and Multimedia Edges and Contours. Ahmed Elgammal Dept. of Computer Science Rutgers University

Transcription:

Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection Andreas Opelt, Axel Pinz and Andrew Zisserman 09-June-2009 Irshad Ali (Department of CS, AIT) 09-June-2009 1 / 20

Object class recognition Object class recognition is a key issue in computer vision. People use Shape and/or Appearance to categorize objects. In this paper they combine both shape and appearance. The alphabet is the basis for a codebook representation of object categories. The main focus of the paper is on representation and use of shape and geometry rather than appearance. Irshad Ali (Department of CS, AIT) 09-June-2009 2 / 20

Object class recognition Irshad Ali (Department of CS, AIT) 09-June-2009 3 / 20

Boundary-Fragment-Model(BFM) A BFM is restricted to a codebook of boundary fragments and does not represent appearance at all. The boundary represents the shape of many object classes quite naturally without requiring the appearance (e.g. texture) to be learnt and thus we can learn models using less training data to achieve good generalization. Irshad Ali (Department of CS, AIT) 09-June-2009 4 / 20

System Overview Irshad Ali (Department of CS, AIT) 09-June-2009 5 / 20

System Overview (a) (b) Figure: (a) Two alphabet entries (one region, one Boundary-Fragment). (b) Two weak detectors (one region-based, one Boundary-Fragment based). Irshad Ali (Department of CS, AIT) 09-June-2009 6 / 20

Training Data To train the model following data are required: A training image set with the object delineated by a bounding box. A validation image set with counter examples (the object is not present in these images), and further examples with the object s centroid (but the bounding box is not necessary). Irshad Ali (Department of CS, AIT) 09-June-2009 7 / 20

Learning Learning is performed in two stages. Alphabet entries are added to a codebook. An alphabet entry can either be a Boundary-Fragment (BF-a piece of linked edges), or a patch (salient region and its descriptor). Each entry also casts at least one centroid vote, which is represented as a vector. Weak detectors are formed as pairs of two alphabet entries, and Boosting is used to select a strong detector. A strong detector consists of many weak detectors. This process selects the weak detectors which perform best on positive validation images and rejects the negative images (including a good centroid estimate). Irshad Ali (Department of CS, AIT) 09-June-2009 8 / 20

Implementation Details Linked edges are obtained for each image in the training and in the validation set using a Canny edge detector. Training images provide the candidate boundary fragments γ i by selecting random starting points on the edge map of each image. Then at each such point they grow a boundary fragment along the contour. Growing is performed from a certain fragment starting length L start in steps of L step pixels until a maximum length L stop is reached. Irshad Ali (Department of CS, AIT) 09-June-2009 9 / 20

Weak detector The combination of boundary fragments to form a weak detector h i. It fires on an image if the k boundary fragments (γ a and γ b ) match image edge chains, the fragments agree in their centroid estimates (within an uncertainty of 2r). In the case of positive images, the centroid estimate agrees with the true object centroid (O n ) within a distance of d c Irshad Ali (Department of CS, AIT) 09-June-2009 10 / 20

Matching weak detectors The top row shows a weak detector with k = 2, that fires on two positive validation image because of highly compact center votes close enough to the true object center (black circle). In the last column a negative validation image is shown. There the same weak detector does not fire (votings do not concur). Bottom row: the same as the top with k = 3. Irshad Ali (Department of CS, AIT) 09-June-2009 11 / 20

Learning a Strong Detector From a weak detector consisting of k boundary fragments and a threshold th hi they learn this threshold and form a strong detector H out of T weak detectors h i using AdaBoost. First they calculate the distances D(h i, I j ) of all combinations of boundary fragments (using k elements for one combination) on all (positive and negative) images of validation set I 1,..., I v. Then in each iteration 1,..., T they search for the weak detector that obtains the best detection result on the current image weighting. Irshad Ali (Department of CS, AIT) 09-June-2009 12 / 20

Detection and Segmentation First the edges are detected. The boundary fragments of the weak detectors are matched to this edge image. In order to detect (one or more) instances of the object (instead of classifying the whole image) each weak detector h i votes with a weight w hi in a Hough voting space. Votes are then accumulated as follows: For all candidate points x n found by the strong detector in the test image I T they sum up the (probabilistic) voting of the weak detectors h i in a 2D Hough voting space. Irshad Ali (Department of CS, AIT) 09-June-2009 13 / 20

Detection and Segmentation Irshad Ali (Department of CS, AIT) 09-June-2009 14 / 20

The BFM for Multiple Categories Building the alphabet of shape for many categories is based on the process for the one-class BFM. They also search over other categories to see if a boundary fragment can be shared. The boundary fragment matches on many positive validation images of another category and gives a roughly correct prediction of the object centroid. In this case they just update the alphabet entry with the new costs for this category and sharing is possible. The boundary fragment matches well on many positive validation images, but the prediction of the object centroid is not correct, though often the predictions for each match are consistent with each other. In this case they add a new centroid vector to the alphabet entry. The third obvious case is where the boundary fragment matches arbitrarily in validation images of a category in which case high costs emerge and sharing is not possible Irshad Ali (Department of CS, AIT) 09-June-2009 15 / 20

Results Irshad Ali (Department of CS, AIT) 09-June-2009 16 / 20

Results Irshad Ali (Department of CS, AIT) 09-June-2009 17 / 20

Results Irshad Ali (Department of CS, AIT) 09-June-2009 18 / 20

Results The first ten weak detectors learnt in the UM for the categories: Cars-side (UIUC), Cars-rear, Airplanes, Motorbikes and Faces (Caltech). Irshad Ali (Department of CS, AIT) 09-June-2009 19 / 20

Conclusion and Discussion Less False positives. Less Training data. Processing Time?? Scaling and Rotation. Irshad Ali (Department of CS, AIT) 09-June-2009 20 / 20