Generic Object Detection Using Improved Gentleboost Classifier

Similar documents
Estimating Human Pose in Images. Navraj Singh December 11, 2009

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Object detection using non-redundant local Binary Patterns

Beyond Bags of Features

Shared features for multiclass object detection

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

CS4495/6495 Introduction to Computer Vision. 8C-L1 Classification: Discriminative models

Selection of Scale-Invariant Parts for Object Class Recognition

Bayes Risk. Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Discriminative vs Generative Models. Loss functions in classifiers

Generic Face Alignment Using an Improved Active Shape Model

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Classifiers for Recognition Reading: Chapter 22 (skip 22.3)

Combining PGMs and Discriminative Models for Upper Body Pose Detection

Shared Features for Multiclass Object Detection

Object recognition (part 1)

Object Detection Design challenges

Part based models for recognition. Kristen Grauman

Object Category Detection: Sliding Windows

Discriminative classifiers for image recognition

Deformable Part Models

Window based detectors

Supervised learning. y = f(x) function

People detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features

Out-of-Plane Rotated Object Detection using Patch Feature based Classifier

Supervised Sementation: Pixel Classification

Sharing features: efficient boosting procedures for multiclass object detection

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction

Part-based and local feature models for generic object recognition

Linear combinations of simple classifiers for the PASCAL challenge

Lecture 20: Object recognition

Active learning for visual object recognition

Effective Classifiers for Detecting Objects

Detection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors

Fast Learning for Statistical Face Detection

CS 223B Computer Vision Problem Set 3

Skin and Face Detection

CS 231A Computer Vision (Fall 2012) Problem Set 3

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Stacked Denoising Autoencoders for Face Pose Normalization

Using the Forest to See the Trees: Context-based Object Recognition

Object Category Detection: Sliding Windows

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

Sharing visual features for multiclass and multiview object detection

TA Section: Problem Set 4

Recap Image Classification with Bags of Local Features


Verification: is that a lamp? What do we mean by recognition? Recognition. Recognition

Beyond Bags of features Spatial information & Shape models

Object detection as supervised classification

Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach

ECG782: Multidimensional Digital Signal Processing

Categorization by Learning and Combining Object Parts

Tracking. Hao Guan( 管皓 ) School of Computer Science Fudan University

Fast Human Detection Using a Cascade of Histograms of Oriented Gradients

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

Tri-modal Human Body Segmentation

Selection of Scale-Invariant Parts for Object Class Recognition

Detection III: Analyzing and Debugging Detection Methods

High Level Computer Vision. Sliding Window Detection: Viola-Jones-Detector & Histogram of Oriented Gradients (HOG)

Boosting Simple Model Selection Cross Validation Regularization. October 3 rd, 2007 Carlos Guestrin [Schapire, 1989]

Hide-and-Seek: Forcing a network to be Meticulous for Weakly-supervised Object and Action Localization

Generic Object-Face detection

What do we mean by recognition?

Using Maximum Entropy for Automatic Image Annotation

Spatio-temporal Feature Classifier

A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM

Face detection and recognition. Detection Recognition Sally

Component-based Face Recognition with 3D Morphable Models

Lecture 16: Object recognition: Part-based generative models

Classification Algorithms in Data Mining

Oriented Filters for Object Recognition: an empirical study

Detecting Pedestrians by Learning Shapelet Features

Supervised learning. y = f(x) function

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK

Summarization of Egocentric Moving Videos for Generating Walking Route Guidance

A robust method for automatic player detection in sport videos

Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

Component-based Face Recognition with 3D Morphable Models

Towards Practical Evaluation of Pedestrian Detectors

Ensemble Methods, Decision Trees

Detecting Pedestrians Using Patterns of Motion and Appearance

CAP 6412 Advanced Computer Vision

Object recognition. Methods for classification and image representation

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

A Keypoint Descriptor Inspired by Retinal Computation

A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods

EE795: Computer Vision and Intelligent Systems

Learning and Recognizing Visual Object Categories Without First Detecting Features

Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection

Detecting Pedestrians Using Patterns of Motion and Appearance

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Detecting Object Instances Without Discriminative Features

Previously. Window-based models for generic object detection 4/11/2011

Category vs. instance recognition

Fusing shape and appearance information for object category detection

2 Cascade detection and tracking

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE

Transcription:

Available online at www.sciencedirect.com Physics Procedia 25 (2012 ) 1528 1535 2012 International Conference on Solid State Devices and Materials Science Generic Object Detection Using Improved Gentleboost Classifier Li Guo a,b,yu Liao a,daisheng Luo b,honghua Liao a a Information Engineering Department,Hubei Institute for Nationalities,Enshi, China b School of Electronics and Information Engineering Sichuan University Chengdu, China Abstract Object detection in natural scene and image is playing an important role in computer vision. The traditional object detection method is more about local feature extraction and supervised learning. Using this method, the detection rate of the image with complex scenes is low. But the reality scene is complex and the real-time detection system need to handle a large number of images. In order to remedy the deficiency of the traditional test methods in the object detection, we propose a new approach which uses patches of object and its relative position with the object s center as the feature and a new improved gentleboost classifier, which enables it to work with better detection result. In this method, we use the linear regression stump as the weak classifier in learning algorithm, weights to the prediction model from the positive and negative classification, but not to weight it only from the positive aspect. At the same time, we compare our proposed algorithm with the detection method proposed by Andreas et al. in Ref 9] and also compares results with different number of weak learners. Experimental results show that our algorithm is simple to implement and the positive detection rate is high. 2011 2012 Published by Elsevier Ltd. B.V. Selection and/or peer-review under responsibility of of [name Garry organizer] Lee Open access under CC BY-NC-ND license. Keywords object detection; features extraction; classifier; gentleboost algorithm 1. Introduction A long-standing goal of machine vision has been to build a system that is able to recognize many different kinds of objects in cluttered scenes. Systems can learn based on given piles of images containing objects of certain categories, and piles of counterexamples, not containing these objects. It is a challenging task owing to their variable appearance and the occlusion in natural scene. The first need is a robust feature set that allows the object shape to be discriminated cleanly, even in cluttered backgrounds under difficult illumination. The most common approach is to slide a window across the image, and to classify each such window as containing the target or background. This approach has been successfully 1875-3892 2012 Published by Elsevier B.V. Selection and/or peer-review under responsibility of Garry Lee Open access under CC BY-NC-ND license. doi:10.1016/j.phpro.2012.03.272

Li Guo et al. / Physics Procedia 25 ( 2012 ) 1528 1535 1529 used to detect objects such as faces [1, 2] and pedestrians [3, 4]. Another popular approach is to extract local interest points from the image, and then to classify each of the regions around these points [5]. The key insights of previous work, therefore, appear to be that using different features described various object information of an image, respectively. This method is to construct classifier in a large number of training samples using machine learning, then to find the same pattern of object density. However, it is not efficient to object detection. Because the surrounding environment is complexity, and there are many noises to effect the image and lead to poor quality, low resolution, and occluded in the object. These cases made the object detection and recognition is very difficult. Meanwhile, a weakness shared by all of the above approaches is that they can fail when local image information is insufficient, such as the target is very small or highly occluded. Due to the disadvantages of previous work, we propose a new method on object detection in natural scenes. The main contribution of this paper is twofold: First, we show a new feature extraction method that the patch and it s position provides excellent performance relative to other existing feature sets. Second, we explore an addition of gentleboost classifier to improve the accuracy in object detection. We believe that our algorithm is better than other recent methods on object detection in complexity natural scene. Fig. 1 illustrates typical examples and related each object to the labeled examples in images. Figure.1 Illustration of object detection This paper is organized as follow. We briefly discuss previous work on object detection in section 1, give a diagram of object detection model in section 2, describe our proposed algorithm in section 3 and give the evaluation of our experiment in section 4. The main conclusions are summarized in section 5. 2.OBJECT detection model In this paper, we give a new proposed method in object detection in natural scenes. This method use patches of object in detection and use improved gentleboost algorithm as classifier to train the detector. Fig. 2 is the diagram of object detection model in our method. It composed of object learning phase and object detection phase. In learning phase, we select some images including a certain of object randomly, extract the features of this object and use improved gentleboost algorithm to train the detector. In detection phase, we use trained detector to detect a new input test image and output the detection result.

1530 Li Guo et al. / Physics Procedia 25 ( 2012 ) 1528 1535 Figure.2 Diagram of object detection model 3.Proposed algorithm The standard approach to object detection is to classify each image patch as foreground (containing the object) or background. There are two main steps to be made: what kind feature to extract from an image, and what kind of classifier to apply to this feature vector. Our proposed algorithm uses object patch and its relative position with the object s center as the feature and use improved gentleboost algorithm train the classifier which is used to classify the object. This algorithm is described in the following section. 3.1.Image preprocessing and feature extraction A challenge task in computer vision is to detect common object in natural and complicated images. The precondition of this task is has large numbers of challengeable image datasets. In these images, the objects must be labeled, these annotations must provide the class information of objects in image, and the location, shape, posture of these objects, and so on. This database can be used to train and test in learning and detection. In this paper, we use examples in LabelMe database (opened by MIT) an online database of images annotation. We select 776 images from LabelMe database, 200 for training and the rest for testing [6]. First of all, we convolve each image with a bank of filters. After filtering the images, we then extract the patches randomly from the training images to create the dictionary of feature. The patch size is from 9 to 25 pixels which separated from 2 pixels variable. The size and location of these patches is chosen randomly, but is constrained to lie inside the annotated bounding box. We record the location from which the patch was extracted by creating a spatial mask centered on the object, and placing a blurred delta function at the relative offset of the patch. We repeated this process for multiple filters (number is 4) and patches, creating a large dictionary of features. Thus the i th dictionary entry consists of a filter f i, a patch P i, and a Gaussian mask g i. We can create a feature vector for every pixel in the image as follows:

Li Guo et al. / Physics Procedia 25 ( 2012 ) 1528 1535 1531 h i ( I, x, y )=[(I*f i ) P i ]*g i (1) Where * represent convolution, represents normalized cross-correlation and h i is the i th component of the feature vector at pixel (x, y). The intuition behind this is as follows: the normalized cross-correlation detects places where patch i occurs, and these vote for the center of the object using the i masks. Note that the positive images used to create the dictionary of features are not used for anything else. The patch extraction of object and dictionary of filtered patches created from the target object is illustrated in Fig. 3. (a) Patches and center of the target object (b)the dictionary of filtered patches Figure.3 The dictionary of filtered patches created from the target obect 3.2.Improved Gentleboost classifier Popular classifier for object detection includes SVM, neural network, naïve Bayes classifiers, Boosting classifier, etc. We use gentleboost classifier, since it has been shown to work well for object detection [7]. It is the iterative combination of classifiers to build a strong classifier, which easy to implement, fast to train and to apply. And it performs feature selection, thus resulting in a fairly small and interpretable classifier. The output of the boosted classifier is a score bi for each patch, that approximates bi log P(Ci = 1 Ii)/P(Ci = 0 Ii), where Ii are the features extracted from image patch Ii, and Ci is the label (foreground VS background) of patch i. In order to combine different information sources, we need to convert the output of the discriminative classifier into a probability. A standard way to do this is by taking a sigmoid transform: si = (wt [1 bi]) = (w1 + w2bi), where the weights w are fit by maximum likelihood on the validation set, so that si P(Ci = 1 Ii). This gives us a per-patch probability; we will denote this vector of local scores computed from image I as L = L (I). Finally we convert this into a probability distribution over possible object locations by normalizing: P(X = i L) = si /N( N=1/ sj), so that P(X = i L) = 1. The gentleboost classifier is illustrated in Fig. 4.

1532 Li Guo et al. / Physics Procedia 25 ( 2012 ) 1528 1535 The complexity of the feature vector can be controlled more easily by using boosting techniques over weak classifiers based on single features (e. g., linear regression stumps) similar in [8]. In this case, the number of features used is bounded by the number of iterations in the boosting process. However, since boosting tends to overfit in some conditions of small datasets, there is a bit of a dilemma here. So, we proposed a new improved gentleboost algorithm would avoid this case. This new algorithm is an addition on gentleboost: (1) we add corrupted copies of our training dataset to the original one and the algorithm will not be easy to overfit, (2)we weights to the prediction model from the positive and negative classification, but not to weight it only from the positive aspect. The step of our improved gentleboost algorithm is specified in Table. 1. At each boosting round, a regression function is fitted (by weighted least-squared error) to each feature in the training set. We use linear regression for our experiments, fitting parameters a, b and th so that our regression functions are of the form f(x) =(a+b)(x> th )+b(x<th). The regression function with the least weighted squared error is added to the total classifier H (x) and its associated feature ( k min ) is used for feature knockout. Figure.4 The illustration of gentleboost classifier TABLE I. OUR IMPROVED GENTLEBOOST ALGORITHM Input: (x i, y i ),, (x m, y m ) where x y 1. Initialize weights w=1, not w=1/m, which make computation simple 2. For iteration number t=1, 2, 3,, T; (a) For each feature k, fit a regression function f t (k) (x)=(a+b)[x k th]+b[[x k th] by weighted least squares on y i to x i with weights w i =1, i=1 m+t-1;a,b is the weighingt of positive and negative recognition (b) Let k min be the index of the feature with the minimal associated weighted least square error; (c) For each iteration, select f t to minimize the error of Taylor value; update the classifier H(x)=H(x)+f t (kmin) ; (d) Create a new example x m+t : select two random indices, x a =x m+t ; x b = x m+t (kmin) ; y a =y m+t ; (e) Select new example weight w m+t to that of its source, update the weights and normalization. 3. Output the final stronger classifier H(x). 4.Experiments In order to validate our method, we evaluate our algorithm on the subset of the MIT LabelMe dataset of objects and scenes, which contains about 2000 images of indoor and outdoor scenes, in which about 30 different kinds of objects have been manually annotated. The dataset is dynamic, free to use, and open to public contribution.

Li Guo et al. / Physics Procedia 25 ( 2012 ) 1528 1535 1533 We take car detection (side view) as a test case in the LabelMe dataset, because it has enough training data. The result is about 200 train car images(12 200) and test images size are about 256 256 pixels in size. A good example of car datasets is shown in Fig. 5. Our experiments are carried out as follows: We compute a bank of features for each labeled image, and then sample the resulting filter at various locations: once near the center of the object (to generate a positive training example), and at about 20 random locations outside the object s bounding box (to generate negative training examples). We repeat this for each training image. These feature vectors and labels are then passed to the classifier. We perform several rounds of boosting (this number was chosen by monitoring performance on the validation set) to train each classifier (using about 200 images). Once the classifier is trained, we can apply it to a novel image, and find the location of the strongest response. From experimental results shown in Fig.6, We can see the input image labeled with ground truth and detector output. It indicates that our detector on the car (side view) in various scenes is accurate. This shows that our proposed algorithm provides a high quality baseline and gives essentially perfect results on the MIT car test set. Figure.5 Examples of the car dataset in LabelMe dataset We use the evaluation criterion is precision-recall curve (RPC) rather than the more common ROC metric, since the latter is designed for binary classification tasks, not detection tasks. Precision can be seen as a measure of exactness or fidelity, whereas recall is a measure of completeness. Our algorithm performance figures quoted in Fig. 7 shows precision-recall curve on three different numbers of weak learners. The red line shows the number of weak learners is 8. The green line shows the number is 30 and the third blue line shows the number is 120. It is shown that the detection result of larger weak learner outperforms the smaller one. On the other way, we summarize performance of comparable result of our proposed algorithm and another method proposed by Andreas et al. in [9] on the same object detection dataset. Fig. 8 illustrates the result of this experiment As it can be observed in precision-recall curve, the green line shows the detection of our algorithm and the red line shows the result of another method. It is clear that the object detection performance of our algorithm is superior to this earlier method.

1534 Li Guo et al. / Physics Procedia 25 ( 2012 ) 1528 1535 Figure.6 Illustration of input image and detector output Figure.7 Precision-Recall curve on our algorithm in defferent number of weak learners Figure.8 Comparable result of our algorithm and other method in Ref. [9]

Li Guo et al. / Physics Procedia 25 ( 2012 ) 1528 1535 1535 5.Conclusion In this paper, we have presented a new object detection algorithm. Through experimental results, it proved that proposed method for object detection is efficient and accuracy, robust feature encoding for object detection and good performance of classifier for generic object classes detection. Much still needs to be done to further improve this work. We expect that classification accuracy would increase further if we were able to add part-based detector for handling partial occlusions; multiple scale detector based on improved gentleboost to learn and real-time detection. Acknowledgment Th work is supported by Science Research Program of Education Bureau of Hubei Province under Grant No. D20101903 and the innovation group program of Hubei university for nationalities under Grant No. MY2008T002. References [1] C. Papageorgious and T. Poggio. A trainable system for object detection. Intl. J. Computer Vision, 38(1):15-33,2000 [2] Henry Schneiderman and Takeo Kanade. A statistical model for 3D object detection. In CVPR, 2000 [3] P.Vida, M.Jones and D.Snow. Detecting pedestrians using patterns of motion and appearance. In IEEE Conf. on Computer Vision and Pattern Recognition, 2003 [4] K.Mikolajczyk, C. Schmid, and A. Zisserman. Human detection based on a probabilistic assembly of robust part detectors. In Proceedings of the 8 th European Conference on Computer Vision, Prague, Czech Republec, May 2004. [5] R. Fergus, P. Perona, and A. Zisserman. A sparse object category model for efficient learning and exhaustive recognition. In CVPR, 2005 [6] B. C.Russell, A. Torralba, K. P.Murphy, W. T.Freeman. LabelMe: a database and web-based tool for image annotation. International Journal of Computer Vision, pages 157-173, volume 77, Numbers 1-3, May,2008 [7] P. Viola and M. Jones. Robust real-time object detection. Intl. J. Computer Vision, 57(2): 137-154, 2004 [8] Lior Wolf, Ian Martin. Robust boosting for learning from few examples. Proc. CVPR, pages 359-364, 2005 [9] Andreas opelt and Axel pinz. Object localization with Boosting and weak supervision for Generic object Recognition. Image Analysis, pages 862-871, 2005.