Class-Specific Weighted Dominant Orientation Templates for Object Detection

Size: px

Start display at page:

Download "Class-Specific Weighted Dominant Orientation Templates for Object Detection"

Randolph Dixon
6 years ago
Views:

1 Class-Specific Weighted Dominant Orientation Templates for Object Detection Hui-Jin Lee and Ki-Sang Hong San 31 Hyojadong Pohang, South Korea POSTECH E.E. Image Information Processing Lab. Abstract. We present a class-specific weighted Dominant Orientation Template (DOT) for class-specific object detection to exploit fast DOT, although the original DOT is intended for instance-specific object detection. We use automatic selection algorithm to select representative DOTs from training images of an object class and use three types of 2D Haar wavelets to construct weight templates of the object class. To generate class-specific weighted DOTs, we use a modified similarity measure to combine the representative DOTs with weight templates. In experiments, the proposed method achieved object detection that was better or at least comparable to that of existing methods while being very fast for both training and testing. 1 Introduction Object detection is an important, yet challenging vision task. It has many applications, including image retrieval and video surveillance. Template matching is simple and can be applied to different types of objects, and is therefore an attractive object detection method. In this approach, given templates of the object, the task of object detection is to locate in an input image the target object that has properties that are similar to those of the given templates. The template can be represented in various forms such as a raw image, a binary, a set of feature points, or a learned pattern. In [1, 2], the templates are represented as raw intensity or color images, and consider the appearance of the objects. However, appearance templates are too specific to capture general properties of objects. Therefore, binary templates [3 5] that represent the contours of the objects are often used to capture information about their shape. The Chamfer distance [6] or Hausdorff distance [7, 8] are commonly used to measure the similarity between a binary template and the input image contour; these distances are computed from the Distance Transform (DT) image [9]. Generally, this approach produces many false positives when the background is cluttered, so the magnitudes [10] or orientations [11 13] of edges are included as additional information to the DT image. The combination of magnitude and orientation can be used to reduce the number of false positives [14]. However, all DT-based methods have the weakness that contour points must be extracted; the result of this step can be unsatisfactory in images that contain illumination changes and noise.

The features can be represented [15, 16] as rectified responses from an extended set of oriented Haar-like wavelets.

2 2 H.J. Lee and K.S. Hong (a) (b) Fig. 1. (a) Target object; (b) Result of template matching using DOT (white rectangle: detected object in input image). To overcome these limitations, the image gradients are considered as the feature constructing templates. The features can be represented [15, 16] as rectified responses from an extended set of oriented Haar-like wavelets. PCA-SIFT features, which are variants of SIFT [17], can be obtained by using PCA which projects gradient images onto a basis [18]. Lastly, binary edge-presence voting can be used to assign features into log-polar spaced bins, irrespective of edge orientation [19]. Templates based on these features tend to be sparse, so they may not be suitable for objects that have little texture. Thus, the Histogram of Gradients (HOG) [20] was proposed as a dense image feature. HOG describes the local distributions of image gradients computed on a regular grid, and gives reliable results but tends to be slow due to computational complexity. Thus, all of the existing methods have shortcomings. Hence, a binary template representation called Dominant Orientation Template (DOT) was introduced [21]. It relies on local dominant orientations instead of local histograms. It is invariant to small translation and at least as discriminant as HOG, while being much faster. However, DOT is somewhat instancespecific, especially when objects are textured. Given the DOT of a cup bearing a flower image (Fig. 1a), DOT detects only that instance, not other cups in the same class of cups (Fig. 1b). Hence, it is more suitable for object recognition than for object detection. In this paper, we propose a DOT-based template for class-specific object detection to exploit fast DOT, although the original DOT is intended for instancespecific object detection. To accomplish this class-specificity, we form two kinds of templates. One consists of the representative DOTs of an object class; they are automatically selected from DOTs of all training images. The other consists of weight templates of an object class obtained from responses of vertical, horizontal, and diagonal 2D Haar wavelets [22]. To combine the representative DOTs and weight templates, we present a modified similarity measure defined as the product of the representative DOTs and weight templates. The remainder of this paper is organized as follows: In section 2, we briefly review the concepts of DOT and 2D Haar wavelets and describe the proposed

3 Class-Specific Weighted DOTs for Object Detection 3 method. In section 3, the experimental results on several classes are shown. In section 4, we conclude the paper. 2 The Proposed Method This section describes Dominant Orientation Templates (DOTs) and 2D Haar wavelets, and how they can be used to generate class-specific weighted DOTs. 2.1 DOT Hinterstoisser et al. [21] retained the orientations of strong gradients based on their magnitude values and considered only the orientations of the gradients where two vectors with 180 separation are regarded as having the same orientation. Let I be an input image and O be a reference image; each is decomposed into small square regions R over a regular grid. For each R, the dominant orientations are considered as features. An operation DO(O, R) returns the set of orientations with strong gradients in region R of O (Fig. 2a, left); a related operation do(i, c + R) returns only one orientation of the strongest gradient in the region at location c + R in I (Fig. 2a, right), where c is a template centered location. Then the obtained orientations are discretized to an integer value n (Fig. 2b). If the gradient magnitude is less than a threshold τ in a region, it is region is judged to be uniform; the symbol is used to indicate these. Hence, the DO(.) function returns either a set of discretized orientations [0, n 1] of k strong gradients, or { }. Similarly, do(.) returns either only one discretized orientation [0, n 1] of the strongest gradient, or { }. To measure the similarity between the image I and the template O centered at location c in I, the similarity score ε(i, O, c) can be formalized as: ε(i, O, c) = R in O δ (do(i, c + R) DO(O, R)) (1) where binary function δ(p ) = 1 if P is true, and 0 otherwise. The similarity score ε can be computed efficiently using a binary representation of DO(O, R) and do(i, c + R). By setting the number n of discretized orientations to 7, DO(.) and do(.) can each be represented as one byte i.e., a 8-bit integer. Each of the first 7 bits corresponds to an orientation; the 8 bit stands for. More exactly, bit 0 i 6 that corresponds to discretized orientation is set to 1; the others are set to 0; if the region is uniform, the 7 th bit is set to 1. The term δ(.) in Eq. 1 can be evaluated very quickly by: δ (do(i, c + R) DO(O, R)) = 1 iff do(i, c + R) DO(O, R) 0 (2) where is the bitwise AND operation.

4 H.J. Lee and K.S. Hong (a) (b) Fig. 2. (a) Dominant orientations of input image I and reference image O; (b) Similarity score between two images using the bitwise AND operator. 2.2 2D Haar Wavelets The Haar wavelet is a sequence of rescaled square-shaped functions which together form a wavelet family or basis.

4 4 H.J. Lee and K.S. Hong (a) (b) Fig. 2. (a) Dominant orientations of input image I and reference image O; (b) Similarity score between two images using the bitwise AND operator D Haar Wavelets The Haar wavelet is a sequence of rescaled square-shaped functions which together form a wavelet family or basis. The Haar wavelet s mother wavelet function ψ(t) can be described as: 1 0 t < 1/2 ψ(t) = 1 1/2 t < 1 (3) 0 otherwise. Its scaling function φ(t) can be described as: φ(t) = { 1 0 t < 1 0 otherwise. (4) The natural extension of wavelets to 2D signals is obtained by taking the tensor product of two 1D wavelet transforms ψ(t) and φ(t). In [22], three types of wavelets were obtained: one that encodes a difference in the intensity along vertical borders, one that encodes a difference in the intensity along horizontal borders, and one that responds strongly to diagonal boundaries. Therefore, the representation of these wavelets identifies local, oriented intensity difference features at multiple-resolutions and is efficiently computable. 2.3 Class-Specific Weighted DOTs This section presents a method to obtain class-specific weighted DOTs. The original DOT is instance-specific, because it has the information that is specific to a given object to be detected. The goal of class-specific weighted DOTs is to obtain the information that is common to all objects in a given object class. This

5 Class-Specific Weighted DOTs for Object Detection 5 Algorithm 1 Bottom-up clustering method Input: DOTs obtained from m training images, T i=1,..,m Output: Sets A j=1,..,n and cluster templates C j=1,..,n from N clusters Clustering 1. Initialize y i = 0 for all i = 1,.., m, where y i (1, 0) represents whether T i belongs to any cluster or not. 2. Repeat for j = 1,..., N: a. Create an empty set A j of j th cluster. b. Randomly select a T i with y i = 0 among T i=1,..,m and put it in A j and set y i = 1. c. Repeat for k = 1,..., d (where, d is the defined number): (a) Obtain C j using the bitwise OR operation of all DOTs in A j. (b) Among T i=1,..,m with y i = 0, select a T i, which has the nearest hamming distance d h with C j and put it in A j and set y i = return A j=1,..,n and C j=1,..,n. goal is achieved by combining representative DOTs and weight templates. The representative DOTs are selected from all DOTs obtained from training images. Weight templates are computed using 2D Haar wavelets (Section 2.2) to obtain the information common to an object class. The modified similarity measure, defined as the product of the representative DOTs and weight templates, is used to get class-specific weighted DOTs. Representative DOTs. Given training images of an object class as reference images, we generate a DOT for each training image (Section 2.1). Conducting template matching using all DOTs obtained from training images would require long run-time and sometimes decrease the accuracy of identification because of redundancy among DOTs. Hence, the representative DOTs of the object class are selected. The overall framework to select them is as follows (Fig. 3). To remove redundancy among DOTs, we use a bottom-up clustering method [21] (algorithm 1) to gather each group of similar DOTs into a cluster. Because a cluster template C j contains properties of all DOTs in cluster A j, we assume that in each cluster A j, if a template is more similar to the cluster template C j than other templates, the template has higher potential as the representative DOT. A representative DOT, S j, in cluster A j is selected as: S j = arg max T Aj R in C j δ ( DO(TAj, R) DO(C j, R) ) (5) where T Aj is a template in cluster A j. The term δ(.) in Eq. 5 is defined by δ ( DO(T Aj, R) DO(C j, R) ) = 1 iff d h (DO(T Aj, R), DO(C j, R)) t (6)

6 6 H.J. Lee and K.S. Hong Fig. 3. Framework of representative DOT selection. where d h is the Hamming distance and t is a threshold to decide whether region R of the template coincides with that of the cluster template. Weight Templates. The Haar wavelets encode the difference in average intensity between local regions along different orientations. The responses of wavelets can express visual features of an object class. More specifically, strong response from a particular wavelet indicates the presence of edge at that location in the image, whereas weak response indicates a uniform area. Therefore, by using these wavelets, visually reliable features of class-specific shape can be captured. We use the three types of 2D Haar wavelets (Section 2.2) with the size of 4 4. Given the gray-scale training images for an object class, wavelet responses for each training image are computed at every grid point with grid spacing of 4 4 then nearest-neighbor interpolation of the values of these grid pixels is used assign the values to all other pixels. Then for each orientation, the average responses are computed for all the training images; each average is then normalized to a proportion of maximum average value, resulting in three weight templates: vertical v(x, y), horizontal h(x, y) and diagonal d(x, y). Training images of cars and pedestrians (Fig. 4b) were used to obtain weight templates (Fig. 4c) that allow identification of significant regions of an object class for each orientations and capture the general property of the object class. Class-Specific Weighted DOTs. To generate class-specific weighted DOTs, the representative DOTs are combined with the weight templates of an object class. To do this, we modify the similarity score (Eq. 1). The modified similarity score ϕ(i, S j, c) between I and a representative DOT S j centered at c in I is formalized as: ϕ(i, S j, c) = R k in S j w(r k ) δ (do(i, c + R k ) DO(S j, R k )) (7)

Class-Specific Weighted DOTs for Object Detection 7 (a) (b) (c) Fig. 4.

images (top) and weight templates (bottom) of cars and pedestrians. (a) (b) Fig. 5.

weight template of pedestrians. where w(r k ) is the weight value of region R k.

) equals the similarity score (Eq.1) of original DOT.

do(i, c + R k ) as follows: v(r k ) do(i, c + R k ) {2 0, 2 6 } h(r w(r k ) = k ) do(i, c + R k

7 Class-Specific Weighted DOTs for Object Detection 7 (a) (b) (c) Fig. 4. (a) The three types (vertical, horizontal, and diagonal) of 2D Haar wavelets; (b,c) Training images (top) and weight templates (bottom) of cars and pedestrians. (a) (b) Fig. 5. (a) 8-bit representation of each orientation bin; (b) Orientation bin corresponding to each weight template of pedestrians. where w(r k ) is the weight value of region R k. If the weight templates are not considered, w(r k ) is set to 1. In this case, ϕ(.) equals the similarity score (Eq.1) of original DOT. If weight templates are considered, w(r k ) is the value of the weight template corresponding to do(i, c + R k ) as follows: v(r k ) do(i, c + R k ) {2 0, 2 6 } h(r w(r k ) = k ) do(i, c + R k ) {2 3 } d(r k ) do(i, c + R k ) {2 1, 2 2, 2 4, 2 5 (8) } u do(i, c + R k ) {2 7 } where u represents a constant weight value of the uniform region. do(i, c + R k ) can be represented as the 8-bit integer (Fig. 5).

over vertical, horizontal, and diagonal weight templates. For template matching, we scan I with the representative DOTs of the specific class.

8 8 H.J. Lee and K.S. Hong (a) (b) Fig. 6. (a) Input image and the detection results of pedestrians without or with mask images (green rectangle: ground truth, blue rectangle: correct result, red rectangle: false positive); (b) Mask images over vertical, horizontal, and diagonal weight templates. For template matching, we scan I with the representative DOTs of the specific class. The best matching template T is obtained from: T = arg max S j ϕ(i, S j, c) (9) Once the best matching template T is obtained, the similarity score at the location c is stored. Local peaks of these scores represent regions that contain a target object. To reduce the number of false positives, we use thresholding to derive binary mask images m a from weight templates, i.e., { 1 w(rk ) τ m a (R k ) = max otherwise. (10) where τ max is the maximum value of the weight template. Mostly, false positives appear similar to the target object. For instance, because the body shape of a pedestrian is approximately a vertical rectangle, this shape can easily be confused with a background object with a similar shape. To eliminate this type of false positive, we use mask images to verify the detection results. The number n cnt of sub-regions matched with m a is defined by: n cnt = R k in S j m a (R k ) δ (do(i, c + R k ) DO(S j, R k )) (11) n cnt can reduce the number of false positives by discarding candidate regions in which n cnt is below a threshold, even though they have high similarity score ϕ(i, T, c). Using m a increases the ability to distinguish a target object from the background (Fig. 6). This approach captures the general property of an object class and ignore unnecessary information inside objects.

9 3 Experimental Results Class-Specific Weighted DOTs for Object Detection 9 The proposed method was tested on images of cars (UIUC-single car dataset [1] and CalTech car rear dataset [24]) and pedestrians (USC-A pedestrian test set [23] and INRIA person dataset [20]). For all datasets, the size of decomposed regions R was 4 4 pixels. In each region, the number k of extracted dominant orientations was 7 in training images and 1 in test images. The detection performance was measured using Equal recall precision Error Rate (1-EER). Detected results were accepted as correct if the intersection area of the detected bounding box and the ground-truth bounding box exceeded 50% of the union of the two boxes. For the comparison, 1-EER was compared with the original DOT [21] to demonstrate the suitability of the proposed method for class-specific object detection, and with different representations of template (contour-based [14] and gradient-based [20]). Experiments were implemented on a quad core CPU operating at 2.6GHz. 3.1 Car Detection UIUC-single car. This dataset contains side images of cars. The training set consists of 550 car images and 500 non-car images of a fixed size ( pixels). The test set consists of 170 images with 200 cars and includes partially-occluded cars and cluttered backgrounds under challenging illumination. In the training step, we used only 550 car images for constructing weight templates and DOTs, among which 20 representative DOTs are selected. In the testing step, the test images were scanned with a step of 4 pixels. Combining weight templates with the original DOTs gave better a result (Fig. 7a) than both the original DOT and the contour-based approach [14] which uses the combination of edge magnitudes and orientations for matching; using mask images further increased the 1-EER (Table 1). For this dataset, the training time was about 4s and the time required to test an image of pixels was 59ms. CalTech car rear. This dataset contains images of rear views of cars. For training, the CalTech cars-markus dataset was used; it contains 12 car images obtained in parking lots. From this dataset, we constructed weight templates and DOTs, among which 18 representative DOTs were selected automatically. For testing, the CalTech cars-brad dataset was; it contains 526 car images. Each image in this dataset contains cars on the road; scale variation among the images is significant. Because the test dataset is not annotated, we defined the ground-truth bounding box manually. Therefore, the performance (Fig. 7b) of proposed method can only be compared directly with that of the original DOT, and not with those of other methods. Combining weight templates with the original DOTs significantly improved the true-positive detection rate from 54.56% (using the original DOT) to 82.89%; using mask images further increased the performance increases this rate to 92.02%. In this experiment, the training time was about 3s and in scale space of pixels, the testing time was 400ms.

10 H.J. Lee and K.S. Hong (a) (b) Fig. 7. Detection results of (a) UIUC car (side); (b) CalTech car (rear) datasets (green rectangle: ground truth, blue rectangle: detection result) 3.

However, because this dataset consists of a test set only, the MIT pedestrian dataset of 924 training images [15] was used for training, among which 12 representative DOTs were selected.

10 10 H.J. Lee and K.S. Hong (a) (b) Fig. 7. Detection results of (a) UIUC car (side); (b) CalTech car (rear) datasets (green rectangle: ground truth, blue rectangle: detection result) 3.2 Pedestrian Detection USC-A pedestrian. This dataset includes 205 front or rear images of 313 pedestrians in upright standing poses. Ppedestrians are not occluded but the background is cluttered. However, because this dataset consists of a test set only, the MIT pedestrian dataset of 924 training images [15] was used for training, among which 12 representative DOTs were selected. The test images were scanned using window at various scales from 0.6 to 1.2. The detection results were compared with the ground truth given in [23] using the criteria proposed in [2]. Combining weight templates with the original DOTs produced better results (Fig. 8) than using the original DOTs for matching; using mask images considerably improve the performance of detection and provided a result better than that [14] (Table 1). For this dataset, the training time was about 15s. Because we considered the various scales of object for the matching, the

11 Class-Specific Weighted DOTs for Object Detection 11 Fig. 8. Detection results of the USC-A pedestrian dataset (green rectangle: ground truth, blue rectangle: detection result of proposed method) testing time to detect all objects in an image was around 350ms. Table 1. 1-EER of original DOT and proposed methods for UIUC-single car and USC-A pedestrian databases. Methods UIUC USC-A [14] 83.0% 80.0% Proposed method The original DOT [21] 74.0% 60.0% With weight templates (without mask) 92.0% 73.8% With weight templates (with mask) 93.5% 82.4% INRIA person. We tested the proposed method on the INRIA person which is more challenging pedestrian dataset. The INRIA person dataset provides both training and testing sets containing positive samples of pixels and negative images that contain no humans. In the training step, we used all 2416 positive training images as training images and selected 18 representative DOTs. We used 1126 positive testing images and negative images obtained by shifting the detection windows by 8 pixels in the negative testing images, all of which

12 12 H.J. Lee and K.S. Hong Fig. 9. Some sample images from the INRIA person dataset are available in the dataset. Positive testing images consist of pedestrians images with the size of cropped from a varied set of personal photos (Fig. 9). The subjects are always upright, but images include some partial occlusions and a wide range of variations in pose, appearance, clothing, and background. To measure the performance, we used miss rate at 10 4 False Positives Per Window (FPPW) as a reference point, because it was used in the other methods to compare. The proposed method was significantly better than other featurebased methods, but is comparable to HOG [20] (Table 2). The proposed method is faster than HOG in both training time and testing time: the training time of the proposed method takes several seconds whereas HOG takes several hours. The testing time to process 4000 windows was around 430ms in the proposed method and almost 1s in HOG. Table 2. Miss rate of proposed method and existing methods at 10 4 FPPW for the INRIA person database. Methods Miss rate at 10 4 FPPW(%) HOG [20] 27.0% 11.0% Wavelet [15, 16] PCA-SIFT [17] >> 20.0% Shape Context [17] Proposed method 17.0% 4 Conclusion We have introduced a class-specific weighted dominant orientation template (DOT) to construct class-specific templates by combining representative DOTs and weight templates. The representative DOTs of an object class are selected using an automatic selection algorithm, and weight templates are obtained using vertical, horizontal, and diagonal 2D Haar wavelets. A modified similarity measure defined as the product of the representative DOTs and weight templates is used to obtain class-specific weighted DOTs. We compared the performance of

13 Class-Specific Weighted DOTs for Object Detection 13 the proposed method with existing methods for detecting cars and pedestrians. The results show that the proposed method was more accurate than existing methods and comparably accurate to HOG. Using the concept of DOT enables fast object detection in an input image. Also, generating the representative DOTs and weight templates is rapid because the process is relatively simple. Acknowledgement. This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(mest) (No ). References 1. Agarwal, S., Roth D.: Learning a Sparse Representation for Object Detection. ECCV (4) 02, (2002) 2. Leibe, B., Seemann, E., Schiele, B.: Pedestrian Detection in Crowded Scenes. Proc. IEEE Conference on Computer Vision and Pattern Recognition, (2005) 3. Gavrila, D., Philomin, V.: Real-Time Object Detection for Smart Vehicles. ICCV 99, (1999) 4. Gavrila, D., Dariu, M.: A Bayesian, Exemplar-Based Approach to Hierarchical Shape Matching. IEEE Trans. Pattern Anal. Mach. Intell., (2007) 5. Thanh, N.D., Ogunbona, P., Li, W.: Human detection based on weighted template matching. Proceedings of the 2009 IEEE international conference on Multimedia and Expo, (2009) 6. Barrow, H.G., Tenenbaum, J.M., Bolles, R.C., Wolf, H.C.: Parametric correspondence and chamfer matching: two new techniques for image matching. Proceedings of the 5th international joint conference on Artificial intelligence, (1977) 7. Huttenlocher, D.P., Klanderman, G.A., Kl, G.A., Rucklidge, W.J.: Comparing Images Using the Hausdorff Distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, (1993) 8. Rucklidge, W. J.: Locating objects using the Hausdorff distance. Proceedings of the Fifth International Conference on Computer Vision, (1995) 9. Borgefors, G.: Hierarchical Chamfer Matching: A Parametric Edge Matching Algorithm. IEEE Trans. Pattern Anal. Mach. Intell., (1988) 10. Rosin, P.L., West, G.A.W.: Salience Distance Transforms. Graph. Models Image Process., (1995) 11. Gavrila, D M.: Multi-Feature Hierarchical Template Matching Using Distance Transforms. Proceedings of the 14th International Conference on Pattern Recognition, (1998) 12. Olson, C.F., Huttenlocher, D.P.: Automatic Target Recognition by Matching Oriented Edge Pixels. IEEE Transactions on Image Processing, (1997) 13. Jain, A.K., Zhong, Y., Lakshmanan, S.: Object Matching Using Deformable Templates. IEEE Trans. Pattern Anal. Mach. Intell., (1996) 14. Thanh, N.D., Li, W., Ogunbona, P.: An improved template matching method for object detection. Proceedings of the 9th Asian conference on Computer Vision - Volume 3, (2010) 15. Mohan, A., Papageorgiou, C., Poggio, T.: Example based object detection in images by components. IEEE Trans. Pattern Anal. and Machine Intell., (2001) 16. Viola, P., Jones, M.J., Snow, D.: Detecting Pedestrians Using Patterns of Motion and Appearance. Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2, (2003)

14 14 H.J. Lee and K.S. Hong 17. Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vision, (2004) 18. Yan, k., Sukthankar, R.: PCA-SIFT: A More Distinctive Representation for Local Image Descriptors. CVPR (2) 04, (2004) 19. Belongie, S., Malik, J., Puzicha, J.: Matching Shapes. ICCV 01, (2001) 20. Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. Proc. IEEE Conference on Computer Vision and Pattern Recognition, (2005) 21. Hinterstoisser, S., Lepetit, V., Ilic, S., Fua, P., Navab, N.: Dominant Orientation Templates for Real-Time Detection of Texture-Less Objects. Proc. IEEE Conference on Computer Vision and Pattern Recognition, (2010) 22. Papageorgiou, C., Poggio, T.: A Trainable System for Object Detection. Int. J. Comput. Vision, (2000) 23. Wu, B., Nevatia, R.: Detection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors. ICCV 05, (2005) 24. Leibe, B., Leonardis, A., Schiele, B.: Robust Object Detection with Interleaved Categorization and Segmentation. Int. J. Comput. Vision, (2008)

Object detection using non-redundant local Binary Patterns

University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh