High Performance Object Detection by Collaborative Learning of Joint Ranking of Granules Features

Size: px
Start display at page:

Download "High Performance Object Detection by Collaborative Learning of Joint Ranking of Granules Features"

Transcription

1 High Performance Object Detection by Collaborative Learning of Joint Ranking of Granules Features Chang Huang and Ram Nevatia University of Southern California, Institute for Robotics and Intelligent Systems Los Angeles, CA 90089, USA {huangcha Abstract Object detection remains an important but challenging task in computer vision. We present a method that combines high accuracy with high efficiency. We adopt simplified forms of APCF features [3], which we term Joint Ranking of Granules (JRoG) features; the features consists of discrete values by uniting binary ranking results of pairwise granules in the image. We propose a novel collaborative learning method for JRoG features, which consists of a Simulated Annealing (SA) module and an incremental feature selection module. The two complementary modules collaborate to efficiently search the formidably large JRoG feature space for discriminative features, which are fed into a boosted cascade for object detection. To cope with occlusions in crowded environments, we employ the strategy of part based detection, as in [9] but propose a new dynamic search method to improve the Bayesian combination of the part detection results. Experiments on several challenging data sets show that our approach achieves not only considerable improvement in detection accuracy but also major improvements in computational efficiency; on a Xeon 3GHz computer, with only a single thread, it can process a million scanning windows per second, sufficing for many practical real-time detection tasks.. Introduction Object detection is a fundamental task in computer vision. Although considerable progress has been achieved in recent years, detection of objects in real-life images remains a challenging task. We will focus on pedestrian detection examples in this paper though our methods should apply to other objects as well. To illustrate the difficulty of the task, consider the images shown in as Fig.; the one on the left is taken from the i-lids subway data set [] that includes considerable inter-occlusion between pedestrians, and the one on the right from the Zurich mobile pedestrian data set (ETHZ) [4] in which the camera is on a continuously movi-lids Subway Set Zurich Mobile Set Figure. Sample images of two challenging data sets ing platform, people are crowded, illuminations change significantly and the background is rather cluttered. There is a vast literature on techniques of object detection, and specifically pedestrian detection. Learning-based methods have come to be dominant; key issues here are the features and the learning algorithms that are used. Features can be global or local. Global features, such as edge templates [6] and shape models [5] can be highly discriminative but sensitive to changes in overall shape due to occlusions and articulations. Local features such as wavelet descriptors [2, ], SIFT-like features [0] and Histogram of Oriented Gradient (HOG) [2])are more flexible but a set of them needs to be selected and combined in some way, typically via a learning algorithm. There have also been efforts to combine a variety of complementary features such as Wu and Nevatia s Heterogeneous Local Features [2], Schwartz et al. s edge-based features augmented by texture and color [4] and Wang et al. s [8] HOG features + Local Binary Pattern (LBP) approach. Leibe et al. [9] combine both local and global cues via a probabilistic top-down segmentation. One thread, among the successful approaches, has been to build on the pioneering work of Viola and Jones for face detection [6]. Enhancements include development of new features, such as motion enhanced Haar-like features [7], Edgelet features [9] and covariance matrix descriptors [5]. Classifier structure has also been enhanced beyond the original cascade, for example the trees in [20]. We present a new method that also follows this paradigm /0/$ IEEE 4

2 but use a very different kind of feature set which, in turn, requires very different learning techniques. The resulting detectors exhibit considerable improvements in accuracy while also reducing the computational costs substantially. In a recent workshop paper, Duan et al. [3] introduced a novel class of features called Associated Pairing Comparison Features (APCF), which are built on earlier granule features which were demonstrated for face detection [7]. APCF features comprise simple comparisons of simple image properties such as color or gradient in small regions (called granules) of the detection window. As the APCF feature is defined by a sequence of unrestricted granules, the feature space can be very large, hence normal AdaBoost learning techniques which require exhaustive enumeration of features is inapplicable; instead, Duan et al. use a heuristic algorithm for feature selection. Our work builds on the concept of APCF but considers a simpler, special case of APCF. We term these features as Joint Ranking of Granules as our comparison thresholds are set to zero (the name will become more obvious when we introduce further details of the features). Based on Duan et al. s heuristic algorithm, we propose a collaborative learning algorithm enhanced by a simulated annealing (SA) step in combination with a Real AdaBoost algorithm and demonstrates the advantage of the new method. [3] shows impressive results for the task of pedestrian detection on some standard datasets. However, the method does not explicitly account for inter-object occlusions so may fail in more crowded environments. We incorporate a part-based approach in our work: we learn detectors for selected parts of the human body and final detection of pedestrians is based on a joint inference of part combination. This step of our method builds on earlier work of Wu and Nevatia [9]; however, we design a new dynamic search approach instead of the static search used in previous work. While our method builds on the above described earlier work, we demonstrate that our method achieves considerable improvements in accuracy compared to the state-of-art detectors which are already very good, and this gain does not come at the cost of enhanced computation; in fact, the detector is significantly more efficient than the others. We provide a detailed analysis of performance later in the paper. The rest of this paper is organized as follows: Section 2 outlines our approach; Section 3 introduces the JRoG feature; Section 4 elaborates the collaborative learning algorithm for JRoG features; Section 5 describes our part-based detection method; Section 6 presents experiment results on several challenging testing sets; Section 7 makes conclusions of this paper. 2. Outline of Our Approach In this paper, Joint Ranking of Granules (JRoG) features are adopted as the descriptors, which are in fact APCF fea- > > < 0 0 Figure 2. Computation of a 3-bit JRoG feature. An image patch is converted into the grey-level granular space, in which three pairs of granules are ranked respectively. Ranking result indicates the brightness of granule (the solid square) is higher than its opponent (the hollow square), and 0 otherwise. Finally, the three bits of ranking results constitute the output of this JRoG feature. tures [3] simplified by excluding gradient granules and setting all comparison thresholds as zero. Illustrated as Fig.2, a JRoG feature unites binary ranking results made by several granule pairs which are selected among thousands of grey-level granules in the granular space [7]. Such unrestricted combination endows JRoG features with remarkable flexibility but also makes the conventional exhaustive search method inapplicable due to the tremendous size of the entire feature set. Inspired by the heuristic algorithm used in [3] for APCF features, we propose a collaborative learning approach to alleviate the difficulty of learning such combinatorial features, which comprises a SA module and an incremental feature selection module. The former one samples the vast feature space in a probabilistic way, while the latter one progressively filters out ineffective features in an enumerated set and finally select an optimal one in a deterministic way. The two complementary modules successfully collaborate to efficiently search the formidably large JRoG feature space and select discriminative JRoG features for the training of domain-partitioning weak classifiers proposed by Schapire and Singer [3]. Moreover, we improve Wu and Nevatia s work [9] on Bayesian combination of part detection results by means of a more effective partition of body parts in crowded environments and a dynamic search method for optimal combination results. 3. JRoG Features JRoG features are a type of combinatorial feature whose outputs are discrete index numbers. Their elementary features come from the granular space of grey-level image instances. The granular space [7], denoted by G, is a computationally efficient multi-resolution space extended from the image instance space X, whose bases are granules that observe an image instance in different locations and scales. Denoting the intensity of pixel (u, v) in an instance x X by x(u, v), a granule is defined by a triplet 42

3 g(2,,0) g(3,0,) g(6,3,2) g(7,7,3) [3] point out that in the Real AdaBoost algorithm, given a domain partition function such as JRoG feature J(x : g), the corresponding optimal weak classifier h(x) receives a normalization factor of sample weights given by Z(S, h(x)) 2 j W + j W j. (4) Figure 3. Four granules of scale from 0 to 3 on a 6 6 image. g u,v,s (x) = 2 s 2 s 2 s i=0 2 s j=0 x(u + i, v + j), () which is the intensity average of pixels in a square whose left top point is (u, v) and width is 2 s. Fig.3 illustrates four granules of different scales on a 6 6 image instance. In this paper, we choose granules of scale from 0 to 3 to constitute the granular space. Notice that the granular space of a w h image instance has (w (2 s )) (h (2 s )) granules of scale s. By ranking two arbitrary granules, an efficient bipartition of the instance space X can be obtained as r ( g i (x), g j (x) ) {, if gi (x) > g = j (x), (2) 0, otherwise and k such bi-partitions jointly define a k-bit JRoG feature that provides a 2 k -partition of X : J(x : g) = [b 0 b b k ] {0,,, 2 k }, g = (g 0, g,, g 2k ), b i = r ( g 2i (x), g 2i+ (x) ), (3) in which g is a 2k-dimensional subspace of G that defines the JRoG feature by its 2k basis granules, and the i-th bipartition b i is given by the ranking result between two consecutive granules in g. In this way, the instance space X is divided into 2 k disjoint blocks {X 0,, X 2k }, and an instance x falls into block X i if and only if J(x) = i. Essentially, the JRoG features are special decision trees where all nodes at the same level share one bi-partition. They are derived from APCF features [3] by setting the comparison threshold as zero and excluding gradient granules. Such simplification helps JRoG features achieve even higher computational efficiency. In the following sections, the JRoG feature J(x : g) is sometimes termed g for abbreviation since it is defined by this granular subspace. 4. Collaborative Learning of JRoG Features for Real AdaBoost Let S = { (x i, y i, w i ) : x i X, y i = ±, w i R } be the training sample set, where x i is an instance, y i is its labe and w i is the sample weight. Schapire and Singer where Wj b is the weight sum of all samples labeled as b and falling into the j-th block. In other words, the learning of domain-partitioning weak classifiers for Real AdaBoost is now reduced to selecting optimal JRoG features that minimize this normalization factor (hereafter, we abbreviate Z(S, h(x)) to Z(S, g) as h(x) is determined by the JRoG feature g if given S). However, this is actually a nontrivial problem due to the tremendous size of the JRoG feature set. Take an instance image of size for example. It has 4725 granules from scale 0 to 3. As a typical 6-bit JRoG feature used in this paper consists of 2 granules (Equ.3), there are totally ! 2 different k-bit JRoG 6 features. Therefore, conventional exhaustive search method used for Haar-like features or HOG features is inapplicable to the selection of discriminative JRoG features. To alleviate this problem, we propose a novel collaborative learning method comprising an incremental feature selection module and a SA module. Before describing this method, we give two important distance definitions for JRoG features as follows. Let g p = (g p0,, g pm ) and g q = (g q0,, g qn ) be two JRoG features. The first distance is the number of different granules in the same bit between them: { n D (g p, g q ) = i=0 g pi g qi, if m = n (5) +, otherwise, where outputs if the inner condition is true, otherwise outputs 0. This distance is infinite if the two subspaces have different dimensions. The second distance is the largest granule-to-granule Euclidean distance as { D 2 (g p, g q ) = max d(gpi, g qi ) }, (6) i in which d(g pi, g qi ) = (u pi u qi )2 + (v pi v qi )2 + (e pi e qi ) 2, (7) where e = 2 s is the half side length of granule g u,v,s (Equ.), u = u+e and v = v+e are its center coordinates. Based on these two distances, neighbors of a JRoG feature g can be formulated as B θ,θ 2 (g) = { g : D (g, g) θ D 2 (g, g) θ 2 }, (8) where θ and θ 2 are thresholds for the two distances. These neighbors are used as candidates in the search of discriminative features since a good feature is likely to have a better neighbor nearby. 43

4 Given: Training sample set S = { (x i, y i, w i ) } N and JRoG feature set G = {g i } M ; Init: Initial feature set G 0 = G, sample subsets {S 0,, S K }, where S K = S and S i = 2 S i+ ; For r = 0,,, K β = G r Z(S r, g); g G r G r+ = { g : g G r and Z(S r, g) < β } ; Output: JRoG feature g = arg min Z(S K, g). g G K Figure 4. Incremental feature selection method. 4.. Incremental Feature Selection Module This module adopts the incremental feature selection method proposed by Huang et al. [7] to fast select an optimal JRoG feature from a large feature set. As formalized in Fig.4, it starts with a small sample subset S 0 and the entire JRoG feature set G 0, and computes normalization factors (Equ.4) of every candidate feature in G 0 with respect to training samples in S 0. The mean Z value of all candidate features is chosen as the threshold β to filter out inferior candidates in current feature set so that the remaining ones constitute a shrunk feature subset. This process repeats until all training samples are employed for evaluation. Since features usually retain similar discriminability in every training sample subset, discriminative ones are very likely to be preserved in the reduced feature subset. If the numbers of training samples and candidate features are N and M respectively, the time spent in computing Z(S, g) is O(N), the conventional exhaustive search method takes O(M N) to select an optimal JRoG feature from G, while the time required by this incremental feature selection method is remarkably reduced to O(M ln N) SA Module The Simulated Annealing [8] is a generic probabilistic meta-heuristic for the global optimization problem of applied mathematics. Fig.5 describes our way of applying this method for searching discriminative JRoG features. As the candidate set of the new state, G is composed of neighbors of current state g. The best feature g is maintained throughout the whole search process and finally output as the optimal state. In practice, we set θ = and θ 2 = 8 so that G is generated by replacing one granule of g with a nearby granule of distance no more than 8. Besides, for simplicity and efficiency, we heuristically set N = 000 dim(g 0 ) and γ = 0.0 N so that each granule can be changed by 000 times on average and the SA process ends at temperature 0.0 T 0. Choice of starting temperature T 0 is critical in the SA process; if T 0 is too high, the search will become a random walk and hardly converge; Given: Training sample set S, initial JRoG feature g 0, starting temperature T 0, temperature decreasing step γ, maximum iteration number R, two thresholds for neighbors, θ and θ 2 ; Init: g = g = g 0, E = E = Z(S, g 0 ), M = 0; For r = 0,,, R T = T 0 γ r Randomly select g B θ,θ 2 (g); E = Z(S, g ), and P acc = exp( E E T ); Generate a uniform random number λ [0, ]; If λ P acc, Then g = g, E = E, M++; If E > E, Then g = g, E = E; Output: The best JRoG feature g and the Jump/Keep ratio η = M N M. Figure 5. Simulated Annealing for Searching JRoG Features. if T 0 is too low, the search is likely to be trapped in a local minimum at the very beginning. Moreover, in AdaBoost algorithm, a series of weak classifiers are sequentially learned with respect to varying training sample weights, so it is difficult to find a universal starting temperature that is appropriate for every round. Understandably, the Jump/Keep ratio η, defined by the number of times the feature is changed or not, reflects the fitness of temperature cooling schedule. Based on this observation, an adaptive temperature tuning method is introduced in the coming section which aims for relatively stable Jump/Keep ratio of the SA in every round of classifier learning Collaborative Learning of JRoG features Formalized in Fig.6, the collaborative learning method constructs a k-bit JRoG feature in k iterations. In each iteration, it grows the current feature g by adding a pair of granules into it, takes the grown feature as the initial seed of SA, and seeks neighbors of the updated feature for refinement. The incremental feature selection method is employed both in the growth of the feature and the search among its neighbors. In our experiments, we define the granule pair candidate set by C = { g : dim(g) = 2, d(g 0, g ) 4 }, so that granules of any candidate pair are close enough to each other. Similarly, the neighbors to be searched are restricted within 2 and 4 in terms of the first and second distances respectively. Generally speaking, increasing/decreasing the starting temperature T 0 will raise/lower the Jump/Keep ratio η in the SA process. This causal relationship enables a negative feedback from η back to T 0. Denote the preferred target Jump/Keep ratio by η. Once a JRoG feature is learned and the corresponding η is computed, a compensation function 44

5 Given: Training sample set S, granule pair candidate set C = { g : dim(g) = 2 }, starting temperature T 0, and target Jump/Keep ratio η; Init: Set initial feature empty g 0 =, dim(g 0 ) = 0; For r =,, k (Grow): Call the incremental feature selection module to select an optimal JRoG feature g r from C r = { g : g = (g r, g), g C } ; (Simulated Annealing): Call the SA module with T 0 to update g r ; (Search Neighbors): Call the incremental feature selection module to select an optimal neighbor from B 2,4 (g r ) to refine g r ; Temperature Tuning: T0 = f T (T 0, η, η), η is the Jump/Keep ratio in the SA process. Output: the learned JRoG feature g k and the suggested starting temperature T0. Figure 6. Collaborative Learning of a k-bit JRoG Feature. k, T0, η Yes g No Accuracy is enough? Grow g T (, η, η ) f T T 0 0 Simulated Annealing No Add g into Strong Classifier Refine g dim( g) = 2 k? Figure 7. Flow Chart of Collaborative Learning for AdaBoost algorithm. The two yellow blocks are incremental feature selection modules, the green one is the SA module, and the red one is the adaptive temperature tuning. This procedure repeats until the boosted strong classifier achieves required accuracy. into three parts (head-shoulder, torso and leg, shown as the upper half of Fig.8), trains three independent detectors for them respectively, and fuses the part detection results with full body detections by means of a Bayesian combination method. Let Z be the detection responses and S be the state of multiple humans, the joint likelihood is formulated as Yes is defined as p(z S) = α p(z α S α ), α {F, H, T, L}, (0) T0 = f T (T 0, η, η) = η T 0, (9) η which adjusts the starting temperature for the coming round to make consequent Jump/Keep ratio approach the target. This provides a more controllable parameter η. To sum up, on one hand, SA is capable of escaping from local minima but hardly converges within limited time; on the other hand, the incremental feature selection significantly reduces the time required to find a discriminative feature in a large enumerated feature set but still insufficient in the enormous JRoG feature space. The collaborative learning method integrates the two complementary methods to alleviate the difficulty of searching the combinatorial feature space. An adaptive tuning method is designed to adjust the starting temperature for the successive round of feature learning to stabilize the SA process. Illustrated as Fig.7, the collaborative learning method serves the Real AdaBoost algorithm [3] by providing a series of discriminative JRoG features. If the SA module is removed, collaborative learning becomes similar to the heuristic learning algorithm used by Duan et al. [3]; we term this simplified version as solo learning (SL). The experiment in section 6. shows that the collaborative learning consistently improves upon solo learning, which justifies the usage of SA module. 5. Dynamic Search for Bayesian Combination of Part Detection Results To address partial occlusion problems in crowded scenes, Wu and Nevatia [9] partition the full human body where α is the index for full-body(f), head-shoulder(h), torso(t) and leg(l), Z α and S α are detection responses and states for part α. Based on this MAP formulation, Wu and Nevatia adopt a naive greedy method to seek the optimal state S that best explains the observation Z. They initialize the state S by all available hypotheses from full-body and head-shoulder detection responses, and test these hypotheses one by one in descending Y-axis order. In each test, a human hypothesis is removed if the joint likelihood increases by taking it out of S. A key problem of this method is that each hypothesis is tested only once and the testing order is predetermined so that each error in S has only one opportunity to be rectified. We follow Wu and Nevatia s Bayesian approach of combining part detection responses but choose a different decomposition of the human body and utilize a dynamic search method to obtain the optimal state. The new decomposition, defined in the lower half of Fig.8, divides the human body into four parts: upper body, lower body, left body and right body. Compared to the three-part decomposition, dividing the full body into four parts is more suitable to the humans on the periphery of crowd whose left or right halves are often occluded. Opposite to Wu and Nevatia s method which starts from full hypotheses set, our dynamic search method initiatalizes the multiple-human state S to be empty and increase the joint likelihood by iteratively adding or removing hypotheses (Fig.9). In each round, the best hypothesis in a candidate set generated by all part detection responses is selected to be added into the current state S or an existing hypothesis is removed if that achieves higher 45

6 Wu and Nevatia s [9] Part Definition Our Part Definition Figure 8. Part Definition of Wu and Nevatia s approach and Ours. Given: Part detection response set Z Init: S, and generate candidate set C from Z; Loop s a = arg max p(z S {s}); s C L a = p(z S {s a }) p(z S); s r = arg max p(z S {s}); s S L r = p(z S {s r }) p(z S); IF L a L r and L a > 0, S {s a } S ELSE IF L r > 0,S {s r } S; ELSE quit Loop, Output: the optimal multiple-human state S for Z. Figure 9. Dynamic Search for the Optimal Multiple-Human State. likelihood. Such a dynamic search process evaluates each hypothesis multiple times which improves the robustness against part detection errors. 6. Experiments In our experiments, the size of pedestrian training samples is set to be All granules are computed based on grey images. Testing images are scanned at 6 scales to detect pedestrians of size from to The rest of this section is made up of five parts: the first part analyzes the convergence of boosting a strong classifier by collaborative learning with different parameter settings; the second compares our approach with previous work on the popular INRIA data set [2]; the third evaluates our method on the challenging ETHZ data set [4]; the fourth presents the improvement in Bayesian part combination [9] by the dynamic search method; and the last part discusses the computational complexity of our approach. 6.. Collaborative Learning for Boosting a Strong Classifier In this section, we design a cross-validation experiment to choose a proper target Jump/Keep ratio for collaborative learning. The sample set includes 20,000 positive samples and 20,000 negative ones collected from internet, of which 70% are selected for training and the rest for testing. The initial starting temperature and the bit number of JRoG features are fixed as 0.03 and 6. The target Jump/Keep ratio is set to be.0, 0.5 and 0.25 respectively, denoted by CL.0, CL 0.5 and CL To validate the effectiveness of adaptive temperature tuning, we remove the corresponding module (the red one in Fig.7) from the collaborative learning and keep using the same start temperature. This setting is denoted by CL. The SA module (the green one in Fig.7) can also be removed and this degenerate version is termed solo learning (SL). With each setting, a strong classifier is trained by the collaborative learning served Real AdaBoost (Fig.7). Two scores are calculated to evaluate the performance of classifiers: Equal Error Rate (EER) and False Positive Rate (FPR) when false negative rate is 0.0 (the latter score is even more important for the cascade detector due to its bias in favor of classifiers with low false negative rate). The experiment is repeated 0 times; Table. lists the two scores of each setting and highlights the best one after 0, 20, 50, and 00 weak classifiers are learned. Table. Convergence of boosted classifiers with different settings Weak Classifier No ERR CL CL CL CL SL FPR CL CL CL CL SL In this comparison, CL.0 performs best when 0 weak classifiers are learned; CL 0.5 takes this position afterward. On one hand, the collaborative learning is not very sensitive to the target Jump/Keep ratio since CL.0, CL 0.5 and CL 0.25 have relatively close ERR and FPR scores; all of them outperform SL, which justifies the usage of SA in the collaborative learning. On the other hand, the ranking of CL varies as the number of weak classifiers increases, indicating that removing the negative feedback constructed by the adaptive temperature tuning may decrease the stability of SA module. Consequently, we choose 0.5 as the target Jump/Keep ratio for the next set of experiments INRIA Data Set The INRIA data set [2] has become a standard to compare results on; it contains 2,478 positive samples and,28 negative images for training, and,28 positive samples and 453 negative images for testing. We generate positive training samples by slightly rotating and scaling the origi- 46

7 0.2 INRIA ETH SEQ ETH SEQ 2 ETH SEQ 3 ilids Miss Rate Dalal et al. [2] Tuzel et al. [5] Wu and Nevatia [22] 0.0 Schwartz et al. [4] Duan et al. [3] Ours False Positive Per Window Detection Rate Ess et al. [4] Wu and Nevatia [23] Schwartz et al. [4] Ours False Positive Per Image (FPPI) Detection Rate Ess et al. [4] Wu and Nevatia [23] Schwartz et al. [4] Ours False Positive Per Image (FPPI) Detection Rate Ess et al. [4] Wu and Nevatia [23] Schwartz et al. [4] Ours False Positive Per Image (FPPI) Figure 0. ROC curves of different methods on multiple data sets. Detection Rate Wu's Full Body [20] Wu's Static Combination [20] Our Full Body Our Static Combination Our Dynamic Combination False Positive Per Image (FPPI) nal ones, and train a 6-layer cascade detector by collecting false alarms in negative training images. We empirically set the bit number of JRoG features as 6 for the first three layers, 5 for the next 6 layers and 4 for the rest. The learned detector contains 2533 weak classifiers and 2772 granules. Fig.0 shows ROC curves of state-of-art methods and ours on the INRIA testing set, in which our method is among the best especially in the region between False Positive Per Window FPPW 0 3 and 0 5. Notably, owing to the collaborative learning method, our approach is still comparable to Duan et al. s although their APCF features additionally utilize the gradient information. Some detection results on un-cropped INRIA testing images are shown in Fig ETHZ Data Set ETHZ data set [4] includes four videos (one for training and three for testing) captured on a moving platform in very cluttered environments. To cope with this challenging data set, we collected about 23,000 negative images and labeled more than 20,000 pedestrians from internet, and trained another 6-layer cascade detector based on these training data, which has the same number of weak classifiers and granules as the one for INRIA test. Only images from the left camera are used for testing. The first sequence contains 999 frames with 5,93 humans; the second one contains 450 frames with 2,359 humans; the third one contains 354 frames with,828 humans. These sequences are processed frame by frame, without usage of any temporal information. To compare with Ess et al s [4] method which utilizes scene knowledge, we use the simple ground plane estimation method used by Wu et al. s [22] to facilitate detection. Schwartz et al s method [4] is also included in this experiment. Following the same evaluation metric used in [4], we obtain ROC curves of our method shown as Fig.0. Our method outperforms other s in all three videos: Compared to the second best method, it increases the detection rate by 9%, 6% and 6% respectively when False Positive Per Frame (FPPF) is 2. Fig. gives some detection results of our method on this challenging data set ilids Data Set The ilids data set [] features a busy subway station where people are frequently occluded by each other. We selected 257 frames from this data set and annotated 33 pedestrians in them. Four parts detectors, shown as Fig.8, are learned from this training set. The sample size for upper body and lower body is and for left body, for right body it is We compare our method with Wu and Nevatia s Boosted Edgelet approach [9]. Here, the combination method proposed in [9] which makes sequential tests is denoted by static combination ; the dynamic search method of this paper is denoted by dynamic combination. For fair comparison, we implemented the static combination method and applied it to our part detection results. Shown in Fig.0, static combination is superior to the full body detection results. Fig. shows some differences in combination results of both methods, dynamic combination successfully removes a false alarm (the red arrow) and recovers a missed detection (the yellow arrow). Besides, the collaborative learning of JRoG features significantly improves the detection accuracy compared to [9], reducing missed detections by a half at the same FPPF in full body detection and combined results Computational Complexity Given the granular space, computing a k-bit JRoG feature only requires 2k times of memory access and k times of substraction. On a Xeon 3 GHz computer, the detector learned for ETHZ testing can process about one million scanning windows per second on a single processor. With the simple ground plane estimation, this detector takes only 70 ms to scan a ETHZ test image at 6 scales from.0 to 0.25; this includes the time spent in computing the granular space. This performance may be adequate for many real-time processing systems and can be scaled up by use of multiple processors or use of GPUs. The training of the 6-layer cascade costs about two days on the same computer. 7. Conclusion We described a novel collaborative learning method for JRoG features and a dynamic search method for Bayesian combination of part detection results. This approach achieves considerable improvements in both detection accuracy and computational efficiency on challenging real-life pedestrian detection problems. The collaborative learning is 47

8 INRIA ETHZ SEQ ETHZ SEQ 2 ETHZ SEQ 3 ilids Figure. Detection results of our method on INRIA, ETHZ and ilids data sets. The first row of the ilids block is the static combination results, and the second and the third rows are the dynamic combination results. The red arrow points at a false alarm given by the static combination, and the yellow arrow points at a detection given by the dynamic combination which is missed by the static combination. a general learning method, which can adapt to other combinatorial features and be used in the detection of other objects such as faces and cars. References [] d.html. [2] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. CVPR, [3] G. Duan, C. Huang, H. Ai, and S. Lao. Boosting associated pairing comparison features for pedestrian detection. Ninth IEEE International Workshop on Visual Surveillance, [4] A. Ess, B. Leibe, and L. V. Gool. Depth and appearance for mobile scene analysis. ICCV, [5] P. Felzenszwalb. Learning models for object recognition. CVPR, 200. [6] D. Gavrila. Pedestrian detection from a moving vehicle. ECCV, [7] C. Huang, H. Ai, Y. Li, and S. Lao. Learning sparse features in granular space for multi-view face detection. Proc. Seventh Intl Conf. Automatic Face and Gesture Recognition, [8] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 985. [9] B. Leibe, E. Seemann, and B. Schiele. Pedestrian detection in crowded scenes. CVPR, [0] C. Mikolajczyk, C. Schmid, and A. Zisserman. Human detection based on a probabilistic assembly of robust part detectors. ECCV, [] A. Mohan, C. Papageorgiou, and T. Poggio. Example-based object detection in images by components. PAMI, 200. [2] C. Papageorgiou, T. Evgeniou, and T. Poggio. A trainable pedestrian detection system. In Proceeding of Intelligent Vehicles, 998. [3] R. E. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 999. [4] W. R. Schwartz, A. Kembhavi, D. Harwood, and L. S. Davis. Human detection using partial least squares analysis. ICCV, [5] O. Tuzel, F. Porikli, and P. Meer. Human detection via classification on riemannian manifolds. CVPR, [6] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR, 200. [7] P. Viola, M. Jones, and D. Snow. Detecting pedestrians using patterns of motion and appearance. ICCV, [8] X. Wang, T. X. Han, and S. Yan. An hog-lbp human detector with partial occlusion handling. ICCV, [9] B. Wu and R. Nevatia. Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. ICCV, [20] B. Wu and R. Nevatia. Cluster boosted tree classifier for multi-view, multi-pose object detection. ICCV, [2] B. Wu and R. Nevatia. Optimizing discrimination-efficiency tradeoff in integrating heterogeneous local features for object detection. CVPR, [22] B. Wu, R. Nevatia, and Y. Li. Segmentation of multiple partially occluded objects by grouping merging assigning part detection responses. CVPR,

Detection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors

Detection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors Detection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors Bo Wu Ram Nevatia University of Southern California Institute for Robotics and Intelligent

More information

Object detection using non-redundant local Binary Patterns

Object detection using non-redundant local Binary Patterns University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh

More information

Human detection using local shape and nonredundant

Human detection using local shape and nonredundant University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Human detection using local shape and nonredundant binary patterns

More information

A novel template matching method for human detection

A novel template matching method for human detection University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 A novel template matching method for human detection Duc Thanh Nguyen

More information

Detecting Pedestrians by Learning Shapelet Features

Detecting Pedestrians by Learning Shapelet Features Detecting Pedestrians by Learning Shapelet Features Payam Sabzmeydani and Greg Mori School of Computing Science Simon Fraser University Burnaby, BC, Canada {psabzmey,mori}@cs.sfu.ca Abstract In this paper,

More information

FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO

FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO Makoto Arie, Masatoshi Shibata, Kenji Terabayashi, Alessandro Moro and Kazunori Umeda Course

More information

Real-Time Human Detection using Relational Depth Similarity Features

Real-Time Human Detection using Relational Depth Similarity Features Real-Time Human Detection using Relational Depth Similarity Features Sho Ikemura, Hironobu Fujiyoshi Dept. of Computer Science, Chubu University. Matsumoto 1200, Kasugai, Aichi, 487-8501 Japan. si@vision.cs.chubu.ac.jp,

More information

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,

More information

Relational HOG Feature with Wild-Card for Object Detection

Relational HOG Feature with Wild-Card for Object Detection Relational HOG Feature with Wild-Card for Object Detection Yuji Yamauchi 1, Chika Matsushima 1, Takayoshi Yamashita 2, Hironobu Fujiyoshi 1 1 Chubu University, Japan, 2 OMRON Corporation, Japan {yuu, matsu}@vision.cs.chubu.ac.jp,

More information

People detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features

People detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features People detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features M. Siala 1, N. Khlifa 1, F. Bremond 2, K. Hamrouni 1 1. Research Unit in Signal Processing, Image Processing

More information

Fast and Stable Human Detection Using Multiple Classifiers Based on Subtraction Stereo with HOG Features

Fast and Stable Human Detection Using Multiple Classifiers Based on Subtraction Stereo with HOG Features 2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China Fast and Stable Human Detection Using Multiple Classifiers Based on

More information

A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM

A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM M.Ranjbarikoohi, M.Menhaj and M.Sarikhani Abstract: Pedestrian detection has great importance in automotive vision systems

More information

High-Level Fusion of Depth and Intensity for Pedestrian Classification

High-Level Fusion of Depth and Intensity for Pedestrian Classification High-Level Fusion of Depth and Intensity for Pedestrian Classification Marcus Rohrbach 1,3, Markus Enzweiler 2 and Dariu M. Gavrila 1,4 1 Environment Perception, Group Research, Daimler AG, Ulm, Germany

More information

A Cascade of Feed-Forward Classifiers for Fast Pedestrian Detection

A Cascade of Feed-Forward Classifiers for Fast Pedestrian Detection A Cascade of eed-orward Classifiers for ast Pedestrian Detection Yu-ing Chen,2 and Chu-Song Chen,3 Institute of Information Science, Academia Sinica, aipei, aiwan 2 Dept. of Computer Science and Information

More information

Efficient Detector Adaptation for Object Detection in a Video

Efficient Detector Adaptation for Object Detection in a Video 2013 IEEE Conference on Computer Vision and Pattern Recognition Efficient Detector Adaptation for Object Detection in a Video Pramod Sharma and Ram Nevatia Institute for Robotics and Intelligent Systems,

More information

Pedestrian Detection in Infrared Images based on Local Shape Features

Pedestrian Detection in Infrared Images based on Local Shape Features Pedestrian Detection in Infrared Images based on Local Shape Features Li Zhang, Bo Wu and Ram Nevatia University of Southern California Institute for Robotics and Intelligent Systems Los Angeles, CA 90089-0273

More information

Histogram of Oriented Gradients (HOG) for Object Detection

Histogram of Oriented Gradients (HOG) for Object Detection Histogram of Oriented Gradients (HOG) for Object Detection Navneet DALAL Joint work with Bill TRIGGS and Cordelia SCHMID Goal & Challenges Goal: Detect and localise people in images and videos n Wide variety

More information

Subject-Oriented Image Classification based on Face Detection and Recognition

Subject-Oriented Image Classification based on Face Detection and Recognition 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Detecting and Segmenting Humans in Crowded Scenes

Detecting and Segmenting Humans in Crowded Scenes Detecting and Segmenting Humans in Crowded Scenes Mikel D. Rodriguez University of Central Florida 4000 Central Florida Blvd Orlando, Florida, 32816 mikel@cs.ucf.edu Mubarak Shah University of Central

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Towards Practical Evaluation of Pedestrian Detectors

Towards Practical Evaluation of Pedestrian Detectors MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Towards Practical Evaluation of Pedestrian Detectors Mohamed Hussein, Fatih Porikli, Larry Davis TR2008-088 April 2009 Abstract Despite recent

More information

Multiple-Person Tracking by Detection

Multiple-Person Tracking by Detection http://excel.fit.vutbr.cz Multiple-Person Tracking by Detection Jakub Vojvoda* Abstract Detection and tracking of multiple person is challenging problem mainly due to complexity of scene and large intra-class

More information

Robust Human Detection Under Occlusion by Integrating Face and Person Detectors

Robust Human Detection Under Occlusion by Integrating Face and Person Detectors Robust Human Detection Under Occlusion by Integrating Face and Person Detectors William Robson Schwartz, Raghuraman Gopalan 2, Rama Chellappa 2, and Larry S. Davis University of Maryland, Department of

More information

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation M. Blauth, E. Kraft, F. Hirschenberger, M. Böhm Fraunhofer Institute for Industrial Mathematics, Fraunhofer-Platz 1,

More information

Fast Human Detection Algorithm Based on Subtraction Stereo for Generic Environment

Fast Human Detection Algorithm Based on Subtraction Stereo for Generic Environment Fast Human Detection Algorithm Based on Subtraction Stereo for Generic Environment Alessandro Moro, Makoto Arie, Kenji Terabayashi and Kazunori Umeda University of Trieste, Italy / CREST, JST Chuo University,

More information

Pedestrian Detection with Occlusion Handling

Pedestrian Detection with Occlusion Handling Pedestrian Detection with Occlusion Handling Yawar Rehman 1, Irfan Riaz 2, Fan Xue 3, Jingchun Piao 4, Jameel Ahmed Khan 5 and Hyunchul Shin 6 Department of Electronics and Communication Engineering, Hanyang

More information

Multi-Person Tracking-by-Detection based on Calibrated Multi-Camera Systems

Multi-Person Tracking-by-Detection based on Calibrated Multi-Camera Systems Multi-Person Tracking-by-Detection based on Calibrated Multi-Camera Systems Xiaoyan Jiang, Erik Rodner, and Joachim Denzler Computer Vision Group Jena Friedrich Schiller University of Jena {xiaoyan.jiang,erik.rodner,joachim.denzler}@uni-jena.de

More information

Supplementary material: Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features

Supplementary material: Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features Supplementary material: Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features Sakrapee Paisitkriangkrai, Chunhua Shen, Anton van den Hengel The University of Adelaide,

More information

Object Category Detection: Sliding Windows

Object Category Detection: Sliding Windows 04/10/12 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical

More information

Human Upper Body Pose Estimation in Static Images

Human Upper Body Pose Estimation in Static Images 1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Detecting Pedestrians Using Patterns of Motion and Appearance

Detecting Pedestrians Using Patterns of Motion and Appearance Detecting Pedestrians Using Patterns of Motion and Appearance Paul Viola Michael J. Jones Daniel Snow Microsoft Research Mitsubishi Electric Research Labs Mitsubishi Electric Research Labs viola@microsoft.com

More information

Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection

Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection Tomoki Watanabe, Satoshi Ito, and Kentaro Yokoi Corporate Research and Development Center, TOSHIBA Corporation, 1, Komukai-Toshiba-cho,

More information

the relatedness of local regions. However, the process of quantizing a features into binary form creates a problem in that a great deal of the informa

the relatedness of local regions. However, the process of quantizing a features into binary form creates a problem in that a great deal of the informa Binary code-based Human Detection Yuji Yamauchi 1,a) Hironobu Fujiyoshi 1,b) Abstract: HOG features are effective for object detection, but their focus on local regions makes them highdimensional features.

More information

Class-Specific Weighted Dominant Orientation Templates for Object Detection

Class-Specific Weighted Dominant Orientation Templates for Object Detection Class-Specific Weighted Dominant Orientation Templates for Object Detection Hui-Jin Lee and Ki-Sang Hong San 31 Hyojadong Pohang, South Korea POSTECH E.E. Image Information Processing Lab. Abstract. We

More information

Fast Human Detection Using a Cascade of Histograms of Oriented Gradients

Fast Human Detection Using a Cascade of Histograms of Oriented Gradients MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Fast Human Detection Using a Cascade of Histograms of Oriented Gradients Qiang Zhu, Shai Avidan, Mei-Chen Yeh, Kwang-Ting Cheng TR26-68 June

More information

Human Detection Based on Large Feature Sets Using Graphics Processing Units

Human Detection Based on Large Feature Sets Using Graphics Processing Units Informatica 29 page xxx yyy 1 Human Detection Based on Large Feature Sets Using Graphics Processing Units William Robson Schwartz Institute of Computing, University of Campinas, Campinas-SP, Brazil, 13083-852

More information

Eye Detection by Haar wavelets and cascaded Support Vector Machine

Eye Detection by Haar wavelets and cascaded Support Vector Machine Eye Detection by Haar wavelets and cascaded Support Vector Machine Vishal Agrawal B.Tech 4th Year Guide: Simant Dubey / Amitabha Mukherjee Dept of Computer Science and Engineering IIT Kanpur - 208 016

More information

Hierarchical Part-Template Matching for Human Detection and Segmentation

Hierarchical Part-Template Matching for Human Detection and Segmentation Hierarchical Part-Template Matching for Human Detection and Segmentation Zhe Lin, Larry S. Davis, David Doermann, and Daniel DeMenthon Institute for Advanced Computer Studies, University of Maryland, College

More information

Multiple Instance Feature for Robust Part-based Object Detection

Multiple Instance Feature for Robust Part-based Object Detection Multiple Instance Feature for Robust Part-based Object Detection Zhe Lin University of Maryland College Park, MD 2742 zhelin@umiacs.umd.edu Gang Hua Microsoft Live Labs Research Redmond, WA 9852 ganghua@microsoft.com

More information

Histograms of Oriented Gradients for Human Detection p. 1/1

Histograms of Oriented Gradients for Human Detection p. 1/1 Histograms of Oriented Gradients for Human Detection p. 1/1 Histograms of Oriented Gradients for Human Detection Navneet Dalal and Bill Triggs INRIA Rhône-Alpes Grenoble, France Funding: acemedia, LAVA,

More information

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg Human Detection A state-of-the-art survey Mohammad Dorgham University of Hamburg Presentation outline Motivation Applications Overview of approaches (categorized) Approaches details References Motivation

More information

Detecting Pedestrians Using Patterns of Motion and Appearance

Detecting Pedestrians Using Patterns of Motion and Appearance International Journal of Computer Vision 63(2), 153 161, 2005 c 2005 Springer Science + Business Media, Inc. Manufactured in The Netherlands. Detecting Pedestrians Using Patterns of Motion and Appearance

More information

Haar Wavelets and Edge Orientation Histograms for On Board Pedestrian Detection

Haar Wavelets and Edge Orientation Histograms for On Board Pedestrian Detection Haar Wavelets and Edge Orientation Histograms for On Board Pedestrian Detection David Gerónimo, Antonio López, Daniel Ponsa, and Angel D. Sappa Computer Vision Center, Universitat Autònoma de Barcelona

More information

Multi-Cue Pedestrian Classification With Partial Occlusion Handling

Multi-Cue Pedestrian Classification With Partial Occlusion Handling Multi-Cue Pedestrian Classification With Partial Occlusion Handling Markus Enzweiler 1 Angela Eigenstetter 2 Bernt Schiele 2,3 Dariu M. Gavrila 4,5 1 Image & Pattern Analysis Group, Univ. of Heidelberg,

More information

Parameter Sensitive Detectors

Parameter Sensitive Detectors Boston University OpenBU Computer Science http://open.bu.edu CAS: Computer Science: Technical Reports 2007 Parameter Sensitive Detectors Yuan, Quan Boston University Computer Science Department https://hdl.handle.net/244/680

More information

Object Category Detection: Sliding Windows

Object Category Detection: Sliding Windows 03/18/10 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Goal: Detect all instances of objects Influential Works in Detection Sung-Poggio

More information

PEDESTRIAN DETECTION IN CROWDED SCENES VIA SCALE AND OCCLUSION ANALYSIS

PEDESTRIAN DETECTION IN CROWDED SCENES VIA SCALE AND OCCLUSION ANALYSIS PEDESTRIAN DETECTION IN CROWDED SCENES VIA SCALE AND OCCLUSION ANALYSIS Lu Wang Lisheng Xu Ming-Hsuan Yang Northeastern University, China University of California at Merced, USA ABSTRACT Despite significant

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

Object Detection Design challenges

Object Detection Design challenges Object Detection Design challenges How to efficiently search for likely objects Even simple models require searching hundreds of thousands of positions and scales Feature design and scoring How should

More information

Integral Channel Features Addendum

Integral Channel Features Addendum DOLLÁR, et al.: INTEGRAL CHANNEL FEATURES ADDENDUM 1 Integral Channel Features Addendum Piotr Dollár 1 pdollar@caltech.edu Zhuowen Tu 2 zhuowen.tu@loni.ucla.edu Pietro Perona 1 perona@caltech.edu Serge

More information

Improving Part based Object Detection by Unsupervised, Online Boosting

Improving Part based Object Detection by Unsupervised, Online Boosting Improving Part based Object Detection by Unsupervised, Online Boosting Bo Wu and Ram Nevatia University of Southern California Institute for Robotics and Intelligent Systems Los Angeles, CA 90089-0273

More information

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction Chieh-Chih Wang and Ko-Chih Wang Department of Computer Science and Information Engineering Graduate Institute of Networking

More information

Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach

Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach Vandit Gajjar gajjar.vandit.381@ldce.ac.in Ayesha Gurnani gurnani.ayesha.52@ldce.ac.in Yash Khandhediya khandhediya.yash.364@ldce.ac.in

More information

[2008] IEEE. Reprinted, with permission, from [Yan Chen, Qiang Wu, Xiangjian He, Wenjing Jia,Tom Hintz, A Modified Mahalanobis Distance for Human

[2008] IEEE. Reprinted, with permission, from [Yan Chen, Qiang Wu, Xiangjian He, Wenjing Jia,Tom Hintz, A Modified Mahalanobis Distance for Human [8] IEEE. Reprinted, with permission, from [Yan Chen, Qiang Wu, Xiangian He, Wening Jia,Tom Hintz, A Modified Mahalanobis Distance for Human Detection in Out-door Environments, U-Media 8: 8 The First IEEE

More information

Human Motion Detection and Tracking for Video Surveillance

Human Motion Detection and Tracking for Video Surveillance Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,

More information

High Level Computer Vision. Sliding Window Detection: Viola-Jones-Detector & Histogram of Oriented Gradients (HOG)

High Level Computer Vision. Sliding Window Detection: Viola-Jones-Detector & Histogram of Oriented Gradients (HOG) High Level Computer Vision Sliding Window Detection: Viola-Jones-Detector & Histogram of Oriented Gradients (HOG) Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de http://www.d2.mpi-inf.mpg.de/cv

More information

Detection of a Single Hand Shape in the Foreground of Still Images

Detection of a Single Hand Shape in the Foreground of Still Images CS229 Project Final Report Detection of a Single Hand Shape in the Foreground of Still Images Toan Tran (dtoan@stanford.edu) 1. Introduction This paper is about an image detection system that can detect

More information

Detecting Pedestrians Using Patterns of Motion and Appearance (Viola & Jones) - Aditya Pabbaraju

Detecting Pedestrians Using Patterns of Motion and Appearance (Viola & Jones) - Aditya Pabbaraju Detecting Pedestrians Using Patterns of Motion and Appearance (Viola & Jones) - Aditya Pabbaraju Background We are adept at classifying actions. Easily categorize even with noisy and small images Want

More information

Person Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab

Person Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab Person Detection in Images using HoG + Gentleboost Rahul Rajan June 1st July 15th CMU Q Robotics Lab 1 Introduction One of the goals of computer vision Object class detection car, animal, humans Human

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE Hongyu Liang, Jinchen Wu, and Kaiqi Huang National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science

More information

Detecting People in Images: An Edge Density Approach

Detecting People in Images: An Edge Density Approach University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 27 Detecting People in Images: An Edge Density Approach Son Lam Phung

More information

https://en.wikipedia.org/wiki/the_dress Recap: Viola-Jones sliding window detector Fast detection through two mechanisms Quickly eliminate unlikely windows Use features that are fast to compute Viola

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major

More information

Face Detection and Alignment. Prof. Xin Yang HUST

Face Detection and Alignment. Prof. Xin Yang HUST Face Detection and Alignment Prof. Xin Yang HUST Many slides adapted from P. Viola Face detection Face detection Basic idea: slide a window across image and evaluate a face model at every location Challenges

More information

Skin and Face Detection

Skin and Face Detection Skin and Face Detection Linda Shapiro EE/CSE 576 1 What s Coming 1. Review of Bakic flesh detector 2. Fleck and Forsyth flesh detector 3. Details of Rowley face detector 4. Review of the basic AdaBoost

More information

Detecting Pedestrians Using Patterns of Motion and Appearance

Detecting Pedestrians Using Patterns of Motion and Appearance MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Detecting Pedestrians Using Patterns of Motion and Appearance Viola, P.; Jones, M.; Snow, D. TR2003-90 August 2003 Abstract This paper describes

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Fast Car Detection Using Image Strip Features

Fast Car Detection Using Image Strip Features Fast Car Detection Using Image Strip Features Wei Zheng 1, 2, 3, Luhong Liang 1,2 1 Key Lab of Intelligent Information Processing, Chinese Academy of Sciences (CAS), Beijing, 100190, China 2 Institute

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

Automatic Parameter Adaptation for Multi-Object Tracking

Automatic Parameter Adaptation for Multi-Object Tracking Automatic Parameter Adaptation for Multi-Object Tracking Duc Phu CHAU, Monique THONNAT, and François BREMOND {Duc-Phu.Chau, Monique.Thonnat, Francois.Bremond}@inria.fr STARS team, INRIA Sophia Antipolis,

More information

Combining PGMs and Discriminative Models for Upper Body Pose Detection

Combining PGMs and Discriminative Models for Upper Body Pose Detection Combining PGMs and Discriminative Models for Upper Body Pose Detection Gedas Bertasius May 30, 2014 1 Introduction In this project, I utilized probabilistic graphical models together with discriminative

More information

Face and Nose Detection in Digital Images using Local Binary Patterns

Face and Nose Detection in Digital Images using Local Binary Patterns Face and Nose Detection in Digital Images using Local Binary Patterns Stanko Kružić Post-graduate student University of Split, Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture

More information

Integrated Pedestrian Classification and Orientation Estimation

Integrated Pedestrian Classification and Orientation Estimation Integrated Pedestrian Classification and Orientation Estimation Markus Enzweiler1 Dariu M. Gavrila2,3 1 Image & Pattern Analysis Group, Univ. of Heidelberg, Germany Environment Perception, Group Research,

More information

detectorpls version William Robson Schwartz

detectorpls version William Robson Schwartz detectorpls version 0.1.1 William Robson Schwartz http://www.umiacs.umd.edu/~schwartz October 30, 2009 Contents 1 Introduction 2 2 Performing Object Detection 4 2.1 Conguration File........................

More information

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Estimating Human Pose in Images. Navraj Singh December 11, 2009 Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks

More information

Category vs. instance recognition

Category vs. instance recognition Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building

More information

Histogram of Oriented Gradients for Human Detection

Histogram of Oriented Gradients for Human Detection Histogram of Oriented Gradients for Human Detection Article by Navneet Dalal and Bill Triggs All images in presentation is taken from article Presentation by Inge Edward Halsaunet Introduction What: Detect

More information

Human Detection Using SURF and SIFT Feature Extraction Methods in Different Color Spaces

Human Detection Using SURF and SIFT Feature Extraction Methods in Different Color Spaces Journal of mathematics and computer Science 11 (2014) 111-122 Human Detection Using SURF and SIFT Feature Extraction Methods in Different Color Spaces Article history: Received April 2014 Accepted May

More information

Detecting Humans under Partial Occlusion using Markov Logic Networks

Detecting Humans under Partial Occlusion using Markov Logic Networks Detecting Humans under Partial Occlusion using Markov Logic Networks ABSTRACT Raghuraman Gopalan Dept. of ECE University of Maryland College Park, MD 20742 USA raghuram@umiacs.umd.edu Identifying humans

More information

Generic Object Detection Using Improved Gentleboost Classifier

Generic Object Detection Using Improved Gentleboost Classifier Available online at www.sciencedirect.com Physics Procedia 25 (2012 ) 1528 1535 2012 International Conference on Solid State Devices and Materials Science Generic Object Detection Using Improved Gentleboost

More information

Face Detection Using Look-Up Table Based Gentle AdaBoost

Face Detection Using Look-Up Table Based Gentle AdaBoost Face Detection Using Look-Up Table Based Gentle AdaBoost Cem Demirkır and Bülent Sankur Boğaziçi University, Electrical-Electronic Engineering Department, 885 Bebek, İstanbul {cemd,sankur}@boun.edu.tr

More information

Capturing People in Surveillance Video

Capturing People in Surveillance Video Capturing People in Surveillance Video Rogerio Feris, Ying-Li Tian, and Arun Hampapur IBM T.J. Watson Research Center PO BOX 704, Yorktown Heights, NY 10598 {rsferis,yltian,arunh}@us.ibm.com Abstract This

More information

Out-of-Plane Rotated Object Detection using Patch Feature based Classifier

Out-of-Plane Rotated Object Detection using Patch Feature based Classifier Available online at www.sciencedirect.com Procedia Engineering 41 (2012 ) 170 174 International Symposium on Robotics and Intelligent Sensors 2012 (IRIS 2012) Out-of-Plane Rotated Object Detection using

More information

Beyond Bags of features Spatial information & Shape models

Beyond Bags of features Spatial information & Shape models Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features

More information

Category-level localization

Category-level localization Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object

More information

Research on Robust Local Feature Extraction Method for Human Detection

Research on Robust Local Feature Extraction Method for Human Detection Waseda University Doctoral Dissertation Research on Robust Local Feature Extraction Method for Human Detection TANG, Shaopeng Graduate School of Information, Production and Systems Waseda University Feb.

More information

An Object Detection System using Image Reconstruction with PCA

An Object Detection System using Image Reconstruction with PCA An Object Detection System using Image Reconstruction with PCA Luis Malagón-Borja and Olac Fuentes Instituto Nacional de Astrofísica Óptica y Electrónica, Puebla, 72840 Mexico jmb@ccc.inaoep.mx, fuentes@inaoep.mx

More information

Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance

Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance Rogerio Feris, Sharath Pankanti IBM T. J. Watson Research Center {rsferis,sharat}@us.ibm.com Behjat Siddiquie SRI International

More information

Ensemble Tracking. Abstract. 1 Introduction. 2 Background

Ensemble Tracking. Abstract. 1 Introduction. 2 Background Ensemble Tracking Shai Avidan Mitsubishi Electric Research Labs 201 Broadway Cambridge, MA 02139 avidan@merl.com Abstract We consider tracking as a binary classification problem, where an ensemble of weak

More information

Sergiu Nedevschi Computer Science Department Technical University of Cluj-Napoca

Sergiu Nedevschi Computer Science Department Technical University of Cluj-Napoca A comparative study of pedestrian detection methods using classical Haar and HoG features versus bag of words model computed from Haar and HoG features Raluca Brehar Computer Science Department Technical

More information

Cat Head Detection - How to Effectively Exploit Shape and Texture Features

Cat Head Detection - How to Effectively Exploit Shape and Texture Features Cat Head Detection - How to Effectively Exploit Shape and Texture Features Weiwei Zhang 1,JianSun 1, and Xiaoou Tang 2 1 Microsoft Research Asia, Beijing, China {weiweiz,jiansun}@microsoft.com 2 Dept.

More information

Human Detection Based on a Probabilistic Assembly of Robust Part Detectors

Human Detection Based on a Probabilistic Assembly of Robust Part Detectors Human Detection Based on a Probabilistic Assembly of Robust Part Detectors K. Mikolajczyk 1, C. Schmid 2, and A. Zisserman 1 1 Dept. of Engineering Science Oxford, OX1 3PJ, United Kingdom {km,az}@robots.ox.ac.uk

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Contextual Combination of Appearance and Motion for Intersection Videos with Vehicles and Pedestrians

Contextual Combination of Appearance and Motion for Intersection Videos with Vehicles and Pedestrians Contextual Combination of Appearance and Motion for Intersection Videos with Vehicles and Pedestrians Mohammad Shokrolah Shirazi and Brendan Morris University of Nevada, Las Vegas shirazi@unlv.nevada.edu,

More information

Linear combinations of simple classifiers for the PASCAL challenge

Linear combinations of simple classifiers for the PASCAL challenge Linear combinations of simple classifiers for the PASCAL challenge Nik A. Melchior and David Lee 16 721 Advanced Perception The Robotics Institute Carnegie Mellon University Email: melchior@cmu.edu, dlee1@andrew.cmu.edu

More information

A Two-Stage Template Approach to Person Detection in Thermal Imagery

A Two-Stage Template Approach to Person Detection in Thermal Imagery A Two-Stage Template Approach to Person Detection in Thermal Imagery James W. Davis Mark A. Keck Ohio State University Columbus OH 432 USA {jwdavis,keck}@cse.ohio-state.edu Abstract We present a two-stage

More information

REAL TIME TRACKING OF MOVING PEDESTRIAN IN SURVEILLANCE VIDEO

REAL TIME TRACKING OF MOVING PEDESTRIAN IN SURVEILLANCE VIDEO REAL TIME TRACKING OF MOVING PEDESTRIAN IN SURVEILLANCE VIDEO Mr D. Manikkannan¹, A.Aruna² Assistant Professor¹, PG Scholar² Department of Information Technology¹, Adhiparasakthi Engineering College²,

More information