Coupling-and-Decoupling: A Hierarchical Model for Occlusion-Free Car Detection
|
|
- Edwin Lambert
- 6 years ago
- Views:
Transcription
1 Coupling-and-Decoupling: A Hierarchical Model for Occlusion-Free Car Detection Bo Li 1,2,3, Tianfu Wu 2,3, Wenze Hu 3,4 and Mingtao Pei 1 1 Beijing Lab of Intelligent Information, School of Computer Science and Technology, Beijing Institute of Technology, Beijing , P.R.China 2 BUPT-Seesoft Joint Lab of Visual Computing and Image Communication, Beijing University of Posts and Telecommunications (BUPT), Beijing , P.R.China 3 Lotus Hill Research Institute, EZhou, P.R.China 4 Department of Statistics, University of California, Los Angeles {boli.lhi, tfwu.lhi, wzhu.lhi}@gmail.com, peimt@bit.edu.cn Abstract. Handling occlusions in object detection is a long-standing problem. This paper addresses the problem of X-to-X-occlusion-free object detection (e.g. car-to-car occlusions in our experiment) by utilizing an intuitive coupling-and-decoupling strategy. In the coupling stage, we model the pair of occluding X s (e.g. car pairs) directly to account for the statistically strong co-occurrence (i.e. coupling). Then, we learn a hierarchical And-Or directed acyclic graph (AOG) model under the latent structural SVM (LSSVM) framework. The learned AOG consists of, from the top to bottom, (i) a root Or-node representing different compositions of occluding X pairs, (ii) a set of And-nodes each of which represents a specific composition of occluding X pairs, (iii) another set of And-nodes representing single X s decomposed from occluding X pairs, and (iv) a set of terminal-nodes which represent the appearance templates for the X pairs, single X s and latent parts of the single X s, respectively. The part appearance templates can also be shared among different single X s. In detection, a dynamic programming (DP) algorithm is used and as a natural consequence we decouple the two single X s from the X-to-X occluding pairs. In experiments, we test our method on roadside cars which are collected from real traffic video surveillance environment by ourselves. We compare our model with the state-of-the-art deformable part-based model (DPM) and obtain better detection performance. 1 Introduction In the literature of object detection, handling occlusions is very challenging and remains a long-standing problem. The two main reasons are (i) The gap between training and testing. When training an object detector, unoccluded object instances are often collected and used purposely. In testing, however, occlusions are inevitable in real scenarios. As a result, the detection performance will go down significantly as occlusions become severe. And (ii) The lack of common occlusion models. Generally and statistically speaking, it is very difficult to capture and predict occlusions because they can be treated as being uniformly distributed
2 2 B. Li, T. Wu, W. Hu and M. Pei in the wildest situation. To some extend, that explains, in turn, why the gap between training and testing exists. To address the occlusion problem, among others, hierarchical modeling (e.g. deformable part-based models [5]) has been widely used and shows performance improvement, and a 2-layer model is often adopted for modeling single objects, which can tackle small occlusions implicitly. Fig. 1. Some examples of roadside cars. There are different types of car-to-car occlusions which challenge the state-of-the-art detectors trained for single cars. In this paper, we distinguish between two types of occlusions: the X-to-X and X-to-Y occlusions, where X and Y represent different object categories (e.g. X represents car and Y person) respectively, and then present a couplingand-decoupling method for X-to-X occlusion-free object detection without modeling occlusions explicitly. As the running examples, we use roadside cars which are often parked along the curb, leading to the X-to-X occlusions. Occlusion-free roadside car detection can facilitate many important applications in computer vision and intelligent transportation, such as parking violation capturing, license plate detection and parking management. Figure 1 shows some examples of carto-car occlusions in real traffic video surveillance environment. In the sequel, we concretely use car instead of X to present the formulation (but notice that the proposed method is not limited to cars). Our method consists of two stages as follows. (i) The coupling stage in modeling and learning. Instead of training a single object detector, we learn hierarchical And-Or directed and acyclic graph (AOG) models for the car-to-car occluding pairs directly to account for the statistically strong coupling. The learned AOG consists of, from the top to bottom, (i) a root Or-node representing different compositions of occluding car-to-car pairs, (ii) a set of And-nodes each of which represents a specific composition of occluding car pairs, (iii) another set of And-nodes representing single cars decomposed from occluding car pairs, and (iv) a set of terminal-nodes which represent the appearance templates for the car pairs, single cars and latent parts of the single cars, respectively. The part appearance templates can also be shared among different single cars. We adopt Histogram of Oriented Gradient (HOG) [2] as the appearance feature as done in DPM [5]. Figure 3 shows the learned AOG for car-to-car pairs (where for clarity only a portion is drawn). We formulate the learning of AOG under the latent structural SVM (LSSVM) framework [13, 14, 16]. In the training
3 A Hierarchical Model for Occlusion-Free Car Detection 3 dataset, bounding boxes of car pairs and corresponding two single cars are annotated, and the parts of single cars are treated as latent variables. (ii) The decoupling stage in detection. Our AOG model is directed and acyclic and we can utilize the DP algorithm in inference. For detected car pairs, the back-traced bounding boxes for the two single cars are obtained, i.e., decoupled from the car pair. Since the locations and sizes of bounding boxes of the single cars are annotated when jointly training the AOG model, the back-traced ones are the optimal solutions for the two single cars statistics of overlapping cars, subset: test 0.8 detection rate of DPM model and proposed model, subset: test our hierarchical car detector DPM car detector proportion detection rate overlap ratio Occlusion: Occlusion: overlap ratio Occlusion: Occlusion: Occlusion: Fig. 2. Top-left: The population ratios in the testing set of roadside cars used in this paper. Bottom: Some examples of cropped car-to-car occluding pairs. The occlusion ratio is measured for the back car in the car pairs. Top-right: The plots of detection rates v.s. occlusion ratios, where blue dashed curve is for the state-of-the-art DPM [5] and red curve is for the proposed method. See text for details. To illustrate the necessity and the advantage of the proposed method in this paper, in Fig. 2, the left figure shows the population ratios of car-to-car pairs with different degrees of occlusions in the testing dataset collected by ourselves from the real traffic video surveillance environment. Some cropped image examples are shown in the bottom. The right figure shows the detection rates against the occlusion ratio for the proposed method (the red curve) and the state-of-the-art DPM [5] (the blue dashed curve). We can observe that, (i) The population ratio of car pairs with occlusions being equal or greater than 0.2 is greater than 0.5 (i.e. occlusions become a statistically major factor). (ii) At the same time, the detection performance of DPM dropped significantly when occlusions go beyond 0.2, while our method can obtain much better performance. (iii) The detection performance of our method goes up significantly when occlusions are greater than This is because that with those severe occlusions,
4 4 B. Li, T. Wu, W. Hu and M. Pei even if DPM could recall the two single cars, their bounding boxes overlap larger than the threshold normally used (e.g. 0.7), and then the one with lower score will be excluded by non-maximum suppressing (NMS) (see the DPM detection results in Fig. 5). Our method can, however, detect those cars correctly by decoupling them from the detected car pairs. More results and final performance comparison are shown in Fig. 5 and Fig.4 respectively. In the literature of computer vision, car detection for traffic monitoring systems are addressed mainly in single unoccluded situations, such as car type classification [12, 8], multiple-view car detection [9, 7], or shadow removal from suspicious car regions in images [10]. [1] proposed a method to detect and track multiple cars simultaneously, but they did not address the occlusion problem. Fig. 3. Our AOG Model. First-layer: illustration of car pair And-nodes and their corresponding appearance features. Second-layer: illustration of single car And-nodes and their corresponding appearance features. Third-layer: illustration of car parts Terminalnodes and their corresponding appearance and deformation features. Parts are shared. For clarity, we just show the parts of two single cars. 2 The Model 2.1 The AOG In this section, we specify the AOG hierarchical model used in this paper which is a directed and acyclic graph facilitating the DP algorithm in detection. The learning of AOG will be given in Sec. 4. By following the framework in [17], our AOG embeds the occluding car pair detection grammars which are embodied by defining three types of nodes:
5 A Hierarchical Model for Occlusion-Free Car Detection 5 (i) The root Or-node O represents compositional alternatives of the occluding car pairs (e.g., car pairs from different viewpoints or with different degrees of occlusions). The Or-node O has a branching variable, denoted by ω(o), indicating which child And-node is selected, and ω(o) will be inferred onthe-fly in detection. (ii) A set of And-nodes V And. There are two types of And-nodes: car pairs and single cars. Each car pair And-node represents the decomposition of a specific type of occluding car pair into two single cars (e.g. a frontal view car pair with the back car being occluded by 30% roughly), and each single car Andnode represents the decomposition of a single car into a small number of parts. (iii) A set of Terminal-nodes V T. First of all, the And-nodes defined above themselves can terminate directly, creating terminal nodes, when the resolution is low (relative to their own decomposed parts). Secondly, each part is represented by a terminal-node linking to image data. In the model, each terminalnode t V T has its own location, denoted by l t, which will be also inferred on-the-fly in detection. The location for placing an And-node is the same as that for the terminal-node directly terminated from it. In the AOG, terminal-nodes link the object detection grammars to image data by evaluating the appearance features, And-nodes take into account the geometric deformations between their child nodes, and Or-nodes select the best solution (i.e. the one with maximal score) among their child nodes. So, the scoring function of the AOG consists of two terms: appearance (i.e. data term) and deformation (i.e. relation term). Formally, an AOG is specified by a 5-tuple, G = (O, V And, V T, Θ app, Θ def ) (1) where Θ app are the parameters for the appearance scoring function when placing terminal-nodes in images, and Θ def the parameters for the deformation cost of a placed terminal-node with respect to its anchor location. They will be learned by LSSVM jointly. Part-sharing in the AOG. For the child single car And-nodes decomposed from car pair And-nodes, some of them are often with the same type (such as sided view or frontal view cars) but different occlusions. So, they can share part appearance, but might have different deformation models. By sharing-parts, it will supply more data in training the part appearance parameters, and also reduce run-time in detection. 2.2 The scoring function of an AOG Let Λ be the image lattice and I Λ an image defined on Λ. In detection, we need to search over scales to detect objects with different sizes. In practice, a feature pyramid of I Λ is generated, denoted by H (e.g. the HOG feature pyramid used
6 6 B. Li, T. Wu, W. Hu and M. Pei in the DPM [5] and our method). When placing an AOG in I Λ at a location u Λ, we have, (i) The scoring function for evaluating an Or-node O at u is defined by, Score(O, u) = max Score(A, u) (2) A ch(o) where ch(o) V And is the set of child And-nodes of the Or-node O. We can assign the branching variable ω(o) = arg max A ch(o) Score(O, u). (ii) The scoring function for computing an And-node A with respect to a placed Or-node O at u is defined by, Score(A, u O, u) =< θ app t A, Φ app (H, A, u) > + Score(c A, u) (3) c ch(a) where the first term is the appearance score for the terminal-node t A terminated from And-node A directly, θ app t A Θ app is the corresponding appearance parameters, Φ app (H, A, u) is the features extracted from the feature pyramid, and ch(a) V And V T is the set of child nodes of A. (iii) The scoring function for computing an And-node A 1 with respect to a placed And-node A at u is defined by, Score(A 1 A, u) = max ( < v Λ θapp t A1, Φ app (H, A 1, v) > < θ def A, 1 A Φdef A 1 A (v, u) > + Score(t A 1, v)) (4) t ch(a 1) where θ def Θ def is the corresponding deformation parameter for node (such as A 1 ) with respect to node (such as A), Φ def (v, u) is the deformation feature which we adopt the same quadratic function as used in DPM [5] and we have Φ def (v, u) = [dx2, dx, dy 2, dy] where (dx, dy) is the displacement between v and u. The best placed location of node A 1 is retrieved by taking arg max v Λ Score(A 1 A, u). (iv) For computing a part terminal-node t with respect to a placed parent Andnode A at u, the scoring function is defined by, Score(t A, u) = max v Λ (< θapp t, Φ app (H, t, v) > < θ def t A, Φdef t A (v, u) >) (5) where in practice, we often place node t at twice the spatial resolution relative to node A to capture more detail information. 3 The DP algorithm for Detection In detection, we first find all the locations in the image pyramid where the scores of the placed AOGs are higher than the estimated threshold τ. For example, at the original resolution, we have {u; Score(O, u) > τ, u Λ}. Then, we will
7 A Hierarchical Model for Occlusion-Free Car Detection 7 utilize the NMS to get final detection results. Since the AOG is directed and acyclic, the AOG scoring function is evaluated in two phases by utilizing the DP algorithm: (i) one bottom-up phase to compute all the appearance scoring maps for terminal-nodes, as well as their transformed maps for different parent nodes which are computed by using the efficient generalized distance transform [6], and then (ii) one top-down phase to retrieve the configurations (i.e., locations of car pair, single cars and parts) for all the locations whose scores are greater than the threshold τ, followed by a post-processing NMS step in practice. We omit the obvious details of the DP algorithm here due to the limited space, which are referred to [5]. By the top-down back-tracing, we can obtain the decoupled single cars from detected occluding car pairs. Notice that we may have two inferred locations for a single car which is shared by two adjacent car pairs if the single car appears in the middle of a line of multiple occluding cars. Then, we use the location as the final detection result for the single car which is decoupled from the detected car pair with higher score. 4 Learning AOG by Latent Structural SVM In this section, we formulate the learning of the AOG under the latent structural SVM (LSSVM) framework [13, 14, 16], which has been widely used in the literature of object detection and machine learning. Training Data. We collect roadside cars from the real traffic video surveillance environment. We annotate the bounding boxes for both occluding car pairs and the corresponding two single cars. When labeling occluded single cars, we annotate their whole bounding boxes. Notice that some cars may be used twice in two adjacent car pairs when they appear in the middle of a line of multiple occluding cars. Those duplicated cars can be treated as bootstrapped ones in learning appearance parameters for single car and parts. 4.1 Latent variables in the AOG Given the training data specified above, for the AOG defined in Sec. 2.1, we have the latent variables as follows. The branches of the root Or-node, i.e., the mixture components of occluding car pairs. Based on the labeled bounding boxes, we initialize them using k-means clustering on the concatenated features (k = 3 clusters in our experiment): the aspect ratios of the three annotated bounding boxes and the displacements between the centers of the two single cars relative to that of the car pair (normalized by the size of the car pair bounding box). The aspect ratios of single cars can roughly indicate viewpoints, the displacement have clue on the configuration of car pair and the aspect ratio of car pair can reflect the degree of occlusions. In training, we also incorporate left-right flipped ones as done in [4]. So, we have 6 car pair models in total. We train the initial AOG (consisting of
8 8 B. Li, T. Wu, W. Hu and M. Pei the root Or-node, the six car pair And-nodes, the twelve single car And-nodes, and the corresponding terminal-nodes for the And-nodes) under LSSVM framework by treating the locations and sizes of car pairs and single cars as hidden variables anchored at the annotated bounding boxes. At each step of re-labeling the positive examples (i.e. assigning latent variables) in learning, we force the the assignment of car pair terminal-nodes to overlap more than 0.7 with the ground-truth, and more than 0.8 for single car terminal nodes. The part configuration for single cars and part-sharing. After the initial AOG is trained, we initialize the part configurations for the single cars based on the learned single car template, similar to the greedy pursuit method used in DPM [5]. We used 8 parts of rectangular shape and with equal sizes for each single car. For the part sharing, we use the similar method as done in [11], resulting in 30 part terminal-nodes in total. 4.2 Learning by LSSVM Denote the set of positive training images by D + = {(I 1, y 1, z 1 ),, (I n, y n, z n )}, where y i = 1 and z i = (ω i, B i, P i ) consisting of (i) The Or-node branching variable ω i (i.e. the mixture component index); (ii) The labeled three bounding boxes B i for the car pair and the two single cars respectively; and (iii) The bounding boxes P i for parts of single cars. z i s are treated as latent variables during learning with different initialization: ω i is initialized by the k-mean clustering stated above, B i by the annotated bounding boxes, and P i by the greedy pursuit and part-sharing strategy stated above. Let D = {(I n+1, y n+1 ),, (I N, y N )} be a set of negative training images (i.e. images without cars appearing) where y i = 1. We first train the initial AOG using z i = (ω i, B i ), and then initialize P i and learn the full AOG using z i = (ω i, B i, P i ). Both are done under the LSSVM framework. Given z, the scoring function is a linear function, Score(I, y, z; Θ) =< Θ, Φ(I, y, z) > (6) where Θ = (Θ app, Θ def ) and Φ(I, y, z) = (Φ app (I, y, z), Φ def (y, z)) specified in Eqn. 3, Eqn. 4 and Eqn. 5. Under the LSSVM framework, we learn Θ by solving the following surrogate loss function [14, 16], 1 min Θ 2 Θ 2 2+ C N N i=1 [max y,z (Score(I i, y, z; Θ) + (y i, y, z)) max z (Score(I i, y i, z))] (7) where the loss function (y i, y, z) = 1 if y i = y, 0 otherwise, and C is the tradeoff parameter balancing the first regularization term and the surrogate loss
9 A Hierarchical Model for Occlusion-Free Car Detection 9 term. The objective function is non-convex, and the concave-convex procedure (CCCP) [15, 14, 16] is used to get a local optimum. Firstly, Eqn.7 can be re-written as, min Θ 1 2 Θ C N max N (Score(I i, y, z; Θ) + (y i, y, z))] + y,z i=1 }{{} f(θ), convex function C N max N (Score(I i, y i, z)) z i=1 }{{} g(θ), concave function (8) Then, at step t, based on the current solution Θ t, The CCCP solves the problem with the two steps as follows. (i) Bounding g(θ) from the upper (since it is concave), i.e., finding hyperplane p t such that, g(θ) g(θ t ) + (Θ Θ t ) p t To do that, we first get the best latent variable assignment for each example by solving zi = arg max z i Score(I, y i, z i ) using the DP algorithm. Then, p t is constructed by, p t = C N Φ(I i, y i, zi ) N i=1 (ii) Updating the solution Θ t+1 = arg min Θ (f(θ) + Θ p t ). The step leads to a standard structural SVM by using different off-the-shelf solver such as the cutting plane method. The details are referred to [13, 16]. Figure 3 shows a portion of our learned AOG model. The first layer corresponding to car pair, the second layer corresponding to single car, and the third layer corresponding to car parts. Beside each node in the AOG, we visualize the learned appearance and deformation templates. 5 Experiments To evaluate our proposed method, we collected 482 car images from street view scenes and annotated the bounding boxes for both car pairs and single cars. In detail, we obtained 1380 car pairs, 2760 occluded single cars and 702 unoccluded single cars. We randomly select 200 images for training, and use the rest for testing. For the negative set, we use the training negative images from PASCAL VOC 2007 database [3]. We also follow the VOC protocol for reporting results [3]. A putative bounding box is considered correct if the intersection of its bounding box with the ground-truth bounding box is greater than 50% of their
10 10 B. Li, T. Wu, W. Hu and M. Pei union. Multiple detections for the same ground truth are penalized. We compute Precision-Recall (PR) curves and score the average precision (AP) across our test set. In experiments, we compared our AOG with the two baseline DPMs: Baseline 1. DPM trained by using occluded single cars in the training set; Baseline 2. DPM trained by using all the single cars in the training set. Fig. 4 shows the PR curves of the three methods, where the proposed method outperforms the two baseline DPMs significantly (by 9.5% and 12.3% respectively). 1 class: car, subset: test precision Baseline 1: AP = Baseline 2: AP = Hierarchical AOG Model: AP = recall Fig. 4. Precision-Recall curves for our model and baseline methods. Figure 5 shows detection results of both DPM car model (Baseline 1 is used since it is better than Baseline 2 according to the PR curves) and our AOG model. Figure 6 shows some examples of layered detection results of Our AOG model. On the top, we show the detection results of car pair model in our AOG. On the bottom, detection results of single cars are shown by using the full AOG model. Here, we can see that our model can lead to fast coarse-to-fine detection, which we will further investigate in our on-going work. 6 Conclusion In this paper, we proposed a hierarchical And-Or directed acyclic graph (AOG) model to address the problem of X-to-X-occlusion-free object detection. The model is a grammar model. It consists of (i) a root Or-node representing a mixture of different types of occluding X pairs, (ii) a set of And-nodes representing different types of occluding X pairs, (iii) another set of And-nodes representing different types of occluding single X s decomposed from X pairs, and (iv) a set
11 A Hierarchical Model for Occlusion-Free Car Detection 11 DPM DPM DPM AOG AOG AOG DPM DPM DPM AOG AOG AOG Fig. 5. Comparison of DPM car model and our hierarchical AOG model. The first row and the third row show the detection results (blue bounding boxes) of DPM car detector, the second and the fourth row show the detection results (red bounding boxes) of proposed AOG model. Best viewed in color. Car Pair Car Pair Car Pair Single Car Single Car Single Car Fig. 6. Layered detections of our AOG model. Top: detection results of car pair module by coupling in the first layer. Bottom: detection results of single car module by decoupling in the second layer.
12 12 B. Li, T. Wu, W. Hu and M. Pei of terminal-nodes representing the appearance templates for the X pairs, single X s and latent parts of the single X s. The part appearance templates can also be shared among different And-nodes of single X s. This model is learned by the latent structural SVM (LSSVM). DP algorithm is used for inference. Our model is a general model, though we only use cars as running examples in this paper, it can be used for other objects potentially. Acknowledgement. We thank the three anonymous reviewers for their helpful comments. This work is supported by China 973 Program under Grant No. 2012CB316300, Natural Science Foundation of China under Grant No References 1. Choi, J.Y., Sung, K.S., Yang, Y.K.: Multiple Vehicles Detection and Tracking based on Scale-Invariant Feature Transform. In: ITSC (2007) Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: CVPR (2005) Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. ( (2007) 4. Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Discriminatively Trained Deformable Part Models, Release 4. ( pff/latentrelease4/) (2010) 5. Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object Detection with Discriminatively Trained Part-Based Models. TPAMI 32 (2010) Felzenszwalb, P.F., Huttenlocher, D.P.: Distance Transforms of Sampled Functions. Technical report , Cornell University CIS (2004) 7. Gupte, S., Masoud, O., Martin, R.F.K., Papanikolopoulos, N.P.: Detection and Classification of Vehicles. TITS 3 (2002) Lai, A.H.S., Fung, G.S.K., Yung, N.H.C.: Vehicle Type Classification from Visualbased Dimension Estimation. In: ITSC (2001) Leotta, M.J., Mundy, J.L.: Vehicle Surveillance with a Generic, Adaptive, 3D Vehicle Model. TPAMI 33 (2011) Liu, X., Dai, B., He, H.: Real-Time On-Road Vehicle Detection Combining Specific Shadow Segmentation and SVM Classification. In: ICDMA (2011) Ott, P., Everingham, M.: Shared Parts for Deformable Part-based Models. In: CVPR (2011) Petrovic, V.S., Cootes, T.F.: Analysis of Features for Rigid Structure Vehicle Type Recognition. In: BMVC (2004) Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large Margin Methods for Structured and Interdependent Output Variables. JMLR 6 (2005) Yu, C.N.J., Joachims, T.: Learning Structural SVMs with Latent Variables. In: ICML (2009) Yuille, A.L., Rangarajan, A.: The Concave-Convex Procedure (CCCP). In: NIPS (2001) Zhu, L., Chen, Y., Yuille, A.L., Freeman, W.T.: Latent Hierarchical Structural Learning for Object Detection. In: CVPR (2010) Zhu, S.C., Mumford, D.: A Stochastic Grammar of Images. FTCGV 2 (2006)
Development in Object Detection. Junyuan Lin May 4th
Development in Object Detection Junyuan Lin May 4th Line of Research [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection, CVPR 2005. HOG Feature template [2] P. Felzenszwalb,
More informationModeling Occlusion by Discriminative AND-OR Structures
2013 IEEE International Conference on Computer Vision Modeling Occlusion by Discriminative AND-OR Structures Bo Li, Wenze Hu, Tianfu Wu and Song-Chun Zhu Beijing Lab of Intelligent Information Technology,
More informationCAR is one of the most frequently seen object category in
FOR REVIEW: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 Learning And-Or Model to Represent Context and Occlusion for Car Detection and Viewpoint Estimation Tianfu Wu, Bo Li and Song-Chun
More informationIntegrating Context and Occlusion for Car Detection by Hierarchical And-Or Model
Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model Bo Li 1,2, Tianfu Wu 2,, and Song-Chun Zhu 2 1 Beijing Lab of Intelligent Information Technology, Beijing Institute of Technology
More informationHierarchical Learning for Object Detection. Long Zhu, Yuanhao Chen, William Freeman, Alan Yuille, Antonio Torralba MIT and UCLA, 2010
Hierarchical Learning for Object Detection Long Zhu, Yuanhao Chen, William Freeman, Alan Yuille, Antonio Torralba MIT and UCLA, 2010 Background I: Our prior work Our work for the Pascal Challenge is based
More informationModeling 3D viewpoint for part-based object recognition of rigid objects
Modeling 3D viewpoint for part-based object recognition of rigid objects Joshua Schwartz Department of Computer Science Cornell University jdvs@cs.cornell.edu Abstract Part-based object models based on
More informationObject Detection by 3D Aspectlets and Occlusion Reasoning
Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition
More informationObject Detection with Partial Occlusion Based on a Deformable Parts-Based Model
Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major
More informationPedestrian Detection Using Structured SVM
Pedestrian Detection Using Structured SVM Wonhui Kim Stanford University Department of Electrical Engineering wonhui@stanford.edu Seungmin Lee Stanford University Department of Electrical Engineering smlee729@stanford.edu.
More informationDeformable Part Models
CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones
More informationPart-Based Models for Object Class Recognition Part 3
High Level Computer Vision! Part-Based Models for Object Class Recognition Part 3 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de! http://www.d2.mpi-inf.mpg.de/cv ! State-of-the-Art
More informationCategory-level localization
Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object
More informationONLINE object tracking is an innate capability in human and
ACCEPTED BY IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, DOI:1.119/TPAMI.216.2644963 1 Online Object Tracking, Learning and Parsing with And-Or Graphs Tianfu Wu, Yang Lu and Song-Chun
More informationDeformable Part Models with Individual Part Scaling
DUBOUT, FLEURET: DEFORMABLE PART MODELS WITH INDIVIDUAL PART SCALING 1 Deformable Part Models with Individual Part Scaling Charles Dubout charles.dubout@idiap.ch François Fleuret francois.fleuret@idiap.ch
More informationUsing the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection
Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Hyunghoon Cho and David Wu December 10, 2010 1 Introduction Given its performance in recent years' PASCAL Visual
More informationObject Detection with Discriminatively Trained Part Based Models
Object Detection with Discriminatively Trained Part Based Models Pedro F. Felzenszwelb, Ross B. Girshick, David McAllester and Deva Ramanan Presented by Fabricio Santolin da Silva Kaustav Basu Some slides
More informationFast, Accurate Detection of 100,000 Object Classes on a Single Machine
Fast, Accurate Detection of 100,000 Object Classes on a Single Machine Thomas Dean etal. Google, Mountain View, CA CVPR 2013 best paper award Presented by: Zhenhua Wang 2013.12.10 Outline Background This
More informationPart-based Visual Tracking with Online Latent Structural Learning: Supplementary Material
Part-based Visual Tracking with Online Latent Structural Learning: Supplementary Material Rui Yao, Qinfeng Shi 2, Chunhua Shen 2, Yanning Zhang, Anton van den Hengel 2 School of Computer Science, Northwestern
More informationA Study of Vehicle Detector Generalization on U.S. Highway
26 IEEE 9th International Conference on Intelligent Transportation Systems (ITSC) Windsor Oceanico Hotel, Rio de Janeiro, Brazil, November -4, 26 A Study of Vehicle Generalization on U.S. Highway Rakesh
More informationStructured Models in. Dan Huttenlocher. June 2010
Structured Models in Computer Vision i Dan Huttenlocher June 2010 Structured Models Problems where output variables are mutually dependent or constrained E.g., spatial or temporal relations Such dependencies
More informationObject Recognition with Deformable Models
Object Recognition with Deformable Models Pedro F. Felzenszwalb Department of Computer Science University of Chicago Joint work with: Dan Huttenlocher, Joshua Schwartz, David McAllester, Deva Ramanan.
More informationCombining ROI-base and Superpixel Segmentation for Pedestrian Detection Ji Ma1,2, a, Jingjiao Li1, Zhenni Li1 and Li Ma2
6th International Conference on Machinery, Materials, Environment, Biotechnology and Computer (MMEBC 2016) Combining ROI-base and Superpixel Segmentation for Pedestrian Detection Ji Ma1,2, a, Jingjiao
More informationStructured output regression for detection with partial truncation
Structured output regression for detection with partial truncation Andrea Vedaldi Andrew Zisserman Department of Engineering University of Oxford Oxford, UK {vedaldi,az}@robots.ox.ac.uk Abstract We develop
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationSupplementary Material: Unconstrained Salient Object Detection via Proposal Subset Optimization
Supplementary Material: Unconstrained Salient Object via Proposal Subset Optimization 1. Proof of the Submodularity According to Eqns. 10-12 in our paper, the objective function of the proposed optimization
More informationOnline Object Tracking, Learning and Parsing with And-Or Graphs
ARXIV VERSION 1 Online Object Tracking, Learning and Parsing with And-Or Graphs Tianfu Wu, Yang Lu and Song-Chun Zhu arxiv:159.67v3 [cs.cv] 11 May 16 Abstract This paper presents a method, called AOGTracker,
More informationPEDESTRIAN DETECTION IN CROWDED SCENES VIA SCALE AND OCCLUSION ANALYSIS
PEDESTRIAN DETECTION IN CROWDED SCENES VIA SCALE AND OCCLUSION ANALYSIS Lu Wang Lisheng Xu Ming-Hsuan Yang Northeastern University, China University of California at Merced, USA ABSTRACT Despite significant
More informationModern Object Detection. Most slides from Ali Farhadi
Modern Object Detection Most slides from Ali Farhadi Comparison of Classifiers assuming x in {0 1} Learning Objective Training Inference Naïve Bayes maximize j i logp + logp ( x y ; θ ) ( y ; θ ) i ij
More informationA Discriminatively Trained, Multiscale, Deformable Part Model
A Discriminatively Trained, Multiscale, Deformable Part Model by Pedro Felzenszwalb, David McAllester, and Deva Ramanan CS381V Visual Recognition - Paper Presentation Slide credit: Duan Tran Slide credit:
More informationPart-based models. Lecture 10
Part-based models Lecture 10 Overview Representation Location Appearance Generative interpretation Learning Distance transforms Other approaches using parts Felzenszwalb, Girshick, McAllester, Ramanan
More informationOcclusion Patterns for Object Class Detection
Occlusion Patterns for Object Class Detection Bojan Pepik1 Michael Stark1,2 Peter Gehler3 Bernt Schiele1 Max Planck Institute for Informatics, 2Stanford University, 3Max Planck Institute for Intelligent
More informationDPM Score Regressor for Detecting Occluded Humans from Depth Images
DPM Score Regressor for Detecting Occluded Humans from Depth Images Tsuyoshi Usami, Hiroshi Fukui, Yuji Yamauchi, Takayoshi Yamashita and Hironobu Fujiyoshi Email: usami915@vision.cs.chubu.ac.jp Email:
More informationDetection III: Analyzing and Debugging Detection Methods
CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can
More informationRecognizing Human Actions from Still Images with Latent Poses
Recognizing Human Actions from Still Images with Latent Poses Weilong Yang, Yang Wang, and Greg Mori School of Computing Science Simon Fraser University Burnaby, BC, Canada wya16@sfu.ca, ywang12@cs.sfu.ca,
More informationBig and Tall: Large Margin Learning with High Order Losses
Big and Tall: Large Margin Learning with High Order Losses Daniel Tarlow University of Toronto dtarlow@cs.toronto.edu Richard Zemel University of Toronto zemel@cs.toronto.edu Abstract Graphical models
More informationThe Pennsylvania State University. The Graduate School. College of Engineering ONLINE LIVESTREAM CAMERA CALIBRATION FROM CROWD SCENE VIDEOS
The Pennsylvania State University The Graduate School College of Engineering ONLINE LIVESTREAM CAMERA CALIBRATION FROM CROWD SCENE VIDEOS A Thesis in Computer Science and Engineering by Anindita Bandyopadhyay
More informationSelective Search for Object Recognition
Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview Introduction Object Recognition Selective Search Similarity Metrics Results Object Recognition Kitten Goal: Problem: Where
More informationDefinition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos
Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,
More informationColorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.
Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 People Detection Some material for these slides comes from www.cs.cornell.edu/courses/cs4670/2012fa/lectures/lec32_object_recognition.ppt
More informationDeformable Part Models
Deformable Part Models References: Felzenszwalb, Girshick, McAllester and Ramanan, Object Detec@on with Discrimina@vely Trained Part Based Models, PAMI 2010 Code available at hkp://www.cs.berkeley.edu/~rbg/latent/
More informationLarge-Scale Traffic Sign Recognition based on Local Features and Color Segmentation
Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation M. Blauth, E. Kraft, F. Hirschenberger, M. Böhm Fraunhofer Institute for Industrial Mathematics, Fraunhofer-Platz 1,
More informationA Hierarchical Compositional System for Rapid Object Detection
A Hierarchical Compositional System for Rapid Object Detection Long Zhu and Alan Yuille Department of Statistics University of California at Los Angeles Los Angeles, CA 90095 {lzhu,yuille}@stat.ucla.edu
More informationPedestrian Detection and Tracking in Images and Videos
Pedestrian Detection and Tracking in Images and Videos Azar Fazel Stanford University azarf@stanford.edu Viet Vo Stanford University vtvo@stanford.edu Abstract The increase in population density and accessibility
More informationFind that! Visual Object Detection Primer
Find that! Visual Object Detection Primer SkTech/MIT Innovation Workshop August 16, 2012 Dr. Tomasz Malisiewicz tomasz@csail.mit.edu Find that! Your Goals...imagine one such system that drives information
More informationIs 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014
Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Problem Definition Viewpoint estimation: Given an image, predicting viewpoint for object of
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationLearning From Weakly Supervised Data by The Expectation Loss SVM (e-svm) algorithm
Learning From Weakly Supervised Data by The Expectation Loss SVM (e-svm) algorithm Jun Zhu Department of Statistics University of California, Los Angeles jzh@ucla.edu Junhua Mao Department of Statistics
More informationThe Caltech-UCSD Birds Dataset
The Caltech-UCSD Birds-200-2011 Dataset Catherine Wah 1, Steve Branson 1, Peter Welinder 2, Pietro Perona 2, Serge Belongie 1 1 University of California, San Diego 2 California Institute of Technology
More informationLecture 15: Detecting Objects by Parts
Lecture 15: Detecting Objects by Parts David R. Morales, Austin O. Narcomey, Minh-An Quinn, Guilherme Reis, Omar Solis Department of Computer Science Stanford University Stanford, CA 94305 {mrlsdvd, aon2,
More informationPart 5: Structured Support Vector Machines
Part 5: Structured Support Vector Machines Sebastian Nowozin and Christoph H. Lampert Providence, 21st June 2012 1 / 34 Problem (Loss-Minimizing Parameter Learning) Let d(x, y) be the (unknown) true data
More informationObject Detection on Self-Driving Cars in China. Lingyun Li
Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not
More informationPart based models for recognition. Kristen Grauman
Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily
More informationLearning to Localize Objects with Structured Output Regression
Learning to Localize Objects with Structured Output Regression Matthew Blaschko and Christopher Lampert ECCV 2008 Best Student Paper Award Presentation by Jaeyong Sung and Yiting Xie 1 Object Localization
More informationSegmenting Objects in Weakly Labeled Videos
Segmenting Objects in Weakly Labeled Videos Mrigank Rochan, Shafin Rahman, Neil D.B. Bruce, Yang Wang Department of Computer Science University of Manitoba Winnipeg, Canada {mrochan, shafin12, bruce, ywang}@cs.umanitoba.ca
More informationVideo understanding using part based object detection models
Video understanding using part based object detection models Vignesh Ramanathan Stanford University Stanford, CA-94305 vigneshr@stanford.edu Kevin Tang (mentor) Stanford University Stanford, CA-94305 kdtang@cs.stanford.edu
More informationDetecting Objects using Deformation Dictionaries
Detecting Objects using Deformation Dictionaries Bharath Hariharan UC Berkeley bharath2@eecs.berkeley.edu C. Lawrence Zitnick Microsoft Research larryz@microsoft.com Piotr Dollár Microsoft Research pdollar@microsoft.com
More information3D Semantic Parsing of Large-Scale Indoor Spaces Supplementary Material
3D Semantic Parsing of Large-Scale Indoor Spaces Supplementar Material Iro Armeni 1 Ozan Sener 1,2 Amir R. Zamir 1 Helen Jiang 1 Ioannis Brilakis 3 Martin Fischer 1 Silvio Savarese 1 1 Stanford Universit
More informationUsing a deformation field model for localizing faces and facial points under weak supervision
Using a deformation field model for localizing faces and facial points under weak supervision Marco Pedersoli Tinne Tuytelaars Luc Van Gool KU Leuven, ESAT/PSI - iminds ETH Zürich, CVL/D-ITET firstname.lastname@esat.kuleuven.be
More informationA Segmentation-aware Object Detection Model with Occlusion Handling
A Segmentation-aware Object Detection Model with Occlusion Handling Tianshi Gao 1 Benjamin Packer 2 Daphne Koller 2 1 Department of Electrical Engineering, Stanford University 2 Department of Computer
More informationA novel template matching method for human detection
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 A novel template matching method for human detection Duc Thanh Nguyen
More informationDetecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)
Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of
More informationObject Recognition II
Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based
More informationCategory vs. instance recognition
Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building
More informationLearning and Recognizing Visual Object Categories Without First Detecting Features
Learning and Recognizing Visual Object Categories Without First Detecting Features Daniel Huttenlocher 2007 Joint work with D. Crandall and P. Felzenszwalb Object Category Recognition Generic classes rather
More informationAn Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b
5th International Conference on Advanced Materials and Computer Science (ICAMCS 2016) An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b 1
More informationLearning to Localize Objects with Structured Output Regression
Learning to Localize Objects with Structured Output Regression Matthew B. Blaschko and Christoph H. Lampert Max Planck Institute for Biological Cybernetics 72076 Tübingen, Germany {blaschko,chl}@tuebingen.mpg.de
More informationDeformable Part Models Revisited: A Performance Evaluation for Object Category Pose Estimation
Deformable Part Models Revisited: A Performance Evaluation for Object Category Pose Estimation Roberto J. Lo pez-sastre Tinne Tuytelaars2 Silvio Savarese3 GRAM, Dept. Signal Theory and Communications,
More informationA Discriminatively Trained, Multiscale, Deformable Part Model
A Discriminatively Trained, Multiscale, Deformable Part Model Pedro Felzenszwalb University of Chicago pff@cs.uchicago.edu David McAllester Toyota Technological Institute at Chicago mcallester@tti-c.org
More informationObject Detection by 3D Aspectlets and Occlusion Reasoning
Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan yuxiang@umich.edu Silvio Savarese Stanford University ssilvio@stanford.edu Abstract We propose a novel framework
More informationMultiple-Person Tracking by Detection
http://excel.fit.vutbr.cz Multiple-Person Tracking by Detection Jakub Vojvoda* Abstract Detection and tracking of multiple person is challenging problem mainly due to complexity of scene and large intra-class
More informationhttps://en.wikipedia.org/wiki/the_dress Recap: Viola-Jones sliding window detector Fast detection through two mechanisms Quickly eliminate unlikely windows Use features that are fast to compute Viola
More informationDetector of Facial Landmarks Learned by the Structured Output SVM
Detector of Facial Landmarks Learned by the Structured Output SVM Michal Uřičář, Vojtěch Franc and Václav Hlaváč Department of Cybernetics, Faculty Electrical Engineering Czech Technical University in
More informationSeparating Objects and Clutter in Indoor Scenes
Separating Objects and Clutter in Indoor Scenes Salman H. Khan School of Computer Science & Software Engineering, The University of Western Australia Co-authors: Xuming He, Mohammed Bennamoun, Ferdous
More informationDetecting Actions, Poses, and Objects with Relational Phraselets
Detecting Actions, Poses, and Objects with Relational Phraselets Chaitanya Desai and Deva Ramanan University of California at Irvine, Irvine CA, USA {desaic,dramanan}@ics.uci.edu Abstract. We present a
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationSelf-Paced Learning for Semisupervised Image Classification
Self-Paced Learning for Semisupervised Image Classification Kevin Miller Stanford University Palo Alto, CA kjmiller@stanford.edu Abstract In this project, we apply three variants of self-paced learning
More informationDPM Configurations for Human Interaction Detection
DPM Configurations for Human Interaction Detection Coert van Gemeren Ronald Poppe Remco C. Veltkamp Interaction Technology Group, Department of Information and Computing Sciences, Utrecht University, The
More informationParameter Sensitive Detectors
Boston University OpenBU Computer Science http://open.bu.edu CAS: Computer Science: Technical Reports 2007 Parameter Sensitive Detectors Yuan, Quan Boston University Computer Science Department https://hdl.handle.net/244/680
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha
More informationObject Category Detection: Sliding Windows
03/18/10 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Goal: Detect all instances of objects Influential Works in Detection Sung-Poggio
More informationHiRF: Hierarchical Random Field for Collective Activity Recognition in Videos
HiRF: Hierarchical Random Field for Collective Activity Recognition in Videos Mohamed R. Amer, Peng Lei, Sinisa Todorovic Oregon State University School of Electrical Engineering and Computer Science {amerm,
More information18 October, 2013 MVA ENS Cachan. Lecture 6: Introduction to graphical models Iasonas Kokkinos
Machine Learning for Computer Vision 1 18 October, 2013 MVA ENS Cachan Lecture 6: Introduction to graphical models Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris
More informationPart-Based Models for Object Class Recognition Part 2
High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de https://www.mpi-inf.mpg.de/hlcv Class of Object
More informationPart-Based Models for Object Class Recognition Part 2
High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de https://www.mpi-inf.mpg.de/hlcv Class of Object
More informationSegmentation. Bottom up Segmentation Semantic Segmentation
Segmentation Bottom up Segmentation Semantic Segmentation Semantic Labeling of Street Scenes Ground Truth Labels 11 classes, almost all occur simultaneously, large changes in viewpoint, scale sky, road,
More informationContent-Based Image Recovery
Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose
More informationVideo annotation based on adaptive annular spatial partition scheme
Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory
More informationOcclusion Patterns for Object Class Detection
23 IEEE Conference on Computer Vision and Pattern Recognition Occlusion Patterns for Object Class Detection Bojan Pepik Michael Stark,2 Peter Gehler 3 Bernt Schiele Max Planck Institute for Informatics,
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification
More informationHOG-based Pedestriant Detector Training
HOG-based Pedestriant Detector Training evs embedded Vision Systems Srl c/o Computer Science Park, Strada Le Grazie, 15 Verona- Italy http: // www. embeddedvisionsystems. it Abstract This paper describes
More informationEvery Picture Tells a Story: Generating Sentences from Images
Every Picture Tells a Story: Generating Sentences from Images Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi, Peter Young, Cyrus Rashtchian, Julia Hockenmaier, David Forsyth University of Illinois
More informationNested Pictorial Structures
Nested Pictorial Structures Steve Gu and Ying Zheng and Carlo Tomasi Department of Computer Science, Duke University, NC, USA 27705 {steve,yuanqi,tomasi}@cs.duke.edu Abstract. We propose a theoretical
More informationTranslation Symmetry Detection: A Repetitive Pattern Analysis Approach
2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops Translation Symmetry Detection: A Repetitive Pattern Analysis Approach Yunliang Cai and George Baciu GAMA Lab, Department of Computing
More informationTraining Deformable Object Models for Human Detection based on Alignment and Clustering
Training Deformable Object Models for Human Detection based on Alignment and Clustering Benjamin Drayer and Thomas Brox Department of Computer Science, Centre of Biological Signalling Studies (BIOSS),
More informationPEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE
PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE Hongyu Liang, Jinchen Wu, and Kaiqi Huang National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science
More informationObject Category Detection. Slides mostly from Derek Hoiem
Object Category Detection Slides mostly from Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical template matching with sliding window Part-based Models
More informationGradient of the lower bound
Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level
More informationUsing k-poselets for detecting people and localizing their keypoints
Using k-poselets for detecting people and localizing their keypoints Georgia Gkioxari, Bharath Hariharan, Ross Girshick and itendra Malik University of California, Berkeley - Berkeley, CA 94720 {gkioxari,bharath2,rbg,malik}@berkeley.edu
More informationGraphical Models for Computer Vision
Graphical Models for Computer Vision Pedro F Felzenszwalb Brown University Joint work with Dan Huttenlocher, Joshua Schwartz, Ross Girshick, David McAllester, Deva Ramanan, Allie Shapiro, John Oberlin
More information