A Study of Vehicle Detector Generalization on U.S. Highway

Size: px
Start display at page:

Download "A Study of Vehicle Detector Generalization on U.S. Highway"


1 26 IEEE 9th International Conference on Intelligent Transportation Systems (ITSC) Windsor Oceanico Hotel, Rio de Janeiro, Brazil, November -4, 26 A Study of Vehicle Generalization on U.S. Highway Rakesh N. Rajaram, Eshed Ohn-Bar, and Mohan M. Trivedi Laboratory for Intelligent and Safe Automobiles University of California, San Diego {rnattoji, eohnbar, mtrivedi}@ucsd.edu Abstract Vehicle detection is an essential task in an intelligent vehicle. Despite being a well-studied vision problem, it is unclear how well vehicle detectors generalize to new settings. Specifically, this paper studies the generalization capability of vehicle detectors on a U.S. highway dataset. Two types of models are employed in the experimental analysis, a subcategory aggregate channel features model and a regionbased convolutional neural network model. The experiments demonstrate limited generalization capability of pre-trained models when evaluated on a dataset captured in new settings. This observation motivates technical modifications in order to improve generalization to the new dataset. By exploring novel training techniques, we significantly improve detection performance by up to %, demonstrating the importance of studying cross-dataset generalization. I. INTRODUCTION A robust object detector must handle appearance variations due to a variety of reasons, including different scenes (urban roads vs highways), camera viewing angle, occlusion due to other objects, truncation due to movement in and out of the camera field of view, as well as variations in the object itself (e.g. SUV vs. sedan). A study in the performance impact of such appearance variations is a generalization study regarding limitations of the object detector. Our study is motivated by the need for training object detectors which can generalize well to new vehicles and settings. Several related research studies propose object detectors with improved generalization capabilities. Notably, the Deformable Parts Model (DPM) [2] models object-part relationships for achieving increased flexibility in detection. The DPM also trains multiple aspect-ratio models for better handling variations due to aspect-ratio of objects. The [3] vehicle detector extends this idea further by training models for varying vehicle types (sub-categories), at varying aspectratio, orientation, occlusion, or by clustering visual descriptors. Current state-of-the-art object detectors such as R-CNN [4], [5] achieve generalization through modeling hierarchies of increasingly higher-level representations and employing large amounts of data. Although the field has seen significant improvement in object detection performance in recent years, analysis on the performance of the aforementioned models when training on a particular scene settings and testing on different settings is lacking. Vision based vehicle detection is a widely studied research topic of the past decade. Literature until 23 has been carefully surveyed by Sivaraman et al. in [6], [7]. Recent success of vehicle detectors on KITTI [] datasets could (a) A typical drive on a highway (b) A typical drive on a urban road Fig.. Difference in scene composition between highway and urban drives. The urban images are from the KITTI [] dataset and the highway images are from a video dataset collected by us. Vehicle ground truth annotation boxes are shown in green. be attributed to robust handling of appearance variation. Regionlets [8] handles variation in location of object parts by a flexible feature extraction scheme within a region of object proposal. A variant of DPM, OC-DPM [9] proposes a DPM with occlusion specific model components. 3DVP [] clusters samples into voxel patterns by occlusion and truncation, and consequently trains a cluster specific vehicle detector. Current state-of-the-art region-based deep convolutional neural network (DCNN) models [4] train multilayer architectures on large amounts of data for implicitly handling object variations in classification and detection. We note that when on-road vehicle detection is concerned, /6/$3. 26 IEEE 277

2 Training Data Clustering Model Learning Training Data Clustering Model Learning (a) Training pipeline Test Image Pixel lookup Features Test Image Pixel lookup (b) Testing pipeline Features Fig. 2. This work studies different clustering techniques derived from aspect ratio and scale, similar to [3] for learning a highway detector. The monolithic detector uses AdaBoost with color and gradient-based pixel lookup features to learn models that provides fast detection in test time. most existing approaches and datasets involve training and testing within similar scenes and geographical locations. Furthermore, although our study focuses on vehicles, similar issues are expected for other types of road occupants [], [2]. In this work, we perform a study of object detector generalization when training is done on the KITTI [] dataset, collected in urban settings in Europe (Karlsruhe, Germany), and testing is done on a U.S. (San Diego) highway dataset, collected by us. The domain application of intelligent vehicles and autonomous driving requires object detection models to generalize over such variations in settings and geographical locations. We study the impact of parameter choices when training on generalization capability to the new dataset. Furthermore, as generalization capability is influenced by both the training procedure and the dataset used, our experiments can study the impact of dataset bias. Specifically, the motivation for this paper are as follows. ) Initial experiments with detectors pre-trained on KITTI dataset generated sub-optimal results on our new highway dataset. 2) Most of the previous work in highway vehicle detection are restricted to narrow back view of the vehicles. Our study is reported on data collected with wider view camera and annotations of entire view of the vehicles (as opposed to just rear part). 3) Most of the previous work in highway vehicle detection report evaluation metric using 5% PASCAL overlap criteria. We include analysis on detector performance at different overlap thresholds. Performance at a higher overlap threshold implies b This motivating points guide our study of generalization of vehicle detectors on highway settings. Fig. highlights some key difference between urban roads and highway data collected at the two geographically different locations. The contributions of the paper are as follows. ) We evaluate two state-of-the-art vehicle detectors on a new highway dataset. 2) We explore the impact of different clustering options on detector s performance and show improvement of more than % over pre-trained models. This demonstrates the importance of studying cross-dataset, cross-settings generalization. 3) The fine-grained analysis of missed detections provides insight into future scope of research. For the comparative analysis, we employ the R-CNN [4] and [3] approaches. The impact of training procedure is mostly done on due to its fast training and testing time, with the aim of highlighting some of the improvements that can be helpful when adapting an object detector from one settings to another. II. SUBCAT DETECTOR FOR ANALYZING GENERALIZATION The key components of the [3] detector are shown in Fig. 2. The crux of this method is to cluster objects into different categories based on features which can be visual (features such as colorspace, gradient magnitude, etc.), geometric (aspect ratio, height, 3D orientation, etc.), or semantic (occluded, truncated, etc.). Then, for each of these clusters a model is generated by training a clusterspecific detector (we chose the ACF [3], [4] detector for our generalization studies). During test time, detection boxes from all the cluster-specific detectors are joined to produce the final vehicle detection boxes. Formally, let O = {o i j } be the set of all vehicles, with i as the image index and j indexing each vehicle in image i. In our clustering process, each cluster provides a subset, 278

3 Count Count Aspect Ratio = W/H > Object Height in pixels Fig. 3. Distribution of vehicle aspect ratio and height plays a significant role in deciding the number of clusters and their properties. Our Highway dataset has diverse aspect ratio and height, motivating learning cluster-specific models suitable for handling such challenges. Ultimately, the goal is to study elements which impact generalization when training on KITTI and testing on the highway dataset. C k O and k N where N is the total number of clusters. The cluster set satisfies the following constraints: () C x C y = φ, x y, and (2) O = n C x. x= III. SAN DIEGO HIGHWAY DATASET The dataset used in this paper was captured using the testbed with a front facing PointGrey color camera at 28x396, 5fps. The annotated frames used in the analysis corresponds to a drive on December 24, 25 at AM PST on San Diego interstate highway. It was a sunny morning leading to some bright reflections from the surround vehicle surface. We choose 22 semi-contiguous frames and annotate 2536 cars that are bigger than 3x3, and atleast 5% visible. Since the clustering process heavily depends on the vehicle aspect ratio (the ratio of bounding box width to height and height distribution, Fig. 3 helps in deciding optimal clustering strategies. IV. EXPERIMENTAL ANALYSIS The LISA-T highway dataset is split equally with the first 5 images going into training set while the remaining images going into the validation set. Data augmentation in the form of horizontally flipped images are added into respective sets. This results in 22 images in each set with 2856 vehicles in training and 226 vehicles in testing sets respectively. This experimental setup allows contrasting the impact of training either on KITTI or on highway settings. Let d be any detection bounding box and o be any ground truth bounding box in the same image. Then, PASCAL overlap threshold (η) is calculated as η = d o d o. Unless specified, we report area under the precision-recall curve (AUC) at η =.7 (7%). Baseline: We run 3 different experiments with the objective of providing baseline performance on the highway Retrained, AUC=72.3% Fast-RCNN, AUC=6.9% Pretrained, AUC=6.99% Fig. 4. -recall curve for baseline methods on the LISA highway test set. Cluster Center: mean Aspect Ratio Number of Clusters Fig. 5. Distribution of cluster average aspect ratio generated by k-means on LISA-T highway training set. dataset test set. The resulting precision-recall (PR) curves are shown Fig. 4. Pre-trained [3] models trained on entire KITTI [] object training set. It is an ensemble of 75 models, trained for 25 different orientation clusters at 3 different scales. This detector achieved 75.46% AUC on KITTI benchmark [5]. Out-of-the-box Fast-RCNN [4] using VGG6 [6] model fine-tuned on VOC7 [7]. About 2 object proposals per image are generated using EdgeBoxes [8]. Detections are generated at the default 5 scale multi-resolution settings. New models trained on the split proposed in [] with tree depth-2 and upto negative samples. All other parameters are same as pre-trained. Next, we consider several strategies for improving the performance of the detector trained on KITTI for U.S. settings. The aim is to gain insight into the type of factors impacting an object detector when applied to new settings. Strategy : We train ACF [3] models, each with a different aspect ratio as obtained by k-means clustering of the samples in the highway dataset. The model height (h k ) 279

4 N=, AUC=7.2% N=2, AUC=69.4% N=4, AUC=56.59% N=8, AUC=6.73% N=, AUC=66.63% Fig. 6. Performance curves for different number of clusters (N) under strategy N=, AUC=7.2% N=2, AUC=74.74% N=4, AUC=76.98% N=8, AUC=8.63% N=, AUC=8.28% Fig. 8. Performance curves for different number of clusters (N) under strategy 3. % 9%.8 8% 7% N=, AUC=7.2% N=2, AUC=75.67% N=4, AUC=47.6% N=8, AUC=8.35% N=, AUC=2.8% Fig. 7. Performance curves for different number of clusters (N) under strategy 2. is fixed to 3 pixels. Model width is set as w k = h k µ k where µ k is the mean aspect ratio of cluster k. We train upto 248 depth-2 decision trees with all positive samples from C k, upto negative samples and 4 rounds of hard negative mining using AdaBoost. Object locations from other clusters are ignored during hard negative mining. Cluster centers are shown in Fig. 5. Resulting performance curves are shown in Fig. 6. Strategy 2: Strategy is modified such that during the training of each model, we allow mining hard negatives from positive samples in other clusters. Resulting performance curves are shown in Fig. 7. Strategy 3: Strategy is modified such that all positive samples from O are used to train each model. Effectively we are training multiple models for the same data, but at different aspect ratios. We note that this is not the same as the original procedure described in Section II which parses the training set into disjoint training clusters. Strategy 3 can leverage from additional data within each AUC Fig. 9. 6% 5% 4% 3% 2% Retrained Fast-RCNN N=, Strategy N=2, Strategy 2 N=8, Strategy 3 % 4% 5% 6% 7% 8% Overlap Threshold AUC as function of overlap threshold (η) for selected experiments. cluster, and the resulting curves are shown in Fig. 8. V. DISCUSSION When comparing the different performance curves for the three strategies of adapting a detector to the new settings, including the baseline, we see that the pre-trained model performs sub-optimally compared to models trained specifically on highway datasets. [3] models trained only on data from urban areas of Germany fail to generalize to our highway dataset. On the other hand, Fast-RCNN [4] in particular and deep convolution neural networks in general are trained on vast amounts of data across multiple object classes and hence are expected to generalize well with appearance and scene variation. This is found to be true for highway vehicle detection but, the reason for lower AUC is explained in the subsequent section. A. Overlap Threshold The interplay of AUC and overlap threshold is insightful in understanding the localization performance of any detector. In Fig. 9, we take the best performing method from each strategy and plot AUC as a function of overlap threshold 28

5 Fig.. The distribution of missed vehicles with respect to object position in the image. Each dot is the center location of missed vehicles. Red corresponds to η =.5 with additional misses in green when η increase to.7. This result is generated using detector trained with strategy 3 (N = 8). Fig.. Example detection results generated using the detector trained with strategy 3 (N = 8). Green boxes are ground truth. Red boxes are detections with confidence score printed above the box. η. While all of the models follow a similar trend, Fast-RCNN shows tremendous decline with η.65. This is a well known issue with DCNN-based approaches, i.e. they are very good for classification tasks but mail fail at localizing the objects. In-fact, Fast-RCNN performs slightly better than all other models with 88.54% AUC at η =.5. This is a useful insight for the intelligent vehicles domain, which may not have been clearly visible in general object detection (e.g. the PASCAL [7] dataset). B. Missed Detections The location and size of missed detection are useful in understanding the limitations of the detector. We limit this analysis to the best performing detector (i.e. strategy 3 with N = 8). In Fig., each missed vehicle s center location is plotted as a dot on the image plane. Red dots correspond to η =.5 and green dots are added when η increases to.7. Fig. helps us draw two important conclusions. () Most of the truncated vehicles are detected but, poorly localized. (2) Small vehicles are almost never detected. This can be quantitatively seen in Fig. 2 where we plot fraction of missed detections vs bounding box height. Selected detection boxes along with ground truth are visualized in Fig.. VI. CONCLUDING REMARKS In this work, we studied the impact of model parameters and dataset training procedure on vehicle detection in highway settings. The study shows that a vehicle detector trained on highway dataset to perform significantly better than one trained on a geographically different dataset. In particular, improved performance could be obtained even with a fraction of the modeling complexity of an off- 28

6 Fraction missed Object Height Fig. 2. Fraction of missed vehicles as a function of vehicle height in pixels. Blue bars corresponds to η =.7 and red bars correspond to η =.5. This plot is generated using detector trained with strategy 3 (N = 8). the-shelf, pre-trained detector (for, we use 8 as opposed to the 75 components in [3])). This does convey how new evaluation settings could render elements of the model redundant. Furthermore, the detector performance (AUC) evaluated at different overlap thresholds suggests that highway vehicle detection is still an open challenge, especially when localization accuracy is important. [] Y. Xiang, W. Choi, Y. Lin, and S. Savarese, Data-driven 3d voxel patterns for object category recognition, in IEEE Conference on Computer Vision and Pattern Recognition, 25. [] R. N. Rajaram, E. Ohn-Bar, and M. M. Trivedi, An exploration of why and when pedestrian detection fails, IEEE Conference on Intelligent Transportation Systems, September 25. [2] R. N. Rajaram, E. Ohn-Bar, and M. M. Trivedi, Looking at pedestrians at different scales: A multiresolution approach and evaluations, IEEE Transactions on Intelligent Transportation Systems, 26. [3] P. Dollár, R. Appel, S. Belongie, and P. Perona, Fast Feature Pyramids for Object Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24. [4] P. Dollár, Piotr s Computer Vision Matlab Toolbox (PMT). pdollar/toolbox/doc/ index.html. [5] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The KITTI dataset, The International Journal of Robotics Research, 23. [6] K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, CoRR, vol. abs/49.556, 24. [7] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, The pascal visual object classes (voc) challenge, International Journal of Computer Vision, vol. 88, pp , June 2. [8] C. L. Zitnick and P. Dollár, Edge boxes: Locating object proposals from edges, in Computer Vision ECCV 24, pp , Springer, 24. VII. ACKNOWLEDGMENTS The authors would like to thank the support of our sponsors and associated industry partners. We also thank our colleagues at the Laboratory for Intelligent and Safe Automobile (LISA), University of California, San Diego, for encouragement and assistance. REFERENCES [] A. Geiger, P. Lenz, and R. Urtasun, Are we ready for autonomous driving? the KITTI vision benchmark suite, in Conference on Computer Vision and Pattern Recognition, 22. [2] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, Object detection with discriminatively trained part based models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp , 2. [3] E. Ohn-Bar and M. M. Trivedi, Learning to detect vehicles by clustering appearance patterns, IEEE Transactions on Intelligent Transportation Systems, 25. [4] R. Girshick, Fast r-cnn, in International Conference on Computer Vision (ICCV), 25. [5] R. N. Rajaram, E. Ohn-Bar, and M. M. Trivedi, RefineNet: Iterative refinement for accurate object localization, in IEEE Intelligent Transportation Systems Conference, 26. [6] S. Sivaraman and M. M. Trivedi, Looking at vehicles on the road: A survey of vision-based vehicle detection, tracking, and behavior analysis, IEEE Transactions on Intelligent Transportation Systems, 23. [7] S. Sivaraman and M. M. Trivedi, A general active learning framework for on-road vehicle recognition and tracking, IEEE Transactions on Intelligent Transportation Systems, 2. [8] X. Wang, M. Yang, S. Zhu, and Y. Lin, Regionlets for generic object detection, in Computer Vision (ICCV), 23 IEEE International Conference on, pp. 7 24, IEEE, 23. [9] B. Pepikj, M. Stark, P. Gehler, and B. Schiele, Occlusion patterns for object class detection, in Computer Vision and Pattern Recognition (CVPR), 23 IEEE Conference on, pp , IEEE,

Can appearance patterns improve pedestrian detection?

Can appearance patterns improve pedestrian detection? IEEE Intelligent Vehicles Symposium, June 25 (to appear) Can appearance patterns improve pedestrian detection? Eshed Ohn-Bar and Mohan M. Trivedi Abstract This paper studies the usefulness of appearance

More information

Multi-Perspective Vehicle Detection and Tracking: Challenges, Dataset, and Metrics

Multi-Perspective Vehicle Detection and Tracking: Challenges, Dataset, and Metrics IEEE 19th International Conference on Intelligent Transportation Systems (ITSC) Windsor Oceanico Hotel, Rio de Janeiro, Brazil, November 1-, Multi-Perspective Vehicle Detection and Tracking: Challenges,

More information

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major

More information

Regionlet Object Detector with Hand-crafted and CNN Feature

Regionlet Object Detector with Hand-crafted and CNN Feature Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet

More information

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Introduction In this supplementary material, Section 2 details the 3D annotation for CAD models and real

More information

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Chi Li, M. Zeeshan Zia 2, Quoc-Huy Tran 2, Xiang Yu 2, Gregory D. Hager, and Manmohan Chandraker 2 Johns

More information

Joint Object Detection and Viewpoint Estimation using CNN features

Joint Object Detection and Viewpoint Estimation using CNN features Joint Object Detection and Viewpoint Estimation using CNN features Carlos Guindel, David Martín and José M. Armingol cguindel@ing.uc3m.es Intelligent Systems Laboratory Universidad Carlos III de Madrid

More information

3D Object Representations for Recognition. Yu Xiang Computational Vision and Geometry Lab Stanford University

3D Object Representations for Recognition. Yu Xiang Computational Vision and Geometry Lab Stanford University 3D Object Representations for Recognition Yu Xiang Computational Vision and Geometry Lab Stanford University 1 2D Object Recognition Ren et al. NIPS15 Ordonez et al. ICCV13 Image classification/tagging/annotation

More information

Detection and Localization with Multi-scale Models

Detection and Localization with Multi-scale Models Detection and Localization with Multi-scale Models Eshed Ohn-Bar and Mohan M. Trivedi Computer Vision and Robotics Research Laboratory University of California San Diego {eohnbar, mtrivedi}@ucsd.edu Abstract

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization

More information

Object Detection by 3D Aspectlets and Occlusion Reasoning

Object Detection by 3D Aspectlets and Occlusion Reasoning Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition

More information


REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological

More information

Supervised learning and evaluation of KITTI s cars detector with DPM

Supervised learning and evaluation of KITTI s cars detector with DPM 24 IEEE Intelligent Vehicles Symposium (IV) June 8-, 24. Dearborn, Michigan, USA Supervised learning and evaluation of KITTI s cars detector with DPM J. Javier Yebes, Luis M. Bergasa, Roberto Arroyo and

More information

Object detection with CNNs

Object detection with CNNs Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals

More information

Final Report: Smart Trash Net: Waste Localization and Classification

Final Report: Smart Trash Net: Waste Localization and Classification Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given

More information

Unified, real-time object detection

Unified, real-time object detection Unified, real-time object detection Final Project Report, Group 02, 8 Nov 2016 Akshat Agarwal (13068), Siddharth Tanwar (13699) CS698N: Recent Advances in Computer Vision, Jul Nov 2016 Instructor: Gaurav

More information

Part Localization by Exploiting Deep Convolutional Networks

Part Localization by Exploiting Deep Convolutional Networks Part Localization by Exploiting Deep Convolutional Networks Marcel Simon, Erik Rodner, and Joachim Denzler Computer Vision Group, Friedrich Schiller University of Jena, Germany www.inf-cv.uni-jena.de Abstract.

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

arxiv: v2 [cs.cv] 14 May 2018

arxiv: v2 [cs.cv] 14 May 2018 ContextVP: Fully Context-Aware Video Prediction Wonmin Byeon 1234, Qin Wang 1, Rupesh Kumar Srivastava 3, and Petros Koumoutsakos 1 arxiv:1710.08518v2 [cs.cv] 14 May 2018 Abstract Video prediction models

More information

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection Yu Xiang 1, Wongun Choi 2, Yuanqing Lin 3, and Silvio Savarese 4 1 University of Washington, 2 NEC Laboratories America,

More information

Pedestrian Detection via Mixture of CNN Experts and thresholded Aggregated Channel Features

Pedestrian Detection via Mixture of CNN Experts and thresholded Aggregated Channel Features Pedestrian Detection via Mixture of CNN Experts and thresholded Aggregated Channel Features Ankit Verma, Ramya Hebbalaguppe, Lovekesh Vig, Swagat Kumar, and Ehtesham Hassan TCS Innovation Labs, New Delhi

More information



More information

Object Detection on Self-Driving Cars in China. Lingyun Li

Object Detection on Self-Driving Cars in China. Lingyun Li Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

DPM Score Regressor for Detecting Occluded Humans from Depth Images

DPM Score Regressor for Detecting Occluded Humans from Depth Images DPM Score Regressor for Detecting Occluded Humans from Depth Images Tsuyoshi Usami, Hiroshi Fukui, Yuji Yamauchi, Takayoshi Yamashita and Hironobu Fujiyoshi Email: usami915@vision.cs.chubu.ac.jp Email:

More information

Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014

Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Problem Definition Viewpoint estimation: Given an image, predicting viewpoint for object of

More information

[Supplementary Material] Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors

[Supplementary Material] Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors [Supplementary Material] Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors Junhyug Noh Soochan Lee Beomsu Kim Gunhee Kim Department of Computer Science and Engineering

More information

Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network

Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Anurag Arnab and Philip H.S. Torr University of Oxford {anurag.arnab, philip.torr}@eng.ox.ac.uk 1. Introduction

More information



More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

Using RGB, Depth, and Thermal Data for Improved Hand Detection

Using RGB, Depth, and Thermal Data for Improved Hand Detection Using RGB, Depth, and Thermal Data for Improved Hand Detection Rachel Luo, Gregory Luppescu Department of Electrical Engineering Stanford University {rsluo, gluppes}@stanford.edu Abstract Hand detection

More information

Finding Tiny Faces Supplementary Materials

Finding Tiny Faces Supplementary Materials Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution

More information

The Caltech-UCSD Birds Dataset

The Caltech-UCSD Birds Dataset The Caltech-UCSD Birds-200-2011 Dataset Catherine Wah 1, Steve Branson 1, Peter Welinder 2, Pietro Perona 2, Serge Belongie 1 1 University of California, San Diego 2 California Institute of Technology

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

Visual features detection based on deep neural network in autonomous driving tasks

Visual features detection based on deep neural network in autonomous driving tasks 430 Fomin I., Gromoshinskii D., Stepanov D. Visual features detection based on deep neural network in autonomous driving tasks Ivan Fomin, Dmitrii Gromoshinskii, Dmitry Stepanov Computer vision lab Russian

More information

An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b

An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b 5th International Conference on Advanced Materials and Computer Science (ICAMCS 2016) An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b 1

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

Occlusion Patterns for Object Class Detection

Occlusion Patterns for Object Class Detection Occlusion Patterns for Object Class Detection Bojan Pepik1 Michael Stark1,2 Peter Gehler3 Bernt Schiele1 Max Planck Institute for Informatics, 2Stanford University, 3Max Planck Institute for Intelligent

More information

Pedestrian Detection Using Structured SVM

Pedestrian Detection Using Structured SVM Pedestrian Detection Using Structured SVM Wonhui Kim Stanford University Department of Electrical Engineering wonhui@stanford.edu Seungmin Lee Stanford University Department of Electrical Engineering smlee729@stanford.edu.

More information

Fast Cyclist Detection by Cascaded Detector and Geometric Constraint

Fast Cyclist Detection by Cascaded Detector and Geometric Constraint Fast Cyclist Detection by Cascaded Detector and Geometric Constraint Wei Tian, 1 Martin Lauer 1 1 Institute of Measurement and Control Systems, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany

More information

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab. [ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that

More information

Category-level localization

Category-level localization Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object

More information

Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains

Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Jiahao Pang 1 Wenxiu Sun 1 Chengxi Yang 1 Jimmy Ren 1 Ruichao Xiao 1 Jin Zeng 1 Liang Lin 1,2 1 SenseTime Research

More information

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro Ryerson University CP8208 Soft Computing and Machine Intelligence Naive Road-Detection using CNNS Authors: Sarah Asiri - Domenic Curro April 24 2016 Contents 1 Abstract 2 2 Introduction 2 3 Motivation

More information

A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications

A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications Le Thanh Nguyen-Meidine 1, Eric Granger 1, Madhu Kiran 1 and Louis-Antoine Blais-Morin 2 1 École de technologie

More information

Counting Vehicles with Cameras

Counting Vehicles with Cameras Counting Vehicles with Cameras Luca Ciampi 1, Giuseppe Amato 1, Fabrizio Falchi 1, Claudio Gennaro 1, and Fausto Rabitti 1 Institute of Information, Science and Technologies of the National Research Council

More information

Fast and Robust Cyclist Detection for Monocular Camera Systems

Fast and Robust Cyclist Detection for Monocular Camera Systems Fast and Robust Cyclist Detection for Monocular Camera Systems Wei Tian, 1 Martin Lauer 1 1 Institute of Measurement and Control Systems, KIT, Engler-Bunte-Ring 21, Karlsruhe, Germany {wei.tian, martin.lauer}@kit.edu

More information



More information

3D Object Proposals for Accurate Object Class Detection

3D Object Proposals for Accurate Object Class Detection 3D Object Proposals for Accurate Object Class Detection Xiaozhi Chen Kaustav Kundu 2 Yukun Zhu 2 Andrew Berneshawi 2 Huimin Ma Sanja Fidler 2 Raquel Urtasun 2 Department of Electronic Engineering Tsinghua

More information

Part-Based Models for Object Class Recognition Part 3

Part-Based Models for Object Class Recognition Part 3 High Level Computer Vision! Part-Based Models for Object Class Recognition Part 3 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de! http://www.d2.mpi-inf.mpg.de/cv ! State-of-the-Art

More information

Pedestrian Detection and Tracking in Images and Videos

Pedestrian Detection and Tracking in Images and Videos Pedestrian Detection and Tracking in Images and Videos Azar Fazel Stanford University azarf@stanford.edu Viet Vo Stanford University vtvo@stanford.edu Abstract The increase in population density and accessibility

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

Fast Vehicle Detector for Autonomous Driving

Fast Vehicle Detector for Autonomous Driving Fast Vehicle Detector for Autonomous Driving Che-Tsung Lin 1,2, Patrisia Sherryl Santoso 2, Shu-Ping Chen 1, Hung-Jin Lin 1, Shang-Hong Lai 1 1 Department of Computer Science, National Tsing Hua University,

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

Go with the flow: Improving Multi-View Vehicle Detection with Motion Cues

Go with the flow: Improving Multi-View Vehicle Detection with Motion Cues IEEE International Conference on Pattern Recognition 2014 Go with the flow: Improving Multi-View Vehicle Detection with Motion Cues Alfredo Ramirez, Eshed Ohn-Bar, and Mohan M. Trivedi LISA: Laboratory

More information

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky

More information

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 Etienne Gadeski, Hervé Le Borgne, and Adrian Popescu CEA, LIST, Laboratory of Vision and Content Engineering, France

More information

arxiv: v1 [cs.cv] 5 Oct 2015

arxiv: v1 [cs.cv] 5 Oct 2015 Efficient Object Detection for High Resolution Images Yongxi Lu 1 and Tara Javidi 1 arxiv:1510.01257v1 [cs.cv] 5 Oct 2015 Abstract Efficient generation of high-quality object proposals is an essential

More information

Multiple-Person Tracking by Detection

Multiple-Person Tracking by Detection http://excel.fit.vutbr.cz Multiple-Person Tracking by Detection Jakub Vojvoda* Abstract Detection and tracking of multiple person is challenging problem mainly due to complexity of scene and large intra-class

More information

Combining ROI-base and Superpixel Segmentation for Pedestrian Detection Ji Ma1,2, a, Jingjiao Li1, Zhenni Li1 and Li Ma2

Combining ROI-base and Superpixel Segmentation for Pedestrian Detection Ji Ma1,2, a, Jingjiao Li1, Zhenni Li1 and Li Ma2 6th International Conference on Machinery, Materials, Environment, Biotechnology and Computer (MMEBC 2016) Combining ROI-base and Superpixel Segmentation for Pedestrian Detection Ji Ma1,2, a, Jingjiao

More information

Lecture 15: Detecting Objects by Parts

Lecture 15: Detecting Objects by Parts Lecture 15: Detecting Objects by Parts David R. Morales, Austin O. Narcomey, Minh-An Quinn, Guilherme Reis, Omar Solis Department of Computer Science Stanford University Stanford, CA 94305 {mrlsdvd, aon2,

More information

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Hyunghoon Cho and David Wu December 10, 2010 1 Introduction Given its performance in recent years' PASCAL Visual

More information

Pedestrian Detection with Deep Convolutional Neural Network

Pedestrian Detection with Deep Convolutional Neural Network Pedestrian Detection with Deep Convolutional Neural Network Xiaogang Chen, Pengxu Wei, Wei Ke, Qixiang Ye, Jianbin Jiao School of Electronic,Electrical and Communication Engineering, University of Chinese

More information

S-CNN: Subcategory-aware convolutional networks for object detection

S-CNN: Subcategory-aware convolutional networks for object detection IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.**, NO.**, 2017 1 S-CNN: Subcategory-aware convolutional networks for object detection Tao Chen, Shijian Lu, Jiayuan Fan Abstract The

More information

Driving dataset in the wild: Various driving scenes

Driving dataset in the wild: Various driving scenes Driving dataset in the wild: Various driving scenes Kurt Keutzer Byung Gon Song John Chuang, Ed. Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2016-92

More information

Modeling 3D viewpoint for part-based object recognition of rigid objects

Modeling 3D viewpoint for part-based object recognition of rigid objects Modeling 3D viewpoint for part-based object recognition of rigid objects Joshua Schwartz Department of Computer Science Cornell University jdvs@cs.cornell.edu Abstract Part-based object models based on

More information


1002 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 17, NO. 4, APRIL 2016 1002 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 17, NO. 4, APRIL 2016 Fast Detection of Multiple Objects in Traffic Scenes With a Common Detection Framework Qichang Hu, Sakrapee Paisitkriangkrai,

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

Amodal and Panoptic Segmentation. Stephanie Liu, Andrew Zhou

Amodal and Panoptic Segmentation. Stephanie Liu, Andrew Zhou Amodal and Panoptic Segmentation Stephanie Liu, Andrew Zhou This lecture: 1. 2. 3. 4. Semantic Amodal Segmentation Cityscapes Dataset ADE20K Dataset Panoptic Segmentation Semantic Amodal Segmentation Yan

More information

Learning Semantic Environment Perception for Cognitive Robots

Learning Semantic Environment Perception for Cognitive Robots Learning Semantic Environment Perception for Cognitive Robots Sven Behnke University of Bonn, Germany Computer Science Institute VI Autonomous Intelligent Systems Some of Our Cognitive Robots Equipped

More information

Road Surface Traffic Sign Detection with Hybrid Region Proposal and Fast R-CNN

Road Surface Traffic Sign Detection with Hybrid Region Proposal and Fast R-CNN 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) Road Surface Traffic Sign Detection with Hybrid Region Proposal and Fast R-CNN Rongqiang Qian,

More information

arxiv: v1 [cs.cv] 26 Jun 2017

arxiv: v1 [cs.cv] 26 Jun 2017 Detecting Small Signs from Large Images arxiv:1706.08574v1 [cs.cv] 26 Jun 2017 Zibo Meng, Xiaochuan Fan, Xin Chen, Min Chen and Yan Tong Computer Science and Engineering University of South Carolina, Columbia,

More information

arxiv: v1 [cs.cv] 16 Nov 2015

arxiv: v1 [cs.cv] 16 Nov 2015 Coarse-to-fine Face Alignment with Multi-Scale Local Patch Regression Zhiao Huang hza@megvii.com Erjin Zhou zej@megvii.com Zhimin Cao czm@megvii.com arxiv:1511.04901v1 [cs.cv] 16 Nov 2015 Abstract Facial

More information

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Kyunghee Kim Stanford University 353 Serra Mall Stanford, CA 94305 kyunghee.kim@stanford.edu Abstract We use a

More information

Gradient of the lower bound

Gradient of the lower bound Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level

More information

Seminar Heidelberg University

Seminar Heidelberg University Seminar Heidelberg University Mobile Human Detection Systems Pedestrian Detection by Stereo Vision on Mobile Robots Philip Mayer Matrikelnummer: 3300646 Motivation Fig.1: Pedestrians Within Bounding Box

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet.

Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet. Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet 7D Labs VINNOVA https://7dlabs.com Photo-realistic image synthesis

More information

Content-Based Image Recovery

Content-Based Image Recovery Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose

More information

Supplementary Material: Unconstrained Salient Object Detection via Proposal Subset Optimization

Supplementary Material: Unconstrained Salient Object Detection via Proposal Subset Optimization Supplementary Material: Unconstrained Salient Object via Proposal Subset Optimization 1. Proof of the Submodularity According to Eqns. 10-12 in our paper, the objective function of the proposed optimization

More information

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information


2 OVERVIEW OF RELATED WORK Utsushi SAKAI Jun OGATA This paper presents a pedestrian detection system based on the fusion of sensors for LIDAR and convolutional neural network based image classification. By using LIDAR our method

More information

Detection III: Analyzing and Debugging Detection Methods

Detection III: Analyzing and Debugging Detection Methods CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can

More information

Individualness and Determinantal Point Processes for Pedestrian Detection: Supplementary Material

Individualness and Determinantal Point Processes for Pedestrian Detection: Supplementary Material Individualness and Determinantal Point Processes for Pedestrian Detection: Supplementary Material Donghoon Lee 1, Geonho Cha 1, Ming-Hsuan Yang 2, and Songhwai Oh 1 1 Electrical and Computer Engineering,

More information

Close-Range Human Detection for Head-Mounted Cameras

Close-Range Human Detection for Head-Mounted Cameras D. MITZEL, B. LEIBE: CLOSE-RANGE HUMAN DETECTION FOR HEAD CAMERAS Close-Range Human Detection for Head-Mounted Cameras Dennis Mitzel mitzel@vision.rwth-aachen.de Bastian Leibe leibe@vision.rwth-aachen.de

More information

CS229: Action Recognition in Tennis

CS229: Action Recognition in Tennis CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active

More information

A Street Scene Surveillance System for Moving Object Detection, Tracking and Classification

A Street Scene Surveillance System for Moving Object Detection, Tracking and Classification A Street Scene Surveillance System for Moving Object Detection, Tracking and Classification Huei-Yung Lin * and Juang-Yu Wei Department of Electrical Engineering National Chung Cheng University Chia-Yi

More information

Fast detection of multiple objects in traffic scenes with a common detection framework

Fast detection of multiple objects in traffic scenes with a common detection framework 1 Fast detection of multiple objects in traffic scenes with a common detection framework arxiv:1510.03125v1 [cs.cv] 12 Oct 2015 Qichang Hu1,2, Sakrapee Paisitkriangkrai1, Chunhua Shen1,3, Anton van den

More information

3D Object Detection and Pose Estimation. Yu Xiang University of Michigan 1st Workshop on Recovering 6D Object Pose 12/17/2015

3D Object Detection and Pose Estimation. Yu Xiang University of Michigan 1st Workshop on Recovering 6D Object Pose 12/17/2015 3D Object Detection and Pose Estimation Yu Xiang University of Michigan 1st Workshop on Recovering 6D Object Pose 12/17/2015 1 2D Object Detection 2 2D detection is NOT enough! 3 Applications that need

More information

Detection and Orientation Estimation for Cyclists by Max Pooled Features

Detection and Orientation Estimation for Cyclists by Max Pooled Features Detection and Orientation Estimation for Cyclists by Max Pooled Features Abstract In this work we propose a new kind of HOG feature which is built by the max pooling operation over spatial bins and orientation

More information

Vehicle Classification on Low-resolution and Occluded images: A low-cost labeled dataset for augmentation

Vehicle Classification on Low-resolution and Occluded images: A low-cost labeled dataset for augmentation Vehicle Classification on Low-resolution and Occluded images: A low-cost labeled dataset for augmentation Anonymous Author(s) Affiliation Address email Abstract 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Video image

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

arxiv: v1 [cs.cv] 31 Mar 2016

arxiv: v1 [cs.cv] 31 Mar 2016 Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv:1603.09742v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract.

More information

Project 3 Q&A. Jonathan Krause

Project 3 Q&A. Jonathan Krause Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations

More information

Histograms of Oriented Gradients

Histograms of Oriented Gradients Histograms of Oriented Gradients Carlo Tomasi September 18, 2017 A useful question to ask of an image is whether it contains one or more instances of a certain object: a person, a face, a car, and so forth.

More information

EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS. Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang

EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS. Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang Image Formation and Processing (IFP) Group, University of Illinois at Urbana-Champaign

More information