Multi-Object Tracking Based on Tracking-Learning-Detection Framework
|
|
- Earl Francis
- 6 years ago
- Views:
Transcription
1 Multi-Object Tracking Based on Tracking-Learning-Detection Framework Songlin Piao, Karsten Berns Robotics Research Lab University of Kaiserslautern Abstract. This paper shows the framework of robust long-term and real-time tracking of multi-object under dynamic background. Here multi-object means either the same types or totally different types. For each tracked object a classifier corresponding to the object is trained on-line using positive and negative constraints inside local region. An external Detector module is integrated into the framework to overcome disappearing problem. Proposed framework selects the best result from several independent components and estimates the error at the same time. Kalman Filter and Particle Filter are used inside Filtering component to predict possible positions of the object in the next frame. At the end various test results in different situations show that proposed algorithm is general and extensible. 1 Introduction Object detection and tracking become more and more important these days. One of the main applications is Advanced Driver Assistance Systems for driving safety [6] and the same applies to robotics. Robot needs to detect objects around it and track them not only for the environment perception but also for the safety. Robot should not harm the human around it. The proposed algorithm was designed originally for solving the problem of detection and tracking of human around low speed robot as shown in Fig. 1. The detection and tracking module is located inside perception layer 1. The framework itself is general, it can be used to detect and track any objects. Discussing about detection normally a classifier needs to be trained off-line for the specific object. Two kinds of greedy searching methods are mainly used. One is to fix the size of searching window and change the size of the whole image. The other one is to fix the size of the whole image but instead change the size of searching window. Dalal and Triggs used histogram of oriented gradients to detect human [5]. They trained human detection classifier using linear SVM and used first strategy to scan whole image. Viola and Jones instead used haar like features and the second searching strategy mentioned above to scan face area in the image [18]. The classifier was trained using Adaboost. The trick they used is integral image which could calculate sum of the pixels inside specific region using constant time complexity. The performance of the detection algorithm depends on features used to extract descriptor, learning algorithms, computational cost. Wu et al. proposed a new visual shape descriptor called CENTRIST which 1 It is safety component of the whole system.
2 is similar to LBP [17] for scene categorization and showed real time performance in human detection [20]. Fig. 1. Designed System Fig. 2. Framework Overview Tracking method is used to locate identical object in the sequent frames as much as possible. Multi-object tracking in real world is not easy because of occlusions, changing background, noise and so on. Recently tracking-by-detection methods become more and more popular. Breitenstein et al. proposed a particle filter based on-line tracking-bydetection algorithm [3]. They used detector confidence when do reasoning for the next frame and used on-line boosting described in [7] to learn object classifier during runtime. Babenko et al. proposed on-line multiple instance learning(mil) for robust object tracking [1]. One positive bag consisting of several image patches is used to update a MIL classifier instead of several positive patches so that drift problem in the traditional tracking-by-detection algorithms could be solved. Saffari et al. proposed on-line random forest [16] and on-line multi-class LPBoost [15] to overcome multi-classification problem which exists in on-line boosting [7] where only binary classification was considered. 1.1 Related work Tracking-Learning-Detection framework was firstly proposed by Kalal et al. in [11]. They explicitly decomposed long-term tracking task into tracking, learning and detection parts. For the tracking they used forward-backward errors to detect tracking failures automatically [10]. The main concept is based on Lucas Kanade s feature tracker [2]. For the learning part they proposed a P-N learning framework which consists of P- expert and N-expert. P-expert analyzes examples classified as negative, estimate false negative and adds them to training set with positive label; N-expert analyzes examples classified as positive, estimates false positive and add them with negative label to the training set. They used iterative procedure to model this learning process and analyzed
3 its error convergence conditions based on well founded theory of dynamical systems [14]. For the detection they used cascaded classifier in order to speed up. Ferns like feature described in [13] was used during the classification. Nearest neighbor classifier was chosen as a final classifier. Our work was mainly aspired by this Tracking-Learning- Detection framework. We will simply note this framework as TLD in the following sections. 1.2 Proposed framework overview Original TLD framework is extended in this paper. Overview of the proposed framework is shown in Fig. 2. Each process contains a single instance of TrackingController which includes all the information of trackers. In each time frame TrackingController updates all trackers positions by iteratively calling each tracker s update function. Each tracker updates its state by itself using information from various modules and filtering strategy. At the same time it updates its appearance model based on P-N learning theory as mentioned in [9]. At the end TrackingController would call post process function to analyze and correct error based on current status calculated by each tracker. For example, two trackers may probably track same target when several targets cross each other. The external detector is trained using the method described in [21], median flow tracker is implemented using the method described in [10], recognizer is similar with the detector mentioned in [11], mean-shift is implemented using the method [4] and the filtering framework currently implemented is classical kalman filter [19] and condensation Particle filter [8]. The recognizer here uses on-line adapted appearance model of the object to evaluate confidence for each candidate. The main contribution is we extended original TLD framework to the multi-target version and the framework itself remains general and extensible. General means each module in the framework could be replaced by any other state-of-art algorithm and extensible means additional modules could be easily added to the framework. The remainder of this paper is organized as follows. Section 2 discusses the whole proposed framework step by step. Subsection 2.1 introduces P-N learning shortly; subsection 2.2 introduces basic detection technique integrated in the framework; subsection 2.3 introduces local searching strategy; subsection 2.4 introduces fusion strategy. The experimental results will be described in Section 3 and Section 4 will give the conclusion and future work. 2 Proposed Framework In this section we will introduce proposed framework more in detail. All related concepts will be discussed step by step. The main structure of each tracker is shown in Fig. 3. In each frame four kinds of candidate regions estimated by Mean-shift module, Detector module, Recognition module, Median Flow module respectively are put into Fusion module. Fusion module uses known appearance model to judge which candidate would be best matched to the current appearance model of the tracking object. Then Learning module updates object s appearance model again based on P-N learning
4 Fig. 3. Structure of Tracker theory. We search inside local region instead of whole image area in order to speed up. We tested with all the videos mentioned in [11] just for searching local area and the result came out even better than we expected. One example is shown in Fig. 4(a). The large green box around each tracking object shown in Fig. 4(b) is the local region we have mentioned. In the case of particle filter we add Gaussian noise to the transition function as shown in shown in Fig. 4(c). The potential problems of this local searching and how to handle this kind of problem will be discussed in the following subsections. 2.1 Introduction to P-N learning P-N learning is kind of learning strategy which uses positive and negative constraints. Actually many researchers unwittingly used this kind of method in their previous work without strict proof, for example, in paper [3] and [12] authors used on-line learning method to update corresponding classifiers, but the way they sampled positive and negative training data is conform to P-N learning constraints. They update classifier in each frame, it could cause unnecessary computational cost. But based on P-N learning theory, updating classifier in each frame is not necessary, however, only when some conditions are satisfied. This conditions will be discussed in detail in the fusion subsection. This mechanism could reduce a lot of computation time. Kalal et el. has provided detailed prove to P-N learning theory in [9]. 2.2 Introduction to Detector As it is shown in Fig. 2 both TrackingController and Tracker class need external Detector class to scan the specified region. It is better the speed of detection becomes faster
5 (a) Example (b) Case of Kalman Filter (c) Case of Particle Filter Fig. 4. Local Region Searching without accuracy loss. Normally detector uses greedy searching strategy as mentioned previously. While doing scanning descriptor of each patch is calculated then put this into the predefined classifier and judge if this patch contains specific object or not. Wu et el. proposed a real-time human detection using CENTRIST descriptor which is very similar with LBP in [20]. This method achieved 20 fps in VGA resolution image only with embedded 1.2GHz CPU. It is almost 80 times faster than HOG based descriptor [5] while achieving similar accuracy. As it is seen in Fig. 5 the patch is divided into several grids, then the green box scan the whole patch 2 and makes descriptor. In the case of Fig. 5 each patch is divided to 3 by 4 small grids. Then there are 2 3 = 6 possible positions inside green box. For each green box the length of LBP descriptor is 256, then the total length of the final descriptor is = For each pixel inside green box the index of LBP descriptor could be calculated as it is shown in 5(b). The index could be represented as (C 1 C 2 C 3 C 4 C 5 C 6 C 7 C 8 ) 2 where C i is set to 1 if the corresponding neighbor pixel value is higher than the current pixel value otherwise it is set to 0. We have trained cascaded detector using linear SVM and histogram intersection kernel based SVM as in Fig Here patch is Sobel edge image.
6 (a) (b) Fig. 5. CENTRIST Descriptor Fig. 6. Cascade In Detection 2.3 Local Region Searching The original TLD framework searches object in the whole image because it only tracks one object. Instead here we search in local region. This gives us many advantages. First it dramatically reduces processing time for each frame, second it makes multi-object tracking possible. There are four results going into Fusion module as it is shown in Fig. 3. Except the result from Median Flow module other three results do searching in local region. The reason we do not search in local for Median Flow is to overcome drift problem because it happens more often than searching in global image. First, Mean-shift module uses two kinds of information. One is the back projected image using initial color histogram of the object. The other is detection confidence image which comes from Detector module. The concept of detection confidence map was proposed in the paper [3]. The authors used the continuous confidence of pedestrian detectors and on-line trained, instance-specific classifiers as a graded observation
7 Fig. 7. Mean-shift Example model in addition to final high-confidence detection results. In our case, detector confidence density d c (p) corresponds to raw SVM output before applying non-maximum suppression, which is further scale to [0,1] using f = 1 exp( ρ). Fig. 7 shows this concept more clearly. When only classical back projected image is used, because the background color is similar with target Mean-shift module may track to a wrong position. But if density image is combined, the result would be very robust. The starting position of Mean-shift module depends on the previous tracking result. If Fusion module selects the result from Mean-shift module as a final result then it means last result is stable. In such case we start Mean-shift algorithm from the last tracked position. This concept is similar to continuous Mean-shift algorithm. But if the last selected tracking result is not frommean-shift module or there is no confident tracking result at all then Mean-shift algorithm starts from center position of the local searching area. The center position of the local searching area is predicted by Filtering component. Second, there are two kinds of searching strategies for Detector module. This task is done by TrackingController not Tracker. The TrackingController detects object either through the whole frame or just boundary area around blue box in Fig. 8. Then TrackingController associates each detected object to the tracker using the similar method described in [3]. After this step each tracker either gets its associated detection result or not. As it is shown in Fig. 8, in the first image tracker can get associated detection result, but in the second image tracker cannot get associated detection result because half of the human is in the non-searching area. In this case we use Detector module to search local area to see if there is human detected. In this Detector module scanning window size is fixed. Instead, size of the whole image is changed in each level.
8 Fig. 8. Different Detection Strategy Third, for Recognition module cascaded detector proposed in [11] is used to search for exact tracking object inside local region. Three stages classifier is structured: (i) patch variance, (ii) ensemble classifier and (iii) nearest neighbor. Each stage either passes patch to the next stage or reject patch. For more detail please refer to paper [11]. Instead of searching whole image we search only in local region which could reduce computational time significantly. 2.4 Introduction to Fusion As it is shown in the Fig. 3 there are four results entering this module, two results from tracker side and two results from detection side. First, we compare two results from meanshi ft tracker side. Here one is from Mean-shift module noted as Tt and the other is median f low from Median Flow module noted as Tt. For each of them the confidence is calculated by on-line trained classifier. At the end higher confidence is selected. Then we compare this confidence with the predefined threshold values. Learning procedure is depend on these two values. There are two situations in which learning procedure could be triggered. One is that current confidence is bigger than θ plus and the other is current confidence is bigger than θ minus and previous learning state is also true. We set θ plus to 0.65 and θ minus to 0.55 in our system. But if Detector side yields exactly one rectangle with a confidence higher than Tracking side, then the response of Detector module is assigned to the final result which causes re-initialization. The details are described in Algorithm 1. In the original TLD paper author updated the classifier only when tracking result is valid. But here the result from Detector side is also used for
9 learning when its confidence is bigger than some threshold, because searching area is already local region around the object. It means if there is a high confidence detection result inside local region we could think this result as a clue for P-expert. It should be noticed that different fusion strategies result in different performance. What we have showed here is just a strategy for the example system, actually this could be adapted to any specific requirements. Once current state is determined to be confident enough then learning begins. As it was stated in [9], P-expert collects all patches highly overlapped with the final state and labels them as positive; N-expert collects all patches which are not overlapped with the final state and labels them as negative. A bounding box B is highly overlapped if the overlapping ratio is over 60%. If this overlapping ratio is less than 20% then they are not overlapped. 3 Experiments We applied proposed framework to test with all the dataset described in [11] and [1]. Since the framework is designed for multi-target version, for the compatibility it should also work on one target case. Because all the test dataset are for one target tracking, our framework works perfect in all the dataset. Except these datasets, we did additional experiments. The experiment part is divided into two sub parts. One is to test without detector and the other is to test with specific detector. The specific detector we used in the second part is face detector and human detector. In the case of human detector we tested in outdoor where image is not normal image 3, background is changing fast, sunlight is also changing abruptly. Fig. 9 shows tracking face and cup which are two different types. We initialized object s position by mouse. It is shown that even if there is some part of occlusion the tracker would not miss the target. Fig. 10 shows tracking two eyes and one mouth. In the second part both face detector and human detector are trained using method described in [20]. The result is shown in Fig. 11. Fig. 12 and Fig. 8 shows more challenging situation. Actually in this case the original TLD algorithm does not work at all, but our system works well. This is because we added two more components compared with original algorithm: one is external Detector module and the other is Mean-shift module. This time we set maximum number of tracker to 1 in order to show why proposed framework is definitely useful. As mentioned before we only search inside green box which is local area. There are four kinds of small rectangles inside big green box. The red one is the result from Detector module, yellow one is the result from Median Flow module, black one is the result from the Mean-shift module and the blue one is the result from the Recognition module. As it can be seen in Fig. 8, there are only yellow and black sub rectangles inside green area which means currently results only from Median Flow module and from Mean-shift module are available. We analyze these two results using on-line trained object model and judge if these results are reliable or not. In the case of bottom image from Fig. 12, there is no reliable trackers but only reliable detection. This kind of procedure is done inside Fusion module. If in this situation 3 Omni view image is transformed to panorama view so that noise is larger than in normal view.
10 Algorithm 1 fusion strategy meanshi ft median f low Require: Tt,Tt,Dt detector,dt recognizer Ensure: B t φ, valid(b t ) f alse C detect 0, C tracker 0 tracked f alse reinitialization f alse con f idencedetections 0 meanshi ft median f low for R t in Tt Tt do if C(R t ) C tracker then B tracker t R t C tracker C(R t ) end if end for if C tracker θ plus then B t B tracker t tracked true, valid(b t ) true else if valid(b t 1 )&(C tracker >θ minus ) then B t B tracker t tracked true, valid(b t ) true end if if tracked then for R t Dt detector Dt recognizer do if Overlapping(R t,b tracker t )<0.5&C(R t )>C tracker then con f idencedetections + + end if end for if con f idencedetections == 1 then reinitialization true end if else for R t in Dt detector Dt recognizer do if C(R t ) C detect then Bt detector R t C detect C(R t ) end if end for if C detect θ detector then B t Bt detector valid(b t ) true end if end if
11 Fig. 9. First Test Without Detector we apply only detection, most of targets would be missed because of sunlight and fast movement of the target. In order to show generality of the proposed framework we replace the Kalman Filter with Particle Filter and applied proposed framework on the ETH benchmark dataset. Actually there are many other benchmark datasets available, but we are not interested in static background and indoor environment. The test video has the length of 999 frames and the resolution of 640 by 480. Maximum number of tracker inside TrackingController in Fig. 2 is set to 6. We tested with 100 particles for each tracker. Fig. 13 shows the result. They are 132nd, 284th, 432nd, 544th frames from the test sequence. It is seen there are several tracking errors. But based on our error detection mechanism, these wrong trackers would be re-initialized within several frames. Thanks to fast runtime speed of Detector module and Kalman Filter our algorithm runs very fast. Our face detector runs at 18ms per frame in 640 by 480 image comparing with 132ms of OpenCV using i CPU. Totally average processing time for one tracking target in our algorithm takes 64ms per frame including all the procedures.
12 Fig. 10. Second Test Without Detector 4 Conclusion and future work A novel object tracking framework is introduced in this paper. Framework s feasibility has been analyzed from the points of Generality and Scalability. Proposed framework calculates the tracking results for each component separately and then estimates the best one through Fusion module. At the same time framework provides functionality of learning object s model at runtime. Original TLD framework can be automatically extended to multi target version with proposed framework. This is done by adding external Detector and Mean-shift components. Test results show that proposed framework shows robustness in a certain extent even in outdoor environment. But for the too fast moving object and deformable object, algorithm is not robust enough especially in outdoor. This kind of problem remains as main task in the future. References 1. B. Babenko and M.-H. Y. S. Belongie. Robust object tracking with online multiple instance learning , 9 2. J.-Y. Bouguet. Pyramidal implementation of the lucas kanade feature tracker description of the algorithm,
13 Fig. 11. Test With Face Detector 3. M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool. Online multiperson tracking-by-detection from a single, uncalibrated camera. IEEE Trans. Pattern Anal. Mach. Intell., 33(9): , Sept , 4, 6, 7 4. D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell., 24(5): , May N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In C. Schmid, S. Soatto, and C. Tomasi, editors, International Conference on Computer Vision & Pattern Recognition, volume 2, pages , INRIA Rhône-Alpes, ZIRST-655, av. de l Europe, Montbonnot-38334, June , 5 6. M. Enzweiler and D. Gavrila. Monocular pedestrian detection: Survey and experiments. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(12): , dec H. Grabner and H. Bischof. On-line boosting and vision. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, volume 1, pages , june M. Isard and A. Blake. Condensationâconditional density propagation for visual tracking. International Journal of Computer Vision, 29:5 28, Z. Kalal, J. Matas, and K. Mikolajczyk. P-N Learning: Bootstrapping Binary Classifiers by Structural Constraints. Conference on Computer Vision and Pattern Recognition, , 4, Z. Kalal, K. Mikolajczyk, and J. Matas. Forward-backward error: Automatic detection of tracking failures. In Pattern Recognition (ICPR), th International Conference on, pages , aug , Z. Kalal, K. Mikolajczyk, and J. Matas. Tracking-learning-detection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 34(7): , july , 3, 4, 7, 8, C.-H. Kuo and R. Nevatia. How does person identity recognition help multi-person tracking? In CVPR, pages , M. Oezuysal, P. Fua, and V. Lepetit. Fast keypoint recognition in ten lines of code. In In Proc. IEEE Conference on Computing Vision and Pattern Recognition, K. Ogata. Modern Control Engineering. Tsinghua University Press, A. Saffari, M. Godec, T. Pock, C. Leistner, and H. Bischof. Online multi-class lpboost. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages , june
14 Fig. 12. Test With Human Detector Fig. 13. Tracking result on ETH data set 16. A. Saffari, C. Leistner, J. Santner, M. Godec, and H. Bischof. On-line random forests. In Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, pages , oct M. P. T. Ojala and D. Harwood. Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In Proceedings of the 12th IAPR International Conference on Pattern Recognition (ICPR 1994), volume vol. 1, pages pp , P. Viola and M. J. Jones. Robust real-time face detection. Int. J. Comput. Vision, 57(2): , May G. Welch and G. Bishop. An introduction to the kalman filter. Technical report, Chapel Hill, NC, USA, J. Wu, C. Geyer, and J. M. Rehg. Real-time human detection using contour cues. In ICRA, pages IEEE, , 5, J. Wu and J. M. Rehg. Centrist: A visual descriptor for scene categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33: ,
Multiple-Person Tracking by Detection
http://excel.fit.vutbr.cz Multiple-Person Tracking by Detection Jakub Vojvoda* Abstract Detection and tracking of multiple person is challenging problem mainly due to complexity of scene and large intra-class
More informationVision Based Person Detection for Safe Navigation of Commercial Vehicle
Vision Based Person Detection for Safe Navigation of Commercial Vehicle Songlin Piao and Karsten Berns University of Kaiserslautern, 67663, Germany, piao@cs.uni-kl.de, berns@cs.uni-kl.de, https://agrosy.informatik.uni-kl.de
More informationFAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO
FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO Makoto Arie, Masatoshi Shibata, Kenji Terabayashi, Alessandro Moro and Kazunori Umeda Course
More informationObject Detection Design challenges
Object Detection Design challenges How to efficiently search for likely objects Even simple models require searching hundreds of thousands of positions and scales Feature design and scoring How should
More informationHuman detection solution for a retail store environment
FACULDADE DE ENGENHARIA DA UNIVERSIDADE DO PORTO Human detection solution for a retail store environment Vítor Araújo PREPARATION OF THE MSC DISSERTATION Mestrado Integrado em Engenharia Eletrotécnica
More informationAutomatic Initialization of the TLD Object Tracker: Milestone Update
Automatic Initialization of the TLD Object Tracker: Milestone Update Louis Buck May 08, 2012 1 Background TLD is a long-term, real-time tracker designed to be robust to partial and complete occlusions
More informationSkin and Face Detection
Skin and Face Detection Linda Shapiro EE/CSE 576 1 What s Coming 1. Review of Bakic flesh detector 2. Fleck and Forsyth flesh detector 3. Details of Rowley face detector 4. Review of the basic AdaBoost
More informationObject detection using non-redundant local Binary Patterns
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh
More informationLarge-Scale Traffic Sign Recognition based on Local Features and Color Segmentation
Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation M. Blauth, E. Kraft, F. Hirschenberger, M. Böhm Fraunhofer Institute for Industrial Mathematics, Fraunhofer-Platz 1,
More informationReal-Time Human Detection using Relational Depth Similarity Features
Real-Time Human Detection using Relational Depth Similarity Features Sho Ikemura, Hironobu Fujiyoshi Dept. of Computer Science, Chubu University. Matsumoto 1200, Kasugai, Aichi, 487-8501 Japan. si@vision.cs.chubu.ac.jp,
More informationObject Category Detection: Sliding Windows
04/10/12 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical
More informationTracking. Hao Guan( 管皓 ) School of Computer Science Fudan University
Tracking Hao Guan( 管皓 ) School of Computer Science Fudan University 2014-09-29 Multimedia Video Audio Use your eyes Video Tracking Use your ears Audio Tracking Tracking Video Tracking Definition Given
More informationFast Human Detection Algorithm Based on Subtraction Stereo for Generic Environment
Fast Human Detection Algorithm Based on Subtraction Stereo for Generic Environment Alessandro Moro, Makoto Arie, Kenji Terabayashi and Kazunori Umeda University of Trieste, Italy / CREST, JST Chuo University,
More informationhttps://en.wikipedia.org/wiki/the_dress Recap: Viola-Jones sliding window detector Fast detection through two mechanisms Quickly eliminate unlikely windows Use features that are fast to compute Viola
More informationHuman detection using local shape and nonredundant
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Human detection using local shape and nonredundant binary patterns
More informationOnline Multi-Face Detection and Tracking using Detector Confidence and Structured SVMs
Online Multi-Face Detection and Tracking using Detector Confidence and Structured SVMs Francesco Comaschi 1, Sander Stuijk 1, Twan Basten 1,2 and Henk Corporaal 1 1 Eindhoven University of Technology,
More informationStruck: Structured Output Tracking with Kernels. Presented by Mike Liu, Yuhang Ming, and Jing Wang May 24, 2017
Struck: Structured Output Tracking with Kernels Presented by Mike Liu, Yuhang Ming, and Jing Wang May 24, 2017 Motivations Problem: Tracking Input: Target Output: Locations over time http://vision.ucsd.edu/~bbabenko/images/fast.gif
More informationEpithelial rosette detection in microscopic images
Epithelial rosette detection in microscopic images Kun Liu,3, Sandra Ernst 2,3, Virginie Lecaudey 2,3 and Olaf Ronneberger,3 Department of Computer Science 2 Department of Developmental Biology 3 BIOSS
More informationA New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM
A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM M.Ranjbarikoohi, M.Menhaj and M.Sarikhani Abstract: Pedestrian detection has great importance in automotive vision systems
More informationCo-occurrence Histograms of Oriented Gradients for Pedestrian Detection
Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection Tomoki Watanabe, Satoshi Ito, and Kentaro Yokoi Corporate Research and Development Center, TOSHIBA Corporation, 1, Komukai-Toshiba-cho,
More informationHuman Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg
Human Detection A state-of-the-art survey Mohammad Dorgham University of Hamburg Presentation outline Motivation Applications Overview of approaches (categorized) Approaches details References Motivation
More informationVisual Detection and Species Classification of Orchid Flowers
14-22 MVA2015 IAPR International Conference on Machine Vision Applications, May 18-22, 2015, Tokyo, JAPAN Visual Detection and Species Classification of Orchid Flowers Steven Puttemans & Toon Goedemé KU
More informationFast and Stable Human Detection Using Multiple Classifiers Based on Subtraction Stereo with HOG Features
2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China Fast and Stable Human Detection Using Multiple Classifiers Based on
More informationGeneric Object-Face detection
Generic Object-Face detection Jana Kosecka Many slides adapted from P. Viola, K. Grauman, S. Lazebnik and many others Today Window-based generic object detection basic pipeline boosting classifiers face
More informationUtilizing Graphics Processing Units for Rapid Facial Recognition using Video Input
Utilizing Graphics Processing Units for Rapid Facial Recognition using Video Input Charles Gala, Dr. Raj Acharya Department of Computer Science and Engineering Pennsylvania State University State College,
More informationIN computer vision develop mathematical techniques in
International Journal of Scientific & Engineering Research Volume 4, Issue3, March-2013 1 Object Tracking Based On Tracking-Learning-Detection Rupali S. Chavan, Mr. S.M.Patil Abstract -In this paper; we
More informationDetection of a Single Hand Shape in the Foreground of Still Images
CS229 Project Final Report Detection of a Single Hand Shape in the Foreground of Still Images Toan Tran (dtoan@stanford.edu) 1. Introduction This paper is about an image detection system that can detect
More informationDiscriminative classifiers for image recognition
Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study
More informationA novel template matching method for human detection
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 A novel template matching method for human detection Duc Thanh Nguyen
More informationObject Category Detection: Sliding Windows
03/18/10 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Goal: Detect all instances of objects Influential Works in Detection Sung-Poggio
More informationEfficient Detector Adaptation for Object Detection in a Video
2013 IEEE Conference on Computer Vision and Pattern Recognition Efficient Detector Adaptation for Object Detection in a Video Pramod Sharma and Ram Nevatia Institute for Robotics and Intelligent Systems,
More informationPEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE
PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE Hongyu Liang, Jinchen Wu, and Kaiqi Huang National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science
More informationCategory vs. instance recognition
Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building
More informationWindow based detectors
Window based detectors CS 554 Computer Vision Pinar Duygulu Bilkent University (Source: James Hays, Brown) Today Window-based generic object detection basic pipeline boosting classifiers face detection
More informationTraffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers
Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane
More information2 Cascade detection and tracking
3rd International Conference on Multimedia Technology(ICMT 213) A fast on-line boosting tracking algorithm based on cascade filter of multi-features HU Song, SUN Shui-Fa* 1, MA Xian-Bing, QIN Yin-Shi,
More informationHuman-Robot Interaction
Human-Robot Interaction Elective in Artificial Intelligence Lecture 6 Visual Perception Luca Iocchi DIAG, Sapienza University of Rome, Italy With contributions from D. D. Bloisi and A. Youssef Visual Perception
More informationPixel-Pair Features Selection for Vehicle Tracking
2013 Second IAPR Asian Conference on Pattern Recognition Pixel-Pair Features Selection for Vehicle Tracking Zhibin Zhang, Xuezhen Li, Takio Kurita Graduate School of Engineering Hiroshima University Higashihiroshima,
More informationA Novel Target Algorithm based on TLD Combining with SLBP
Available online at www.ijpe-online.com Vol. 13, No. 4, July 2017, pp. 458-468 DOI: 10.23940/ijpe.17.04.p13.458468 A Novel Target Algorithm based on TLD Combining with SLBP Jitao Zhang a, Aili Wang a,
More informationFace Detection and Alignment. Prof. Xin Yang HUST
Face Detection and Alignment Prof. Xin Yang HUST Many slides adapted from P. Viola Face detection Face detection Basic idea: slide a window across image and evaluate a face model at every location Challenges
More informationEfficient Visual Object Tracking with Online Nearest Neighbor Classifier
Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Steve Gu and Ying Zheng and Carlo Tomasi Department of Computer Science, Duke University Abstract. A tracking-by-detection framework
More informationFast Human Detection Using a Cascade of Histograms of Oriented Gradients
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Fast Human Detection Using a Cascade of Histograms of Oriented Gradients Qiang Zhu, Shai Avidan, Mei-Chen Yeh, Kwang-Ting Cheng TR26-68 June
More informationDeformable Part Models
CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones
More informationFace and Nose Detection in Digital Images using Local Binary Patterns
Face and Nose Detection in Digital Images using Local Binary Patterns Stanko Kružić Post-graduate student University of Split, Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture
More informationMulti-camera Pedestrian Tracking using Group Structure
Multi-camera Pedestrian Tracking using Group Structure Zhixing Jin Center for Research in Intelligent Systems University of California, Riverside 900 University Ave, Riverside, CA 92507 jinz@cs.ucr.edu
More informationProgress Report of Final Year Project
Progress Report of Final Year Project Project Title: Design and implement a face-tracking engine for video William O Grady 08339937 Electronic and Computer Engineering, College of Engineering and Informatics,
More informationHUMAN POSTURE DETECTION WITH THE HELP OF LINEAR SVM AND HOG FEATURE ON GPU
International Journal of Computer Engineering and Applications, Volume IX, Issue VII, July 2015 HUMAN POSTURE DETECTION WITH THE HELP OF LINEAR SVM AND HOG FEATURE ON GPU Vaibhav P. Janbandhu 1, Sanjay
More informationMULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION
MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION Panca Mudjirahardjo, Rahmadwati, Nanang Sulistiyanto and R. Arief Setyawan Department of Electrical Engineering, Faculty of
More informationHuman Detection and Tracking for Video Surveillance: A Cognitive Science Approach
Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach Vandit Gajjar gajjar.vandit.381@ldce.ac.in Ayesha Gurnani gurnani.ayesha.52@ldce.ac.in Yash Khandhediya khandhediya.yash.364@ldce.ac.in
More informationSingle Object Tracking with TLD, Convolutional Networks, and AdaBoost
Single Object Tracking with TLD, Convolutional Networks, and AdaBoost Albert Haque Fahim Dalvi Computer Science Department, Stanford University {ahaque,fdalvi}@cs.stanford.edu Abstract We evaluate the
More informationSelection of Scale-Invariant Parts for Object Class Recognition
Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract
More informationMulti-Person Tracking-by-Detection based on Calibrated Multi-Camera Systems
Multi-Person Tracking-by-Detection based on Calibrated Multi-Camera Systems Xiaoyan Jiang, Erik Rodner, and Joachim Denzler Computer Vision Group Jena Friedrich Schiller University of Jena {xiaoyan.jiang,erik.rodner,joachim.denzler}@uni-jena.de
More informationHistograms of Oriented Gradients for Human Detection p. 1/1
Histograms of Oriented Gradients for Human Detection p. 1/1 Histograms of Oriented Gradients for Human Detection Navneet Dalal and Bill Triggs INRIA Rhône-Alpes Grenoble, France Funding: acemedia, LAVA,
More informationThe most cited papers in Computer Vision
COMPUTER VISION, PUBLICATION The most cited papers in Computer Vision In Computer Vision, Paper Talk on February 10, 2012 at 11:10 pm by gooly (Li Yang Ku) Although it s not always the case that a paper
More informationParticle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore
Particle Filtering CS6240 Multimedia Analysis Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore (CS6240) Particle Filtering 1 / 28 Introduction Introduction
More informationAutomatic Parameter Adaptation for Multi-Object Tracking
Automatic Parameter Adaptation for Multi-Object Tracking Duc Phu CHAU, Monique THONNAT, and François BREMOND {Duc-Phu.Chau, Monique.Thonnat, Francois.Bremond}@inria.fr STARS team, INRIA Sophia Antipolis,
More informationA Cascade of Feed-Forward Classifiers for Fast Pedestrian Detection
A Cascade of eed-orward Classifiers for ast Pedestrian Detection Yu-ing Chen,2 and Chu-Song Chen,3 Institute of Information Science, Academia Sinica, aipei, aiwan 2 Dept. of Computer Science and Information
More informationHuman Object Classification in Daubechies Complex Wavelet Domain
Human Object Classification in Daubechies Complex Wavelet Domain Manish Khare 1, Rajneesh Kumar Srivastava 1, Ashish Khare 1(&), Nguyen Thanh Binh 2, and Tran Anh Dien 2 1 Image Processing and Computer
More informationREAL-TIME, LONG-TERM HAND TRACKING WITH UNSUPERVISED INITIALIZATION
REAL-TIME, LONG-TERM HAND TRACKING WITH UNSUPERVISED INITIALIZATION Vincent Spruyt 1,2, Alessandro Ledda 1 and Wilfried Philips 2 1 Dept. of Applied Engineering: Electronics-ICT, Artesis University College
More informationReliable Human Detection and Tracking in Top-View Depth Images
2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops Reliable Human Detection and Tracking in Top-View Depth Images Michael Rauter Austrian Institute of Technology Donau-City-Strasse
More informationRecap Image Classification with Bags of Local Features
Recap Image Classification with Bags of Local Features Bag of Feature models were the state of the art for image classification for a decade BoF may still be the state of the art for instance retrieval
More informationEnsemble of Bayesian Filters for Loop Closure Detection
Ensemble of Bayesian Filters for Loop Closure Detection Mohammad Omar Salameh, Azizi Abdullah, Shahnorbanun Sahran Pattern Recognition Research Group Center for Artificial Intelligence Faculty of Information
More informationOnline Object Tracking with Proposal Selection. Yang Hua Karteek Alahari Cordelia Schmid Inria Grenoble Rhône-Alpes, France
Online Object Tracking with Proposal Selection Yang Hua Karteek Alahari Cordelia Schmid Inria Grenoble Rhône-Alpes, France Outline 2 Background Our approach Experimental results Summary Background 3 Tracking-by-detection
More informationFeature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking
Feature descriptors Alain Pagani Prof. Didier Stricker Computer Vision: Object and People Tracking 1 Overview Previous lectures: Feature extraction Today: Gradiant/edge Points (Kanade-Tomasi + Harris)
More informationPostprint.
http://www.diva-portal.org Postprint This is the accepted version of a paper presented at 14th International Conference of the Biometrics Special Interest Group, BIOSIG, Darmstadt, Germany, 9-11 September,
More informationTracking-Learning-Detection
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 6, NO., JANUARY 200 Tracking-Learning-Detection Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas, Abstract Long-term tracking is the
More informationFast Bounding Box Estimation based Face Detection
Fast Bounding Box Estimation based Face Detection Bala Subburaman Venkatesh 1,2 and Sébastien Marcel 1 1 Idiap Research Institute, 1920, Martigny, Switzerland 2 École Polytechnique Fédérale de Lausanne
More informationFuzzy based Multiple Dictionary Bag of Words for Image Classification
Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image
More informationImproving Object Tracking by Adapting Detectors
Improving Object Tracking by Adapting Detectors Lu Zhang Laurens van der Maaten Vision Lab, Delft University of Technology, The Netherlands Email: {lu.zhang, l.j.p.vandermaaten}@tudelft.nl Abstract The
More informationDistance-Based Descriptors and Their Application in the Task of Object Detection
Distance-Based Descriptors and Their Application in the Task of Object Detection Radovan Fusek (B) and Eduard Sojka Department of Computer Science, Technical University of Ostrava, FEECS, 17. Listopadu
More informationFace Tracking in Video
Face Tracking in Video Hamidreza Khazaei and Pegah Tootoonchi Afshar Stanford University 350 Serra Mall Stanford, CA 94305, USA I. INTRODUCTION Object tracking is a hot area of research, and has many practical
More informationHuman detections using Beagle board-xm
Human detections using Beagle board-xm CHANDAN KUMAR 1 V. AJAY KUMAR 2 R. MURALI 3 1 (M. TECH STUDENT, EMBEDDED SYSTEMS, DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING, VIJAYA KRISHNA INSTITUTE
More informationPerson Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab
Person Detection in Images using HoG + Gentleboost Rahul Rajan June 1st July 15th CMU Q Robotics Lab 1 Introduction One of the goals of computer vision Object class detection car, animal, humans Human
More informationDetecting Pedestrians by Learning Shapelet Features
Detecting Pedestrians by Learning Shapelet Features Payam Sabzmeydani and Greg Mori School of Computing Science Simon Fraser University Burnaby, BC, Canada {psabzmey,mori}@cs.sfu.ca Abstract In this paper,
More informationContextual Combination of Appearance and Motion for Intersection Videos with Vehicles and Pedestrians
Contextual Combination of Appearance and Motion for Intersection Videos with Vehicles and Pedestrians Mohammad Shokrolah Shirazi and Brendan Morris University of Nevada, Las Vegas shirazi@unlv.nevada.edu,
More informationA Novel Extreme Point Selection Algorithm in SIFT
A Novel Extreme Point Selection Algorithm in SIFT Ding Zuchun School of Electronic and Communication, South China University of Technolog Guangzhou, China zucding@gmail.com Abstract. This paper proposes
More informationProject Report for EE7700
Project Report for EE7700 Name: Jing Chen, Shaoming Chen Student ID: 89-507-3494, 89-295-9668 Face Tracking 1. Objective of the study Given a video, this semester project aims at implementing algorithms
More informationGeneric Face Alignment Using an Improved Active Shape Model
Generic Face Alignment Using an Improved Active Shape Model Liting Wang, Xiaoqing Ding, Chi Fang Electronic Engineering Department, Tsinghua University, Beijing, China {wanglt, dxq, fangchi} @ocrserv.ee.tsinghua.edu.cn
More informationIMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES
IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,
More informationRelational HOG Feature with Wild-Card for Object Detection
Relational HOG Feature with Wild-Card for Object Detection Yuji Yamauchi 1, Chika Matsushima 1, Takayoshi Yamashita 2, Hironobu Fujiyoshi 1 1 Chubu University, Japan, 2 OMRON Corporation, Japan {yuu, matsu}@vision.cs.chubu.ac.jp,
More informationLecture 12 Recognition. Davide Scaramuzza
Lecture 12 Recognition Davide Scaramuzza Oral exam dates UZH January 19-20 ETH 30.01 to 9.02 2017 (schedule handled by ETH) Exam location Davide Scaramuzza s office: Andreasstrasse 15, 2.10, 8050 Zurich
More informationDefinition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos
Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,
More informationAdaptive Cell-Size HoG Based. Object Tracking with Particle Filter
Contemporary Engineering Sciences, Vol. 9, 2016, no. 11, 539-545 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2016.6439 Adaptive Cell-Size HoG Based Object Tracking with Particle Filter
More informationPreviously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011
Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition
More informationFace detection and recognition. Detection Recognition Sally
Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification
More informationReal Time Person Detection and Tracking by Mobile Robots using RGB-D Images
Real Time Person Detection and Tracking by Mobile Robots using RGB-D Images Duc My Vo, Lixing Jiang and Andreas Zell Abstract Detecting and tracking humans are key problems for human-robot interaction.
More informationPairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement
Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement Daegeon Kim Sung Chun Lee Institute for Robotics and Intelligent Systems University of Southern
More informationLinear combinations of simple classifiers for the PASCAL challenge
Linear combinations of simple classifiers for the PASCAL challenge Nik A. Melchior and David Lee 16 721 Advanced Perception The Robotics Institute Carnegie Mellon University Email: melchior@cmu.edu, dlee1@andrew.cmu.edu
More informationLecture 12 Recognition
Institute of Informatics Institute of Neuroinformatics Lecture 12 Recognition Davide Scaramuzza 1 Lab exercise today replaced by Deep Learning Tutorial Room ETH HG E 1.1 from 13:15 to 15:00 Optional lab
More informationAdaptive Visual Face Tracking for an Autonomous Robot
Adaptive Visual Face Tracking for an Autonomous Robot Herke van Hoof ab Tijn van der Zant b Marco Wiering b a TU Darmstadt, FB Informatik, Hochschulstraße 10, D-64289 Darmstadt b University of Groningen,
More informationHuman Motion Detection and Tracking for Video Surveillance
Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,
More informationImage Classification based on Saliency Driven Nonlinear Diffusion and Multi-scale Information Fusion Ms. Swapna R. Kharche 1, Prof.B.K.
Image Classification based on Saliency Driven Nonlinear Diffusion and Multi-scale Information Fusion Ms. Swapna R. Kharche 1, Prof.B.K.Chaudhari 2 1M.E. student, Department of Computer Engg, VBKCOE, Malkapur
More informationFeature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1
Feature Detection Raul Queiroz Feitosa 3/30/2017 Feature Detection 1 Objetive This chapter discusses the correspondence problem and presents approaches to solve it. 3/30/2017 Feature Detection 2 Outline
More informationSpecular 3D Object Tracking by View Generative Learning
Specular 3D Object Tracking by View Generative Learning Yukiko Shinozuka, Francois de Sorbier and Hideo Saito Keio University 3-14-1 Hiyoshi, Kohoku-ku 223-8522 Yokohama, Japan shinozuka@hvrl.ics.keio.ac.jp
More informationFace detection in a video sequence - a temporal approach
Face detection in a video sequence - a temporal approach K. Mikolajczyk R. Choudhury C. Schmid INRIA Rhône-Alpes GRAVIR-CNRS, 655 av. de l Europe, 38330 Montbonnot, France {Krystian.Mikolajczyk,Ragini.Choudhury,Cordelia.Schmid}@inrialpes.fr
More informationBeyond Bags of features Spatial information & Shape models
Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features
More informationA Comparison of SIFT, PCA-SIFT and SURF
A Comparison of SIFT, PCA-SIFT and SURF Luo Juan Computer Graphics Lab, Chonbuk National University, Jeonju 561-756, South Korea qiuhehappy@hotmail.com Oubong Gwun Computer Graphics Lab, Chonbuk National
More informationVideo Processing for Judicial Applications
Video Processing for Judicial Applications Konstantinos Avgerinakis, Alexia Briassouli, Ioannis Kompatsiaris Informatics and Telematics Institute, Centre for Research and Technology, Hellas Thessaloniki,
More informationBeyond Bags of Features
: for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015
More information