Robust visual tracking based on Informative random fern

Similar documents
Cluster Analysis of Electrical Behavior

Edge Detection in Noisy Images Using the Support Vector Machines

A Binarization Algorithm specialized on Document Images and Photos

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Classifier Selection Based on Data Complexity Measures *

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Feature Reduction and Selection

An Improved Image Segmentation Algorithm Based on the Otsu Method

TN348: Openlab Module - Colocalization

The Research of Support Vector Machine in Agricultural Data Classification

An Entropy-Based Approach to Integrated Information Needs Assessment

Human Face Recognition Using Generalized. Kernel Fisher Discriminant

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Detection of an Object by using Principal Component Analysis

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

PCA Based Gait Segmentation

Support Vector Machines

Local Quaternary Patterns and Feature Local Quaternary Patterns

Reducing Frame Rate for Object Tracking

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Fast Feature Value Searching for Face Detection

Online codebook modeling based background subtraction with a moving camera

An Image Fusion Approach Based on Segmentation Region

A Background Subtraction for a Vision-based User Interface *

Tracking Pedestrian with Multi-Component Online Deformable Part-Based Model

An efficient method to build panoramic image mosaics

Classifying Acoustic Transient Signals Using Artificial Intelligence

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Distance Calculation from Single Optical Image

A Robust Method for Estimating the Fundamental Matrix

A high precision collaborative vision measurement of gear chamfering profile

Data Mining: Model Evaluation

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Unsupervised Learning and Clustering

Discriminative Dictionary Learning with Pairwise Constraints

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Competitive Sparse Representation Classification for Face Recognition

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

Face Recognition by Fusing Binary Edge Feature and Second-order Mutual Information

Gender Classification using Interlaced Derivative Patterns

Face Detection with Deep Learning

Audio Content Classification Method Research Based on Two-step Strategy

A Gradient Difference based Technique for Video Text Detection

Image Alignment CSC 767

A Gradient Difference based Technique for Video Text Detection

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Machine Learning: Algorithms and Applications

Wavefront Reconstructor

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

User Authentication Based On Behavioral Mouse Dynamics Biometrics

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

A User Selection Method in Advertising System

Fast Sparse Gaussian Processes Learning for Man-Made Structure Classification

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

Histogram of Template for Pedestrian Detection

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Positive Semi-definite Programming Localization in Wireless Sensor Networks

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

Adaptive Silhouette Extraction and Human Tracking in Dynamic. Environments 1

An Optimal Algorithm for Prufer Codes *

Robust Mean Shift Tracking with Corrected Background-Weighted Histogram

Action Recognition Using Completed Local Binary Patterns and Multiple-class Boosting Classifier

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

Combination of Local Multiple Patterns and Exponential Discriminant Analysis for Facial Recognition

A fast algorithm for color image segmentation

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

Related-Mode Attacks on CTR Encryption Mode

IMAGE FUSION BASED ON EXTENSIONS OF INDEPENDENT COMPONENT ANALYSIS

MOTION BLUR ESTIMATION AT CORNERS

Load Balancing for Hex-Cell Interconnection Network

Fuzzy Logic Based RS Image Classification Using Maximum Likelihood and Mahalanobis Distance Classifiers

SVM Based Forest Fire Detection Using Static and Dynamic Features

CS 534: Computer Vision Model Fitting

Research and Application of Fingerprint Recognition Based on MATLAB

Adaptive Transfer Learning

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Discriminative classifiers for object classification. Last time

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method

An Efficient Face Detection Method Using Adaboost and Facial Parts

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

Fingerprint matching based on weighting method and SVM

A New Approach For the Ranking of Fuzzy Sets With Different Heights

PRÉSENTATIONS DE PROJETS

Suppression for Luminance Difference of Stereo Image-Pair Based on Improved Histogram Equalization

Pruning Training Corpus to Speedup Text Classification 1

Transcription:

5th Internatonal Conference on Computer Scences and Automaton Engneerng (ICCSAE 205) Robust vsual trackng based on Informatve random fern Hao Dong, a, Ru Wang, b School of Instrumentaton Scence and Opto-electroncs Engneerng, Behang Unversty, Bejng 009, Chna; a qanxn_dh@63.com, bwangr@buaa.edu.cn Keywords: Vsual trackng, IRF-LD, Gaussan projecton, Real tme Abstract. In ths paper, a novel vsual trackng algorthm named as Informatve random fern rackng Learnng Detecton (IRF-LD) has been proposed. Instead of a bnary comparson n the standard random fern of LD, we use the real value feature and Gaussan random projecton to acqure the advantages of hgh accuracy and low memory requrement. Expermental results on challengng sequences have demonstrated the superor performance of our IRF-LD when compared wth several state-of-the-art trackng algorthms.. Introducton Vsual trackng s one of the most mportant problems n computer vson. It s the bass for many applcatons such as survellance, human computer nteracton and acton recognton, etc. Many methods have been proposed for vsual trackng over the past few decades. Generally speakng, most trackers can be dvded nto two categores: generatve models and dscrmnatve models. Generatve models [] are typcally formulated as searchng the most smlar mage regon wth mnmal reconstructon error. Owng to the fact that they concern only about the appearance of the object, the generatve models often fal n cluttered background. For dscrmnatve models [2], trackng s treated as a bnary classfcaton task that fnds the decson boundary between the target and the background. Compared wth generatve models, dscrmnatve models are usually more resstant to cluttered background snce they explctly sample mage patch from the background as negatve example to tran the classfer. Kalal et.al [3] proposed a novel approach called rackng Learnng Detecton(LD), n whch trackng and detecton are ndependent processes that exchange nformaton va learnng. Random fern [4] classfer s one mportant component of the cascade detector n LD and shows excellent performance. However, there exst some potental problems wth t. Frst, the comparson of each pxel par produces only two outputs, 0 or, leadng to lots of nformaton loss. In addton, the random fern classfer n LD requres enormous memory, havng an exponental relatonshp wth the number of pxel pars n a fern. o address the ssues, we extend the LD based on nformatve random fern whch produces the real value feature for a fern based on subtracton and Gaussan projecton. he rest of ths paper s organzed as follows. In Secton 2, the ntroducton of random fern s ntroduced. he proposed method IRF-LD s presented n Secton 3. Secton 4 shows the expermental results, followed by concluson n Secton 5. 2. Prelmnares In random fern, the smple ntensty comparsons between pxel pars are chosen as the bnary features. Let f, =,.., N denotes the bnary feature that extracted from an mage patch whch to classfy. he class c for ths mage can be descrbed by () c = arg max p (c f, f 2,..., f N ) c C Here, C s the set of all classes. Usng Bayes formula, the posteror can be wrtten as 206. he authors - Publshed by Atlants Press 689

p( f, f2,..., fn c) pc ( ) pc ( f, f2,..., fn ) = (2) p( f, f2,..., fn ) We take the denomnator as a constant and assume the probablty pc () s unform, then () s equal to c = arg max p( f, f,..., f c) (3) c C Ozuysal et al. [4] proposed to dvde the features nto several groups, and assumed the dfferent groups are ndependent of each other. Formally, 2 2 N/ S p( f, f,..., f c) pf ( c) N = N = (4) Where S s the number of pxel pars n each fern and F s a group of features, named as a fern. = N Ss the total number of ferns. In practce, S cannot be too small, thus the memory occupaton s very enormous. 3. IRF-LD algorthm 3. IRF-LD framework Our whole IRF-LD trackng approach s summarzed n Fg.. It nherts the framework of LD whch decomposes the long-term trackng task nto trackng, detecton and learnng. he target s followed by a tracker from frame to frame and ts moton s estmated usng the Lucas-Kanade tracker extended wth falure detecton. he task of learnng s to ntalze the cascade detector n the frst frame and update t n run-tme usng the P-N experts. In orgnal LD, the cascade detector, whch s responsble for selectng the most possble target canddate n each frame, conssted of three stages: () patch varance: ths stage can rejects those patches wth gray-value varance smaller than 50 percent of the varance of the target patch; () random fern: t performs a quantty of pxel comparsons on a patch resultng n a bnary code, whch ndexes to an array of posterors. () nearest neghbor: t s the last stage to dvded each canddate patch nto target object or background by appearance usng Normalzed Correlaton Coeffcent. In our IRF-LD, the tracker and learnng methods from LD are adopted. Meanwhle, some mprovements are made n the cascade detector. Instead of the bnary comparson n the orgnal random fern of LD, we ntroduce the nformatve random fern classfer to mprove the robustness of the detector. he more nformatve real value from the subtracton s used n our method. Moreover, a random projecton s utlzed to map the value of each fern derved from feature value to a parametrc dstrbuton, specfcally, Gaussan dstrbuton, n whch the classfcaton s done. In the followng, the proposed IRF classfer wll be descrbed n detal from three steps: feature formaton, classfcaton wth probablty and onlne update. Fg. Framework of IRF-LD trackng algorthm 3.2 Feature Formaton We adapt the real value feature from [5],.e., the real value feature f, j descrbed n Eq.(5) s extracted from pxel par j of fern : f, j= Id ( ( j, )) Id ( 2( j, )), (5) Where Id ( ) represents the ntensty of an mage patch I at d. d (, j ) and d (, j ) 2 denote the 690

coordnates of the randomly generated pxel par j of fern. Obvously, the real value feature can preserve more nformaton about the ntensty dfference between two pxels because of f, j nstead of f, j { 0,}. Snce the feature f, j s a real value, t s necessary to encode all real values n each fern nto a sngle real value to smplfy the subsequent classfcaton. A theoretcal bass for ths dea has been stated by Johnson-Lndenstrauss(JL) lemma [6] that wth hgh probablty the dstances between the ponts n the hgh-dmensonal space are preserved f they are projected onto a randomly selected low-dmensonal subspace. Besdes, the lterature [6] also proved that for k-sparse data (e.g, mage and audo sgnal), the random matrx such as Gaussan matrx satsfyng the JL lemma holds true for the restrcted sometry property n compressve sensng. herefore, we use the Gaussan matrx to facltate effcent projecton from feature values of dfferent pxel pars nto a sngle real value n ths paper. Formally, F S = rf (6) j, j j= Where rj ~ N (0,) s a real value generated randomly accordng to a Gaussan dstrbuton. Besdes, comparng the proposed IRF wth the standard random ferns method, we can fnd that the IRF has the advantages of requrng a constant and much lower memory from the followng analyss. Assumng that the number of classes s γ = 2 (foreground and background) and the real value feature s stored n a sngle precson type whch occupes 4 Bytes. hen the memory requrement s MEM Our = γ 4. Whle n the standard random ferns method used n LD, a specfc bnary code s stored n an ntegral type whch occupes 4 Bytes. he memory requrement s MEM 2 S LD = γ 4. It s clear that the standard random fern method n LD needs memory 2 S tmes more than the proposed IRF method. 3.3 Classfcaton wth probablty: In IRF, the output F s calculated as a sngle real value produced randomly on the bass of Gaussan dstrbuton. For smplcty, we model the probablty pf ( c ) as a Gaussan dstrbuton c c µ, σ for fern of class c. Whereupon, the dscrmnatve functon s wth parameters ( ) H( F) = log = = pf ( c= ) pc ( = ) pf ( c= 0) pc ( = 0) = log( pf ( c= )) log( pf ( c= 0)) = = Where we assume unform pror pc ( = ) = pc ( = 0), c { 0,} s a bnary varable whch represents the sample label and F = { F, F2,..., F } s a set contanng the value of all ferns for an mage patch. he IRF classfes the patch as the target f the correspondng value H( F ) s larger than zero. 3.4 Onlne Update: o ntegrate our IRF feature that the value of each fern s modeled as a Gaussan dstrbuton wth c c µ, σ to the target model, we smplfy the update of the classfer as a parameter update: parameter ( ) µ λµ + ( λ ) µ c c c, new σ λσ ( ) + ( λ)( σ ) + λ( λ )( µ µ ) c c 2 c, new 2 c c, new 2 c, new c, new 2 2 Where λ s the learnng rate, µ = EF [ c] and σ = E[( F c) ] ( E[ F c]) are estmated from the tranng samples that are generated by P-N experts at current frame. 4. Experments Our IS s mplemented n C++, whch runs at 25 frames per second on an Intel Dual-Core 3.30GHz CPU wth 4G RAM. hree state-of-the-art algorthms on 6 fully-annotated vdeo sequences ncluded LD [3], OAB [7] and C [2] are used to valdate the performance of our IRF-LD (7) (8) 69

algorthm. All of these algorthms are evaluated n the one-pass evaluaton(ope) [8], and these sequences wth the correspondng ground truth fles and the compared code lbrary are avalable on the webste: http://vsual-trackng.net. In all the experments, the total number of ferns s set to = 50, the number of pxel pars n a fern s decded as S = 4, and the learnng rate λ s selected as 0.85. 4. Evaluaton Metrc We use the precson plots and success plots [8] to evaluate the robustness of trackng algorthms quanttatvely. he precson plot shows the percentage of frames whose estmated center locatons are wthn the gven threshold dstance of the ground truth. o compare the performances of dfferent algorthms, the score for the threshold equal to 20 pxels s used to be the representatve precson score. Meanwhle, the success plot s based on the overlap rato that s OS = Area( bt ba) Area( bt ba), where b t s the tracked target box and b a denotes the ground truth box. he success plot shows the ratos of frames wth OS > t0 throughout all threshold t 0 [0,]. he area under curve(auc) of each success plots serves as the frst measure to rank the trackng algorthms n the followng. 4.2 Result and Analyss he overall performance of the 4 trackng algorthms based on success plots and precson plots are llustrated n Fg.2. Accordng to the expermental results, our algorthm acheves outstandng performances n both the metrc overlap and center locaton error: n the success plot, t acheves an AUC score of 0.58 and ranks st. Moreover, our IRF-LD algorthm outperforms LD by 6.8%. Meanwhle, the overall precson of our IRF-LD at 79.3% s stll the hghest among all algorthms, yet beatng LD by.%. Fg. 2 he overall performance of the 4 trackng algorthms on all vdeo sequences o further analyze the performance of IRF-LD, the AUC scores and precson scores for each sequence are also shown n able. Some sampled results on sequences are llustrated n Fg.3. From able, we can observe that IRF-LD performs best on 4 out of 6 sequences (the talc fonts ndcate the best performance). Note that there exst many challengng factors n these vdeos that IRF-LD acheves favorable results. For nstance, the sequences faceocc2, carscale, and sylverster have the attrbutes of scale varaton and (n-)out-of plane rotaton, n whch faceocc2 and carscale also have occluson attrbute, thereby makng them far more challengng. Notwthstandng, IRF-LD performs persstently well from begnnng to end. able he AUC/Precson scores on each sequence Sequences Our LD OAB C Faceocc2 0.69/0.792 0.6/0.856 0.593/0.708 0.602/0.68 Sylvester 0.676/0.946 0.666/0.949 0.557/0.74 0.659/0.90 Carscale 0.575/0.78 0.452/0.853 0.398/0.663 0.433/0.78 Skatng 0.366/0.495 0.90/0.38 0.394/0.688 0.086/0.090 Doll Deer 0.56/0.98 0.690/0.95 0.566/0.983 0.590/0.732 0.533/0.874 0.640/0.958 0.455/0.684 0.039/0.042 IRF-LD not merely nherts the orgnal trackng framework of LD, the superorty of our IRF-LD algorthm compared wth LD manly les n that: ts cascade detector performance s further mproved by combnng the IRF classfer. In addton, IRF-LD produces the real value for a fern based on subtracton and Gaussan projecton, leadng to a more nformatve result than the 692

bnary feature used n the LD. Hence, the mantanng of the dversty of real value features enables IRF-LD to practce excellently n the presence of sgnfcant drastc appearance changes. 5 Concluson In ths paper, we have proposed a novel trackng method based on LD and Informatve random fern. he proposed method has advantages of hgh accuracy and low memory requrement, thus s very approprate for embedded systems. Expermental results show that t performs better than some other methods on most vdeo sequences. #4 #365 #72 #275 #363 #90 (a)faceocc (b)sylvester #50 #324 2 #3585 #0 #46 #68 (c)doll (d)skatng #50 #92 #48 #9 #32 #48 Acknowledgement (e)carscale (f)deer Our LD OA C Fg. 3 Screenshots from some of some sampled trackng results. hs work was partally supported by Natonal Natural Scence Foundaton of Chna(6097408). References []. Ross, D.A., Lm, J., Ln, R.S., Yang, M.H. Incremental Learnng for Robust Vsual rackng[j]. Internatonal Journal of Computer Vson, 2008, 77(-3): 25-4. [2]. Zhang, K., Zhang, L., Yang, M.-H. Real-tme compressve trackng[c]. European Conference on Computer Vson. Italy, 202, pp. 864-877. [3]. Kalal, Z., Mkolajczyk, K., Matas, J. rackng-learnng-detecton[j]. IEEE ransactons on Pattern Analyss and Machne Intellgence, 202, 34(7): 409-422. [4]. Ozuysal, M., Fua, P., Lepett, V. Fast keypont recognton n ten lnes of code[j]. IEEE Conference on Computer Vson and Pattern Recognton. USA, 2007, pp. -8. [5]. Zhang, J., Lu, K., Cheng, F., L, Y. Vsual trackng wth randomly projected ferns[j]. Sgnal Processng: Image Communcaton, 204, 29(9):987-997. [6]. Achloptas, D. Database-frendly random projectons: Johnson-Lndenstrauss wth bnary cons[j]. Journal of computer and System Scences, 2003, 66(4), 67-687. [7]. Grabner, H., Grabner, M., Bschof, H. Real-me rackng va On-lne Boostng[C]. Brtsh Machne Vson Conference. UK, 2006, 47-56. [8]. Wu, Y., Lm, J., Yang, M.-H. A benchmark[c]. IEEE Conference on Computer Vson and Pattern Recognton. USA, 203, pp. 24-248. 693