Robust Mean Shift Tracking with Corrected Background-Weighted Histogram

Similar documents
Scale and Orientation Adaptive Mean Shift Tracking

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Binarization Algorithm specialized on Document Images and Photos

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

An Image Fusion Approach Based on Segmentation Region

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Image Alignment CSC 767

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Local Quaternary Patterns and Feature Local Quaternary Patterns

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

A Background Subtraction for a Vision-based User Interface *

Support Vector Machines

Object Contour Tracking Using Multi-feature Fusion based Particle Filter

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A fast algorithm for color image segmentation

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Machine Learning: Algorithms and Applications

OBJECT TRACKING BY ADAPTIVE MEAN SHIFT WITH KERNEL BASED CENTROID METHOD

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Reducing Frame Rate for Object Tracking

Cluster Analysis of Electrical Behavior

Target Tracking Analysis Based on Corner Registration Zhengxi Kang 1, a, Hui Zhao 1, b, Yuanzhen Dang 1, c

Nonlocal Mumford-Shah Model for Image Segmentation

Robust visual tracking based on Informative random fern

Fast Feature Value Searching for Face Detection

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Object-Based Techniques for Image Retrieval

CS 534: Computer Vision Model Fitting

Related-Mode Attacks on CTR Encryption Mode

COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1

An Improved Image Segmentation Algorithm Based on the Otsu Method

High-Boost Mesh Filtering for 3-D Shape Enhancement

A Deflected Grid-based Algorithm for Clustering Analysis

Edge Detection in Noisy Images Using the Support Vector Machines

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Face Tracking Using Motion-Guided Dynamic Template Matching

A Novel Fingerprint Matching Method Combining Geometric and Texture Features

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Active Contours/Snakes

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation

An Optimal Algorithm for Prufer Codes *

X- Chart Using ANOM Approach

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Clustering Algorithm Combining CPSO with K-Means Chunqin Gu 1, a, Qian Tao 2, b

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Detection of an Object by using Principal Component Analysis

Virtual Machine Migration based on Trust Measurement of Computer Node

Integrated Expression-Invariant Face Recognition with Constrained Optical Flow

Fitting: Deformable contours April 26 th, 2018

Improved SIFT-Features Matching for Object Recognition

Suppression for Luminance Difference of Stereo Image-Pair Based on Improved Histogram Equalization

Collaborative Tracking of Objects in EPTZ Cameras

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Solving two-person zero-sum game by Matlab

Online codebook modeling based background subtraction with a moving camera

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Palmprint Feature Extraction Using 2-D Gabor Filters

Fast Computation of Shortest Path for Visiting Segments in the Plane

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Efficient Mean-shift Clustering Using Gaussian KD-Tree

Unsupervised Learning

A Gradient Difference based Technique for Video Text Detection

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Modular PCA Face Recognition Based on Weighted Average

Gender Classification using Interlaced Derivative Patterns

A Gradient Difference based Technique for Video Text Detection

Feature-Area Optimization: A Novel SAR Image Registration Method

Development of an Active Shape Model. Using the Discrete Cosine Transform

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Straight Line Detection Based on Particle Swarm Optimization

Feature-Preserving Mesh Denoising via Bilateral Normal Filtering

Keyword-based Document Clustering

DETECTION OF MOVING OBJECT BY FUSION OF COLOR AND DEPTH INFORMATION

Multi-View Face Alignment Using 3D Shape Model for View Estimation

Simultaneously Fitting and Segmenting Multiple- Structure Data with Outliers

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

A Robust Method for Estimating the Fundamental Matrix

The Shortest Path of Touring Lines given in the Plane

Robust Inlier Feature Tracking Method for Multiple Pedestrian Tracking

Load-Balanced Anycast Routing

Face Recognition using 3D Directional Corner Points

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Learning Ensemble of Local PDM-based Regressions. Yen Le Computational Biomedicine Lab Advisor: Prof. Ioannis A. Kakadiaris

Learning a Class-Specific Dictionary for Facial Expression Recognition

Feature-based image registration using the shape context

Video Object Tracking Based On Extended Active Shape Models With Color Information

A new segmentation algorithm for medical volume image based on K-means clustering

Matching of 2D Laser Signatures based on Spatial and Spectral Analysis

Transcription:

Robust Mean Shft Trackng wth Corrected Background-Weghted Hstogram Jfeng Nng, Le Zhang, Davd Zhang and Chengke Wu Abstract: The background-weghted hstogram (BWH) algorthm proposed n [] attempts to reduce the nterference of background n target localzaton n mean shft trackng. However, n ths paper we prove that the weghts assgned to pxels n the target canddate regon by BWH are proportonal to those wthout background nformaton,.e. BWH does not ntroduce any new nformaton because the mean shft teraton formula s nvarant to the scale transformaton of weghts. We then propose a corrected BWH (CBWH) formula by transformng only the target model but not the target canddate model. The CBWH scheme can effectvely reduce background s nterference n target localzaton. The expermental results show that CBWH can lead to faster convergence and more accurate localzaton than the usual target representaton n mean shft trackng. Even f the target s not well ntalzed, the proposed algorthm can stll robustly track the object, whch s hard to acheve by the conventonal target representaton. Keywords: Object Trackng, Mean Shft, Background nformaton, Target ntalzaton Correspondng author. Le Zhang s wth the Bometrcs Research Center, Dept. of Computng, The Hong Kong Polytechnc Unversty, Kowloon, Hong Kong, Chna. Emal: cslzhang@comp.polyu.edu.hk. Ths work s supported by the Hong Kong Polytechnc Unversty Internal Research Grant (A-SA08) and the Natonal Scence Foundaton Councl of Chna under Grants 6053060 and 6077500. Jfeng Nng s wth the Bometrcs Research Center, Dept. of Computng, The Hong Kong Polytechnc Unversty, Kowloon, Hong Kong, Chna, and the State Key Laboratory of Integrated Servce Networks, Xdan Unversty, X an, Chna. Emal: jf_nng@sna.com. Davd Zhang s wth the Bometrcs Research Center, Dept. of Computng, The Hong Kong Polytechnc Unversty, Kowloon, Hong Kong, Chna. Emal: csdzhang@comp.polyu.edu.hk. Chengke Wu s wth the State Key Laboratory of Integrated Servce Networks, Xdan Unversty, X an, Chna. Emal: ckwu@xdan.edu.cn.

. Introducton Object trackng s an mportant task n computer vson. Many algorthms [] have been proposed to solve the varous problems arsen from noses, clutters and occlusons n the appearance model of the target to be tracked. Among varous object trackng methods, the mean shft trackng algorthm [,, 4] s a popular one due to ts smplcty and effcency. Mean shft s a nonparametrc densty estmator whch teratvely computes the nearest mode of a sample dstrbuton [5]. After t was ntroduced to the feld of computer vson [6], mean shft has been adopted to solve varous problems, such as mage flterng, segmentaton [3, 3, 5, 8-9] and object trackng [,, 8-0,, 4, 6, 7]. In the mean shft trackng algorthm, the color hstogram s used to represent the target because of ts robustness to scalng, rotaton and partal occluson [,, 7]. However, the mean shft algorthm s prone to local mnma when some of the target features present n the background. Therefore, n [], Comancu et al. further proposed the background-weghted hstogram (BWH) to decrease background nterference n target representaton. The strategy of BWH s to derve a smple representaton of the background features and use t to select the salent components from the target model and target canddate model. More specfcally, BWH attempts to decrease the probablty of promnent background features n the target model and canddate model and thus reduce the background s nterference n target localzaton. Such an dea s reasonable and ntutve, and some works have been proposed to follow ths dea [0-]. In [0], the object s parttoned nto a number of fragments and then the target model of each fragment s enhanced by usng BWH. Dfferent from the orgnal BWH transformaton, the weghts of background features are derved from the dfferences between the fragment and background colors. In [], the target s represented by combnng BWH and adaptve kernel densty estmaton, whch extends the searchng range of the mean shft algorthm. In addton, Allen et al. [] proposed a parallel mplementaton of mean shft

algorthm wth adaptve scale and BWH, and demonstrated the effcency of ther technque n a SIMD computer. All the above BWH based methods am to decrease the dstracton of background n target locaton to enhance mean-shft trackng. Unfortunately, all of them do not notce that the BWH transformaton formula proposed n [] s actually ncorrect, whch wll be proved n ths paper. In ths paper we demonstrate that the BWH algorthm wll smultaneously decrease the probablty of promnent background features n the target model and target canddate model. Thus BWH s equvalent to a scale transformaton of the weghts obtaned by the usual target representaton method n the target canddate regon. Meanwhle, the mean shft teraton formula s nvarant to the scale transformaton of weghts. Therefore, the mean shft trackng wth BWH n [, 0-] s exactly the same as the mean shft trackng wth usual target representaton. Based on the mean shft teraton formula, the key to effectvely explot the background nformaton s to decrease the weghts of promnent background features. To ths end, we propose to transform only the target model but not the target canddate model. A new formula for computng the pxel weghts n the target canddate regon s then derved. The proposed corrected background-weghted hstogram (CBWH) can truly acheve what the orgnal BWH method wants: reduce the nterference of background n target localzaton. An mportant advantage of the proposed CBWH method s that t can work robustly even f the target model contans much background nformaton. Thus t reduces greatly the senstvty of mean shft trackng to target ntalzaton. In the experments, we can see that even when the ntal target s not well selected, the proposed CBWH algorthm can stll correctly track the object, whch s hard to acheve by the usual target representaton. The rest of the paper s organzed as follows. Secton ntroduces brefly the mean shft algorthm and the BWH method. Secton 3 proves that the BWH method s equvalent to the 3

conventonal mean shft trackng method, and then the CBWH algorthm s presented. Secton 4 presents experments to test the proposed CBWH method. Secton 5 concludes the paper.. Mean Shft Trackng and Background-Weghted Hstogram. Target Representaton In object trackng, a target s usually defned as a rectangle or an ellpsodal regon n the frame and the color hstogram s used to represent the target. Denote by { } x n = the normalzed pxels n the target regon, whch has n pxels. The probablty of a feature u, whch s actually one of the m color hstogram bns, n the target model s computed as [, ] ˆ { qˆ } q= u u m n ; * = ( ) * ˆu x δ ( x ) q = C k b u () = where ˆq s the target model, q ˆu s the probablty of the u th element of ˆq, δ s the * Kronecker delta functon, ( ) b assocates the pxel x * to the hstogram bn, k(x) s an x n * sotropc kernel profle, and constant C s C k ( x = ) =. Smlarly, the probablty of the feature u=,,, m n the target canddate model from the target canddate regon centered at poston y s gven by { } = ( ) = ˆ ( y) pˆ y p u u m n h y x p = C k b u = h () ; ˆ ( ) u (y) h δ x where ˆp( y ) s the target canddate model, ˆ ( y) p s the probablty of the u th element of u ˆp( y ), { x } = n h are pxels n the target canddate regon centered at y, h s the bandwdth and C h s the normalzed constant C h n h y x = k. = h 4

. Mean Shft Trackng Algorthm A key ssue n the mean shft trackng algorthm s the computaton of an offset from the current locaton y to a new locaton y accordng to the mean shft teraton equaton n h y x x w g = = h y (3) n h y x w g = h m qˆ w = b u u δ ( x) u= pˆ ( y (4) ) where g( x ) s the shadow of the kernel profle k( x ) : g( x) k ( x) u =. For the convenence of expresson, we denote by g y x g =. Thus Eq. (3) can be re-wrtten as: h y nh nh = x wg wg (5) = = Wth Eq. (5), the mean shft trackng algorthm can fnd the most smlar regon to the target object n the new frame..3 Background-Weghted Hstogram (BWH) In target trackng, often the background nformaton s ncluded n the detected target regon. If the correlaton between target and background s hgh, the localzaton accuracy of the object wll be decreased. To reduce the nterference of salent background features n target localzaton, a representaton model of background features was proposed by Comancu et al. [] to select dscrmnatve features from the target regon and the target canddate regon. In [], the background s represented as { ˆu } u m m o (wth o ˆ = ) and t s calculated = by the surroundng area of the target. The background regon s three tmes the sze of the target as suggested n []. Denote by ô the mnmal non-zero value n { o ˆu }. The u= m = u 5

coeffcents { ( ˆ v ˆ u = mn o ou,) } = u m (6) are used to defne a transformaton between the representatons of target model and target canddate model. The transformaton reduces the weghts of those features wth low v u,.e. the salent features n the background. Then the new target model s * ( ) δ ( ) n u u = qˆ = C v k x b x u (7) wth the normalzaton constant canddate model s C = n ( ) * m x ( x * uδ u ) k v b u = =. The new target n h y x pˆ u( y) = Cv h u k δ b( x) u = h (8) where C h = nh. k v b u y x m ( h ) uδ ( x) u = = The above BWH transformaton ams to reduce the effects of promnent background features n the target canddate regon on the target localzaton. In next secton, however, we wll prove that BWH cannot acheve ths goal because t s equvalent to the usual target representaton under the mean shft trackng framework. 3. The Corrected Background-Weghted Hstogram Scheme 3. The Equvalence of BWH Representaton to Usual Representaton By the mean shft teraton formula (5), n the target canddate regon the weghts of ponts (referrng to Eq. (4)) determne the convergence of the trackng algorthm. Only when the 6

weghts of promnent features n the background are decreased, the relevance of background nformaton for target localzaton can be reduced. Let s analyze the weght changes by usng the BWH transform. Denote by w the weght of pont x computed by the BWH n the target canddate regon. It can be derved by Eq. (4) that m qˆ u w = δ b( x) u pˆ (9) u= u ( y) Let u be the bn ndex n the feature space whch corresponds to pont x n the canddate regon. We have b( ) δ x u =. So Eq. (9) can be smplfed as u u ( ) w = qˆ pˆ y (0) Substtute Eqs. (7) and (8) nto Eq. (0), there s w = n * ( x ) δ ( ) Cv k b x u u j j j= n h y x j Cv h u k δ b( x j) u j= h By removng the common factor the normalzaton factors C and v u from the numerator and denomnator and substtutng C h nto the above equaton, we have * ( ) δ ( ) n C k x b x u ˆ h = h u h n h y x ˆ h h u h Ch k δ b( x ) u = h CC C C q C C w = = = w CC CC p CC () where w calculated by Eq. (4) s the weght of pont n the target canddate regon usng the usual representaton of target model and target canddate model. Eq. () suggests that w s proportonal to w. Moreover, by combnng mean shft teraton Eq. (5), we have 7

y nh nh nh x x x gw gw CCh CCh gw = = = nh nh nh gw wg wg CCh CC = = h = = = = () Eq. () shows that the mean shft teraton formula s nvarant to the scale transformaton of weghts. Therefore, BWH actually does not enhance mean shft trackng by transformng the representaton of target model and target canddate model. Its result s exactly the same as that wthout usng BWH. 3. The Corrected Background-Weghted Hstogram (CBWH) Algorthm Although the dea of BWH s good, we see n Secton 3. that the BWH algorthm does not mprove the target localzaton. To truly acheve what the BWH wants to acheve, here we propose a new transformaton method, namely the corrected BWH (CBWH) algorthm. In CBWH, Eq. (6) s employed to transform only the target model but not the target canddate model. That s to say, we reduce the promnent background features only n the target model but not n the target canddate model. We defne a new weght formula " u u ( ) w = qˆ pˆ y (3) Note that the denomnator n the above equaton s dfferent from that n Eq. (0). Smlar to the prevous dervaton process n Secton 3., we can easly obtan that w = C C v w " u (4) Snce C C s a constant scalng factor, t has no nfluence on the mean shft trackng process. We can omt t and smplfy Eq. (4) as w = v w " u (5) Eq. (5) clearly reflects the relatonshp between the weght calculated by usng the usual target representaton (.e. w ) and the weght calculated by explotng the background 8

nformaton (.e. w " ). If the color of pont n the background regon s promnent, the correspondng value of v u s small. Hence n Eq. (5) ths pont s weght s decreased and ts relevance for target localzaton s reduced. Ths wll then speed up mean shft s convergence towards the salent features of the target. Note that f we do not use the background nformaton, v u wll be and w " wll degrade to w wth the usual target representaton. Fg. plots the non-zero weghts of the features n the frst teraton of frame of the benchmark png-pang ball sequence (referrng to Secton 4 please). The weghts w, w and w " are calculated respectvely by usng the three target representaton methods,.e. the orgnal representaton, BWH and CBWH. Fg. clearly shows that w s proportonal to w wth a constant rate ( w / w =0.599). Therefore, the representaton of target model and target canddate model usng BWH s the same as the usual representaton wthout usng background features because the mean shft teraton s nvarant to scale transform. Meanwhle, w " s dfferent from w and w. Some w ", e.g. of bns 7 and 4, are enhanced whle some w ", e.g. of bns 0 and 0, are weakened. In summary, BWH does not ntroduce any new nformaton to mean shft trackng, whle CBWH explots truly the background features and can ntroduce new nformaton for trackng. 3.3 Background Model Updatng n CBWH In BWH and the proposed CBWH, a background color model { o ˆu } u= m s employed and ntalzed at the begnnng of trackng. However, n the trackng process the background wll often change due to the varatons of llumnaton, vewpont, occluson and scene content, etc. If the orgnal background color model s stll used wthout updatng, the trackng accuracy may be reduced because the current background may be very dfferent from the prevous background model. Therefore, t s necessary to dynamcally update the background model for 9

a robust CBWH trackng performance. Here we propose a smple background model updatng method. Frst, the background features { o } ˆu u = m and { v u } u = m Bhattacharyya smlarty between { o } computed by n the current frame are calculated. Then the ˆu u m ˆu u m = and the old background model { } o = s m ρ = oo ˆˆ (6) u= u u If ρ s smaller than a threshold, ths mples that there are consderable changes n the background, and then we update { ˆu } u m o by = { o } ˆu u = m and update { v u} by u= m { v u }. The transformed target model q u = m ˆu s then computed by Eq. (7) usng { } v. u u = m Otherwse, we do not update the background model. The proposed CBWH based mean shft trackng algorthm can be summarzed as follows. ) Calculate the target model ˆq by Eq. () and the background-weghted hstogram { o ˆu }, and then compute { } u= m v u u= m by Eq. (6) and the transformed target model ˆq by Eq. (7). Intalze the poston y 0 of the target canddate regon n the prevous frame. ) Let k 0. 3) Calculate the target canddate model ˆp(y 0) usng Eq. () n the current frame. " 4) Calculate the weghts { w } = nh accordng to Eq. (3). 5) Calculate the new poston y of the target canddate regon usng Eq. (5). 6) Let d y y 0, y0 y, k k+. Set the error threshold ε (default value: 0

0.), the maxmum teraton number N, and the background model update threshold ε ( default value: 0.5). If d<ε or k N Calculate { oˆu } and u = m { } v u u = m frame. If ρ by Eq. (6) s smaller than based on the trackng result of the current ε, then { oˆ } { ˆ u ou} u= m u= m and { v } { u v u}, and { qˆ } u u= m u= m u= m s updated by Eq. (7). Stop teraton and go to step for next frame. Otherwse Go to step 3. 4. Expermental Results and Dscussons Several representatve vdeo sequences are used to evaluate the proposed method n comparson wth the orgnal BWH based mean shft trackng, whch s actually equvalent to the mean shft trackng wth usual target representaton. The two algorthms were mplemented under the programmng envronment of MATLAB 7.0. In all the experments, the RGB color model was used as the feature space and t was quantzed nto 6 6 6 bns. Any elgble kernel functon k(x), such as the commonly used Epanechnkov kernel and Gaussan kernel, can be used. Our experments have shown that the two kernels lead to almost the same trackng results. Here we selected the Epanechnkov kernel as recommended n [] so that g(x) = k ( x) =. To better llustrate the proposed method, n the experments on the frst three sequences we dd not update the background feature model n CBWH because there are no obvous background changes, whle for the last sequence we updated adaptvely the background

feature model because there are many background changes such as scene content, llumnaton and vewpont varatons. Table and Table lst respectvely the average numbers of teratons and the target localzaton accuraces by the two methods on the four vdeo sequences. The MATLAB codes and all the expermental results of ths paper can be found n the web-lnk http://www.comp.polyu.edu.hk/~cslzhang/cbwh.htm. The frst experment s on the benchmark png-pang ball sequence, whch was used n [] to evaluate BWH. Ths sequence has 5 frames of spatal resoluton 35 40. The target s the ball that moves quckly. Refer to Fgure, n frame we ntalzed the target model wth a regon of sze 7 3 (nner blue rectangle), whch ncludes many background elements n t. The background model was then ntalzed to be a regon of sze 53 6 (external red rectangle excludng the target regon), whch approxmately three tmes that of the target area. The trackng results n Fgure and the statstcs n Table show that the proposed CBWH model (mean error:.94; standard devaton:.44) has a more accurate localzaton accuracy than the orgnal BWH model (mean error:.0; standard devaton: 0.64), because the former truly explots the background nformaton n target localzaton. Fgure 3 llustrates the numbers of teratons by the two methods. The average number of teratons s 3.04 for CBWH and 8.4 for BWH. The CBWH method requres less computaton. The salent features of target model are enhanced whle the background features beng suppressed n CBWH so that the mean shft algorthm can more accurately locate the target. The second vdeo s a soccer sequence. In ths sequence, the color of sport shrt (green) of the target player s very smlar to that of the lawn and thus some target features are presented n the background. Expermental results n Fgure 4 show that the BWH loses the object very quckly, whle the proposed CBWH successfully tracks the player over the whole sequence. The thrd experment s on the benchmark sequence of table tenns player. The target to be To calculate the target localzaton accuracy, we manually labeled the target n each frame as ground-truth.

tracked s the head of the player. We use ths sequence to test the robustness of the proposed CBWH to naccurate target ntalzaton. Refer to Fgure 5, n the frst frame the ntal target regon (nner blue rectangle) was delberately set so that t occupes only a small part of the player s head but occupes much background. The ntal target model s severely naccurate and t contans much background nformaton. Fgure 6 compares the Bhattacharyya smlartes between the trackng result and ts surroundng background regon by BWH and CBWH. We see that the Bhattacharyya smlarty of CBWH s smaller than that of BWH, whch mples that CBWH can better separate the target from background. Regard to the target localzaton accuracy, the proposed CBWH based method has a mean error of 3.89 and standard devaton of 4.56, whch are much better than those of the BWH based method whose mean error and standard devaton are s 5.4 and 5.70 respectvely. Because CBWH reduces the mpact of features shared by the target and background and enhances the promnent features n the target model, t decreases sgnfcantly the relevance of background for target localzaton. The experment n Fgure 5 suggests that the proposed CBWH method s a good canddate n many real trackng systems, where the ntal targets are often detected wth about 60% background nformaton nsde them. In Fgure 7, we show the trackng results on ths sequence by another naccurate ntalzaton. The same concluson can be drawn. The last experment s on a face sequence wth obvous changes of background content, llumnaton and vewpont. Usually, the background features { o ˆu } u= m are defned by the frst frame. However, due to the evoluton of vdeo scenes, the background features wll change and thus { o ˆu } u= m should be dynamcally updated for better performance. Fgure 8 shows the trackng results respectvely by BWH, CBWH wthout background update and CBWH wth background update. Obvously, CBWH wth background update locates the target much more accurately than the other two methods, whle BWH performs the worst. 3

The complexty of CBWH s bascally the same as that of the orgnal mean shft trackng except for transformng the target model wth background-weghted hstogram. Because the proposed CBWH focuses on trackng the salent features whch are dfferent from background, the average number of teratons of t s much less than that of the orgnal BWH. Meanwhle, Table also shows that the proposed CBWH locates the target more relably and more accurately than BWH. It acheves much smaller mean error and standard devaton than BWH. 5. Conclusons In ths paper, we proved that the background-weghted hstogram (BWH) representaton n [] s equvalent to the usual target representaton so that no new nformaton can be ntroduced to mprove the mean shft trackng performance. We then proposed a corrected BWH (CBWH) method to reduce the relevance of background nformaton and mprove the target localzaton. The proposed CBWH algorthm only transforms the hstogram of target model and decreases the probablty of target model features that are promnent n the background. The CBWH truly acheves what the BWH wants. The expermental results valdated that CBWH can not only reduce the mean shft teraton number but also mprove the trackng accuracy. One of ts mportant advantages s that t reduces the senstvty of mean shft trackng to the target ntalzaton so that CBWH can robustly track the target even t s not well ntalzed. Reference [] Comancu D., Ramesh V., and Meer P.: Real-Tme Trackng of Non-Rgd Objects Usng Mean Shft. Proc. IEEE Conf. Computer Vson and Pattern Recognton, Hlton Head, SC, USA, June, 000, pp. 4-49. [] Comancu D., Ramesh V. and Meer P.: Kernel-Based Object Trackng, IEEE Trans. Pattern 4

Anal. Machne Intell., 003, 5, (), pp. 564-577. [3] Comancu D., and Meer P.: Mean Shft: a Robust Approach toward Feature Space Analyss, IEEE Trans Pattern Anal. Machne Intell., 00, 4, (5), pp. 603-69. [4] Bradsk G.: Computer Vson Face Trackng for Use n a Perceptual User Interface, Intel Technology Journal, 998, (Q). [5] Fukunaga F. and Hostetler L. D.: The Estmaton of the Gradent of a Densty Functon, wth Applcatons n Pattern Recognton, IEEE Trans. on Informaton Theory, 975,, (), pp. 3-40. [6] Cheng Y.: Mean Shft, Mode Seekng, and Clusterng, IEEE Trans on Pattern Anal. Machne Intell., 995, 7, (8), pp. 790-799. [7] Nummaro K., Koller-Meer E. and Gool L. V.: An Adaptve Color-Based Partcle Flter, Image and Vson Computng, 003,, (), pp. 99-0. [8] Collns R.: Mean-Shft Blob Trackng through Scale Space. Proc. IEEE Conf. Computer Vson and Pattern Recognton, Wsconsn, USA, June 003, pp. 34-40. [9] Zvkovc Z., and Kröse B.: An EM-lke Algorthm for Color-Hstogram-Based Object Trackng. Proc. IEEE Conf. Computer Vson and Pattern Recognton, Washngton, DC, USA, July 004, volume I, pp. 798-803. [0] Yang C., Raman D., and Davs L.: Effcent Mean-Shft Trackng va a New Smlarty Measure. Proc. IEEE Conf. Computer Vson and Pattern Recognton, San Dego, CA, June 005, Volume I, pp.76-83. [] Ylmaz A., Javed O., and Shah M.: Object Trackng: a Survey, ACM Computng Surveys, 006, 38, (4), Artcle 3. [] Ylmaz A.: Object Trackng by Asymmetrc Kernel Mean Shft wth Automatc Scale and Orentaton Selecton. Proc. IEEE Conf. Computer Vson and pattern Recognton, Mnnesota, USA, June 007,Volume I, pp.-6,. [3] Wang J., Thesson B., Xu Y. and Cohen M. F.: Image and Vdeo Segmentaton by Ansotropc Kernel Mean Shft. Proc. European Conf. on Computer Vson, Prague, Czech Republc, May 004, vol. 30, pp. 38-49. 5

[4] Hu J., Juan C., and Wang J.: A spatal-color mean-shft object trackng algorthm wth scale and orentaton estmaton, Pattern Recognton Letters, 008, 9, (6), pp. 65-73. [5] Pars S., and Durand F.: A Topologcal Approach to Herarchcal Segmentaton usng Mean Shft. Proc. IEEE Conf. on Computer Vson and Pattern Recognton, Mnnesota, USA, June 007, pp. -8. [6] Collns R. T., Lu Y., and Leordeanu M.: Onlne Selecton of Dscrmnatve Trackng Features, IEEE Trans. Pattern Anal. Machne Intell., 005, 7, (0), pp. 63-643. [7] Tu J., Tao H., and Huang T.: Onlne updatng appearance generatve mxture model for meanshft trackng, Machne Vson and Applcatons, 009, 0, (3), pp. 63 73. [8] Luo Q., and Khoshgoftaar T. M.: Effcent Image Segmentaton by Mean Shft Clusterng and MDL-Guded Regon Mergng. IEEE Proc. Internatonal Conference on Tools wth Artfcal Intellgence, Florda, USA, November 004, pp. 337-343. [9] Park J., Lee G., and Park S.: Color mage segmentaton usng adaptve mean shft and statstcal model-based methods, Computers & Mathematcs wth Applcatons, 009, 57, (6), pp. 970-980. [0] Jeyakar J., Babu R., and Ramakrshnan K. R.: Robust object trackng wth background-weghted local kernels, Computer Vson and Image Understandng, 009,,(3), pp. 96-309. [] L L., and Feng Z.: An effcent object trackng method based on adaptve nonparametrc approach, Opto-Electroncs Revew, 005, 3, (4), pp. 35-330. [] Allen J., Xu R., and Jn J.: Mean Shft Object Trackng for a SIMD Computer. Proc. Internatonal Conference on Informaton Technology and Applcatons. Sydney, Australa, July 005, Volume I, pp.69-697. 6

Lst of Tables and Fgures Table. The average number of teratons by the two methods on the four sequences. Table. The target localzaton accuraces (mean error and standard devaton). Fg. : Weghts of the features n the frst mean shft teraton of frame (the png-pang ball sequence) usng the orgnal representaton, BWH and CBWH. Fg. : Mean shft trackng results on the png-pang ball sequence. Frames, 0, 5 and 5 are dsplayed. Fg. 3: Number of teratons on the png-pang ball sequence. Fg. 4: Mean shft trackng results on the soccer sequence. Frames, 5, 75 and 5 are dsplayed. Fg. 5: Mean shft trackng results on the table tenns player sequence wth naccurate ntalzaton. Frames, 0, 30, and 58 are dsplayed. Fg. 6: Bhattacharyya coeffcents between the trackng result and ts surroundng background regon for the BWH and CBWH methods on the table tenns player sequence. Fg. 7: Mean shft trackng results on the table tenns player sequence wth another naccurate ntalzaton. Frames, 0, 30, and 58 are dsplayed. Fg. 8: Mean shft trackng results of the face sequence wth the proposed CBWH target representaton methods. Frames 00, 5, 30 and 448 are dsplayed. 7

Table. The average number of teratons by the two methods on the four sequences. Methods Png-pang ball Table tenns Soccer sequence sequence player sequence Face sequence BWH 8.4 3.57 4.5 4.6 CBWH 3.74 3. 3.46 3.9 8

Table. The target localzaton accuraces (mean error and standard devaton). BWH CBWH Sequence Standard Standard Mean error Mean error devaton devaton Png-pang ball.0 0.64.94.44 Soccer 5. 56.0 4.6 7.65 Table tenns player 5.4 5.70 3.89 4.56 Face 7.83 0.04 3.65 5.93 9

8 7 6 Orgnal representaton BWH based representaton CBWH based representaton 5 weghts 4 3 0 0 0 0 30 40 50 60 70 the member of feature space Fg. : Weghts of the features n the frst mean shft teraton of frame (the png-pang ball sequence) usng the orgnal representaton, BWH and CBWH. 0

0 5 5 (a) The BWH based mean shft trackng 0 5 5 (b) The proposed CBWH based mean shft trackng Fg. : Mean shft trackng results on the png-pang ball sequence. Frames, 0, 5 and 5 are dsplayed.

0 9 BWH based target representaton CBWH based target representaton Number of teratons 8 7 6 5 4 3 0 0 30 40 50 Frames Fg. 3: Number of teratons on the png-pang ball sequence.

5 75 5 (a) The BWH based mean shft trackng 5 75 5 (b) The proposed CBWH based mean shft trackng Fg. 4: Mean shft trackng results on the soccer sequence. Frames, 5, 75 and 5 are dsplayed. 3

0 (a) The BWH based mean shft trackng 30 58 0 30 (b) The proposed CBWH based mean shft trackng 58 Fg. 5: Mean shft trackng results on the table tenns player sequence wth naccurate ntalzaton. Frames, 0, 30, and 58 are dsplayed. 4

0.9 Bhattacharyya smlarty betw een p u and o u for BWH Bhattacharyya smlarty betw een p u and o u for CBWH 0.8 Bhattacharyya smlarty 0.7 0.6 0.5 0.4 0.3 0. 0 0 30 40 50 Frame ndex Fg. 6: Bhattacharyya coeffcents between the trackng result and ts surroundng background regon for the BWH and CBWH methods on the table tenns player sequence. 5

0 (a) The BWH based mean shft trackng 30 58 0 30 (b) The proposed CBWH based mean shft trackng 58 Fg. 7: Mean shft trackng results on the table tenns player sequence wth another naccurate ntalzaton. Frames, 0, 30, and 58 are dsplayed. 6

00 5 (a) The BWH based mean shft trackng 30 448 00 5 30 (b) The proposed CBWH based mean shft trackng wthout background update 448 00 5 30 (c) The proposed CBWH based mean shft trackng wth background update 448 Fg. 8: Mean shft trackng results of the face sequence wth the proposed CBWH target representaton methods. Frames 00, 5, 30 and 448 are dsplayed. 7