Segmentation and Tracking of Multiple Humans in Crowded Environments
|
|
- Preston Roberts
- 5 years ago
- Views:
Transcription
1 1 Segmentaton and Trackng of Multple Humans n Crowded Envronments Tao Zhao, Member, IEEE, Ram Nevata, Fellow, IEEE, and Bo Wu, Student Member, IEEE, Abstract Segmentaton and trackng of multple humans n crowded stuatons s made dffcult by nter-object occluson. We propose a model based approach to nterpret the mage observatons by multple, partally occluded human hypotheses n a Bayesan framework. We defne a jont mage lkelhood for multple humans based on the appearance of the humans, the vsblty of body obtaned by occluson reasonng, and foreground/background separaton. The optmal soluton s obtaned by usng an effcent samplng method, data-drven Markov chan Monte Carlo DDMCMC), whch uses mage observatons for proposal probabltes. Knowledge of varous aspects ncludng human shape, camera model, and mage cues are ntegrated n one theoretcally sound framework. We present expermental results and quanttatve evaluaton, demonstratng that the resultng approach s effectve for very challengng data. Index Terms Multple Human Segmentaton, Multple Human Trackng, Markov chan Monte Carlo the objects to be ntalzed before occluson happens. Ths s usually nfeasble for crowded scene. We beleve that use of a shape model s necessary to acheve ndvdual human segmentaton and trackng n crowded scenes. a) Sample frame b) Moton blobs I. INTRODUCTION AND MOTIVATION Segmentaton and trackng of humans n vdeo sequences s mportant for a number of applcatons, such as vsual survellance and human computer nteracton. Ths has been a topc of consderable research n the recent past and robust methods for trackng solated or small number of humans havng only transent occluson exst. However, trackng n a more crowded stuaton where several people are present and exhbt persstent occluson, remans challengng. The goal of ths work s to develop a method to detect and track humans n the presence of persstent and temporarly heavy occluson. We do not requre that humans be solated,.e. un-occluded, when they frst enter the scene. However, n order to see a person, we requre that at least the head-shoulder regon must be vsble. We assume a statonary camera so that moton can be detected by comparson wth a background model. We do not requre the foreground detecton to be perfect, e.g. the foreground blobs may be fragmented, but we assume that there are no sgnfcant false alarms due to shadows, reflectons, or other reasons. We also assume that the camera model s known and that people walk on a known ground plane. Fg.1a) shows a sample frame of a crowded envronment and Fg.1b) shows the moton blobs detected by comparson wth the learned background. It s apparent that segmentng humans from such blobs s not straght-forward. One blob may nclude multple objects; whle one object may splt nto multple blobs. Blob trackng over extended perods, e.g. [20], may resolve some of these ambgutes but such approaches are lkely to fal when occluson s persstent. Some approaches have been developed to handle occluson, e.g. [9], but requre 1 T. Zhao s wth Intutve Surgcal Inc, 950 Kfer Road, Sunnyvale, CA Emal: taozhao@alumn.usc.edu. R. Nevata and B. Wu are wth Insttute for Robotcs and Intellgent Systems, Unversty of Southern Calforna, Los Angeles, CA Emal: {nevata bowu}@usc.edu. c) Our result Fg. 1. An sample frame, the correspondng moton blobs and our segmentaton and trackng result for crowded stuaton. In earler related work [54], Zhao and Nevata model human body as a 3D ellpsod and human hypotheses are proposed based on head top detecton from foreground boundary peaks. Ths method works reasonably well n presence of partal occlusons f the number of people n the feld of vew s small. As the complexty of the scene grows, head tops can not be obtaned by smple foreground boundary analyss and more complex shape models are needed to ft more accurately wth the observed shapes. Also, jont reasonng about the collecton of objects s needed, rather than the smpler one-byone verfcaton method n [54]. The consequence of ths jont consderaton s that the optmal soluton has to be computed n the jont parameter space of all the objects. To track the objects n multple frames, temporal coherence s another desred property besdes accuracy of the spatal segmentaton. We adapt a data-drven Markov chan Monte Carlo approach to explore ths complex soluton space. To mprove the computatonal effcency, we use drect mage features from bottom-up mage analyss as mportance proposal probabltes to gude the moves of the Markov chan. The man features of ths work nclude 1) a 3-dmensonal part based human body model, whch enables segmentaton and trackng of humans n 3D and nference of nter-object occluson naturally; 2) a Bayesan framework whch ntegrates segmentaton and trackng based on a jont lkelhood for the appearance of multple objects; Dgtal Object Indentfer /TPAMI /$ IEEE
2 2 3) desgn of an effcent Markov chan dynamcs, drected by proposal probabltes based on mage cues; and 4) the ncorporaton of a color based background model n a mean shft trackng step. Our method s able to successfully detect and track humans n scenes of complexty shown n Fg.1 wth hgh detecton and low false alarm rates; the trackng results for the frame n Fg.1a) s shown n Fg.1c) the result ncludes ntegraton of multple frames durng trackng). In the result secton, we gve graphcal and quanttatve results on a number of sequences. Parts of our system have been partally descrbed n [53] and [55]; ths paper provdes a unfed presentaton of the methodology, addtonal results and dscussons. Ths approach has been bult on by other researchers, e.g. [41]. The same framework has also been successfully appled to vehcle segmentaton and trackng n challengng cases [43]. The rest of the paper s organzed as follows: Secton II gves a bref revew of the related works; Secton III presents an overvew of our method; Secton IV descrbes the probablstc modelng of the problem; Secton V descrbes our MCMC based soluton; Secton VI shows expermental results and evaluaton; conclusons and dscussons are gven n the last secton. II. RELATED WORK We summarze related work n ths secton; some of these are referred to n more detal n the followng sectons. Due to the sze of the lterature n ths feld, t s not possble for us to provde a comprehensve survey but we attempt to nclude the major trends. The observatons for human hypotheses may come from multple cues. Many prevous approaches [20], [9], [54], [37], [44], [15], [18], [40], [24], [3], [45] use moton blobs detected by comparng pxel colors n a frame to learned models of the statonary background. When the scene s not hghly crowded, most part of the humans n the scene are detected n the foreground moton blob; multple humans may be merged nto a sngle blob but they can be separated by rather smple processng. For example, Hartaoglu et al. [15] uses vertcal projecton of the blob to help segment a bg blob nto multple humans. Sebel and Maybank [40], Zhao and Nevata [54] detect head canddates by analyzng the foreground boundares. Snce dfferent humans have small overlappng foreground regons, they could be segmented n a greedy way. However, the utlty of these methods n crowded envronments such as n Fg.1 s lkely to be lmted. Some methods, e.g. [50], [31], [7], [13] detect appearanceor shape-based patterns of humans drectly. [50] and [31] learn human detectors from local shape features; [7] and [13] bulds contour templates for pedestrans. These learnng based methods need a large number of tranng samples and may be senstve to magng vew-pont varatons as they learn 2-D patterns. Besdes moton and shape, face and skn-color are also useful cues for human detecton, but envronments where these cues could be utlzed are lmted, usually ndoor scenes where llumnaton s controlled and the objects are maged wth hgh resoluton, e.g. [42], [12]. Wthout a specfc model of objects, trackng methods are lmted to blob trackng e.g. [3]. The man advantage of model-based trackng s that t can solve the blob merge and splt problems by enforcng a global shape constrant. The shape models could be ether parametrc, e.g. an ellpsod as n [54], or non-parametrc, e.g. the edge template as n [13]; ether n 2D, e.g. [46] or n 3D, e.g. [54]. Parametrc models are usually generatve and of hgh dmensonalty, whle non-parametrc models are usually learned from real samples. 2D models make the matchng of hypotheses and mage observatons straghtforward, whle 3D models are more natural for occluson reasonng. The choce of the model complexty depends on both the applcaton and the vdeo resoluton. For human trackng from a md-dstant camera, we do not need to capture the detaled body artculaton, a rough body model, such as the generc cylnder n [19], the ellpsod n [54], and the multple rectangles n [46] suffce. When the body pose of humans s desred and the vdeo resoluton s hgh enough, more complex models could be used, such as the artculated models n [54] and [34]. Trackng of multple objects requres matchng of hypotheses wth the observatons both spatally and temporally. When objects are hghly nter-occluded, ther mage observatons are far from beng ndependent, hence a jont lkelhood for multple objects s necessary [46], [27], [19], [35], [30], [51]. Smth et al. [41] use a par-wse Markov Random Feld MRF) to model the nteracton between humans and defne the jont lkelhood. Rttscher et al. [36] nclude a hdden varable, whch ndcates a global mappng from the observed features to human hypotheses, n the state vector. As the soluton space s of hgh dmenson, searchng for the best nterpretaton by brute force s not feasble. Partcle flters based methods, e.g. [19], [46], [30], [51], [27], become unsutable when the dmensonalty of the search space s hgh as the number of samples needed usually grows exponentally wth the dmenson. [41], [21] use some varatons of MCMC algorthm to sample the soluton space whle [45], [36] uses an EM style method. For effcency the canddate solutons could be generated from some mage cues, not pure randomly, e.g. [36] propose hypotheses from local slhouette features. Informaton from multple cameras wth overlappng vews can reduce the ambguty of a sngle camera. Such methods usually assume that at least from one vew pont, the object can be detected successfully e.g. [11]) or many cameras are avalable for 3-dmensonal reconstructon e.g. [28]). The dffculty of segmentng multple humans whch overlap n mages from a stereo camera s allevated by analyzng n the 3-dmensonal space where they are separable [52]. In a mult-camera context, an object can be tracked even when t s fully occluded from some of the vews; however, many real envronments do not permt use of multple cameras wth overlappng vews. In ths paper, we consder stuatons where vdeo from only one camera s avalable. However our approach can utlze multple cameras wth lttle modfcaton. MCMC-based methods are recevng ncreasng popularty for computer vson problems due to ts flexblty n optmzng an arbtrary energy functon as opposed to energy functons of specfc type as n graph cut [2] or belef propagaton [49].
3 3 It has been used for varous applcatons ncludng segmentng multple cells [38], mage parsng [48], mult-object trackng [21], estmatng artculated structures [23], etc. Data-drve MCMC was proposed by [48] to utlze bottom-up mage cues to speed up the samplng process. We want to pont out the dfference between our approach and another ndependently developed work [21] whch also used MCMC for mult-object trackng. [21] assumes that the objects do not overlap by applyng a penalty term for overlap whle our approach explctly uses a lkelhood of appearance under occluson. Our approach focuses on the doman of trackng human whch s the most mportant subject for vsual survellance. We consder the 3-dmensonal perspectve effect n typcal camera settng whle the ant trackng problem descrbed n [21] s almost a 2-dmensonal problem. We utlze acqured appearance where each object s of dfferent appearance where ants n [21] are assumed to have the same appearance. We developed a full set of effectve bottom-up cues for human segmentaton and hypotheses generaton. III. OVERVIEW Our approach to segmentng and trackng of multple humans emphaszes the use of shape models. An overvew dagram s gven n Fg.2. Based on a background model, the foreground blobs are extracted as the basc observaton. By usng the camera model and the assumpton that objects move on a known ground plane, multple 3D human hypotheses are projected onto the mage plane and matched wth the foreground blobs. Snce the hypotheses are n 3D, occluson reasonng s straghtforward. In one frame, we segment the foreground blobs nto multple humans and assocate the segmented humans wth the exstng trajectores. Then the tracks are used to propose human hypotheses n the next frame. The segmentaton and trackng are ntegrated n a unfed framework and nter-operate along tme. Fg. 2. Overvew dagram of our approach. We formulate the problem of segmentaton and trackng as one of Bayesan nference to fnd the best nterpretaton gven the mage observatons, the pror models, and the estmates from prevous frame analyss.e. the maxmum a posteror, MAP, estmaton). The state to be estmated at each frame ncludes the number of objects, ther correspondences to the objects n the prevous frame f any), ther parameters e.g. postons), and the uncertanty of the parameters. We defne a color-based jont lkelhood model whch consders all the objects and the background together, and encodes both the constrants that the object should be dfferent from the background and that the object should be smlar to ts correspondence. Usng ths lkelhood model gracefully ntegrates segmentaton and trackng, and avods a separate, sometmes ad hoc, ntalzaton step. Gven multple human hypotheses, before calculatng the jont mage lkelhood nterobject occluson reasonng s done. The occluded parts of a human should not have correspondng mage observatons. The soluton space contans subspaces of varyng dmensons, each correspondng to a dfferent number of objects. The state vector conssts of both dscrete and contnuous varables. Ths dsqualfes many optmzaton technques. Therefore we use a hghly general reversble jump/dffuson MCMC-based method to compute the MAP estmate. We desgn dynamcs for mult-object trackng problem. We also use varous drect mage features to make the Markov chan more effcent. Drect mage features alone do not guarantee optmalty because they are usually computed locally or usng partal cues. Usng them as proposal probabltes of the Markov chan results n an ntegrated top-down/bottom-up approach whch has both the computatonal effcency of mage features and the optmalty of a Bayesan formulaton. A mean shft technque [5] s used as effcent dffuson for the Markov chan. The data-drven dynamcs and the n-depth exploraton of the soluton space make the approach less senstve to dmensonalty compared to partcle flters. Our experments show that the descrbed approach works robustly n very challengng stuatons wth affordable computaton; some results are shown n Secton VI. IV. PROBABILISTIC MODELING Let θ represent the state of the objects n the scene at tme t; t conssts of the number of objects n the scene, ther 3D postons and other parameters descrbng ther sze, shape and pose. Our goal s to estmate the state at tme t, θ t), gven the mage observatons, I 1),...,I t), abbrevated as I 1,...,t).We formulate the trackng problem as computng the maxmum a posteror MAP) estmaton, θ t). θ t) = arg max P θ t) I 1,...,t)) θ t) { Θ = arg max P I t) θ t)) P θ t) I 1,...,t 1))} 1) θ t) Θ where Θ s the soluton space. Denote by m the state vector of one ndvdual object. A state contanng n objects can be wrtten as θ = {k 1, m 1 ),...,k n, m n )} Θ n, where k s the unque dentty of the -th object whose parameters are m, and Θ n s the soluton space of exactly n objects. The entre soluton space s Θ = N max n=0 Θ n, where N max s a upper bound of the number of objects. In practce, we compute an approxmaton of P θ t) I 1,...,t 1)) detals are gven later n secton IV-D). A. 3D Human Shape Model The parameters of an ndvdual human, m, are defned based on a 3D human shape model. Human body s hghly artculated, however, n our case, the human moton s mostly lmted to standng or walkng, and we do not attempt to
4 4 capture the detaled shape and artculaton parameters of the human body. Thus we use a number of low dmensonal models to capture the gross shape of human bodes. Fg. 3. A number of 3D human models to capture the gross shape of human bodes. Ellpsods ft human body parts well and has the property that ts projecton s an ellpse wth a convenent form [16]. Therefore we model human shape by a composton of multple ellpsods correspondng to the head, the torso and the legs, wth fxed spatal relatonshp. A few such models at characterstc poses are suffcent to capture the gross shape varatons of most humans n the scene for md-resoluton mages. We use the mult-ellpsod model to control the model complexty whle mantanng a reasonable level of fdelty. We have used three such models 1 for legs close to each other and 2 for legs well-splt) n our prevous work on multhuman segmentaton [53]. However, n ths work we use only a sngle model wth three ellpsods whch we found suffcent for trackng. The model s controlled by two parameters called sze and thckness. The sze parameter s the 3D heght of the model; t also controls the overall scalng of the object n the three drectons. The thckness parameter captures extra scalng n the horzontal drectons. Besdes sze and thckness, the parameters also nclude mage poston of the head 1,3D orentaton of the body, and 2D nclnaton of the body. The orentatons of the models are quantzed nto a few levels for computaton effcency. The orgn of the rotaton s chosen so that 0 corresponds to human facng the camera. We use 0 and 90 to represent frontal/back and sde vew n ths work. The 3D models assumes that humans are perfectly uprght, but there are chances that they nclne ther body slghtly. We use one parameter to capture the nclnaton n 2D as opposed to two parameters n 3D). Therefore, the parameters of the -th human are m = {o,x,y,h,f, } whch are orentaton, poston, sze, thckness, and nclnaton respectvely. We also wrte x,y ) as u. Wth a gven camera model and a known ground plane, the 3D shape models automatcally ncorporates the perspectve effect of camera projecton change n object mage sze and shape due to the change n object poston and/or camera vewpont). Compared to 2D shape models e.g. [13]) or prelearnt 2D appearance models e.g. [50]), the 3D models are more easly applcable for a novel vewpont. 1 The mage head locaton s a equvalent parameterzaton of the world locaton on the ground plane x w,y w ) gven the human heght. The two are related by [x, y, 1] T [p 1, p 2, p 3 h+p 4 ][x w,y w, 1] T, where p s the -th column of the camera projecton matrx and h s the heght of the human. For clarty of presentaton, we chose the ground plane to be z =0. B. Object Appearance Model Besdes the shape model, we also use a color hstogram of the object, p = { p 1,..., p m } m s the number of bns of the color hstogram) defned wthn the object shape, as a representaton of ts appearance whch helps establsh correspondence n trackng. We use color hstogram because t s nsenstve to the non-rgdty of human moton. Furthermore, there exsts effcent algorthm, e.g. the mean shft technque [5], to optmze a hstogram-based object functon. When calculatng the color hstogram, a kernel functon K E ) wth Epanechnkov profle [5] s appled to weght pxel locatons so that the center has a hgher weght than the boundary. Such a representaton has been used n [6]. Our mplementaton uses a sngle RGB hstogram wth 512 bns 8 for each dmenson), of all the samples wthn the three ellptc regons of our object model. C. Background Appearance Model The background appearance model s a modfed verson of a Gaussan dstrbuton. } Denote by r j, ḡ j, b j ) and Σ j = dag {σ 2 rj,σ 2 gj,σ 2 the mean and the covarance of the bj color at pxel j. The probablty of pxel j beng from the background s P b I j )=P { b r [ j,g j,b j ) ) 2 ) 2 ) ] } 2 rj r max exp j gj ḡ σ rj j bj b σ gj j σ bj,ɛ 2) where ɛ s a small constant. It s a composton of a Gaussan dstrbuton and a unform dstrbuton. The unform dstrbuton captures the outlers whch are not modeled by the Gaussan dstrbuton to make the model more robust. The Gaussan parameters mean and covarance) are updated contnuously by the vdeo stream only wth the non-movng regons. More sophstcated background model e.g. mxture of Gaussan [44] or non-parametrc [10]), could be used to account for more varatons but ths s not the focus of ths work; we assume that comparson wth background model yelds adequate foreground blobs. D. The Pror Dstrbuton The pror dstrbuton P θ t) I 1,...,t 1)) s decomposed n two parts gven by: P θ t) I 1,...,t 1)) P θ t)) P θ t) I 1,...,t 1)) 3) P θ t) ) s ndependent of tme, and s defned by n P S )P m ), where S s the projected mage of =1 the -th object and S s ts area. The pror of the mage area P S ) s modeled as beng proportonal to exp λ 1 S )[1 exp λ 2 S )] 2. The frst term here penalzes large total object sze to avod stuatons where two hypotheses overlap a large porton of an mage blob, 2 We have used pror on the number of objects n [53] to constran over segmentaton. However we found that the pror on the area s more effectve due to the large varaton of the mage szes of the objects due to camera perspectve effect) and therefore ther dfferent contrbuton to the lkelhood.
5 5 whle the second term penalzes objects wth small mage szes as they are more lkely to be due to mage nose. Although the pror on 2D mage sze could be converted to the 3D space, defnng ths pror n 2D s more natural, because these propertes model the relablty of mage evdence ndependent of the camera models. The prors on the human body parameters are consdered ndependent. Thus we have P m )=Po )P x,y )P h )P f )P ). We set P o frontal ) = P o profle ) = 1/2. P x,y ) s a unform dstrbuton n the mage regon where a human head s plausble. P h ) s a Gaussan dstrbuton N μ h,σh 2 ) truncated n the range of [h mn,h max ] and P f ) s Gaussan dstrbuton N μ f,σf 2) truncated n the range of [f mn,f max ]. P ) s a Gaussan dstrbuton N μ,σ 2 ). In our experments, we use μ h = 1.7m, σ h = 0.2m, h mn = 1.5m, h max = 1.9m; μ f =1, σ f =0.2, f mn =0.8, f max =1.2; μ =0, σ =3. These parameters correspond to common adult body szes. We approxmate the second term of the rght sde of Equ.3, P θ t) I 1,...,t 1) ),bypθ t) θ t 1) ), assumng θ t 1 encodes the necessary nformaton from the past observatons. For convenence of expresson, we rearrange θ t) and θ t 1) as θ t) = { kt) )} N, m t) and θ { kt 1) )} N t 1) =, m t 1), =1 =1 where N s the overall number of object present n the two frames, { kt) so that one of } t 1) = k, m t) = φ, m t 1) Fg. 4. = φ s true for each. k t) t 1) t) = k means object k s a tracked object; m t) = φ t 1) means object k s a dead object.e. trajectory s termnated); and m t 1) t) = φ means object k s a new object. Wth the rearranged state vector, we have P θ t) θ t 1)) = θt) ) P θ t 1) = N ) P m t) m t 1). The temporal pror =1 of each object follows the defnton P m t) ) m t 1) P assoc P new P dead m t) m t) ) m t 1), kt) t 1) = k ), m t 1) = φ m t 1) ), m t) = φ 4) We assume that the poston and the nclnaton of an object follow constant velocty models wth Gaussan nose, and that the heght and thckness follow a Gaussan dstrbuton for smplcty of presentaton, we omt the velocty terms n the state). We use Kalman flters for temporal estmaton; ) P assoc s therefore a Gaussan dstrbuton. P new m t) = ) ) ) P new ũ t) and P dead = P dead are the m t 1) ũ t 1) lkelhoods of the ntalzaton of a new track at poston ũ t) and the termnaton of an exstng track at poston ũ t 1) respectvely. They are set emprcally accordng to the dstance of the object to the entrances/exts the boundares of the mage and other areas that people move n/out). P new u) N μu), Σ e ), where μu) s the locaton of the closest entrance pont to u and Σ e s ts assocated covarance matrx whch s set manually or through a learnng phase. P dead ) follows a smlar defnton. E. Jont Image Lkelhood for Multple Objects and Background The mage lkelhood P I θ) reflects the probablty that we observe mage I or some features extracted from I) gven state θ. Here we develop a lkelhood model based on the color nformaton of background and objects. Gven a state vector θ, we partton the mage nto dfferent regons correspondng to dfferent objects and the background. Denote by S the vsble part of the -th object defned by m. The vsble part of an object s determned by the depth order of all the objects, whch can be nferred from ther 3D postons and the camera model. The entre object regon S = n =1 S = n =1 S, snce S are dsjont regons. We use S to denote the supplementary regon of S,.e. the non-object regon. The relatonshp of the regons s llustrated n Fg.4. Frst pane: the relatonshp of vsble object regons and the nonobject regon. Rest panes: the color lkelhood model. In S, the lkelhood favors both the dfference of an object hypothess wth the background and ts smlarty wth ts correspondng object n a prevous frame. In S, the lkelhood penalzes the dfference wth the background model. Note that the ellptc models are used for llustraton. In case of multple objects whch can possbly overlap n the mage, the lkelhood of the mage gven the state cannot be smply decomposed nto the lkelhood of each ndvdual objects. Instead, a jont lkelhood of the whole mage gven all objects and the background model needs to be consdered. The jont lkelhood P I θ) conssts of two terms correspondng to the object regon and the non-object regon P I θ) =P I S θ ) P I S θ After obtanng S by occluson reasonng, the object regon lkelhood can be calculated by P I S θ ) = n ) P I S m =1 exp λ n S S λ b B p, d ) +λ f B p, p ) =1 }{{}}{{} 1) 2) 6) where d s the color hstogram of the background mage wthn the vsblty mask of object, p s the color hstogram of the object, both weghted by the kernel functon K E ). Bp, d) = m j=1 pj d j s the Bhattachayya coeffcent, whch reflects the smlarty of two hstograms. Ths lkelhood favors both the dfference of an object hypothess wth the background and ts smlarty wth ts correspondng object n a prevous frame Fg.4). Ths enables smultaneous segmentaton and trackng n the same object functon. We call the two terms background excluson and ) 5)
6 6 object attracton respectvely. The background excluson concept was also proposed by [33]. λ b and λ f weght the relatve contrbuton of the two terms we constran λ b +λ f =1). The object attracton term s the same as the lkelhood functon used n [6]. For an object wthout a correspondence,.e. anew object, only the background excluson part s used. The non-object lkelhood s calculated by P I S θ ) = j S P b I j )) λ S exp λ S j S e j, 7) where e j = logp b I j )) s the probablty of belongng to the background model, as defned n Equaton 2. λ S n Equaton 6 and λ S n Equaton 7 weght the balance of the foreground and the background consderng the dfferent probablstc models beng used. The posteror probablty s obtaned by combnng the pror, Equaton 3, and the lkelhood, Equaton 5. V. COMPUTING MAP BY EFFICIENT MCMC Computng the MAP s an optmzaton problem. Due to the jont consderaton of an unknown number of objects, the soluton space contans subspace of varyng dmensons. It also ncludes both dscrete varable and contnuous varables. These has made the optmzaton challengng. We use a Markov chan Monte Carlo method wth jump/dffuson dynamcs to sample the posteror probablty. Jumps cause the Markov chan to move between subspaces wth dfferent dmensons and traverse the dscrete varables; dffusons make the Markov chan sample contnuous varables. In the process of samplng, the best soluton s recorded and the uncertanty assocated wth the soluton s also obtaned. Fg.5 gves a block dagram of the computaton process. The MCMC based algorthm s an teratve process, startng from an ntal state. In each teraton, a canddate s proposed from the state n the prevous teraton asssted by mage features. The canddate s accepted probablstcally accordng to the Metropols-Hastng rule [17]. The state correspondng to the maxmum posteror value s recorded and becomes the soluton. Suppose we want to desgn a Markov chan wth statonary dstrbuton Pθ) =P θ t) I t),θ t 1)). At the g-th teraton, we sample a canddate state θ accordng to θ g 1 from a proposal dstrbuton qθ g θ g 1 ). The{ canddate state θ s } accepted wth the probablty p = mn 1, )qθ g 1 θ ) 3 Pθ Pθ g 1)qθ θ g 1). If the canddate state θ s accepted, θ g = θ, otherwse, θ g = θ g 1. It can be proven that the Markov chan constructed n ths way has ts statonary dstrbuton equal to P), ndependent of the choce of the proposal probablty q) and the ntal state θ 0 [47]. However, the choce of the proposal probablty q) can affect the effcency of the MCMC sgnfcantly. Random proposal probabltes wll lead to very slow mxng rate. Usng more nformed proposal probabltes, e.g. as n data-drven MCMC [48], wll make the Markov chan traverse the soluton space more effcently. Therefore the proposal dstrbuton s wrtten as qθ g θ g 1,I). If the proposal probablty s nformatve enough so that each sample can be thought of as a hypothess, then the MCMC approach becomes a stochastc verson of the hypothesze and test approach. In general, the orgnal verson of MCMC has dmenson matchng problem for soluton space wth varyng dmensonalty. A varaton of MCMC, called trans-dmensonal MCMC [14] s proposed to solve ths problem. However, wth some approprate assumpton and smplfcaton, trans-dmensonal MCMC can be reduced to the standard MCMC. We address ths ssue later n ths secton. A. Markov Chan Dynamcs We desgn the followng reversble dynamcs for the Markov chan to traverse the soluton space. The dynamcs correspondng to the proposal dstrbuton wth a mxture densty qθ θ g 1,I)= a A p aq a θ θ g 1,I), where A s the set of all dynamcs = {add, remove, establsh, break, exchange, dff}. The mxng probabltes p a are the chances of selectng dfferent dynamcs and a A p a =1. We assume that we have the sample n the g 1-th teraton θ t) g 1 = {k 1, m 1 ),...,k n, m n )} and now propose a canddate θ for the g-th teraton t s omtted where there s no ambguty). Object hypothess addton Sample the parameters of a new human hypothess k n+1, m n+1 ) and add t to θ g 1. q add θ g 1 {k n+1, m n+1 )} θ g 1,I) s defned n a datadrven way whose detals wll be gven later. Object hypothess removal Randomly select an exstng human hypothess r [1,n] wth a unform dstrbuton and remove t. q remove θ g 1 \{k r, m r )} θ g 1 )=1/n. Ifk r has a correspondence n θ t 1), then that object becomes dead. Establsh correspondence Randomly select a new object r n θ t) g 1 and a dead object r n θ t 1), and establsh ther temporal correspondence. q establsh θ θ g 1 ) u r u r 2 for all the qualfed pars. Break correspondence Randomly select an object r where Fg. 5. The block dagram of the MCMC trackng algorthm. 3 Base on our experments, we fnd that approxmatng the rato n the second Pθ term wth just the posteror probablty rato, ), gves almost the same Pθ g 1 ) results as the complete computaton, hence we use ths approxmaton n our mplementaton.
7 7 k r θ t 1) wth a unform dstrbuton and change k r to a new object and same object n θ t 1) becomes dead). q break θ θ g 1 )=1/n, where n s the number of objects n θ t) g 1 that have correspondences n the prevous frame. Exchange dentty Exchange the IDs of two close-by objects. Randomly select two objects r 1,r 2 [1,n] and exchange ther IDs. q exchange r 1,r 2 ) u r1 u r2 2. Identtes exchange can also be replaced by the composton of breakng and establshng correspondence. It s used to ease the traversal snce breakng and establshng correspondences may lead to a bg decrease n the probablty and are less lkely to be accepted. Parameter update Update the contnuous parameters of an object. Randomly select an exstng human hypothess r [1,n] wth a unform dstrbuton, and update ts contnuous parameters q dff θ θ g 1 )=1/n)q d m r m r ). Among the above, addton and removal are a par of reverse moves, as are the establshng and breakng correspondences; exchangng dentty, and parameter updatng are the ther own reverse moves. B. Informed Proposal Probablty In theory, the proposal probablty q) does not affect the statonary dstrbuton. However, dfferent q) lead to dfferent performance. The number of samples needed to get a good soluton strongly depends on the proposal probabltes. In ths applcaton, the proposal probablty of addng a new object, and the update of the object parameters, are the two most mportant ones. We use the followng nformed proposal probabltes to make the Markov chan more ntellgent and thus have a hgher acceptance rate. Object addton We add human hypotheses from three cues, foreground boundares, ntensty edges, and foreground resdue foreground wth the exstng objects carved out). In [54] a method to detect the heads whch are on the boundary of the foreground s descrbed. The basc dea s to fnd the local vertcal peaks of the boundary. The peaks are further verfed by checkng f there are enough foreground pxels below t accordng to a human heght range and the camera model. Ths detector has a hgh detecton rate and s also effectve when the human s small and mage edges are not relable; however, t cannot detect the heads n the nteror of the foreground blobs. Fg.6a) shows an example of head detecton from foreground boundares. The second head detecton method s based on an Ω shape head-shoulder model ths term was frst ntroduced n [53]). Ths detector matches the Ω-shape edge template wth the mage ntensty edges to fnd the head canddates. Frst, Canny edge detector s appled to the foreground regon of the nput mage. A dstance transformaton [1] s computed on the edge map. Fg.6b) shows the exponental edge map where Ex, y) =exp λdx, y)) Dx, y) s the dstance to the closest edge pont and λ s a factor to control the response feld dependng on the object scale n the mage; we use λ =0.25). Besdes, the coordnates of the closest pxel pont are also recorded as Cx, y). The unt mage gradent vector Ox, y) s only computed at edge pxels. The Ω shape model, see a) c) Fg. 6. Head detecton. a) Head detecton from foreground blob boundares; b) Dstance transformaton on Canny edge detecton result; c) The Ω-shape head-shoulder model black-head shoulder shape, whte-normals); and d) Head detecton from ntensty edges. Fg.6c), s derved by projectng a generc 3D human model to the mage and takng the contour of the whole head and the upper quarter torso as the shoulder. The normals of the contour ponts are also computed. The sze of the human model s determned by the camera calbraton assumng an average human heght. Denote { u 1,..., u k } and { v 1,..., v k } as the postons and the unt normals of the model ponts respectvely when head top s at x, y). The model s matched wth the mage as Sx, y) = 1/k)Σ k =1 e λd u ) v O C u ))). A head canddate map s constructed by evaluatng Sx, y) on every pxel n the dlated foreground regon. After smoothng t, we fnd all the peaks above a threshold such that a very hgh detecton rate but may also result n a hgh false alarm rate. An example s shown n Fg.6d). The false alarms tend to happen n the area of rch texture where there are abundant edges of varous orentatons. Fnally, after some human objects obtaned from the frst two methods are hypotheszed and removed from the foreground, the foreground resdue map R = F S s computed. Morphologcal open operaton wth a vertcally elongated structural element s appled to remove thn brdges and small/thn resdues. From each connected component c, human canddates can be generated assumng 1) the centrod of the c s algned wth the center of human body; 2) the top center pont of c s algned wth the human head; or 3) the bottom center pont of c s algned wth the human feet. The proposal probablty for addton combnes these three head detecton methods q a k, m) = 3 =1 λ aq a k, m), where λ a, = 1, 2, 3 are mxng probabltes of the three methods and we use λ a = 1/3. q a ) samples m frst and then k. q a k, m) = q a m)q a k m), and q a m) = q o o)q a u)q h h)q f f)q ). q a u) answers the queston where to add a new human hypothess. In practce, q o o), q h h), q f f), and q ) use ther respectve pror dstrbutons, and q a u) s a mxture of Guassan based on the bottom-up detecton results. For example, denote by HC 1 = {x,y )} N 1 =1 the head canddates obtaned by the frst method, then q a1 u) = q a1 x, y) N 1 N x,y ), dag{σx,σ 2 y} ) 2. =1 The defnton of q a2 u) and q a3 u) are smlar. After b) d)
8 8 { u s sampled, qk m) } qk u ) s to sample k from k t 1) d 1,...,k t 1) d nd, new accordng to P u u t 1) d ), see Equaton 4, = 1,...,n d and P new u), where n d s the number of dead objects. The addton and removal actons change the dmenson of the sate vector. When calculatng the acceptance probablty, we need to compute the rato of probabltes from spaces wth dfferent dmensons. Smth et al. [41] use an explct strategy of trans-dmensonal MCMC [14] to deal wth the dmensonmatchng problem. We do not need explct strategy to match the dmenson. Snce the trans-dmensonal actons only add or remove one object at one teraton, leavng the other objects unchanged, the Jacoban n [14] s unt, as n [41]. So our formulaton s just a specal case of the more general theory. Parameter update We use two ways to update the model parameters: q dff m r m r ) = λ d1 q d1 m r m r )+ λ d2 q d2 m r m r ), λ d =1/2. q d1 ) uses stochastc gradent decent to update the object parameters. q d1 m r m r ) N m r k de dm, w), where E = log P θ t) I t),θ t 1)) s the energy functon, k s a scalar to control the step sze, and w s random nose to avod local maxmum. A mean shft vector computed n the vsble regon provdes an approxmaton of the gradent of the object lkelhood w.r.t. the poston. q d2 m r m r ) Nm ms r, w), where m ms r s the new locaton computed from the mean shft procedure detals are gven n a separate Appendx). We assume that the change of the posteror probablty by other components and due to occluson can be absorbed n the nose term. The mean shft has an adaptve step sze and has a better convergence behavor than numercally computed gradents. The rest of the parameters follow ther numercally computed gradents. Compared to the orgnal color-based mean shft trackng, the background excluson term n Equaton 6, can utlze a known background model, whch s avalable for a statonary camera. As we observe n our experments, trackng usng the above lkelhood s more robust to the change of appearance of the object, e.g. when gong nto the shadow, compared to usng the object attracton term alone. Theoretcally, the Markov chan desgned should be rreducble and reversble, however the use of the above data drven proposal probabltes makes the approach not conform to the theory exactly. Frst, rreducblty requres the Markov chan be able to reach any possble pont n the soluton space. However, n practce, the proposal probablty of some pont are very small, close to zero. For example the proposal probablty of addng a hypothess at a poston, where there s no head canddate detected nearby, s extremely low. Wth fnte numbers of teratons, a state ncludng such a hypothess wll never be sampled. Although ths breaks the completeness of the Markov chan, we argue that skppng the parts of the soluton space, where no sgn of objects observed, brngs no harm to the qualty of the fnal soluton and makes the searchng process more effcent. Second, the use of the mean shft, whch s a non-parametrc method, makes the chan rreversble. Mean-shft can be seen as an approxmaton of the gradent, whle stochastc gradent decent s essentally a Gbbs sampler [39], whch s a specal case of Metropols- Hastng sampler wth acceptance rato always equal to one [25]. However, mean shft s much faster than the random walk to estmate the parameters of the object. We choose to use these technques wth the lost of some theoretcal beauty, because expermentally they makes our method much more effcent and the results are good. C. Incremental Computaton As the MCMC process may need hundreds or more samples to approxmate the dstrbuton, we need an effcent method to compute the lkelhood for each proposed state. In one teraton of the algorthm, at most two objects may change. It affects the lkelhood locally, therefore the computaton of the new lkelhood can be carred out more effcently by ncrementally computng t only wthn ther neghborhood the area assocated wth the changed objects and those overlappng wth them). Take the addton acton as an example. When a new human hypothess s added to the state vector, for the lkelhood of the non-object regon P I S θ), we only need to remove those background pxels taken by the new hypothess. For the lkelhood of the object regon P I S θ), as the new hypothess may overlap wth some exstng hypotheses, we need to recompute the vsblty of the object regons connected to the new hypothess and then update the lkelhood of these neghborng objects. The ncremental computatons of the lkelhood for the other actons are smlar. Although a jont state and jont lkelhood s used, the computaton of each teraton s greatly reduced through the ncremental computaton. Ths s n contrast to the partcle flter where the evaluaton of each partcle jont state) needs the computaton of the full jont lkelhood. The appearance models of the tracked objects are updated after processng each frame to adapt to the change n object appearance. We update the object color hstogram usng an IIR flter p t) =λ p p t) + 1 λ p ) p t 1). We choose to update the appearance conservatvely: we use a small λ p =0.01 and stop updatng f the object s occluded by more than 25% or ts poston covarance s too bg. VI. EXPERIMENTAL RESULTS We have expermented the system wth many types of data and wll only show some representatve ones. We wll frst show results on an outdoor scene vdeo and then on a standard evaluaton dataset of ndoor scene vdeos. Vdeo results are submtted as supplementary materals. Among all the parameters of our approach, many are natural, meanng that they correspond to measurable physcal quanttes e.g. 3d human heght), therefore settng ther values s straghtforward. We use the same set of parameters for all the sequences. Ths means that our approach s not senstve to the choce of parameter values. We lst here the values of the parameters whch are not mentoned n the prevous sectons. For the sze pror n Sec. IV-D), λ 1 =0.04 and λ 2 = For lkelhood, λ f =0.5, λ b =0.5 n Equaton 6, λ S =25n Eqn. 6 and λ S =0.005 n Eqn. 7. For the mxng probabltes of dfferent types of dynamcs, we use
9 9 P add =0.1, P remove =0.1, P establsh =0.1, P break =0.1, P exchange = 0.1 and P dff = 0.5. We also apply a hard constrant of 25 pxels on the mnmum mage heght of a human. We also want to comment here on the choce of parameters related to the peakedness of a dstrbuton n samplng algorthms. The mage lkelhood s usually a combnaton of a number of components stes, e.g. pxels). Inevtable smplfcatons e.g. ndependence assumpton) n probablstc modelng may result n excessve peakedness of the dstrbuton, whch affects the performance of the samplng algorthms such as MCMC and partcle flter by havng the samples n both MCMC and partcle flter focused n one locaton.e. hghest peak) of the state space therefore makes them to degenerate nto greedy algorthms. Elmnatng the dependences of dfferent components can be extremely dffcult and nfeasble. From an engneerng pont of vew, one should set the values of the parameters e.g. λ S and λ S whle keepng ther rato constant) so that lkelhood rato of dfferent hypotheses are reasonable, so that the Markov chans can effcently traverse and partcle flters can mantan multple hypotheses. In a smlar fashon, smulated annealng has been used n the samplng process to reduce the effect of the peakedness and force convergence [48], [8], however the varyng temperature makes the samples not from a sngle posteror dstrbuton. A. Evaluaton on an Outdoor Scene We show results on an outdoor vdeo sequence, that we call the Campus Plaza sequence, whch contans 900 frames. Ths sequence s captured from a camera above a buldng gate wth a 40 camera tlt angle. The frame sze s pxel, and the samplng rate s 30 FPS. In ths sequence, 33 humans pass by the scene wth 23 gong out of feld of vew and 10 gong nsde a buldng. The nter-human occlusons n ths sequence are large. There are overall 20 occluson events, 9 out of them are heavy occluson over 50% of the object s occluded). For MCMC samplng, we use 500 teratons per frame. We show n Fg.7 some sample frames from the result on ths sequence. The denttes of the objects are shown by ther ID numbers dsplayed on the head. We evaluate the results by the trajectory-based errors. Trajectores whose lengths are less than 10 frames are dscarded. Among the 33 human objects, trajectores of 3 objects are broken once ID 28 ID 35, ID 31 ID 32, ID 30 ID 41, all between frame 387 and frame 447, as marked wth arrows n Fg.7); rest of the trajectores are correct. Usually the trajectores are ntalzed once the humans are fully n the scene, some start when the objects are only partally nsde. Only the ntalzatons of three objects objects 31, 50, 52) are notceably delayed by 50, 55, 60 frames respectvely after they are fully n the scene). Partal occluson or/and lack of contrast wth the background are the causes of the delays. To justfy our approach for ntegrated segmentaton and trackng, we compare the trackng result wth the result usng frameby-frame segmentaton as n [53] where we use frame-based evaluaton metrcs. The detecton rate and the false alarm rate s 98.13% and 0.27% respectvely. The detecton rate and the false alarm rate of the same sequence by usng segmentaton alone are 92.82% and 0.18%. Wth trackng, not only the temporal correspondences are obtaned, but also the detecton rate s ncreased by a large margn whle the false alarm rate s kept low. B. Evaluaton on Indoor Scene Sequences Fg. 8. Trackng evaluaton crtera. Next, we descrbe the results of our method on an ndoor vdeo set, CAVIAR vdeo corpus 4 [56]. We test our system on the 26 shoppng center corrdor vew sequences, overall 36,292 frames, captured by a camera lookng down towards a corrdor. The frame sze s pxel, and the samplng rate s 25 FPS. Some 2D-3D pont correspondences are gven from whch the camera can be calbrated. However, we compute the camera parameters by an nteractve method [26]. The nter-object occluson n ths set s also ntensve. There are overall 96 occluson events n ths set, 68 out of 96 are heavy occlusons, and 19 out of the 96 are almost fully occlusons more than 90% of the object s occluded). Many nteractons between humans, such as talkng, and hand shakng, make ths set very dffcult for trackng. For MCMC samplng, we use 500 teratons per frame agan. For such a bg data set, t s nfeasble to enumerate the errors lke for the Campus Plaza sequence. Instead we defned fve statstcal crtera: 1) number of mostly tracked trajectores; 2) number of mostly lost trajectores; 3) number of fragments of trajectory; 4) number of false trajectores a results trajectory correspondng to no object); and 5) the frequency of dentty swtches dentty exchangng between a par of result trajectores). Fg.8 llustrates ther defnton. These fve categores are by no means a complete classfcaton, however they cover most of the typcal errors observed on ths set. There are other performance measures that have been proposed n the recent evaluatons, such as the Multple Object Trackng Precson and Accuracy n the CLEAR 2006 evaluaton [57]. We do not use these measures, because they are less ntutve, as they try to ntegrate multple factors nto one scalar valued measure. Table I gves the performance of our method. We developed an evaluaton software to count the number of mostly tracked trajectores, mostly lost trajectores, false alarms and fragments automatcally. Denote a ground-truth trajectory by {G ),...G +n) }, where G t) s the object state at the t-th frame; denote a hypotheszed trajectory by {H j),...h j+m) }. The overlap rato of the ground-truth 4 In the provded ground-truth, there are 232 trajectores overall. However 5 of these are mostly out of sght, e.g. only one arm or the head top s vsble; we set these as don t care.
10 10 frame 42 frame 59 frame 250 frame 311 frame 387 frame 447 frame 560 frame 661 Fg. 7. Selected frames of the trackng results from Campus Plaza. The numbers on the heads show denttes. Please note that the two people who are sttng on two sdes are n the background model, therefore not detected.)
11 11 object and the hypotheszed object at the t-frame s defned by OverlapG t), H t) )= RegGt) ) RegH t) ) 8) RegG t) ) RegH t) ) where Reg) s the mage regon of the object. If OverlapG t), H t) ) > 0.5, we say {G t), H t) } s a potental match. The overlap rato of the ground-truth trajectory and the hypotheszed trajectory s defned by OverlapG :+n), H j:j+m) ) mn+n,j+m) = t=max,j) δoverlapg t),h t) )>0.5) max+n,j+m) mn,j)+1 where δ) s an ndcator functon. Gven that one sequence has N G ground-truth trajectores {G k } N G k=1, and N H hypotheszed trajectores {H k } N H k=1, we compute the overlap ratos for all ground-truth hypothess pars {G k, H l }; the pars whose overlap ratos are larger than 0.8 are consdered to be potental matches. Then the Hungaran matchng algorthm [22] s used to fnd the best matches whch are consdered to be mostly tracked. To count the mostly lost trajectores, we defne a recall rato by replacng the denomnator of Equ.9 wth n +1. If for G k, there s no H l such that the recall rato between them s larger than 0.2, we consder G k to be mostly lost. To count the false alarm and fragments, we defne a precson rato by replacng the denomnator of Equ.9 wth m +1.If for H l there s no G k such that the precson rato between them s larger than 0.2, we consder H l a false alarm; f there s such a G k that the precson between them s larger than 0.8, but the overlap rato s smaller than 0.8, we consder H l to be a fragment of G k. We frst count the mostly tracked trajectores, and remove the matched parts of the groundtruth tracks. Second, we count the trajectory fragments wth a greedy, teratve algorthm. At each round, the fragment wth the hghest overlap rato s found, and then the matched part of the ground-truth track s removed; ths procedure s repeated untl there are no more vald fragments. Lastly, we count the mostly lost trajectores and the false alarms. Ths algorthm can not classfy all ground-truth and hypotheszed tracks; the unlabeled ones are manly due to an dentty swtch. We count the frequency of dentty swtches vsually. Some sample frames and results are shown n Fg.9. Most of the mssed detectons are due to the humans wearng clothng wth color very smlar to that of the background so that some part of the object s msclassfed as background, see the frame 1413 of Fg.9b) for an example. The fragmentaton of trajectory and the ID swtch are manly due to full occlusons, see the frame 496 of Fg.9a) and the frame 316 of Fg.9b) for examples. Our method can deal wth partal occluson well. For full occluson, classfyng an object as gong nto an occluded state and assocatng t when t reappears could potentally mprove the performance. The false alarms are manly due to the shadows, reflectons and sudden brghtness changes whch are msclassfed as foreground, see the frame 563 of Fg.9a). More sophstcated background model and shadow model e.g. [32]) could be used to mprove the result. In general, our method performs reasonably well on the CAVIAR set, though not as well as on the Campus Plaza 9) TABLE I RESULTS OF PERFORMANCE EVALUATIONS ON CAVIAR SET 277 TRAJECTORIES). MT: MOSTLY TRACKED, ML: MOSTLY LOST, FGMT: FRAGMENT, FA:FALSE ALARM, IDS: IDENTITY SWITCH. MT ML Fgmt FA IDS Number Percentage 62.1% 5.3% sequence, manly due to the above mentoned dffcultes. The runnng speed of the system s about 2 FPS wth a 2.8GHz Pentum IV CPU. The mplementaton s n C++ code wthout any specal optmzaton. VII. CONCLUSION AND FUTURE WORK We have presented a prncpled approach to smultaneously detect and track humans n a crowded scene acqured from a sngle statonary camera. We take a model-based approach and formulate the problem as a Bayesan MAP estmaton problem to compute the best nterpretaton of the mage observatons collectvely by the 3D human shape model, acqured human appearance model, background appearance model, camera model, the assumpton that humans move on a a known ground plane, and the object prors. The mage s modeled as a composton of an unknown number of possbly overlappng objects and a background. The nference s performed by an MCMCbased approach to explore the jont soluton space. Data-drven proposal probabltes are used to drect the Markov chan dynamcs. Experments and evaluatons on challengng reallfe data show promsng results. The success of our approach manly les n the ntegraton of the top-down Bayesan formulaton followng the mage formaton process and the bottom-up features that are drectly extracted from mages. The ntegraton has the beneft of both the computatonal effcency of mage features and the optmalty of a Bayesan formulaton. Ths work could be mproved/extended n several ways. 1) extenson to track multple classes of objects e.g. humans and cars), by addng model swtchng n the MCMC dynamcs. 2) Trackng, operatng n a 2-frame nterval, has a very local vew therefore ambgutes nevtably exst, especally n the case of trackng fully occluded objects. The analyss n the level of trajectores may resolve the local ambgutes e.g. [29]). The analyss may take nto account the pror knowledge on the vald object trajectores ncludng ther startng and endng ponts. APPENDIX I SINGLE OBJECT TRACKING WITH BACKGROUND KNOWLEDGE USING MEANSHIFT Denote by p, pu), and bu) the color hstograms of the object learnt onlne, the color hstogram of the object at locaton u and the color hstogram of the background at the correspondng regon respectvely. Let {x } =1,...,n be the pxel locatons n the regon wth the object center at u. A kernel wth profle k) s used to assgn smaller weghts to the pxels farther away from the center. An m-bn color hstogram pu) ={p j u)} j=1,...,m, s constructed as p j u) =
12 12 a) Sequence ThreePastShop2cor b) Sequence TwoEnterShop2cor Fg. 9. Selected frames of the trackng results from CAVIAR set. n =1 k x 2) δ [b f x ) j], where functon b f ) maps the pxel locaton to the correspondng hstogram bn, and δ s the delta functon. Smlar for p and b. We would lke to optmze Lu) = λ b B pu), bu)) +λ f B pu), p) }{{}}{{} L 1 u) L 2 u) 10) where B) s the Bhattachayya coeffcent. By applyng Taylor expanson at pu 0 ) and bu 0 )u 0 s a predcted poston of the object), we have where w f L 1 u) =B pu), bu)) = Bu) = m pu p u u 0 ) δ [b f x ) u], therefore u=1 Bu 0 )+B pu 0 )pu) pu 0 )) + B du 0 )bu) bu 0 )) n ) m m b u u 0 ) = c 1 + p p p u u 0 ) Lu) =c 1 +c 2 + λ f w f uu)+ uu 0) b b λ bw b u x 2) k 13) uu) h uu 0) =1 = c 1 + u=1 n =1 k u x h u=1 2 w b 11) where w b = m u=1 Smlarly, also n [6], L 2 u) =B pu), p) 1 2 b u u 0 ) p u u 0 ) δ [b f x ) u]+ p uu 0 ) b u u 0 ) δ [b b x ) u] m = c 2 + u=1 n h =1 p u u 0 ) p u p uu) w f k } {{ } w u x h 2 p u p u u 0 ) 12) The last term of Lu) s the densty estmate computed wth kernel profle k) at u. The meanshft algorthm wth negatve
13 13 weght [4] apples. By usng the Epanechkov profle [6], Lu) wll be ncreased wth the new locaton moved to n u =1 x w n =1 w 14) ACKNOWLEDGMENT Ths research was funded, n part, by the U.S. Government VACE program. REFERENCES [1] G. Borgefors. Dstance transformatons n dgtal mages. Computer Vson, Graphcs, and Image Processng, 343): , [2] Y. Boykov, O. Veksler, and R. Zabh. Fast approxmate energy mnmzaton va graph cuts. IEEE Trans. Pattern Analyss and Machne Intellgence, 2311): , [3] I. Cohen and G. Medon. Detectng and trackng movng objects for vdeo survellance. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, II: , [4] R.T. Collns. Mean-shft Blob Trackng through Scale Space. In Proc. Conf. Computer Vson and Pattern Recognton, II: , [5] D. Comancu and P. Meer. Mean shft: A robust approach toward feature space analyss. IEEE Trans. Pattern Analyss and Machne Intellgence, 245): , [6] D. Comancu and P. Meer. Kernel-based object trackng. IEEE Trans. Pattern Analyss and Machne Intellgence, 255): , [7] L. Davs, V. Phlomn, and R. Duraswam. Trackng humans from a movng platform. In Proc. Int l Conf. Pattern Recognton, IV: , [8] J. Deutscher, A. Blake, and I. Red. Artculated body moton capture by annealed partcle flterng. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, II: , [9] A. Elgammal and L. Davs. Probablstc framework for segmentng people under occluson. In Proc. Int l Conf. Computer Vson, II: , [10] A. Elgammal, R. Duraswam, D. Harwood, and L. Davs. Background and foreground modelng usng non-parametrc kernel densty estmaton for vsual survellance. Proc. IEEE, 907): , [11] F. Fleuret, R. Lengagne, and P. Fua. Fxed pont probablty feld for complex occluson handlng. In Proc. Int l Conf. Computer Vson, I: , [12] D. G-Perez, J.-M. Odobez, S. Ba, K. Smth, and G. Lathoud. Trackng people n meetngs wth partcles. In Proc. Int l Workshop on Image Analyss for Multmeda Interactve Servce, [13] D. Gavrla and V. Phlomn. Real-tme object detecton for smart vehcles. In Proc. Int l Conf. Computer Vson, I:87-93, [14] P. Green. Trans-dmensonal Markov chan Monte Carlo. Oxford Unversty Press, [15] S. Hartaoglu, D. Harwood, and L. Davs. W4: Real-tme survellance of people and ther actvtes. IEEE Trans. Pattern Analyss and Machne Intellgence, 228): , [16] R. Hartley and A. Zsserman. Multple Vew Geometry n Computer Vson. Cambrdge Unversty Press, [17] W. Hastng. Monte carlo samplng methods usng markov chans and ther applcatons. Bometrka, 571):97-109, [18] S. Hongeng and R. Nevata. Mult-agent event recognton. In Proc. Int l Conf. Computer Vson, II:84-91, [19] M. Isard and J. MacCormck. Bramble: A bayesan multple-blob tracker. In Proc. Int l Conf. Computer Vson, II:34-41, [20] J. Kang, I. Cohen, and G. Medon. Contnuous trackng wthn and across camera streams. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, I: , [21] Z. Khan, T. Balch, and F. Dellaert. Mcmc-based partcle flterng for trackng a varable number of nteractng targets. IEEE Trans. Pattern Analyss and Machne Intellgence, 2711): , [22] H. W. Kuhn. The hungaran method for the assgnment problem. Naval Research Logstcs Quarterly, II:83-87, [23] M.-W. Lee and I. Cohen. A model-based approach for estmatng human 3d poses n statc mages. IEEE Trans. Pattern Analyss and Machne Intellgence, 286): , [24] A. Lpton, H. Fujyosh, and R. Patl. Movng target classfcaton and trackng from real-tme vdeo. In Proc. DARPA Image Understandng Workshop, pp , [25] J. Lu. Metroplzed gbbs sampler. In Monte Carlo strateges n scentfc computng. Computng, Sprnger-Verlag NY INC, [26] F. Lv, T. Zhao, and R. Nevata. Self-calbraton of a camera from vdeo of a walkng human. IEEE Trans. Pattern Analyss and Machne Intellgence, 289): , [27] J. MacCormck and A. Blake. A probablstc excluson prncple for trackng multple objects. In Proc. Int l Conf. Computer Vson, pages I: , [28] A. Mttal and L. Davs. M2tracker: A mult-vew approach to segmentng and trackng people n a cluttered scene usng regon-based stereo. In Proc. European Conf. Computer Vson, II:18-33, [29] P. Nllus, J. Sullvan, and S. Carlsson. Mult-target trackng - lnkng denttes usng bayesan network nference. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, II: , [30] K. Okuma, A. Taleghan, N. de Fretas, J. Lttle, and D. Lowe. A boosted partcle flter: Multtarget detecton and trackng. In Proc. European Conf. Computer Vson, I:28-39, [31] C. Papageorgou, T. Evgenou, and T. Poggo. A tranable pedestran detecton system. In Proc. of Intellgent Vehcles, pp , [32] A. Prat, I. Mkc, M. Trved, and R. Cucchara. Detectng movng shadows: Algorthms and evaluaton. IEEE Trans. Pattern Analyss and Machne Intellgence, 257): , [33] P. Prez, C. Hue, J. Vermaak, and M. Gangnet. Color-based probablstc trackng. In Proc. European Conf. Computer Vson, pages I: , [34] D. Ramanan, D. Forsyth, and A. Zsserman. Strke a pose: Trackng people by fndng stylzed poses. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, I: , [35] C. Rasmussen and G. D. Hager. Probablstc data assocaton methods for trackng complex vsual objects. IEEE Trans. Pattern Analyss and Machne Intellgence, 236): , [36] J. Rttscher, P. Tu, and N. Krahnstoever. Smultaneous estmaton of segmentaton and shape. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, II: , [37] R. Rosales and S. Sclaroff. 3d trajectory recovery for trackng multple objects and trajectory guded recognton of actons. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, II: , [38] H. Rue and MA. Hurn. Bayesan object dentfcaton. Bometrka, 863): , [39] C. R. H. S. Geman. Dffuson for global optmzaton. SIAM J. on Control and Optmzaton, 245): , [40] N. Sebel and S. Maybank. Fuson of multple trackng algorthm for robust people trackng. In Proc. European Conf. Computer Vson, IV: , [41] K. Smth, D. Gatca-Perez, and J.-M. Odobez. Usng partcles to track varyng numbers of nteractng people. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, I: , [42] X. Song and R. Nevata. Combned face-body trackng n ndoor envronment. In Proc. Int l Conf. Pattern Recognton, IV: , [43] X. Song and R. Nevata. A model-based vehcle segmentaton method for trackng. In Proc. Int l Conf. Computer Vson, II: , [44] C. Stauffer and E. Grmson. Learnng patterns of actvty usng realtme trackng. IEEE Trans. Pattern Analyss and Machne Intellgence, 228): , [45] C. Tao, H. Sawhney, and R. Kumar. Object trackng wth bayesan estmaton of dynamc layer representatons. IEEE Trans. Pattern Analyss and Machne Intellgence, 241):75-89, [46] H. Tao, H. Sawhney, and R. Kumar. A samplng algorthm for trackng multple objects. In Proc. Workshop of Vson Algorthms, [47] L. Terney. Markov chan concepts related to samplng algorthms. In Markov Chan Monte Carlo n Practce, pp.59-74, [48] Z. W. Tu and S. C. Zhu. Image segmentaton by data-drven markov chan monte carlo. IEEE Trans. Pattern Analyss and Machne Intellgence, 245): , [49] Y. Wess. Correctness of local probablty propagaton n graphcal models wth loops. Neural Computaton, 121):1-41, [50] B. Wu and R. Nevata. Detecton of multple, partally occluded humans n a sngle mage by bayesan combnaton of edgelet part detectors. In Proc. Int l Conf. Computer Vson, I:90-97, [51] T. Yu and Y. Wu. Collaboratve trackng of multple targets. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, I: , [52] T. Zhao, M. Aggarwal, R. Kumar, and H. Sawhney. Real-tme wde area mult-camera stereo trackng. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, I: , [53] T. Zhao and R. Nevata. Bayesan human segmentaton n crowded stuatons. In Proc. IEEE Conf. Computer Vson and Pattern Recognton, II: , [54] T. Zhao and R. Nevata. Trackng multple humans n complex stuatons. IEEE Trans. Pattern Analyss and Machne Intellgence, 269): , [55] T. Zhao and R. Nevata. Trackng multple humans n crowded envronment. In Proc. IEEE Conf. Computer Vson and Pattern Recognton,
14 14 II: , [56] The CAVIAR data set. rbf/caviar/ [57] CLEAR06 Evaluaton Campagn and Workshop. uka.de/clear06/ Tao Zhao receved the BEng degree from the Department of Computer Scence and Technology, Tsnghua Unversty, Chna, n He receved the MSc and the PhD degrees from the Department of Computer Scence at the Unversty of Southern Calforna n 2001 and 2003, respectvely. He was wth Sarnoff Corporaton, Prnceton, New Jersey, from 2003 to He s currently wth Intutve Surgcal Incorporated, Sunnyvale, Calforna workng on computer vson applcatons for medcne and surgery. Hs research nterests nclude computer vson, machne learnng, and pattern recognton. Hs experence has been n vsual survellance, human moton analyss, aeral mage analyss and medcal mage analyss. He s a member of the IEEE and IEEE computer socety. Ram Nevata receved hs Ph.D. degree from Stanford Unversty wth specalty n the area of computer vson. He has been wth the Unversty of Southern Calforna snce 1975 where he s currently a Professor of Computer Scence and Electrcal Engneerng. He s also Drector of the Insttute for Robotcs and Intellgent Systems. He has been prncpal nvestgator of major Government funded computer vson research programs for over 25 years. Dr. Nevata has made mportant contrbutons to several areas of computer vson ncludng the topcs of shape descrpton, object recognton, stereo analyss aeral mage analyss, trackng of humans and event recognton. Dr. Nevata s a Fellow of the Insttute of Electrcal and Electroncs Engneers IEEE) and of the Amercan Assocaton for Artfcal Intellgence AAAI). He s an assocate edtor for the Pattern Recognton, and the Computer Vson and Image Understandng journals. Dr. Nevata s author of two books, several book chapters, and over 100 refereed techncal papers. Bo Wu receved the B.Eng and M.Eng degrees from the Department of Computer Scence and Technology, Tsnghua Unversty, Bejng, Chna, n 2002 and 2004 respectvely. He s currently a PhD canddate at the Computer Scence Department, Unversty of Southern Calforna, Los Angeles. Hs research nterests nclude computer vson, machne learnng, and pattern recognton. He s a student member of the IEEE computer socety.
Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationHermite Splines in Lie Groups as Products of Geodesics
Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationTN348: Openlab Module - Colocalization
TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationCS 534: Computer Vision Model Fitting
CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationReducing Frame Rate for Object Tracking
Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationA Fast Visual Tracking Algorithm Based on Circle Pixels Matching
A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng
More informationFEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur
FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationFitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.
Fttng & Matchng Lecture 4 Prof. Bregler Sldes from: S. Lazebnk, S. Setz, M. Pollefeys, A. Effros. How do we buld panorama? We need to match (algn) mages Matchng wth Features Detect feature ponts n both
More informationHierarchical clustering for gene expression data analysis
Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally
More informationBiostatistics 615/815
The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts
More informationMultiple Frame Motion Inference Using Belief Propagation
Multple Frame Moton Inference Usng Belef Propagaton Jang Gao Janbo Sh The Robotcs Insttute Department of Computer and Informaton Scence Carnege Mellon Unversty Unversty of Pennsylvana Pttsburgh, PA 53
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationProblem Definitions and Evaluation Criteria for Computational Expensive Optimization
Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty
More informationR s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes
SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationAn Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationDetermining the Optimal Bandwidth Based on Multi-criterion Fusion
Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationImage Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline
mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationPerformance Evaluation of Information Retrieval Systems
Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence
More informationReal-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input
Real-tme Jont Tracng of a Hand Manpulatng an Object from RGB-D Input Srnath Srdhar 1 Franzsa Mueller 1 Mchael Zollhöfer 1 Dan Casas 1 Antt Oulasvrta 2 Chrstan Theobalt 1 1 Max Planc Insttute for Informatcs
More informationActive Contours/Snakes
Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng
More informationA Robust Method for Estimating the Fundamental Matrix
Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.
More informationLobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide
Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.
More informationLECTURE : MANIFOLD LEARNING
LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors
More informationOverview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION
Overvew 2 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Introducton Mult- Smulator MASIM Theoretcal Work and Smulaton Results Concluson Jay Wagenpfel, Adran Trachte Motvaton and Tasks Basc Setup
More informationRange images. Range image registration. Examples of sampling patterns. Range images and range surfaces
Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples
More informationAnalysis of Continuous Beams in General
Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationFitting: Deformable contours April 26 th, 2018
4/6/08 Fttng: Deformable contours Aprl 6 th, 08 Yong Jae Lee UC Davs Recap so far: Groupng and Fttng Goal: move from array of pxel values (or flter outputs) to a collecton of regons, objects, and shapes.
More informationCorner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity
Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent
More informationEECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty
More informationUnsupervised Learning
Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and
More informationFace Tracking Using Motion-Guided Dynamic Template Matching
ACCV2002: The 5th Asan Conference on Computer Vson, 23--25 January 2002, Melbourne, Australa. Face Trackng Usng Moton-Guded Dynamc Template Matchng Lang Wang, Tenu Tan, Wemng Hu atonal Laboratory of Pattern
More informationModeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach
Modelng, Manpulatng, and Vsualzng Contnuous Volumetrc Data: A Novel Splne-based Approach Jng Hua Center for Vsual Computng, Department of Computer Scence SUNY at Stony Brook Talk Outlne Introducton and
More informationSkew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach
Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research
More informationReal-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution
Real-tme Moton Capture System Usng One Vdeo Camera Based on Color and Edge Dstrbuton YOSHIAKI AKAZAWA, YOSHIHIRO OKADA, AND KOICHI NIIJIMA Graduate School of Informaton Scence and Electrcal Engneerng,
More informationComputer Animation and Visualisation. Lecture 4. Rigging / Skinning
Computer Anmaton and Vsualsaton Lecture 4. Rggng / Sknnng Taku Komura Overvew Sknnng / Rggng Background knowledge Lnear Blendng How to decde weghts? Example-based Method Anatomcal models Sknnng Assume
More informationDetection of an Object by using Principal Component Analysis
Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,
More informationHigh resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices
Hgh resoluton 3D Tau-p transform by matchng pursut Wepng Cao* and Warren S. Ross, Shearwater GeoServces Summary The 3D Tau-p transform s of vtal sgnfcance for processng sesmc data acqured wth modern wde
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationA MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS
Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung
More informationMOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN
MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS by XUNYU PAN (Under the Drecton of Suchendra M. Bhandarkar) ABSTRACT In modern tmes, more and more
More informationAn Image Fusion Approach Based on Segmentation Region
Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua
More informationOptimizing Document Scoring for Query Retrieval
Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng
More informationProblem Set 3 Solutions
Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,
More informationDynamic Camera Assignment and Handoff
12 Dynamc Camera Assgnment and Handoff Br Bhanu and Ymng L 12.1 Introducton...338 12.2 Techncal Approach...339 12.2.1 Motvaton and Problem Formulaton...339 12.2.2 Game Theoretc Framework...339 12.2.2.1
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationProposal Maps driven MCMC for Estimating Human Body Pose in Static Images
Proposal Maps drven MCMC for Estmatng Human Body Pose n Statc Images Mun Wa Lee and Isaac Cohen Insttute for Robotcs and Intellgent Systems Integrated Meda Systems Center Unversty of Southern Calforna
More informationRobust visual tracking based on Informative random fern
5th Internatonal Conference on Computer Scences and Automaton Engneerng (ICCSAE 205) Robust vsual trackng based on Informatve random fern Hao Dong, a, Ru Wang, b School of Instrumentaton Scence and Opto-electroncs
More informationA Background Subtraction for a Vision-based User Interface *
A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton
More informationSimulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010
Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationTECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.
TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of
More informationA Gradient Difference based Technique for Video Text Detection
A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg
More informationNUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS
ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana
More informationA Gradient Difference based Technique for Video Text Detection
2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,
More informationUnsupervised Learning and Clustering
Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationMOTION BLUR ESTIMATION AT CORNERS
Gacomo Boracch and Vncenzo Caglot Dpartmento d Elettronca e Informazone, Poltecnco d Mlano, Va Ponzo, 34/5-20133 MILANO boracch@elet.polm.t, caglot@elet.polm.t Keywords: Abstract: Pont Spread Functon Parameter
More informationMachine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)
Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes
More informationMulti-stable Perception. Necker Cube
Mult-stable Percepton Necker Cube Spnnng dancer lluson, Nobuuk Kaahara Fttng and Algnment Computer Vson Szelsk 6.1 James Has Acknowledgment: Man sldes from Derek Hoem, Lana Lazebnk, and Grauman&Lebe 2008
More informationA Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems
A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty
More informationVideo Object Tracking Based On Extended Active Shape Models With Color Information
CGIV'2002: he Frst Frst European Conference Colour on Colour n Graphcs, Imagng, and Vson Vdeo Object rackng Based On Extended Actve Shape Models Wth Color Informaton A. Koschan, S.K. Kang, J.K. Pak, B.
More information3D vector computer graphics
3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres
More informationLecture 5: Multilayer Perceptrons
Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented
More informationClassifying Acoustic Transient Signals Using Artificial Intelligence
Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)
More informationAccounting for the Use of Different Length Scale Factors in x, y and z Directions
1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,
More informationShape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram
Shape Representaton Robust to the Sketchng Order Usng Dstance Map and Drecton Hstogram Department of Computer Scence Yonse Unversty Kwon Yun CONTENTS Revew Topc Proposed Method System Overvew Sketch Normalzaton
More informationProper Choice of Data Used for the Estimation of Datum Transformation Parameters
Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and
More information12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification
Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero
More informationImage Alignment CSC 767
Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationResolving Ambiguity in Depth Extraction for Motion Capture using Genetic Algorithm
Resolvng Ambguty n Depth Extracton for Moton Capture usng Genetc Algorthm Yn Yee Wa, Ch Kn Chow, Tong Lee Computer Vson and Image Processng Laboratory Dept. of Electronc Engneerng The Chnese Unversty of
More informationAn Efficient Background Updating Scheme for Real-time Traffic Monitoring
2004 IEEE Intellgent Transportaton Systems Conference Washngton, D.C., USA, October 3-6, 2004 WeA1.3 An Effcent Background Updatng Scheme for Real-tme Traffc Montorng Suchendra M. Bhandarkar and Xngzh
More informationSimplification of 3D Meshes
Smplfcaton of 3D Meshes Addy Ngan /4/00 Outlne Motvaton Taxonomy of smplfcaton methods Hoppe et al, Mesh optmzaton Hoppe, Progressve meshes Smplfcaton of 3D Meshes 1 Motvaton Hgh detaled meshes becomng
More information2 ZHENG et al.: ASSOCIATING GROUPS OF PEOPLE (a) Ambgutes from person re dentfcaton n solaton (b) Assocatng groups of people may reduce ambgutes n mat
ZHENG et al.: ASSOCIATING GROUPS OF PEOPLE 1 Assocatng Groups of People We-Sh Zheng jason@dcs.qmul.ac.uk Shaogang Gong sgg@dcs.qmul.ac.uk Tao Xang txang@dcs.qmul.ac.uk School of EECS, Queen Mary Unversty
More informationHigh-Boost Mesh Filtering for 3-D Shape Enhancement
Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,
More informationEXTENDED BIC CRITERION FOR MODEL SELECTION
IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd
More informationQuality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation
Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on
More information