Multi-Scale Object Candidates for Generic Object Tracking in Street Scenes
|
|
- Donald Arnold
- 5 years ago
- Views:
Transcription
1 Muli-Scale Objec Candidaes for Generic Objec Tracking in Sree Scenes Aljoša Ošep, Alexander Hermans, Francis Engelmann, Dirk Klosermann, Markus Mahias and Basian Leibe Absrac Mos vision based sysems for objec racking in urban environmens focus on a limied number of imporan objec caegories such as cars or pedesrians, for which powerful deecors are available. However, pracical driving scenarios conain many addiional objecs of ineres, for which suiable deecors eiher do no ye exis or would be cumbersome o obain. In his paper we propose a more general racking approach which does no follow he ofen used racking-bydeecion principle. Insead, we invesigae how far we can ge by racking unknown, generic objecs in challenging sree scenes. As such, we do no resric ourselves o only racking he mos common caegories, bu are able o handle a large variey of saic and moving objecs. We evaluae our approach on he KITTI daase and show compeiive resuls for he annoaed classes, even hough we are no resriced o hem. I. INTRODUCTION Oudoor visual scene undersanding is a key componen for auonomous mobile sysems. Specifically, deecion and racking of oher raffic paricipans are essenial seps owards safe navigaion and pah planning hrough populaed urban areas. Recen resuls on sandard benchmarks [1] show ha some objec caegories, such as cars or pedesrians, can already be racked raher reliably by sae-of-he-ar rackingby-deecion approaches [2], [3], [4], [5]. In pracical driving scenarios, however, here are numerous oher objecs ha could pose poenial safey hazards and i quickly becomes infeasible o rain specific deecors for all possible classes. In his paper, we herefore invesigae he problem of generic objec racking in sree scenes. Raher han saring from he oupu of a class-specific deecor, we ry o exrac a se of objec candidae regions purely from low-level cues and o rack hem over ime. This approach has he advanage ha i is no a priori resriced in he ypes of objecs ha can be racked. However, he racking ask becomes much more challenging, since i requires solving a complex figureground segmenaion problem in every frame o decide which scene regions conain valid objecs and a wha spaial exen hose objecs should be represened. In order o address his segmenaion problem, we make use of scene informaion from sereo deph o generae Generic Objec Proposals GOPs) in 3D and keep only hose proposals ha can consisenly be racked over a sequence of frames. In our racking sep, we link hese objec proposals ino rajecories and inegrae he individual 3D measuremens ino a 3D shape model for each racked objec. We joinly reason abou valid objec proposals and All auhors are wih Visual Compuing Insiue, RWTH Aachen Universiy. {lasname}@vision.rwh-aachen.de Fig. 1. We propose an approach o rack generic objecs in sree scenes ha goes beyond he capabiliies of pre-rained objec deecors. Our approach can handle a wide variey of saic and moving objecs of differen sizes and robusly rack hem. Blue areas indicae poenial objec regions. heir corresponding rajecories via a model selecion based muli-objec racking procedure. For such an approach o work, he generaion of good objec proposals is a key requiremen. This is a very challenging problem, since he unknown objecs may originae from vasly differen scales see Fig. 1), measuremens from nearby objecs end o merge, and objecs close o scene srucures are difficul o segmen due o ofen noisy sereo daa. To reach accepable recall values, sae-of-he-ar appearance-based objec proposal generaion approaches [6] ypically need several housand objec proposals per frame, wo orders of magniude more han wha would be racable o use in a racking framework. We propose a novel robus muli-scale objec proposal exracion procedure ha uses a wo-sage segmenaion approach. Firs, a coarse supervised segmenaion removes non-objec regions corresponding o known background caegories such as road, building, or vegeaion. Nex, we perform a fine unsupervised muli-scale segmenaion o exrac scale-sable objec proposals from he remaining scene regions. As many of hese proposals may overlap and he correc objec scale ofen canno be deermined on a singleframe basis, we perform muli-hypohesis racking a he level of objec proposals. In summary, our main conribuions are: 1) We presen a novel, scalable approach ha successfully racks a large variey of generic objecs in challenging sree scenes. 2) As a key componen of his approach, we propose a robus muli-scale 3D objec proposal exracion procedure based on a wo-sage segmenaion and scale-sable clusering. 3) We demonsrae he validiy of our approach quaniaively and
2 qualiaively on he KITTI daase [1]. We show ha our approach can compee wih sae-of-he-ar deecor-based mehods in close and medium camera disance. Objec definiion. In he remainder of his paper we refer o an objec as an eniy ha appears in urban sree scenes, sicks ou of he ground plane, has a well-defined closed boundary in space [7], and is surrounded by a cerain band of free-space. In addiion, objecs need o appear consisenly in a sequence of frames, eiher moving or no, and mainain a roughly consisen appearance. We also assume a size range for objecs of ineres beween 0.5m and 5m. This definiion includes oher raffic paricipans, as well as saic/parked vehicles and iems of sree furniure. We explicily exclude only iems ha are beer explained by suff caegories such as vegeaion or building facade. II. RELATED WORK Many approaches have been proposed for objec racking in sree scenarios [8], [2], [9], [3], [5]. Mos of hose follow a racking-by-deecion sraegy by firs applying deecors rained for specific caegories on each frame and hen linking he deecions ino rajecories. The KITTI racking benchmark [1] gives a good overview of such racking mehods. Zhang e al. [5] pose racking as a maximum-a-poseriori daa associaion problem wih non-overlap consrains. Pirsiavash e al. [3] consider racking as a spaio-emporal grouping problem and propose greedy global opimizaion approach. Milan e al. [2] use a coninuous energy minimizaion approach ha akes ino accoun physical consrains and rack persisence. While hese approaches obain impressive resuls, hey have he drawback ha hey assume ha all ineresing objec caegories are known beforehand and ha deecors can be rained for each caegory. Recenly, he problem of racking generic objecs has received more aenion. For auomoive scenarios several approaches address his problem using highly precise LIDAR daa as inpu [10], [8], [11], [12]. Perovskaya e al. [10] use a model-based approach o deec car-sized objecs in laser poin clouds. Held e al. [12] uilize 3D shape and color informaion o obain precise velociy esimaes of generic objecs in LIDAR daa. In conras, we use deph informaion obained from a sereo camera pair, which is far less accurae and requires a more robus processing pipeline. Oher approaches ry o find and rack generic objecs based on moion segmenaion. For example, Bewley e al. [13] use a self-supervised framework o deec dynamic objec clusers exraced from a monocular camera sream. In conras o hose approaches, we are also ineresed in racking saic insances of ineresing objecs. To he bes of our knowledge, only few approaches deal wih generic objec racking from sereo deph. Nguyen e al. [14] also arge generic objecs of several sizes, bu hey only rack moving objecs, wih he purpose of generaing improved occupancy grids of he scene for a driver assisance sysem. While our pipeline is similar o he approaches of [9], [15], hey only rack pedesrian sized objecs, whereas we aim o also rack larger objecs such as cars and vans. Inpu Dispariy and Ground- Plane Esimaion Sereo Pair Sequence Visual Odomery Fig. 2. Semanic Segmenaion Sec.4 Generic Objec Proposal Generaion Generic Objec Tracking High-level overview of our pipeline. Sec.5 Sec.6 A key par of our pipeline is he generaion of good Generic Objec Proposals GOPs). Several previous mehods have been proposed for his sep, ofen based on LIDAR daa. Wang e al. [16] use a minimal-spanning-ree clusering approach o exrac 3D objec proposals from LIDAR daa and hen classify hem ino background, bicyclis, car, or pedesrian. Ioanneu e al. [17] propose a Difference-of- Normals operaor o exrac scale-sable objec proposal regions from LIDAR daa. We compare agains his approach in Sec. VII. Bansal e al. [18] propose a semanic srucure labeling approach based on sereo daa in order o creae proposal regions for a pedesrian deecor. The resuling regions are oo coarse and would no generalize well o all ineresing objecs. There is also a large se of approaches ha ry o find good objec proposals in he form of bounding boxes from color images [7], [6], [19]. While sae-of-hear mehods such as EdgeBoxes [6] obain a very high recall, heir precision is oo low o be applicable for our approach. Our approach builds upon a semanic segmenaion o rejec scene pars ha can be well explained by background caegories. Several oher approaches have already demonsraed semanic segmenaion on differen subses of KITTI [1]. Xu e al. [20] fuse informaion from several sensors o classify superpixels, while we only rely on sereo daa. Ladický e al. [21] joinly infer dispariy maps and dense semanic segmenaions based on a monocular image using a combinaion of deph and semanic classifiers. Ros e al. [22] pre-compue a high-qualiy semanic map of he saic pars of a scene in order o laer on label he environmen based on he curren locaion wihin ha map. Objecs ha appeared in he scene can hen auomaically be labeled. While his is fas and gives good resuls, our approach also generalizes o unknown scenes. III. METHOD OVERVIEW Fig. 2 gives an overview of our approach. Given a sequence of sereo image pairs, we compue dispariy maps using ELAS [23]. From a dispariy map we generae poin cloud and fi a ground plane using RANSAC. To narrow Our Mehod
3 down he 3D search space for poenial objecs, we perform supervised coarse semanic segmenaion on he poin cloud Sec. IV). Based on he idea of hings and suff [24], we remove all poins ha belong o suff regions such as road, sky, or building poins. This gives us a coarse idea where poenial objecs could be locaed. On he remaining poin cloud we perform muli-scale search for generic objec proposals Sec. V). Each proposal is defined by a se of 3D poins. The resul is an over-complee se of possible 3D objecs. In he las sep we perform objec racking. Firs, we ransform he 3D objec proposals o he common coordinae frame using visual odomery of [25] and link hem across frames. Nex, we idenify he bes se of objecs and heir corresponding racks Sec. VI). We perform his selecion joinly in a model selecion based muli-hypohesis racking framework, which searches for he subse of objec rajecory hypoheses ha ogeher bes explains he observed daa. IV. SEMANTIC SEGMENTATION We use a supervised semanic segmenaion approach o classify pars of he scene ha do no resemble objecs and can herefore be removed for furher processing. In conras o he classical semanic segmenaion asks, we are ineresed in correcly recognizing he known background caegories while generalizing o poenially unseen objec caegories. To achieve ha, we specifically use feaures ha capure he background caegories well. We rea cars and pedesrians as one single objec class, such ha afer semanic segmenaion we know ha somehing is an objec, bu no of wha kind. We follow he design of a ypical segmenaion pipeline: saring wih an over-segmenaion, feaures are exraced for each segmen and are hen used o classify he segmens ino semanic caegories. A Condiional Random Field CRF) is hen applied o enforce spaial coherence. As our furher approach operaes in 3D, we use he VCCS algorihm [26] o pariion he poin cloud ino segmens. For each segmen we compue several feaures which can be grouped ino four caegories: Appearance. We compue L*a*b* hisograms over he poins wihin a segmen. We use hree separae hisograms for L*, a*, and b*, each conaining 10 bins 30 dimensions). Furhermore, we compue he mean and covariance of he L*a*b* gradiens wihin he segmen 3+6 dimensions as he covariance is symmeric). Finally, we add hisograms of exons, similar o hose used in [27]. We exonize he whole image and creae a hisogram of exons wihin he segmen 50 dimensions), giving a oal of 89 dimensions. Only he appearance feaures are based on he color image, while all furher feaures are based on he 3D poin cloud. Densiy. These feaures are largely inspired by Bansal e al. [18]. Based on he orienaion of an esimaed ground plane, we slice he 3D space ino 3 heigh bands and projec he poins of each band ono a densiy map. This gives us densiies for 3 heigh regions. The densiy maps are hen discreized using 3 resoluions. By projecing a segmen s cenroid ono each densiy map we are able o selec a cell in each layer and resoluion 3 3 = 9 grid cells). We also consider he 4-neighborhood of he seleced cells in each layer 4 9 = 36 grid cells), resuling in a oal of 45 dimensions. Furhermore, we coun he 3D poins wihin a segmen, which represens he densiy of he segmen iself, summing up o a oal of 46 feaure dimensions. Geomery. Based on he covariance marix of he 3D poins wihin one segmen, we compue several specral and direcional feaures [28]. From he eigenvalues we compue he poin-ness, linear-ness, surface-ness and curvaure of he segmen. From he eigenvecors we deermine he segmen normal and he cosines beween boh he normal and angen vecors and he ground plane normal. Finally, we compue a igh bounding box of he segmen along he principal axes. This resuls in a oal of 12 feaure dimensions. Locaion. This feaure represens a locaion prior wih 3 dimensions. I consiss of: he heigh of he segmen cenroid, he deph of he segmen cenroid and he horizonal angle beween he camera s opical axis and he vecor from he camera cener o he segmen cenroid. Thus, our resuling feaure vecor consiss of a oal of 150 dimensions. We hen rain a Random Fores classifier [29] wih single-aribue ess, yielding class poseriors for every segmen. A fully conneced CRF [30], defined over he segmen ceners in 3D, furher improves he resuls. We use a close-range smoohing kernel defined only over he 3D cenroid locaions and a larger-range appearance kernel defined over he 3D cenroid and he average L*a*b* color of a segmen. From his semanic segmenaion we only consider he segmens labeled as objec for our furher seps. V. GENERIC OBJECT PROPOSAL GENERATION The muli-scale objec proposal generaion mehod produces a ranked se of objec proposals GOPs) from he remaining objec regions wihin he poin cloud. In addiion o correc objec proposals arges for racking), his se may sill conain under- and over-segmenaions e.g., car pars, groups of pedesrians, pedesrians merged wih oher objecs). These overlapping and compeing proposals are a major difference o previous single-scale approaches [9] and make he daa associaion ask more challenging. In order o suppor efficien daa associaion and racking, he objec proposal generaion procedure should achieve a high recall wih a very small se of objec proposals. Curren appearance-based objec proposal mehods are able o achieve good recall, bu a he cos of very large proposal ses for an overview see [19]). The muli-scale search for objec proposals is necessary for several reasons. Firsly, sizes of poenially ineresing objecs fundamenally differ e.g., pedesrians and vans). Secondly, he observed objecs migh be jus parially visible. Noisy sereo poin clouds ypically conain severe deph arifacs and ouliers. This makes our problem even harder and requires a robus approach, which we describe in deail in he nex subsecions. In a nushell, we firs projec he 3D poin cloud o he ground plane and compue a densiy
4 Poin-Cloud D σ1 D σ2. D σk Ground-Plane Scale-Space Camera Image Fig. 3. Semanic segmenaion allows us o only consider poins labeled as objec. Since objec sizes are unknown, we consider differen scales of he ground-plane densiy map. map of he 3D poins. Then we perform muli-scale search for objec proposals as follows. We ieraively smooh he densiy map and idenify blobs clusers) around modes in he densiy map using Quick-Shif [31] a each scale. Our final proposals are clusers ha persis in he scale space of he densiy map. Scale-Space Represenaion of he Densiy Map. Firs, we discreize he ground plane of he poin cloud ino a regular grid and compue he poin-densiy map D by projecing he 3D poin cloud o he ground plane. Each grid cell sores he scalar value represening he densiy of poins falling ino he cell. In addiion, cells sore a lis of associaed 3D poins. We creae a ground-plane scale-space represenaion of he densiy map D σk, k = 1...K by convolving D wih a Gaussian kernel σ k whose size increases in each ieraion k see Fig. 3). Muli-Scale Clusering. In he nex sep, we apply Quick- Shif clusering [31] o obain he modes of he scale-filered densiy D σk : C k = {cluser k m)},m = 1...#clusers. A cluser cluser k m) = [ ] {cell c } m,bbm 2D,s m is defined by he se of cells {cell c } m, c = 1...#cells ha converged o is mode. BBm 2D represens a 2D bounding box ha is compued by projecing he corresponding 3D poins o he image plane see Fig. 3) and s m is a scale-sabiliy coun. Idenificaion of Scale-Sable Clusers. In order o obain a compac se of GOPs we idenify he clusers ha persis over scales. This sep is moivaed by resuls from scale-space filering [32], namely ha he mos scale-sable proposals also end o be he mos salien ones. We idenify scalesable clusers by ieraing hrough cluser ses C k, k = 1...K and search for similar clusers beween ses C k and C k+1. If wo clusers cluser k j), cluser k+1 l) are very similar according o our scale-sabiliy crierion, hen we merge hem. This is done by removing cluser k j) from C k and merging i wih cluser k+1 l) and incremening he scale-sabiliy coun s m of cluser k+1 l) by 1. In our scenario, wo clusers should be declared as similar when hey roughly) correspond o he same objec. This moivaes he following scale-sabiliy crierion: wo clusers cluser k j) and cluser k+1 l) are similar when heir bounding boxes BBj 2D and BBl 2D have a very high overlap. To be specific, we compue he Jaccard Index J, ) of he wo bounding boxes and declare hem as he same cluser if J ) BBk 2D,BB2D l > 0.9. Finally we obain a se of GOPs {Ω i } for frame where each GOP is defined as: Ω i = [ p i,c i,3d,h i,s i,r i], 1) where p i is he 3D posiion of he i h GOP, projeced ono he ground plane. C i,3d is a 3 3 covariance marix represening he uncerainy in 3D posiion p i, compued as [33] C i,3d = F cl C 1 2D F c L +F cr C 1 2D F c R ) 1, 2) where F cl,f cr are Jacobians of he projecion marices of boh cameras and C 2D is he covariance of pixel measuremens. h i denoes a color hisogram, compued by dividing he bounding box of he GOP ino 4 4 cells and sacking heir RGB color hisograms. Si R3 denoes he se of 3D poin measuremens of he GOP in he camera space) and he scalar ri [0,1] is he objec sabiliy score, compued as ri = si K, where s i is scale-sabiliy of he proposal. VI. TRACKING Saring wih he previously inroduced, possibly overlapping GOPs { } Ω 0: i, we now wan o find a se of mos likely objecs and heir rajecories {H n }. Our basic assumpion is ha correc GOPs have a higher chance of producing sable rajecories wih consisen appearance han GOPs caused by noise and incorrec segmenaions. We approach his problem by performing racking and objec selecion joinly in a muli-hypohesis racking framework. Oher han classic racking approaches we are no only looking for physically exclusive inlier deecions i.e. is he rack coninued by deecion A or deecion B?), bu we also have an inlier hypohesis ambiguiy on physically overlapping objec proposals see Fig. 4). We ackle his challenging muli-hypohesis racking problem on he objec proposal level by mainaining a lis of physically overlapping objec-rajecory hypoheses ha compee for he poenially overlapping) GOPs. A each ime sep, our algorihm selecs a subse of hypoheses, ha bes explains he observaions. We formulae racking as a model selecion procedure and exend our previous work [34], [35], where rajecories wih consisen moion and appearance are preferred. Addiionally, our mehod akes emporal consisency of he 3D shape of he racked objec ino accoun. Our mehod is also capable of keeping rack of currenly no seleced rack hypoheses. As a concree example, his means ha we may rack a group of pedesrians as a single objec over a sequence of frames 1, bu we also keep hypoheses for he individual pedesrians. If a some poin heir moion sars diverging, he observed daa can beer be explained by individual pedesrian hypohesis. Tracking is performed on he esimaed ground plane and he camera pose compued for each frame using he Visual Odomery mehod of [25]. In order o obain a sable 3D shape represenaions of he racked objecs, we inegrae he noisy 3D measuremens of he GOPs over ime. In following, we will inroduce he quadraic pseudo-boolean opimizaion QPBO) racking mehod by Leibe e al. [34] and our exension of he approach, ha enables us o perform 1 Remember ha we do no have pedesrian specific knowledge, such ha a group of pedesrians is a valid objec.
5 +1 +1 Responses from Deecors Generic Objec Proposals Trajecory Hypoheses Objec-Trajecory Hypoheses Fig. 4. Tracking-by-deecion associaes deecions and rejecs he incorrec racks op). We associae GOPs and penalize incorrec associaions e.g.car pars) bu associae boh individual pedesrians and pedesrian groups boom). racking wihou using a deecor and rack regions ha likely correspond o he valid objecs. A. QPBO Tracking The idea of [34] is o use a deecor o generae an overcomplee possibly physically implausible) se of rajecory hypoheses. Then a physically plausible) se of hypoheses is seleced by solving a quadraic pseudo-boolean opimizaion problem QPBO): argmax m m T Qm, m {0,1}, 3) where m is a binary indicaor vecor ha indicaes wheher he model hypohesis) was seleced or no. The diagonal erms of he marix Q represens he hypohesis likelihoods cos benefis for specific hypohesis) reduced by a consan penaly ε 1 ha enforces sparse soluions: q nn = ε 1 + D i H0:k n 1 ε2 )+ε 2 SD i H 0:k n ) ). 4) Here, Di represen he supporing deecions of he hypohesis Hn 0:k and S ) is he likelihood of he deecion belonging o he hypohesis. Wih off-diagonal enries we model ineracions beween hypoheses: Physical overlap penaly {}}{ q mn = 0.5 ε 3 OHn 0:k,Hm 0:k ) + 5) 1 ε2 )+ε 2 S Di H )) ), Di H0:k n H0:k m }{{} Avoiding double-couning of inlier conribuions where H {H m,h n } is he weaker hypohesis. O, ) measures he physical overlap of he hypoheses and he second erm correcs for double-couning deecions ha are consisen wih boh hypoheses. Model parameer ε 2 is he minimal score of he inlier deecions and ε 3 weighs penalizaion of he physical overlap. In our formulaion, we use he GOPs Ω i insead of he deecions Di and inroduce a shape model of he unknown objec o he racking process. Physical overlap beween he compeing hypoheses is compued as a Bhaacharyya coefficien of he wo 2D occupancy hisograms of heir shape represenaions. The hisograms are compued by sampling 3D poins from he hypoheses shape represenaions and projecing hem o he ground plane. B. Objec-Trajecory Hypohesis Generaion The basic uni of our racker is he objec-rajecory hypohesis Hn 0:k, ha spans over he frames 0...k: Hn 0:k = [ In 0:k,Mn 0:k,A 0:k n,sn 0:k ], 6) where I n represen he inlier GOP se of he n h hypohesis, M n is he moion model, A n he appearance model and S n is he 3D shape model. Noe, ha an objec-rajecory hypohesis does no only hypohesize he rajecory bu also he objec s shape. This is a fundamenal difference compared o he original QPBO racking. Hypohesis Generaion. Following he QPBO racking approach, he firs sep is o generae an over-complee se of hypoheses. In each frame, we exend he old hypohesis se using he new GOP se by running a forward Kalman filer. We sar a new hypoheses from he new GOPs ha were no used for exending old hypoheses by running he Kalman filer backwards. A each Kalman filer sep we perform neares neighbor daa associaion wihin he validaion volume of C i,3d, selecing inlier GOPs of pas frames by evaluaing he GOP associaion probabiliy. Daa Associaion. We compue GOP Ω i associaion probabiliy as we omi he indices 0 : k o reduce cluer): p Ω i H n ) = p Ω i A n ) p Ω i M n ) p Ω i S n ). 7) As appearance model we compue he Bhaacharyya disance beween he rajecory RGB color hisogram A n and GOP color hisogram h i : p Ω ) i A n = 1 h i r,g,b) A nr,g,b). 8) r,g,b For moion model we assume a consan-velociy Kalman filer wih he following sae vecor: x k = [ x k,y k,ẋ k,ẏ k] T, 9) where [ x k,y k] T represen he 2D posiion on he ground plane and [ ẋ k,ẏ k] T he velociy. Given he prediced sae x k and GOP Ω i, we ge he moion model probabiliy as: p Ω ) i M n = e 1 2 p i [xk,0,y k ] T) C 1 p i [xk,0,y k ] T), 10) where C = C i,3d + C sys, C sys is he sysem uncerainy of he Kalman filer. The shape model is evaluaed by: p Ω i S n ) = e α d J BB 2D β d J BB 3D, 11) where d J BB 2D is is he Jaccard disance defined as 1 J, )) beween he 2D bounding boxes of he inegraed) hypohesis shape represenaion and he GOP. These bounding boxes are compued by projecing he associaed 3D poins o he camera image plane. d J BB 3D is he Jaccard
6 Objec Road Building Tree Bush Sign Pole Dir Sky Grass Average Jaccard Acc TABLE I JACCARD SCORE & CLASS-ACCURACY FOR OUR 7 CLASSES. disance beween heir 3D bounding boxes and α,β are he weighing facors for boh erms. Finally, he fi of he GOP Ω i o he hypohesis S ) Hn 0:k is evaluaed as: S Ω i Hn 0:k ) = e k τ ) p Ω i Hn 0:k ) ) p Ω i. 12) The erm pω i ) = e γ1 r i ) is he GOP prior compued from he GOP sabiliy score ri. The final score of he hypohesis S ) Hn 0:k is a summaion over is inlier GOP scores, weighed by emporal decay. The parameer τ regulaes he exen of emporal decay and γ regulaes he influence of he GOP prior. C. Shape Model Measuremen Inegraion Our racker relies on raw 3D deph esimaes for he compuaion of GOP associaions and selecion coss. Because individual sereo-based 3D measuremens are very imprecise, we inegrae 3D measuremens of inlier GOPs In 0:k over ime o creae a sable 3D represenaion of he hypoheses. We coninuously build hypohesis shape represenaions Sn 0:k by inegraing he GOP measuremens in a voxel grid and compuing occupancy probabiliies of he voxel grid cells. We perform inegraion in a wo-sep procedure: firs, we reconsruc he poin cloud represenaion of he inegraed model, second, we regiser model poins wih associaed inlier GOP poins Si and updae he shape model S0:k n wih new measuremens. Model Iniializaion. We iniialize he model by cenering a fixed-size regular voxel grid a he cener of mass of he firs inlier GOP of he hypohesis Hn 0:k and iniialize each voxel grid cell c j Sn 0:k wih p cj) 0, he probabiliy ha a measured poin falls ino he cell normalized coun of he poins falling ino he cell). Model Updae. To updae he shape model Sn 0:k wih new GOP measuremens Si we cener he voxel grid represenaion of he inegraed model o he las posiion world coordinaes) of he hypohesis Hn 0:k and reconsruc poins wih he highes occupancy probabiliy along he camera ray. We align he shape model Sn 0:k o he new measuremen Si using weighed Ieraive Closes Poin ICP) algorihm. For efficien updaes, we consider cells c j independen and use a Binary Bayes Filer o updae occupancy probabiliies of each cell [36]. The sae ransiion model applies an exponenial decay owards he uniform disribuion. VII. EXPERIMENTAL EVALUATION In his secion we conduc a series of experimens o firs evaluae he individual sages of our approach and hen assess overall performance. As a es bed we use he well known KITTI daase [1]. All experimens are performed on he KITTI racking raining se. As we perform general objec racking and do no single ou specific classes, he sandard evaluaion pipeline on he KITTI es se is no suiable for our approach. All mehods evaluaed in he remaining of he paper do no use he raining se as inpu. This enables us o use i as a valid es bed. A. Semanic Segmenaion To show he validiy of our segmenaion algorihm iself, we compare our approach o hree recen baselines [21], [22], [20] which each provide ground ruh annoaions for a differen se of images and semanic caegories wihin he KITTI [1] daase. Only an approximae comparison can be provided, as he approaches use differen deph maps and hus label slighly differen pars of he image. Ladický e al. [21] even esimae a dense semanic map wihou deph informaion, whereas our mehod provides semanic labels only for image pixels wih a corresponding deph esimae. However, even wih his rough comparison, Table II shows ha our semanic segmenaion obains compeiive resuls. Segmenaion Daase. For our complee pipeline, we rained our semanic segmenaion classifier on a oal of 203 annoaed images exraced across he KITTI odomery daase we will publicly release his daa upon publicaion). In our annoaions we labeled he following classes: building, car, curb, grass/dir, person, pole, road, sky, sidewalk, sign, surface marking, ree/bush and wall. For our approach we group person and car ino a single objec class. For he remaining pipeline, he semanic segmenaion is used as an iniial sep o filer ou regions which are unlikely o belong o an objec. Therefore, is main goal is o be able o disinguish beween objec and non-objec regions, raher han separaing non-)objec classes. While our annoaed daase conains a oal 13 objec caegories, we merge hem ino objec and non-objec classes for evaluaion. We qualiaively and quaniaively found ha beer resuls can be obained by using more han only wo classes for raining. We believe ha his is he resul of reducing he inra class variance. In pracice, curb, sidewalk and surface marking were merged ino he road class. We also joined wall wih building, and pole wih sign. Table I shows boh he class accuracy and Jaccard scores for hese classes. B. Objec Proposal Generaion In Fig. 5 we compare our generic objec proposal generaion mehod wih wo relevan baselines. Differenceof-Normals DoN) [17] demonsraed excellen resuls on KITTI 3D laser daa [1]; EdgeBoxes [6] is a sae-of-hear appearance-based objec proposal generaion mehod as shown in [19]). The code of boh mehods is publicly available. We use defaul parameers for EdgeBoxes [6] and heir pre-rained edge deecion model. For DoN we used he specified parameers from [17]. Fig. 5 lef) shows ha our mehod requires 2 orders of magniude fewer proposals han EdgeBoxes [6] o cover roughly 70% of he relevan arges annoaed in KITTI).
7 Building Car Fence Grass Obsacle Pole Road Sidewalk Ladický [21] Our approach Ros [22] Our approach Xu [20] Our approach TABLE II CLASS-ACCURACY COMPARISON TO OTHER APPROACHES. WE TRAIN OUR APPROACH ON THE DIFFERENT SEMANTIC ANNOTATIONS. OUR RESULTS ARE AVERAGED OVER 5 RUNS AND GRAY CELLS REPRESENT CLASSES NOT REPRESENTED IN A DATASET. Sign Sky Tree Global Average 1 Generic Objec Proposals - Baselines 1 Generic Objec Proposals - Occlusion 1 Tracking Recall Comparison 1 Precision vs. Recall - All Caegories Recall Recall Recall Precision Our mehod DoN EdgeBoxes k Number of objec proposals per frame 0.2 Fully visible Toal Parially occluded Mosly occluded Number of objec proposals per frame Fig. 5. GOP Recall. Lef: Comparison of proposal generaion mehod and wo baselines, Difference-of-Normals [17] and EdgeBoxes [6]. Righ: Recall per occlusion. Wih 30 objec proposals per frame, DoN has a similar sauraion poin as our mehod, bu achieves only 40% recall. Fig. 5 righ) shows he recall of our mehod under varying amouns of occlusion. As can be seen, our approach achieves good recall for he mosly visible objecs. For parially occluded objecs, our mehod repors 2D bounding boxes only spanning he visible area, while he KITTI annoaions cover he whole objec even if i is no acually visible). As our mehod is no aware of objec caegories, no class-specific size heurisics can be applied. EdgeBoxes [6] does no require deph daa, bu needs oo many proposals o be applicable o our problem. We observed ha DoN [17] produces very relevan and compac proposals, bu only in he close camera range. C. Tracking In his secion we demonsrae compeiive performance on car and pedesrian caegories compared o oher saeof-he-ar deecion-based approaches on he KITTI racking daase [1]. We will show ha our proposed racks include he caegories annoaed in KITTI. Evaluaion of racking performance of our approach is non-rivial as we do no have caegory knowledge for he racked objecs. This means ha we do no know if a rajecories represens, e.g., a car or pedesrian; i is jus a generic objec. Especially he caegory-specific precision merics become meaningless, as he confidence in a racked objec does no rely on is caegory! We compare o wo sae-of-he-ar racking-by-deecion mehods [2], [5], for which we obained racking resuls from he auhors. Fig. 7 lef) shows a frame-level recall evaluaion for cars and pedesrians as a funcion of he disance from he cameras. In shor camera-range 25m) we ouperform he oher mehods in erms of recall, while hey achieve a higher Car - Our mehod Car - MCF 0.2 Car - CEM Ped. - CEM Ped. and groups - Our mehod Ped. - Our mehod Cyclis - Our mehod Disance from Camera m) 0.5 Our inegraion mehod GCT Recall Fig. 7. Lef: Tracking recall compared o wo baselines [5], [2] on pedesrian and car caegories. Righ: Precision vs. recall of our mehod for all caegories in KITTI, using our voxel-grid based and he GCT based inegraion [9]. recall in he limi. In case of pedesrian racking he saeof-he-ar mehod [2] ouperforms our mehod by abou 13% poins. We observed ha his performance difference originaes from he fac ha we are simply no able o disinguish beween individual pedesrians a he racking level. Already a he proposal level, proposals for pedesrians walking close ogeher are ranked higher, as he free-space surrounds he groups surpasses he free-space around individuals. Again, his is due o he fac ha our racker has no caegoryrelaed knowledge. In order o validae his effec, we also plo he performance when changing he annoaions, such ha annoaed pedesrians walking very close ogeher are merged ino a single hypohesis See Fig. 7, lef). To furher show generalizaion o novel classes, we also repor recall for he cyclis class in Fig. 7 lef). In Fig. 7 righ) we show a full precision-recall curve for all annoaed objecs in KITTI based on he assumpion ha hose annoaions can be used as a proxy for all valid objecs in realiy, no all objecs are no annoaed). Our approach can rack abou 50% of all annoaed objecs in a disance range of up o 30m. Experimenally he voxelgridbased inegraion mehod urned ou o be more robus for racking han he GCT approach [9]. This experimen also demonsraes he imporance of robus shape inegraion. Qualiaive resuls are shown in Fig. 6. VIII. CONCLUSIONS In his paper, we invesigaed how far we can ge wih a generic objec racking approach. In paricular, we proposed a novel racking pipeline wih he key feaure of racking muliple objecs simulaneously wihou explicily learning a classifier for each caegory. This is an imporan sep owards beer scene undersanding, where i is impossible o learn class specific knowledge for everyhing ineresing.
8 Building Grass/Dir Objec Road Sky Sign/Pole Tree/Bush Fig. 6. Qualiaive resuls on he KITTI racking raining se. Lef: Semanic segmenaion resuls. The label colors are shown in he color map a he boom. Middle: Generic Objec Proposals. Righ: Tracking Resuls. The saic objecs are visualized wih he gray bounding boxes. We do no aim o replace deecor-based racking mehods, bu believe ha an opimal racking approach should combine he srenghs of boh paradigms, which we plan o address in fuure work. Towards our goal of general objec racking, we proposed a compeiive semanic segmenaion algorihm, a novel muli-scale objec proposal generaion sage, ha reaches high recall wih few proposals, and a 3D racker ha achieves compeiive resuls for close-range objecs. Acknowledgmens: This work was funded by ERC Saring Gran projec CV-SUPER ERC-2012-SG ). We would like o hank Dennis Mizel for helpful discussions. REFERENCES [1] A. Geiger, P. Lenz, and R. Urasun, Are we ready for Auonomous Driving? The KITTI Vision Benchmark Suie, in CVPR, [2] A. Milan, S. Roh, and K. Schindler, Coninuous Energy Minimizaion for Muliarge Tracking, PAMI, vol. 36, no. 1, pp , [3] H. Pirsiavash, D. Ramanan, and C. C.Fowlkes, Globally-opimal Greedy Algorihms for Tracking a Variable Number of Objecs, in CVPR, [4] J. H. Yoon, M.-H. Yang, J. Lim, and K.-J. Yoon, Bayesian Muliobjec Tracking Using Moion Conex from Muliple Objecs, in WACV, [5] L. Zhang, L. Yuan, and R. Nevaia, Global Daa Associaion for Muli-Objec Tracking Using Nework Flows, in CVPR, [6] C. L. Zinick and P. Dollár, Edge Boxes: Locaing Objec Proposals from Edges, in ECCV, [7] B. Alexe, T. Deselaers, and V. Ferrari, Measuring he Objecness of Image Windows, PAMI, vol. 34, no. 11, pp , [8] R. Kaesner, J. Maye, Y. Pila, and R. Siegwar, Generaive Objec Deecion and Tracking in 3D Range Daa, in ICRA, [9] D. Mizel and B. Leibe, Taking Mobile Muli-Objec Tracking o he Nex Level: People, Unknown Objecs, and Carried Iems, in ECCV, [10] A. Perovskaya and S. Thrun, Model Based Vehicle Deecion and Tracking for Auonomous Urban Driving, Auonomous Robos, vol. 26, pp , [11] A. Teichman and S. Thrun, Tracking-based semi-supervised learning, IJRR, vol. 31, no. 7, pp , [12] D. Held, J. Levinson, S. Thrun, and S. Savarese, Combining 3D Shape, Color, and Moion for Robus Anyime Tracking, in RSS, [13] A. Bewley, V. Guizilini, F. Ramos, and B. Upcrof, Online Self- Supervised Muli-Insance Segmenaion of Dynamic Objecs, in ICRA, [14] T.-N. Nguyen, B. Michaelis, A. Al-Hamadi, M. Tornow, and M. Meinecke, Sereo-Camera-Based Urban Environmen Percepion Using Occupancy Grid and Objec Tracking, TITS, vol. 13, no. 1, pp , [15] D. Beymer and K. Kur, Real-ime racking of muliple people using coninuous deecion, in IEEE Frame Rae Workshop, [16] D. Z. Wang, I. Posner, and P. Newman, Wha Could Move? Finding Cars, Pedesrians and Bicycliss in 3D Laser Daa, in ICRA, [17] Y. Ioannou, B. Taai, R. Harrap, and M. A. Greenspan, Difference of Normals as a Muli-Scale Operaor in Unorganized Poin Clouds, in 3DIMPVT, [18] M. Bansal, B. Maei, H. Sawhney, S.-H. Jung, and J. Eledah, Pedesrian Deecion wih Deph-guided Srucure Labeling, in ICCV Workshops, [19] J. Hosang, R. Benenson, and B. Schiele, How good are deecion proposals, really? in BMVC, [20] P. Xu, F. Davoine, J.-B. Bordes, H. Zhao, and T. Denoeux, Informaion Fusion on Oversegmened Images: An Applicaion for Urban Scane Undersanding, in MVA, [21] L. Ladicky, J. Shi, and M. Pollefeys, Pulling Things ou of Perspecive, in CVPR, [22] G. Ros, A. Bakhiary, S. Ramos, D. Vazqueuez, M. Granados, and A. M. Lopez, Vision-based Offline-Online Percepion Paradigm for Auonomous Driving, in WACV, [23] A. Geiger, M. Roser, and R. Urasun, Efficien Large-Scale Sereo Maching, in ACCV, [24] G. Heiz and D. Koller, Learning Spaial Conex: Using Suff o Find Things, in ECCV, [25] A. Geiger, J. Ziegler, and C. Siller, SereoScan: Dense 3d Reconsrucion in Real-ime, in Inel. Vehicles Symp. 11, [26] J. Papon, A. Abramov, M. Schoeler, and F. Wrger, Voxel Cloud Conneciviy Segmenaion - Supervoxels for Poin Clouds, in CVPR, [27] J. Shoon, J. M. Winn, C. Roher, and A. Criminisi, TexonBoos for Image Undersanding: Muli-Class Objec Recogniion and Segmenaion by Joinly Modeling Texure, Layou, and Conex, IJCV, vol. 81, no. 1, pp. 2 23, [28] D. Munoz, N. Vandapel, and M. Heber, Onboard Conexual Classificaion of 3-D Poin Clouds wih Learned High-order Markov Random Fields, in ICRA, [29] L. Breiman, Random Foress, Machine Learning, vol. 45, no. 1, pp. 5 32, [30] P. Krähenbühl and V. Kolun, Efficien Inference in Fully Conneced CRFs wih Gaussian Edge Poenials, in NIPS, [31] A. Vedaldi and S. Soao, Quick Shif and Kernel Mehods for Mode Seeking, in ECCV, [32] A. P. Wikin, Scale-Space Filering: A New Approach To Muli-Scale Descripion, in ICASSP, [33] R. Harley and A. Zisserman, Muliple view geomery in compuer vision. Cambridge Universiy Press, [34] B. Leibe, K. Schindler, N. Cornelis, and L. V. Gool, Coupled Objec Deecion and Tracking from Saic Cameras and Moving Vehicles, PAMI, vol. 30, no. 10, pp , [35] D. Mizel, E. Horber, A. Ess, and B. Leibe, Muli-person Tracking wih Sparse Deecion and Coninuous Segmenaion, in ECCV, [36] S. Thrun, W. Burgard, and D. Fox, Probabilisic Roboics Inelligen Roboics and Auonomous Agens). The MIT Press, 2005.
Image segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding
Moivaion Image segmenaion Which pixels belong o he same objec in an image/video sequence? (spaial segmenaion) Which frames belong o he same video sho? (emporal segmenaion) Which frames belong o he same
More informationMobile Robots Mapping
Mobile Robos Mapping 1 Roboics is Easy conrol behavior percepion modelling domain model environmen model informaion exracion raw daa planning ask cogniion reasoning pah planning navigaion pah execuion
More informationA Fast Stereo-Based Multi-Person Tracking using an Approximated Likelihood Map for Overlapping Silhouette Templates
A Fas Sereo-Based Muli-Person Tracking using an Approximaed Likelihood Map for Overlapping Silhouee Templaes Junji Saake Jun Miura Deparmen of Compuer Science and Engineering Toyohashi Universiy of Technology
More informationCAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL
CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL Klečka Jan Docoral Degree Programme (1), FEEC BUT E-mail: xkleck01@sud.feec.vubr.cz Supervised by: Horák Karel E-mail: horak@feec.vubr.cz
More informationVisual Perception as Bayesian Inference. David J Fleet. University of Toronto
Visual Percepion as Bayesian Inference David J Flee Universiy of Torono Basic rules of probabiliy sum rule (for muually exclusive a ): produc rule (condiioning): independence (def n ): Bayes rule: marginalizaion:
More informationSTEREO PLANE MATCHING TECHNIQUE
STEREO PLANE MATCHING TECHNIQUE Commission III KEY WORDS: Sereo Maching, Surface Modeling, Projecive Transformaion, Homography ABSTRACT: This paper presens a new ype of sereo maching algorihm called Sereo
More informationImplementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report)
Implemening Ray Casing in Terahedral Meshes wih Programmable Graphics Hardware (Technical Repor) Marin Kraus, Thomas Erl March 28, 2002 1 Inroducion Alhough cell-projecion, e.g., [3, 2], and resampling,
More informationProbabilistic Detection and Tracking of Motion Discontinuities
Probabilisic Deecion and Tracking of Moion Disconinuiies Michael J. Black David J. Flee Xerox Palo Alo Research Cener 3333 Coyoe Hill Road Palo Alo, CA 94304 fblack,fleeg@parc.xerox.com hp://www.parc.xerox.com/fblack,fleeg/
More informationImproved TLD Algorithm for Face Tracking
Absrac Improved TLD Algorihm for Face Tracking Huimin Li a, Chaojing Yu b and Jing Chen c Chongqing Universiy of Poss and Telecommunicaions, Chongqing 400065, China a li.huimin666@163.com, b 15023299065@163.com,
More informationMORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES
MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES B. MARCOTEGUI and F. MEYER Ecole des Mines de Paris, Cenre de Morphologie Mahémaique, 35, rue Sain-Honoré, F 77305 Fonainebleau Cedex, France Absrac. In image
More informationLearning in Games via Opponent Strategy Estimation and Policy Search
Learning in Games via Opponen Sraegy Esimaion and Policy Search Yavar Naddaf Deparmen of Compuer Science Universiy of Briish Columbia Vancouver, BC yavar@naddaf.name Nando de Freias (Supervisor) Deparmen
More informationA Matching Algorithm for Content-Based Image Retrieval
A Maching Algorihm for Conen-Based Image Rerieval Sue J. Cho Deparmen of Compuer Science Seoul Naional Universiy Seoul, Korea Absrac Conen-based image rerieval sysem rerieves an image from a daabase using
More informationVisual Indoor Localization with a Floor-Plan Map
Visual Indoor Localizaion wih a Floor-Plan Map Hang Chu Dep. of ECE Cornell Universiy Ihaca, NY 14850 hc772@cornell.edu Absrac In his repor, a indoor localizaion mehod is presened. The mehod akes firsperson
More informationWe are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors
We are InechOpen, he world s leading publisher of Open Access books Buil by scieniss, for scieniss 4,000 116,000 120M Open access books available Inernaional auhors and ediors Downloads Our auhors are
More informationDetection and segmentation of moving objects in highly dynamic scenes
Deecion and segmenaion of moving objecs in highly dynamic scenes Aurélie Bugeau Parick Pérez INRIA, Cenre Rennes - Breagne Alanique Universié de Rennes, Campus de Beaulieu, 35 042 Rennes Cedex, France
More informationNEWTON S SECOND LAW OF MOTION
Course and Secion Dae Names NEWTON S SECOND LAW OF MOTION The acceleraion of an objec is defined as he rae of change of elociy. If he elociy changes by an amoun in a ime, hen he aerage acceleraion during
More informationMulti-Target Detection and Tracking from a Single Camera in Unmanned Aerial Vehicles (UAVs)
2016 IEEE/RSJ Inernaional Conference on Inelligen Robos and Sysems (IROS) Daejeon Convenion Cener Ocober 9-14, 2016, Daejeon, Korea Muli-Targe Deecion and Tracking from a Single Camera in Unmanned Aerial
More informationMOTION TRACKING is a fundamental capability that
TECHNICAL REPORT CRES-05-008, CENTER FOR ROBOTICS AND EMBEDDED SYSTEMS, UNIVERSITY OF SOUTHERN CALIFORNIA 1 Real-ime Moion Tracking from a Mobile Robo Boyoon Jung, Suden Member, IEEE, Gaurav S. Sukhame,
More informationEECS 487: Interactive Computer Graphics
EECS 487: Ineracive Compuer Graphics Lecure 7: B-splines curves Raional Bézier and NURBS Cubic Splines A represenaion of cubic spline consiss of: four conrol poins (why four?) hese are compleely user specified
More informationReal-time 2D Video/3D LiDAR Registration
Real-ime 2D Video/3D LiDAR Regisraion C. Bodenseiner Fraunhofer IOSB chrisoph.bodenseiner@iosb.fraunhofer.de M. Arens Fraunhofer IOSB michael.arens@iosb.fraunhofer.de Absrac Progress in LiDAR scanning
More informationEvaluation and Improvement of Region-based Motion Segmentation
Evaluaion and Improvemen of Region-based Moion Segmenaion Mark Ross Universiy Koblenz-Landau, Insiue of Compuaional Visualisics, Universiässraße 1, 56070 Koblenz, Germany Email: ross@uni-koblenz.de Absrac
More informationAn Iterative Scheme for Motion-Based Scene Segmentation
An Ieraive Scheme for Moion-Based Scene Segmenaion Alexander Bachmann and Hildegard Kuehne Deparmen for Measuremen and Conrol Insiue for Anhropomaics Universiy of Karlsruhe (H), 76 131 Karlsruhe, Germany
More informationAUTOMATIC 3D FACE REGISTRATION WITHOUT INITIALIZATION
Chaper 3 AUTOMATIC 3D FACE REGISTRATION WITHOUT INITIALIZATION A. Koschan, V. R. Ayyagari, F. Boughorbel, and M. A. Abidi Imaging, Roboics, and Inelligen Sysems Laboraory, The Universiy of Tennessee, 334
More informationMultiple View Discriminative Appearance Modeling with IMCMC for Distributed Tracking
Muliple View Discriminaive ing wih IMCMC for Disribued Tracking Sanhoshkumar Sunderrajan, B.S. Manjunah Deparmen of Elecrical and Compuer Engineering Universiy of California, Sana Barbara {sanhosh,manj}@ece.ucsb.edu
More informationReal Time Integral-Based Structural Health Monitoring
Real Time Inegral-Based Srucural Healh Monioring The nd Inernaional Conference on Sensing Technology ICST 7 J. G. Chase, I. Singh-Leve, C. E. Hann, X. Chen Deparmen of Mechanical Engineering, Universiy
More informationVideo-Based Face Recognition Using Probabilistic Appearance Manifolds
Video-Based Face Recogniion Using Probabilisic Appearance Manifolds Kuang-Chih Lee Jeffrey Ho Ming-Hsuan Yang David Kriegman klee10@uiuc.edu jho@cs.ucsd.edu myang@honda-ri.com kriegman@cs.ucsd.edu Compuer
More informationRobust Visual Tracking for Multiple Targets
Robus Visual Tracking for Muliple Targes Yizheng Cai, Nando de Freias, and James J. Lile Universiy of Briish Columbia, Vancouver, B.C., Canada, V6T 1Z4 {yizhengc, nando, lile}@cs.ubc.ca Absrac. We address
More informationThe Impact of Product Development on the Lifecycle of Defects
The Impac of Produc Developmen on he Lifecycle of Rudolf Ramler Sofware Compeence Cener Hagenberg Sofware Park 21 A-4232 Hagenberg, Ausria +43 7236 3343 872 rudolf.ramler@scch.a ABSTRACT This paper invesigaes
More informationWeighted Voting in 3D Random Forest Segmentation
Weighed Voing in 3D Random Fores Segmenaion M. Yaqub,, P. Mahon 3, M. K. Javaid, C. Cooper, J. A. Noble NDORMS, Universiy of Oxford, IBME, Deparmen of Engineering Science, Universiy of Oxford, 3 MRC Epidemiology
More informationMODEL BASED TECHNIQUE FOR VEHICLE TRACKING IN TRAFFIC VIDEO USING SPATIAL LOCAL FEATURES
MODEL BASED TECHNIQUE FOR VEHICLE TRACKING IN TRAFFIC VIDEO USING SPATIAL LOCAL FEATURES Arun Kumar H. D. 1 and Prabhakar C. J. 2 1 Deparmen of Compuer Science, Kuvempu Universiy, Shimoga, India ABSTRACT
More informationA Hierarchical Object Recognition System Based on Multi-scale Principal Curvature Regions
A Hierarchical Objec Recogniion Sysem Based on Muli-scale Principal Curvaure Regions Wei Zhang, Hongli Deng, Thomas G Dieerich and Eric N Morensen School of Elecrical Engineering and Compuer Science Oregon
More informationFACIAL ACTION TRACKING USING PARTICLE FILTERS AND ACTIVE APPEARANCE MODELS. Soumya Hamlaoui & Franck Davoine
FACIAL ACTION TRACKING USING PARTICLE FILTERS AND ACTIVE APPEARANCE MODELS Soumya Hamlaoui & Franck Davoine HEUDIASYC Mixed Research Uni, CNRS / Compiègne Universiy of Technology BP 20529, 60205 Compiègne
More informationWheelchair-user Detection Combined with Parts-based Tracking
Wheelchair-user Deecion Combined wih Pars-based Tracking Ukyo Tanikawa 1, Yasuomo Kawanishi 1, Daisuke Deguchi 2,IchiroIde 1, Hiroshi Murase 1 and Ryo Kawai 3 1 Graduae School of Informaion Science, Nagoya
More informationTracking a Large Number of Objects from Multiple Views
Tracking a Large Number of Objecs from Muliple Views Zheng Wu 1, Nickolay I. Hrisov 2, Tyson L. Hedrick 3, Thomas H. Kun 2, Margri Beke 1 1 Deparmen of Compuer Science, Boson Universiy 2 Deparmen of Biology,
More informationJ. Vis. Commun. Image R.
J. Vis. Commun. Image R. 20 (2009) 9 27 Conens liss available a ScienceDirec J. Vis. Commun. Image R. journal homepage: www.elsevier.com/locae/jvci Face deecion and racking using a Boosed Adapive Paricle
More informationIntentSearch:Capturing User Intention for One-Click Internet Image Search
JOURNAL OF L A T E X CLASS FILES, VOL. 6, NO. 1, JANUARY 2010 1 InenSearch:Capuring User Inenion for One-Click Inerne Image Search Xiaoou Tang, Fellow, IEEE, Ke Liu, Jingyu Cui, Suden Member, IEEE, Fang
More informationUpper Body Tracking for Human-Machine Interaction with a Moving Camera
The 2009 IEEE/RSJ Inernaional Conference on Inelligen Robos and Sysems Ocober -5, 2009 S. Louis, USA Upper Body Tracking for Human-Machine Ineracion wih a Moving Camera Yi-Ru Chen, Cheng-Ming Huang, and
More informationarxiv: v1 [cs.cv] 25 Apr 2017
Sudheendra Vijayanarasimhan Susanna Ricco svnaras@google.com ricco@google.com... arxiv:1704.07804v1 [cs.cv] 25 Apr 2017 SfM-Ne: Learning of Srucure and Moion from Video Cordelia Schmid Esimaed deph, camera
More informationSimultaneous Localization and Mapping with Stereo Vision
Simulaneous Localizaion and Mapping wih Sereo Vision Mahew N. Dailey Compuer Science and Informaion Managemen Asian Insiue of Technology Pahumhani, Thailand Email: mdailey@ai.ac.h Manukid Parnichkun Mecharonics
More informationTracking a Large Number of Objects from Multiple Views
Boson Universiy Compuer Science Deparmen Technical Repor BUCS-TR 2009-005 Tracking a Large Number of Objecs from Muliple Views Zheng Wu 1, Nickolay I. Hrisov 2, Tyson L. Hedrick 3, Thomas H. Kun 2, Margri
More informationRao-Blackwellized Particle Filtering for Probing-Based 6-DOF Localization in Robotic Assembly
MITSUBISHI ELECTRIC RESEARCH LABORATORIES hp://www.merl.com Rao-Blackwellized Paricle Filering for Probing-Based 6-DOF Localizaion in Roboic Assembly Yuichi Taguchi, Tim Marks, Haruhisa Okuda TR1-8 June
More informationSam knows that his MP3 player has 40% of its battery life left and that the battery charges by an additional 12 percentage points every 15 minutes.
8.F Baery Charging Task Sam wans o ake his MP3 player and his video game player on a car rip. An hour before hey plan o leave, he realized ha he forgo o charge he baeries las nigh. A ha poin, he plugged
More informationMOTION DETECTORS GRAPH MATCHING LAB PRE-LAB QUESTIONS
NME: TE: LOK: MOTION ETETORS GRPH MTHING L PRE-L QUESTIONS 1. Read he insrucions, and answer he following quesions. Make sure you resae he quesion so I don hae o read he quesion o undersand he answer..
More information4. Minimax and planning problems
CS/ECE/ISyE 524 Inroducion o Opimizaion Spring 2017 18 4. Minima and planning problems ˆ Opimizing piecewise linear funcions ˆ Minima problems ˆ Eample: Chebyshev cener ˆ Muli-period planning problems
More informationDAGM 2011 Tutorial on Convex Optimization for Computer Vision
DAGM 2011 Tuorial on Convex Opimizaion for Compuer Vision Par 3: Convex Soluions for Sereo and Opical Flow Daniel Cremers Compuer Vision Group Technical Universiy of Munich Graz Universiy of Technology
More informationCONTEXT MODELS FOR CRF-BASED CLASSIFICATION OF MULTITEMPORAL REMOTE SENSING DATA
ISPRS Annals of he Phoogrammery, Remoe Sensing and Spaial Informaion Sciences, Volume I-7, 2012 XXII ISPRS Congress, 25 Augus 01 Sepember 2012, Melbourne, Ausralia CONTEXT MODELS FOR CRF-BASED CLASSIFICATION
More informationA Framework for Applying Point Clouds Grabbed by Multi-Beam LIDAR in Perceiving the Driving Environment
Sensors 215, 15, 21931-21956; doi:1.339/s15921931 Aricle OPEN ACCESS sensors ISSN 1424-822 www.mdpi.com/journal/sensors A Framewor for Applying Poin Clouds Grabbed by Muli-Beam LIDAR in Perceiving he Driving
More informationLAMP: 3D Layered, Adaptive-resolution and Multiperspective Panorama - a New Scene Representation
Submission o Special Issue of CVIU on Model-based and Image-based 3D Scene Represenaion for Ineracive Visualizaion LAMP: 3D Layered, Adapive-resoluion and Muliperspecive Panorama - a New Scene Represenaion
More informationCOSC 3213: Computer Networks I Chapter 6 Handout # 7
COSC 3213: Compuer Neworks I Chaper 6 Handou # 7 Insrucor: Dr. Marvin Mandelbaum Deparmen of Compuer Science York Universiy F05 Secion A Medium Access Conrol (MAC) Topics: 1. Muliple Access Communicaions:
More informationCoded Caching with Multiple File Requests
Coded Caching wih Muliple File Requess Yi-Peng Wei Sennur Ulukus Deparmen of Elecrical and Compuer Engineering Universiy of Maryland College Park, MD 20742 ypwei@umd.edu ulukus@umd.edu Absrac We sudy a
More informationViewpoint Invariant 3D Landmark Model Inference from Monocular 2D Images Using Higher-Order Priors
Viewpoin Invarian 3D Landmark Model Inference from Monocular 2D Images Using Higher-Order Priors Chaohui Wang 1,2, Yun Zeng 3, Loic Simon 1, Ioannis Kakadiaris 4, Dimiris Samaras 3, Nikos Paragios 1,2
More informationMulti-View 3D Human Tracking in Crowded Scenes
Proceedings of he Thirieh AAAI Conference on Arificial Inelligence (AAAI-16) Muli-View 3D Human Tracking in Crowded Scenes Xiaobai Liu Deparmen of Compuer Science, San Diego Sae Universiy GMCS Building,
More informationIn fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magnetic Field Maps
In fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magneic Field Maps A. D. Hahn 1, A. S. Nencka 1 and D. B. Rowe 2,1 1 Medical College of Wisconsin, Milwaukee, WI, Unied
More informationImproving Occupancy Grid FastSLAM by Integrating Navigation Sensors
Improving Occupancy Grid FasSLAM by Inegraing Navigaion Sensors Chrisopher Weyers Sensors Direcorae Air Force Research Laboraory Wrigh-Paerson AFB, OH 45433 Gilber Peerson Deparmen of Elecrical and Compuer
More informationMATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008
MATH 5 - Differenial Equaions Sepember 15, 8 Projec 1, Fall 8 Due: Sepember 4, 8 Lab 1.3 - Logisics Populaion Models wih Harvesing For his projec we consider lab 1.3 of Differenial Equaions pages 146 o
More informationDynamic Route Planning and Obstacle Avoidance Model for Unmanned Aerial Vehicles
Volume 116 No. 24 2017, 315-329 ISSN: 1311-8080 (prined version); ISSN: 1314-3395 (on-line version) url: hp://www.ijpam.eu ijpam.eu Dynamic Roue Planning and Obsacle Avoidance Model for Unmanned Aerial
More informationMoving Object Detection Using MRF Model and Entropy based Adaptive Thresholding
Moving Objec Deecion Using MRF Model and Enropy based Adapive Thresholding Badri Narayan Subudhi, Pradipa Kumar Nanda and Ashish Ghosh Machine Inelligence Uni, Indian Saisical Insiue, Kolkaa, 700108, India,
More informationRobust Multi-view Face Detection Using Error Correcting Output Codes
Robus Muli-view Face Deecion Using Error Correcing Oupu Codes Hongming Zhang,2, Wen GaoP P, Xilin Chen 2, Shiguang Shan 2, and Debin Zhao Deparmen of Compuer Science and Engineering, Harbin Insiue of Technolog
More informationCENG 477 Introduction to Computer Graphics. Modeling Transformations
CENG 477 Inroducion o Compuer Graphics Modeling Transformaions Modeling Transformaions Model coordinaes o World coordinaes: Model coordinaes: All shapes wih heir local coordinaes and sies. world World
More informationTrackNet: Simultaneous Detection and Tracking of Multiple Objects
TrackNe: Simulaneous Deecion and Tracking of Muliple Objecs Chenge Li New York Universiy cl2840@nyu.edu Gregory Dobler New York Universiy greg.dobler@nyu.edu Yilin Song New York Universiy ys1297@nyu.edu
More informationRobust LSTM-Autoencoders for Face De-Occlusion in the Wild
IEEE TRANSACTIONS ON IMAGE PROCESSING, DRAFT 1 Robus LSTM-Auoencoders for Face De-Occlusion in he Wild Fang Zhao, Jiashi Feng, Jian Zhao, Wenhan Yang, Shuicheng Yan arxiv:1612.08534v1 [cs.cv] 27 Dec 2016
More informationAnalysis of Various Types of Bugs in the Object Oriented Java Script Language Coding
Indian Journal of Science and Technology, Vol 8(21), DOI: 10.17485/ijs/2015/v8i21/69958, Sepember 2015 ISSN (Prin) : 0974-6846 ISSN (Online) : 0974-5645 Analysis of Various Types of Bugs in he Objec Oriened
More informationVideo Content Description Using Fuzzy Spatio-Temporal Relations
Proceedings of he 4s Hawaii Inernaional Conference on Sysem Sciences - 008 Video Conen Descripion Using Fuzzy Spaio-Temporal Relaions rchana M. Rajurkar *, R.C. Joshi and Sananu Chaudhary 3 Dep of Compuer
More informationRobot localization under perceptual aliasing conditions based on laser reflectivity using particle filter
Robo localizaion under percepual aliasing condiions based on laser refleciviy using paricle filer DongXiang Zhang, Ryo Kurazume, Yumi Iwashia, Tsuomu Hasegawa Absrac Global localizaion, which deermines
More informationA Bayesian Approach to Video Object Segmentation via Merging 3D Watershed Volumes
A Bayesian Approach o Video Objec Segmenaion via Merging 3D Waershed Volumes Yu-Pao Tsai 1,3, Chih-Chuan Lai 1,2, Yi-Ping Hung 1,2, and Zen-Chung Shih 3 1 Insiue of Informaion Science, Academia Sinica,
More informationIEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS 1
TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS 1 Adapive Appearance Model and Condensaion Algorihm for Robus Face Tracking Yui Man Lui, Suden Member,, J. Ross Beveridge, Member,,
More informationDetection of salient objects with focused attention based on spatial and temporal coherence
ricle Informaion Processing Technology pril 2011 Vol.56 No.10: 1055 1062 doi: 10.1007/s11434-010-4387-1 SPECIL TOPICS: Deecion of salien objecs wih focused aenion based on spaial and emporal coherence
More informationOcclusion-Free Hand Motion Tracking by Multiple Cameras and Particle Filtering with Prediction
58 IJCSNS Inernaional Journal of Compuer Science and Nework Securiy, VOL.6 No.10, Ocober 006 Occlusion-Free Hand Moion Tracking by Muliple Cameras and Paricle Filering wih Predicion Makoo Kao, and Gang
More informationACQUIRING high-quality and well-defined depth data. Online Temporally Consistent Indoor Depth Video Enhancement via Static Structure
SUBMITTED TO TRANSACTION ON IMAGE PROCESSING 1 Online Temporally Consisen Indoor Deph Video Enhancemen via Saic Srucure Lu Sheng, Suden Member, IEEE, King Ngi Ngan, Fellow, IEEE, Chern-Loon Lim and Songnan
More informationDefinition and examples of time series
Definiion and examples of ime series A ime series is a sequence of daa poins being recorded a specific imes. Formally, le,,p be a probabiliy space, and T an index se. A real valued sochasic process is
More informationSequential Monte Carlo Tracking for Marginal Artery Segmentation on CT Angiography by Multiple Cue Fusion
Sequenial Mone Carlo Tracking for Marginal Arery Segmenaion on CT Angiography by Muliple Cue Fusion Shijun Wang, Brandon Peplinski, Le Lu, Weidong Zhang, Jianfei Liu, Zhuoshi Wei, and Ronald M. Summers
More informationDeformable Parts Correlation Filters for Robust Visual Tracking
PAPER UNDER REVISION Deformable Pars Correlaion Filers for Robus Visual Tracking Alan Lukežič, Luka Čehovin, Member, IEEE, and Maej Krisan, Member, IEEE ha par-based models should be considered in a layered
More informationIROS 2015 Workshop on On-line decision-making in multi-robot coordination (DEMUR 15)
IROS 2015 Workshop on On-line decision-making in muli-robo coordinaion () OPTIMIZATION-BASED COOPERATIVE MULTI-ROBOT TARGET TRACKING WITH REASONING ABOUT OCCLUSIONS KAROL HAUSMAN a,, GREGORY KAHN b, SACHIN
More informationRobust 3D Visual Tracking Using Particle Filtering on the SE(3) Group
Robus 3D Visual Tracking Using Paricle Filering on he SE(3) Group Changhyun Choi and Henrik I. Chrisensen Roboics & Inelligen Machines, College of Compuing Georgia Insiue of Technology Alana, GA 3332,
More informationScale Recovery for Monocular Visual Odometry Using Depth Estimated with Deep Convolutional Neural Fields
Scale Recovery for Monocular Visual Odomery Using Deph Esimaed wih Deep Convoluional Neural Fields Xiaochuan Yin, Xiangwei Wang, Xiaoguo Du, Qijun Chen Tongji Universiy yinxiaochuan@homail.com,wangxiangwei.cpp@gmail.com,
More informationAudio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA
Audio Engineering Sociey Convenion Paper Presened a he 119h Convenion 2005 Ocober 7 10 New Yor, New Yor USA This convenion paper has been reproduced from he auhor's advance manuscrip, wihou ediing, correcions,
More informationLarge-scale 3D Outdoor Mapping and On-line Localization using 3D-2D Matching
Large-scale 3D Oudoor Mapping and On-line Localizaion using 3D-D Maching Takahiro Sakai, Kenji Koide, Jun Miura, and Shuji Oishi Absrac Map-based oudoor navigaion is an acive research area in mobile robos
More informationReal-Time Non-Rigid Multi-Frame Depth Video Super-Resolution
Real-Time Non-Rigid Muli-Frame Deph Video Super-Resoluion Kassem Al Ismaeil 1, Djamila Aouada 1, Thomas Solignac 2, Bruno Mirbach 2, Björn Oersen 1 1 Inerdisciplinary Cenre for Securiy, Reliabiliy, and
More informationImage Content Representation
Image Conen Represenaion Represenaion for curves and shapes regions relaionships beween regions E.G.M. Perakis Image Represenaion & Recogniion 1 Reliable Represenaion Uniqueness: mus uniquely specify an
More informationOptimal Crane Scheduling
Opimal Crane Scheduling Samid Hoda, John Hooker Laife Genc Kaya, Ben Peerson Carnegie Mellon Universiy Iiro Harjunkoski ABB Corporae Research EWO - 13 November 2007 1/16 Problem Track-mouned cranes move
More informationReal-Time Avatar Animation Steered by Live Body Motion
Real-Time Avaar Animaion Seered by Live Body Moion Oliver Schreer, Ralf Tanger, Peer Eiser, Peer Kauff, Bernhard Kaspar, and Roman Engler 3 Fraunhofer Insiue for Telecommunicaions/Heinrich-Herz-Insiu,
More informationNonparametric CUSUM Charts for Process Variability
Journal of Academia and Indusrial Research (JAIR) Volume 3, Issue June 4 53 REEARCH ARTICLE IN: 78-53 Nonparameric CUUM Chars for Process Variabiliy D.M. Zombade and V.B. Ghue * Dep. of aisics, Walchand
More informationMulti-camera multi-object voxel-based Monte Carlo 3D tracking strategies
RESEARCH Open Access Muli-camera muli-objec voxel-based Mone Carlo 3D racking sraegies Crisian Canon-Ferrer *, Josep R Casas, Monse Pardàs and Enric Mone Absrac This aricle presens a new approach o he
More informationShortest Path Algorithms. Lecture I: Shortest Path Algorithms. Example. Graphs and Matrices. Setting: Dr Kieran T. Herley.
Shores Pah Algorihms Background Seing: Lecure I: Shores Pah Algorihms Dr Kieran T. Herle Deparmen of Compuer Science Universi College Cork Ocober 201 direced graph, real edge weighs Le he lengh of a pah
More informationDynamic Depth Recovery from Multiple Synchronized Video Streams 1
Dynamic Deph Recoery from Muliple ynchronized Video reams Hai ao, Harpree. awhney, and Rakesh Kumar Deparmen of Compuer Engineering arnoff Corporaion Uniersiy of California a ana Cruz Washingon Road ana
More informationRGB-D Object Tracking: A Particle Filter Approach on GPU
RGB-D Objec Tracking: A Paricle Filer Approach on GPU Changhyun Choi and Henrik I. Chrisensen Cener for Roboics & Inelligen Machines College of Compuing Georgia Insiue of Technology Alana, GA 3332, USA
More informationSLAM in Large Indoor Environments with Low-Cost, Noisy, and Sparse Sonars
SLAM in Large Indoor Environmens wih Low-Cos, Noisy, and Sparse Sonars Teddy N. Yap, Jr. and Chrisian R. Shelon Deparmen of Compuer Science and Engineering Universiy of California, Riverside, CA 92521,
More informationResearch Article Auto Coloring with Enhanced Character Registration
Compuer Games Technology Volume 2008, Aricle ID 35398, 7 pages doi:0.55/2008/35398 Research Aricle Auo Coloring wih Enhanced Characer Regisraion Jie Qiu, Hock Soon Seah, Feng Tian, Quan Chen, Zhongke Wu,
More informationY. Tsiatouhas. VLSI Systems and Computer Architecture Lab
CMOS INEGRAED CIRCUI DESIGN ECHNIQUES Universiy of Ioannina Clocking Schemes Dep. of Compuer Science and Engineering Y. siaouhas CMOS Inegraed Circui Design echniques Overview 1. Jier Skew hroughpu Laency
More informationA Novel Approach for Monocular 3D Object Tracking in Cluttered Environment
Inernaional Journal of Compuaional Inelligence Research ISSN 0973-1873 Volume 13, Number 5 (2017), pp. 851-864 Research India Publicaions hp://www.ripublicaion.com A Novel Approach for Monocular 3D Objec
More informationIn Proceedings of CVPR '96. Structure and Motion of Curved 3D Objects from. using these methods [12].
In Proceedings of CVPR '96 Srucure and Moion of Curved 3D Objecs from Monocular Silhouees B Vijayakumar David J Kriegman Dep of Elecrical Engineering Yale Universiy New Haven, CT 652-8267 Jean Ponce Compuer
More informationStereoscopic Neural Style Transfer
Sereoscopic Neural Syle Transfer Dongdong Chen 1 Lu Yuan 2, Jing Liao 2, Nenghai Yu 1, Gang Hua 2 1 Universiy of Science and Technology of China 2 Microsof Research cd722522@mail.usc.edu.cn, {luyuan,jliao}@microsof.com,
More informationSENSING using 3D technologies, structured light cameras
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 39, NO. 10, OCTOBER 2017 2045 Real-Time Enhancemen of Dynamic Deph Videos wih Non-Rigid Deformaions Kassem Al Ismaeil, Suden Member,
More informationGauss-Jordan Algorithm
Gauss-Jordan Algorihm The Gauss-Jordan algorihm is a sep by sep procedure for solving a sysem of linear equaions which may conain any number of variables and any number of equaions. The algorihm is carried
More informationTrack-based and object-based occlusion for people tracking refinement in indoor surveillance
Trac-based and objec-based occlusion for people racing refinemen in indoor surveillance R. Cucchiara, C. Grana, G. Tardini Diparimeno di Ingegneria Informaica - Universiy of Modena and Reggio Emilia Via
More informationTime Expression Recognition Using a Constituent-based Tagging Scheme
Track: Web Conen Analysis, Semanics and Knowledge Time Expression Recogniion Using a Consiuen-based Tagging Scheme Xiaoshi Zhong and Erik Cambria School of Compuer Science and Engineering Nanyang Technological
More informationA Face Detection Method Based on Skin Color Model
A Face Deecion Mehod Based on Skin Color Model Dazhi Zhang Boying Wu Jiebao Sun Qinglei Liao Deparmen of Mahemaics Harbin Insiue of Technology Harbin China 150000 Zhang_dz@163.com mahwby@hi.edu.cn sunjiebao@om.com
More informationTrack and Cut: simultaneous tracking and segmentation of multiple objects with graph cuts
INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Track and Cu: simulaneous racking and segmenaion of muliple objecs wih graph cus Aurélie Bugeau Parick Pérez N 6337 Ocober 2007 Thèmes COM
More informationSTRING DESCRIPTIONS OF DATA FOR DISPLAY*
SLAC-PUB-383 January 1968 STRING DESCRIPTIONS OF DATA FOR DISPLAY* J. E. George and W. F. Miller Compuer Science Deparmen and Sanford Linear Acceleraor Cener Sanford Universiy Sanford, California Absrac
More informationHierarchical Recurrent Filtering for Fully Convolutional DenseNets
Hierarchical Recurren Filering for Fully Convoluional DenseNes Jo rg Wagner1,2, Volker Fischer1, Michael Herman1 and Sven Behnke2 1- Bosch Cener for Arificial Inelligence - 71272 Renningen - Germany 2-
More information