Simultaneously Fitting and Segmenting Multiple- Structure Data with Outliers

Smultaneously Fttng and Segmentng Multple- Structure Data wth Outlers Hanz Wang a, b, c, Senor Member, IEEE, Tat-un Chn b, Member, IEEE and Davd Suter b, Senor Member, IEEE Abstract We propose a robust fttng framework, called Adaptve ernel-scale Weghted Hypotheses (ASWH, to segment multplestructure data even n the presence of a large number of outlers. Our framework contans a novel scale estmator called Iteratve th Ordered Scale Estmator (IOSE. IOSE can accurately estmate the scale of nlers for heavly corrupted multple-structure data and s of nterest by tself snce t can be used n other robust estmators. In addton to IOSE, our framework ncludes several orgnal elements based on the weghtng, clusterng, and fusng of hypotheses. ASWH can provde accurate estmates of the number of model nstances, and the parameters and the scale of each model nstance smultaneously. We demonstrate good performance n practcal applcatons such as lne fttng, crcle fttng, range mage segmentaton, homography estmaton and two-vew based moton segmentaton, usng both synthetc data and real mages. Index Terms Robust Statstcs, Model Fttng, Scale Estmaton, ernel Densty Estmaton, Multple Structure Segmentaton. INTRODUCTION W e consder the settng where one s gven some data and wthn that data there are several structures of nterest: where one knows the general form of a parametrc model that these structures ft. We are motvated by applcatons n computer vson: such as lne and crcle fndng n mages, homography/fundamental matrx estmaton, optcal flow calculaton, range mage segmentaton, moton estmaton and segmentaton. We adopt the paradgm that these models (lne, crcle, homography, fundamental matrx, etc. can be found by robust fttng. Robust because the data contans outlers or pseudo-outlers (the latter refers to the fact that data belongng to one structure s usually an outler to any other structure [, 3]. Tradtonal robust approaches such as M-estmators [4], LMedS [5] and LTS [6], cannot tolerate more than 50% of outlers. Ths s a severe lmtaton when these approaches are appled to multple-structure data, where outlers (and pseudooutlers to any structure are a large fracton of the data. A number of robust approaches clam to tolerate more than 50% of outlers: HT [7], RHT [8], RANSAC [9], MSAC [0], MUSE [], MINPRAN [], ALS [3], RESC [4], pbm [5], HBM [6], ASSC [], ASC [7], etc. These cannot be consdered as complete solutons to the above problem because they ether requre a sequental fttng-and-removng procedure (.e., sequentally estmate the parameters of one model nstance, dchotomze the nlers belongng to that model nstance from outlers, and remove the nlers from data or some form of further analyss, n the case of HT/RHT, to segment multple-modes n Hough space (.e., model parameter space. Moreover, the methods (typfed by RANSAC and related estmators generally requre one or more parameters related to the scale of nlers (or equvalent, the error tolerance to be ether specfed by the user or estmated by a separate procedure. Further, we note that the sequental fttng-and-removng procedure has lmtatons: ( f the parameters or the scale of a model nstance s ncorrect, the nlers of the remanng model nstances may be wrongly removed, and the nlers to the current model nstance may be ncorrectly left n. Ths wll lead to naccurate estmaton of the remanng model a School of Informaton Scence and Technology, Xamen Unversty, Fuan, 36005, Chna; E-mal: hanz.wang@eee.org; b School of Computer Scence, The Unversty of Adelade, North Terrace, South Australa 5005, Australa. E-mal: {tatun-chn, davd.suter}@ adelade.edu.au c Fuan ey Laboratory of the Bran-lke Intellgent Systems (Xamen Unversty, Fuan, Chna, 36005 nstances n the followng steps; ( Also, the sequental fttng-and-removng procedure s not computatonally effcent because t requres generatng a large number of hypotheses n each step. Alternatves have been proposed but these too have lmtatons. MultRANSAC [8] requres the user to specfy both the number of model nstances and the nler scales. GPCA [9] s restrcted to fttng lnear subspaces, and requres that the user specfes the number of populatons. Moreover, GPCA becomes problematc for data nvolvng outlers and a large number of structures. MS (Mean Shft [0, ] and ts varants [, 3], attempt to detect/estmate model nstances by fndng peaks n parameter space; however, ths becomes very dffcult when the nler nose s hgh and/or the percentage of outlers s hgh. Moreover, t s not trval to fnd a global optmal bn sze for HT/RHT or an optmal bandwdth for the MS-based approaches to both acheve accurate results and correctly localze multple sgnfcant peaks n parameter space. RHA [4] s predcated on an nterestng property of resduals (to hypotheszed fts when the model nstances are parallel to each other, however any guarantee of performance when models are not parallel reles on (lmted emprcal support. Also, RHA requres the user to specfy both the hstogram bn sze (lmtaton shared by HT/RHT and others and a threshold for selectng the sgnfcant local peaks correspondng to the model nstance number. A accard-dstance lnkage (-lnkage based agglomeratve clusterng technque [5] s more effectve than RHA when model nstances are not parallel. However, they dchotomze nlers/outlers by a user-specfed nler scale, whch s not known n practce. Lke RANSAC, the performance of - lnkage greatly depends on the nput nler scale. In the kernel fttng (F approach [6]: gross outlers are dentfed frst and removed, and then the multple structures are estmated n parallel. Unlke -lnkage and RANSAC, F does not requre a user to specfy the nler scale. However, F uses a step sze h n computng the ordered resdual kernel. F also requres a user to specfy the relatve weghtng rato between fttng error and model complexty n mplementng the model selecton procedure. We note that although the MS-based approaches [0,, 3], RHA [4], -lnkage [5] and F [6] clam to be able to detect the number of model nstances, they all requre some user-specfed thresholds whose values are crucal n determnng the number of model nstances (e.g.,[3] suggests for the MS-based approaches that the sgnfcant peaks are

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, MANUSCRIPT ID: TPAMI-00-07-0509 selected f the frst N modes clearly domnate the (N+th mode but the authors do not dscuss how to udge f the Nth mode domnates the (N+th mode, nor specfy a threshold value for the udgment. RHA [4] uses a threshold to elmnate spurous peaks n the resdual hstogram whle -lnkage determnes the number of model nstances by selectng sgnfcant bns of the hypothess hstogram whose value s larger than a user-specfed threshold. The value of the step sze h and the weghtng rato used n F has a sgnfcant nfluence on determnng the number of model nstances [6]. To detect the number of model nstances s stll one of the most dffcult thngs n model fttng [7, 8]. Our approach s based on samplng parameter space, Sec. 3., (snce each pont n ths space s a potental model, we also call ths hypothess space ths startng pont s smlar to HT/RHT. Fg. llustrates the basc steps startng from weghtng the hypotheses samples n Fg. (c. A key element n effcency s that we prune ths space (Sec. 3.3, usng a prncpled nformaton theoretc crtera, after weghtng each hypothess so as to emphasse lkely canddates (Sec. 3.. Snce multple survvng (hence hghly lkely to be approxmatons to a true structure samples n hypothess space may actually come from the same structure, we cluster the survvng hypotheses (Sec. 3.4 usng an approach nspred by -lnkage (but wth very mportant dfferences see the dscusson theren, that, whlst provdng an advance towards good clusters, does have a tendency to over-segment the clusters (Fg. (d. Thus we refne these clusters (Sec. 3.5 usng, agan, a prncpled nformaton theoretc crteron (Fg. (e. Lke most algorthms, our approach s crucally dependent on the qualty of the output of the frst stage n our case, the weghtng of hypotheses before prunng. Our weghtng s based on the noton that a good hypothess s one that explans well the data around t. In essence, ths noton, lke n many methods, s based upon an analyss of the smaller resduals (f the model s a true one then the data gvng rse to these resduals would lkely belong to that model that essentally characterses an effectve nler scale. Here we do two thngs we mprove the scale estmator OSE [3] (the new scale estmator s called IOSE Sec.. and we also mprove on the scorng functon (Sec. 3. used n [7, 9] (frstly wth a better estmate of scale and secondly n how we use that scale. Our man contrbuton s that we develop a framework (ASWH that smultaneously fts all models to multplestructured data and segments the data. In the framework, the relevant quanttes (the parameters, the scales and the number of model nstances are determned automatcally. In so dong, we have orgnal contrbutons n varous buldng blocks: ( we propose a robust scale estmator called Iteratve th Ordered Scale Estmator (IOSE, whch can accurately estmate the scale of nlers for heavly corrupted data wth multple structures. IOSE s of nterest by tself snce t can be a component n any robust approach that uses a data drven scale estmate; ( We propose an effectve hypotheses weghtng functon that contrbutes to rapd elmnaton of poor hypotheses. Our combnaton of ths wth a (fully automatc entropy thresholdng approach lead to hgh computatonal effcency; (3 From that pont, our algorthm uses the relatvely common dvde nto small clusters (.e., over segmented clusters and then merge procedure but wth a couple of novel features such as effectvely usng a accard-dstance based clusterng algorthm refned by the use of mutual nformaton theory to fuse the over-segmented clusters. (e.g., when the nler nose of a structure s large, there may be more than one cluster correspondng to the structure. Overall, as a result of these (c (a (e (f Fgure. Overvew of our approach: (a and (b mage par wth SIFT features; (c weghted hypotheses (plotted usng the frst coeffcents of homography matrx; (d clusters (each colour represents a dfferent cluster after removng weak hypotheses and clusterng; (e survvng two hypotheses (red and colour code of hypotheses udged to belong to each structure; (f feature pont data segmented accordng to structure. nnovatons, ASWH does not requre the user to specfy the number of structures nor the nler scale of each model nstance. ASWH can deal wth multple-structured data heavly corrupted wth outlers. Our expermental results show that the proposed IOSE and ASWH outperform competng approaches n both scale estmaton and model fttng. We are not clamng to have solved the mpossble a completely automated system that wll determne exactly how many structures there are n a scene, and recover all these structures wth hgh precson and relablty. Such s mpossble because t depends upon what s consdered as an mportant structure (a wall s a sngle plane at one resoluton but a collecton of closely algned planes at fner resoluton. Rather what we are clamng s that: when one have a number of roughly comparable szed structures n the scene, our approach wll (and to an extent better than other compettors determne not only the number of those structures but relably ft and segment them. The remander of the paper s constructed as follows: n Sec., we revew exstng scale estmaton technques; and then propose a novel scale estmator and evaluate ts performance. In Sec. 3, we develop the man components of the proposed approach. We put all the developed components together to obtan a complete framework n Sec. 4. In Sec. 5 we provde expermental results (wth both synthetc and real data, and we gve a concludng dscusson n Sec. 6. SCALE ESTIMATION Scale estmaton plays an mportant role n robust model fttng and segmentaton. The accuracy of the scale estmate (or error tolerance greatly affects the performance of many robust estmators such as RANSAC, HT, RHT, multransac, M-estmators, MSAC, ALS, ASSC, and -lnkage. The mportance of the scale estmaton derves from: ( t can be used to dchotomze nlers and outlers (e.g., n [9, 3, 5, 8, 5]; ( It can be used to select the best hypothess (say, n (b (d

H. WANG ET AL.: SIMULTANEOUSLY FITTING AND SEGMENTING MULTIPLE STRUCTURE DATA WITH OUTLIERS 3 [3, 30]; (3 It can be used to determne varous data dependent parameters such as the bandwdth value (e.g., for [5, 30] or the bn sze (e.g., for [7, 8]. Thus a better scale estmator s a crucal component n solvng the mult-structure fttng and segmentaton problems. In ths secton, we revew several popular robust scale estmators and then propose a new scale estmator (IOSE, followed by expermental evaluaton.. Revew of Scale Estmators Gven a set of data ponts n d dmensons d X : = { x x : = ( x,..., xd} where =,,n and x R, and the parameter estmate of a model : = ( p ' p ( R, the resdual r correspondng to the th data sample s wrtten as: : = F( x, ( r where F(. s a functon computng the resdual of a data pont x to the parameter estmate of a model. Denote by r the sorted absolute resdual such that r... r n. Gven the scale of nler nose s, the nlers can be dchotomzed from outlers usng the followng equaton: r / s <Ε ( where E s a threshold (98% of nlers of a Gaussan dstrbuton s ncluded when E s set to.5. The MEDan (MED and the Medan Absolute Devaton (MAD are two popular scale estmators. These scale estmators can be wrtten as: 5 s MED : =.486+ med r (3 n p s MAD : =.486 med{ r - medr } (4 A generalzaton of the MED and MAD s the th Ordered Scale Estmator (OSE [3]: s : = r Θ ( + κ (5 where Θ ( s the argument of the normal cumulatve densty functon. κ ( κ : = / n s the fracton of the whole data ponts havng absolute resduals less or equal to the th ordered absolute resdual. MED, MAD and OSE assume that the data do not nclude pseudo-outlers and the resduals of the nlers are Gaussan dstrbuted. When the data contans multple structures, the scale estmates obtaned by these estmators are based and may even break down. [3] proposed an Adaptve Least th order Squares (ALS algorthm to optmze the value of OSE by mnmzng the varance of the normalzed errors: σ ζ : = arg mn = arg mn Θ ( κ r ( p r s + (6 = where σ s the varance of the frst smallest absolute resduals. ŝ s the estmated scale by Eq. (5. Alternatvely, the Modfed Selectve Statstcal Estmator (MSSE [3] fnds the value whch satsfes: σ + E > (7 σ p + Another approach [] uses a Two-Step Scale Estmator (TSSE, whch uses a mean shft procedure and a mean shft valley procedure to dchotomze nlers/outlers, and then employs the MED scale estmator to estmate the scale. All of ALS, MSSE and TSSE clam that they can tolerate more than 50% outlers. However, ALS may not work accurately when the outler percentage n data s small. ALS and MSSE cannot handle data nvolvng extreme outlers. TSSE s computatonally slow. To mprove the accuracy of scale estmaton, the authors of [3] employ an expectaton maxmzaton (EM procedure to estmate the rato of nlers and outlers, from whch the scale of nler noses can be derved. However, there are several lmts n that approach: ( the authors assume that outlers are randomly dstrbuted wthn a certan range. Ths assumpton s less approprate for multple structured data. ( They ntalze the nler rato value to be 50% of the data ponts. However, when the ntal rato s far from the true case, the EM algorthm may converge to a local peak, whch leads to breakdown of the approach.. The Iteratve th Ordered Scale Estmator As mentoned n Sec.., when data contan multple structures, MED, MAD and OSE can ether break down or be badly based. The reasons nclude: ( the breakdown s caused when the medan (for MED/MAD, or the th ordered absolute resdual (for OSE, belongs to outlers. That s, when the medan or the th ordered pont s an outler, t s mpossble for MED/MAD or OSE to derve the correct nler scale from that pont. ( MED, MAD and OSE are based for multplestructure data due to the bas n estmatngκ n Eq. (5 when n (the number of whole data s used nstead of the number of data belongng to the structure of nterest. We propose to teratvely optmze the κ estmate (abbrevated to IOSE. Let r be the th sorted absolute resdual gven the parameters of the th structure ( and n be the nler number of the th structure, IOSE for the th structure can be wrtten as follows: s : = r Θ ( + κ (8 κ : = / n (9 The dfference between Eq. (8 and Eq. (5 s n the way of estmatng κ. In (8, only the data ponts belongng to the th structure are used; whle n (5, the whole data ponts are used. When data contan only one populaton wthout any outlers, (8 becomes (5. When data nvolve multple structures and/or random outlers, (8 should be used nstead of (5. One crucal queston s how to estmate n. Ths s a chcken-and-egg problem because when we know n, we can d- chotomze nlers and outlers and estmate the nler scale belongng to the th structure; on the other hand, f we know the true nler scale, we can estmate the value of n. To solve ths problem, we propose to use an teratve procedure to estmate n descrbed n Fg.. Theorem. The scale estmates { s,t} n IOSE converge t =,... and { s,t} s monotoncally decreasng wth t. t =,... Proof. Accordng to (8 and (9 r r s, = and s, = ( / n Θ + ( Θ + / n where n = n and n are the number of ponts satsfyng r s, < E. As the value and the th ordered absolute resdual r are fxed n the teraton procedure, and Θ (( + κ / s monotoncally ncreasng wth κ, we obtan s, s, because n n = n. Smlarly, we have s,t s,t -. When the estmated scale s,t reaches the true nler scale, the number of the nlers classfed by Eq. ( wll not change and IOSE outputs s,t. IOSE s smple but effcent and hghly robust to outlers, as demonstrated n Sec..4. We assume the multple structures (or model nstances are from the same class but wth dfferent parameters.

4 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, MANUSCRIPT ID: TPAMI-00-07-0509 Algorthm ( s, = IOSE R. Input a set of resduals R : = { r } =,..., n and the value :Compute the th ordered absolute resdual r, κ t ( = (=/n, and s, t ( = by (8. 3: Whle s,t s,t - and κ t < 4: Estmate the scale s,t at step t by (8. 5: Calculate the number of nlers n t satsfyng (. 6: Compute κ t by (9. 7: End whle 8: Output the scale estmate s ( = s. Fgure.The IOSE algorthm.,t results than the other approaches when the outler percentage s less than 50%, but better results than MED/MAD for data nvolvng more than 50% outlers; and better than OSE when the outler percentage s more than 60%. MSSE and TSSE acheve better results than EM but less accurate results than IOSE. IOSE works well even when the data nvolve 90% outlers. Of course, when data nvolve more 90% outlers, IOSE begns to break down. The reason s because that n that case, the th ordered pont s an outler, from whch IOSE cannot correctly recover the nler scale..3 The Choce of the Value One ssue n IOSE s how to choose. When the data contan a hgh percentage of outlers, one should set to be a small value to guarantee that the th ordered pont belongs to nlers (avod breakdown; yet, when the data nclude a hgh percentage of nlers, one should set the value as large as possble so that more nlers are ncluded to acheve better statstcal effcency. It s more mportant to avod breakdown. For all the experments n ths paper, the nler/outler percentage of data s not gven to IOSE. To be safe (usng a small value, we fx the value to be 0% of the whole data ponts, and we do not change the value for any ndvdual experment. (a (c (b (d.4 Performance of IOSE We compare the performance of IOSE wth that of seven other robust scale estmators (MED, MAD, OSE, ALS, MSSE, EM, and TSSE. In ths secton we assume the true parameters of a model are gven so that we can concentrate on the scale estmaton alone (we evaluate the performance of IOSE wthout usng any pror knowledge about true model parameters n Sec. 5. In the followng experment, we generate a two crossng lne dataset and a two parallel plane dataset, each of whch ncludes a total number of 000 data ponts, dstrbuted n the range [0 00]. The standard varance of the nler nose s.0. For the two crossng lne dataset, the number of the data ponts belongng to the frst lne (n red color s gradually decreased from 900 to 00, whle the number of the data ponts belongng to the second lne (n blue color s fxed at 00. Thus the percentage of outlers to the frst lne s ncreased from 0% to 90%. For the two parallel plane dataset, we keep the number of data ponts belongng to the two planes the same. We gradually decrease the number of data ponts belongng to the planes, whle we ncrease the number of random outler from 0 to 800. Thus the percentage of outlers correspondng to the frst lne s ncreased from 50% to 90%. We measure the scale estmaton error usng the followng equaton: s st Λ (0 s( s, s T = max(, st s where ŝ s the estmated scale and s T s the true scale. We repeat the experments 50 tmes and show both the average results (shown n Fg. 3(c, d and the maxmum error among the results (n Fg. 3(e, f. TABLE shows the mean, the standard varance and the maxmum scale estmaton errors of the results. The value s fxed to 00. As shown n Fg. 3 and Table, IOSE acheves the best performance among the eght competng scale estmators. When the outler percentage s more than 50%, MED and MAD begn to break down. ALS obtans less accurate (e (f Fgure 3. (a and (b Two snapshots respectvely showng the two-lne data and the two-plane data wth 90% outlers. (c and (d are respectvely the error plots of the scale estmaton vs. the outler percentages. (e and (f show the maxmum errors n scale estmaton obtaned by the eght competng estmators. In (d and (f we do not show the results of the Medan and the MAD scale estmators whch cannot deal wth more than 50% outlers and totally break down. TABLE QUANTITATIVE EVALUATION OF THE DIFFERENT SCALE ESTIMATORS ON THE TWO-LINE DATASET AND THE TWO-PLANE DATASET. Two-lne dataset Two-plane dataset Mean Std.Var. Max. Err. Mean Std.Var. Max. Err. Medan..9 39.8 35.7.3 40.8 MAD.0.8 39.9 7.3.33 36.7 OSE.70.88.5 3.0.04.0 ALS.7 0.7 4..40.0 3. MSSE 0.3.30 38. 0.38.3 3.3 EM 0.33 0.5 6.6.57.70 87.4 TSSE 0.3 0.5.9 0.5 0.7 5.75 IOSE 0. 0.05 0.88 0. 0.06 0.88 3 THE ASWH APPROACH In ths secton, based on IOSE (Sec., we propose a novel robust approach, whch can smultaneously estmate not only the parameters of and the scales of model nstances n data, but also the number of model nstances n the data. In contrast to the experment n Sec..4, the parameters of model nstances are not gven as pror nformaton n the followng experments, but are estmated from the data.

H. WANG ET AL.: SIMULTANEOUSLY FITTING AND SEGMENTING MULTIPLE STRUCTURE DATA WITH OUTLIERS 5 3. The Methodology In our framework we frst sample a number of p-subsets (a p- subset s the mnmum number of data ponts whch are requred to calculate the model parameters, e.g., p s equal to for lne fttng and 3 for crcle fttng. The number of samples should be suffcently large such that, wth hgh probablty, at least one all-nler p-subset s selected for each structure n the data. Both random samplng technques (such as [9] and adaptve samplng technques (e.g., [33, 34] can be employed to sample p-subsets. Havng generated a set of p-subsets, we compute the model hypotheses usng the p-subsets. We assgn each hypothess a weght. Let = (,,, p be the th hypothess and P := (, w = (,,, p, w' be a weghted hypothess correspondng to, we represent the model parameter estmates wth a set of weghted hypotheses as follows: P : = { P } = {(, w } ( =,,... =,,... where w s the weght of the hypothess P and w >0 Sec. 3.. The assumpton s that every structure of nterest has at least one (locally strong hypothess assocated wth t and that these (amongst other clean hypotheses wll survve the thresholdng step Sec. 3.3. The survvng hypotheses are clustered and sngle representatves of each cluster are selected Sec. 3.4 and 3.5. The net effect s ntended so as to choose one hypothess for each cluster: Let : { }: {( P = P, = w } =,,... be the weghted hypotheses assocated wth the th structure and P be the estmated hypothess for the th structure wth the parameter estmate and weght w, ASWH can be wrtten as: P= P P : = argmaxw ( { P } 3. Weghtng Hypotheses In the above, t s mportant to effectvely weght each hypothess as t s the key measure of the goodness of a hypothess. Ideally, when a p-subset only contans nlers belongng to one structure, the weght of the correspondng hypothess should be as hgh as possble. Otherwse, the weght of that hypothess should be as close to zero as possble. We base our measure on non-parametrc kernel densty estmate technques [35, 36]. Gven a set of resduals { r( } =,..., n, ( r( R whch are derved from the th hypothess, the varable bandwdth kernel densty estmate at r s wrtten as follows: n ( r r f (: r =, n = h( N N h( (3 where N( and h( are the kernel and the bandwdth. The popular Epanechnkov kernel N E ( r [35] s used n our approach and t s defned as: 3 ( r r r 0 r NE(: r = 4,wth kn E(: r = (4 0 r> 0 r > where kn E ( r s the profle of the kernel N E ( r. The bandwdth can be estmated usng [35]: /5 43 ( r dr N = s 35 n r ( r dr N h ( ( (5 where s ( s the scale estmate obtaned by IOSE n Sec... The value of s ( and the bandwdth h ( s calculated for each p-subset. Lke [7, 9], we score a hypothess usng the densty estmate at the orgn (O, but we refne to further suppress hypotheses that produce large scale estmates through: f ( O n ( ( ( N r h N, w : = = ( (6 s n = s( h( Consder for the moment the case where the nler scale s fxed (to some common value, then the bandwdth h s also a constant value for all vald hypotheses. Thus, Eq. (6 can be rewrtten as: n w r( N h (7 ( n = Note ths s ust a summaton of values one for each resdual small enough to le wthn the support of the kernel functons overlappng the orgn. So t s lke a weghted count of the number of resduals n a certan fxed wdth (by the data ndependent wdth of the kernel functon. So t s lke a generalzaton of the RANSAC crtera sharng the fact that t would score each hypothess only by a globally defned tolerance level and the number of resduals lyng n that tolerance. Of course ths RANSAC score s mplctly related to the scale of nose that s consstent wth that hypothess the larger the scale the fewer samples one would expect n the tolerance band. In contrast, our crteron s delberately more sharply dependent on a data drven scale. In effect the band tself s beng refned n a data dependent way. We wll evaluate the two weghtng functons (Eqs. 6 and 7 n Sec. 4 and 5. 3.3 Selectng Sgnfcant Hypotheses After assgnng to each hypothess a weght, the next queston s how to select sgnfcant hypotheses (wth hgh weght scores whle gnorng weak hypotheses (correspondng to bad p-subsets. One soluton s to manually set a threshold. However, ths s not desrable because t s not data-drven. Data drven thresholdng approaches have been well surveyed n [37]. We use an nformaton-theoretc approach smlar to [38] whch conducts thresholdng as a process to remove redundances n the data, whle retanng as much nformaton content as possble. As we show n the followng, the method s effectve and computatonally effcent. Gven a set of hypotheses P ( = { P } =,,..., and the correspondng weghts W : = { w } =,..., n calculated by Eq. (6 or Eq. (7, we defne: Ψ max( := W w (8 where Ψ 0. Ths means Ψ s a quantty that s proportonal to the gap between the weght of the th hypothess and the maxmum weght. By regardng weghts as estmates of the densty at locatons P ( = { P} =,,... n the parameter space, the pror probablty of component Ψ s wrtten as: n p( Ψ : = Ψ Ψ (9 = The uncertanty of Ψ s then defned as: H( Ψ : = p( Ψlogp( Ψ (0 The entropy of the set { Ψ } =,..., n can then be calculated as the sum of the uncertanty of the hypothess set: n ( The quantty H essentally quantfes the nformaton content avalable n the set of hypothesesp ( = { P } =,,.... * Sgnfcant hypotheses P are then selected from the hypotheses P by selectng hypotheses satsfyng the followng condton: * P = { P H + logp( Ψ < 0} ( In other wods, we keep the hypotheses whch contrbute a H : = H( Ψ =

6 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, MANUSCRIPT ID: TPAMI-00-07-0509 sgnfcant propoton of the nformaton content, whle reectng unnformatve hypotheses. Wth Eq. (, weak hypotheses are reected whle effectve hypotheses are dentfed (see Fg. (d for an example. 3.4 Clusterng Hypotheses The survvng sgnfcant hypotheses are assumed to form clusters correspondng to the underlyng modes of the model probablty densty dstrbutons n parameter space. We cluster the sgnfcant hypotheses based on the accard dstance (referred to as -dstance, whch measures to what extent the dentfed nlers are shared between two hypotheses. The hypotheses from the same structure share a hgh percentage of the dentfed nlers. The hypotheses wth low - dstance scores are clustered together whle hypotheses wth hgh -dstance scores keep ndependent to each other. Gven a set of resduals { r ( } =,..., nderved from, we can dchotomze nlers and outlers wth an nler scale estmate ŝ obtaned by IOSE or specfed by the user. The consensus set of resduals { r ( } =,..., n s formulated as: If r E s C( : = { L( r( } =,..., n,where L ( r = (3 0 Otherwse The -dstance between two consensus sets C( and C( k s gven by: ( C( C( k C(, C( k : = (4 C( ( C k When C( and C( k are dentcal, ther -dstance s zero; When C( and C( k are totally dfferent, ther - dstance s.0. We note that [5] also employs the -dstance as a clusterng crteron. However, our use of -dstance s sgnfcantly dfferent: ( a consensus set n [5] s a set of classfcatons of the parameter hypothess wth respect to one data pont. The -dstance must be calculated for all possble pars of the data ponts,.e., n(n-/ pars. The consensus set n our approach s the nler/outler bnary classfcaton of all data ponts wth respect to one parameter hypothess. Wth an effectve weghtng functon, we can select sgnfcant hypotheses by prunng weak hypotheses and calculate the - dstances only for the sgnfcant hypotheses, by whch the computatonal effcency can be greatly mproved. In our case, we only compute the -dstance for M (M -/ sgnfcant modes (where M s the number of sgnfcant hypotheses whch can be much smaller than the number of data ponts n. Thus, our approach s much faster than the -lnkage approach (see TABLE, and TABLES 4-6 n Sec. 5. ( [5] clusters the pars of data ponts, and selects as the nlers the data ponts belongng to one cluster when the number of the data ponts s larger than a user-specfed threshold. The parameters of the mode nstance are then calculated usng all the classfed nlers. In contrast, we use a more straghtforward way that clusters the pars of hypotheses n parameter space and we do not use any threshold to determne the number of clusters. After we calculate the -dstances between all possble * pars of the selected sgnfcant weghted hypotheses P, we apply the entropy thresholdng approach (descrbed n Sec. 3.3 to the -dstance values to obtan an adaptve cut-off value. Ths cut-off value s data-drven rather than user-specfed. The pars of hypotheses whose -dstance values are less than the cut-off value are clustered together through the tradtonal lnkage algorthm [39]; whle the pars of hypotheses whose -dstance values are larger than the cut-off value are left ndependent. However, one structure may correspond to more than one cluster, especally when the nler nose of the structure s large. To solve ths problem, we need to effectvely fuse the clusters belongng to the same structure (equvalently ths means elmnaton of duplcate representatons of modes. 3.5 Fusng Clusters wth Mutual Informaton Theory Algorthm P Δ = Fuson( X, P : Input the data ponts X and a set of hypotheses P ( = { P } =,,... : Sort hypotheses P accordng to the weghts of P to obtan P = { P λ(, P λ(,...} satsfyng w λ w ( λ. ( 3: Compute the probablty of X belongng to the two hypotheses Pλ ( and P λ ( chosen from P by (8 4: Compute the mutual nformaton M ( Pλ(, Pλ( by (6 5: Repeat step 3 and step 4 untl the mutual nformaton of all possble pars of hypotheses n P s calculated 6: For = to Sze( P - 7: If any M( Pλ(, Pλ ( q > 0 for Sze( P q > then 8: Change the cluster label of P λ ( to that of P λ ( q 9: Remove Pλ ( from P 0: End f : End for : Output the remanng hypotheses n P (= P Δ Fgure 4. The Fuson algorthm. To solve above problem, we propose to use Mutual Informaton Theory (MIT [40]. Ths s a frequently used technque n varous contexts: [4] employs MIT to select an optmal set of components n mxture models for accurately estmatng an underlyng probablty densty functon. [4] extends the approach of [4] to speaker dentfcaton. In ths paper, we propose to apply MIT to multple-structure model fttng. Each cluster, obtaned by the approach descrbed n Sec. 3.4, s represented by the most sgnfcant hypothess wth the hghest weght score among all the hypotheses belongng to that cluster. Let P = { P } =,,... be the most sgnfcant hypotheses n the clusters. Each hypothess n P corresponds to one model canddate. When two hypotheses n P correspond to one structure (.e., the correspondng p-subsets are sampled from the same structure, the nformaton shared by the two hypotheses s large. On the other hand, when two hypotheses are from dfferent structures, the mutual nformaton between them s small. Let p( denote the probablty of. As n Eq. (, the uncertanty of the hypothess P s defned as: H( P H( : = p( logp( (5 An estmate of the mutual nformaton between two hypotheses P and P n P can be wrtten as: M (, (, p(, : p(, log P P M = p( p( n (6 n p( xl p( xl where p(, l = (7 = n n p( p( p( x p( x l l l= l= Here p( xl s the condtonal probablty of data x belongng to model. Recallng Eq. (, and assumng a Gaussan nler nose model, p( x l can be wrtten as: F( x, p( xl exp s s (8 where s s the nler scale correspondng to the hypotheses. Therefore, Eq. (6 analyses the mutual nformaton between two

H. WANG ET AL.: SIMULTANEOUSLY FITTING AND SEGMENTING MULTIPLE STRUCTURE DATA WITH OUTLIERS 7 hypotheses by measurng the degree of smlarty between the nler dstrbutons centered respectvely on the two hypotheses. When the mutual nformaton between P and P (.e., M ( P, P s larger than zero, P and P are statstcally dependent; Otherwse, P and P are statstcally ndependent. [4] employs MIT to prune mxture components n Gaussan mxture models. In contrast, we apply MIT n multplestructure model fttng. We are prunng/fusng related descrptons of general data (e.g., dentfyng smlar lnes, crcles, planes, homographes or a fundamental matrces - smlar n that they explan the same porton of data. The fuson algorthm s summarzed n Fg. 4. The functon Sze( returns the number of hypotheses. The weght scores of hypotheses ndcate the goodness of model canddates. We frst order the hypotheses accordng to ther weghts: from the weakest to the most sgnfcant weght. When we fnd a par of hypotheses whch are statstcally dependent on each other, we fuse the hypothess wth a lower weght score to the hypothess wth a hgher weght score. 4 THE COMPLETE ALGORITHM OF ASWH Algorthm 3 [ P Δ, Cs ] = ASWH( X, Input data ponts X and the value : Estmate the parameter of a model hypothess usng a p-subset chosen from X : Derve the resdual R ( { ( = r } of the data ponts =,..., n 3: Estmate the scale s ( ( = IOSE R, by Alg. 4: Compute the weght of the hypothess w by (6 5: Repeat step to step 4 many tmes to obtan a set of weghted hypotheses P = { P } =,,... = {(, w} =,,... 6: Select sgnfcant hypotheses P( * P from P by the approach ntroduced n Sec. 3.3 7: Run the clusterng procedure ntroduced n Sec. 3.4 on * P to obtan clusters and select the most sgnfcant hypothess n each cluster accordng to the weghts of * hypotheses P = { P } =,,... P Δ Δ 8: Fuse the hypotheses P to obtan P = { P } =,,... by Alg. 9: Derve the nler/outler dchotomy usng P Δ Δ 0:Output the hypotheses P and the nler/outler dchotomes Cs (= {( C P } Δ =,,... correspondng to each of the Δ hypotheses n P Fgure 5. The complete ASWH algorthm. Wth all the ngredents developed n the prevous sectons, we now propose the complete ASWH algorthm: summarzed n Fg. 5. After the estmaton, one can use all the ndentfed nlers to refne each estmated hypothess. Because some robust approaches such as RANSAC, MSAC, HT, RHT, MS and -lnkage requre the user to specfy the nler scale or the scale-related bn sze or bandwdth, and also t s nterestng to evaluate the effectveness of the weghtng functons n (6 and (7, we provde an alternatve verson of ASWH wth a user-nput nler scale (here we refer to our approach wth a fxed nler scale as ASWH and refer to ASWH wth the adaptve scale estmaton approach as ASWH. For ASWH, we nput the nler scale nstead of the value, and we do not run step 3 nstead smply passng the user suppled tolerance as the scale estmate for all hypotheses and usng ths to determne the (sngle global bandwdth. All the remanng steps are the same as ASWH. (a (b Fgure 6. Some results obtaned by ASWH. (a Over-segmented clusters after the clusterng step usng a fxed nler scale. (b The fnal obtaned segmentaton result obtaned by ASWH. Fg. and Fg. 6 respectvely llustrate some results of ASWH and ASWH (the results are smlar; essentally only the computatonal tme taken s dfferent n planar homography detecton usng Merton College II (referred to as MC. We employ the drect lnear transform (DLT algorthm (Alg. 4., p.09, [43] to compute the homography hypotheses from a set (5000 n sze of randomly chosen p-subsets (here a p-subset s 4 pars of pont correspondences n two mages. For each estmated homography hypothess, we use the symmetrc transfer error model (p.94, [43] to compute the resduals. For ASWH, there are 384 sgnfcant hypotheses selected from the 5000 hypotheses, and 9 clusters are obtaned at the clusterng step. In comparson, the number of the selected sgnfcant hypotheses and the number of clusters before the fuson step obtaned by ASWH are respectvely 806 and 45. Although the segmentaton results obtaned by ASWH are the same as those obtaned by ASWH, ASWH s more effectve n selectng sgnfcant hypotheses due to the more effectve weghtng functon. For the computatonal tme used by ASWH, around 74% of the whole tme s used to generate 5000 hypothess canddates from the randomly sampled p-subsets and compute the weghts of the hypotheses (step to step 5 n Fg. 5, and about 5% of the whole tme s used to cluster the sgnfcant hypotheses (step 7 n Fg. 5. The sgnfcant hypothess selecton (step 6 and the fuson of the clusters (step 8 take less than % of the whole tme. In contrast, ASWH uses about 6% of the computatonal tme n generatng hypothess canddates and computng the weghts of the hypotheses, and 73% of the whole tme n clusterng the sgnfcant hypotheses (because more hypotheses are selected for clusterng when the less effectve weghtng functon s used. Thus we can see that effectvely selectng sgnfcant hypotheses (step 6 n Fg. 5 wll sgnfcantly affect the computatonal speed of ASWH/. We wll gve more comparsons between ASWH/ n Sec. 5. (a (b (c (d (e (f Fgure 7. Segmentaton results for the Merton College III (referred to as MC3 mage par obtaned by ASWH. (a and (b The nput mage par wth SIFT feature [] ponts and correspondng matches supermposed respectvely. (c The segmentaton results. (d to (f The nler/outler dchotomy obtaned by IOSE (correspondng to the left, mddle and rght plane respectvely.

8 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, MANUSCRIPT ID: TPAMI-00-07-0509 (a Data (b RHT (c MS (d RHA (e -lnkage (f F (g ASWH Fgure 8. Examples for lne fttng and segmentaton. st to 4 th rows respectvely ft three, four, fve and sx lnes. The correspondng outler percentages are respectvely 85%, 85%, 87% and 90%. The nler scale s.5. (a The orgnal data wth a total number of 000 n each dataset, are dstrbuted n the range of [0, 00]. (b to (g The results obtaned by RHT, MS, RHA, -lnkage, F and ASWH, respectvely. We do not show the results of ASWH and ASC whch are smlar to those of ASWH due to the space lmt. Fg. 7 shows the homography-based segmentaton results obtaned by ASWH. ASWH automatcally selects 68 sgnfcant hypotheses from 0000 hypotheses and correctly detects three planar homographes. Fg. 7 (d to (f llustrate the probablty densty dstrbutons of the ordered (by the absolute resdual value correspondences for the selected model nstance and the adaptve nler/outler dchotomy estmated by IOSE (n ASWH. We can see that IOSE has correctly detected the nler-outler dchotomy boundary. 5 EXPERIMENTS We compare ASWH wth RHT [8], ASC [7, 9], MS [0, ], RHA [4], -lnkage [5] and F [6]. The choce of the compettors was nformed by the suggestons of the revewers and by the followng reasons: ( we choose -lnkage (whch uses the accard dstance as a crteron to cluster data ponts and ASC (whch s also developed based on kernel densty estmate because they are the most related approaches to the proposed approach. ( We choose RHT and MS because both approaches work n parameter space. (3 RHA and F estmate the parameters of model nstances smultaneously and clam to have the ablty to estmate the number of structures. Thus we also choose RHA and F. As ndcated earler - we test two versons of our approach: ASWH wth a fxed nler scale, and ASWH wth an adaptve nler scale estmate. ASWH s not ntended to be a proposed method we use t only to evaluate the relatve effects of the more sophstcated weghtng scheme ntroduced n Sec. 3.. Nonetheless we fnd that t performs remarkably well albet at slower speed and slghtly less accuracy, generally. In runnng RHT, MS, -lnkage and ASWH, we manually specfy the nler scale for these approaches. We specfy the number of model nstances for ASC and the mnmum nler number (whch has nfluence on determnng the number of model nstances for RHT. MS, RHA, -lnkage and F need some user-specfed thresholds: whch have sgnfcant nfluence on determnng the estmate of the model nstance number (see Sec. for more detal. Thus, we adust the threshold values so that the approaches of RHT, MS, RHA, - lnkage and F output the correct model nstance number. Nether the number of model nstances nor the true scale s requred for ASWH. The value of n ASWH/ s fxed to be 0% of the whole data ponts for all the followng experments and we do not adust any other parameter or threshold for ASWH (note: we specfy the nler scale for ASWH for comparson purpose only. Although there are other samplng technques (e.g., [33, 34], we employ random samplng for all the approaches, because: ( t s easy to mplement and t does not requre extra thresholds (e.g., the threshold n the exponent functon n [5], or the matchng scores n [33, 34]; ( It s the most wdely used approach to generate hypothess canddates. To measure the fttng errors, we frst fnd the correspondng relatonshp between the estmated model nstances and the true model nstances (for real data/mage, we manually obtan the true parameters of model nstances and the true nlers to the model nstance. Then, we ft the true nlers to the correspondng model nstances estmated by the approaches, and ft the classfed nlers by the approaches to the true parameters of the correspondng model nstances. For RHT, MS, RHA, -lnkage and ASWH, the nlers belongng to a model nstance are classfed usng the gven ground truth nler scale whle for ASC, F and ASWH, the nler scales are adaptvely estmated. There are 5000 p- subsets generated for each dataset used n Sec. 5., Sec. 5.. to Sec. 5..3, 0000 p-subsets for each dataset used n Sec. 5..4, and 0000 for each dataset used n Sec. 5..5. 5. Smulatons wth Synthetc Data We evaluate the eght approaches on lne fttng usng four sets of synthetc data (see Fg. 8. The resduals are calculated by the standard regresson defnton. We compare the fttng error results and the computatonal speed (TABLE. From Fg. 8 and TABLE, we can see that for the three-lne data, ASWH/ succeed n fttng and segmentng the three lnes For MS, we use the C++ code from http://coewww.rutgers.edu/rul,, For -lnkage, we use the Matlab code from http://www.toldo.nfo.

H. WANG ET AL.: SIMULTANEOUSLY FITTING AND SEGMENTING MULTIPLE STRUCTURE DATA WITH OUTLIERS 9 Legend (a (b (c Fgure 9. The average results obtaned by the eght approaches. (a-c respectvely shows the nfluence of nler scale, outler percentage, and the relatve cardnalty rado of outlers to nlers. (a (b RHT (c ASC (d MS (e RHA (f -lnkage (g F (h ASWH Fgure 0. Examples for lne fttng wth real mages. st ( tenns court and th ( tracks rows respectvely ft sx and seven lnes. (a The orgnal mages; (b to (g are the results obtaned by RHT, ASC, MS, RHA, -lnkage, F and ASWH, respectvely. We do not show the results obtaned by ASWH, whch are very smlar to those of ASWH, due to the space lmt. automatcally. Wth some user-adusted thresholds, RHT, ASC, -lnkage and F also correctly ft all the three lnes: RHT and ASC acheves lghtly hgher fttng errors than ASCH/, whle F acheves the lowest fttng error among the eght methods because the gross outlers are effectvely removed at the frst step. However, ASWH s much faster than F and -lnkage (about 40 tmes faster than F and 5.5 tmes faster than -lnkage. ASWH s faster than ASWH (about 3.4 tmes faster because ASWH uses a more effectve weghtng functon, resultng n less selected sgnfcant hypotheses (376 sgnfcant hypotheses than ASWH (68 selected sgnfcant hypotheses. RHT s faster (but has larger fttng error than ASWH because RHT uses several addtonal user-specfed thresholds: whch are used to stop the random samplng procedure. MS and RHA respectvely ft one/two lnes but fal to ft the other lnes. TABLE THE FITTING ERRORS IN PARAMETER ESTIMATION (AND THE CPU TIME IN SECONDS. 3 lnes M M M3 M4 M5 M6 M7 M8.8 (0.80 3.3 4 lnes (0.84 9.9 5 lnes (0.79 5.7 6 lnes (.. (6.7.6 (8.0.40 (8.46. (.0 3.7 (7.56 4. (5.87 7.35 (7.6 47.0 (5.53 5. (8. 48.5 (7.3 6 (7.86 78 (9.06.7 (5. 8.67 (9.5 0.9 (3.7 7.6 (3. 0.99 (77.06 (3 9.7 (6.0 (685.8 (5.4.6 (7.7.5 (4..7 (0..3 (4.5. (4.0.9 (4..5 (3.43 (M-RHT; M-ASC; M3-MS; M4-RHA; M5--LINAGE; M6-F; M7- ASWH; M8-ASWH. WE RUN THE APPAROACHES ON A LAPTOP WITH AN I7.66GHZ CPU IN WINDOW 7 PLATFORM For the four lne data, ASWH/, ASC and F correctly ft the four lnes. However, ASWH s sgnfcant faster than F and ASC. In contrast, RHT and -lnkage correctly ft three lnes but wrongly ft one. MS and RHA correctly ft two/one lnes but fal to ft two/three, respectvely. For the fve lne data, only ASWH/ and ASC correctly ft all fve lnes, whle ASC acheves relatvely hgher fttng error than ASWH/. -lnkage and F correctly ft four of the fve lnes, whle RHT and RHA ft three lnes but wrongly ft two. MS only fts two lnes correctly. In fttng the sx lnes, ASWH/, ASC and F correctly ft the sx lnes. We note that F costs much hgher CPU tme n fttng the sx-lne data than the three-lne data (around 4 tmes slower. Ths s because F obtans more clusters (30 clusters for the sx-lne data than the three-lne data (0 clusters at the step of spectral clusterng, and most of the CPU tme s spent on mergng clusters n the step of model selecton. ASC also uses more CPU tme n fttng to the sx-lne data than the three-lne data. In contrast, ASWH uses comparable CPU tme for both types of data and s about 00 tmes faster than F and more than 3 tmes faster than ASC. TABLE 3 THE FITTING ERRORS IN PARAMETER ESTIMATION. Inler scale Outler percentage Cardnalty rato Std.Var. Max.Err. Std.Var. Max.Err. Std.Var. Max.Err. RHT 4.75 36. 0.36 3..90 6.7 ASC 0.06 3.0 0.0 6.3.30 8.5 MS 5.03 46.3 7.70 49.6 3.85 3.8 RHA 9.3 333 7.0 334 4.3 43 -lnkage 5.60 4.0.06 45.6 5.95.6 F 0.9 8.4 0. 6.4.07 0.4 ASWH 0.04.88 0.09 6.0 0.0 0.86 ASWH 0.06.75 0.09 6.09 0.03 0.86 We also evaluate the performance of the eght approaches n dfferent nler scales, outler percentages, and the cardnalty ratos of nlers belongng to each model nstance. For the followng experment, we use the three-lne data smlar to that used n Fg. 9. Changes n nler scales: we generate the three lne data wth 80% outler percentage. The nler scale of each lne s slowly ncreased from 0. to 3.0. Changes n outler percentages: we draw the breakdown plot to test the nfluence of outler percentage on the performance of the

0 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, MANUSCRIPT ID: TPAMI-00-07-0509 (a (b RHT (c ASC (d MS (e RHA (f -lnkage (g F (h ASWH Fgure. Examples for crcle fttng. st ( cups to th ( cons rows respectvely ft four and sx crcles. (a The orgnal mages; (b to (h The results obtaned by RHT, ASC, MS, RHA, -lnkage, F and ASWH, respectvely. (a (b RHT (c ASC (d MS (e RHA (f -lnkage (g F (h ASWH Fgure. Examples for range mage segmentaton. st ( fve planes ncludng 084 data ponts to th ( block havng 069 data ponts rows ft fve planes. (a The orgnal mages; (b to (g The segmentaton results obtaned by RHT, ASC, MS, RHA, -lnkage, F and ASWH, respectvely. eght approaches. We gradually decrease the nler number of each lne equally whle we ncrease the number of the randomly dstrbuted outlers. The nler scale of each lne s set to.0. Changes n cardnalty ratos between the nlers of each lne: The nler numbers of the three lnes are equal to each other at the begnnng and the nler scales of all lnes are.0. We gradually reduce the nler number of one lne whle we ncrease the number of data ponts belongng to the other lnes. Thus, the relatve cardnalty rato between the lnes s ncreased gradually. We repeat the experment twenty tmes and we report both the average results (n Fg. 9, and the averaged standard varances and the worst results (n TABLE 3 of the fttng errors. As shown n Fg. 9 and TABLE 3, ASWH/ show robustness to the nfluence of the nler scale, the outler percentage and the nler cardnalty rato, and both acheve the most accurate results among the competng approaches. In contrast, RHA does not acheve good results for all the three experments. MS does not work well for the experments n Fg. 9 (b and (c but t works reasonably well n Fg. 9 (a when the nler scale s less than 0.3. Among the four approaches: RHT, ASC, -lnkage and F, n Fg. 9 (a RHT and -lnkage respectvely break down when the nler scale s larger than.4 and 0.8; n contrast, ASC and F acheve relatvely good performance when ASC s gven the number of model nstances and the step-sze n F s adusted manually. In Fg. 9 (b, t shows that -lnkage and RHT begn to break down when the outler percentage s larger than 84% and 89% respectvely, and both acheve better results than MS and RHA but worse results than ASC and F. In Fg. 9 (c, t shows that ASC and F work well when the relatve cardnalty rato of nlers s small but they begn to break down when the nler cardnalty rato s larger than.0 and. respectvely. -lnkage and RHT acheve worse results than ASC and F. It s worth notng that when we fx the step-sze h to be 50 (as specfed n Fg. n [6], F wrongly estmates the number of model nstances n about 0% of the experments n Fg. 9 (a,.3% n Fg. 9 (b and 5.3% n Fg. 9 (c. When the nler cardnalty rato of model nstances n Fg. 9 (c s larger than.3, F wrongly estmates almost all the number of model nstances n the experments. In contrast ASWH wth a fxed value correctly fnd the model nstance number for all the experments n Fg. 9. 5. Experments wth Real Images 3 5.. Lne Fttng We test the performance of the eght approaches usng real mages for lne fttng (shown n Fg. 0. For the tenns court mage whch ncludes sx lnes, there are 460 edge ponts detected by the Canny operator [44]. As F consumes a large memory resource n calculatng the kernel matrx and n spectral clusterng, we randomly resample 000 data ponts for tenns court and test F on the resampled data. As shown n Fg. 0 and TABLE 4, ASWH/ and -lnkage correctly ft all the sx lnes. However, ASWH/ are much faster than -lnkage and F (ASWH/ are respectvely /5 and 08/46 tmes faster than -lnkage and F. Among the other competng approaches, MS fts two lnes but fals n fttng four lnes; RHT, ASC and F succeed n fttng fve out of the sx lnes whle RHA correctly ft four lnes only. We note that ASC fts two estmated lnes to the same real lne and so does F (as ponted by the green arrows n Fg. 0. Whle ths does not occur to ASWH/ because t uses MIT to effectvely fuse over-clustered modes belongng to the same model nstance. For the tracks mage whch ncludes seven lnes, 0 Canny edge ponts are detected. Only F and ASWH/ successfully ft all the seven lnes n ths challengng case. ASWH (4.3 seconds s about 40 tmes faster than F and 3 To make the readers easy to test and compare ther approaches wth the competng approaches n ths paper, we put the data used n the paper (for lne fttng, crcle fttng, homography based segmentaton and fundamental matrx based segmentaton and the correspondng ground truth (for homography based segmentaton and fundamental matrx based segmentaton onlne: http://cs.adelade.edu.au/~dsuter/code_and_data/

H. WANG ET AL.: SIMULTANEOUSLY FITTING AND SEGMENTING MULTIPLE STRUCTURE DATA WITH OUTLIERS 5.8 tmes faster than ASWH (see TABLE 4. In contrast, both ASC and -lnkage correctly ft four lnes but fal n fttng three lnes. The number of the successfully ftted lnes by RHT, MS and RHA s respectvely one, two and three. 5.. Crcle Fttng In Fg., we ft the rms of the four cups and of the sx cons. For the cups mage, there are 634 Canny edge ponts detected. The nlers correspondng to each of the four cups are about 7% to 0% of the data ponts. As shown n Fg., ASWH/, RHT, ASC, -lnkage and F acheve accurate fttng results for all the four cups. However, ASWH s much faster than ASC, -lnkage and F (respectvely, about 5, 6 and 5 tmes faster see TABLE 4 and requres less user-specfed thresholds. MS correctly fts the crcle rms of three cups but fals n one; RHA only fts one crcle rm but fals n the other three. For the cons mage 4, there are 68 edge ponts. As shown n Fg. and TABLE 4, ASWH/, RHT, ASC and F correctly ft the sx rms whle ASWH s 30 tmes faster than F, and 6.5 tmes faster than ASC. Although RHT s about 3 tmes faster than ASWH, t requres many userspecfed thresholds. For -lnkage, there are two estmated model nstances correspondng to one con (ponted by the green arrow. The problem s due to the clusterng crteron used n -lnkage. MS correctly fts fve rms of the cons but wrongly fts one rm of the cons. RHA successfully fts one rm of the cons only. TABLE 4 THE CPU TIME USED BY THE APPROACHES(IN SECONDS M M M3 M4 M5 M6 M7 M8 court 0.74 8.3 7.98 5. 73 4600*. 0.8 tracks 0.85 9.9 6.57.5 3. 67 4.5 4.3 cups.9 4.9. 0.8 0. 5.4 9.86 3.7 cons.5 6.7 3.9.5 8.8 7 6.59 4.0 5planes 3.40 64 0. 9.6 95* 593*.9 5. blocks 3. 64 9.7 7.0 30* 3445*.3 9.8 (M-RHT; M-ASC; M3-MS; M4-RHA; M5--LINAGE; M6-F; M7- ASWH; M8-ASWH. * MEANS THE APPROACH USES THE RE-SAMPLED DATA POINTS 5..3 Plane Fttng Range data usually contan multple structures but they do not have many random outlers. However, nlers of one structure are pseudo-outlers to the other structures. Thus, the task of range data segmentaton requres an approach to be robust. To make the expermentaton feasble, we use 000 resampled data ponts for F and 5000 resampled data ponts for -lnkage (these are challenged by large numbers of data ponts. We model the plane segmentaton as a standard planar regresson problem for calculatng the resduals. As shown n Fg., RHA acheves the worst results. -lnkage and F correctly ft all the planar surfaces for the fve plane data but wrongly fts one plane for the block range data. The other fve approaches correctly ft all fve planes n both case but the results of RHT and ASC (due to the under-estmated nler scales are less accurate than MS and ASWH/. MS succeeds n both cases. One crucal reason that MS can succeed s because the nler scales of the planes n the two range data are very small (about 6x0-4 for the fve plane and 8x0-3 for the block. Ths s consstent wth the prevous analyss (n Fg. 9 (a that MS works better when the nler scale s smaller. ASWH/ acheve better 4 The mage s taken from http://www.murderousmaths.co.uk. results than the other approaches and ASWH/ can estmate the number of planes automatcally. For the computatonal effcency, ASWH/ are much faster than -lnkage (more than one order faster and F (more than two order faster n dealng wth a large number of data ponts and ASC (8 to 4 tmes faster n dealng wth data nvolvng multple structures (see TABLE 4. The computatonal tme of the eght approaches for the lne/crcle/plane fttng experments descrbed n Sec. 5.. to Sec. 5..3 are summarzed n TABLE 4. 5..4 Homography Based Segmentaton We use the approaches mentoned n Sec. 4 to obtan homography hypotheses and compute the resduals of the SIFT pont correspondences, see Fg. 3. To quanttatvely compare the results of ASWH/ wth those of the competng approaches, we manually select a set of nlers for each model nstance and use the chosen nlers to estmate the model parameters, whch are used as the grand truth (GT model parameters. All true nlers belongng to that model nstance can be derved from the GT parameters (for the next secton we also perform a smlar manual ground truth extracton. The comparsons are shown n TABLE 5. For each data, we also report the total number (TN of data ponts, the mnmum nler number (MnN and the maxmum nler number (MaxN correspondng to the model nstances, from whch the outler percentage and the relatve nler cardnalty rato between the model nstances can be calculated. From Fg. 3 and TABLE 5 we can see that MS acheves the worst result and fals n all the fve data. RHA succeeds n fttng one data wth less accuracy but fals n four. RHT and F succeed n four data but fal n one. Only ASC (gven the number of model nstances, -lnkage (gven the nler scale and a threshold for the number of model nstances and ASWH/ correctly ft and segment all the data. For the computatonal effcency, we note that when the data contan less data ponts and less number of model nstances, e.g., for MC, ASWH takes smlar computatonal tme to -lnkage whle ASWH s about 80% faster than ASC and more than one order faster than F. In contrast, when data nvolve more data ponts and more number of model nstances, such as the data 5B, ASWH s much faster (6.9, 6.5 and 375 tmes faster, respectvely than ASC, -lnkage, and F. As for the scale estmaton error obtaned by ASWH ASC and F, we do not consder MH because F breaks down for ths case. The mean scale error for the other four data obtaned by ASWH, ASC and F s respectvely 0.8, 0.9 and 3.80: ASWH acheves most accurate accuracy n estmatng the nler scales. 5..5 Fundamental Matrx Based Segmentaton In Fg. 4, we gve examples showng the ablty of ASWH n two-vew based multple-body moton segmentaton usng the fundamental matrx model. We employ the seven-pont algorthm [0] to obtan the fundamental matrx hypotheses and compute the resduals of the SIFT pont correspondences usng the Sampson Dstance [0]. From TABLE 6 we can see that MS and RHA fal to ft both data. RHT and -lnkage acheve better results than MS and RHA for the BC data, but both approaches break down for the BCD data. Only ASC, F and ASWH/ correctly ft both data whle ASWH acheve more accurate results than ASC and F. The mean scale error obtaned by ASC, F and ASWH s respectvely.5, 0.74,and 0.87, from whch we can see F acheves slghtly more accurate results than

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, MANUSCRIPT ID: TPAMI-00-07-0509 (a (a (b (b (c (c (d RHT (e ASC (f MS (g RHA (h -lnkage ( F ( ASWH Fgure 3. More examples showng the ablty of the eght approaches n estmatng homographes and segmentng multple-structure data wth the mage pars of Model House (referred as to MH, four books (referred as to 4B and fve books (referred as to 5B. (a, (b and (c show the left mages wth the ground truth segmentaton results supermposed. Each model nstance s n one color. The yellow dots are the nd th outlers. (a, (b and (c show the rght mages wth the dspartes of correspondng ponts supermposed. to 4 rows are the segmentaton results obtaned by RHT, ASC, MS, RHA, -lnkage, F and ASWH, respectvely. ASWH n scale estmaton but ASWH s about one order faster (9.7 and 8.4 tmes faster for BC and BCD respectvely than F. TABLE 5 THE FITTING ERRORS OBTAINED BY THE EIGHT APPROACHES AND THE CPU TIME USED (IN SECONDS. MnN MaxN TN MC 3 00 347 MC3 36 303 799 MH 05 94 70 777 4B 3 5B 57 577 394 M M M3 M4 M5 M6 M7 M8.89.4 44.6 4.47.9.87.9.8 (4.8 (30.8 (47 (9.0 (5.4 (30 (8.3 (6.8.36 0.49 507 36 0.66 0.66 0. 5 0.50 (6.0 (54.7 ( (3.7 (6.5 (85.8 (9.6 (.5 39.5.57 365 5.43 5.6.85.56 (39.5 (55.3 (36 (.6 (9.0 (64.5 (9.3 (5..79 0.86 467 350 0.95. 0.9 0.87 (33.0 (3 (89 (7.8 (33.6 (989 (0.7 (4.9.7 0.43 3 44 0.85 0.6 0.50 0.44 (7.4 (68 (87 (9.4 (57 (8996 (36.0 (4.3 (M-RHT; M-ASC; M3-MS; M4-RHA; M5--lnkage; M6-F; M7ASWH; M8-ASWH. Images MC, MC3, and MH are taken from http://www.robots.ox.ac.uk/~vgg/data TABLE 6 THE FITTING ERRORS OBTAINED BY THE EIGHT APPROACHES AND THE CPU TIME USED (IN SECONDS. M M 0.39 0.7 500 6 (.6 (45 9.7 0.53 460 7 (3.4 (97 MnN MaxN TN BC 378 BCD 3 M3 5.9 (39 6.5 (9 M4 4.5 (49 36.4 (68 M5 0.7 (6 7.6 (3 M6 0.5 (66 0.5 (59 M7 M8 0.4 0.5 (98 (64.5 0. 5 0.6 (74 (6.5 (M-RHT; M-ASC; M3-MS; M4-RHA; M5--lnkage; M6-F; M7ASWH; M8-ASWH 6 CONCLUSION We have tackled three mportant problems n robust model fttng: estmatng the number of model nstances, estmatng the parameters, and estmatng the nler scale of each model nstance. At the heart of our soluton are a novel scale estmator IOSE and an effectve robust fttng framework. The (a (b (c Fgure 4. The segmentaton results obtaned by ASWH for the mage pars of Box-Car (referred as to BC and Box-Car-Dnosaur (referred as to BCD respectvely. (a shows the left mage wth the ground truth segmentaton supermposed. (b shows the rght mage wth the dspartes of correspondng ponts supermposed. (c shows the results obtaned by ASWH. overall framework, ASWH, uses the accurate scale estmates of IOSE to weght each sample hypothess. Ths allows us to gan effcency by early elmnaton of weak hypotheses correspondng to bad p-subsets (that may nvolve outlers. We demonstrate those effcency gans by conductng experments wth two varants of our approach (ASWH/ and comparng ther run-tmes wth competng approaches. ASWH represents a verson of our approach that s smlar to puttng RANSAC (wth a known correct error tolerance nto our framework, nstead of our enhanced method of scale estmaton and hypothess scorng. The effcency gans are easly seen as ASWH s almost always faster than ASWH (and most of the compettors. Both ASWH/ generally outperform compettors n terms of robustness and accuracy ths attests to the effectveness of the rest of our framework. Most remarkably, our approach s essentally free of any user choce of parameter save one, related to the nler percentage of the smallest structure lkely to be resolved by our approach. When we do not know the nler percentage, we fx the value n ASWH to a small number (0% n our case to avod breakdown. Actually, ths s not such an un-

H. WANG ET AL.: SIMULTANEOUSLY FITTING AND SEGMENTING MULTIPLE STRUCTURE DATA WITH OUTLIERS 3 usual assumpton as the nler percentage estmate has been usually ether assumed known, or fxed to some low value, and then used to determne the number of the requred p- subsets n many robust approaches whch employ the popular random samplng scheme. We acknowledge that n some cases (probably most the user does not know the outler percentage n any reasonably accurate or precse form. Ths ssue s related to the ssue of how bg should the recovered populatons/structures be. Ths ssue s compounded by other aspects related to structural scale (e.g., how close two planes are n fttng, for example, planar parts of a wall wth small recesses, etc.. As stated n Sec., we do not clam that we have proposed an automatc system whch has solved all the problems of recovery of structure from mage data: ths s mpossble because how many structures there are n a scene s usually a queston wthout a unque answer. However, the number of structures can be determned when we assume that all the structures of nterest are relatvely large and have roughly comparable sze (we gnore the structures wth very few supportng data ponts. In our approach, such an assumpton s mplct by usng a conservatve (n ths paper, we fx the value to be 0% of the data ponts for all the experments. Ths can be thought of as (roughly sayng we are assumng we are not nterested n fner scale structure that makes up less than 0% of the data ponts. Our experments show that the proposed method, whlst havng no clams of optmalty, s effectve. Of course, our approach s not wthout some drawbacks and possble mprovements. For example, guded samplng approaches may mprove the effcency. Lkewse, though we have demonstrated superorty to the comparator methods, as descrbed n ther publshed forms, that does not preclude mprovements to those approaches. Fnally, t also must be acknowledged that one s not always strctly comparng apples wth apples and oranges wth oranges. For example, many technques n ths area were developed under the assumpton there s a domnant sngle structure and thus were never explctly desgned to recover multple structures, though perhaps others have tred to extend the methods to do so by ft-and-remove teratons etc. When one has only a sngle domnant structure, or perhaps knows the scale of nose, other methods stll suffce. However, our method stll generally performs well albet slower. Compared to the more recent methods of ASC [7, 9], - Lnkage [5] and F [6], the advantage of our method can be understood as such: ASC s not developed for multstructure fttng, thus t may suffer from the problems related to sequental ft-and-remove. -Lnkage requres a user-gven global nler threshold (whch may be dffcult to determne, thus t assumes smlar nose scales across all structures. Fnally, as our experments show, F does not tolerate very well dfferent nler populaton szes. In contrast, for reasons already descrbed above (.e., accurate scale estmator, parallel fttng va hypothess weghtng and fuson, ASWH s sgnfcantly more tolerant to the ssues faced by the other methods. ACNOWLEDGEMENTS The authors would lke to thank the revewers for ther valuable comments, whch greatly helped to mprove the paper. Parts of our code are developed based on the Structure and Moton Toolkt provded by Prof. P. Torr. We also thank Dr. R. Toldo and Prof. A. Fusello for sharng the code of - lnkage, and Dr. R. Subbarao and Prof. P. Meer for sharng the code of MS. We thank Prof. A. Bab-Hadashar for provdng some range mage data. Ths work was supported by the Australan Research Councl (ARC under the proect DP087880, by the Natonal Natural Scence Foundaton of Chna under proect 67079 and by the Xamen Scence & Technology Plannng Proect Fund (350Z06005 of Chna. Most of ths work was completed when the frst author was at the Unversty of Adelade. REFERENCES. Lowe, D.G., Dstnctve Image Features from Scale-Invarant eyponts. Internatonal ournal of Computer Vson, 004. 60(. 9-0.. Wang, H., D. Suter, Robust Adaptve-Scale Parametrc Model Estmaton for Computer Vson. IEEE Trans. Pattern Analyss and Machne Intellgence, 004. 6(. 459-474. 3. Stewart, C.V., Bas n Robust Estmaton Caused by Dscontnutes and Multple Structures. IEEE Trans. Pattern Analyss and Machne Intellgence, 997. 9(8. 88-833. 4. Huber, P.., Robust Statstcs. New York, Wley, 98. 5. Rousseeuw, P.., Least Medan of Squares Regresson. ournal of the Amercan Statstcal Assocaton, 984. 79. 87-880. 6. Rousseeuw, P.., A. Leroy, Robust Regresson and outler detecton. ohn Wley & Sons, New York., 987. 7. Hough, P.V.C., Methods and means for recognsng complex patterns, n U.S. Patent 3 069 654. 96. 8. XU, L., E. OA, P. ULTANEN, A New Curve Detecton Method: Randomzed Hough Transform (RHT. Pattern Recognton Letters, 990. (5. 33-338. 9. Fschler, M.A., R.C. Bolles, Random Sample Consensus: A Paradgm for Model Fttng wth Applcatons to Image Analyss and Automated Cartography. Commun. ACM, 98. 4(6. 38-395. 0. Torr, P., D. Murray, The Development and Comparson of Robust Methods for Estmatng the Fundamental Matrx. Internatonal ournal of Computer Vson, 997. 4(3. 7-300.. Mller,.V., C.V. Stewart. MUSE: Robust Surface Fttng Usng Unbased Scale Estmates. IEEE Computer Vson and Pattern Recognton. 996. 300-306.. Stewart, C.V., MINPRAN: A New Robust Estmator for Computer Vson. IEEE Trans. Pattern Analyss and Machne Intellgence, 995. 7(0. 95-938. 3. Lee,.-M., P. Meer, R.-H. Park, Robust Adaptve Segmentaton of Range Images. IEEE Trans. Pattern Analyss and Machne Intellgence, 998. 0(. 00-05. 4. Yu, X., T.D. Bu, A. rzyzak, Robust Estmaton for Range Image Segmentaton and Reconstructon. IEEE Trans. Pattern Analyss and Machne Intellgence, 994. 6(5. 530-538. 5. Chen, H., P. Meer. Robust Regresson wth Proecton Based M-estmators. IEEE Internatonal Conference on Computer Vson. 003. 878-885. 6. Hosennezhad, R., A. Bab-Hadashar. A Novel Hgh Breakdown M- estmator for Vsual Data Segmentaton. Internatonal Conference on Computer Vson. 997. -6. 7. Wang, H., D. Mrota, M. Ish, et al. Robust Moton Estmaton and Structure Recovery from Endoscopc Image Sequences Wth an Adaptve Scale ernel Consensus Estmator IEEE Computer Vson and Pattern Recognton. 008. 8. Zulan, M., C.S. enney, B.S. Manunath. The multransac Algorthm and Its Applcaton to Detect Planar Homographes. IEEE Internatonal Conference on Image Processng. 005. 53-56. 9. Vdal, R., Y. Ma, S. Sastry, Generalzed prncpal component analyss (GPCA. IEEE Trans. Pattern Analyss and Machne Intellgence, 004. 7(. 945-959. 0. Fukunaga,., L.D. Hostetler, The Estmaton of the Gradent of a Densty Functon, wth Applcatons n Pattern Recognton. IEEE Trans. Info. Theory, 975. IT-. 3-40.. Comancu, D., P. Meer, Mean Shft: A Robust Approach towards Feature Space A Analyss. IEEE Trans. Pattern Analyss and Machne Intellgence, 00. 4(5. 603-69.. Tuzel, O., R. Subbarao, P. Meer. Smultaneous Multple 3D Moton Estmaton va Mode Fndng on Le Groups. IEEE Internatonal Conference on Computer Vson. 005. 8-5. 3. Subbarao, R., P. Meer, Nonlnear Mean Shft over Remannan Manfolds. Internatonal ournal on Computer Vson, 009. 84(. -0. 4. Zhang, W.,. osecka, Nonparametrc Estmaton of Multple Structures wth Outlers. LNCS, DVW, 006. 4358. 60-74. 5. Toldo, R., A. Fusello. Robust Multple Structures Estmaton wth - Lnkage. European Conference on Computer Vson. 008. 537-547. 6. Chn, T.-., H. Wang, D. Suter. Robust Fttng of Multple Structures: The Statstcal Learnng Approach. IEEE Internatonal Conference on Computer Vson. 009. 43-40. 7. Wess, Y., E.H. Adelson. A Unfed Mxture Framework for Moton Segmentaton: Incorporatng Spatal Coherence and Estmatng the Number of

4 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, MANUSCRIPT ID: TPAMI-00-07-0509 Models. Proceedngs of IEEE conference on Computer Vson and Pattern Recognton. 996. 3-36. 8. anatan,., C. Matsunaga. Estmatng the Number of Independent Motons for Multbody Moton Segmentaton. Asan Conference on Computer Vson 00. 7-. 9. Wang, H., D. Mrota, G. Hager, A Generalzed ernel Consensus Based Robust Estmator. IEEE Transactons on Pattern Analyss and Machne Intellgence, 00. 3(. 78-84. 30. Wang, H., D. Suter. Robust Fttng by Adaptve-Scale Resdual Consensus. 8th European Conference on Computer Vson. 004. 07-8. 3. Bab-Hadashar, A., D. Suter, Robust segmentaton of vsual data usng ranked unbased scale estmate. ROBOTICA, Internatonal ournal of Informaton, Educaton and Research n Robotcs and Artfcal Intellgence, 999. 7. 649-660. 3. Torr, P., A. Zsserman, MLESAC: A New Robust Estmator Wth Applcaton to Estmatng Image Geometry. Computer Vson and Image Understandng, 000. 78(. 38-56. 33. Tordoff, B., D.W. Murray, Guded Samplng and Consensus for Moton Estmaton. European Conference on Computer Vson, 00. 8-96. 34. Chum, O.,. Matas. Matchng wth PROSAC: Progressve Sample Consensus. IEEE Computer Vson and Pattern Recognton. 005. 0-6. 35. Wand, M.P., M. ones, ernel Smoothng. Chapman & Hall, 995. 36. Slverman, B.W., Densty Estmaton for Statstcs and Data Analyss. 986, London: Chapman and Hall. 37. Sezgn, M., B. Sankur, Survey Over Image Thresholdng Technques and Quanttatve Performance Evaluaton. ournal of Electronc Imagng, 004. 3(. 46-65. 38. Ferraz, L., R. Felp, B. Martínez, et al. A Densty-Based Data Reducton Algorthm for Robust Estmators. Lecture Notes n Computer Scence. 007. 355-36. 39. Duda, R.O., P.E. Hart, Pattern Classfcaton and Scene Analyss. 973: ohn Wley and Sons. 40. Shannon, C.E., The Mathematcal Theory of Communcaton. Bell Systems Technque ournal, 948. 7. 379-43. 4. Yang, Z.R., M. Zwolnsk, Mutual Informaton Theory for Adaptve Mxture Models. IEEE Transactons on Pattern Analyss and Machne Intellgence, 00. 3(4. 396-403. 4. Lee, Y.,.Y. Lee,. Lee, The Estmatng Optmal Number of Gaussan Mxtures Based on Incremental k-means for Speaker Identfcaton. Internatonal ournal of Informaton Technology, 006. (7. 3-. 43. Hartley, R., A. Zsserman, Multple Vew Geometry n Computer Vson. nd ed. 004: Cambrdge Unversty Press. 44. Canny,.F., A Computatonal Approach to Edge Detecton. IEEE Transactons on Pattern Analyss and Machne Intellgence, 986. 8. 679-697. Tat-un Chn receved a B.Eng. n Mechatroncs Engneerng from Unverst Teknolog Malaysa (UTM n 003, and subsequently n 007 a Ph.D. n Computer Systems Engneerng from Monash Unversty, Vctora, Australa. He was a Research Fellow at the Insttute for Infocomm Research n Sngapore from 007-008. Snce 008 he s a Lecturer at The Unversty of Adelade, Australa. Hs research nterests nclude robust estmaton and statstcal learnng methods n Computer Vson. Davd Suter receved a B.Sc.degree n appled mathematcs and physcs (The Flnders Unversty of South Australa 977, a Grad. Dp. Comp. (Royal Melbourne Insttute of Technology 984, and a Ph.D. n computer scence (La Trobe Unversty, 99. He was a Lecturer at La Trobe from 988 to 99; and a Senor Lecturer (99, Assocate Professor (00, and Professor (006-008 at Monash Unversty, Melbourne, Australa. Snce 008 he has been a professor n the school of Computer Scence, The Unversty of Adelade. He s head of the School of Computer Scence. He served on the Australan Research Councl (ARC College of Experts from 008-00. He s on the edtoral boards of Internatonal ournal of Computer Vson. He has prevously served on the edtoral boards of Machne Vson and Applcatons and the Internatonal ournal of Image and Graphcs. He was General co-char or the Asan Conference on Computer Vson (Melbourne 00 and s currently co-char of the IEEE Internatonal Conference on Image Processng (ICIP03. Hanz Wang s currently a Dstngushed Professor and Mnang Scholar at Xamen Unversty, Chna, and an Adunct Professor at the Unversty of Adelade, Australa. He was a Senor Research Fellow (008-00 at the Unversty of Adelade, Australa; an Assstant Research Scentst (007-008 and a Postdoctoral Fellow (006-007 at the ohns Hopkns Unversty; and a Research Fellow at Monash Unversty, Australa (004-006. He receved the Ph.D degree n Computer Vson from Monash Unversty, Australa. He was awarded the Douglas Lampard Electrcal Engneerng Research Prze and Medal for the best PhD thess n the Department. Hs research nterests are concentrated on computer vson and pattern recognton ncludng vsual trackng, robust statstcs, model fttng, obect detecton, vdeo segmentaton, and related felds. He has publshed around 50 papers n maor nternatonal ournals and conferences ncludng the IEEE Transactons on Pattern Analyss and Machne Intellgence, Internatonal ournal of Computer Vson, ICCV, CVPR, ECCV, NIPS, MICCAI, etc. He s an Assocate Edtor for IEEE Transactons on Crcuts and Systems for Vdeo Technology (T-CSVT and he was a Guest Edtor of Pattern Recognton Letters (September 009. He s a Senor Member of the IEEE. He has served as a revewer for more than 30 ournals and conferences.