Robust Statistics for Computer Vision: Model Fitting, Image Segmentation and Visual Motion Analysis

Size: px

Start display at page:

Download "Robust Statistics for Computer Vision: Model Fitting, Image Segmentation and Visual Motion Analysis"

Katherine Nelson
5 years ago
Views:

1 Robust Statstcs for Computer Vson: Model Fttng, Image Segmentaton and Vsual Moton Analyss by Hanz Wang B.S. (Schuan Unversty) 1996 M.S. (Schuan Unversty) 1999 A thess submtted for the degree of Doctor of Phlosophy` n the Department of Electrcal and Computer Systems Engneerng Monash Unversty Clayton Vctora 3800 Australa February 2004

2 Robust Statstcs for Computer Vson: Model Fttng, Image Segmentaton and Vsual Moton Analyss Copyrght 2004 by Hanz Wang All Rghts Reserved

3 Contents Contents Lst of Fgures Lst of Tables v x Summary x Declaraton xv Preface xv Acknowledgements xx Dedcaton xx 1. Introducton Background and Motvaton Thess Outlne Model-Based Robust Methods: A Revew The Least Squares (LS) Method Outlers and Breakdown Pont Tradtonal Robust Estmators from Statstcs M-Estmators and GM-Estmators The Repeated Medan (RM) Estmator The Least Medan of Squares (LMedS) Estmator The Least Trmmed Squares (LTS) Estmator... 19

4 2.4 Robust Estmators Developed wthn the Computer Vson Communty Breakdown Pont n Computer Vson Hough Transform (HT) Estmator Random Sample Consensus (RANSAC) Estmator Mnmze the Probablty of Randomness (MINPRAN) Estmator Mnmum Unbased Scale Estmator (MUSE) and Adaptve Least kth Order Squares (ALKS) Estmator Resdual Consensus (RESC) Estmator Concluson Usng Symmetry n Robust Model Fttng Introducton Dlemma of the LMedS and the LTS n the Presence of Clustered Outlers The Symmetry Dstance Defnton of Symmetry The Symmetry Dstance The Least Trmmed Symmetry Dstance (LTSD) Method Expermental Results Crcle Fttng Ellpse fttng Experments wth Real Images Experments for the Data wth Unform Outlers Concluson MDPE: A Novel and Hghly Robust Estmator Introducton Nonparametrc Densty Gradent Estmaton and Mean Shft Method Maxmum Densty Power Estmator MDPE The Densty Power (DP) The MDPE Algorthm Experments and Analyss Experment Lne Fttng...59

5 Crcle Fttng Tme Complexty Experment Experment Experment The Influence of the Wndow Radus and the Percentage of Outlers on MDPE The Influence of the Choce of Error Tolerance on RANSAC The Relatonshp between the Nose Level of Sgnal and the Choce of Wndow Radus for MDPE Experments on Real Images Concluson A Novel Model-Based Algorthm for Range Image Segmentaton Introducton A Revew of Several State-of-the-Art Methods for Range Image Segmentaton The USF Range Segmentaton Algorthm The WSU Range Segmentaton Algorthm The UB Range Segmentaton Algorthm The UE Range Segmentaton Algorthm Towards to Model-Based Range Image Segmentaton Method A Quck Verson of the MDPE QMDPE QMDPE The Breakdown Plot of QMDPE The Tme Complexty of QMDPE The Influence of Wndow Radus on the Results of QMDPE Applyng QMDPE to Range Image Segmentaton From Estmator to Segmenter A New and Effcent Model-Based Algorthm for Range Image Segmentaton Experments n Range Image Segmentaton Concluson...104

6 6. Varable-Bandwdth QMDPE for Robust Optcal Flow Calculaton Introducton Optcal Flow Computaton From QMDPE to vbqmdpe Bandwdth Choce The Algorthm of the Varable Bandwdth QMDPE Performance of vbqmdpe vbqmdpe and Optcal Flow Calculaton Varable-Bandwdth-QMDPE Optcal Flow Computaton Quanttatve Error Measures for Optcal Flow Expermental Results on Optcal Flow Calculaton Concluson A Hghly Robust Scale Estmator for Heavly Contamnated Data Introducton Robust Scale Estmators The Medan and Medan Absolute Devaton (MAD) Scale Estmator Adaptve Least K-th Squares (ALKS) Scale Estmator Resdual Consensus (RESC) Scale Estmator Modfed Selectve Statstcal Estmator (MSSE) A Novel Robust Scale Estmator: TSSE Mean Shft Valley Algorthm Two-Step Scale Estmator (TSSE) Experments on Robust Scale Estmaton Normal Dstrbuton Two-mode Dstrbuton Two-mode Dstrbuton wth Random Outlers Breakdown Plot A Roof Sgnal A Step Sgnal Breakdown Plot for Robust Scale Estmator Performance of TSSE Conclusons v

7 8. Robust Adaptve-Scale Parametrc Model Estmaton for Computer Vson Introducton Adaptve Scale Sample Consensus (ASSC) Estmator Algorthm Experments wth Data Contanng Multple Structures D Examples D Examples The Breakdown Plot of the Four Methods Influence of the Nose Level of Inlers on the Results of Robust Fttng Influence of the Relatve Heght of Dscontnuous Sgnals ASSC for Range Image Segmentaton The Algorthm of ASSC-Based Range Image Segmentaton Experments on Range Image Segmentaton ASSC for Fundamental Matrx Estmaton Background of Fundamental Matrx Estmaton The Experments on Fundamental Matrx Estmaton A Modfed ASSC (ASRC) Adaptve-Scale Resdual Consensus (ASRC) Experments Concluson Mean shft for Image Segmentaton by Pxel Intensty or Pxel Color Introducton False-Peak-Avodng Mean Shft for Image Segmentaton The Relatonshp between the Gray-Level Hstogram of Image and the Mean Shft Method The False Peak Analyss An Unsupervsed Peak-Valley Sldng Algorthm for Image Segmentaton Expermental Results Color Image Segmentaton Usng Global Informaton and Local Homogenety HSV Color Space Consderng the Cyclc Property of the Hue Component n the Mean Shft Algorthm The Proposed Segmentaton Method for Color Images v

8 Local Homogenety Color Image Segmentaton Method Experments on Color Image Segmentaton Concluson Concluson and Future Work Bblography 184 v

9 Lst of Fgures Contents Chapter 1. Introducton 1 Fgure 1.1: The OLS estmator may breakdown...2 Fgure 1.2: Examples of multple strucutres...3 Chapter 2. Model-Based Robust Methods: A Revew 8 Fgure 2.1: The types of outlers...11 Chapter 3. Usng Symmetry n Robust Model Fttng 31 Fgure 3.1: An example where LMedS and LTS fal to ft a crcle...34 Fgure 3.2: LMedS searches for the best ft wth the least medan of resduals...35 Fgure 3.3: Breakdown Plot of LMedS and LTS...36 Fgure 3.4: Four knds of symmetres...39 Fgure 3.5: Example of usng the symmetry of the crcle n the LTSD method...42 Fgure 3.6: Comparatve result of the LTSD, LMedS, and LTS n crcle fttng...43 Fgure 3.7: Comparson of the results of the LTSD, LTS and LMedS n ellpse fttng...45 Fgure 3.8: Fttng a mouse pad by the LTSD, LTS and LMedS methods...46 Fgure 3.9: Fttng the ellpse n a cup by the LTSD, LTS and LMedS methods...47 Fgure 3.10: An ellpse wth 40% randomly dstrbuted outlers...48 Chapter 4. MDPE: A Novel and Hghly Robust Estmator 50 Fgure 4.1: One example of the mean shft estmator...55 Fgure 4.2: Comparng the performance of sx methods...60 Fgure 4.3: One example of fttng crcles by the sx methods...62 Fgure 4.4: Experment fttng a lne wth clustered outlers Fgure 4.5: Breakdown plot for the sx methods...67 Fgure 4.6: The nfluence of wndow radus and percentage of outlers on the results of the MDPE v

10 Fgure 4.7: The nfluence of the choce of error bound on the results of RANSAC Fgure 4.8: The relatonshp between the nose level of sgnal and the choce of wndow radus n MDPE Fgure 4.9: Fttng a lne by the sx methods...72 Fgure 4.10: Fttng a crcle edge by the sx methods...73 Chapter 5. A Novel Model-Based Algorthm for Range Image Segmentaton 79 Fgure 5.1: The smplfed 3D recognton systems...80 Fgure 5.2: Breakdown plot for the QMDPE method...91 Fgure 5.3: The nfluence of wndow radus on the results of the QMDPE Fgure 5.4: The structure of the proposed range mage segmentaton algorthm...95 Fgure 5.5: A comparson of usng normal nformaton or not usng normal nformaton...98 Fgure 5.6: Segmentaton of ABW range mage (test.28)...99 Fgure 5.7: Segmentaton of ABW range mage (test.27) Fgure 5.8: Segmentaton of ABW range mage (test.13) Fgure 5.9: Comparson of the segmentaton results for ABW range mage (test.1) Fgure 5.10: Comparson of the segmentaton results for ABW range mage (tran 6) Chapter 6. Varable-Bandwdth QMDPE for Robust Optcal Flow Calculaton 106 Fgure 6.1: Comparng the performance of vbqmdpe, LS, LMedS, and LTS Fgure 6.2: One example of multple motons Fgure 6.3: The snapshot of the three mage sequences Chapter 7. A Hghly Robust Scale Estmator for Heavly Contamnated Data 120 Fgure 7.1: An example of the applcaton of the mean shft valley method Fgure 7.2: Breakdown plot of sx methods n estmatng the scale of a roof sgnal Fgure 7.3: Breakdown plot of sx methods n estmatng the scale of a step sgnal Fgure 7.4: Breakdown plot of dfferent robust k scale estmators Chapter 8. Robust Adaptve-Scale Parametrc Model Estmaton for Computer Vson 136 Fgure 8.1: Comparng the performance of four methods Fgure 8.2: Frst experment for 3D multple-structure data Fgure 8.3: Second experment for 3D multple-structure data Fgure 8.4: Breakdown plot of the four methods v

11 Fgure 8.5: The nfluence of the nose level of nlers on the results Fgure 8.6: The nfluence of the relatve heght of dscontnuous sgnals on the results of the four methods Fgure 8.7: Segmentaton of ABW range mages Fgure 8.8: Comparson of the segmentaton results for ABW range mage (test 3) Fgure 8.9: Comparson of the segmentaton results for ABW range mage (test 13) Fgure 8.10: A comparson of correctly dentfed percentage of nlers Fgure 8.11: Example of usng ASSC to estmate the fundamental matrx Fgure 8.12: Comparng the performance of fve methods Fgure 8.13: 3D exaample by the fve methods Chapter 9. Mean Shft for Image Segmentaton by Pxel Intensty or Pxel Color 162 Fgure 9.1: False peak nose Fgure 9.2: The segmentaton results of the proposed method Fgure 9.3: The applcaton of the proposed method on medcal mages Fgure 9.4: HSV color space Fgure 9.5: Example of usng the proposed method to segment color mage Fgure 9.6: Segmentng the Jelly beans color mage Fgure 9.7: Segmentng the Splash color mage x

12 Lst of Tables Contents Table 3.1: Comparson of the estmated parameters by LTSD, LTS, and LMedS methods n ellpses fttng under 40% clustered outlers...45 Table 3.2: Comparson of the estmated parameters by the LTSD, LTS, and LMedS methods n ellpses fttng wth 40% randomly dstrbuted outlers Table 4.1: The comparson of tme complexty for the fve methods (all tme n seconds). 63 Table 5.1: The tme complexty of QMDPE (n seconds) Table 6.1: Comparatve results on dvergng tree Table 6.2: Comparatve results on Yosemte (cloud regon excluded) Table 6.3: Comparatve results on Otte mage sequences Table 7.1: Applyng the mean shft valley method to decompose data Table 8.1: Result of the estmates of the parameters (A, B, C; σ) provded by each of the robust estmators appled to the data n Fgure Table 8.2: Result of the estmates of the parameters (A, B, C; σ) provded by each of the robust estmators appled to the data n Fgure Table 8.3: An expermental comparson for data wth 60% outlers Table 8.4: Expermental results on two frames of the Corrdor sequence Table 9.1: False peaks predcton x

13 Contents Summary Robust Statstcal methods (such as LMedS and LTS) were frst ntroduced n computer vson to mprove the performance of feature extracton algorthms. One attractve feature of tradtonal robust statstcal methods s that they can tolerate up to half of the data ponts that do not obey the assumed model (.e., they can be robust to up to 50% contamnaton). However, they can break down at unexpectedly lower percentages when the outlers are clustered; also, they cannot tolerate more than 50% outlers. Ths s because that these methods measure only one sngle statstc: for example, the least medan of resduals (for LMedS) or the least sum of trmmed squared of resduals (for LTS), omttng other characterstcs of the data. We realsed that there are two possble ways to mprove the robustness of the methods: () to take advantage of specal nformaton n the data (e.g., symmetry); () to take advantage of nformaton n the resduals (.e., the probablty densty functon (pdf) of the resduals). In terms of these aspects, the thess makes the followng contrbutons: To leverage possble symmetry n the data, we adapt the concept of Symmetry Dstance to formulate an mproved regresson method, called the Least Trmmed Symmetry Dstance (LTSD). To explot the structure n the pdf of resduals, we develop a famly of very robust estmators: Maxmum Densty Power Estmator (MDPE), Quck-MDPE (QMDPE), and varable-bandwdth QMDPE (vbqmdpe) by applyng nonparametrc densty estmaton and densty gradent estmaton technques n parametrc estmaton. In these methods, we consder the densty dstrbuton of data ponts n resdual space and the sze of the resdual correspondng to the local maxmum of the densty dstrbuton n ther objectve functons. An mportant tool n our methods s the mean shft method. x

14 The pdf of the resduals s mportant for scale estmaton (more specfcally, the shape/spread ). By consderng dstrbuton of the resduals, and by employng the mean shft method and our proposed mean shft valley method, we develop the Two Step Scale Estmator (TSSE). Furthermore, based on TSSE, we propose a famly of novel robust estmators: Adaptve Scale Sample Consensus (ASSC) and Adaptve Scale Resdual Consensus (ASRC), whch consder both the resduals of nlers and the scale of nlers n the objectve functons. More specfcally, the frst contrbuton of ths thess s that we demonstrate the fraglty of LMedS and LTS and analyse the reasons that cause the fraglty of these methods n the stuaton when a large percentage of clustered outlers exst n the data. We ntroduce the concept of Symmetry Dstance to model fttng and formulate an mproved regresson method the LTSD estmator. Expermental results are presented to show that the LTSD performs better than LMedS and LTS under a large percentage of clustered outlers and large standard varance of nlers. The tradtonal robust methods generally assume that the data of nterests (nlers) occupy a majorty of the whole data. In mage analyss, however, the data s often complex and several nstances of a model are smultaneously present, each accountng for a relatvely small percentage of the data ponts. To deal wth data ncludng multple structures and a hgh percentage of outlers (>50%) remans a challengng task. In ths thess, we assume that the nlers occupy a relatve majorty of the data, by whch t s possble that a robust estmator can tolerate more than 50% outlers. A sgnfcant contrbuton of ths thess s that we present a seres of novel and hghly robust estmators MDPE, QMDPE and vbqmdpe, whch can tolerate more than 80% outlers and s very robust to data wth multple structures, by applyng the mean shft algorthm n the space of the pdf of resduals. When data nclude multple structures, two major steps should be taken n the process of robust model fttng: ) robustly estmate the parameters of a model, and ) dfferentate nlers from outlers. Experments n ths thess show that to correctly estmate the parameters of a model (only) s not enough; to dfferentate nlers from outlers, both the estmated parameters of a model and the correspondng scale estmate should be correct. x

15 Havng a correct scale of nlers s crucal to the robust behavour of an estmator. The success of many robust estmators s based on havng a correct ntal scale estmate or the correct settng of a partcular parameter that s related to scale (e.g., RANSAC, Hough Transform, M-estmators etc.). Although there are a lot of papers that propose hghly robust estmators, robust scale estmaton s relatvely neglected n the computer vson communty. One major contrbuton of ths thess s that we nvestgate the behavour of several state-of-the-art robust scale estmators for data wth multple structures, and propose a novel robust scale estmator: TSSE. TSSE s very robust to outlers and can resst heavly contamnated data wth multple structures. TSSE s a very general method and can be used to gve an ntal scale estmate for robust estmators such as M-estmators. TSSE can also be used to provde an auxlary estmate of scale (after the parameters of a model to ft have been found) as a component of almost any robust fttng method such as Hough Transform, MDPE, etc. Another mportant contrbuton of ths thess s that we propose, based on TSSE and RANSAC, another novel and hghly robust estmator: ASSC (and a varant of ASSC: ASRC). The ASSC estmator s an mportant mprovement over RANSAC because no pror knowledge concernng the scale of nlers s necessary (the scale estmaton s data drven). ASSC can tolerate more than 80% outlers and multple structures. ASSC s also an mprovement over MDPE and ts famly (QMDPE/vbQMDPE). MDPE and ts famly only estmate the parameters of a model. In contrast, ASSC can produce the parameters of a model and the correspondng scale as ts results. We used the mean shft algorthm extensvely n the robust methods descrbed above. We also drectly apply the mean shft method to mage segmentaton based on mage ntensty or on mage color. One property of the mean shft s that t s senstve to local peaks (ncludng false peaks). We found n our experments that t s possble that there are many false peaks f the feature space (such as the ntensty/color space or the resdual space) s quantzed. The occurrence of false peaks may have a negatve nfluence on the performance of methods employng the mean shft. In ths thess, we establsh a quanttatve relatonshp between the appearance of false peaks and the value of the bandwdth h. We provde a complete unsupervsed peak-valley sldng algorthm for graylevel mage segmentaton. The general mean shft algorthm consders only the global x

16 nformaton (features) of the mage, whle neglectng the local homogenety nformaton. We modfy the mean shft algorthm so that both local homogenety and global nformaton are consdered. In order to valdate our proposed methods, we have (successfully) appled these methods to a consderable number of mportant and fundamental computer vson tasks ncludng: Model fttng (geometrc prmtve fttng): (a) lne fttng; (b) crcle fttng; (c) ellpse fttng; (d) plane fttng, etc.; Range mage segmentaton; Robust optcal flow calculaton; Fundamental matrx estmaton; Grey mage segmentaton and color mage segmentaton. xv

17 Contents Declaraton February 26, 2004 I declare that: 1. Ths thess contans no materal that has been accepted for the awards of any other degree or dploma n any unversty or nsttute. 2. To the best of my knowledge, ths thess contans no materal that has prevously publshed or wrtten by another person except where due reference s made n the text of the thess. Sgned: Hanz Wang xv

18 Contents Preface Durng my study at Monash Unversty (from June, 2001 to Feb., 2004), a number of papers, whch contan materal used n ths thess, have been publshed, accepted, or are currently under revew/preparaton. Papers that have been accepted or publshed: 1. H. Wang and D. Suter, Robust Adaptve-Scale Parametrc Model Estmaton for Computer Vson, IEEE Trans. Pattern Analyss and Machne Intellgence (PAMI), H. Wang and D. Suter, Robust Fttng by Adaptve-Scale Resdual Consensus, n 8 th European Conference on Computer Vson (ECCV04), Prague, pages , May 11-14, H. Wang and D. Suter, MDPE: A Very Robust Estmator for Model Fttng and Range Image Segmentaton, Internatonal Journal of Computer Vson (IJCV), 59(2), pages , H. Wang and D. Suter, Usng Symmetry n Robust Model Fttng, Pattern Recognton Letters, 24(16), pages , H. Wang and D. Suter, False-Peaks-Avodng Mean Shft Method for Unsupervsed Peak-Valley Sldng Image Segmentaton, n 7 th Internatonal Conference on Dgtal Image Computng: Technques and Applcatons (DICTA'03), Sydney, pages , Dec H. Wang and D. Suter, Color Image Segmentaton Usng Global Informaton and Local Homogenety, n 7 th Internatonal Conference on Dgtal Image Computng: Technques and Applcatons (DICTA'03), Sydney, pages 89-98, Dec xv

19 7. H. Wang and D.Suter, Varable Bandwdth QMDPE and Its Applcaton n Robust Optc Flow Estmaton, n 9 th IEEE Internatonal Conference on Computer Vson (ICCV03), Nce, France, pages , Oct H. Wang and D. Suter, A Model-Based Range Image Segmentaton Algorthm Usng a Novel Robust Estmator, n 3 rd Internatonal Workshop on Statstcal and Computatonal Theores of Vson SCTV03 (n conjuncton wth ICCV03), Nce, France, Oct D. Suter and H. Wang, Robust Fttng Usng Mean Shft: Applcatons n Computer Vson, n M. Hubert, G. Pson, A. Struyf, and S. Van Aelst, edtors, Theory and Applcatons of Recent Robust Methods, Statstcs for Industry and Technology. Brkhauser, Basel, pages , D. Suter, P. Chen, and H. Wang, Extractng Moton from Images: Robust Optc Flow and Structure from Moton, n Proceedngs Australa-Japan Advanced Workshop on Computer Vson, Adelade, Australa, pages 64-69, 9-11 Sept H. Wang and D. Suter, A Novel Robust Method for Large Numbers of Gross Errors, n 7th Int. Conf. on Automaton, Robotcs and Computer Vson (ICARCV02), Sngapore, pages , December 3-6, H. Wang and D. Suter, LTSD: A Hghly Effcent Symmetry-based Robust Estmator, n 7th Int. Conf. on Automaton, Robotcs and Computer Vson (ICARCV02), Sngapore, pages , December 3-6, Techncal Reports: 1. H. Wang and D. Suter, ASSC A New Robust Estmator for Data wth Multple Structures, Techncal Report (MECSE ), Monash Unversty, Sept., H. Wang and D. Suter, MDPE: A Very Robust Estmator for Model Fttng and Range Image Segmentaton, Techncal Report (MECSE ), Monash Unversty, Mar., H. Wang and D. Suter, Robust Scale Estmaton from True Parameters of Model, Techncal Report (MECSE ), Monash Unversty, Mar., xv

20 4. H. Wang and D. Suter, False-Peaks-Avodng Mean Shft Method for Unsupervsed Peak-Valley Sldng Image Segmentaton, Techncal Report (MECSE ), Monash Unversty, Mar., H. Wang, A. Bab-Hadashar, S. Boukr, and D. Suter, Outlers Rejecton based on Repeated Medans, Techncal Report (MECSE ), Monash Unversty, Dec., Papers n Preparaton: 1. H. Wang and D. Suter, A Novel Robust Estmator for Accurate Optcal Flow Calculaton, n preparaton for Image and Vson Computng. xv

21 Acknowledgements Contents There are many people wthout whom ths thess could not be fnshed. I would frst lke to thank my supervsor A.Prof. Davd Suter for hs kndly help, advce, support, and encouragement over the years. It s he who dd a thorough proof readng and spent countless hours n mprovng the clarty and the presentaton of the thess as well as my academc papers. I learned many moral standards and belefs, whether matters relatng to academc or non-academc, from Davd. I would lke to thank my assocate supervsor Prof. Raymond Jarvs who provded me wth a clear pcture of the research background, the state of art n my research area, and potental new approaches and new drectons. Hs gudance at that stage was so mportant that helped me to concentrate on my project quckly and determne my path to acheve the goals. I would lke to thank Dr. Alreza Bab-Hadashar and Dr. Sama Boukr for ther valuable dscusson and suggestons for the robust statstcs part; I would lke to thank Prof. Jaesk Mn for hs knd assstance. I thank Prof. Xaoy Jang and A. Prof. Patrck J.Flynn for ther code and results for the range mage segmentaton part. I thank A. Prof. Mchael Black for hs valuable suggestons for the optcal flow calculaton part. I thank Prof. Andrew Zsserman, Dr. Hongdong L, Krsty Sm, and Hafeng Chen for ther kndly help for the fundamental matrx estmaton part. Many thanks to my colleagues from the Dgtal Percepton Laboratory: Dr. Pe Chen, Mr. Danel Tung, who dscussed the theoretcal problems wth me and provded techncal help to me. I also thank Dr. Paul Rchardson, Dr. Fang (Fona) Chen, Dr. Prthvraj Tssanayagam, Mr. Mohamed Gobara and Mr. James Cheong. I have benefted a lot from ther kndly help and support. xx

22 I am very grateful to many anonymous revewers of my journal, conference and workshop papers. Ther valuable comments and suggestons provded valuable assstance to revse and mprove each of my papers and make the deas n each paper clearer and more understandable. I would lke to thank many researchers and people that I met at conferences, workshops, and semnars, who motvated and smulated me to pursue a hgher level n my study. Thank you for walkng wth me n my lfe. Wthout your nspraton, my study would stay at the orgnal level. Especally, I would lke to express my deepest thanks and apprecatons to my mother, father, and lttle sster. They gve me constant encouragement, selfless support, and kndly solctude. Here, I would lke to share all my achevements wth them. The research n ths thess was supported by the Australa Research Councl (ARC), under the grant A xx

23 Dedcaton To my mother, my father and my lttle sster. xx

24 1. Introducton Chapter 1 Introducton 1.1 Background and Motvaton The study of computer vson s strongly nterdscplnary. Ths study s new, rapdly growng and complex snce t brngs together several dscplnes ncludng Computer Scence, Artfcal Intellgence, Physcs, Graphcs, Psychology, Physology, etc. The purpose of computer vson s to develop theores and algorthms to automatcally extract and analyse useful nformaton from an observed mage, mage set, or mage sequence. One major task of computer vson and mage analyss nvolves the extracton of meanngful nformaton from mages or mage sequences usng concepts akn to regresson and model fttng. The range of applcatons s wde: t ncludes: robot vson, automated survellance (cvl and mltary) and nspecton, bomedcal mage analyss, vdeo codng, moton segmentaton, human-machne nterface, vsualzaton, hstorcal flm restoraton etc. Parametrc models play a vtal role n many actvtes n computer vson research. When engaged n parametrc fttng n a computer vson context, t s mportant to recognse that 1

25 data obtaned from the mage or mage sequences may be naccurate. It s almost unavodable that data are contamnated (due to faulty feature extracton, sensor nose, segmentaton errors, etc) and t s also lkely that the data wll nclude multple structures. Thus, t has been wdely acknowledged that all algorthms n computer vson should be robust for accurate estmaton (Haralck 1986). Ths rules out a smple-mnded applcaton of the least squares (LS) method. To ft a model to nosy data (wth a large number of outlers and multple structures) s stll a major and challengng task wthn the computer vson communtes. Robust regresson methods are a class of technques that can tolerate gross errors (outlers) and have a hgh breakdown pont. Robust Statstcal methods were frst ntroduced n computer vson to mprove the performance of feature extracton algorthms. These methods can tolerate (n varous degrees) the presence of data ponts that do not obey the assumed model. Such ponts are called outlers True lne 15 Inlers 10 5 Lne estmated by the OLS 0 Outler Fgure 1.1: The OLS estmator may breakdown when even one outler exsts n the data. The defnton of robustness n ths context often s focused on the noton of the breakdown pont. The breakdown pont of an estmator may be roughly defned as the smallest percentage of outler contamnaton that can cause the estmator to produce arbtrarly large values ((Rousseeuw and Leroy 1987), pp.9). Breakdown pont s one mportant qualty of 2

an estmator when we evaluate how robust an estmator s to outlers. The more robust an estmator s, the hgher ts breakdown pont s. The breakdown pont, as defned n statstcs, s a worst-case measure.

26 an estmator when we evaluate how robust an estmator s to outlers. The more robust an estmator s, the hgher ts breakdown pont s. The breakdown pont, as defned n statstcs, s a worst-case measure. A zero breakdown pont only means that there exsts one (at least) potental confguraton for whch the estmator wll fal. The LS estmator has a breakdown pont of 0%, because only one sngle extreme outler s suffcent to force the LS estmator to produce arbtrarly large values (see Fgure 1.1). (a) (b) Fgure 1.2: Examples that many nstances of a model can be smultaneously present n one mage: (a) there are many planar surfaces (.e., nstances of a planar model) n the range mage, (b) there are many cups (the rm of a cup can be roughly treated as a crcle model) n the color mage. Two frequently used robust technques are the least medan of squares (LMedS) (Rousseeuw 1984) and the M-estmators (Huber 1981). One attractve feature of LMedS and M-estmators s that they can tolerate up to half of the data ponts beng arbtrarly bad. In computer vson and mage analyss, however, the data s often complex and several nstances of a model are smultaneously present, each accountng for a relatvely small percentage of the data ponts (see Fgure 1.2). We call ths case data wth multple structures. Thus t wll rarely happen that a gven populaton acheves the crtcal sze of 50% of the total populaton and, therefore, technques that have been touted for ther hgh breakdown pont (e.g., LMedS and other tradtonal robust methods from statstcs) are no 3

27 longer relable canddates, beng lmted to a 50% breakdown pont. Only robust methods desgned wth ths specal nature of the vsual data n mnd can acheve satsfactory results. To desgn an effcent robust method for computer vson tasks, several characterstcs that are dstnct from those (mostly) addressed by the statstcal communty must be taken nto account: Pseudo-outlers. In a gven mage, there are usually several populatons of data (.e., multple structures). Some parts correspond to one object n a scene and other parts wll correspond to other, rather unrelated, objects. When attemptng to ft a model to ths data, one must consder the populaton belongng to the related object as nlers and other populatons as outlers - the term pseudo-outler has been coned (Stewart 1995). In computer vson tasks, t rarely happens that a gven populaton acheves the crtcal sze of 50% of the total populaton and, therefore, technques that have been touted for ther hgh breakdown pont (e.g., the Least Medan of Squares) are no longer relable canddates from ths pont of vew. Large data szes. Modern dgtal cameras exst wth around 4 mllon pxels per mage. Image sequences, typcally at up to 50 frames per second, contan many mages. Thus, computer vson researchers typcally work wth data sets n the tens of thousands of elements, at least, and data sets n the 10 6 and 10 9 range are not uncommon. Unknown szes of populatons and unknown locaton. Computer vson requres fully automated analyss n, generally, rather unstructured envronments. Thus, the szes and locatons of the populatons nvolved, wll fluctuate greatly. Moreover, there s no human n the loop to select regons of the mage domnated by a sngle populaton, or to adjust varous thresholds. In contrast, statstcal problems studed n most other areas usually have a sngle domnant populaton plus some percentage of outlers (typcally ms-recordngs - not the pseudo-outlers mentoned above). Typcally a human expert s there to assess the results (and, f necessary, crop the data, adjust thresholds, try another technque etc.). 4

28 Emphass on fast calculaton. Most tasks n computer vson must be performed on-the-fly. Offlne analyss that takes seconds, let alone mnutes or hours, s usually a luxury afforded by relatvely few applcatons. These rather pecular crcumstances have lead computer vson researchers to develop ther own technques that perform n a robust fashon (perhaps emprcally robust should be used, as few have formal proved robust propertes, though many trace ther hertage to technques that do have such proved propertes). These nclude ALKS (Lee, Meer et al. 1998), RESC (Yu, Bu et al. 1994), and MUSE (Mller and Stewart 1996). However, t has to be admtted that a complete soluton, addressng all of the above problems, s far from beng acheved. Indeed, none of the technques, wth present hardware lmtatons, are really real-tme when appled to the most demandng tasks. None have been proved to relably tolerate hgh percentages of outlers and, ndeed, we have found wth our experments that RESC and ALKS, although clearly better than the Least Medan of Squares, n ths respect, are not always relable. As we stated n the summary of ths thess, one can mprove upon these approaches by usng extra nformaton such as symmetry n the data or the resdual dstrbuton. Ths thess addresses varous problems n computer vson - specfcally, robust model fttng, range mage segmentaton, mage moton estmaton, fundamental matrx calculaton, and grey/color mage segmentaton. The major contrbutons of ths thess come n followng forms: (a) a new symmetry-based robust method; (b) several novel hghly robust methods wth expermentally demonstrated advantages; (c) a novel hghly robust scale estmaton technque; (d) several practcal technques applyng the proposed robust methods to solve real computer vson problems ncludng range mage segmentaton, optcal flow calculaton and fundamental matrx estmaton; and (e) a couple of algorthms for grey/color mage segmentaton. A more subtle contrbuton of ths thess s that we, n lookng at applyng mean shft for hstogram-based mage segmentaton, notced a quantzng effect that produces false peaks. We develop a theory to predct/avod false peaks and ths theory s applcable n all stuatons where one quantzes feature space (e.g., the resdual space) before applyng the mean shft. The methods/technques developed n ths thess can be benefcal to both the statstcs and the computer vson communtes. 5

29 1.2 Thess Outlne There are a wde range of topcs covered n ths thess (model fttng; range mage segmentaton; optcal flow calculaton; fundamental matrx estmaton; grey/color mage segmentaton). Thus, prevous related work s revewed or ntroduced when t s necessary. In Chapter 2, several state-of-the-art robust technques are revewed. These robust technques nclude both those developed n the statstcs feld (such as M-Estmators, Repeated Medan, LMedS, and LTS) and those developed n the computer vson communty (such as Hough Transform, RANSAC, MINPRAN, MUSE, ALKS, and RESC). Chapter 3 addresses the fraglty of tradtonally employed robust methods (LMedS and LTS) when data nvolve clustered outlers, and analyses the reasons that cause the fraglty of these methods. Furthermore, the symmetry nformaton n the data s exploted and the concept of Symmetry Dstance s ntroduced to model fttng. An mproved regresson method the LTSD s proposed. Chapter 4 takes advantage of structure nformaton n the pdf of the resduals n order to acheve hgher robustness. By employng nonparametrc densty estmaton and densty gradent technques, and by consderng the dstrbuton of probablty densty n the resdual space, a novel and hghly robust estmator, MDPE, s proposed. Extensve expermental comparsons have been carred out to show the advantages of MDPE compared wth fve frequently used robust methods (LMedS, Hough Transform, RANSAC, ALKS, and RESC). Chapter 5 begns by revewng several state-of-the-art range mage segmentaton algorthms. Then a novel model-based range mage segmentaton algorthm, derved from Quck-MDPE, s proposed. Segmentaton s a complcated task and t requres more than a smple applcaton of a robust estmator. Actually, our proposed algorthm tackles many subtle ssues and thereby provdes a framework for those who want apply ther robust estmators to the task of range mage segmentaton. In Chapter 6, we ntroduce the problem of optcal flow calculaton. Then, a modfed QMDPE employng the varable bandwdth technque (vbqmdpe) s appled to compute the optcal flow. Because vbqmdpe has a hgher robustness to outlers than LMedS and LTS, the experments on both synthetc and real mage sequences show very promsng results. 6

30 Havng a correct scale of nlers s mportant to the robust behavour of a lot of estmators. However, robust scale estmaton s relatvely neglected. Thus, Chapter 7 nvestgates the behavour of several state-of-the-art robust scale estmators for data wth multple structures, and, by explotng the nformaton of shape dstrbuton of resduals, proposes a novel robust scale estmator: TSSE. Chapter 8 proposes, based on TSSE, a novel robust estmator: ASSC and ts varant ASRC. Experments on model fttng, range mage segmentaton and fundamental matrx estmaton show that the proposed method s very robust to data wth dscontnutes, multple structures and outlers. In Chapter 9, we drectly apply the mean shft algorthm to grey/color mage segmentaton. In the process, we dentfy an ssue that affects the mean shft method when the data s heavly quantzed. We also solve a couple of practcal problems: () we propose a quanttatve relatonshp between the appearance of false peaks and the value of the bandwdth h, whch s applcable for many methods employng the mean shft; () we ntroduce the local homogenety nto the mean shft algorthm. These result n two algorthms for grey/color mage segmentaton. Fnally, Chapter 10 summarzes what we have done and dentfes what reman to be challengng problems: suggestng future research work. 7

31 2. Model-Based Robust Methods: A Revew Chapter 2 Model-Based Robust Methods: A Revew The hstory of seekng a robust method that can resst the effects of gross errors,.e. outlers, n fttng models s long. Snce data contamnaton s usually unavodable - (due to such cases as faulty feature extracton, sensor nose and falure, segmentaton errors, multple structures, etc.), there has recently been a general recognton that all algorthms should be robust for accurate estmaton. As ponted out by (Meer, Mntz et al. 1991), a robust estmator should have followed propertes: Good effcency at the assumed nose dstrbuton. Relablty n the presence of varous types of nose. Hgh breakdown pont. Tme complexty s not much greater than that of the Least Squares method. Because lnear models play a very mportant role n most modern robust methods and many modern technques are developed based on lnear regresson methods, ths chapter commences wth revewng a most frequently appled lnear regresson method: the LS method. Several state-of-the-art robust technques are then revewed. 8

32 2.1 The Least Squares (LS) Method Lnear regresson analyss s an mportant tool n most appled scence ncludng computer vson. The least squares method s one of the most famous lnear regresson methods and t has been used n many scentfc felds for a long tme. The classcal lnear model can be descrbed n the followng form [(Rousseeuw and Leroy 1987), pp.1]: y = x θ x θ + e ( 1,..., ) ( 2.1) 1 1 p p = n where the varable y s the response varable; and the varables x 1,,x p are the explanatory varables. The error term e s usually assumed to be normally dstrbuted wth mean zero and standard devaton σ. We have n sets of observatons on y and (x 1,,x p ), for =1, n: y1 x11 x1 p ( Y, X ) = ( 2.2) yn xn 1 xnp where Y=( y 1,,y n ) s a n-vector; X=( x (1),, x (p) ) s a n-by-p matrx and x () = (x 1,,x n ) s a n-vector. Equaton ( 2.1) can be rewrtten usng matrx notaton as follows: Y=Xθ + e ( 2.3) Usng regresson analyss, we can obtan regresson coeffcents ˆ θ = ( ˆ θ ˆ θ )' from the 1 p observaton data (Y, X). θˆ s the estmate of θ. Applyng θˆ to the explanatory varables (x 1,,x p ), we can obtan: 9

33 = ˆ x ˆ θ ( 2.4) yˆ x 1 θ p p where ŷ s the estmated value of y. Usually, ths estmated value s not exactly the same as the actually observed value. The dfference between the estmated value ŷ and actually observed value y s the resdual r for the 'th set of observed data. r = y - ŷ ( 2.5) The ordnary least squares regresson estmator can be wrtten as follows: n 2 r = 1 ˆ θ = arg mn ( 2.6) ˆ θ Equaton ( 2.6) s the well-known LS equaton. From equaton ( 2.6), we can see the least squares estmator estmates the optmzed θˆ by mnmzng of the sum of the squared resduals. If we let: S(θˆ ) = n 2 r = 1 ( 2.7) Then, we have: S(θˆ )= r r = (Y-Xθˆ ) (Y-Xθˆ ) = Y Y+θˆ X X θˆ -2θˆ X Y ( 2.8) Dfferentatng S(θˆ ) w.r.t. θˆ, we obtan: 2X Xθˆ -2X Y = S( ˆ θ ) = 0 ˆ θ ( 2.9) From equaton ( 2.9), we obtan the normal equaton [(Rao and Toutenburg 1999), pp.24]: X Xθˆ = X Y ( 2.10) When X X s not sngular, the regresson coeffcents θˆ can be estmated by: 10

34 θˆ = (X X) -1 X Y ( 2.11) The LS estmator s hghly effcent and acheves optmum results under Gaussan dstrbuted nose. Although the LS method has the advantages of low computatonal cost and hgh effcency, t s extremely senstve to outlers (gross errors or samples belongng to another structure and dstrbuton). 2.2 Outlers and Breakdown Pont Clustered outlers 15 Leverage pont 10 Pseudo-outlers True Lne 5 0 Inlers -5 Unformly dstrbuted outlers Fgure 2.1: The types of outlers. Outlers can be grossly defned as: the data ponts that le far from the majorty of the data. Before we dscuss the behavor of robust estmators, t s benefcal to nvestgate the varous types of outlers that one can encounter. Outlers frequently happen n the data n computer vson tasks. Outlers could potentally lead to negatve effects on the accuracy of the results. Even more, outlers could serously spol the results of one method that s not robust to outlers. Although a lot of new statstcal technques have been developed to tolerate the effect of outlers wthn recent years, ther advantages reman only when the data nvolve certan types of outlers. 11

35 Loosely speakng, outlers can be classfed nto followng four types: Leverage ponts the outlers n explanatory varables. Clustered outlers the outlers that are clustered. Randomly dstrbuted outlers the outlers that are randomly dstrbuted. Pseudo outlers the data ponts from structures that are extraneous to a partcular sngle parametrc model ft,.e., data that are nlers to one structure wll be pseudo outlers to another. Leverage pont means that the pont s outlyng relatve to the explanatory varable x, but not relatve to the response varable y. Leverage ponts do not always lead to negatve results. When a leverage pont les close to a regresson lne, t s a good leverage pont (as shown n Fgure 2.1) and can lead to a good effect on the results. However, the leverage pont s far away the regresson lne, t s a bad leverage pont (.e., outler). Clustered outlers often brng serously negatve effects to the results. It has been expermentally shown that t s relatvely hard to resst the effects of clustered outlers than those of randomly dstrbuted outlers (see chapter 3 and 4). Most theores are proposed assumng that outlers are unformly dstrbuted. The theores that consder clustered outlers are relatvely less n number. One characterstc to dstngush pseudo outlers from gross outlers and clustered outlers s that pseudo outlers are coherent and structured. Pseudo outlers have structures whle gross outlers and clustered outlers do not have. Pseudo outlers often appear n the data ncludng multple structures. Because multple structures frequently happen n computer vson tasks, studyng the effects of pseudo outlers (multple structures) has been popular n computer vson communty (Yu, Bu et al. 1994; Mller and Stewart 1996; Lee, Meer et al. 1998; Bab-Hadashar and Suter 1999). To seek an estmator wth hgh breakdown pont s one of the most mportant topcs among the statstcs and computer vson communty. The breakdown pont of an estmator may be roughly defned as the smallest percentage of outler contamnaton that can cause the estmator to produce arbtrarly large values. Let Z be any sample of n data ponts (x 1, y 1 ), 12

36 , (x n, y n ), Z = {z 1,,z n } and z = {x, y }. For m n, the fnte-sample breakdown pont of a regresson estmator T can be wrtten as [(Rousseeuw and Leroy 1987), pp.10]: * m * ε n ( T, Z) = mn ; sup Tn ( Z ) = ( 2.12) * n Z Z m Because one sngle outler s suffcent to force the LS estmator to produce arbtrarly large value, the LS estmator has a breakdown pont of 0%. In order to reduce the nfluence of outlers, many robust estmators wth hgh breakdown pont have been developed durng the past three decades. In the next sectons, several modern robust estmators, developed by both statstcs and computer vson communtes wll be revewed. 2.3 Tradtonal Robust Estmators from Statstcs A lot of robust estmators have been developed wthn the statstcs communty and appled to computer vson feld. Among these robust estmators, the famly of M-estmators s one class of the most popular robust regresson methods M-Estmators and GM-Estmators The theory of M-estmators was frstly developed by Huber n 1964 and several years later. It was successfully generalzed as a robust regresson method (Huber 1973; Huber 1981). The essence of M-estmators s to replace the squared resduals r 2 n equaton ( 2.6) by a symmetrc functon ρ of the resduals: n ˆ θ = arg mn ρ( r ) ( 2.13) ˆ θ = 1 where ρ r ) s a robust loss functon wth a unque mnmum when resdual r s zero. The ( purpose of ntroducng the loss functon ρ r ) s to reduce the effects of outlers. ( 13

37 Let the dervatve of ρ ( r ) be ψ ( r ), then dfferentatng ρ( n = 1 r ) n equaton 2.13, we obtan n = 1 ψ ( r / ˆ) σ r = 0 ( 2.14) where σˆ s the varance related to resduals. The soluton of equaton ( 2.14) could be found usng teratve mnmzaton and varous equaton solvng algorthms (L 1985). M-estmators can be classfed nto three types based on the nfluence functon ψ r ) (Holland and Welsch 1977; Stewart 1997): ( 1. Monotone M-estmators. Ths type of M-estmators has nondecreasng, bounded ψ (r) wrtten as: functons [(Huber 1981), Chapter 7]. The loss functons can be 1 2 r, r c ρ ( r) = 2 ( 2.15) 1 c(2 r c), c < r 2 2. Hard Redescenders. Ths type of M-estmators forces ψ (r) =0 when r > c (c s a threshold). That s to say, a resdual wll lose ts effects on the results when the absolute of the resdual s larger than c (Hampel, Rousseeuw et al. 1986a). The loss functons ρ(r) can be wrtten as: 1 2 r, r a 2 1 a(2 r a), a < r b ρ ( r) = 2 ( 2.16) 1 2 a[ ( r c) /( b c) + ( b + c a) ], b < r c 2 1 a( b + c a), c r 2 3. Soft Redescenders. Ths type of M-estmators has not a fnte rejecton pont c. The type of M-estmators force ψ (r) =0 when r. 14

38 1 2 ρ ( r ) = (1 + f )log(1 + r / f ) ( 2.17) 2 Although the M-estmators are robust to outlers wth respect to response varables, they are not effcent n resstng the outlers wth respect to explanatory varables (see equaton ( 2.1)). Therefore, the generalzed M-estmators (GM-estmators) were developed to reduce the effects of outlers wth respect to explanatory varables. GM-estmators used weght functon w to resst the nfluence of outlers wth respect to explanatory varables. Mallows (Mallows 1975) presented the followng GM-estmators: n = 1 w( x ) ψ ( r / ˆ) σ x = 0 ( 2.18) Hll developed the followng equaton (Hll 1977): n = 1 w( x ) ψ ( r / w( x ) ˆ) σ x = 0 ( 2.19) Unfortunately, t has been proved that the breakdown pont of GM-estmators s only 1/(1+p), where p s the dmenson of explanatory varables (Maronna, Bustos et al. 1979). That means when p=2, the hghest breakdown pont of GM-estmators s only %. When p ncreases, the breakdown pont wll correspondngly dmnsh The Repeated Medan (RM) Estmator Before the development of the repeated medan estmator, t was controversal whether t was possble to fnd a robust estmator wth a hgh breakdown pont of 50%. In 1982, Segel proposed the repeated medan (RM) estmator (Segel 1982). The repeated medan estmator has an attractve characterstc n that t can obtan a 50% breakdown pont. The repeated medan method can be summarzed as follows: For any p observatons, (x 1, ' y 1 ),,(x p, y p ), let the soluton parameter vector be denoted by ˆ θ = ( ˆ θ,..., ˆ θ p ). The jth coordnate of ths vector s denoted by θ j ( 1,, p ). Then the repeated medan estmator s wrtten as: 1 15

39 ˆ θ = med(...( med( med θ j ( 1,, p ))) ) ( 2.20) j 1 p 1 p The repeated medan estmator s effectve for problems wth small p. However, the tme complexty of the repeated medan estmator s O(n p log p n), whch prevents the method beng useful n applcatons where p s even moderately large The Least Medan of Squares (LMedS) Estmator Rousseeuw proposed the least medan of squares (LMedS) n 1984 (Rousseeuw 1984). The LMedS method has the followng assumptons: The sgnal to be estmated should occupy the majorty of the data ponts, that s, more than 50% data ponts should belong to the sgnal to be estmated (some tradtonal methods such as RM, LTS, etc., also have the same assumpton). The correct ft wll correspond to the one wth the least medan of squared resduals. Ths crteron s not always true when the data ncludes multple structures and clustered outlers, and when the varance of nlers s large (see chapter 2 and 3). The LMedS method s based on the smple dea of replacng the sum n least sum of squares formulaton by a medan. LMedS fnds the parameters to be estmated by mnmzng the medan of squared resduals correspondng to the data ponts. The LMedS estmate can be wrtten as: ˆ θ = arg mn med r ( 2.21) ˆ θ 2 A drawback of the LMedS method s that no explct formula exsts for the soluton of equaton ( 2.21) the exact soluton can only be determned by a search n the space of all possble estmates. Ths space s very large. One can consder all estmates determned by all possble p-tuples of data ponts. There are O(n p ) p-tuples and t takes O(nlogn) tme to fnd the medan of the resduals of the whole data for each p-tuple. Thus t costs O(n p+1 logn) for the LMedS method. The cost wll thus ncrease very fast wth n and p. 16

40 In practce, only an approxmate LMedS, based upon random samplng, can be mplemented for any problem of a reasonable sze we generally refer to ths approxmate verson when we use the term LMedS (a conventon adopted by most other authors as well). In order to reduce the tme complexty of the LMedS method to a feasble value, a Monte Carlo type technque (descrbed as follows) s usually employed. A p-tuple s clean f t conssts of p good observatons wthout contamnaton by outlers. One performs, m tmes, random selectons of p-tuples, where one chooses m so that the probablty (P) that at least one of the m p-tuples s clean s almost 1. Let ε be the fracton of outlers contaned n the whole set of ponts. The probablty P can be expressed as follows: P=1-(1-(1- ε) p ) m ( 2.22) Thus one can determne m for gven values of ε, p and P by: log(1 P) m = ( 2.23) p log[1 (1- ε) ] For example, f there are 50 percent of data contamnated by outlers,.e. ε = 0.5, and f we requre P = 0.99; then, for crcle fttng, p=3, we obtan m=35; and for ellpses fttng, f we let p=5, then we obtan m=145. The LMedS method has excellent global robustness and hgh breakdown pont (.e., 50%). Over the last two decades, LMedS has been growng n popularty. For example, Kumar and Hanson used the least medan of squares to solve the pose estmaton problem (Kumar and Hanson 1989); Roth and Levne employed t for range mage segmentaton (Roth and Levne 1990); Meer et. al. appled t for mage structure analyss n the pecewse polynomal feld ; Zhang used the least medan squares n conc fttng (Zhang 1997); and Bab-Hadashar and Suter employed t for optc flow calculaton (Bab-Hadashar and Suter 1998). However, the relatve effcency of the LMedS method s poor when Gaussan nose s present n the data. As Rousseeuw noted (Rousseeuw and Leroy 1987), the LMedS method 17

41 has a very low convergence rate: t s of order n -1/3, whch s much lower than the convergence rate of order n -1/2 of M-estmators. To compensate ths defcency, Rousseeuw mproved the LMedS method by carryng out a weghted least square procedure after the ntal LMedS ft. The weghts are chosen based on the ntal LMedS ft. The prelmnary scale (a detaled descrpton sees chapter 7) estmate s gven by: S (1 ) n p = + 2 medr ( 2.24) where r s the resdual of 'th sample. The weght functon W whch wll be assgned to the 'th data pont s gven by: r (2.5S) W = ( 2.25) r > (2.5S) The data ponts correspondng to W =0 are lkely to be outlers and wll not be consdered n the further weghted least squares estmate. The data ponts havng W =1 are nlers and wll be used for determnng the fnal varance estmates σˆ. Fnally, σˆ s gven by the weghted least squares ˆ σ = mn Wr ( 2.26) ˆ θ n = 1 2 There are dfferent ways n whch a method can be robust. The robustness we have been dscussng s global robustness. However, the LMedS method may be locally unstable when fttng models to data. Ths means that a small change n the data can greatly alter the output. Ths behavour s not desrable n computer vson and has been notced by Thomas (Thomas and Smon 1992). In comparson, M-estmators have better local stablty. 18

42 2.3.4 The Least Trmmed Squares (LTS) Estmator The least trmmed squares (LTS) method was ntroduced by Rousseeuw to mprove the low effcency of LMedS (Rousseeuw 1984; Rousseeuw and Leroy 1987). The LTS estmator can be mathematcally expressed as: h ˆ 2 arg mn ( r ) n : ˆ θ = 1 θ = ( 2.27) where ( are the ordered squared resduals, h s the trmmng constant. 2 2 r ) 1: n ( r ) n: n The LTS method uses h data ponts (out of n) to estmate the parameters. The coverage value, h, may be set from n/2 to n. The am of LTS estmator s to fnd the h-subset wth smallest least squares resduals and use the h-subset to estmate parameters of models. The breakdown pont of LTS s (n-h)/n. When h s set n/2, the LTS estmator has a hgh breakdown value of 50%. The advantages of LTS over LMedS are: It s less senstve to local effects than LMedS,.e. t has more local stablty. LTS has better statstcal effcency than LMedS. It converges lke n 1/2. The mplement of the LTS method also uses random samplng because the number of all possble h-subsets ( C ) grows fast wth n. There are two commonly employed ways to generate a h-subset: h n 1. Drectly generate a random h-subset from the n data ponts. 2. Frstly generate a random p-subset. If the rank of ths p-subset s less than p, randomly add data ponts untl the rank s equal to p. Next, use ths subset to compute parameters θˆ j (j=1, p) and resduals r (=1,,n). Sort the resduals nto r( π (1) r( π ( h) r( π ( n), and h-subset s set to: H:={ π ( 1),..., π ( h) }. 19

43 Although the frst way s easer than the second, the h-subset yelded by the frst method may contan a lot of outlers. Indeed, the chance of generatng a clean h-subset by method (1) tends to zero wth ncreasng n. In contrast, t s easer to fnd a clean p- subset wthout outlers. Therefore, method (2) can generate more (good) ntal subset wth sze h than method (1). Lke LMedS, the effcency of LTS can be mproved by adoptng a weghted least squares refnement as the last stage. 2.4 Robust Estmators Developed wthn the Computer Vson Communty Although many robust estmators were developed n statstcs durng the past decades, most of them can only tolerate 50% outlers. In computer vson tasks, t frequently happens that outlers and pseudo-outlers occupy the absolute majorty of the data. Therefore, the requrement n these robust estmators that outlers occupy less than 50% of all the data ponts s far from beng satsfed for real tasks n computer vson. A good robust estmator should be able to correctly fnd the ft when outlers occupy a hgher percentage of the data (more than 50%). Also, deally, the estmator should be able to resst the nfluence of all types of outlers (e.g., unformly dstrbuted outlers, clustered outlers and pseudo-outlers). Recently, many efforts have been made n computer communty to fnd robust estmators, whch can tolerate more than 50% outlers. Among these hgh robust estmators, the frequently used estmators are Hough Transform (Hough 1962), RANSAC (Fschler and Rolles 1981), MINPRAN (Stewart 1995), MUSE (Mller and Stewart 1996), ALKS (Lee, Meer et al. 1998), and RESC estmator (Yu, Bu et al. 1994). In the followng sub-sectons, we wll ntroduce these robust estmators. We commence wth the explanaton of breakdown pont as t s often nterpreted by the computer vson communty. 20

44 2.4.1 Breakdown Pont n Computer Vson In the statstcal lterature (Huber 1981; Rousseeuw and Leroy 1987), there are a number of precse defntons of robustness and of robust propertes: ncludng the aforementoned breakdown pont (see secton 2.2) whch s an attempt to characterze the tolerance of an estmator to large percentages of outlers. Loosely put, such estmators should stll perform relably even f up to 50% of the data do not belong to the model we seek to ft (n statstcs, these outlers are usually false recordngs or other wrong data). Estmators, such as the Least Medan of Squares, that have a proven breakdown pont of 0.5, have been much vaunted; partcularly snce ths s generally vewed to be the best achevable. It would be desrable to place all estmators on such a frm theoretcal footng by, amongst other thngs, defnng and provng ther breakdown-pont. However, n practce, t s usually not possble to do so. Moreover, one can queston whether the current defntons of such notons are approprate for the tasks at hand n order to yeld mathematcal tractablty, they may be too narrow/restrctve. For example, does one care f there s one sngle, unlkely f not mpossble, confguraton of data that wll lead to the breakdown of an estmator f all practcal examples of data can be relably tackled? Moreover, as appealng as t s to quote theoretcal results, t may mean lttle n practce. Takng for example the Least Medan of Squares estmator: the estmator s too costly to mplement and so everyone mplements an approxmate verson of that estmator no such proofs exst (nor can they) assurng a precse breakdown pont for such approxmate versons of the estmators. Not to menton the fact that there are data sets, havng less than 50% outlers, where even the true Least Medan of Squares wll provably fal (for example clustered outlers see secton 3.2); of course such confguratons are carefully excluded by the careful phrasng of the formal proofs of robustness. Yet clustered outlers, perhaps unlkely n the manstream statstcal examples, are somewhat lkely n computer vson tasks when we consder the noton of pseudo-outlers (Stewart 1997) data belongng to a second object or objects wthn the mage. Several technques (e.g., RANSAC, Hough transform) have expermentally proven themselves as relable workhorses (toleratng very hgh percentages of outlers usually much over 50%). We may say that these have an emprcally determned very hgh 21

45 breakdown pont, meanng that these are unlkely to breakdown and can usually tolerate extremely hgh levels of outlers (much n excess of 50%). Although the breakdown pont n statstcs s proved to be bounded by 0.5 [(Rousseeuw and Leroy 1987), pp.125], the proof also shows that they requre the robust estmator has a unque soluton (more techncally, they requre affne equvarance). When outlers (ncludng pseudo-outlers assocated wth multple structures) occupy more than 50% of the whole data, a robust method may return one of the multple valued solutons (Yu, Bu et al. 1994). As Stewart sad (Stewart 1999): the nature of computer vson problem alters the performance requrements of the robust estmators n a number of ways. The optmum breakdown pont of 0.5 must be surpassed n some domans. A robust estmator wth more than 0.5 breakdown pont s possble. That s, a robust estmator may have a hgher than 0.5 breakdown pont f we relax the sngle soluton requrement, and permt the case of multple solutons to exst (Yu, Bu et al. 1994; Stewart 1995; Lee, Meer et al. 1998). Ths can be done through the use of RANSAC or Hough Transform, or through adaptve technques based on scale estmates such as ALKS and MUSE, etc (Stewart 1999). Though none of them have a theoretcally proven breakdown pont hgher than 0.5, plausble arguments, supported by experments, suggest that they do n practce. Thus, n ths thess, though we are motvated by the appealng noton of strctly provable robustness n the form of hgh breakdown pont, we follow a growng tradton of authors (Yu, Bu et al. 1994; Stewart 1995; Lee, Meer et al. 1998) that present estmators, that have emprcally demonstrated robust qualtes and are supported by plausble arguments, based (as s, we mght emphasze, the approxmate Least Medan of Squares technque used by many statstcans and other scentsts alke) on the smlarty of the proposed technque to estmators that do have provably hgh breakdown ponts Hough Transform (HT) Estmator The Hough Transform was developed frst to detect smple curves such as lnes and crcles (Hough 1962). The basc Hough Transform s a votng technque. A typcal 22

46 mplementaton of the technque s to count the number of data features that are mapped nto each cell n quantzed parameter space. The Hough Transform has attracted a great deal of attenton, and many mprovements have been made, lke the generalzed Hough Transform, the probablstc Hough Transform and the herarchcal Hough Transform (Illngworth and Kttler 1988; Leavers 1993). The Hough transform has been recognzed as beng a powerful tool n shape analyss, model fttng, moton segmentaton, whch gves good results even n the presence of nose and occluson. Major shortcomngs of the Hough Transform are excessve storage requrements and computatonal complexty. Typcally, the storage space and tme complexty requred s about O(N p ), where p s the dmenson of parameter space and N s the number that each parameter space s quantzed nto. Another problem of the Hough Transform s ts lmted precson. Generally speakng, ncreasng the quantzaton number of each parameter space wll lead to a hgher precson; however, ths wll also ncrease the computatonal cost. Fnally, though the Hough Transform can be successfully appled to estmate multple structures, one mght have to solve many practcal problems n multmodal parameter space. In effect, the hard problems have been deferred to the analyss of parameter space. Though the Hough Transform tends to demonstrate robustness n the presence of relatvely hgh percentages of outlers, no formal proof of robustness (n terms of breakdown pont) seems to exst untl very recently: we become aware of the work of (Goldenshluger and Zeev 2004), where the authors formalzed the statstcal propertes of the HT methodology and determned the breakdown pont of the HT estmator Random Sample Consensus (RANSAC) Estmator Fschler and Bolles (Fschler and Rolles 1981) provded a generate-and-test paradgm: RANdom Sample Consensus (RANSAC). They used the mnmum number of data ponts, a p-subset (p s the dmenson of parameter space), necessary to estmate the parameters of the model. RANSAC, for ts hgh robustness to outlers and ease to carry out, has been wdely employed n many computer vson tasks. The algorthm of RANSAC can be descrbed as follows: 23

47 1 Randomly choose a p-subset from the gven n data ponts. Ths step has the smlar place to that used n LMedS, LTS, ALKS, etc. n that all of these estmators employ a random samplng scheme. 2 Usng ths p-subset to determne the parameters of the model. Then, determne the number of data ponts that are wthn some error tolerance of the model. 3 If the number s greater than a threshold, use the p-subset to determne a new model by the least squares method and output the parameters of the model as results. 4 Otherwse, randomly choose a new p-subset and repeat step 1 and 2. If no consensus wth the threshold or more members has been found after runnng the predetermned number of trals, ether termnate the program or compute the parameters of the model wth the largest consensus set found. From the procedure of RANSAC descrbed above, we can see the RANSAC method need three predetermned parameters: The error tolerance. The number of subsets to try. The threshold, whch ndcates whether or not the correct model has been found. If the predetermned parameters are correct, RANSAC s very robust to outlers and can tolerate more than 50% outlers expermentally. However, f the predetermned parameters or some of them (for example, the error tolerance) are wrong, the achevements of RANSAC wll be corrupted Mnmze the Probablty of Randomness (MINPRAN) Estmator The MINPRAN estmator s one knd of robust estmator that has a hgher than 50% breakdown pont (Stewart 1995). It can fnd a model n the data nvolvng more than 50% 24

48 outlers wthout pror knowledge about error bounds. In contrast, the Hough Transform and RANSAC technques need a pror knowledge about the nler bound of the correct ft. The MINPRAN estmator s smlar to the LMedS/LTS estmator but t outperforms the LMedS/LTS estmator n the followng ways: The MINPRAN estmator can fnd correct ft that nvolves less than 50% of the data. The prerequste of ths achevement s that nlers need to be close to the correct ft and outlers are randomly dstrbuted. It does not hallucnate fts when there s not any model n the data. (The MINPRAN estmator outputs nothng, not even a false ft, when there s no model n the data). Ths s dfferent from most other robust estmators such as M- estmators, LMedS, LTS, etc. The LMedS/LTS estmators always use 50% of the data (for LMedS, see secton 2.3.3) or h data ponts out of n data ponts (for LTS, see secton 2.3.4) regardless of the true percentage of the nlers n the whole data. MINPRAN, however, wll dentfy and use all nlers. Therefore, the MINPRAN estmator can yeld more accurate results than the LMedS and LTS estmators. However, MINPRAN assumes that the outlers are randomly dstrbuted wthn a certan range. Ths makes MINPRAN less effectve n extractng multple structures and clustered outlers. At the same tme, t occasonally fnds the fts that actually brdge small magntude dscontnutes Mnmum Unbased Scale Estmator (MUSE) and Adaptve Least kth Order Squares (ALKS) Estmator Mller and Stewart proposed the mnmum unbased scale estmator (MUSE) n 1996 (Mller and Stewart 1996). MUSE was desgned to extract surfaces that contaned less than 50% of the data wthout a pror knowledge about the percentage of the nlers and to detect the small-scale dscontnutes n the data. MUSE s based on the LMedS method, but t mproves LMedS and can accurately estmate fts when the data contan multple surfaces. 25

49 MUSE randomly selects p-subsets and then estmates fts based on these p-subsets. It then calculates the unbased estmate of the scale for each ft s k smallest resduals, where k s set to all possble values and satsfes 1 k N-p; Then the smallest scale estmate over all possble k s chosen as the representatve value of the hypotheszed fts. Fnally, the ft from the p-subset wth the smallest scale estmate s chosen as the optmum ft. The scale estmate can be wrtten as: rk : N s k = ( 2.28) E u ] [ k: N where r k:n s the kth ordered absolute resdual, and E[u k:n ] can be approxmated by: 1 k E[ uk : N ] Φ ((1 + ) / 2) ( 2.29) N where Φ [.] s the argument of the normal cumulatve densty dstrbuton. Because the scale estmate s based, a correcton s taken to elmnate the bas by normalzng the s k. The unbased mnmum scale estmate can be wrtten as: mn s ϕ = ( 2.30) E[mn v arg mn s ] k k k ' k' = k k where v k s the kth scale estmate. Because t takes O(n 3 ) tme to calculate E[mnv k], these are pre-calculated and stored n a table, whch wll be used when MUSE s carred out. Inspred by MUSE, Lee and Meer provded the adaptve least kth order squares (ALKS) estmator (Lee, Meer et al. 1998). ALKS s based on the least kth order squares (LKS) [(Rousseeuw and Leroy 1987), pp.124]. LKS procedure s smlar to those of the LMedS estmator. The dfference between LKS and LMedS s n that LKS uses k data ponts out of n data ponts (p<k<n), whle LMedS uses half of the n data ponts. The breakdown pont of LKS s mn(k/n, 1-k/n). Because t s mpossble for a robust estmator usng a one-step procedure to have a breakdown pont exceedng 50%, ALKS uses mult-step procedure (n 26

50 each step, ALKS employs LKS wth dfferent k value) and recovers the ft to the relatve majorty of the data wthout any pror knowledge about the scale of nlers. ALKS uses k data ponts out of n data ponts. The robust estmate of the nose scale can be wrtten as: sˆ k = Φ 1 dˆ k mn 1+ k / n [ ] 2 ( 2.31) where d kmn s the half-wdth of the shortest wndow contanng k resduals. Ths estmate of nose scale s only vald when the k s not very large or very small. After ŝk s estmated, the varance of the normalzed error s computed as follows: Ε 2 k = 1 q p k qk = 1 r sˆ, k k 2 = 1 sˆ 2 k qk = 1 q k r 2, k p ˆ σ = sˆ 2 k 2 k ( 2.32) ALKS assumes that the optmum value of k should yeld the least 2 Ε k. In order to estmate the correct k, a random samplng technque, smlar to that used n LMedS and RANSAC, s also employed n ALKS. The ALKS method can deal wth data wth multple structures, but t cannot resst the nfluence of the extreme outlers. The authors of MUSE and those of ALKS consder robust scale estmate n ther methods and they both can obtan a greater than 50% breakdown pont. MUSE and ALKS can perform better than LMedS and M-estmators at small-scale dscontnutes. However, MUSE needs a lookup table for the scale estmate correcton; ALKS s lmted n ts ablty to handle extreme outlers. Another problem we found n ALKS s ts nstablty under a small percentage of outlers. 27

51 2.4.6 Resdual Consensus (RESC) Estmator Yu et al. presented a hghly robust estmator for range mage segmentaton and reconstructon (Yu, Bu et al. 1994). Because ths method consders resdual consensus, t s called RESC. RESC greatly mproves the robustness of an estmator to outlers. Although there have been some estmators appearng n the lterature (such as Hough Transform and RANSAC) that mght tolerate more than 50% outlers, RESC clams that t can reach a hgh breakdown pont whch can tolerate more than 80% outlers. Ths attractve characterstc of the RESC method s acheved by usng a hstogram compresson approach for resdual consensus. The basc dea of the RESC method s that f a model s correctly ftted, the resduals of nlers should be small and, at the same tme, the hstogram of the resduals should concentrate wthn a small range n the lower part of the hstogram. In the RESC method, a hstogram compresson technque plays an mportant role n resdual consensus. The hstogram compresson technque works as follows: 1. Estmate the orgnal hstogram of resduals wth as many hstogram columns (says 2000) as possble. The estmated hstogram of resduals wll be used for compresson n the followng steps. 2. Decde the column wdth of the compressed hstogram. The wdth s chosen where the frst column of the compressed hstogram can contan p percent of all data ponts. The value of p s expermentally chosen as 12. Then the number v of consecutve columns n the orgnal hstogram can be easly determned (these consecutve columns of the orgnal hstogram wll contan p percent of the data ponts). 3. Then every v columns n the orgnal hstogram s compressed nto one column n the compressed hstogram. From the detals of the hstogram compresson, we can see that the column wdth of the compressed hstogram depends on the nose level, whch wll change for dfferent types of data. Instead of usng only resdual nformaton n ts objectve functon, the RESC method 28

52 consders two factors n ts objectve functon: the number of nlers and the resduals of the nlers. The RESC method defned ts objectve functon as: ψ 1 v m α β = h β =1 / ( 2.33) where s the th column n the compressed hstogram; h s the number of ponts n the th column. α and β are coeffcents whch determne the relatve mportance of h and. They are emprcally determned as: α =1.3; and β =1.0. ψ s also called hstogram power. The procedure of the RESC method s as follows: 1. Randomly choose k sets of p-subset from the whole data ponts. A modfed genetc algorthm s used by the RESC method to mprove the speed. 2. Calculate the resduals and compress the hstogram of the resduals. 3. Select a p-subset from the k sets whose hstogram power s the hghest. 4. Determne the standard varance of the resduals. 5. Label the ponts of ths prmtve and remove them from the whole data ponts. 6. Remove the outlers n the labelled regon. 7. Repeat step 1-6 untl all sgnals are extracted. The RESC method s very robust to nose. It fnds the parameters by the p-subset correspondng to the maxmum of the hstogram power. However, a dsadvantage of the RESC method s that t needs user to tune many parameters for the optmal performance. 29

53 2.5 Concluson In ths chapter, we have revewed several tradtonal and state-of-the-art robust estmators, wth ther advantages and dsadvantages. In the followng chapters, we wll, by usng the extra nformaton nsde the resduals, proposed novel effcent robust methods and apply them to computer vson tasks. 30

54 3. Usng Symmetry n Robust Model Fttng Chapter 3 Usng Symmetry n Robust Model Fttng 3.1 Introducton The pattern recognton and computer vson communtes often employ robust methods for model fttng (Fschler and Rolles 1981; Rousseeuw 1984; Rousseeuw and Leroy 1987; Zhang 1997; Danuser and Strcker 1998; Stewart 1999). It s common to employ regresson analyss to undertake such tasks (Rousseeuw and Leroy, 1987). In partcular, hgh breakdown-pont methods such as Least Medan of Squares (LMedS) and Least Trmmed Squares (LTS) have often been used n stuatons where the data are contamnated wth outlers. LMedS and LTS are based on the dea that the correct ft wll correspond to the one wth the least medan of resduals (for LMedS), or the least sum of trmmed squared resduals (for LTS). The essence of the argument clamng a hgh breakdown pont for the LMedS s that f the uncontamnated data are n the majorty, then the medan of the squared resduals should be unaffected by the outlers, and thus the medan squared resdual should be a relable measure of the qualty of the ft. Lkewse, snce the LTS method reles only on (the sum of squares of) the h smallest resduals, for some choce of the parameter h, t s thought that ths should be robust to contamnaton so long as h data ponts, at least, belong to the true ft. 31

55 However, though the breakdown pont of these methods can be as hgh as 50% (they can be robust to up to 50% contamnaton), they can break down at unexpectedly lower percentages when the outlers are clustered. Due to the affects of clustered outlers, the correct ft may not correspond to the ft wth the least medan of squared resdual (for LMedS) or the least trmmed squared resduals (for LTS). It s worth mentonng that ths phenomenon s not lmted to LMedS, and LTS. It also happens to most other robust estmators such as random sample consensus RANSAC (Fschler and Rolles 1981), resdual consensus estmator RESC (Yu, Bu et al. 1994), adaptve least k squares estmator ALKS (Lee, Meer et al. 1998), etc. The mechansm of the breakdown n these robust estmators s smlar to that of the LMedS and LTS (see chapter 4). Ths llustrates a general prncple: most robust methods only depend upon a sngle statstcal property (the sum of the trmmed squared resduals or the medan of the squared resduals, for example) and these methods gnore many other propertes that the data, or the resduals to the ft, should have. The key to salvagng the robustness of LMedS and LTS (and some other robust estmators), even n the presence of clustered outlers, can be that one looks beyond the standard defnton of the robust methods to ncorporate other measures and statstcs nto the formulaton. In ths chapter, we restrct ourselves to one such property symmetry. Symmetry s very common and mportant n our world. When we wll ft crcles, ellpses, or any symmetrc object, one of the most basc features n the model s symmetry. In our method, we ntroduce the concept of symmetry dstance (SD) and thereby we propose, by takng advantage of the symmetry nformaton n the vsual data, an mproved method, called the least trmmed symmetry dstance (LTSD). The symmetry we employ, n ths context, s that of symmetry about a central pont (central wth respect to the shape of nterest). The LTSD method s nfluenced not only by the szes of the resduals of data ponts, but also by the symmetry of the data ponts and has applcatons where one s tryng to ft a symmetrc model (e.g. crcle and ellpses). Expermental results show that the LTSD approach gves better results than the LMedS method and the LTS method n stuatons where a large percentage of clustered outlers and large standard varance n nlers are encountered. 32

56 The man contrbutons of ths chapter are as follows: 1. We llustrate stuatons where LMedS and LTS to fal to correctly ft the data n the presence of clustered outlers, and analyze the reasons that cause the breakdown of these two methods. Ths provdes an mportant cautonary note when employng these two robust estmators n stuatons where the outlers are clustered. 2. We ntroduce the concept of symmetry dstance (SD) nto model fttng. The concept of SD n computer vson s not novel. However t s a novel concept n the feld of model fttng. Based on Su et al. s pont symmetry dstance (Su and Chou 2001), we propose a novel symmetry dstance and apply t to model fttng. 3. We expermentally show that the proposed method works better than LMedS and LTS under a large percentage of clustered outlers for both smulated and real data. Ths chapter s organzed as follows: n secton 3.2, the factors that cause both LMedS and LTS to fal to ft a model under a large percentage of clustered outlers are explored. In secton 3.3, a novel symmetry dstance measure s gven and our proposed method s developed n secton 3.4. Experments demonstratng the utlty of the approach (for crcle fttng and ellpses fttng) are gven n secton 3.5. Fnally, some conclusons and future work are summarzed n secton Dlemma of the LMedS and the LTS n the Presence of Clustered Outlers The LMedS method and the LTS method are based on the dea that the correct ft s determned by a smple statstc: the least medan of the squared resduals (for LMedS), or by the least sum of trmmed squared resduals (for LTS); and that such a statstc s not nfluenced by the outlers. Consder the contamnated dstrbuton defned as follows (Haralck 1986; Hampel, Rousseeuw et al. 1986b): 33

57 where F 0 s an nler dstrbuton, and H s an outler dstrbuton. F=(1- ε)f 0 + εh ( 3.1) By LTS By LMS Clustered outlers Fgure 3.1: An example where LMedS and LTS (h s set to 0.5n) fal to ft a crcle yet there are under 44.5% outlers though, of course, the outlers are clustered. Equaton ( 3.1) s also called the gross error model. When the standard varance of F 0 s small (<<1) and that of H s large or H s unform dstrbuted, the assumptons leadng to the robustness of LMedS or LTS, are true. However, when F 0 s scattered,.e. the standard varance of F 0 s bg, and H s clustered dstrbuted wth hgh densty, the assumpton s not always true. Let us nvestgate an example. In Fgure 3.1, F 0 (bvarate normal wth unt varance) ncludng 100 data ponts were generated by addng the nose to samples of a crcle wth radus 10.0 and center at (0.0, 0.0). Then 80 clustered outlers were added, possessng a sphercal bvarate normal dstrbuton wth one unt standard varance and mean (20.0, 6.0). As Fgure 3.1 shows, both LMedS and LTS faled to ft the crcle: LMS returned the result wth a radus equal to and the center was located at ( , ). The results obtaned by LTS were: the radus was and the center was at (1.1445, ). 34

58 It s mportant to pont out that the falure s nherent, and not smply an artfact of our mplementaton. Let us check the medan of the resduals (for LMedS) and the sum of trmmed squared resduals (for LTS) and we wll understand why LMedS and LTS faled to ft to the crcle. The medan of resduals of the perfect ft s However the medan of resduals of fnal result by the LMedS method s Medan Functon 15 Xc YC 16 Radus Fgure 3.2: LMedS searches for the best ft wth the least medan of resduals. Fgure 3.2 shows the tme evoluton of the parameters: medan of resduals (top left), centre of the ftted crcle (x coordnate and y coordnate n the top rght and bottom left, respectvely), and the radus of the ftted crcle (bottom rght) as LMedS searches for the best ft wth the least medan of resduals. (Note: the teratons pass by the correct ft (ponted out by arrows) proceedng to fts wth even lower medan of resduals.) In fact, durng the searchng procedure, the LMedS estmator consstently mnmzes the medan of the resduals, startng wth ntal fts that have a larger medan resdual than the true ft, but successvely fndng fts wth lower medan resduals proceedng to even lower medan resduals than that possessed by the true ft. The reason that LTS faled s smlar. LTS fnds the ft wth smallest trmmed squared resduals. The value of the least sum of trmmed squared resduals obtaned s However, the same statstc for the true ft s Clearly, LTS has correctly, by 35

59 ts crteron, obtaned a better ft (but n fact, the wrong one). The problem s not wth the mplementaton but wth the crteron (a) Estmated radus by LMS std=0.4 std=0.7 std=1.0 std=1.3 std= Percentage of clustered outlers (b) Estmated radus by LTS 24 std=0.4 std=0.7 std= std=1.3 std= Percentage of clustered outlers (c) Fgure 3.3: Breakdown Plot: (a) One case of the dstrbuton of data; The results of LMedS (b) and LTS (c) n crcle fttng wll be affected by the standard varance of nlers and percentages of clustered outlers. Now, let us consder another example showng that the results of the LMedS and the LTS are affected by the standard varance of the nlers. We generated a crcle wth radus 10.0 and center at (0.0, 0.0). In addton, clustered outlers were added to the crcle wth mean (20.0, 6.0) and unt standard varance. In total, 100 data ponts were generated. At frst, we assgned 100 data to the crcle wthout any outlers. Then we repeatedly moved two ponts from the crcle to the clustered outlers untl 50 data were left n the crcle. Thus, the percentage of outlers changed from 0 to 50%. In addton, for each percentage of clustered outlers, we vared the standard varance of the nlers from 0.4 to 1.6 wth a step sze of 36

60 0.3. Fgure 3.3 (a) llustrates one example of the dstrbuton of the data, wth 38% clustered outlers and the standard varance of nlers 1.3. From Fgure 3.3 (b), we can see that when the standard varance of nlers s no more than 1.0, LMedS can gve the rght results under a hgh percentage of outlers (more than 44%). However, when the standard varance of nlers s more than 1.0, LMedS does not gve the rght result even when the percentage of outlers s less then 40%. From Fgure 3.3 (c), we can see when the standard varance of nlers s 0.4, the LTS estmator can correctly gve the results even under 50% clustered outlers; whle when the standard varance of nlers s 1.6, LTS does not gve the rght results even when only 30 percent of the data are outlers. From the dscusson above, we now see several condtons under whch LMedS and LTS faled to be robust. A crucal pont s: these methods measure only one sngle statstc: the least medan of resduals or the least sum of trmmed squared of resduals, omttng other characterstcs of the data. If we look at the falures, we can see that the results lost the most basc and common feature of the nlers wth respect to the ftted crcle symmetry. In the next secton, we wll ntroduce the concept of symmetry dstance nto robust regresson methods and propose an mproved method, called the Least Trmmed Symmetry Dstance (LTSD), by whch the better performance s acqured even when data nclude clustered outlers. 3.3 The Symmetry Dstance Symmetry s consdered a pre-attentve feature that enhances recognton and reconstructon of shapes and objects (Attneave 1995). Symmetry exsts almost everywhere around us. A square, a cube, a sphere, and a lot of geometrc patterns show symmetry. Archtecture usually dsplays symmetry. Symmetry s also an mportant parameter n physcal and chemcal processes and s an mportant crteron n medcal dagnoss. Even we human bengs show symmetry, (for nstance, our faces and bodes are roughly symmetrcal between rght and left). One of the most basc features n the shapes of models we often ft/mpose on our data, e.g. crcles and ellpss, s the symmetry of the model. 37

61 Symmetrc data should suggest symmetrc models and data that s symmetrcally dstrbuted should be preferred as the nler data (as opposed to the outlers). For decades, symmetry has wdely been studed n computer vson communty. For example, consderable efforts have been focused on the detecton of symmetry n mages n regard to mrror symmetres (Marola 1989; Nalwa 1989) and n regard to crcular symmetres (Bgun 1988; Resfeld, Wolfson et al. 1992); Krby etc. used the symmetrc features of mages for mage compresson (Krby and Srovch 1990); Zabrodsky treated symmetry as a contnuous feature and appled t n fndng the orentaton of symmetrc objects (Zabrodsky, Peleg et al. 1995); Skewed symmetres n 3D structures have been extensvely studed (Oh, Asada et al. 1988; Ponce 1990; Gross and Boult 1994). Symmetry has also been treated as a feature n cluster analyss (Su and Chou 2001). More detaled defntons of symmetry can be found n (Zabrodsky 1993). We demonstrate here that symmetry can also be used as a feature to enhance the performance of robust estmators when fttng models wth symmetrc structure Defnton of Symmetry There are many knds of symmetry n exstence n the world. Generally speakng, symmetry can be classfed nto the followng four basc types, whch are shown n Fgure 3.4, (Zabrodsky 1993; Zabrodsky, Peleg et al. 1995): 1. Mrror-symmetry: f an object s nvarant under a reflecton about a lne (for 2D) or a plane (for 3D). 2π 2. C n -symmetry: f an object s nvarant under rotaton of n radans about ts center (for 2D) or a lne passng through ts center (for 3D). 3. D n -symmetry: f an object has both mrror-symmetry and C n -symmetry. 4. Crcular-symmetry: f an object hasc -symmetry. 38

62 (a) (b) (c) (d) Fgure 3.4: Four knds of symmetres: (a) mrror-symmetry; (b) C 4 -symmetry; (c) D 4 - symmetry; (d) crcular symmetry The Symmetry Dstance The exact mathematcal defnton of symmetry (Weyl 1952; Mller 1972) s nsuffcent to descrbe and quantfy symmetry found both n the natural world and n the vsual world. Su and Chou proposed a symmetry dstance measure based on the concept of pont symmetry (Su and Chou 2001). Gven n ponts x, =1, n and a reference vector C (e.g. the centrod of the data), the pont symmetry dstance between a pont x j and C s defned as follows: d ( x, C) = mn s j = 1,... and N j ( x x j j C) + ( x C) C + x C ( 3.2) 39

63 From equaton ( 3.2), we can see that the pont symmetry dstance s non-negatve by defnton. In essence, the measure tres to balance data ponts wth others symmetrc about the centrod for example, x =(2C-x j ) exsts n the data, d s (x j,c)=0. However, accordng to ( 3.2), one pont could be used repeatedly as the balancng pont wth respect to the center. Ths does not seem to properly capture the noton of symmetry. In order to avod one pont beng used as a symmetrc pont more than one tme by other ponts, we refne the pont symmetry dstance between a pont x j and C as follows: D ( x, C) = mn s j = 1,... N and j and R ( x x j j C) + ( x C) C + x C ( 3.3) wherer s a set of ponts that have been used as symmetrc pont. Based on the concept of pont symmetry dstance, we propose a non-metrc Symmetry Dstance (SD). Gven a pattern x conssted of n ponts x 1, x n and a reference vector C, the symmetry dstance of the pattern x wth respect to the reference vector C s: n 1 SD n (x,c)= Ds ( x, C) n 1 = ( 3.4) When the SD of a pattern s equal to 0.0, the pattern s perfect symmetrc; when the SD of a pattern s very bg, the pattern has lttle symmetry. 3.4 The Least Trmmed Symmetry Dstance (LTSD) Method We proposed a new method, whch couples the LTS method wth the symmetry dstance measure defned n sub-secton That s, besdes resduals, we also choose symmetry dstance as a crteron n the model fttng. For smplcty, we call the proposed method LTSD (Least Trmmed Symmetry Dstance). Mathematcally, the LTSD estmate can be wrtten as: 40

64 ˆ θ = arg mn SD h ( x, C) ( 3.5) θ, C Only h data ponts wth the smallest sorted resduals are used to calculate the symmetry dstance. The estmated parameters correspond to the least symmetry dstance. The specfc detals of the proposed method are gven as follows: 1. Set repeat tmes (RT) accordng to equaton ( 2.23). Intalse h wth [(n+p+1)/2] h n. If we want LTSD to have a hgh breakdown pont, say 50%, we can set h=(n+p+1)/2. 2. Randomly choose p-subsets, and extend to h-subset H 1 by the method (2) n subsecton Compute ˆ θ 1 by LS method based on H 1. Compute symmetry dstance SD 1 based on ˆ θ 1 and H 1 usng equaton ( 3.4) n sub-secton and usng the centre of the ft (crcle or ellpse) as the reference vector C. Decrement RT and f RT s smaller than 0, go to step 4, otherwse, go to step 2. We calculate the parameters θˆ based on h- subset nstead of p-subset n order to mprove the statstcal effcency. 4. Fnally, output θˆ wth the lowest SD. 3.5 Expermental Results In ths secton, we wll show several examples usng the proposed method to ft a model wth symmetrcal structures. Crcle fttng and ellpses fttng have been very popular topcs n the computer vson feld. One of the obvous characterstcs of crcles and ellpses s that they are symmetrc. We frst present an example of crcle fttng; then we present a relatvely more complcated example of ellpse fttng. The results are compared wth those of the LMedS method and the LTS method. 41

65 3.5.1 Crcle Fttng In Fgure 3.5, about 45 percent clustered outlers were added to the orgnal crcle data. Snce LMedS and LTS only rely on the resduals of the data ponts, ther results were affected by the standard varance of the nlers and percentages of the clustered outlers. Therefore, they faled to ft the crcle under a hgh percentage of clustered outlers (see Fgure 3.1). However, because the LTSD method consders the symmetry of the object, ths enables LTSD fnd the rght model (see Fgure 3.5): the true centre and radus of the crcle are respectvely (0.0,0.0) and 10.0; by the LTSD method, we obtaned centre (-0.23, 0.01) and radus True crcle -5 By LTSD -10 Inlers Clustered outlers True crcle Crcle by LTSD Fgure 3.5: Usng the symmetry of the crcle, the LTSD method found the approxmately rght results under 44.5% clustered outlers. Another example showng the advantages of the proposed method s gven n Fgure 3.6 (a) (correspondng to Fgure 3.3 (a)). From Fgure 3.6 (a), we can see that when the outlers are clustered, the LMedS and LTS broke down under very low percentages of outlers, n ths case, they both broke down under 38% outlers! In comparson to the LMedS and LTS methods, the proposed method gves the most accurate results. The proposed method s affected less by the standard varance of the nlers and the percentages of the clustered outlers. Fgure 3.6 (b) shows that the radus found by the LTSD method n crcle fttng (true radus s 10.0) changed less under dfferent standard varance of the nlers and percentages of clustered outlers. In comparson to Fgure 3.3 (b) and (c), the fluctuaton of 42

66 the radus found by the LTSD method s smaller. Even when 50 percent clustered outlers exst n the data and the standard varance of nlers s 1.6, the results dd not (yet) break down. However, both the LMedS and the LTS broke down std=0.4 std=0.7 std=1.0 std=1.3 std= LTSD LMedS Estmated radus by LTSD LTS (a) Percentage of clustered outlers (b) Fgure 3.6: (a) A comparatve result of the LTSD, LMedS, and LTS wth 38% clustered outlers; (b) the results of the LTSD method s affected less by the standard varance of nlers and percentages of clustered outlers Ellpse fttng Ellpses are one of most common and mportant prmtve models n computer vson and pattern recognton, and often occur n geometrc shapes, man-made and natural scenes. Ellpse fttng s a very mportant task for many ndustral applcatons because t can reduce the data and beneft the hgher level processng (Ftzgbbon, Plu et al. 1999). Crcles may be projected nto ellpses under perspectve projecton. Thus ellpses are frequently used n computer vson for model matchng (Sampson 1982; Ftzgbbon, Plu et al. 1999; Robn 1999). In ths subsecton, we apply the proposed robust method LTSD to ellpses fttng. A general conc equaton can be wrtten as follows: ax 2 + bxy + cy 2 + dx + ey + f = 0 ( 3.6) 43

67 where (a,b,c,d,e,f) are the parameters needed to fnd from the gven data. When b 2 < 4ac, the equaton above corresponds to ellpses. The ellpse can also be represented by ts more ntutve geometrc parameters: 2 ( x cosθ + y snθ xc cosθ yc snθ ) 2 A ( xsnθ + y cosθ + xc snθ yc cosθ ) + 2 B 2 = 1 ( 3.7) where (x c, y c ) s the center of the ellpse, A and B are the major and mnor axes, and θ s the orentaton of the ellpse. The relaton between (a,b,c,d,e,f) and (x c, y c, A, B, θ) can be wrtten as (Robn 1999): be 2cd xc = 2 4ac b bd 2ae yc = 2 4ac b { A, B} = 2 a + c ± 1 1 b θ = tan 2 a c f 2 f b 2 a c + ( ) f 2 (5.3) It s convenent to fnd (a,b,c,d,e,f) frst by the gven data and then convert to (x c, y c, A, B, θ). As llustrated n Fgure 3.7 and Table 3.1, 200 data were generated wth 40% clustered outlers. The outlers were compacted wthn a regon of radus 5 and center at (20.0, 5.0). The ellpse had a standard varance 0.8, major axs 10.0, mnor axs 8.0, center (0.0,0.0), and orentaton to horzon drecton θ s 0.0 degree. The results of LTS and LMedS were serously affected by the clustered outlers. However, the LTSD method worked well. 44

68 20 15 Inlers Clustered outlers By LTSD By LTS By LMedS Fgure 3.7: Comparson of the results obtaned by the LTSD method, LTS and LMedS n ellpse fttng under 40% clustered outlers. xc yc Major axs Mnor axs θ (deg) True value The LTSD method The LTS method The LMedS method Table 3.1: Comparson of the estmated parameters by the LTSD, LTS, and LMedS methods n ellpses fttng under 40% clustered outlers. Next, we wll apply the LTSD method to real mages Experments wth Real Images The frst example s to ft an ellpse n an mage of a mouse pad, shown n Fgure 3.8. The edge mage was obtaned by usng Canny operator wth threshold In total, 310 data ponts were n the edge mage (Fgure 3.8 (b)). The clustered outlers, due to the flower, occupy 50% of the data. Three methods (the LTSD, LTS and LMedS) were appled to detect the mouse pad edge. As shown n Fgure 3.8 (c), both LTSD and LTS correctly found the edge of the mouse pad. However, LMedS fals to detect the edge of the mouse 45

69 pad. Ths s because under the condton that the standard varance of nlers s small, the statstcal effcency of LTS s better than LMedS. (a) Edge ponts By LTSD By LTS By LMedS By LMedS By LTSD By LTS (b) (c) Fgure 3.8: Fttng a mouse pad (a) a mouse pad wth some flower; (b) the edge mage by usng Canny operator; (c) the results obtaned by the LTSD, LTS and LMedS methods. Fgure 3.9 shows the use of the LTSD method to ft an ellpse to the rm of a cup. Fgure 3.9 (a) gves a real cup mage. After applyng the Prewtt operator, the edge of the cup s detected s shown n Fgure 3.9 (b). We can see that there s a hgh percentage (about 45%) of clustered outlers exstng n the edge mage, external to the rm of the cup (the ellpse we shall try to ft), manly due to the fgure on the cup. However, the rm of the cup has a symmetrc ellptcal structure. Fgure 3.9 (c) shows that the LTSD method correctly fnds the ellpse n the openng of the cup, whle both the LTS and the LMedS fal to correctly ft the ellpse. 46

70 (a) (b) (c) Fgure 3.9: Fttng the ellpse n a cup (a) a real cup mage; (b) the edge of the cup by applyng Prewtt operator; (c) comparatve results obtaned by the LTSD, LTS and LMedS methods Experments for Data wth Unform Outlers Fnally, we nvestgated the characterstcs of the LTSD under unform outlers. We generated 200 data ponts wth 40% unform outlers (see Fgure 3.10). The ellpse had a standard varance 0.5, major axs 10.0, mnor axs 8.0, center (0.0, 0.0), and orentaton to horzon drecton θ s 0.0 degree. The unform outlers were randomly dstrbuted n a rectangle wth left upper corner (-20.0, 20.0) and rght lower corner (20.0, -20.0). We repeated the performance 100 tmes and the averaged results were shown n Table 3.2. We can see the LTSD method can also work well n unform outlers. 47

71 Fgure 3.10: An ellpse wth 40% randomly dstrbuted outlers. xc yc Major axs Mnor axs θ (deg) True value The LTSD method The LTS method The LMedS method Table 3.2: Comparson of the estmated parameters by the LTSD, LTS, and LMedS methods n ellpses fttng wth 40% randomly dstrbuted outlers. 3.6 Concluson The fraglty of tradtonally employed robust estmators: LMedS and LTS, n the presence of clustered outlers, has been demonstrated n ths chapter (a smlar story apples more wdely see chapter 4). These robust estmators can break down at surprsngly lower percentage of outlers when the outlers are clustered. Thus ths chapter provdes an mportant cautonary note to the computer vson communty to carefully employ robust estmators when outlers are clustered. We also proposed a new method that ncorporates symmetry dstance nto model fttng. The comparatve study shows that ths method can acheve better performance than the least medan of squares method and the least trmmed 48

72 squares method especally when large percentages of clustered outlers exst n the data and the standard varance of nlers s large. The prce pad for the mprovement n fttng models s an ncrease of the computatonal complexty due to the complcated defnton of symmetry dstance. It takes about O(n 2 ) tme to compute symmetry dstance (SD) for each p-subset. The proposed method can be appled to the fttng of other symmetrc shapes and to other felds. Unfortunately, LTSD was especally desgned for spatally symmetrc data dstrbutons. For nler dstrbutons that are not spatally symmetrc (ncludng structures that, though they may be symmetrc, have large amounts of mssng or occluded data so that the vsble nlers are not symmetrc), the LTSD s not a good choce (Note: for ths knd of data, we have developed other novel hghly robust estmators that wll be presented n the followng chapters). However, the LTSD does provde a feasble way to greatly mprove the achevements of conventonal estmators the LMedS and the LTS, especally, when the data contan nlers (wth symmetry) wth large varance and are contamnated by large percentage of clustered outlers. In the next chapter, we wll take advantage of the nformaton n the structure of the pdf of resduals, and propose a more general hghly robust estmator: MDPE, whch can be wdely appled n many computer tasks and s applcable for mssng data, data wth occluson, and data wthout symmetry. The same motvaton s also behnd several mproved methods: QMDPE and vbqmdpe (n chapter 5 and chapter 6). 49

73 4. MDPE: A Novel and Hghly Robust Estmator Chapter 4 MDPE: A Novel and Hghly Robust Estmator 4.1 Introducton Many robust estmators (such as M-estmators, LMedS, LTS, etc.) have been developed n the statstcs feld. However, they assume that nlers occupy an absolute majorty of the whole data, and thus, they wll breakdown for data nvolvng an absolute majorty of outlers. Obvously, the requrement for 50% or more data belongng to nlers may not be always satsfed, e.g., when the data contan multple surfaces, when data from multple vews are merged, or when there are more than 50% nose data ponts exstng n the data. For these cases, we need to fnd a more robust estmator that can tolerate more than 50% outlers. Ths chapter presents a novel robust estmator (MDPE). Ths estmator apples nonparametrc densty estmaton and densty gradent estmaton technques n parametrc estmaton ( model fttng ). The goals n desgnng MDPE are: t should be able to ft sgnals correspondng to less than 50% of the data ponts and be able to ft data wth multstructures. In developng MDPE, we make the common assumpton that the resduals of 50

74 the nlers are contamnated by Gaussan nose (although the precse nature of the nose dstrbuton s not that essental, dependng only upon zero mean and unmodalty). We also assume that the sgnal (we seek to ft) occupes a relatve majorty of the data that s, there are no other populatons, belongng to vald structures, that sngly has a larger populaton. In other words, f there are multple structures, we seek to ft the largest structure (n terms of populaton of data whch s often related to but not necessarly dentcal to geometrc sze). Of course, n a complete applcaton of MDPE, such as the range segmentaton algorthm presented n the next chapter, one can apply the estmator serally to dentfy the largest structured populaton, remove t, and then seek the largest n the remanng populaton etc. Key components of MDPE are: Probablty Densty estmaton n conjuncton wth Mean Shft technques (Fukunaga and Hostetler 1975). The mean shft vector always ponts towards the drecton of the maxmum ncrease n the probablty densty functon (see secton 4.2). Through the mean shft teratons, the local maxmum densty, correspondng to the mode (or the center of the regons of hgh concentraton) of data, can be found. MDPE optmzes an objectve functon that measures more than just the sze of the resduals. It consders the followng two factors at the same tme: The densty dstrbuton of the data ponts (n resdual space) estmated by the densty estmaton technque. The sze of the resdual correspondng to the local maxmum of the probablty densty dstrbuton. If the sgnal s correctly ftted, the denstes of nlers should be as large as possble; at the same tme, the center of the hgh concentraton of data should be as close to zero as possble n the resdual space. Thus, both the densty dstrbuton of data ponts n resdual space and the sze of the resdual correspondng to the local maxmum of the densty dstrbuton, are consdered as mportant characterstcs n the objectve functon of MDPE. MDPE can tolerate a large percentage of outlers and pseudo-outlers (emprcally, usually more than 85%) and t can acheve better performance than other smlar robust estmators. 51

75 To demonstrate the performance of MDPE, we compare, based upon tests on both synthetc and real mages, MDPE wth other fve popular robust estmators (from both statstcs and computer vson feld): Hough Transform (HT), Random Samplng Consensus (RANSAC), Least Medan of Squares (LMedS), Resdual Consensus (RESC), and Adaptve Least kth Order Squares (ALKS). Experments show that MDPE has a hgher robustness to outlers and fewer errors than the other fve estmators. The contrbutons of ths chapter can be summarzed as follows: We apply nonparametrc densty estmaton and densty gradent estmaton technques n parametrc estmaton. We provde a novel estmator, MDPE, whch can usually tolerate more than 85% outlers although t s smple and easy to mplement. The performance of MDPE has been compared wth those of fve other popular methods, ncludng tradtonal ones (RANSAC, Hough Transform, and LMedS) and recently proposed ones (RESC and ALKS). The organzaton of ths chapter s as follows: densty gradent estmaton and the mean shft method are ntroduced n secton 4.2. Secton 4.3 descrbes the MDPE method. Comparatve expermental results of MDPE and several other robust estmators are contaned n secton 4.4. Fnally, we conclude wth a summary and a dscusson of further possble work n secton Nonparametrc Densty Gradent Estmaton and Mean Shft Method There are several nonparametrc methods avalable for probablty densty estmaton: the hstogram method, the nave method, the nearest neghbor method, and kernel estmaton (Slverman 1986). The kernel estmaton method s one of the most popular technques used n estmatng densty. Gven a set of n data ponts {X } =1,,n n a d-dmensonal 52

76 53 Eucldan space R d, the multvarate kernel densty estmator wth kernel K and wndow radus (band-wdth) h s defned as follows ((Slverman 1986), p.76) = = n d h X x K nh x f 1 ) ( 1 ) ( ˆ ( 4.1) The kernel functon K(x) should satsfy some condtons ((Wand and Jones 1995), p.95). There are several dfferent knds of kernels. The Epanechnkov kernel ((Slverman 1986), p.76) s one optmum kernel whch yelds mnmum mean ntegrated square error (MISE): Χ < Χ Χ Χ + = Χ otherwse f d c K T T d e 0 1 ) 2)(1 ( 2 1 ) ( 1 ( 4.2) where c d s the volume of the unt d-dmensonal sphere, e.g., c 1 =2, c 2 =π, c 3 =4π/3. The estmate of the densty gradent can be defned as the gradent of the kernel densty estmate ( 4.1): = = n d h X x K nh x f x f 1 ) ( 1 ) ˆ( ) ( ˆ ( 4.3) Accordng to ( 4.3), the densty gradent estmate of the Epanechnkov kernel can be wrtten as: + = ) ( 2 ] [ 1 2 ) ( ) ( ˆ x S X x d d x h x X n h d c h n n x f ( 4.4) where the regon S h (x) s a hypersphere of the radus h, havng the volume d d c h, centered at x, and contanng n x data ponts. The mean shft vector M h (x) s defned as x X n x X n x S X x x S X x h h = ) ( ) ( h 1 ] [ 1 (x) M ( 4.5)

77 Equaton ( 4.4) can be rewrtten as: M h 2 h ˆ f ( x) (x) ( 4.6) d + 2 fˆ( x) Equaton ( 4.6) frstly appeared n (Fukunaga and Hostetler 1975). Equaton ( 4.5) shows that the mean shft vector s the dfference between the local mean and the center of the wndow. Equaton ( 4.6) shows the mean shft vector s an estmate of the normalzed densty gradent. The mean shft s an unsupervsed nonparametrc estmator of densty gradent. One characterstc of the mean shft vector s that t always ponts towards the drecton of the maxmum ncrease n the densty. The Mean Shft algorthm can be descrbed as follows: 1. Choose the radus of the search wndow 2. Intalze the locaton of the wndow. 3. Compute the mean shft vector M h (x). 4. Translate the search wndow by M h (x). 5. Step 3 and step 4 are repeated untl convergence. The converged centers (or wndows) correspond to modes (or centers of the regons of hgh concentraton) of data represented as arbtrary-dmensonal vectors. The proof of the convergence of the mean shft algorthm can be found n (Comancu and Meer 2002a). Snce ts ntroducton by Fukunaga and Hostetler (1975), the mean shft method has been extensvely exploted and appled, for ts ease and effcency, n low level computer vson tasks such as vdeo trackng (Comancu, Ramesh et al. 2000), mage flterng (Comancu and Meer 1999a), clusterng (Cheng 1995; Comancu and Meer 1999b) and mage segmentaton (Comancu and Meer 1997; Comancu and Meer 2002a). 54

78 P0' P1' Probablty Densty P P Two mode normal Dstrbuton Fgure 4.1: One example where the mean shft estmator found the local maxmum of the probablty denstes. To llustrate the mean shft method, two sets of samples from normal dstrbutons were generated, each havng 1000 data ponts and wth unt varance. One had a dstrbuton wth zero mean, and the other had a mean of 4.0 (see Fgure 4.1). These ponts were dstrbuted along the abscssa but here we choose to plot only the correspondng probablty densty at those data ponts). We selected two ntal ponts as the centers of the ntal wndows: P0 (-2.0) and P1 (2.5). The search wndow radus was chosen as 1.0. After applyng the mean shft algorthm, the mean shft estmator automatcally found the local maxmum denstes (the centers of converged wndows). Precsely, P0 located at , and P1 wth The centers (P0 and P1 ) of the converged wndows correspond to the local maxmum probablty denstes, that s, the two modes. 55

79 4.3 Maxmum Densty Power Estmator MDPE The Densty Power (DP) Random samplng technques have been wdely used n a lot of methods, for example, RANSAC, LMedS, RESC, ALKS, etc (see chapter 2). Each uses the random samplng technques to choose p ponts, called a p-subset, to determne the parameters of a model for that p-subset (p equals 2 for a lne, 3 for a crcle or plane, 6 for a quadratc curve), and fnally outputs the parameters determned by the p-subset wth the mnmum or maxmum of the respectve objectve functon. They dffer n ther objectve functons used to rank the p-subsets. Here we defne a new objectve functon. When a model s correctly ftted, there are two crtera that should be satsfed: (1) Data ponts on or near the model (nlers) should be as many as possble; (2) The resduals of nlers should be as small as possble. Most objectve functons of exstng methods consder ether one of the crtera or both. RANSAC (Fschler and Rolles 1981) apples crteron (1) nto ts optmzaton process and outputs the results wth the hghest number of data ponts wthn an error bound; The Least squares method uses crteron (2) as ts objectve functon, but mnmzes the resduals of all data ponts wthout the ablty to dfferentate the nlers from the outlers; MUSE, nstead of mnmzng the resduals of nlers, mnmzes the scale estmate provded by the kth ordered absolute resdual. RESC combnes both crtera nto ts objectve functon,.e., the hstogram power. Among all these methods, RESC obtans the hghest breakdown pont. It seems that t s preferable to consder both crtera n the objectve functon. The new estmator we ntroduce here, MDPE, also consders these two crtera n ts objectve functon. We assume the resduals of the nlers (good data ponts) satsfy a zero mean, smooth and unmodal dstrbuton: e.g., a Gaussan-lke dstrbuton. If the model to ft s correctly estmated, the data ponts on or near the ftted structure should have a hgher probablty densty; and at the same tme, the center of the converged wndow by the mean shft procedure (correspondng to the hghest local probablty densty) should be as close 56

80 to zero as possble n resdual space. Accordng to the above assumptons, our objectve functon ψ DP consders two factors: (1) the denstes fˆ (X ) of all data ponts wthn the converged wndow W c and (2) the center Xc of the converged wndow. Thus ψ DP X Wc follows: f ˆ( ) and ψ DP X 1 Xc. We defne the probablty densty power functon as ψ DP = fˆ( X ) exp( Xc ) X W c α ( 4.7) where Xc s the center of the converged wndow W c obtaned by applyng the mean shft procedure. α s a factor that adjusts the relatve nfluence of the probablty densty to the resdual of the pont correspondng to the center of the converged wndow. α s emprcally set to 1.0. Expermentally, we have found the above form to behave better than varous other alternatves havng the same general form. If a model s found, Xc s very small, and the denstes wthn the converged wndow are very hgh. Thus our objectve functon wll produce a hgh score. Experments, presented n the next secton, show that MDPE s a very powerful method for data wth a large percentage of outlers The MDPE Algorthm As Lee stated (Lee, Meer et al. 1998), any one-step robust estmator cannot have a breakdown pont exceedng 50%, but estmators adoptng multple-step procedures wth an apparent breakdown pont exceedng 50% are possble. MDPE adopts a mult-step procedure. The procedure of MDPE can be descrbed as follows: (1) Choose a search wndow radus h, and a repetton count m. The value m can be chosen accordng to equaton ( 2.23). 57

81 (2) Randomly choose one p-subset, estmate the model parameters by the p-subset, and calculate the sgned resduals of all data ponts. (3) Apply the mean shft steps n the resdual space wth ntal wndow center zero. Notce that the mean shft s employed n one-dmensonal space sgned resdual space. The converged wndow center C can be obtaned by the mean shft procedure n secton 4.2. (4) Calculate the denstes (usng equaton 4.1) correspondng to the postons of all data ponts wthn the converged wndow wth radus h n the resdual-densty space. (5) Calculate the densty power accordng to equaton ( 4.7). (6) Repeat step (2) to step (5) m tmes. Fnally, output the parameters wth the maxmum densty power. The results are from one p-subset, correspondng to the maxmum densty power. In order to mprove the statstcal effcency, a weghted least square procedure [(Rousseeuw and Leroy 1987), p.202] s carred out after the ntal MDPE ft. However, a more robust scale estmator TSSE (whch was proposed at the later stage see chapter 7) can also be employed. Instead of estmatng the ft nvolvng the absolute majorty n the data set, MDPE fnds a ft havng a relatve majorty of the data ponts. Ths makes t possble, n practce, for MDPE to obtan a hgh robustness that can tolerate more than 50% outlers. 4.4 Experments and Analyss Next, we wll compare the abltes of several estmators (MDPE, RESC, ALKS, LMedS, RANSAC, and Hough Transform) to deal wth data wth a large percentage of outlers. We choose RANSAC and Hough Transform as two methods to compare wth, because they are very popular methods and have been wdely appled n computer vson. Provded wth the correct error tolerance (for RANSAC) or bn sze (for Hough Transform), they can tolerate 58

82 more than 50% outlers. Although LMedS has only 0.5 breakdown pont and cannot tolerate more than 50% outlers, t needs no pror knowledge of the varance of nlers. RESC and ALKS are two relatvely new methods and represent modern developments n robust estmaton. We also note that RANSAC, LMedS, RESC, ALKS, and MDPE all adopt smlar four-step procedures: randomly samplng; estmatng the parameter canddate for each sample; evaluatng the qualty of each canddate; outputtng the fnal parameter estmate wth the best qualty measure. In ths secton, we wll nvestgate the characterstcs of the sx methods under clustered outlers and dfferent percentages of outlers. We also nvestgate the tme complexty of the fve comparatve methods (LMedS, RANSAC, ALKS, RESC, and MDPE). We produce the breakdown plot of the sx methods, and test the nfluence of the choce of wndow radus on the MDPE. Unless we specfy, the wndow radus h for MDPE wll be set at 2.0 for the experments n ths secton Experment 1 In ths experment, the performance of MDPE n lne fttng and crcle fttng wll be demonstrated and ts tolerance to large percentages of outlers wll be compared wth fve other popular methods: RANSAC, Hough Transform, LMedS, RESC, and ALKS. The tme complexty of the fve methods (except for Hough Transform) wll also be evaluated and compared. We wll show that some methods break down. We can (and have) checked whether such a breakdown s an artfact of mplementaton (e.g. randomly samplng) or whether the breakdown s the result of the objectve functon for that method scorng wrong ft better than the true one see dscussons later (sub-secton ) Lne Fttng We generated four knds of data (step, three-step, roof, and sx-lne), each wth a total of 500 data ponts. The sgnals were corrupted by Gaussan nose wth zero mean and standard varance σ. Among the 500 data ponts, α data ponts were randomly dstrbuted n the range of (0, 100). The 'th structure has n data ponts. 59

83 (a) Step: x:(0-55), y=30, n 1 =65; x:(55-100), y=40, n 2 =30; α=405; σ=1.5. (b) Three-step: x:(0-30), y=20, n 1 =45; x:(30-55), y=40, n 2 =30; x:(55-80), y=60, n 3 =30; x:(80-100), y=80, n 4 =30; α=365; σ=1. (c) Roof: x:(0-55), y=x+30, n 1 =35; x:(55-100), y=140-x, n 2 =30; α=435; σ=1. (d) Sx-lne: x:(0-25), y=3x, n 1 =30; x:(25-50), y=150-3x, n 2 =20; x:(25-50), y=3x-75, n 3 =20; x:(50-75), y=3x-150, n 4 =20; x:(50-75), y=225-3x, n 5 =20; x:(75-100), y=300-3x, n 6 =20; α=370; σ= Hough ALKS LMedS 60 LMedS ALKS RESC MDPE 20 Others Ransac (a) (b) RESC Others MDPE LMedS LMedS ALKS ALKS Hough Ransac (c) (d) Fgure 4.2: Comparng the performance of sx methods: (a) fttng a step wth a total of 87% outlers; (b) fttng three steps wth a total of 91% outlers; (c) fttng a roof wth a total of 93% outlers; (d) fttng sx lnes wth a total of 94% outlers. 60

84 From Fgure 4.2, we can see that because LMedS has only a 0.5 breakdown pont, t cannot resst more than 50% outlers. Thus, LMedS faled to ft all the four sgnals; The ALKS, RESC and MDPE approaches all have hgher robustness, compared wth LMedS, to outlers. However, the results show that ALKS s not applcable for the sgnals wth such large percentages of outlers because t faled n all four cases. RESC, although havng a very hgh robustness, ftted one model, but faled three. The Hough Transform could not correctly ft the step sgnals, whch happen to fall near an nclned lne, wth large percentages of outlers. Provded wth the correct error bound of nlers, RANSAC correctly ftted three sgnals, but faled one. Only the MDPE method correctly ftted all the four sgnals. The MDPE ddn t breakdown even wth 94% outlers. In Fgure 4.2 (d), we can see, although MDPE, Hough Transform, and RANSAC dd not breakdown, they found dfferent lnes n the sx-lne sgnal (accordng to ther own crteron). Among these sx methods, MDPE, RESC and RANSAC are smlar to each other. They all randomly choose p-subsets and try to estmate parameters by a p-subset correspondng to the maxmum value of ther object functon. Thus, ther objectve functons are the core that determnes how much robustness to outlers these methods have. RANSAC consders only the number of data ponts fallng nto gven error bound of nlers; RESC consders the number of data ponts wthn the mode and the resdual dstrbutons of these ponts; MDPE consders not only the densty dstrbuton of the mode, whch s assumed havng Gaussan-lke dstrbuton, n the resdual space, but also the sze of the resdual correspondng to the center of the mode. It s mportant to pont out that the falures of RESC, ALKS, LMedS, and RANSAC, and Hough Transform n some of or all of the four sgnals s nherent and not smply an artefact of our mplementaton. Let us check the crtera of RESC and we wll understand why RESC faled to ft to the three sgnals. The objectve functon of RESC for the correct ft s 7.0 (for one-step sgnal), 5.8 (for three-steps sgnal) and s 4.4 (for sx-lnes sgnal). However, the objectve functon of RESC for the estmated parameters s 7.6 for a step, 8.1 for three steps and 5.3 the sx-lne sgnal. In fact, durng the searchng procedure, the RESC estmator consstently maxmzes ts objectve functon hstogram power, startng wth ntal fts that have a smaller hstogram power, but successvely fndng fts wth hgher hstogram power proceedng to even hgher hstogram power than that possessed by the true ft. The falures of RANSAC, LMedS and ALKS have a smlar nature: for 61

example, the medan of resduals of the true ft s 16.8, 29.2 and 97.0 for a step, three steps and sx lnes respectvely. However the medan of resduals of fnal result by the LMedS method s 16.

85 example, the medan of resduals of the true ft s 16.8, 29.2 and 97.0 for a step, three steps and sx lnes respectvely. However the medan of resduals of fnal result by the LMedS method s 16.3, (for a step), 15.5 (for three steps) and 23.4 (for sx lnes). The problem s not wth the mplementaton but wth the crteron Crcle Fttng The proposed MDPE s a general method that can be easly appled to ft other knds of models, such as crcles, ellpss, planes, etc. Fgure 4.3 shows the ablty of the MDPE to ft crcles under 95% outlers. Fve crcles were generated, each wth 101 data ponts and σ= random outlers were dstrbuted wthn the range (-75, 75). Thus, for each crcle, t has 1904 outlers (404 pseudo-outlers plus 1500 random outlers). The MDPE method gave more accurate results than LMedS, RESC, and ALKS. The Hough Transform and RANSAC also correctly ft the crcles when provded wth correct bn sze (for Hough Transform) and error bound of nlers (for RANSAC). The three methods (MDPE, Hough Transform, and RANSAC) ftted three dfferent crcles accordng to ther own crteron. Fgure 4.3: One example of fttng crcles by the sx methods. The data had about 95% outlers. 62

86 Tme Complexty It mght be nterestng to compare the tme complexty between the dfferent methods. In ths experment, we wll compare the speed of MDPE, RESC, ALKS, LMedS, and RANSAC. We do not consder the Hough Transform, because the speed of Hough transform depends on the dmenson of parameter space, the range of each parameter, and the bn sze. It also uses a dfferent framework (votng n parameter space), compared wth the other fve methods (whch use samplng technques). In order to make the speed of each method comparable, the same smple random samplng technque was used for all fve methods. Although some other samplng technques exst, such as guded samplng (Tordoff and Murray 2002) and GA samplng (Roth and Levne 1991; Yu, Bu et al. 1994), and the speed of each method by adoptng these samplng technques can be mproved; we adopted the smple randomly samplng technque because: (1) It has been wdely used n most robust estmators (such as LMedS, LTS, RANSAC, ALKS, MUSE, MINPRAN, etc.). (2) It s easy to perform. A Step Three Steps A Roof Sx Lnes Fve Crcles Percentages of outlers 87% 91% 93% 94% 95% Number of Samplng: m MDPE RESC ALKS LMedS RANSAC Table 4.1: The comparson of tme complexty for the fve methods (all tme n seconds). 63

87 We used the sgnals above (a step, three steps, a roof, sx lnes, and fve crcles) to test speed of the fve methods. We repeated the experments on each sgnal 10 tmes, and the mean tme of each method for each sgnal was recorded. We performed them all n complete MATLAB code (programmng n C code wth optmsaton wll make the methods faster). From Table 4.1, we can see that LMedS and RANSAC have smlar speed and they are faster than MDPE, RESC, and ALKS. MDPE s about 35% faster than RESC. The speed of MDPE s close to that of ALKS n lne fttng but faster than ALKS n the fve-crcles fttng. ALKS s also faster than RESC n lne fttng, but slower than RESC n crcle fttng. We noted that the tme complexty of ALKS, compared wth MDPE and RESC, s hgher n the fve-crcle sgnal (2005 data ponts) than n the lne sgnals (505 data ponts). Ths s because the ALKS procedure used m p-subsets for each value of k (as recommended by Lee and Meer, the number of dfferent k s equal to 19). Thus, when the number of data ponts and samplng tmes s ncreased, the ncrease of tme complexty of ALKS n sortng the resduals of the data ponts (manly) s hgher than that of RESC n compressng hstogram, and that of MDPE n calculatng densty power Experment 2 In the prevous experment, we nvestgated the characterstcs of the sx methods to ft data wth multple structures. In ths experment, we wll explore the abltes of the sx methods to ft data wth clustered outlers. We generated a lne (y=x-1) corrupted by Gaussan nose wth zero mean and standard varance σ 1. The lne had n data ponts. Among the total 500 data ponts, α data ponts were randomly dstrbuted n the range of (0, 100.0), and β clustered outlers were added to the sgnals, possessng a sphercal bvarate normal dstrbuton wth standard varance σ 2 and mean (80.0, 30.0). (a) γ=100, σ 1 =1.0; α=200; β=200; σ 2 =5.0. (b) γ=100, σ 1 =1.0; α=200; β=200; σ 2 =2.0. (c) γ=275, σ 1 =1.0; α=0; β=225; σ 2 =1.0. (d) γ=275, σ 1 =5.0; α=0; β=225; σ 2 =

88 Hough 50 Others 50 MDPE and RANSAC RESC LMedS ALKS LMedS (a) (b) RESC 50 Others 50 ALKS Hough 40 RESC MDPE LMedS and RANSAC (c) (d) Fgure 4.4: Experment where the sx methods are fttng a lne wth clustered outlers. The standard varance of both clustered outlers and nlers wll affect the results of the sx methods. Fgure 4.4 shows that both the standard varance of clustered outlers σ 2 and the standard varance of nlers to the lne σ 1 wll decde the accuracy of the results estmated by the sx methods. When σ 1 s small and σ 2 s large, all methods except for LMedS can correctly ft the lne although a large number of clustered outlers exsted n the data (see Fgure 4.4 (a)). The LMedS faled because t cannot tolerate more than 50% outlers. When the standard varance of clustered outlers s small,.e., the outlers are densely clustered wthn a small range; the ablty of MDPE, RESC, ALKS, and RANSAC to resst the nfluence of clustered outlers wll be greatly reduced (see Fgure 4.4 (b)). As shown n Fgure 4.4 (c) and Fgure 4.4 (d), the standard varance of nlers to the lne wll also affect 65

89 the accuracy of the results by LMedS, MDPE, RESC, ALKS, and RANSAC. When σ 1 was 5.0 (Fgure 4.4 (d)), all the fve methods faled to ft the lne even wth only 45% clustered outlers. The Hough Transform, to our surprse, showed excellent performance to resst clustered outlers. It succeeded to ft all the four sgnals despte clustered outlers. We note that the Hough Transform adopts a dfferent framework to the other fve methods: t uses a votng technque n parameter spaces nstead of resdual space. It would seem that the objectve functons of all other methods fal to score the correct solutons hghly (for MDPE, RESC, and RANSAC) or lowly (for LMedS and ALKS) enough when there are large numbers of very hghly clustered outlers. Ths has been noted before wth the LMedS (Wang and Suter 2003a; also see chapter 3) and s presumably one reason why the proofs of hgh breakdown pont specfcally stpulates rather generally dstrbuted outlers Experment 3 It s mportant to know the characterstcs of the varous methods when the sgnals were contamnated by dfferent percentages of outlers. In ths experment, we wll draw the breakdown plot and compare the abltes of the sx methods to resst dfferent percentages of outlers (n order to avod crowdng, each sub-fgure n Fgure 4.5 ncludes three methods). We generated step sgnals (y=ax+b) as follows: Sgnals: lne 1: x:(0-55), A=0, B=30, n 1 wll be decreased wth the ncrease of unformly dstrbuted outlers α; lne 2: x:(55-100), A=0, B=60, n 2 =25; for both lnes: σ=1. In total 500 ponts. 15 clustered outlers centred at (80, 10) wth unt varance were added to the sgnals. At the begnnng, n 1 = 460, α=0, so the frst sgnal had an ntal 8% outlers; then every repetton of the experment 5 ponts were moved from n 1 to unform outlers (α) rangng over (0-100) untl n 1 =25. Thus the percentage of outlers n the data ponts changed from 8% to 95%. The whole procedure above was repeated 20 tmes. 66

90 As Fgure 4.5 llustrated, the LMedS frst broke down (at about 50% of outlers) among all these sx estmators. ALKS broke down even when outlers comprsed less than 80%; RESC began to break down when outlers comprsed more than 88% of the total data ALKS LMedS Hough ALKS LMedS Hough Error n A Error n B Outlers Percentage (a1) Outlers Percentage (b1) MDPE RESC RANSAC MPDE RESC RANSAC Error n A Error n B Outlers Percentage (a2) Outlers Percentage (b2) Fgure 4.5: Breakdown plot for the sx methods: (a1) and (a2) error n A vs. outler percentage; (b1) and (b2) error n B vs. outler percentage. From Fgure 4.5, we can also see that, provded wth the correct error bound (for RANSAC) and wth a good bn sze (for Hough Transform), RANSAC and Hough Transform can tolerant more than 50% outlers. RANSAC began to break down at 92% outlers; Hough transform began to break down when outlers have more than 88% (broke down at 89% or more outlers). However, the performance of RANSAC s largely dependent on the correct choce of error tolerance. If the error tolerance devated from the correct error tolerance, RANSAC wll completely breakdown (see the experment n 67

91 subsecton ). Smlarly, the good performance of the Hough Transform s largely dependent on the choce of accumulator bn sze. If the bn sze s wrongly gven, Hough Transform wll also breakdown (ths phenomenon was also ponted out by Chen and Meer (Chen and Meer 2002)). In contrast, MDPE has the hghest robustness among the sx methods. MDPE began to break down only at 94% outlers. However, even at 94% and 95% outlers, MDPE had stll, loosely speakng, about 75% correct estmaton rate out of the 20 tmes. Another thng we notced s that ALKS has some obvous fluctuatons n the results when the outlers are less than 30%, whle the other fve do not have ths undesrable characterstc. Ths may be because the robust estmate of the nose varance s not vald for small or large k values (k s the optmum value to be determned by the data). Among all these sx methods, MDPE and RANSAC have smlar accuracy. They are more accurate than RESC, ALKS, and LMedS. The accuracy of the Hough Transform greatly depends on the accumulator bn sze n each parameter space. Generally speakng, the larger the bn sze s, the lower accuracy the Hough Transform may have. Thus, n order to obtan hgher accuracy, one needs to reduce the bn sze. However, ths wll lead to an ncrease n storage requrements and computatonal complexty. Also, one can have a bn sze that s too small (theoretcally, each bn receves less votes and n the lmt of very small bn sze, no bn wll have more than 1 vote!) Experment 4 The problem of the choce of wndow radus n the means shft,.e., bandwdth selecton, has been wdely nvestgated durng the past decades (Slverman 1986; Wand and Jones 1995; Comancu, Ramesh et al. 2001; Comancu and Meer 2002a). Comancu and Meer (2002a) suggested several technques for the choce of wndow radus: 1. The optmal bandwdth should be the one that mnmzes AMISE; 68

92 2. The choce of the bandwdth can be taken as the center of the largest operatng range over whch the same results are obtaned for the same data. 3. The best bandwdth maxmzes a functon that expresses the qualty of the results. 4. User provdes top-down nformaton to control the kernel bandwdth. Next we wll nvestgate the nfluence of the choce of wndow radus on the results of MDPE The Influence of the Wndow Radus and the Percentage of Outlers on MDPE % Outlers 60% Outlers 70% Outlers 80% Outlers % Outlers 60% Outlers 70% Outlers 80% Outlers Error n A Errro n B Wndow Radus H Wndow Radus H Fgure 4.6: The nfluence of wndow radus and percentage of outlers on the results of the MDPE. Although the MDPE has showed ts powerful ablty to tolerate large percentage of outlers (ncludng pseudo-outlers), ts success s decded by the correct choce of wndow radus h. If h s chosen too small, t s possble that the denstes of data ponts n the resdual space may not be correctly estmated (the densty functon s a nosy functon wth many local peaks and valleys), and some nlers may possbly be neglected; on the other hand, f h s set too large, the wndow wll nclude all the data ponts ncludng nlers and outlers; all peaks and valleys of the densty functon wll also be smoothed out. In order to nvestgate the nfluence of the choce of wndow radus h and percentage of outlers on the estmated results, we generated a step sgnal: y=ax+b, where A=0, B=30 for x:(0-55), 69

93 n 1 =100; and A=0, B=70 for x:(55-100), n 2 =80.. The step was corrupted by Gaussan nose wth a unt varance. In total, 500 data ponts were generated. Unformly dstrbuted outlers n the range (0-100) were added to the sgnal so that the data respectvely ncluded 50%, 60%, 70% and 80% outlers (ncludng unformly dstrbuted outlers and pseudo outlers). To nvestgate the effect of wndow sze n MDPE, the wndow radus h was set from 1 to 20 wth ncreasng step by 1 each tme. The results were repeated 20 tmes. Fgure 4.6 shows that the absolute errors n A and B ncrease wth the wndow radus h (when h s larger than some range) because when the radus becomes larger, t s possble that more outlers were ncluded wthn the converged wndow. The percentage of outlers has nfluence on the senstvty of the results to the choce of wndow radus: when the data nclude a hgher percentage of outlers, the results are relatvely more senstve to the choce of wndow radus; n contrast, when there are a less percentage of outlers n the data, the results are relatvely less senstve to the choce the wndow radus The Influence of the Choce of Error Tolerance on RANSAC We notce that RANSAC has an mportant parameter error tolerance (.e. error bound of nlers), the correct choce of whch s crucal for the method s success n model fttng. The purpose of error tolerance n RANSAC has some smlarty to the wndow radus h n MDPE: they both restrct mmedate consderaton of the data wthn some range; MDPE uses the denstes of the data wthn the converged wndow; RANASC uses the number of the data wthn error tolerance. It would be nterestng to nvestgate the senstvty of the results to the choce of the error bounds n RANSAC. We used the same sgnal as used n Fgure 4.6 and the results were repeated 20 tmes. As Fgure 4.7 show, RANSAC has lttle robustness to the choce of dfferent error bound. When the error bound devated from the true value (whch s assumed as a pror knowledge), RANSAC totally broke down. Moreover, the result of RANSAC s very senstve the choce of error bound, regardless of the percentages of outlers that are ncluded n the data: even when the data ncluded 50% outlers, RANSAC stll broke down when the wrong error bound was provded. Ths s dfferent to the behavour of MDPE. As 70

94 shown n Fgure 4.6, when the data nclude 50% of outlers, the results of MDPE showed robustness for a large range of h (from1 to 15) % Outlers 60% Outlers 70% Outlers 80% Outlers % Outlers 60% Outlers 70% Outlers 80% Outlers Error n A Error n B Error Tolerance Error Tolerance Fgure 4.7: The nfluence of the choce of error bound on the results of RANSAC The Relatonshp between the Nose Level of Sgnal and the Choce of Wndow Radus for MDPE Next, we wll nvestgate the relatonshp between the nose level of nlers and the choce of wndow radus. We use the step sgnal wth 70% outlers that s used n Fgure 4.6. But we change the standard varance of the step sgnal from 1 to 4, wth ncrement 1. Fgure 4.8 shows that the results are smlar when the nose levels of the step sgnal are set from 1 to 3. However, when the standard varance of the sgnal s ncreased to 4, the tolerance range to the choce of wndow radus has an obvous reducton; and the fluctuaton n the estmated parameters s larger for hgher nose level n the sgnal than lower one. In fact, we have notced that, not surprsngly, when the nose level s too large, the accuracy of all methods that are used for comparson s low. The breakdown pont of these methods wll decrease wth the ncrease of nose level of sgnal. 71

0.7 0.6 σ =1 σ =2 σ =3 σ =4 12 10 σ =1 σ =2 σ =3 σ =4 0.5 8 Error n A 0.4 0.3 Error n B 6 0.2 4 0.

8: The relatonshp between the nose level of sgnal and the choce of wndow radus n MDPE. 4.

95 σ =1 σ =2 σ =3 σ = σ =1 σ =2 σ =3 σ = Error n A Error n B Wndow Radus H Wndow Radus H Fgure 4.8: The relatonshp between the nose level of sgnal and the choce of wndow radus n MDPE Experments on Real Images (a) (b) LMedS ALKS Others (c) Fgure 4.9: Fttng a lne (a) one real pavement; (b) the edge mage obtaned by usng Canny operator; (c) the results of lne fttng obtaned by the sx methods. 72

In ths experment, we wll provde two real mages to show the ablty of MDPE to tolerate large percentage of outlers. The frst example s to ft a lne n the pavement shown n Fgure 4.9.

96 In ths experment, we wll provde two real mages to show the ablty of MDPE to tolerate large percentage of outlers. The frst example s to ft a lne n the pavement shown n Fgure 4.9. The edge mage was obtaned by usng Canny operator wth threshold 0.15 and ncluded 2213 data ponts (shown n Fgure 4.9 (b)). There were about 85% outlers (most belongng to pseudooutlers whch had structures and belonged to other lnes) n the data. Sx methods (MDPE, RESC, ALKS, LMedS, RANSAC, and Hough Transform) were appled to ft a lne n the pavement. As shown n Fgure 4.9 (c), ALKS and LMedS faled to correctly ft a lne n the pavement; whle the other four methods correctly found a lne (a) (b) (c) Fgure 4.10: Fttng a crcle edge. (a) twelve cups; (b) the edge mage obtaned by usng Canny operator; (c) the results of crcle fttng obtaned by the sx methods. 73

97 The second example s to ft a crclar edge of one cup out of twelve cups. Among the total 1959 data ponts, the nlers correspondng to each cup were less than 10% of the total data ponts. Ths s another multple-soluton case: the ftted crcle can correspond to any cup n the twelve cups. As shown n Fgure 4.10, MDPE, RANSAC, and Hough Transform all correctly found a cup edge (the result of RANSAC was relatvely lower n accuracy than that of MDPE), but each method found a dfferent crcle (Note: as these are not synthetc data, we do not have the correct error bound for RANSAC and bn sze for Hough Transform. We emprcally chose the error bound for RANSAC and bn sze for Hough Transform so that the performance was optmzed). However, all other three methods (RESC, ALKS, and LMedS), whch are closer to MDPE n sprt, faled to ft the crcle edge of a cup. 4.5 Concluson In ths chapter, we ntroduced a new and hghly robust estmator (MDPE). MDPE s smlar to many random samplng estmators: we randomly choose several p-subsets, and we calculate the resduals for the ft determned by each p-subset. However, the crux of the method s that we apply the mean shft procedure to fnd the local maxmum densty of these resduals. Furthermore, we evaluate a densty power measure nvolvng ths maxmum densty. The fnal estmated parameters are those determned by the one p- subset correspondng to the maxmum densty power over all of the evaluated p-subsets. Our method, and hence our defnton of maxmum densty power, s based on the assumpton that when a model s correctly ftted, ts nlers n resdual space should have a hgher probablty densty, and the resdual at the maxmum probablty densty of nlers should have a low absolute value. Ths captures the dual notons that: the data ponts havng lower resduals should be as many as possble, and that the resduals should be as small as possble. In that sense, our method combnes the essence of two popular estmators: Least Medan of Squares (low resduals) and RANSAC (maxmum number of nlers). However, unlke RANSAC, MDPE scores the results by the denstes of data ponts fallng nto the converged wndow and on the sze of resdual of the pont correspondng to local maxmum densty. Contrast ths also wth the Least Medan of Squares, whch uses a sngle statstc (the medan). 74

98 The result of our nnovaton s a hghly robust estmator. The MDPE can tolerate more than 85% outlers, and has regularly been observed to functon well wth even more than 90% outlers. We also compared our method wth several tradtonal (RANSAC, Hough Transform and LMedS) and recently provded methods (RESC and ALKS). From our expermental analyss, t s hard to say f any method has a clear advantage. LMedS and RANSAC are the fastest among the sx methods. However, the apparent breakdown pont of LMedS s lower; and RANSAC needs a pror knowledge of the error bounds. The results of RANSAC are very senstve the choce of error bounds, even when the percentage of outlers s low. The Hough Transform shows excellent performance when the data nclude clustered outlers. However, the space requrement and tme complexty s hgh when the dmenson of parameters s hgh and hgh accuracy s requred. Among recently proposed estmators: MDPE, RESC, and ALKS; MDPE has the hghest robustness to outlers. ALKS shows less robustness and nstablty when the percentage of outlers s small. However, t s completely data drven. Although RESC needs user to adjust some parameters, t s also a hghly robust estmator. So, we can see each method has some advantages and dsadvantages. When the percentage of outlers s very large or there are many structures n the data (pseudo-outlers), one problem n carryng out all of the methods whch use random samplng technques s: the number of p-subsets to be sampled, m, wll be huge. Fortunately, several other samplng technques, such as guded samplng (Tordoff and Murray 2002) and GA samplng (Roth and Levne 1991; Yu, Bu et al. 1994), appeared durng recent years. Investgaton of samplng technques s beyond the scope of ths thess but should be addressed n future work In the latter part of constructon of MDPE, the authors became aware of (Chen and Meer 2002). Ths work has some smlar deas to our work n that both methods employ a kernel densty estmaton technque. However, ther work places emphass on the projecton pursut paradgm and on data fuson. Moreover, they use an M-estmator paradgm (see secton 2 n the paper). Though there are nce theoretcal lnks between M-estmator versons of robust estmators and kernel densty estmaton, as referred to n that paper, the 75

99 crucal fact remans that LMedS and RANSAC type methods have a hgher breakdown pont (especally n hgher dmenson). Thus, though ther work employs kernel densty estmaton that s also a key to our own approach, the dfferences are sgnfcant: (1) The spaces consdered are dfferent: n ther methods, they consdered ther mode of the densty estmate n the projecton space along the drecton of parameter vector. MDPE consders the densty dstrbuton of the mode n the resdual space. (2) The mplcaton of the mode s dfferent: they sought the mode that corresponds to the maxmum densty n the projecton space, whch maxmzes the projecton ndex. MDPE consders not only the densty dstrbuton of the mode, whch s assumed havng Gaussan-lke dstrbuton, n the resdual space, but also the sze of the resdual correspondng to the center of the mode. (3) They used a varable bandwdth technque that s proportonal wth the MAD scale estmate. However, as Chen and Meer sad, MAD may be unrelable when the dstrbuton s mult-modal, whch may cause problems wth the bandwdth estmaton. We used a fxed bandwdth technque to estmate the densty dstrbuton. The relatonshp between the choce of the bandwdth and the results of MDPE s nvestgated n ths chapter. Furthermore, we also employed the varable bandwdth technque n the modfed verson of MDPE (see chapter 6), n TSSE (chapter 7) and n ASSC (chapter 8). (4) In ther method, the computatonal complexty s greatly ncreased for hgher dmensons because the search space s much larger wth the ncrease of the dmenson of parameter space. Thus, a more effcent search strategy s demanded for hgher dmenson n ther method. In our method, lke RESC, ALKS, LMedS, etc., one-dmensonal resdual space s analyzed rather than mult-dmensonal parameter space. The tme complexty of MDPE (and RESC, ALKS, LMedS, etc.) s related to the randomly samplng tmes, whch wll be affected by both the dmenson of the parameter space and the percentage of outlers. 76

100 (5) Because ther method employed a projecton pursut technque, more supportng data ponts ((Chen and Meer 2002), pp.249) are needed to yeld relable results. Thus, they randomly choose the data ponts n one bn from the upper half of the rankng (by the number of ponts nsde each bn) followed by regon growng to reach more data ponts. In MDPE, we randomly choose p-subsets from the whole data each tme, and calculate the parameters by the p-subset and then the resduals of all data ponts by the obtaned parameters. At ths pont n tme, t s dffcult to compare the performance of the two approaches. We do not prove that our method has a hgh breakdown pont n a rgorous way. However, we must pont out that, despte mpressons that may be obtaned by readng much of the lterature, partcularly that amed more at the practtoner, more tradtonally accepted technques stll have ther shortcomngs n smlar ways. For example, though t s often cted that Least Medan of Squares has a proven breakdown pont of 50%, t s often overlooked that all practcal mplementatons of Least Medan of Squares are an approxmate form of Least Medan of Squares (and thus only have a weaker guarantee of robustness). Indeed, the robustness of practcal versons of Least Medan of Squares hnges on the robustness of two components (and n two dfferent ways): the robustness of the medan resdual as a measure of qualty of ft and the robustness of the random samplng procedure to fnd at least one resdual dstrbuton whose medan s not greatly affected by outlers. Our procedures, lke many other procedures, share the second vulnerablty as we also rely on random samplng technques. The frst vulnerablty s sometmes dsregarded for practcal versons of Least Medan of Squares, because robustness s vewed as beng guaranteed by vrtue of the proof of robustness for the deal Least Medan of Squares. However, two comments should be made n ths respect. Frstly, that proof reles on assumptons regardng the outler dstrbuton and t can easly be shown that clustered outlers wll nvaldate that proof. Secondly, there s an nherent gap between a proof for 77

101 an deal procedure and what one can say about an approxmaton to that procedure. We beleve that our method of scorng the fts better protects aganst the vulnerabltes that structure n the outlers expose. We have presented emprcal evdence to support that. 78

102 5. A Novel Model-Based Algorthm for Range Image Segmentaton Chapter 5 A Novel Model-Based Algorthm for Range Image Segmentaton 5.1 Introducton Percepton of surfaces n mages has played a very mportant role n mage understandng and three-dmensonal object recognton. Because range mages contan three-dmensonal geometrc nformaton, the dffcultes of recognzng three-dmensonal objects n range mages are greatly reduced. Range mages are the mages that can provde 3D dstance nformaton, related to a known reference coordnate system, to surface ponts on the objects (n the mages) n a scene. Each pxel n a range mage contans 3D geometry nformaton. Thus, the value of the pxel corresponds to a depth/range measurement (.e., n the z drecton); and the coordnate of the pxel n 3D can be wrtten as (x, y, z), where (x, y) s the mage coordnate of the pxel. Currently, range mages have been wdely appled n the felds such as autonomous navgaton and medcal dagnoss. 79

103 Raw mage Segmentaton Feature Extracton Objects Matchng Hgh-Level Descrpton of Features Object Descrptons n Lbrary Object Recognton Fgure 5.1: The smplfed 3D recognton systems Although varous 3D object recognton systems have been developed n recent years, there are common ams n these systems [(Suk and Bhanddarkar 1992), pp.10]: Robustness of the recognton process. The systems should be able to recognze objects wth nose, occluson, and ambguty. Generalty of representaton. The systems should be able to handle dfferent types of objects. Flexblty n control strategy. Speed and Effcency. Speed s a crtcal factor. It wll affect whether or not the systems can recognze the objects on real tme. Ths factor s especally mportant n robot systems. As Fgure 5.1 llustrates, we can see that to segment range mages s the frst step n order to extract features and recognze a three dmensonal object. Therefore whether or not we 80

104 can correctly segment the range mages s an mportant factor that affects the recognton of a three-dmensonal object. Image segmentaton s to segment the mage nto some meanngful non-overlappng homogeneous regons whose unon s the entre mage. If we let R be the whole mage, R 1, R 2,, R n be the segmented regons, then, accordng to Gonzalez [(Gonzalez. and Woods. 1992), pp.458], segmentaton of an mage can be descrbed as follows: n 1. R =1 R = 2. R s a connected regon for all =1,2,,n. 3. R = Φ when j R j 4. P(R )=TRUE for all =1,2,.,n 5. P R R ) =FALSE when j ( j where Φ s the null set and P(R ) s the logcal judge over the ponts n the set R. There has been general agreement on what results an mage segmentaton system should acheve [(Haralck and Shapro 1992), pp.509]: The segmented regons of an mage should be unform and homogeneous consderng some characterstc, e.g., models of the objects n the mage. Regon nterors should contan as less holes as possble. Boundares of regons should be smooth and accurate. It s very dffcult to acheve all these propertes at the same tme by one segmentaton method. To evaluate the performance of a method for mage segmentaton, Hoover suggested to use followng fve types of metrcs (Hoover, Jean-Baptste et al. 1996): 81

105 1. Correct detecton. If there are more than T percent of the pxels n the segmented regon correctly assgned, the regon s correctly detected. 2. Over segmentaton. If there are more than T percent of the pxels n a segmented regon that should be assgned to other regons, the detected regon s called over segmented. 3. Under segmentaton. If a segmented regon contans more than one surface, (whch should be assgned to dfferent regons,) the detected regon s called under segmented. 4. Mssng a regon. If a regon s not contaned n any regon of correct detecton, over segmentaton, or under segmentaton n the segmented mage, the regon s mssed. 5. Nosy regon. If a regon n the segmented mage does not belong to any regon of correct detecton, over segmentaton, or under segmentaton, t s classfed as nose. where T s a threshold and can be specfed by user accordng to the accuracy requrement of a system. There are many three-dmensonal mage segmentaton methods publshed n the lterature. Generally speakng, these segmentaton methods can be classfed nto two major classes: 1. Edge-based segmentaton technques (Ghosal and Mehrotra 1994; Wan and Batchelor 1994). 2. Regon-based segmentaton technques or clusterng technques (Hoffman and Jan 1987; Jang and Bunke 1994; Ftzgbbon, Eggert et al. 1995). In edge-based segmentaton methods, t s mportant to correctly extract the dscontnutes surface dscontnutes (boundares and jumps) and orentaton dscontnutes (creases and roofs), whch wll be used to gude the followed segmentaton process. The man dffcultes that edge-based segmentaton technques meet are: 82

106 The effectveness of these methods wll be greatly reduced when range mages contan nose; When the edge operator mask sze s ncreased, the computatonal tme wll be greatly ncreased. When the edge pxels detected by edge operator are not contnuous (especally n nosy mage), t wll be dffcult to lnk these dscontnuous pxels. Also, the relablty of the crease edge detectors makes edge-based methods questonable. Regon-based technques have wder popularty than edge-based technques. The essence of regon growng technques s that t segments range mages based on the smlartes of feature vectors correspondng to pxels n range mages. The regon-based technques frst estmate the feature vectors at each pxel, and then aggregate the pxels that have smlar feature vectors; and at the same tme, separate the pxels whose feature vectors are dssmlar, to form a segmented regon. However, regon-based methods also have some problems: They have many parameters to control the processng of the regon growng. Most of these parameters need to be predetermned. The choce of ntal regon greatly affects the performance of most regon-based methods. When the seeds are placed on a boundary or on a nose corrupted part of the mage, the results wll break down. The regon boundares are often dstorted because of the nose n the range mages. In clusterng-based methods, to adaptvely estmate the actual number of clusters n the range mage s dffcult. Another way of classfyng a segmentaton approach s that whch uses the noton of model-drven (top-down). The model-drven methods are appealng because t has been proved that these methods have smlartes to the human cogntve process (Nesser 1967; 83

107 Gregory 1970). The model-based methods can drectly extract the requred prmtves from the unprocessed raw range mages. Model-based methods, n partcular, robust model based approaches, have been attractng more and more attenton (Roth and Levne 1990; Yu, Bu et al. 1994; Stewart 1995; Mller and Stewart 1996; Lee, Meer et al. 1998). These methods are very robust to nosy or occluded data. Next, we wll revew several state-of-the-art methods of segmentaton n secton 5.2. Then we modfy the MDPE to produce a qucker verson, called QMDPE (Quck-MDPE), and evaluate ts achevements n secton 5.3. In secton 5.4, we present a model-based method based on the QMDPE for segmentaton of rage mages. The performance of the proposed method s compared to that of several other state-of-the-art range mage segmentaton methods n secton 5.5. We conclude n secton A Revew of Several State-of-the-Art Methods for Range Image Segmentaton In ths secton, four methods (UB, UE, USF, WSU) wll be revewed. The USF (Hoover, Jean-Baptste et al. 1996) and UE (Ftzgbbon, Eggert et al. 1995) algorthms segment range mages by teratvely growng regons from seed regons. However, the WSU (Hoffman and Jan 1987; Flynn and Jan 1991) algorthm employs a clusterng technque n ts segmentaton process. The UB (Jang and Bunke 1994) algorthm s n regon growng framework, but t s based on straght-lne segmentaton n scan lnes. Next the four methods wll be revewed n detal The USF Range Segmentaton Algorthm The USF algorthm (Hoover, Jean-Baptste et al. 1996) can be descrbed as follows: 1. Compute the normal of each range pxel. Frst, a growng operaton s carred out n a n-by-n wndow from the pxel of nterest. The dstances between the pxels (to grow) and ther four-connected pxels 84

108 must be less than a threshold; otherwse, the pxels wll be gnored. The normal s found by ether an Egen method (Duda and Hart 1973; Goldgof, Huang et al. 1989) or a method solvng a set of nne plane equatons. 2. Choose the seed ponts. The pxel wth smallest nteror measure, correspondng to the resdual error of the plane equaton ft to the entre n-by-n wndow, s chosen as a seed pont. 3. Usng the seed pont to grow the regon. Four crtera for pxels jonng the regon must be satsfed: (a) The pont s connected to the regon grown so far. (b) Angle between normal of pxel and that of regon grown so far s less than a threshold T 1. (c) Perpendcular dstance between pxel and plane grown so far s wthn a threshold T 2. (d) Dstance of pxel and four-connected neghbor already n the grown regon s less than a threshold T 3. The regon s recursvely grown untl no pxel left to jon that regon. Then a new regon begns to grow from the next seed pont avalable. If a regon s fnal sze s less than a threshold T 4, the regon and ts pxels are gnored and wll be dealt wth durng the postprocess step The WSU Range Segmentaton Algorthm The WSU algorthm s frst presented by Hoffman and Jan (Hoffman and Jan 1987). Flynn and Jan mproved the WSU algorthm and appled t to 3D object recognton (Flynn and Jan 1991). WSU can be used not only for planar surfaces but also for quadrc surfaces. The detals of the WSU algorthm can be descrbed as follows: 85

109 1 Labellng the jump edge pxels. The dstances (d j, j=1,2,,8) n z between a range pxel and ts eght neghbourng pxels are measured. If d j for all eght ponts s greater than a predetermned threshold, the pxel s labelled as a jump edge pxel. 2 Estmatng the surface normal of each range pxel. The surface normal of each range pxel s estmated usng k-by-k neghbourng ponts (but not ncludng jump edge pxels). A prncpal component technque (Flynn and Jan 1988) s employed to estmate the surface normal because ths technque can accommodate data contamnated wth nose. 3 Samplng on a regular grd s performed to yeld a data set less than 1000 n sze. The normal nformaton and poston nformaton of each sampled data pont s used to form sx vectors. Then a clusterng algorthm (Jan and Dubes 1988) s employed n the sx dmensonal space (correspondng to the sx vectors). 4 Assgnng the range pxels to clusters. Each range pxel s assgned to the correspondng closest cluster center. The connected component algorthm s used to avod assgnng the same labels to regons that are not connected. 5 An edge-based merge step s performed. Ths step wll merge the regons where the average angle between the surface normals of range pxels on one sde of the edge and ther neghbours on the other sde s less than a predetermned threshold. 6 A prncpal component procedure s performed to dstngush planar regons from non-planar regons. The non-planar regons wll be gnored n further processng. 7 If regons have smlar parameters and are adjacent, they wll be merged. 8 Unlabeled pxels on each segment are merged nto the regon. 9 Step 6, 7 and 8 are repeated untl the result does not change The UB Range Segmentaton Algorthm The UB segmentaton method was developed based on the observaton that f a lne s used to scan the mage, the ponts that belong to a planar surface wll form a straght lne segment. On the other hand, f the ponts are on the same straght lne segment, they wll 86

110 surely belong to the same planar surface. Unlke other regon growng algorthms whch use seeds to grow the regons, the UB segmenter uses straght lne segments as growng prmtves. Ths greatly reduces the data dmenson to be dealt wth n the growng process and makes the algorthm very fast. The UB algorthm can be descrbed as follows (Jang and Bunke 1994): 1. A medan flter s employed as a preprocessng step to reduce the nose level of an mage. 2. Scan the mage by scan lne and dvde the data on each scan lne nto straght lne segments. 3. Usng a lnk-based data structure, fnd the neghborhood of each lne segment. 4. Select the best seed regon by the followng process. Frst choose a small number of neghborng lne segments (three n (Jang and Bunke 1994)) n a seed regon. If any lne segment s shorter than a predetermned threshold, ths canddate regon s dscarded. The optmal seed regon s the one wth the smallest error among the errors computed by a least square plane fttng for each canddature. 5. The growng process s performed. In the UB algorthm, the ntal guess s a set of lne segments. A lne segment s added to the regon f the perpendcular dstance between ts two end ponts and the plane of the regon s wthn a threshold. Ths process s contnued untl no more lne segments can be added to ths regon. 6. The process of seed regon fndng (step 4) and regon growng (step 5) s terated untl no seed regon can be found. 7. A post-processng step s appled to yeld clean edges between regons The UE Range Segmentaton Algorthm The UE algorthm (Ftzgbbon, Eggert et al. 1995) s smlar to the USF algorthm. They both belong to the class of regon growng algorthms. The UE algorthm contans the followng man steps: 87

111 1. Normal Calculaton. A 5-by-5 wndow s used to estmate the normal of each range pxel. A normal and a depth dscontnuty are detected usng a predetermned normal threshold and depth threshold. 2. A dscontnuty preservng smoothng s performed wth multple passes for greater smoothng. 3. H-K based segmentaton for ntalzaton. The Gaussan (H) and mean (K) curvature of each pxel s estmated. Usng the combned sgns of the par (H, K), one can judge the surface type of each range pxel. Each pxel and ts eght-connected pxels of smlar labelng are grouped to form ntal regons. Then dlaton and eroson are performed to fll small unknown areas and remove small regons. 4. Regon growng. A least squares surface fttng procedure s performed n the ntal regons obtaned above. Then each regon s n turn grown. To jon the regon, a pont needs to satsfy the followng requrements: (a) The pont s eght-connected to the regon grown so far. (b) The perpendcular dstance between the pxel and plane grown so far s wthn a threshold. (c) The angle between normal of the pxel and that of regon grown so far s less than a threshold. (d) The pont s closer to the current surface than any other possble surfaces t may be assgned to. (e) The normal of the pxel s n better agreement wth the current surface than any other possble surface t may be assgned to. After expanson, the surface s reftted usng these new ponts. Then a contracton of the regon boundary s performed. 88

112 5. Regon boundary refnement. A pxel s added to a regon durng expanson f: (a) The pont s eght-connected to the regon grown so far. (b) The pont-to-plane dstance s less than a threshold. (c) The pont s on the one sde of a decson surface Towards to Model-Based Range Image Segmentaton Method Although the edge-based methods and regon-based methods are popular n the computer vson communty, t s dffcult for these methods to drectly extract specfed prmtves. The model-drven (top-down) methods are appealng because they can drectly extract the requred prmtves from the unprocessed raw range mages. The features that are used n model-drven methods are prmtves. So the matchng takes place very early n the recognton process. In the model-based methods, prmtve geometrc features are matched nstead of smlar features that are used n regon-based methods. Then matches are checked for local consstency by usng some geometrc constrants, e.g. dstance, normal, etc. Because of the ntroducton of robust statstcs nto some model-based methods, the model-based segmentaton methods are very robust to nosy or occluded data. Next, we wll present an effcent model-based method for range mage segmentaton. 5.3 A Quck Verson of the MDPE QMDPE As shown n chapter 4, MDPE has a very hgh robustness and can tolerate a large percentage of outlers ncludng gross nose and pseudo-outlers. However, the tme needed to calculate the denstes fˆ (X ) of all data ponts wthn the converged wndow W c s large when the number of the data ponts s very large. It takes O(n) tme to calculate the densty fˆ (X ) at one pont X. If there are nw data ponts wthn the converged wndow W c, the tme complexty of computng the probablty densty power functon ψ DP s O(n*nw). In 89

113 range mage processng, nw may be tens of thousands to hundreds of thousands n sze. For such a huge number of range data ponts, MDPE s not computatonally effcent. A qucker verson of MDPE wth a smlar hgher breakdown pont to outlers s needed for range mage segmentaton. In ths secton, we wll modfy our MDPE to produce a qucker verson, called QMDPE QMDPE MDPE measures the entre probablty denstes of all data ponts wthn the converged mean shft wndow. However, QMDPE uses only the densty of the pont n the center of the converged wndow. QMDPE, lke MDPE, also assumes nlers occupy a relatve majorty, wth Gaussan-lke dstrbuton, of the data ponts. Thus, when a model to ft s correctly estmated, the center of the converged wndow (Xc) n resdual space should be as close to zero as possble; and the probablty densty fˆ (X c ) of the pont at Xc should be as hgh as possble. Thus we defne the probablty densty power functon, whch uses only one pont s probablty densty, as follows: ( fˆ( Xc) ) α ψ DP = ( 5.1) exp( Xc ) where α s a factor that adjusts the relatve nfluence of the probablty densty to the resdual of the pont correspondng to the center of the converged wndow. It s emprcally determned to get the best performance. We adjusted the value of α by comparng the results n both synthetc data and real mage data used n secton 4.4, and set t to be 2.0 for optmal achevement (we note that the emprcally best value of α n equaton ( 5.1) for QMDPE s dfferent to that n equaton ( 4.7) for MDPE, where α s set to 1.0). Because only the probablty densty on the pont correspondng to the center of the converged wndow needs to be calculated, the tme cost to compute the probablty densty power n QMDPE s greatly reduced when the number of data s very large (for example, range mage data). 90

114 5.3.2 The Breakdown Plot of QMDPE Now, we compare the tolerance of QMDPE (to outlers) wth that of other estmators (ncludng LMedS, ALKS, RESC, RANSAC, Hough Transform, and MDPE) usng the data n secton (the results of those estmators are shown n Fgure 4.5). From Fgure 5.2 (the experments were repeated 20 tmes and results were averaged), we can see that the QMDPE began to breakdown when outlers nvolved more than 92% of the data. However, even when outlers occuped more than 92% of the data, QMDPE stll acted reasonably relably (about 70%, loosely speakng, correct). The percentage of outlers at whch the QMDPE began to break down s hgher than that of LMedS (51%), ALKS (80%), RESC (89%), and Hough Transform (89%) methods; QMDPE and the RANSAC have smlar performance. However, RANSAC needs a pror knowledge about the error bound of nlers; QMDPE needs no pror knowledge about the error bounds. Although ts robustness to outlers s a lttle lower than that of MDPE, the QMDPE algorthm s faster than MDPE because t saves tme n calculatng the probablty densty power for each randomly sampled p-subset QMDPE 15 QMDPE Error n A Error n B Outlers Percentage Outlers Percentage (a) (b) Fgure 5.2: Breakdown plot for the QMDPE method: (a) error n A vs. outler percentage; (b) error n B vs. outler percentage. 91

115 5.3.3 The Tme Complexty of QMDPE A Step Three Steps A Roof Sx Lnes Fve Crcles QMDPE Table 5.1: The tme complexty of QMDPE (n seconds). Compared wth Table 4.1, Table 5.1 shows the tme complexty of QMDPE. We can see that QMDPE s slower than LMedS and RANSAC. However, the speed of QMDPE s faster than that of MDPE, RESC, and ALKS. QMDPE s about 20% faster than MDPE and almost 100% faster than RESC n lne fttng. Of course, the tme complexty of these methods may change, to some extent, for dfferent types of sgnal (for example, RESC s slower than ALKS n the analyss of the four lne sgnal but faster than ALKS n fve crcle sgnal; QMDPE s much faster than MDPE n our experments wth range mage data). It s not practcal to compare the tme complexty of all methods for all types of sgnals. The above wll gve the reader some rough dea of the tme complexty of each method The Influence of Wndow Radus on the Results of QMDPE % Outlers 60% Outlers 70% Outlers 80% Outlers % Outlers 60% Outlers 70% Outlers 80% Outlers Error n A Error n B Wndow Radus H Wndow Radus H Fgure 5.3: The nfluence of wndow radus on the results of the QMDPE. Now, we wll nvestgate the nfluence of wndow radus on the results of the QMDPE. From Fgure 5.3, we can see that, although the percentage of outlers also has an affect on the choce of wndow radus (the results are relatvely more senstve to the choce of wndow radus when the outlers are greater n number), the results of QMDPE show less 92

116 senstvty to the choce of wndow radus h than the results of MDPE (see Fgure 4.6). The reason s: the wndow radus h plays two roles n MDPE. Frst, h s related to the densty estmaton; Second, the densty power n MDPE wll count all ponts denstes wthn the converged wndow (where h s the radus of the wndow). However, because we use only one pont to estmate the densty power, h s only used for densty estmaton n QMDPE. 5.4 Applyng QMDPE to Range Image Segmentaton A good estmator s generally only one component of a complete scheme to successfully tackle meanngful computer vson tasks. Segmentaton requres more than a smplemnded applcaton of an estmator, no matter how good that estmator s. Several dffcultes faced wth applyng a statstcal estmator to ths task must be consdered n desgnng a method for segmentaton From Estmator to Segmenter To test the utlty of QMDPE, we apply t to range mage segmentaton. The range mages were generated by usng an ABW structured lght scanner and all ABW range mages have 512x512 pxels. These ABW range mages can be obtaned from the USF database (avalable at However, segmentaton s a (surprsngly) complex task and an estmator cannot smply be drectly appled wthout consderng the followng factors: 1. The computatonal cost. QMDPE s an mproved (n speed) MDPE. Its computatonal cost s much less than MDPE s computatonal cost. Even so, for a range mage wth a large number of data ponts (262,144 data ponts n our case), employng a herarchcal structure n our algorthm greatly optmzes the computatonal speed. 2. Handlng of ntersectons of surfaces. 93

117 When two surfaces ntersect, ponts around the ntersecton lne may possbly be assgned to ether surface (see Fgure 5.5). In fact, the ntersecton lne s on both surfaces and the data ponts are nlers to both surfaces. Addtonal nformaton (such as the normal to the surface at each pxel) should be used to handle data near the ntersecton lne. 3. Handlng vrtual ntersecton. It s popular n model-based methods to drectly estmate parameters of a prmtve; and classfy data ponts belongng to the prmtve accordng to the estmated parameters. The data ponts on the surface wll then be masked out and not be processed n later steps. However, sometmes two surfaces do not actually ntersect, but the extenson of one surface s ntersected by the other surface. In ths case, the connected component algorthm (Luma, Shapro et al. 1983) should be employed. 4. Removal of the solated outlers. When all surfaces are estmated, some solated outlers, due to the nose ntroduced by range mage camera, may reman. At ths stage, a post processng procedure should be made to elmnate the solated outlers. The orgnators of other novel estmators (e.g. ALKS, RESC, MUSE, MINPRAN) have also appled ther estmators to range mage segmentaton, but they have not generally tackled all of the above ssues. Hence, even those nterested n applyng ALKS/RESC, or any other estmator, to range mage segmentaton may fnd several of the components of our complete mplementaton ndependently useful A New and Effcent Model-Based Algorthm for Range Image Segmentaton Shadow pxels may occur n an ABW range mage. These ponts cannot gve range nformaton and thus wll not be processed. There are four levels n the herarchy we used n our algorthm. The bottom level of the herarchy contans 64x64 pxels that are obtaned by usng regular samplng on the orgnal mage. The top level (=4) of the herarchy s the orgnal mage (512x512). The level =2 94

118 and the level =3 of the herarchy have 128x128 and 256x256 pxels respectvely. We begn wth bottom of the herarchy (=1),.e., the 64x64 regular sampled range mage. Begn Level =1; Mark all nvald ponts. Calculate the normal of each range pxel and dentfy the jump edge pxels =+1 Processng the 'th herarchy No Top of the herarchy? Yes Elmnate the solated outlers End Fgure 5.4: The structure of the proposed range mage segmentaton algorthm For unlabelled ponts n the top herarchy, we use the connected component algorthm to get the maxmum connected component (leavng out the nvald ponts). Thus, the data for current herarchcal level can be obtaned by regularly samplng on the maxmum connected component. For the connected ponts whose number s below a threshold, we marked them as nose. The structure of the proposed algorthm s shown n Fgure 5.4. The detaled steps of the proposed method can be descrbed as follows: 1. Mark all nvald ponts. For example, shadow pxels may occur n a structured lght scanner (e.g. ABW) mage, these ponts wll not be processed n the next steps. 2. Calculate the normal of each range pxel and dentfy the jump edge pxels. 95

119 Although the QMDPE algorthm was desgned to ft the data despte nose and multple structures, t requres that the data ponts of the model should occupy a relatve majorty of the whole data. Ths can be satsfed n a lot of range mages (and the presented algorthm can deal wth the whole mage as raw mage). However, for some very complcated range mages (those wth many objects and surfaces), ths requrement s not always satsfed. Usng the nformaton provded by the jump edge wll help to coarsely segment the range mage to some small regons (each may nclude several planes). 3. Employ a herarchal samplng technque. The proposed algorthm employs a herarchal structure based on the fact that when an mage s regularly sampled, the man detals wll reman whle some mnor detals may be lost. 4. Apply the QMDPE to obtan the parameters of the estmated prmtve. For the current level n the herarchy, we use the whole sampled mage as the data to deal wth. We apply the QMDPE algorthm to that data whch yelds the plane parameters. The nlers correspondng to the estmated plane parameters are then dentfed by employng an auxlary scale estmator. (For hstorcal reason, we 5 2 employed a revsed Medan scale estmator: s = α (1 + ) med( r ) + β, where α n 3 and β were expermentally set as 1.8 and 0.2 for optmum. However, an mproved robust scale estmator TSSE (see chapter 7), can be used to estmate the scale of nlers). At ths stage, t s dffcult to tell whch plane, of any two ntersectng planes, the data that are on or near the ntersecton lne belong to. Note: ths case s not consdered n the popular range mage segmentaton methods employng robust estmators such as RESC, MUSE and ALKS. We handle ths case n the next step. 5. Usng normal nformaton. When the angle between the normal of the data pont that has been classfed as an nler, and the estmated plane normal, s less than a threshold value (T-angle, 40 degree n our case), the data pont s accepted for step 5. Otherwse, the data pont s 96

120 rejected and s classfed as a left-over pont for further processng. As shown n Fgure 5.5, when we dd not consder the normal nformaton, the range mage was over segmented because of the ntersecton of two planes (ponted out by the arrow n Fgure 5.5 (b) and (c)). As comparson, we obtan the rght result when we consdered the normal nformaton (see Fgure 5.5 (d) and (e)). 6. Usng the connected component algorthm to extract the maxmum connected component and label them. The remanng unlabeled nlers wll be used n the next loop for further processng. 7. Select the connected component for processng n the next loop. For all unlabeled data ponts, we use jump edge nformaton and connected component analyss to extract the component wth the maxmum number of the connected data ponts for the next loop. When the number of the data ponts belongng to the maxmum connected component s larger than a threshold (T-cc), we repeat step 4-6. Otherwse, we stop ths herarchy and go to the next hgher level n the herarchy untl the top of the herarchy (512-by-512). 8. Fnally, we elmnate the solated outlers and assgn them to the majorty of ther eght-connected neghbors. Compared wth current popular methods (regon-based and edge-based methods), our proposed method s a model-based top-down technque. Our method drectly extracts the requred prmtves from the raw mages, and t deals wth the whole mage as raw mage. Our method s very robust to nosy or occluded data due to the adopton of robust estmator QMDPE. Because we adopt herarchcal technque, ths makes the proposed method computatonally effcent and makes t possble to deal wth large sze range mages wth only small extra computatonal cost. 97

121 (a) (b) (c) (d) (e) (f) Fgure 5.5: A comparson of usng normal nformaton or not usng normal nformaton. (a) Range mage (ABW test.10 from the USF database); (b) The segmentaton result wthout usng normal nformaton; (c) The ponts near or on the ntersecton of two planes may be classfed to both planes wthout consderng normal nformaton; (d, e) The result usng normal nformaton; (f) The ground truth result. 5.5 Experments n Range Image Segmentaton In ths secton, we wll show how to use our method to segment range mages. Snce one man advantage of our method, over the tradtonal methods, s that t can resst the nfluence of nose, we put some randomly dstrbuted nose nto the range mages (Note, as the whole mage s dealt wth at the begnnng of the segmentaton, there s also a hgh percentage of pseudo-outlers exstng n the data). 98

122 (a) (b) (c) (d) Fgure 5.6: Segmentaton of ABW range mage (test.28) from the USF database. (a) Range mage wth 15% random nose; (b) Segmentaton result by the proposed method; (c) The edge mage of the result by the proposed method; (d) The edge mage of the ground truth result. 99

123 (a) (b) (c) (d) Fgure 5.7: Segmentaton of ABW range mage (test.27) from the USF database. (a) Range mage wth 15% random nose; (b) Segmentaton result by the proposed method; (c) The edge mage of the result by the proposed method; (d) The edge mage of the ground truth result. 100

124 (a) (b) (c) (d) Fgure 5.8: Segmentaton of ABW range mage (test.13) from the USF database. (a) Range mage wth 15% random nose; (b) Segmentaton result by the proposed method; (c) The edge mage of the result by the proposed method; (d) The edge mage of the ground truth result. 101

In Fgure 5.6, Fgure 5.7, and Fgure 5.8 (a), we add 15% randomly dstrbuted nose,.e. 39322 nosy ponts were added to each range mage taken from the USF ABW range mage database (test28, test27, and test13).

Only a slght dstorton appeared on some boundares of neghbourng surfaces. In fact, the accuracy of the range data, and the accuracy of normal at each range pont, wll have an effect on the dstorton.

125 In Fgure 5.6, Fgure 5.7, and Fgure 5.8 (a), we add 15% randomly dstrbuted nose,.e nosy ponts were added to each range mage taken from the USF ABW range mage database (test28, test27, and test13). As shown Fgure 5.6, Fgure 5.7, and Fgure 5.8 (b) and (c), our method can resst the nfluence of large number of nose corrupted ponts. The man surfaces were recovered by our proposed method. Only a slght dstorton appeared on some boundares of neghbourng surfaces. In fact, the accuracy of the range data, and the accuracy of normal at each range pont, wll have an effect on the dstorton. It s mportant to compare the results of our method wth the results of other methods. We also compare our results wth those of the three state-of-art range mage segmenters (.e. the USF, WSU and UB, see (Hoover, Jean-Baptste et al. 1996)). (a) (b) (c) (d) (e) (f) Fgure 5.9: Comparson of the segmentaton results for ABW range mage (test.1) from the USF database. (a) Range mage; (b) The result of ground truth; (c) The result by the USF; (d) The result by the WSU; (e) The result by the UB; (f) The result by the proposed method. 102

126 Consder Fgure 5.9 and Fgure 5.10: (a) s the range mage and (b) s the edge map of the manually made ground truth segmentaton result. The results obtaned by all methods should be compared wth the ground truth. (c) s the results obtaned by the USF. From Fgure 5.9 (c) and Fgure 5.10 (c), we can see the USF s results contaned many nosy ponts. In both Fgure 5.9 (d) and Fgure 5.10 (d), the WSU segmenter mssed surfaces. The WSU segmenter also under segmented the surface n Fgure 5.10 (d). From Fgure 5.9 (e) and Fgure 5.10 (e), we can see the boundares on the juncton of surfaces were dstorted relatvely serously. Our results are shown n Fgure 5.9 (f) and Fgure 5.10 (f). Compared wth other methods, the proposed method performed best. Our method drectly extracted the planar prmtves. In the proposed method, the parameters requrng tunng are less n number than other tradtonal methods. (a) (b) (c) (d) (e) (f) Fgure 5.10: Comparson of the segmentaton results for ABW range mage (tran 6) from the USF database. (a) Range mage; (b) The result of ground truth; (c) The result by the USF; (d) The result by the WSU; (e) The result by the UB; (f) The result by the proposed method. 103

127 As stated before, adoptng herarchcal samplng technque n the proposed method greatly reduces ts tme cost. The processng tme of the method s affected to a relatvely large extent by the number of surfaces n the range mages. The processng tme for a range mage ncludng smple objects s faster than that for a range mage ncludng complcated objects. Generally speakng, t takes about 40 seconds (on an AMD800MHz personal computer programmed (un-optmzed) n the C language) for segmentng a range mage wth less surfaces and about seconds for a range mage ncludng more surfaces. Ths ncludes the tme for computng normal nformaton at each range pxel (whch takes about 12 seconds). 5.6 Concluson In ths chapter, we developed a qucker verson of MDPE called QMDPE. The advantage of QMDPE s that only the probablty densty correspondng to the center of the converged mean shft wndow needs to be calculated; therefore the tme cost to compute the probablty densty power s greatly reduced. Although QMDPE has a relatvely lower tolerance to outlers than MDPE, QMDPE stll has a better tolerance than most avalable estmators (such as M-estmators, LMedS, LTS, RANSAC, ALKS, and RESC). We recommend that when the number of data ponts are small (say less than 5000 ponts) and the task has a hgh relance on the robustness of the estmator, MDPE s an deal choce. On the other hand, when the task nvolves a large number of data ponts (for example, range mage segmentaton whch often nvolves more than tens of thousands of data), and the speed s a relatvely mportant factor to consder, t s better to choose QMDPE rather than MDPE. The second contrbuton of ths chapter s that we apply the new QMDPE to the computer vson task of segmentng range data. Ths part s more than a mere applcaton of the estmator n a straghtforward manner. There are a number of ssues that need to be addressed when applyng an estmator (any estmator) to such a problem. The solutons we have found, to these practcal problems that arse n the segmentaton task, should be of ndependent nterest. The resultng combnaton of a hghly robust estmator and a very 104

128 careful applcaton of that estmator, produces a very effectve method for range segmentaton. Expermental comparsons of the proposed approach, and several other state-of-the-art methods, support the clam that the proposed method s more robust to outlers and can acheve good results even when the range mages are contamnated by a large number of (mpulse) nosy data ponts. In (Roth and Levne 1990), the authors also employed a robust estmator LMedS to segment range mage. They frstly found the largest connected regon bounded by edge pxels; then they used LMedS to ft the geometrc prmtve n the chosen regon. They assumed the largest connected regon contaned only one geometrc prmtve. However, f the regon ncludes more than two geometrc prmtves (for complcated range mages), and each geometrc prmtve has less than 50% data n the regon, the estmated prmtve wll be wrong because LMedS has only up to 0.5 breakdown pont. The algorthm proposed n ths chapter s a model-based method and can drectly extract planar prmtves from the raw mages. Because QMDPE s very robust to nose, the algorthm has the advantage that t can resst the nfluence of a large amount of random nose n the range mage. Also, the proposed algorthm s robust to the presence of multple structures. Snce we sequentally removed the detected surfaces one by one, the average tme to segment the range mage wll be affected by how many surfaces the range mage ncludes. However, the computng tme wll not be greatly affected by the sze of the range mage as we use a samplng herarchcal technque. 105

129 6. Varable-Bandwdth QMDPE for Robust Optcal Flow Calculaton Chapter 6 Varable-Bandwdth QMDPE for Robust Optcal Flow Calculaton 6.1 Introducton One major task of computer vson s to compute the optcal flow from mage sequences (Horn and Schunck 1981; Nagel 1987; Fleet and Jepson 1990; Barron, Fleet et al. 1994; Black 1994; Black and Jepson 1996; La and Vemur 1998; Memn and Perez 1998; Ong and Spann 1999; Farneback. 2000; Farneback 2001; Memn and Perez 2002). Accurate computaton of optcal flow s an mportant foundaton for tasks, such as moton segmentaton, extractng structure from moton, etc. Tradtonal methods of computng optcal flow are non-robust. Whch means that they wll fal to correctly compute optcal flow when the two assumptons: data conservaton and spatal coherence, are volated. Clearly, these assumptons wll be volated near moton boundares, and when shadows, occlusons, and/or transparent motons are present. Durng the last ten years, robust technques, such as: M-estmators (Black and Anandan 1993; Black and Anandan 1996), Least Medan Squares (LMedS) estmator (Bab- Hadashar and Suter 1998; Ong and Spann 1999), Least Trmmed Squares (LTS) estmators (Mng and Haralck 2000), and robust Total Least Squares (TLS) estmator 106

130 (Bab-Hadashar and Suter 1998) etc., have been employed to extract optcal flow. Because these robust estmators can tolerate the nfluence of bad data,.e. outlers, they usually obtan better results. Unfortunately, these robust estmators have a breakdown pont no more than 50%. Ths means that when the data contan more than 50% outlers, these estmators wll totally breakdown. Such may happen, for example, near moton boundary. In ths chapter, we wll provde, based on our prevous work [see (Wang and Suter 2002a; Wang and Suter 2003b); also see chapter 4 and 5], a robust estmator varable bandwdth QMDPE (vbqmdpe). Some others may prefer the term Adaptve-Bandwdth QMDPE abqmdpe. Instead of usng a fxed bandwdth as n QMDPE, vbqmdpe uses data-drven bandwdth selecton. We apply the novel proposed vbqmdpe to the task of optcal flow computaton. We also correct the results of Bab-Hadashar and Suter (Bab-Hadashar and Suter 1998) for the Otte mage sequence. vbqmdpe s very robust f the percentage of outlers s less than 80%, outperformng most other methods n optcal flow computaton. Of course, any method can breakdown under extreme data: even LMedS and LTS can breakdown when clustered outlers are present - despte those outlers consttutng less than 50% of the whole data (e.g., see secton 4.4.2). 6.2 Optcal Flow Computaton Let I(x, y, t) be the lumnance of a pxel at poston (x, y) and tme t, and v = (u, v) be the optcal flow. The data conservaton assumpton mples (Fennema and Thompson 1979): I(x, y, t)) = I(x + uδt, y + vδt, t + δt) ( 6.1) Frst order expanson yelds the optcal flow constrant (OFC) equaton: I I I u + v + x y t = 0 ( 6.2) where ( I/ x, I/ y, and I/ t) are partal dervatves of lumnance I wth respect to space and tme at pont (x, y, t)). 107

131 The resdual at (x, y) can be wrtten as: I I I r( u, v) = u + v + ( 6.3) x y t The error measure usng the least squares (LS) wthn the small local neghbourhood R can be wrtten as: E LS ( u, v) = r( u, v) ( 6.4) ( x, y) R 2 From equaton ( 6.2), we can see there s only one equaton but wth two varables to estmate - the aperture problem. In order to constran the soluton, the local regon R should be as large as possble. However, f R s too large, the spatal coherence assumpton wll be volated - the generalzed aperture problem (Black and Anandan 1996). The affne moton model of mage flow s sometmes used n preference to the constant flow model: u = a v = a a x + a 1 y + a x + a y ( 6.5) Tradtonal (Least Squares) methods estmate the optcal flow by mnmzng the error measure n equaton ( 6.4), assumng a flow model such as ( 6.5). 6.3 From QMDPE to vbqmdpe In chapter 4 and 5, we proposed a robust estmator MDPE and ts modfcaton QMDPE whch can both tolerate more than 50% outlers. However, these two robust estmators use a fxed bandwdth technque and thus they requre the user to specfy the bandwdth h for the kernel densty estmaton. In practcal tasks, t wll be attractve f the bandwdth can be data-drven. Next, we wll, based on QMDPE, provde a varable bandwdth QMDPE, called vbqmdpe. 108

132 6.3.1 Bandwdth Choce One crucal ssue n the non-parametrc densty estmaton, and the mean shft method, s how to choose h (Wand and Jones 1995; Comancu, Ramesh et al. 2001; Comancu and Meer 2002a). We employ a method n (Wand and Jones 1995): = 1 1 1/ 5 R K hˆ 243 ( ) = s 2 ( 6.6) 35u2 ( K ) n = 1 2 ) where R( K ) K ( ζ ) dζ and u ( K ) ζ K ( ζ dζ., s s the sample standard devaton. A robust medan scale estmator s then gven by (Rousseeuw and Leroy 1987): s = med x ( 6.7) ĥ wll provde a upwards bound on the AMISE (asymptotc mean ntegrate error) optmal bandwdth ĥ AMISE, thus we choose the bandwdth as c ĥ, c s a constant number (0<c<1) and s used to avod over-smoothng (we are also aware that f the value of the bandwdth s too small, t wll ntroduce artefacts). The medan scale estmator n equaton ( 6.7) may be based for non-symmetrcal multmodel data and for data wth more than 50% outlers. However, the nfluence of the bandwdth h on the fnal result s relatvely weak as t s only used n the pdf estmaton and the mean shft method The Algorthm of the Varable Bandwdth QMDPE The vbqmdpe procedure can be wrtten as follows: 1 Randomly choose one p-subset, estmate the model parameters by the p-subset, and calculate the resduals of all data ponts. 2 Adaptvely choose the bandwdth h usng the method descrbed n secton

133 3 Apply the mean shft teraton n the resdual space wth ntal wndow center zero. Thus, we obtan the center of the converged wndow Xc. 4 Calculate the probablty densty fˆ at the poston Xc by equaton ( 4.1) and ( 4.2). 5 Calculate the densty power accordng to equaton ( 5.1). 6 Repeat step (1) to step (5) many tmes. Fnally, output the parameters wth maxmum densty power. Varable bandwdth means that the bandwdth h s varable for each randomly chosen p- subset - nstead of usng a fxed bandwdth as n our prevous work (Wang and Suter 2002a; Wang and Suter 2003b). In order to mprove the statstcal effcency, a weghted least square procedure (Rousseeuw and Leroy 1987) can be carred out as the fnal step Performance of vbqmdpe LMedS and LTS vbqmdpe LS, LMedS and LTS LS vbqmdpe (a) (b) vbqmdpe 90 vbqmdpe 80 LMedS LTS 70 LS LMedS and LS LTS (c) (d) Fgure 6.1: Comparng the performance of vbqmdpe, LS, LMedS, and LTS wth (a) 55%; (b) 80%; (c) 70%; (d) 85% outlers. 110

134 We demonstrate that vbqmdpe s very robust to outlers by comparng t to several other tradtonal methods (the LS, LMedS and LTS methods, whch are frequently employed n optcal flow calculaton). Frst, we take a smple settng lne fttng. We generated four knds of data (one step, two steps, two crossed lnes, and four lnes), each wth a total of 500 data ponts. The sgnals were corrupted by Gaussan nose wth zero mean and unt standard varance. Among the 500 data ponts, α data ponts were randomly dstrbuted n the range of (0, 100). The 'th structure has n data ponts. The four sgnals are as follows: One step: x:(0-55), y=30, n 1 =225; x:(55-100), y=40, n 2 =225; α=50. Two steps: x:(0-30), y=20, n 1 =100; x:(30-55), y=40, n 2 =100; x:(55-80), y=60, n 3 =100; α=200. Two crossed lnes: x:(20-70), y=x+10, n 1 =150; x:(35-85), y=115-x, n 2 =150; α=200. Four lnes: x:(0-25), y=3x+10, n 1 =75; x:(25-55), y=130-2x, n 2 =20; x:(40-65), y=3x- 110, n 3 =75; x:(65-90), y=280-3x, n 4 =75; α=370. From Fgure 6.1, we can see that LS s non-robust, and that LMedS and LTS faled to ft all the four sgnals. Only vbqmdpe correctly ftted all the four sgnals - not even breakng down when the data ncludes 85% outlers (Fgure 6.1 (d)). 6.4 vbqmdpe and Optcal Flow Calculaton The optcal flow constrant (OFC) s a lnear equaton n u-v space. Each pxel gves rse to one such lnear constrant and, n a nose-free settng, and assumng constant u and v, all lnes ntersect at a common pont. 111

135 Two man dffcultes n optcal flow estmaton are (Nes, Del Bmbo et al. 1995): The dscontnutes n the local velocty; The aperture problem (a) (b) LS 2 LTS V 0-2 LMedS -4 vbqmdpe U (c) Fgure 6.2: One example of multple motons. The frst dffculty s related to occlusons between mage llumnaton dscontnutes, movng objects or movng object boundares. One soluton to the second dffculty s to enlarge the local wndow so as to collect more constrant equatons to over determne the optcal flow; ths wll brng hgher statstcal effcency. However, enlargng the wndow 112

136 means more chance of ncludng multple motons (formng multple clusters of ntersectng lnes e.g., Fgure 6.2). Because tradtonal estmators (M-estmators, LMedS, LTS, etc.) have only up to 50% breakdown pont, they may fal to compute optcal flow when the data nclude multple moton structures (.e. the outlers occupy more than 50% of the data). In such cases, vbqmdpe performs well. We generated an mage sequence wth two movng squares usng the method smlar to that n (Barron, Fleet et al. 1994). Fgure 6.2 (a) shows one snapshot of the mage sequence. The correct optcal flow s shown n Fgure 6.2 (b). The small wndow centered at (110, 136) n Fgure 6.2 (a) ncludes three motons: each moton nvolves less than 50% data ponts. Its OFC plot, usng symbolcally determned dervatves of the mage ntenstes I, s shown n Fgure 6.2 (c). From Fgure 6.2 (c), we can see that there are three motons ncluded n the small wndow n Fgure 6.2 (a). The optcal flow of each moton (2.0, 1.0), (-3.0, 1.5), (3.0, -1.5) s marked by a red-plus sgn. The proposed robust estmator gves the correct optcal flow estmaton (3.0, -1.5). However, by the LMedS method, the estmated optcal flow s (2.36, -0.71); and the estmated optcal flow by the least trmmed squares and the least squares method s respectvely (2.71, -1.43) and (0.06, 0.84) Varable-Bandwdth-QMDPE Optcal Flow Computaton The frst step to compute optcal flow s to estmate the spato-temporal dervatves of the mage brghtness. We follow Bab-Hadashar and Suter (Bab-Hadashar and Suter 1998), and Nagel (Nagel 1995), by convolvng the mage brghtness wth dervatves of 3D spatotemporal Gaussan functon: G(x) = ( 2π ) 1 3/2 Σ e 1 x T 1 Σ x 2 ( 6.8) where x =(x, y, t) T ; Σ s the covarance matrx. There are methods to estmate the dervatves near the dscontnutes of optcal flow (e.g., (Ye and Haralck 2000)). In our smple approach, we frst estmate the dervatves of I wth ntal standard varance σ 0. Then, when the estmated dervatves (I x, I y, and I t ) are larger 113

137 than a threshold, we smply re-compute the dervatves wth half of the standard varance n that correspondng drecton. For each NxN patch of the mage and chosen moton model (n our case, constant moton model and affne moton model), we solve for the flow usng the vbqmdpe. The measure of relablty n (Bab-Hadashar and Suter 1998) can be employed n our method Quanttatve Error Measures for Optcal Flow When the ground truth optcal flow of mage sequences s known, the error analyss s performed by Barron s method (Barron, Fleet et al. 1994). The angular error measure s reported n degree: E = arcos(v e, v c ) ( 6.9) T 2 2 where v e = ( u, v,1) / u + v + 1 and v c s the true moton vector. The average and standard devaton of the errors are both reported. 6.5 Expermental Results on Optcal Flow Calculaton The proposed algorthm has been evaluated on both synthetc and real mages. Three wellknown mage sequences (the Dvergng Tree sequence (whch s obtaned from ftp://csd.uwo.ca/pub/vson); the Yosemte sequence (whch s obtaned from ftp://csd.uwo.ca/pub/vson); and the Otte mage sequence (whch s obtaned from are used (see Fgure 6.3). Table 6.1 shows the comparson results the Dvergng Tree sequence (Fgure 6.3 (a)) showng the proposed method gves the most accurate results for affne moton model. Even for the constant moton model, vbqmdpe stll yelds better results than most other comparatve methods. Fgure 6.3 (b) shows one snapshot of the Yosemte sequence. Because the true moton of the clouds does not really reflect the mage brghtness changes, we exclude the clouds n our experments. From Table 6.2, we can see that the proposed algorthm and Farneback s 114

algorthms gve the best overall results. The standard varance error of our results s less than that of Farneback s results (Farneback.

2000) s better than our results for constant moton model, our results for affne moton model wth larger local wndow outperform those results.

segmentaton algorthm, s better than ours.

138 algorthms gve the best overall results. The standard varance error of our results s less than that of Farneback s results (Farneback. 2000; Farneback 2001) for both constant and affne moton models. Although the averaged angle error of Farneback s results (Farneback. 2000) s better than our results for constant moton model, our results for affne moton model wth larger local wndow outperform those results. However, the average angle error of Farneback s later verson (Farneback 2001), whch used an affne moton model and a combned a regon growng segmentaton algorthm, s better than ours. To our knowledge, t s the best result obtaned so far n the feld of optcal flow computaton for the Yosemte sequence. (a) (b) (c) Fgure 6.3: The snapshot of the three mage sequences: (a) the Dvergng Tree; (b) the Yosemte; and (c) the Otte sequence. 115

139 We also note that our results wth affne moton model are better than those wth constant moton model n both the Dvergng Tree and the Yosemte. Ths s because the moton n these two sequences s mostly dvergng. For each pxel wthn a small local wndow, the optcal flow changes. Thus, the affne moton model reflects the true stuaton better than the constant model. The Otte sequence (Fgure 6.3 (c)) s a real mage sequence (Otte and Nagel 1995) and t s dffcult because t ncludes many sharp dscontnutes n both moton and depth. When we recomputed the optcal flow for Otte mage sequence (frame 35) by Bab-Hadashar and Suter s code, we found that the results n (Bab-Hadashar and Suter 1998) were wrongly reported (our results show an mproved performance!). From Table 6.3, we can see our vbqmdpe outperforms all other publshed benchmarks 6.6 Concluson We have developed a novel robust estmator varable bandwdth QMDPE, and we appled t to optcal flow computaton. By employng nonparametrc densty estmaton and densty gradent estmaton technques n parametrc model estmaton, the proposed method s very robust to outlers and s a substantal mprovement over tradtonal methods. We expect we can do even better wth a mult-resoluton verson of our approach. Our code wthout optmzaton takes about 6 mn on Yosemte mage sequence on a 1.2GHz AMD personal computer, usng 17x17 patches around each pxel and m s set to 30. The speed can be mproved for less m and smaller patches but wth worse accuracy. The mean number of mean shft teratons s about 3 for each p-subset. 116

140 Technque Avg. error (degree) Std. dev. (degree) Densty ( % ) Horn and Schunck (orgnal unthresholded) Horn and Schunck (modfed unthresholded) Uras et.al. (unthresholded) Nagel Anandan Sngh (Step 1 unthresholded) Sngh (Step 2 unthresholded) Least-Squares (block-based) method (n Ong and Spann, 1999) vbqmdpe2 (σ 0 =1.5, 11x11, m=30) vbqmdpe6 (σ 0 =1.5, 11x11, m=30) Table 6.1: Comparatve results on dvergng tree: the frst part of the table s the results reported by Barron et. al (1994) and Ong et. al (1999); the second part s the results obtaned by the proposed algorthm (number 2 and 6 represent constant and affne moton models). 117

141 Technque Avg. error (degree) Std. dev. (degree) Densty ( % ) Black (1994) Szelsk and Coughlan (1994) Black and Anandan (1996) Black and Jepson (1996) Ju et. al. (1996) Memn and Perez (1998) Memn and Perez (2002) La and Vemur(1998) Bab-Hadashar and Suter (WTLS2, 1998) Bab-Hadashar and Suter (WTLS6, 1998) Farneback2 (2000) Farneback6 (2000) Farneback6 (2001) vbqmdpe2 (σ 0 =2.0, 17x17, m=30) vbqmdpe6 (σ 0 =2.0, 17x17, m=30) vbqmdpe2 (σ 0 =2.0, 25x25, m=30) vbqmdpe6 (σ 0 =2.0, 25x25, m=30) Table 6.2: Comparatve results on Yosemte (cloud regon excluded): the frst part s the results reported n the recently referenced lterature; the second part s our results. 118

142 Technque Avg. error (degree) Std. dev. (degree) Densty ( % ) Gachett and Torre (1996) Bab-Hadashar and Suter (WLS2, 1998) Bab-Hadashar and Suter (WLS6, 1998) Bab-Hadashar and Suter (WTLS2, 1998) Bab-Hadashar and Suter (WTLS6, 1998) Bab-Hadashar and Suter (WLS2, corrected) Bab-Hadashar and Suter (WLS6, corrected) Bab-Hadashar and Suter (WTLS2, corrected) Bab-Hadashar and Suter (WTLS6, corrected) vbqmdpe2 (σ 0 =2.0, 17x17, m=30) vbqmdpe6 (σ 0 =2.0, 17x17, m=30) vbqmdpe2 (σ 0 =2.0, 25x25, m=30) vbqmdpe6 (σ 0 =2.0, 25x25, m=30) Table 6.3: Comparatve results on Otte mage sequences: the frst part was reported by Bab-Hadashar and Suter (1998); the second part s the corrected results; the thrd part s obtaned by runnng the proposed algorthm. 119

143 7. A Hghly Robust Scale Estmator for Heavly Contamnated Data Chapter 7 A Hghly Robust Scale Estmator for Heavly Contamnated Data It s not enough to (only) correctly estmate the parameters of a model to dfferentate nlers from outlers; It s also mportant to robustly estmate the scale of nlers. In ths chapter, we propose a new robust scale estmaton technque: robust Two-Step Scale estmator (TSSE). The TSSE apples nonparametrc densty estmaton and densty gradent estmaton technques, to robustly estmate the scale of nlers for heavly contamnated data. The TSSE can tolerate more than 80% outlers and comparatve experments show ts advantages over fve other robust scale estmators: the medan, the medan absolute devaton (MAD), Modfed Selectve Statstcal Estmator (MSSE), Resdual Consensus (RESC), and Adaptve Least Kth order Squares (ALKS). 7.1 Introducton As emphaszed n chapter 2, n computer vson tasks, t frequently happens that gross nose and pseudo outlers occupy the absolute majorty of the data. Most past work amed at presentng robust estmators wth hgh breakdown pont (Rousseeuw 1984; Yu, Bu et al. 1994; Stewart 1995; Mller and Stewart 1996; Lee, Meer et al. 1998),.e. the estmator can 120

144 correctly fnd the parameters of a model from the data whch are heavly contamnated. However, correctly estmatng the parameters of a model s not enough to dfferentate nlers from outlers. Havng a correct scale of nlers s crucal to the robust behavour of an estmator. The success of some robust estmators s based on havng correct ntal scale estmate, or the correct settng of a partcular parameter that s related to scale (e.g., RANSAC, Hough Transform, M-estmators etc.). Thus, ther performance crucally depends on that user-provded scale-related knowledge. Robust scale estmaton s often attempted durng a post-processng stage of robust estmators (such as LMedS, LTS, etc.). Yet, although there are a lot of papers that propose robust estmators wth hgh breakdown pont for model fttng, robust scale estmaton s relatvely neglected. In ths chapter, we nvestgate the behavour of several robust scale estmators that are wdely used n computer vson communty and show the problems of these scale estmaton technques. We also propose a new robust scale estmator: Two-Step Scale estmator (TSSE), based on the nonparametrc densty estmaton and densty gradent estmaton technques. TSSE can tolerate more than 80% outlers and outperform the fve comparatve scale estmators (The Medan, MAD, ALKS, RESC, MSSE scale estmators). Ths chapter s organzed as follows: n secton 7.2, we revew prevous robust scale technques. In secton 7.3, we propose a smple but effcent mean shft valley algorthm, by whch the local valley can be found and propose the novel robust scale estmator: TSSE. TSSE s expermentally compared wth fve other robust scale estmators, usng data wth multple structures, n secton 7.4. We conclude n secton Robust Scale Estmators The emphass n many past computer vson papers presentng robust estmators was on the hgh breakdown pont (Rousseeuw 1984; Rousseeuw and Leroy 1987; Yu, Bu et al. 1994; Stewart 1995; Mller and Stewart 1996; Lee, Meer et al. 1998; Wang and Suter 2003b),.e. the estmator that can correctly fnd the parameters of a model from the data whch are heavly contamnated. Whether or not the nlers can be successfully dfferentated from the outlers depends on two factors: 121

145 (1) Whether the parameters of a model are correctly found; and (2) Whether the scale of nlers s correctly estmated. Step (2), scale estmaton plays an mportant role n the overall robust behavour of these methods. Some robust estmators, such as M-estmators, RANSAC, Hough Transform, etc., put the onus on the "user" - they smply requre some user-set parameters that are lnked to the scale of nlers. Others, such as LMedS, RESC, MDPE, QMDPE, etc., use an auxlary estmate of scale (after fndng the parameters of a model) durng a postprocessng stage, whch ams to dfferentate nlers from outlers. Gven a scale estmate, s, the nlers are usually taken to be those data ponts that satsfy the followng condton: r/s < T ( 7.1) where r s the resdual of 'th sample, and T s a threshold. For example, f T s 2.5 (1.96), 98% (95%) percent of a Gaussan dstrbuton wll be dentfed as nlers The Medan and Medan Absolute Devaton (MAD) Scale Estmator Among many robust estmators, the sample medan s one of the most famous estmators. The sample medan s bounded when the data nclude more than 50% nlers. A robust medan scale estmator s then gven by (Rousseeuw and Leroy 1987): M (1 ) n p = + 2 med r ( 7.2) where r s the resdual of 'th sample, n s the number of sample ponts and p s the dmensons of parameter space (e.g., 2 for a lne, 3 for a crcle). A varant, MAD, s also used to estmate the scale of nlers (Rousseeuw and Croux 1993): MAD=1.4826med { r -med j r j } ( 7.3) 122

146 The MAD estmator s very robust to outlers and has a 50% breakdown pont. The outlers can be recognzed by computng: r med r j j MAD n < T ( 7.4) where T s a threshold. The medan and MAD are often used to yeld ntal scale values (before estmatng the parameters of a model) for many robust estmators. These two methods can also serve as auxlary scale estmators (after fndng the parameters of a model) for other robust estmators. Because the medan and MAD have 50% breakdown ponts, they wll break down when the data nclude more than 50% outlers. Both methods are based for multple-mode cases even when the data contans less than 50% outlers (see secton 7.4) Adaptve Least K-th Squares (ALKS) Scale Estmator As we have outlned n secton 2.4.5, the authors of ALKS (Lee, Meer et al. 1998) employ the robust k scale estmaton technque n ALKS by searchng for a model mnmzng the k-th order statstcs of the squared resduals. The optmal value of the k s that whch corresponds to the mnmum of the varance of the normalzed error (see equaton ( 2.32)). The authors assume that when k s ncreased so that the frst outler s ncluded, the ncrease of ŝk s much less than that of σˆ k. ALKS s lmted n ts ablty to handle extreme outlers. Another problem we found (Wang and Suter 2003b) (also see chapter 4) n ALKS s ts lack of stablty under a small percentage of outlers. 123

147 7.2.3 Resdual Consensus (RESC) Scale Estmator In secton 2.4.6, we dscussed the RESC estmator (Yu, Bu et al. 1994). In that secton, we concentrate on the estmaton of the parameters, not on scale estmaton. After fndng a ft, RESC estmates the scale of the ft by drectly calculatng: v 1 c c σ = α( ( h δ h ) ) v c h 1 = 1 = 1 2 1/2 ( 7.5) where c h s the mean of all resduals ncluded n the compressed hstogram; α s a correct factor for the approxmaton ntroduced by roundng the resduals n a bn of the hstogram to δ ( δ s the bn sze of the compressed hstogram); v s the number of bns of the compressed hstogram. However, we found the estmated scale s stll overestmated for the reason that, nstead of summng up squared dfferences between all ndvdual resduals and the mean resdual n the compressed hstogram, equaton ( 7.5) sums up the squared dfferences between resduals n each bn of compressed hstogram and the mean resdual n the compressed hstogram. To reduce ths problem, we revse equaton (7.5) as follows: n 1 c c σ = ( ( r h ) ) v c h 1 = 1 = 1 2 1/2 ( 7.6) where n c s the number of data ponts n the compressed hstogram Modfed Selectve Statstcal Estmator (MSSE) Bab-Hadashar and Suter (Bab-Hadashar and Suter 1999) have used least k-th order (rather than medan) methods and a heurstc way of estmatng scale to perform range segmentaton. After fndng a ft, they tred to recognze the frst outler, by detectng the k- th resdual jumps, whch can ndcate the unbased scale estmate usng the frst k-th resduals n an ascendng order: 124

148 k 2 r 2 = 1 ˆ σ k = k p ( 7.7) where p s the dmenson of the model. They assume that when k s ncreased, the value of the k-th resdual wll jump when t comes from a dfferent dstrbuton. Thus, the scale can be estmated by checkng the valdty of the followng nequalty: σ σ 2 2 k + 1 T 1 2 k k p 1 > + ( 7.8) + 1 Because ths method does not rely on the k-th order statstcs (t uses only the frst k data ponts that has been classfed as nlers), t s less based when data nclude multplestructural dstrbuton. However, though ther method can handle large percentages of outlers and pseudooutlers, t does not seem as successful n toleratng extreme cases 7.3 A Novel Robust Scale Estmator: TSSE In ths secton, we wll produce a mean shft valley (MSV) technque and then, we propose a hghly robust scale estmator (TSSE), whch s very robust to multple-structural data Mean Shft Valley Algorthm Although the mean shft method has been extensvely exploted and appled n low level computer vson tasks (Cheng 1995; Comancu and Meer 1997; Comancu and Meer 1999b; Comancu and Meer 2002a) for ts effcency n seekng local peaks of probablty densty, sometmes t s very mportant to fnd the valleys of dstrbutons. Based upon the Gaussan kernel, a saddle-pont seekng method was publshed n (Comancu, Ramesh et 125

149 al. 2002b). Here, we provde a more smple method to fnd local valleys n a one dmensonal functon. One characterstc of the mean shft vector s that t always ponts towards the drecton of the maxmum ncrease n the densty. Thus the drecton opposte to the mean shft vector wll always pont toward to a local mnmum densty. In order to fnd valley n densty space, we defne the mean shft valley vector MV h ( x ) to pont n the opposte drecton to the peak: 1 MV() h x = -M() h x = x x ( 7.9) n x x Sh( x) Replacng M h ( x) n ( 4.6) by MV h ( x ), we can obtan: MV ( x) h 2 h ˆ f( x) ( 7.10) d + 2 fˆ ( x) MV h ( x) always ponts towards the drecton of the maxmum decrease n the densty. In practce, we fnd that the step-sze gven by the above equaton may lead to oscllaton. Thus we derve a recpe for avodng the oscllatons n valley seekng. Let {y k } k=1,2 be the sequence of successve locatons of the mean shft valley procedure, then we take a modfed step by: y k+1 =y k + p MV h y ) ( 7.11) ( k where p s a correcton factor, and 0 < p 1. If the shft step at y k s large, t causes y k+1 to jump over the local valley and thus oscllate over the valley. Ths problem can be avoded when we adjust the correcton factor p so that MV h (y k ) T MV h (y k+1 )>0. The mean shft valley algorthm can be descrbed as: 1. Choose the bandwdth, h; set p =1; and ntalse the locaton of the wndow. 126

150 2. Compute the shft step vector MV h (y k ). 3. Compute y k+ 1 by equaton ( 7.11) and MVh ( y k + 1). 4. If MV h (y k ) T MV h (y k+1 )>0, go to step 5; Otherwse, we let p=p/2. Repeat step 3 and 4 untl MV h (y k ) T MV h (y k+1 )>0; 5. Translate the search wndow by p MV h y ). ( k 6. Repeat step 3 to step 5 untl convergence V0 V Probablty Densty V0' V1' Three Normal Dstrbutons Fgure 7.1: An example of the applcaton of the mean shft valley method to fnd local valleys. To llustrate the mean shft valley method, three normal modes (mode 1 ncludes 600 data ponts, mode 2 ncludes 500 data ponts, and mode 3 ncludes 600 data ponts) wth a total of 1700 data ponts were generated n Fgure 7.1. We selected two ntal ponts: V0 (0.3) and V1 (7.8). The search wndow radus was chosen as 2.0. The mean shft valley method automatcally found the local mnmum denstes (converged ponts). Precsely: V0 was located at , and V1 was at The centers (V0 and V1 ) of the converged wndows correspond to the local mnmum probablty denstes. If we use V0 and V1 as two densty thresholds, the whole data can be decomposed nto three modes (see Table 7.1). 127

151 Mode 1 Mode 2 Mode 3 Mean Number Mean Number Mean Number Generated Data Estmated Parameters Table 7.1: Applyng the mean shft valley method to decompose data. There s one exceptonal case: when there are no local valleys (e.g., un-modal), the mean shft valley method s dvergent. Ths can easly be avoded by termnatng when no samples fall wthn the wndow. Next, we wll apply the mean shft and mean shft valley methods, n (one-dmensonal) resdual space, to produce a hghly robust scale estmator: Two-Step Scale Estmator Two-Step Scale Estmator (TSSE) We base our method on the assumpton that the nlers occupy relatve majorty, and are Gaussan dstrbuted, but the whole data can nclude multple-structural dstrbuton. Thus, we propose a robust two-step method to estmate the scale of the nlers. (1) Use mean shft, wth ntal center zero (n ordered absolute resdual space), to fnd the local peak, and then use the mean shft valley to fnd the valley next to the peak. Note: modes other than the nlers wll be dsregarded as they le outsde the obtaned valley. (2) Estmate the scale of the ft by the medan scale estmator usng the ponts wthn the band centered at the local peak extendng to the valley. TSSE s very robust to outlers and can resst heavly contamnated data wth multple structures. In next secton, we wll compare the achevements of our method and other fve methods. The experments wll show the advantages of the proposed method over other methods 128

152 7.4 Experments on Robust Scale Estmaton In ths secton, we wll nvestgate the behavour of several state-of-the-art robust scale estmators that are wdely used n computer vson communty and show the weakness of these scale estmaton technques. We assume the parameters of the model have been correctly estmated. In the followng experments, we compare the proposed method TSSE, wth other fve robust scale estmators: the medan, MAD, ALKS, MSSE, and the revsed RESC (accordng to revsed equaton ( 7.6). Comparatve experments show the proposed method acheves better results than the other fve robust scale estmators. The sgnals were generated as follows: The 'th structure has n data ponts, corrupted by Gaussan nose wth zero mean and standard varance σ. α data ponts were randomly dstrbuted n the range of (0, 100) Normal Dstrbuton Frst, we generate a smple lne sgnal: One lne: x:(0-55), y=30, n 1 =10000, σ 1 =3; α=0,.e., 100% nlers; After we appled the sx robust scale estmators to the sgnal, we obtaned the followng estmates: Medan (3.0258); MAD (3.0237); ALKS (2.0061); MSSE (2.8036); the revsed RESC (2.8696); and TSSE (3.0258). Among these sx comparatve methods, the medan, MAD, and TSSE gave the most accurate results. ALKS gave the worst result. Ths s because the robust estmate ŝ k s an underestmate of σ for all values of k (Rousseeuw and Leroy 1987) and because the crteron equaton ( 2.32) estmates the optmal k wrongly. ALKS used only about 15% data as nlers. MSSE used 98% of the data ponts as nlers, whch s reasonably good Two-mode Dstrbuton In ths subsecton, we analyse more complcated data. We generated a step sgnal so that the data nclude two structures,.e. two lnes. 129

153 A step sgnal: lne1: x:(0-55), y=40, n 1 =3000, σ 1 =3; lne2: x:(55-100), y=70, n 2 =2000, σ 2 =3; α=0. The results that we obtaned are as follows: the medan (6.3541); MAD (8.8231); ALKS (3.2129); MSSE (2.8679); the revsed RESC (2.9295); and TSSE (3.0791). Among these sx methods, the medan and MAD gave the worst results. Ths s because the medan and MAD scale estmators assume the resduals of the whole data are at Gaussan dstrbuton, whch s volated n the sgnal (contanng two modes). The other four robust scale estmators yeld good results Two-mode Dstrbuton wth Random Outlers Next, we agan use the above one-step sgnal. However, we ncreased the number of outlers so that the data nclude 80% of outlers,.e., n 1 =1000; n 2 =750; α=3250. After applyng the sx methods, the estmated scale of the sgnal that we obtaned are: the medan ( ); MAD ( ); ALKS (7.2586); MSSE ( ); the revsed RESC ( ); and TSSE (4.1427). From the obtaned results, we can see that only the proposed method gave a reasonably good result, whle all other fve methods faled to estmate the scale of the nlers when the data nvolve a hgh percentage of outlers Breakdown Plot A Roof Sgnal We generate a roof sgnal contanng 500 data ponts n total. A roof: x:(0-55), y=x+30, n 1, σ=2; x:(55-100), y=140-x, n 2 =50; σ=2. At the begnnng, we assgn 450 data pont to n 1 and the number of the unform outlers α =0; Thus, the data nclude 10% outlers. Then, we decrease n 1, and at the same tme, we ncrease α so that the total number of data ponts s 500. Fnally, n 1 =75, and α=375,.e. the data nclude 85% outlers. The results are repeated 20 tmes. 130

154 Fgure 7.2 shows that TSSE yelded the best results among the sx comparatve methods. The revsed RESC method begns to break down when the outlers occupy around 60%. MSSE gave reasonable results when the percentage of outlers s less than 75%, but t broke down when the data nclude more outlers. Although the breakdown ponts of the medan and the MAD scale estmators are as hgh as 50%, ther results devated from the true scale even when outlers are less than 50% of the data. They are based more and more from the true scale wth the ncrease n the percentage of outlers. ALKS yelded less accurate results than TSSE, and less accurate results than the revsed RESC and MMSE when outlers are less 60% Medan Scale Estmator Error n scale Revsed RESC MAD 10 ALKS MMSE Percentage of outlers TSSE Fgure 7.2: Breakdown plot of sx methods n estmatng the scale of a roof sgnal A Step Sgnal We generated another sgnal: one-step sgnal that contans 1000 data ponts n total. Onestep sgnal: x:(0-55), y=30, n 1, σ=2; x:(55-100), y=40, n 2 =100; σ=2. At the begnnng, we assgn n 1 wth 900 data ponts and the number of the unform outlers α =0; Thus, the data nclude 10% outlers. Then, we decrease n 1, and at the same tme, we 131

155 ncrease α so that the number of the whole data ponts s Fnally, n 1 =150, and α=750,.e. the data nclude 85% outlers. From Fgure 7.3, we can see that TSSE gave the most accurate estmaton of the scale of the sgnal. In contrast, the revsed RESC begns to break down when the number of outlers s about 50% of the data. MSSE gave reasonable results when the percentage of outlers s less than 70%. However, t broke down when the data nclude more outlers. The medan and the MAD scale estmators are more and more based wth the ncrease n the percentage of outlers for the two-structured sgnal. ALKS yelded less satsfactory results Medan Scale Estmator Error n scale ALKS MAD 5 0 Revsed RESC MMSE TSSE Percentage of outlers Fgure 7.3: Breakdown plot of sx methods n estmatng the scale of a step sgnal. Compared wth Fgure 7.2, we can see that the revsed RESC, MSSE, and ALKS yelded less accurate results for small scale step sgnal than roof sgnal, but the results of the proposed TSSE are smlar accurate for both types of sgnals. Even when the data nclude 85% outlers, the recovered scales of nlers by TSSE for the one-step sgnal are 2.95, whch s reasonably good. 132

156 Breakdown Plot for Robust Scale Estmator If the data have a Gaussan lke dstrbuton, the medan scale estmator ( 7.1) s only one possble robust k scale estmator ( 2.31) (correspondng to k=0.5n). We nvestgated the achevements of the robust k scale estmator (assumng the correct parameters of a model have been found). Let: dˆ S( q) = q 1 Φ [(1 + q) / 2] ( 7.12) where q s set from 0 to 1. Thus S (0.5) s the medan scale estmator. We generated a one-step sgnal contanng 500 data ponts n total. One-step sgnal: x:(0-55), y=30, n 1, σ=1; x:(55-100), y=40, n 2 =50; σ=1. At the begnnng, n 1 = 450 and α =0; Then, we decrease n 1, and at the same tme, we ncrease α untl n 1 =50, and α=400,.e. the data nclude 90% outlers S(0.5) S(0.4) 30 S(0.3) 25 S(0.2) S(0.1) Fgure 7.4: Breakdown plot of dfferent robust k scale estmators. As Fgure 7.4 shows, after fndng the robust estmate of the parameters of a model, the accuracy of S (q) s ncreased wth the decrease of q. When the outlers are less than 50% of the whole data, the dfference for dfferent values of q s small. However, when the data 133

157 nclude more than 50% outlers, the dfference for varous values of q s large. Ths provdes a useful cue for robust estmators, whch use the medan scale method to recovery the scale of nlers Performance of TSSE From the experments n ths secton, we can see the proposed TSSE s a very robust scale estmator, achevng better results than the other fve methods. However, we must acknowledge that the accuracy of TSSE s related to the accuracy of kernel densty estmaton. In partcular, for very few data ponts, the kernel densty estmates wll be less accurate. Also, the proposed TSSE may underestmate the scale f the underlyng dstrbuton s composed of heavly overlappng Gaussans We also note that, for the purposes of ths chapter (only), we assume we know the parameters of the model: ths s so we can concentrate on estmatng the scale of the resduals. However, n practce, one cannot drectly estmate the scale: the parameters of a model also need to be estmated. In the next chapter, we wll propose a new robust estmator Adaptve Scale Sample Consensus (ASSC) estmator, whch can estmate the parameters and the scale smultaneously. 7.5 Conclusons In ths chapter, we show that scale estmaton for data, nvolvng multple structures and hgh percentages of outlers, s as yet a relatvely unsolved problem. Ths provdes an mportant warnng to the computer vson communty: t s necessary to carefully choose a proper scale estmator. We also, based on the mean shft algorthm, propose a smple but effcent mean shft valley technque, whch can be used to fnd local valley. Furthermore, we propose a promsng robust scale estmator (TSSE), based on the mean shft and the mean shft valley technques. The experments are compared wth fve other state-of-the-art robust scale 134

158 estmators, and show the advantages of TSSE over these methods, especally, when the data nvolve a hgh percentage of outlers and the nose level of nlers s large. TSSE s a very general method and can be used to gve an ntal scale estmate for robust estmators such as M-estmators, etc. TSSE can also be used to provde an auxlary estmate of scale (after the parameters of a model to ft have been found) as a component of almost any robust fttng method such as Hough Transform (Hough 1962), MDPE (chapter 4) and QMDPE (chapter 5), etc. 135

159 8. Robust Adaptve-Scale Parametrc Model Estmaton for Computer Vson Chapter 8 Robust Adaptve-Scale Parametrc Model Estmaton for Computer Vson 8.1 Introducton Robust model fttng essentally requres the applcaton of two estmators. The frst s an estmator for the values of the models parameters. The second s an estmator for the scale of the nose n the (nler) data. In the prevous chapter, we proposed a novel robust scale estmaton technque: the Two-Step Scale estmator (TSSE) and shown the performance of TSSE despte heavly contamnated data, assumng that the correct parameters of a model are avalable. However, n many practcal cases, the parameters of a model and the scale of nlers need to be estmated smultaneously. In ths chapter, based on our prevous work (TSSE), we wll propose a novel robust estmator: Adaptve Scale Sample Consensus (ASSC) estmator. The ASSC estmator combnes Random Sample Consensus (RANSAC) and TSSE. ASSC also uses a modfed objectve functon that depends upon both the number of nlers and the correspondng scale. Dscontnuous sgnals (such as parallel lnes/planes, step lnes/planes, etc.) often appear n computer vson tasks. A lot of work has been done to nvestgate the behavour of robust estmators for dscontnuous sgnals, e.g., (Mller and Stewart 1996; Stewart 1997; Stewart 136

160 1999; Chen and Meer 2002). Dscontnuous sgnals are hard to deal wth: e.g., most robust estmators break down and yeld a brdge between the two planes of one step sgnal. ASSC s very robust to dscontnuous sgnals and data wth multple structures, beng able to tolerate more than 80% outlers. The man advantage of ASSC over RANSAC s that pror knowledge about the scale of nlers s not needed. ASSC can smultaneously estmate the parameters of a model and the scale of nlers belongng to that model. Experments on synthetc data show that ASSC has better robustness to heavly corrupted data than Least Medan Squares (LMedS), Resdual Consensus (RESC), and Adaptve Least K th order Squares (ALKS). We also apply ASSC to two fundamental computer vson tasks: range mage segmentaton and robust fundamental matrx estmaton. Experments show very promsng results. At the latter part of ths chapter, we extend ASSC to produce ASRC (Adaptve-Scale Resdual Consensus) estmator. ASRC scores a model based on both the resduals of nlers and the correspondng scale estmate determned by those nlers. The dfference between ASRC and ASSC s: n ASSC, all nlers are treated as the same,.e., each nler contrbutes 1 to the object functon of ASSC. However, n ASRC, the szes of the resduals of nlers are nfluental. The man contrbutons of ths chapter can be summarzed as follows: By employng TSSE n a RANSAC lke procedure, we propose a hghly robust estmator: Adaptve Scale Sample Consensus (ASSC) estmator. Experments presented show that ASSC s hghly robust to heavly corrupted data wth multple structures and dscontnutes (emprcally, ASSC can tolerate more than 80% outlers), and that t outperforms several competng methods. ASSC s successfully appled n two fundamental and mportant computer vson tasks: range mage segmentaton and fundamental matrx estmaton. The experments show promsng results. 137

161 We extend ASSC to produce ASRC whch mproves the objectve functon of ASSC by weghtng each nler dfferently accordng to the sze of the resdual of that nler. Experments showng the advantages of ASRC over ASSC. Ths chapter s organzed as follows: n secton 8.2, the robust ASSC estmator s proposed. In secton 8.3, expermental comparsons, usng both 2D and 3D examples, are presented. We apply ASSC to range mage segmentaton n secton 8.4 and fundamental matrx estmaton n secton 8.5. We ntroduce ASRC and we provde some experments showng the advantages of ASRC over ASSC n secton 8.6. We state our conclusons n secton Adaptve Scale Sample Consensus (ASSC) Estmator Algorthm In secton we revewed the RANSAC estmator. The crteron used by RANSAC s to maxmze the number of data ponts wthn the user-set error bound. Clearly, ths bound s related to the scale of the nlers (S). Mathematcally, the RANSAC estmate can be wrtten as: ˆ θ = arg max n ( 8.1) ˆ θ ˆ θ where n s the number of ponts whose absolute resdual n the canddate parameter space θˆ s wthn the error bound (.e., r 2. 5S ));θˆ s the estmated parameters from one of the randomly chosen p-subsets. The error bound n RANSAC s crucal to the performance of RANSAC. Provded wth a correct error bound of nlers, the RANSAC method can fnd a model even when data contan a large percentage of gross errors. However, when the wrong error bound s gven, RANSAC wll totally break down even when outlers occupy relatvely small percentages of the whole data (see secton ). Thus the major problem wth RANSAC s that the 138

162 technque needs pror knowledge of the error bound of nlers, whch s not avalable n most practcal vson tasks. In ths secton, we wll, based upon our prevously proposed TSSE, propose an adaptvescale robust estmator ASSC. We assume that when a model s correctly found, two crtera should be satsfed: The number of data ponts (n θ ) near or on the model should be as large as possble; The resduals of the nlers should be as small as possble. Correspondngly, the scale (S θ ) should be as small as possble. We therefore defne our objectve functon as: ˆ θ = arg max( n / S ) ( 8.2) ˆ θ ˆ θ ˆ θ Note: when the estmate of the scale s fxed, equaton ( 8.2) s another form of RANSAC wth the score n θ scaled by 1/S (.e, a fxed constant for all p-subsets), yeldng the same results as RANSAC. ASSC s more reasonable than RANSAC because the scale s estmated for each canddate ft, n addton to the fact that t no longer requres a user defned error-bound. The ASSC algorthm s as follows: (1) Randomly choose one p-subset from the data ponts, estmate the model parameters usng the p-subset, and calculate the ordered absolute resduals of all data ponts. (2) Choose the bandwdth by equaton 6.6 and calculate an ntal scale by a robust k scale estmator (equaton 7.12) usng q=0.2. (3) Apply TSSE to the absolute sorted resduals to estmate the scale of nlers. At the same tme, the probablty densty at the local peak f ˆ( peak ) and local valley f ˆ( valley ) are obtaned by equaton ( 4.1). 139

163 (4) Valdate the valley. Let f ˆ( valley ) / f ˆ ( peak) = λ (where 1>λ 0). Because the nlers are assumed to have a Gaussan-lke dstrbuton, the valley s not suffcently deep when λ s too large (say, 0.8). If the valley s suffcently deep, go to step (5); otherwse go to step (1). (5) Calculate the score,.e., the objectve functon of the ASSC estmator. (6) Repeat step (1) to step (5) many tmes. Fnally, output the parameters and the scale S 1 wth the hghest score. Because the robust k scale estmator s based for data wth multple structures, the fnal scale of nlers S 2 should be refned when the scale S 1 obtaned by TSSE s used. In order to mprove the statstcal effcency, a weghted least square procedure ((Rousseeuw and Leroy 1987), p.202) s carred out after fndng the ntal ft. Instead of estmatng the ft nvolvng the absolute majorty n the data set, the ASSC estmator fnds a ft havng a relatve majorty of the data ponts. Ths makes t possble, n practce, for ASSC to obtan a hgh robustness so that t can tolerate more than 50% outlers. Indeed, the experments n the next secton show that the ASSC estmator s a very robust estmator for data wth multple structures and a hgh percentage of outlers. 8.3 Experments wth Data Contanng Multple Structures In ths secton, both 2D and 3D examples are gven. The results of the proposed method are compared wth those of three other popular methods: LMedS, RESC, and ALKS. All of the four methods use the random samplng scheme that s also at the heart of our method. Note: unlke the experments n secton 7.4, here we do not (of course) assume any knowledge of the parameters of the models n the data. Nor are we amng to fnd any partcular structure. Due to the random samplng used, the methods wll possbly return a dfferent structure on dfferent runs however, they wll generally fnd the largest structure most often (f one domnates n sze). 140

164 D Examples RESC 80 ASSC and RESC 80 ASSC LMedS ALKS 40 LMedS ALKS (a) (b) RESC 60 ALKS 60 LMedS 50 LMedS ASSC ALKS ASSC 20 RESC (c) (d) Fgure 8.1: Comparng the performance of four methods: (a) fttng a lne wth a total of 90% outlers; (b) fttng three lnes wth a total of 88% outlers; (c) fttng a step wth a total of 85% outlers; (d) fttng three steps wth a total of 89% outlers. We generated four knds of data (a lne, three lnes, a step, and three steps), each wth a total of 500 data ponts. The sgnals were corrupted by Gaussan nose wth zero mean and standard varance σ. Among the 500 data ponts, α data ponts were randomly dstrbuted n the range of (0, 100). The 'th structure has n data ponts. (a) One lne: x:( 0-100), y=x, n 1 =50; α=450; σ=0.8. (b) Three lnes: x:(25-75), y=75, n 1 =60; x:(25-75), y=60, n 2 =50; x=25, y:(20-75), n 3 =40; α=350; σ=

165 (c) One step: x:(0-50), y=35, n 1 =75; x:(50-100), y=25, n 2 =55; α=370; σ=1.1. (d) Three steps: x:(0-25), y=20, n 1 =55; x:(25-50), y=40, n 2 =30; x:(50-75), y=60, n 3 =30; x:(75-100), y=80, n 4 =30; α=355; σ=1.0. In Fgure 8.1, we can see that the proposed ASSC method yelds the best results among the four methods, correctly fttng all four sgnals. Because LMedS has a 50% breakdown pont, t faled to ft all the four sgnals. Although ALKS can tolerate more than 50% outlers, t faled n all four cases wth very hgh outler content. RESC gave better results than LMedS and ALKS. It succeeded n two cases (one-lne and three-lne sgnals) even when the data nvolved more than 88% outlers. However, RESC faled to ft two sgnals (Fgure 8.1 (c) and (d)). It should be emphaszed that both the bandwdth choce and the scale estmaton n the proposed ASSC method are data-drven. No pror knowledge about the bandwdth and the scale s necessary n the proposed method. Ths s a great mprovement over the tradtonal RANSAC method where the user must set a pror scale-related error bound D Examples Two synthetc 3D sgnals were generated. Each contaned 500 data ponts and three planar structures. Each plane contans 100 ponts corrupted by Gaussan nose wth standard varance σ; 200 ponts are randomly dstrbuted n a regon ncludng all three structures. A planar equaton can be wrtten as Z=AX+BY+C, and the resdual of the pont at (X, Y, Z ) s r =Z -AX -BY -C. (A, B, C; σ) are the parameters to estmate. In contrast to the secton 8.3.1, we now attempt to fnd all structures n the data. In order to extract all planes, we: (1) Apply the robust estmators to the data set and estmate the parameters and scale of a plane; (2) Extract the nlers and remove them from the data set; 142

166 z z z (3) Repeat step 1 to 2 untl all planes are extracted. 3D Data Result by ASSC z y x y x (a) (b) Result by RESC Result by ALKS y x y x (c) (d) Fgure 8.2: Frst experment for 3D multple-structure data: (a) the 3D data; the results by (b) the ASSC method; (c) by RESC; and (d) by ALKS. Plane A Plane B Plane C True values (3.0, 5.0, 0.0; 3.0) (2.0, 3.0, 0.0; 3.0) (2.0, 3.0, 80.0; 3.0) ASSC (3.02, 4.86, 1.66; 3.14) (2.09, 2.99, 0.56, 3.18) (1.79, 2.98, 83.25, 3.78) RESC (3.69, 5.20, -7.94, 36.94) (4.89, 13.82, ,51.62) and (-2.88,-1.48, ,0.47) ALKS (2.74, 5.08, 1.63; 44.37) (-7.20, 0.91,198.1; 0.007) and (-0.59,1.82,194.06; 14.34) LMedS (1.22, 3.50,30.36, 51.50), (-0.11, -3.98, ; 31.31) and (-9.59, -1.66,251.24;0.0) Table 8.1: Result of the estmates of the parameters (A, B, C; σ) provded by each of the robust estmators appled to the data n Fgure

167 z z z 3D Data Result by ASSC z y (a) x y (b) x Result by RESC Result by ALKS y (c) x y (d) x Fgure 8.3: Second experment for 3D multple-structure data: (a) the 3D data; the results by (b) the proposed method; (c) by RESC; (d) and by ALKS. Plane A Plane B Plane C True values (0.0, 3.0, -60.0; 3.0) (0.0, 3.0, 0.0; 3.0) (0.0, 0.0, 40.0; 3.0) ASSC (0.00, 2.98, , 2.11) (0.18, 2.93, 0.18, 3.90) (0.08, 0.03, 38.26; 3.88) RESC (0.51, 3.04,-67.29;36.40) (6.02,-34.00, ;101.1) and (0.35, -3.85, , 0.02) ALKS (-1.29, 1.03,14.35; 30.05), (-1.07, -2.07,84.31; 0.01) and (1.85, , 36.97; 0.08) LMedS (0.25, 0.61,24.50, 27.06), (-0.04, -0.19, 92.27; 9.52) and (-0.12, -0.60,92.19; 6.89) Table 8.2: Result of the estmates of the parameters (A, B, C; σ) provded by each of the robust estmators appled to the data n Fgure

168 The red crcles consttute the frst plane extracted; the green stars the second plane extracted; and the blue squares the thrd extracted plane. The results are shown n Fgure 8.2, Table 8.1; Fgure 8.3 and Table 8.2 (the results of LMedS, whch completely broke down for these 3D data sets, are only gven n Table 8.1 and Table 8.2). Note for RESC, we use the revsed form n equaton ( 7.6) nstead of equaton ( 7.5) for scale estmate. Smlarly, n the second experment (Fgure 8.3 and Table 8.2), LMedS and ALKS completely broke down for the heavly corrupted data wth multple structures. RESC, although t correctly ftted the frst plane, wrongly estmated the scale of the nlers to the plane. RESC wrongly ftted the second and the thrd planes. Only the proposed method correctly ftted all three planes (Fgure 8.3 (b)) and estmated the correspondng scale for each plane. The proposed method s computatonally effcent. We perform the proposed method n MATLAB code wth TSSE n Mex. When m s set as 500, the proposed method takes about 1.5 second for the 2D examples and about 2.5 seconds for the 3D examples n an AMD 800MHz personal computer The Breakdown Plot of the Four Methods In ths subsecton, we perform an experment to draw the breakdown plot of each method (smlar to the experment reported n (Yu, Bu et al. 1994). However, the data that we use s more complcated because t contans two types of outlers: clustered outlers and randomly dstrbuted outlers). We generate one plane sgnal wth Gaussan nose havng unt standard varance. Both clustered outlers and randomly dstrbuted outlers are added to the data. The clustered outlers have 100 data ponts and are dstrbuted wthn a cube. The randomly dstrbuted outlers contan the plane sgnal and clustered outlers. The number of nlers s decreased from 900 to 100. At the same tme, the number of randomly dstrbuted outlers s ncreased from 0 to 750 so that the total number of the data ponts s kept Thus, the outlers occupy from 10% to 90% outlers. Examples for data wth 20% and 70% outlers are shown n Fgure 8.4 (a) and (b) to llustrate the dstrbutons of the nlers and outlers. 145

z If an estmator s robust enough to outlers, t can resst the nfluence of both clustered outlers and randomly dstrbuted outlers even when the outlers occupy more than 50% of the data.

169 z If an estmator s robust enough to outlers, t can resst the nfluence of both clustered outlers and randomly dstrbuted outlers even when the outlers occupy more than 50% of the data. In order to ncrease the stablty of the result, we perform the experments 20 tmes, usng dfferent random samplng seeds, for each data set nvolvng dfferent percentage of outlers (10% to 90%). An averaged result s show n Fgure 8.4 (c-e). 70 Data dstrbuton nvolvng 20% outlers Clustered Outlers Random outlers 10 5 Inlers x y (a) (b) ASSC RESC ALKS LMedS ASSC RESC ALKS LMedS 15 ASSC RESC ALKS LMedS RESC 10 ALKS Error n A 0 Error n B LMedS Error n C 5 0 ASSC ALKS ASSC LMedS RESC Outler Percentage Outler Percentage Outler Percentage (c) (d) (e) Fgure 8.4: Breakdown plot of the four methods: (a) example of the data wth 20% outlers; (b) example of the data wth 80% outlers; (c) the error n the estmate of parameter A, (d) n parameter B, and (e) n parameter C. From Fgure 8.4 (c-e), we can see that our method obtans the best result. Because the LMedS has only 50% breakdown pont, t broke down when outlers occuped more than 50% of the data (approxmately). ALKS broke down when outlers had about 75%. RESC began to break down when outlers had more than 83% of the whole data; In contrast, the ASSC estmator s the most robust to outlers. It began to breakdown at 89% outlers. In fact, when the nlers are about (or less than) 10% of the data, the assumpton that nlers 146

170 should occupy a relatve majorty of the data s volated. Brdgng between the nlers and the clustered outlers tends to yeld a hgher score. Other robust estmators also suffer from the same problem Influence of the Nose Level of Inlers on the Results of Robust Fttng Next, we wll nvestgate the nfluence of the nose level of nlers on the results of the chosen four robust fttng methods. We use the sgnal shown n Fgure 8.4 (b) wth 70% outlers. However, we change the standard varance of the plane sgnal from 0.1 to 3, wth ncrement ASSC RESC ALKS LMedS ASSC RESC ALKS LMedS 10 8 ASSC RESC ALKS LMedS LMedS 4 ALKS Error n A 0 Error n B RESC Error n C ALKS ASSC -2-4 LMedS RESC ASSC Nose level of nlers Nose level of nlers Nose Level of nlers (a) (b) (c) Fgure 8.5: The nfluence of the nose level of nlers on the results of the four methods: plots of the error n the parameters A (a), B (b) and C (c) for dfferent nose level. Fgure 8.5 shows that LMedS broke down frst. Ths s because LMedS cannot resst the nfluence of outlers when the outlers occupy more than a half of the data ponts. ALKS, RESC, and ASSC estmators all can tolerate more than 50% outlers. However, among these three robust estmators, ALKS broke down frst. It began to break down when the nose level of nlers s ncreased to 1.7. RESC s more robust than ALKS: t began to break down when the nose level of nlers s ncreased to 2.3. The ASSC estmator shows the best achevement. Even when the nose level s ncreased to 3.0, the ASSC estmator dd not break down yet. 147

171 z z Influence of the Relatve Heght of Dscontnuous Sgnals In ths subsecton, we wll nvestgate the nfluence of the relatve heght of dscontnuous sgnals on the performance of the four methods. 50 Parallel planes 50 Step Sgnal y (a) x 15 x y (b) ASSC RESC ALKS LMedS 0.6 ASSC RESC ALKS LMedS ASSC RESC ALKS LMedS 0.6 Error n A Error n B ASSC ALKS Error n C ALKS LMedS RESC RESC LMedS 0 ASSC Parallel heght Parallel heght Parallel heght (c1) (c2) (c3) ASSC RESC ALKS LMedS ASSC RESC ALKS LMedS LMedS ASSC ALKS ASSC RESC ALKS LMedS Error n A Error n B ALKS RESC Error n C RESC LMedS Step heght 0 ASSC Step heght Step heght (d1) (d2) (d3) Fgure 8.6: The nfluence of the relatve heght of dscontnuous sgnals on the results of the four methods: (a) two parallel planes; (b) one step sgnal; (c1-c3) the results for the two parallel planes; (d1-d3) the results for the step sgnal. 148

172 We generate two dscontnuous sgnals: one contanng two parallel planes and one contanng one-step planes. The sgnals have unt varance. Randomly dstrbuted outlers coverng the regons of the sgnals are added to the sgnals. Among the total 1000 data ponts, there are 20% pseudo-outlers and 50% random outlers. The relatve heght s ncreased from 1 to 20. Fgure 8.6 (a) and (b) shows examples of the data dstrbutons of the two sgnals wth relatve heght 10. The averaged results (over 20 repettons) obtaned by the four robust estmators are shown n Fgure 8.6 (c1-c3) and (d1-d3). From Fgure 8.6, we can see the tendency to brdge becomes stronger as the step decreases. LMedS shows the worst results among the four robust estmators. For the rest three estmators: ASSC, ALKS, and RESC, from Fgure 8.6 (c1-c3) and (d1-d3), we can see that: For the parallel plane sgnal, the results by ALKS are affected most by the small step. RESC shows better result than ALKS. However, ASSC shows the best result. For the step sgnal, when the step heght s small, all of these three estmators are affected. When the step heght s ncreased from small to large, all of the three estmators show robustness to the sgnal. However, ASSC acheves the best results for small step heght sgnals. In next sectons, we wll apply the ASSC estmator to more "real world" computer vson tasks: range mage segmentaton and fundamental matrx estmaton. 8.4 ASSC for Range Image Segmentaton The range mage segmentaton algorthm that we present here s based on the just ntroduced ASSC estmator. Although MDPE n chapter 4, has smlar performance to ASSC, MDPE only outputs the parameters of the model as results. An auxlary scale estmator s requred to provde an estmate of the scale of nlers. ASSC, however, does not need any auxlary scale estmator at the post-processng stage. It can estmate the scale of nlers durng the process of estmatng the parameters of a model. 149

173 8.4.1 The Algorthm of ASSC-Based Range Image Segmentaton We employ a herarchcal structure, smlar to that n chapter 5. We begn wth bottom level contanng 64x64 pxels that are obtaned by usng regular samplng on the orgnal 512x512 mage. In each level of the herarchy, we: (1) Apply the ASSC estmator to obtan the parameters of plane and the scale of nlers. (2) The nlers (n the top herarchy) correspondng to the estmated parameters of plane and scale are then dentfed. If the number of nlers s less than a threshold, go to step (7). Ths step s dfferent from the step (4) n secton n that the parameters of plane and the correspondng scale of nlers are obtaned smultaneously n ASSC; whle QMDPE needs an auxlary scale estmator to obtan the scale of nlers. (3) Use normal nformaton to valdate the nlers obtaned n step (2). Ths step s smlar to the step (5) n secton If the number of the valdated nlers s small, go to step (7). (4) Fll n the holes nsde the maxmum connected component from the valdated nlers. The holes may appear because of sensor nose or because some ponts have large resduals and are beyond the range that s related to the estmated scale. (5) Assgn a label to the ponts correspondng to the connected component from step (4) and remove the ponts from the data set that wll be further processed. Ths happens n the top of the herarchy. (6) If a pont s unlabelled and t s not a jump edge pont, the pont wll be used as a "leftover" pont. After collectng all these ponts, use the connected component algorthm to get the maxmum connected component. If the number data ponts of the maxmum connected component of "left-over" ponts s smaller than a threshold, go to step (7); otherwse, get the data for the current herarchcal level by regularly samplng on the maxmum connected component obtaned n ths step, then go to step (1). (7) Termnate the processng n the current level of the herarchy and go to the hgher-level herarchy untl the top of the herarchy. 150

8.4.2 Experments on Range Image Segmentaton (a1)

7: Segmentaton of ABW range mages from the USF

(a1, b1, c1) Range mage wth 26214 random nose

the correspondng range mages wthout addng random

c3) Segmentaton result by the proposed algorthm.

174 8.4.2 Experments on Range Image Segmentaton (a1) (a2) (a3) (b1) (b2) (b3) (c1) (c2) (c3) Fgure 8.7: Segmentaton of ABW range mages from the USF database. (a1, b1, c1) Range mage wth random nose ponts; (a2, b2, c2) The ground truth results for the correspondng range mages wthout addng random nose; (a3, b3.c3) Segmentaton result by the proposed algorthm. To show the performance of the proposed ASSC-based range mage segmentaton algorthm, we do experments smlar to those n secton 5.5 by addng random nose 151

ponts to the range mages taken from the USF ABW range mage database (test 16, test 7 and tran 5). As shown n Fgure 8.7, all of the man surfaces (structures) were recovered by our method.

Ths s because of the sensor nose and the lmted accuracy of the estmated normal at each range pont.

175 ponts to the range mages taken from the USF ABW range mage database (test 16, test 7 and tran 5). As shown n Fgure 8.7, all of the man surfaces (structures) were recovered by our method. Only a slght dstorton appeared on some boundares of neghbourng surfaces. Smlar slght dstorton also appears n the expermental results (by QMDPE) n chapter 5. Ths s because of the sensor nose and the lmted accuracy of the estmated normal at each range pont. Generally speakng, the more accurate the range data are, and the more accurate the estmated normal at range ponts s, the less the dstorton s. (a) (b) (c) (d) (e) (f) Fgure 8.8: Comparson of the segmentaton results for ABW range mage (test 3) from the USF range mage database. (a) Range mage; (b) The result of ground truth; (c) The result by the USF;(d) The result by the WSU; (e) The result by the UE; (f) The result by the proposed method. 152

We also compare our results wth those of the Unversty of South Florda (USF), Washngton State Unversty (WSU), and the Unversty of Ednburgh (UE) (Hoover, Jean- Baptste et al. 1996) methods. Fgure 8.

176 We also compare our results wth those of the Unversty of South Florda (USF), Washngton State Unversty (WSU), and the Unversty of Ednburgh (UE) (Hoover, Jean- Baptste et al. 1996) methods. Fgure 8.8(c-f) and Fgure 8.9 (c-f) show the results obtaned by the four methods. From Fgure 8.8 (c) and Fgure 8.9 (c), we can see that the USF s results contaned many nosy ponts. In both Fgure 8.8 (d) and Fgure 8.9 (d), the WSU segmenter mssed one surface. The WSU segmenter also over segmented one surface n Fgure 8.8 (d). Some boundares on the juncton of the segmented patch by the USF and WSU n Fgure 8.9 (c) were relatvely serously dstorted. The UE shows relatvely better results than the USF and the WSU. However, some estmated surfaces are stll nosy (see Fgure 8.8 (e) and Fgure 8.9 (e)). Compared wth the other three methods, the proposed method acheved the best results. All surfaces are recovered and the segmented surfaces are relatvely clean. The edges of the segmented patches were reasonably good. (a) (b) (c) (d) (e) (f) Fgure 8.9: Comparson of the segmentaton results for ABW range mage (test 13) from the USF range mage database. (a) Range mage; (b) The result of ground truth; (c) The result by the USF; (d) The result by the WSU; (e) The result by the UE; (f) The result by the proposed method. 153

177 8.5 ASSC for Fundamental Matrx Estmaton Background of Fundamental Matrx Estmaton The fundamental matrx provdes some constrants (related to eppolar geometry, projectvty, etc.) between correspondng ponts n multple vews. The estmaton of the fundamental matrx s mportant for several problems: matchng, recoverng of structure, moton segmentaton, etc.(torr and Zsserman 2000). Snce there possbly are msmatched pars of ponts n the data, tradtonal method such as least squares estmator cannot yeld accurate results. Even worse, the least squares estmator can breakdown. Robust estmators such as M-estmators, LMedS, RANSAC, MSAC and MLESAC have been appled to estmate the fundamental matrx to mprove the accuracy of the results (Torr and Murray 1997). Let {x, } and {x } (for =1,,n) be a set of matched homogeneous mage ponts vewed n mage 1 and mage 2 respectvely. We have the followng constrants for the fundamental matrx F: x Fx = 0anddet[ F] = 0 ( 8.3) ' T We employ the 7-pont algorthm (Torr and Murray 1999) to solve for canddate fts usng Sampson dstance for the resdual to a ftted fundamental matrx. For the th correspondence, the resdual r usng Smpson dstance s: r = k ( k ) 1/2 x + ky + kx' + ky' ( 8.4) wherek = f x x + f x y + f xς + f y x + f y y + f yς + f xς + f yς + f ς ' ' ' ' ' ' The Experments on Fundamental Matrx Estmaton Frst, we generated 300 matches ncludng 120 pont pars of nlers wth unt Gaussan varance and 160 pont pars of random outlers. In practce, the scale of nlers s not 154

178 avalable. Thus, for RANSAC and MSAC, the medan scale estmator, as recommended n (Torr and Murray 1999), s used to yeld an ntal scale estmate. The number of random samples s set to The experment was repeated 30 tmes and the averaged values are shown n Table 8.3. From Table 8.3, we can see that our method yelds the best result. % of nlers correctly classfed % of outlers correctly classfed Standard varance of nlers Ground Truth ASSC MSAC RANSAC LMedS Table 8.3: An expermental comparson for data wth 60% outlers. 1 1 Correctly estmated nler percentages ASSC MSAC, RANSAC, and LMedS Correctly estmated outler percentages RANSAC ASSC LMedS MSAC Outler percentages (a) Outler percentages (b) Standard varance of resduals RANSAC MSAC LMedS ASSC Outler percentages (c) Fgure 8.10: A comparson of correctly dentfed percentage of nlers (a), outlers (b), and the comparson of standard varance of resduals of nlers (c). 155

Next, we draw the breakdown plot of the four methods. Among the total 300 correspondences, the percentage of outlers s ncreased from 5% to 70% n ncrements of 5%.

If a method s robust enough, t should resst the nfluence of outlers and the correctly dentfed percentages of nlers should be around 95% (T s set 1.96 n equaton 7.

179 Next, we draw the breakdown plot of the four methods. Among the total 300 correspondences, the percentage of outlers s ncreased from 5% to 70% n ncrements of 5%. The experments were repeated 100 tmes for each percentage of outlers. If a method s robust enough, t should resst the nfluence of outlers and the correctly dentfed percentages of nlers should be around 95% (T s set 1.96 n equaton 7.1) and the standard varance of nlers should be near to 1.0 regardless of the percentages of outlers actually n the data. We set the number of random samples, m, to be hgh enough to ensure a hgh probablty of success. From Fgure 8.10 we can see that MSAC, RANSAC, and LMedS all break down when the data nvolve more than 50% outlers. The standard varance of nlers dentfed by ASSC s the smallest of all of the estmates when the percentage of outlers s hgher than 50%. Note: ASSC succeeds to fnd the nlers and outlers even when the outlers occuped 70% of the whole data. (a) (b) (c) (d) (e) (f) Fgure 8.11: (a) (b) mage par (c) matches (d) nlers by ASSC; (e) (f) eppolar geometry. 156

180 Number of nlers (.e., matched correspondences) Mean error of nlers Standard varance of nlers ASSC MSAC RANSAC LMedS Table 8.4: Expermental results on two frames of the Corrdor sequence. To conclude, we apply the proposed method on real mage frames. Two frames of the Corrdor sequence (bt.003 and bt.006), whch can be obtaned from (Fgure 8.11 (a) and (b)). Fgure 8.11 (c) shows the matches nvolvng 500 pont pars n total. The nlers (201 correspondences) obtaned by the proposed method are shown n Fgure 8.11 (d). The eppolar lnes (we draw 30 of the eppolar lnes) and eppole usng the estmated fundamental matrx by ASSC are shown n Fgure 8.11 (e) and (f). We can see that the proposed method acheves a good result. Because the camera matrces of the two frames are avalable, we can obtan the ground truth fundamental matrx and thus evaluate the errors. From Table 8.4, we can see that ASSC performs the best among the four methods. 8.6 A Modfed ASSC (ASRC) Adaptve-Scale Resdual Consensus (ASRC) In the prevous experments, we have seen the robustness of ASSC to outlers and multple structures. However, all nlers n ASSC (equaton 8.2), are treated as the same,.e., each nler contrbutes equvalently to the objectve functon of ASSC. Actually, nlers can have dfferent nfluence on the results f we take nto account the szes of the resduals of nlers. Thus, we modfy ASSC to yeld a robust adaptve-scale resdual consensus (ASRC) estmator. 157

181 n ˆ θ (1 r ˆ /( S ˆT ) θ θ ˆ = 1 θ = arg max( ) ( 8.5) ˆ θ S ˆ θ where n θˆ s the number of nlers whch satsfes equaton 7.1) for the ftted ˆ θ. From equaton ( 8.5), we can see that when the resdual of a data pont s zero, the pont contrbutes 1 to the objectve functon of ASRC; when the resdual of a data pont s equal or larger than S θˆ T, t does not contrbute anythng to the objectve functon of ASRC Experments RESC, ASSC, and ASRC 90 ASSC RESC LMedS 60 LMedS ASRC ALKS 30 ALKS (a) (b) RESC and ASSC ASRC 70 RESC ALKS LMedS ALKS LMedS ASSC ASRC (c) (d) Fgure 8.12: Comparng the performance of fve methods: (a) fttng a roof wth a total of 87% outlers; (b) fttng F-fgure wth a total of 92% outlers; (c) fttng a step wth a total of 91% outlers; (d) fttng three-step wth a total of 91% outlers. 158

182 z z z z z Result by ASRC y x y x (a) (b) Result by ASSC Result by RESC y (c) x y (d) x Result by ALKS Result by LMedS (e) y (f) x Fgure 8.13: (a) the 3D data wth 87% outlers; the extracted results by (b) ASRC; (c) ASSC; (d) RESC; (e) ALKS; and (f) LMedS. In ths subsecton, we wll carry out some experments showng the advantages of ASRC over ASSC and other robust methods (RESC, ALKS, LMedS). We use data smlar to that used n secton and secton 8.3.2, but wth more outlers. From Fgure 8.12 we can see that LMedS (50% breakdown pont) faled to ft all four examples. Although ALKS s more robust than LMedS, t also faled to ft the four sgnals. RESC and ASSC succeeded n the roof sgnal (87% outlers), however, they both faled n 159

183 the other three cases. In contrast, ASRC correctly fts all four sgnals. ASRC doesn t breakdown even when outlers occupy more than 90%, whch s an mprovement over ASSC. From Fgure 8.13 (d) and (e), we can see that RESC and ALKS, whch clam to be robust to data wth more than 50% outlers, faled to extract the three planes. Ths s because the estmated scales (by RESC and ALKS) for the frst plane were wrong, whch caused these two methods to fal to ft the second and thrd planes. Because the LMedS (n Fgure 8.13 (f)) has only a 50% breakdown pont, t completely faled to ft data wth such hgh contamnaton 80% outlers. ASSC, although t correctly ftted the frst plane, wrongly ftted the second and the thrd planes. Only the ASRC method correctly ftted and extracted all three planes (Fgure 8.13 (b)). We also successfully appled the ASRC to range mage segmentaton and fundamental matrx estmaton, whch s smlar to those n secton 8.4 and 8.5. We recommend readng (Wang and Suter 2004) for detals of those experments. 8.7 Concluson In ths chapter, we propose a very robust Adaptve Scale Sample Consensus (ASSC) estmator. The ASSC method has an objectve functon that consders both the number of nlers and the correspondng scale estmate for those nlers. ASSC s very robust to multple-structural data contanng hgh percentages of outlers (more than 80% outlers). The ASSC estmator s compared to several popular robust estmators: LMedS, RESC, and ALKS and ASSC can generally acheve better results. Furthermore, we appled ASSC to two mportant computer vson tasks: range mage segmentaton and robust fundamental matrx estmaton. However, the applcatons of ASSC are not lmted to these two felds. The computatonal cost of the proposed ASSC method s moderately low, whch makes t applcable to many computer vson tasks. 160

184 We also mproved ASSC: ASRC mproves the objectve functon of ASSC by weghtng each nler dfferently accordng to the resdual of that nler. Although we have compared aganst several of the natural compettors from the computer vson and statstcs lterature (Fschler and Rolles 1981; Rousseeuw 1984; Yu, Bu et al. 1994; Lee, Meer et al. 1998), t s dffcult to be comprehensve. For example, n (Scott 2001) the authors also proposed a method whch can smultaneously estmate the model parameters and the scale of the nlers. In essence, the method tres to fnd the ft that produces resduals that are the most Gaussan dstrbuted (or whch have a subset that s most Gaussan dstrbuted), and all data ponts are consdered; In contrast, only the data ponts, wthn the band obtaned by the mean shft and mean shft valley, are consdered n our objectve functon. Also, we do not assume the resduals for the best ft wll be the best match to a Gaussan dstrbuton. In the latter stage of proposng ASSC/ASRC, we become aware that Gotardo, et al. proposed an mproved robust estmator based on RANSAC and MLESAC (Gotardo, Bellon et al. 2003), and appled t to range mage segmentaton. However, lke RANSAC, ths estmator also requres that the user to set the scale-related tolerance a pror. In contrast, the proposed ASSC/ASRC method n ths chapter does not requre any pror nformaton about the scale or tolerance. The parameters of a model and the correspondng scale of nlers are smultaneously obtaned from the data. 161

185 9. Mean shft for Image Segmentaton by Pxel Intensty or Pxel Color Chapter 9 Mean Shft for Image Segmentaton by Pxel Intensty or Pxel Color 9.1 Introducton One major task of pattern recognton, mage processng, and related areas: s to segment mage nto homogenous regons. Image segmentaton s the frst step towards mage understandng and ts success drectly affects the qualty of mage analyss. Image segmentaton has been acknowledged to be one of the most dffcult tasks n computer vson and mage processng (Cheng, Jang et al. 2001; Comancu and Meer 2002a). Note: the type of mage we refer to n ths chapter (grey/color) s dfferent from that of the range mage we referred to n chapter 5 and chapter 8. A range mage contans 3D geometry nformaton. That s, the value of a pxel n the range mage corresponds to a depth/range measurement. Unlke other vson tasks such as parametrc model estmaton ((Wang and Suter 2003a; Wang and Suter 2003b), also see chapter 3-5 and chapter 8), fundamental matrx estmaton (Torr and Murray 1997), optcal flow calculaton ((Wang and Suter 2003c); also see chapter 6), etc., there s no wdely accepted model or analytcal soluton for mage 162

186 segmentaton. There probably s no one "true" segmentaton acceptable to all dfferent people and under dfferent psychophyscal condtons. Indeed, acceptable segmentatons may have to be defned for the dfferent requrements one may have. All of these ncrease the dffculty of the segmentaton task. A lot of mage segmentaton methods have been proposed durng recent decades. Roughly speakng, these methods can be classfed nto (Cheng, Jang et al. 2001): (1) Hstogram thresholdng (Kurugollu, Sankur et al. 2001); (2) Clusterng (Comancu and Meer 1997; Zhang and Wang 2000; Chen and Lu 2001); (3) Regon growng method (Adams and Bschof 1994); (4) Edge-based method (Nevata 1977); (5) Physcal-model-based method (Klnker, Shafer et al. 1990); (6) Fuzzy approaches (Pal 1992); and (7) Neural network based methods (Iwata and Nagahash 1998). We have employed the mean shft algorthm extensvely n our prevously presented robust methods. In ths chapter, we drectly apply the mean shft method to mage segmentaton based on the mage ntensty or on the mage color. As we stated n chapter 4, the mean shft s a form of mode seekng (n essence). It acheves a degree of scale selectvty snce t works wth a smoothed estmate of the underlyng densty functon. In the most commonly used form (Fukunaga and Hostetler 1975; Comancu and Meer 2002a), the wndow sze and the smoothng are drectly related to a quantty h that s the bandwdth choce for the kernel densty estmator employed. Although many authors of papers that employ the mean shft method have remarked that the value h needs to be chosen wth care, the general mpresson gven s that the results are not that senstve to the choce of h and that one generally takes a pragmatc ht and mss affar. Thus, n the frst part of ths chapter we llustrate that there are two ssues affected by the settng of h: the rather dsastrous appearance of false peaks (where the applcaton of the mean shft process wll fal) and the choce of scale (affectng the sgnfcance of actual peaks n the underlyng densty at large scales the densty s very smoothed and local peaks are dsregarded or merged). The latter behavour s much more bengn and, ndeed, as t performs a type of controlled scale-space analyss, can be used to advantage. The former s to be avoded at all costs as t wll result n completely arbtrary results. (Ths 163

187 behavour, though, s due to the extreme quantzed data such as hstogramed data and thus t may not arse n all applcatons). Thus, ths chapter provdes an mportant warnng about the senstvty of the mean shft to false peak nose due to the quantzaton. For smplcty, we choose the problem of hstogram-based grey-level mage segmentaton. We show that one can rather smply predct values of h that wll be problematc; and thereby, n ths settng, we provde a means for a completely automated approach. Ths negates the need for the settng of a value for any parameter, ncludng h (except that one may repeat the soluton wth a range of h to perform a type of scale space analyss). The general mean shft algorthm consders only the global color (or ntensty) nformaton of the mage, whle neglectng the local color nformaton. In the second part of ths chapter, we propose a new method of color mage segmentaton consderng both global nformaton and local homogenety. We ntroduce local homogenety nformaton nto the mean shft segmentaton algorthm. The proposed method apples the mean shft algorthm n the hue and ntensty subspace of HSV. The cyclc property of the hue component s also consdered n the proposed method. Experments on natural color mages show promsng results. The contrbutons of ths chapter can be summarzed as follows: We present the relatonshp between the grey-level hstogram of an mage and the mean shft method and analytcally determne the condtons leadng to the appearance of false peaks. We present an unsupervsed peak-valley sldng algorthm for mage segmentaton. We ntroduce the local homogenety concept nto the mean shft method and propose a color mage segmentaton method consderng the Cyclc Property of the Hue component. We carry out several experments on both grey-level and color mage and the results are promsng. 164

188 False-Peak-Avodng Mean Shft for Image Segmentaton The mean shft (MS) algorthm s senstve to local peaks. In ths subsecton, we show both emprcally and analytcally that when usng sample data, the reconstructed PDF may have false peaks. We show how the occurrence of the false peaks s related to the bandwdth h of the kernel densty estmator, usng examples of gray-level mage segmentaton. It s well known that n MS-based approaches, the choce of h s mportant. However, we provde a quanttatve relatonshp between the appearance of false peaks and the value of h. For the gray-level mage segmentaton problem, we not only show how to avod the false peak problem, but also we provde a complete unsupervsed peak-valley sldng algorthm for gray-level mage segmentaton The Relatonshp between the Gray-Level Hstogram of Image and the Mean Shft Method If we are segmentng a gray-level mage based upon only the ntensty characterstc of pxels, the mean-shft equatons can be rewrtten n terms of the mage ntensty hstogram: ( ) 2 2 ) ( 2 ) ( 2 2 ) ˆ( x S t d d t x h t H c nh d x f h + = + ( 9.1) where H(t ) be the hstogram on gray level t (t s an nteger and t ). The kernel densty functon n equaton ( 9.1) s related to dscrete gray levels )} ( { x S t t h and the correspondng hstogram{ } (x) S t ) H(t h. Lkewse: ( ) + = + = x t H t t H h d c h n t H x t t H h d c h n x f x S t x S t d d x S t x S t d d h h h h ) ( ) ( 2 ) ( ) ( 2 ) ( ) ( 2 ) ( ) ( ) ( 2 ) ( 1 ) ( ˆ ( 9.2)

189 The last term n equaton ( 9.2) s called the sample mean shft M h (x) n dscrete gray-level space: M h H ( t ) t t Sh ( x) ( x) = H ( t ) x ( 9.3) t Sh ( x) Equaton ( 9.3) s derved from the Epanechnkov kernel. (Note: reference (Yang and Lu 2001) employng a smlar formulaton used a Gaussan kernel - see equaton (15) and (17) n that paper) The False Peak Analyss In mplementng the mean shft approach n ths settng, we found, to our surprse, n some cases there are a lot of peaks appearng between two consecutve gray levels near a local maxmum densty (see Fgure 9.1 (a) and (b)). We call these peaks the false peaks. These false peaks wll serously affect the performance of the mean shft method,.e. the mean shft s very senstve to these nose peaks and the mean shft loop wll stop at these false peaks nstead of a real local maxmum densty. Here we analytcally determne the condtons leadng to ths problem. For smplcty, we choose a one-dmensonal settng. (However, the analyss of the nfluence of false peak nose on the mean shft and the mean shft valley can also be extended to multdmensonal case). Let fˆ( ) be the kernel densty estmate at gray level t k ; let 0 < δ x < 1; d=1; and c d =2. Usng equaton ( 9.1) we have: t k fˆ( t k 3 + δx) = 4nh H ( t ) 3 t S ( t + δx) h k 2 2 ( h tk + δx t ) ( 9.4) If h s an nteger (h>0) and t k +h<255, and consderng t has to be a seres of consecutve unsgned nteger, we have { t t Sh ( tk + δx)} = { t t. Sh ( tk)} { t t = tk + h}. 166

190 Zoom n False Peaks x 10-4 A x 10-4 A x 10-4 A1+A2 (a) (b) (c) (d) (e) Fgure 9.1: False peak nose. (a) Orgnal probablty densty dstrbuton wth h equal to 5; (b) Zoom n a part of (a). Many false peaks ntroduced by A1+A2 n Eq. ( 9.6); (c)-(e) A1, A2, and A1+A2 n Eq. ( 9.6) wth t k =95. The equaton ( 9.4) can be rewrtten as: ( 9.5) We let: ( 9.6a) ( ) ) ( 2 ) ( ) ( 3 k h t S t t xm x t H nh A k h δ δ + = ) 2 )( ( x h x h t H nh A k δ δ + + = ( ) ( ) ) 2 )( ( 4 3 ) ( 2 ) ( 4 3 ) ˆ( ) 2 )( ( ) ( ) ( 2 ) ( 4 3 ) ˆ( ) )( ( ) ( 4 3 ) ˆ( ) ( 3 2 ) ( ) ( 2 ) ( ) ( 3 x h x h t H nh t xm x t H nh t f x h x h t H t t H t t H x x t H nh t f x h h h t H t x t h t H nh x t f k k h t S t k k k t S t t S t t S t k k k t S t k k h k h k h k h k h δ δ δ δ δ δ δ δ δ δ δ = = = +

191 168 When h>> x δ, A2 can be approxmated as a lnear equaton (see Fgure 9.1 (d)). Equaton ( 9.5) can be rewrtten as: 2 1 ) ˆ( ) ( ˆ A A t f x t f k k + + = +δ ( 9.6b) Now we calculate the dervatve of ) ( ˆ x t f k δ + wth respect to x δ : ( 9.7) Settng ( 9.7) equal to zero, we obtan: ) ( ) ( ) ( ) ( ) ( ) ( ) ( h t H t H h h t H t M t H x k t S t k k h t S t k h k h = δ ( 9.8) Substtutng equaton ( 9.3) nto equaton ( 9.8), and f 1 0 < < x δ,.e. f: ) ( ) ( ) ( ) )( ( ) ( ) )( ( ) ( ) ( ) ( h t H h t H t H t t t H h h t H t t t H k k t S t k t S t k k t S t k h k h k h < < + ( 9.9) there wll be a false peak appearng between two consecutve gray level, t k and t k +1. For example, n Fgure 9.1, when we apply the mean shft method wth ntal locaton at 95, we fnd the mean shft stopped at , nstead of the real local maxmum densty at 101. From equaton ( 9.8), we obtaned x δ =0.7244,.e. there s a false peak between 95 and 96. We let L be the leftmost tem n the n equaton ( 9.9) and R be the rghtmost tem of ( 9.9); let x MS(tk) be the pont whch the mean shft converges to, from ntal pont at t k, ( ) = = + h h t H t M t H h t H t H x nh h x h t H nh t M x t H nh x d x t df k k h t S t k t S t k k h t S t k k h k h k h ) ( ) ( ) ( 2 ) ( ) ( ) 2 2 )( ( 4 3 ) ( 2 2 ) ( 4 3 ) ( ) ( ˆ ) ( ) ( 3 3 ) ( 3 δ δ δ δ δ

192 correspondng to the local peak. Thus f the condton: L<h<R s satsfed, we can predct that there wll be a false peak between t k and t k +1 (see Table 9.1). h L R δ x t k x MS(tk) False peak between t k and t k yes yes no yes no no yes Table 9.1: False peaks predcton The above analyss suggests that one could devse an approach that adaptvely adjusts h dependng upon whether false peaks are predcted. If a false peak s detected, we can use the followng adjustment to avod the nfluence of the false peak: y k+1= y k + cel(m h (y k )) for MS step y k+1= y k + floor(mv h (y k )) for MSV step ( 9.10a) ( 9.10b) An Unsupervsed Peak-Valley Sldng Algorthm for Image Segmentaton Consder the peaks {P()} and valleys {V()}. V(0)=0 and V(n)=255. V ( 0) P(1) < V (1) <... < P( n) V ( n). The proposed algorthm s descrbed as follows: (1) Intalse the bandwdth h and the locaton of search wndow. (2) Apply the mean shft algorthm to obtan peak P k wth the ntal wndow locaton V k (3) Apply the mean shft valley method to obtan valley V k wth ntal wndow locaton P k

193 (4) Repeat step (2) and (3) untl P k or V k s equal to or larger than 255. The questons remans as to how many of these peaks are sgnfcant. We post-process by step (5). (5) Valdate peaks and valleys (5a) Remove peaks too small compared wth the largest. (5b) Remove the smaller of two consecutve peaks f too close. (5c) Calculate the normalzed contrast (Albol, Torrest et al. 2001) for a valley and two neghbourng peaks: Contrast Normalzed Contrast = ( 9.11) Heght where the contrast s the dfference between the smaller peak and the valley. Remove the smaller one of the two peaks f ths s small. After step 5(a)-5(c), we obtan several sgnfcant peaks {PS(1), PS(k)}. The valleys then are chosen as the mnmum of the valleys between two consecutve sgnfcant peaks. Thus we have k-1 valleys {VS(1), VS(k-1)}. (6) Usng the obtaned valleys, fnally obtan k segmented mages by {[0, VS(1)], [VS(1), VS(2)], [VS(k), 255]} Expermental Results In ths secton, we wll use several examples to show the performance of the proposed method n segmentng mages. Fgure 9.2 demonstrates the segmentaton procedures of the proposed method. Fgure 9.2 (c)/(d) shows the obtaned peaks and valleys before/after valdaton. Before the valdaton, there are ten peaks and ten valleys obtaned. Near a local plateau, there wll be some nsgnfcant peaks and valleys. After applyng step 5 n secton 9.2.3, we fnally obtaned three valdated valleys and thus we have four segmented mages Fgure 9.2 (e-h). The fnal result s shown n (). 170

1800 1600 0.018 0.016 PDF Peaks Valleys 1400 0.014 1200 0.012 1000 0.01 800 0.008 600 0.006 400 0.

016 PDF Peaks Valleys 0.014 0.012 0.01 0.008 0.006 0.004 0.

mergng; (d) fnal peaks and valleys; (e)-(h) the resultng segmented mages; () the fnal segmented mage.

194 PDF Peaks Valleys (a) (b) (c) PDF Peaks Valleys (d) (e) (f) (g) (h) () Fgure 9.2: The segmentaton results of the proposed method (h=7). (a) orgnal mage of the cameraman; (b) gray-level hstogram; (c) peaks and valleys of f ˆ ( x ) before mergng; (d) fnal peaks and valleys; (e)-(h) the resultng segmented mages; () the fnal segmented mage. Fgure 9.3 hows another experment on a x-ray medcal mage. From Fgure 9.3, we can see that the x-ray mage has been successfully segmented: the background (Fgure 9.3 (c)), the bone (Fgure 9.3 (d)), and the tssues (Fgure 9.3 (e) were extracted separately. 171

The computatonal speed of the proposed algorthm s effcent: about 0.27 second usng MATLAB code on an AMD 800MHz personal computer. 0.16 0.14 PDF Peaks Valleys 0.12 0.1 0.08 0.06 0.04 0.

(a) the orgnal x-ray mage; (b) the fnal peaks and valleys after valdaton; (c)-(e) the resultng segmented mages; () the fnal segmented mage. 9.

195 The computatonal speed of the proposed algorthm s effcent: about 0.27 second usng MATLAB code on an AMD 800MHz personal computer PDF Peaks Valleys (a) (b) (c) (d) (e) (f) Fgure 9.3: The applcaton of the proposed method on medcal mages (h=2). (a) the orgnal x-ray mage; (b) the fnal peaks and valleys after valdaton; (c)-(e) the resultng segmented mages; () the fnal segmented mage. 9.3 Color Image Segmentaton Usng Global Informaton and Local Homogenety Clusterng technques dentfy homogeneous clusters of ponts n the feature space (such as RGB color space, HSV color space, etc.) and then label each cluster as a dfferent regon. The homogenety crteron s usually that of color smlarty,.e., the dstance between one cluster to another cluster n the color feature space should smaller than a threshold. The dsadvantage of ths method s that t does not consder local color nformaton between neghbourng pxels. 172

196 In ths secton, we propose a new segmentaton method ntroducng local homogenety nto a mean shft algorthm. Thus, our mean shft algorthm consders both global and local color nformaton. The method s performed n "Hue-Value" two-dmensonal subspace of Hue-Saturaton-Value space (see secton 9.3.1). Compared wth those applyng the mean shft algorthm n LUV or RGB color space, the complexty of the proposed method s lower. The proposed method also consders the cyclc property of the hue component and t does not need pror knowledge about the number of clusters and t detects the clusters unsupervsed. In (Cheng and Sun 2000), the authors also proposed a peak-fndng algorthm. Unfortunately, t s heurstcally based. One characterstc of the mean shft vector s that t always ponts towards the drecton of the maxmum ncrease n the densty. The converged centres (or wndows) correspond to modes (or centres of the regons of hgh concentraton) of data. The mean shft algorthm has a sold theoretcal foundaton. The proof of the convergence of the mean shft algorthm can be found n (Comancu and Meer 1999b; Comancu and Meer 2002a) HSV Color Space Although RGB (Red, Green, and Blue) s a wdely used color space to represent the color nformaton n a color mage, HSV (Hue, Saturaton, and Value) s sometmes preferred as the color space. In the HSV color space, each color s determned by the values of HSV (see Fgure 9.4). The frst component, Hue, s a specfcaton of the ntrnsc color. The second component of the HSV trple, Saturaton, descrbes, how pure the color s. The last component of the HSV trple s a measure of how brght the color s. The HSI (huesaturaton-ntensty), the HSB (hue-saturaton-brghtness), and the HSL (hue-saturatonlghtness) are varant forms of the HSV color space (Cheng, Jang et al. 2001). Ths class of color space has been wdely used n computer vson tasks (Cheng and Sun 2000; Zhang and Wang 2000; Sural, Qan et al. 2002). The reason that ths class of color space s preferred to the RGB color space, s that t better represents human percepton of colors. The advantages of hue over RGB are (Cheng and Sun 2000; Cheng and Y.Sun 2000; Cheng, Jang et al. 2001): 173

197 Hue s nvarant to certan types of hghlghts, shadng, and shadows; The segmentaton s performed on only one dmenson and results of segmentaton have fewer segments than usng RGB. G B Hue R Saturaton n Value Fgure 9.4: HSV color space. In the HSV color space, Hue and value are the most mportant ones. Although the authors of (Cheng and Sun 2000) utlzed both hue and ntensty to segment color mages, they segmented color mages n one-dmensonal ntensty subspace and then, herarchcally, segmented the results (from the segmentaton n the ntensty subspace) n the hue subspace. Thus t s easy to over-segment the mage. In ths chapter, we apply the mean shft algorthm n hue-value two-dmensonal subspace (both hue and ntensty are consdered smultaneously n our method). The hue and value are scaled so that they range from 0 to 255. One thng worth mentonng s the cyclc property of the hue component. Because the hue s a value of angle, t has ths cyclc property. The cyclc property of the hue wll be consdered n the mean shft algorthm and n classfyng pxels to clusters (see secton and 9.3.3). 174

198 Consderng the Cyclc Property of the Hue Component n the Mean Shft Algorthm Because the hue s a value of angle, the cyclc property of the hue must be consdered n the mean shft algorthm. The revsed mean shft algorthm can be wrtten as: x X n x S X x h ) ( ' ' h ' 1 (x) M ( 9.12) where the convergng wndow center x s a vector of [H, V], and > + < < < < < < < < < = 255, )& 255 ( 0, )& 255 ( 255, & ) ( ' h fh h V V H h H h H H h fh h V V H h H h H H h H h f h V V h H H x S h ( ' x n s the number of data ponts nsde the hypersphere ) ( ' x S h ). When translatng the search wndow, let x k+1 =[H k+1,v k+1 ]. We have: and ( 9.13) The Proposed Segmentaton Method for Color Images Although the mean shft algorthm has been successfully appled to clusterng (Cheng 1995; Comancu and Meer 1999b), mage segmentaton (Comancu and Meer 1997; Comancu and Meer 2002a), etc., t manly consders global color nformaton, whle neglectng local homogenety. In ths chapter, we ntroduce a measure of local homogenety (Cheng and Sun 2000) nto the mean shft algorthm. The proposed method > < + = , , , 1 ) ( ' ) ( ' ) ( ' ) ( ' ) ( ' ) ( ' 1 ' ' ' ' ' ' k h k k h k k h k k h k k h k k h k H S H x H S H x H S H x H S H x H S H x H S H x k H n f H n H n f H n H n f H n H ' 1 ' ( ) 1 h k k k V S V x V V n + =

199 consders both global nformaton and local homogenety nformaton as explaned n the next secton Local Homogenety In (Cheng and Sun 2000), a measure of local homogenety has been used n onedmensonal hstogram thresholdng. The homogenety conssts of two parts: the standard devaton and the dscontnuty of the ntenstes at each pxel of the mage. The standard devaton S j at pxel P j can be wrtten as: 1 S = ( I m ) j w j nw Iw Wd( Pj) 2 ( 9.14) where m j s the mean of n w ntenstes wthn the wndow W d (P j ), whch has a sze of d by d and s centered at P j. A measure of the dscontnuty D j at pxel P j can be wrtten as: D = + G ( 9.15) 2 j G x 2 y where G x and G y are the gradents at pxel P j n the x and y drecton. Thus, the homogenety H j at P j can be wrtten as: H j = 1-(S j/s max ) (D j/d max ) ( 9.16) From equaton ( 9.16), we can see that the H value ranges from 0 to 1. The hgher the H j value s, the more homogenous the regon surroundng the pxel P j s. In (Cheng and Sun 2000), the authors appled ths measure of homogenety to the hstogram of gray levels. Here, we wll show that the local homogenety can also be ncorporated nto the popular mean shft algorthm Color Image Segmentaton Method Our proposed method manly conssts of three parts: 176

200 Map the mage to the feature space consderng both global color nformaton and local homogenety. Apply the revsed mean shft algorthm (secton 9.3.2) to obtan the peaks. Post-processng and assgn the pxels to each cluster. The detals of the proposed method are: 1. Map the mage to the feature space. We frst compute the local homogenety value at each pxel of the mage. To calculate the standard varance at each pxel, a 5-by-5 wndow s used. For the dscontnuty estmaton, we use a 3-by-3 wndow. Of course, other wndow szes can also be used. However, we fnd that the wndow szes used n our case can acheve better performance and computatonal effcency. After computng the homogenety for each pxel, we only use the pxels wth hgh homogenety values (near to 1.0) and neglect the pxels wth low homogenety values. We map the pxels wth hgh homogenety values n the hue-value two-dmensonal space. Thus, both global and local nformaton are consdered. 2. Apply the mean shft algorthm to fnd the local hgh-densty modes. We randomly ntalze wndows (many enough) n HV space, wth radus h. When the number of data ponts nsde the wndow s large, and when the wndow center s not too close to the other accepted wndows, we accept the wndow. After the ntal wndow has been chosen, we apply the mean shft algorthm consderng the cyclc property of the hue component to obtan the local peaks. 3. Valdate the peaks and label the pxels. 177

201 After applyng the mean shft algorthm, we obtan a lot of peaks. Obvously, these peaks are not all vald. We need some post-processng to valdate the peaks. Thus we do the followng step. Elmnate the repeated peaks. Because of the lmted accuracy of the mean shft, the same peak obtaned by the mean shft may not be at the exact same locaton. Thus, we remove the repeated peaks that are very close to each other (e.g., ther dstance s less than 1.0). Remove the small peaks related to the maxmum peaks. Because the mean shft algorthm only fnds the local peaks, t may stop at small local peaks. Calculate the normalzed contrast usng equaton ( 9.11). Remove the smaller one of the two peaks f the normalzed contrast s small. 4. After obtanng the valdated peaks, we assgn pxels to ther nearest clusters. In ths step, the cyclc property of the hue component wll agan be consdered. The dstance between the th pxel to the j th cluster s: Dst ( mn( H H, H H ) 2 2 (, j) j j + V V j = α ( 9.17) where α s a factor to adjust the relatve weght of the hue component over the value component. The authors of (Zhang and Wang 2000) employed the k-means algorthm to segment the mage. The dsadvantage of such an approach s the requrement that the user must specfy the number of the clusters. In comparson, the proposed method s unsupervsed n that t needs no pror knowledge about the number of the clusters. We llustrate wth the experments n the next secton Experments on Color Image Segmentaton In ths secton, we test our color mage segmentaton method on natural color mages. 178

In Fgure 9.5, part of the procedures of the proposed method s llustrated and fnal segmentaton results are gven. Fgure 9.5 (a) ncludes the orgnal mage home. The ponts n HV space are dsplayed n Fgure 9.

The blue lnes are the traces of the mean shft procedures; green dots are the centres of the converged wndows by the mean shft procedures; and the red crcles are the fnal peaks after valdaton. Fgure 9.

202 In Fgure 9.5, part of the procedures of the proposed method s llustrated and fnal segmentaton results are gven. Fgure 9.5 (a) ncludes the orgnal mage home. The ponts n HV space are dsplayed n Fgure 9.5 (b) (no valdaton of local homogenety) and (c) (after valdaton of local homogenety). The tracks of the mean shft n HV space, wth dfferent ntalzatons, are ncluded n Fgure 9.5 (d). The blue lnes are the traces of the mean shft procedures; green dots are the centres of the converged wndows by the mean shft procedures; and the red crcles are the fnal peaks after valdaton. Fgure 9.5 (e) gves the fnal segmentaton results by the proposed method. From Fgure 9.5 (e), we can see that our method obtans good segmentaton results. The tree, the house, the roof, and the rm of the curtan and the house are all segmented out separately. The curtan and the sky are segmented to the same cluster. Ths s because the color of the curtan s blue, whch s smlar to the color of sky. From the pont of vew for color homogenety, ths result s correct. (a) (b) (c) (d) (e) Fgure 9.5: (a) the orgnal mage home ; (b) The hue-value feature space wthout consderng local homogenety; (c) The hue-value feature space consderng local homogenety; (d) procedures and results of the data decomposton by the mean shft algorthm wth dfferent ntalzatons (e) the fnal results wth seven colors obtaned by the proposed method wth h=9. 179

We also note that the grassland are segmented nto two parts: on one hand, one can say the method over-segments the grassland because they both belong to the grassland; on

Ths agan demonstrates that there s no unque soluton to mage segmentaton. In Fgure 9.6 and Fgure 9.

203 We also note that the grassland are segmented nto two parts: on one hand, one can say the method over-segments the grassland because they both belong to the grassland; on the other hand, one can say the method correctly segment the grassland because the grassland can be seen to have dfferent color. Ths agan demonstrates that there s no unque soluton to mage segmentaton. In Fgure 9.6 and Fgure 9.7, we compare the proposed method wth a method employng a smlar scheme but wthout consderng the local homogenety and the cyclc property of the hue component (a) (b) (c) Fgure 9.6: (a) the orgnal mage Jelly beans ; (b) the fnal results wth fve colors obtaned by the proposed method wth h=7; (c) the results wth seven colors wthout consderng the local homogenety and the cyclc property of the hue (h=7). (a) (b) (c) Fgure 9.7: (a) the orgnal mage Splash ; (b) the fnal results wth three colors obtaned by the proposed method wth h=7; (c) the results wth sx colors wthout consderng the local homogenety and the cyclc property of the hue (h=7). 180

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust