ESTIMATIO OF PROPER PARAMETER VALUES FOR OCUMET BIARIZATIO E. Badekas and. Papamarkos Image Processng and Multmeda Laboratory epartment of Electrcal & Computer Engneerng emocrtus Unversty of Thrace, 67 Xanth, Greece, papamark@ee.duth.gr ABSTRACT Most of the exstng document-bnarzaton technques deal wth many parameters that requre a pror settng of ther values. ue to the unknown of the ground-truth mages, the evaluaton of document bnarzaton technques s subectve and employs human observers for the estmaton of the approprate parameter values. The selecton of the approprate values for these parameters s crucal and nfluences to the fnal bnarzaton. However, there s no predetermned set of parameters that guarantees optmal bnarzaton for all document mages. Ths paper proposes a new technque that allows the estmaton of proper parameters values for each one of the document bnarzaton technques. The proposed approach s based on a statstcal performance analyss of a set of bnarzaton results, whch are obtaned by applyng varous bnarzaton technques wth dfferent parameter values. The proposed statstcal performance analyss can also depcts the best document bnarzaton result obtaned by a set of document bnarzaton technques. KEY WORS Bnarzaton, Thresholdng, ocument Processng, Segmentaton, Parameter Evaluaton and etecton. Introducton ocument bnarzaton s an actve research area n mage processng. Many bnarzaton technques for gray-scale, and more recently, for color document mages have been proposed [-3]. Most of these technques nclude parameters, whch requre approprate ntal settngs of ther values. It s obvous that the selected values of the parameter set (PS) may be approprate to specfc document mages and possbly to other smlar documents. Thus, estmaton of approprate PS values wll be necesstated agan for dstnctve mages. Ths paper proposes a new technque, whch can be used to adaptve estmaton of approprate PS values for varous document bnarzaton technques. In general, both global [-5] and local [6-4] bnarzaton technques exst for document bnarzaton. The global bnarzaton technques are sutable for convertng any gray-scale mage nto a bnary form but are napproprate for complex document mages, and perform even worse wth degraded document mages. In cases of complex and degraded documents, local bnarzaton technques gve better bnarzaton results. Ths category ncludes the technques of Bernsen [6], Chow and Kaneko [7], Ekvl [8], Marda and Hansworth [9], black [], Taxt [], Yanowtz and Brucksten [], Sauvola and Petkanen [3-4] and Gatos et al. [3]. In the context of document bnarzaton, the most powerful technques are probably those that go beyond the mage gray-scale values to ncorporate the structural characterstcs of the characters [5-8], [9]. Methods that are based on stroke analyss, such as the stroke wdth (SW) and other characters geometrcal characterstcs belong to ths category. The Adaptve Logcal Level Technque (ALLT) and ts mproved versons [5-6], [9] as well as the Improvement of Integrated Functon Algorthm (IIFA) [7-8], [9] are two of the most powerful technques that utlze these characterstcs. Fnally, there are bnarzaton technques that are based on generc clusterng approaches, such as the Fuzzy C-means algorthm (FCM) [9] and the Kohonen neural network based technques, proposed by Papamarkos et al. [-3]. Most of the bnarzaton technques, especally those categorzed as local thresholdng algorthms have PS values that must be defned pror to ther applcaton to some document mage. Clearly, dfferent values of the PS lead to dfferent bnarzaton results, whch mean that there s not a safe set of approprate PS values for all types of document mages. In ths paper, a Parameter Estmaton Algorthm (PEA) s proposed whch can be used to detect the proper PS values of every document bnarzaton technque. The estmaton s based on the correlaton analyss between the dfferent document bnarzaton results obtaned by applyng some specfc bnarzaton technque to a document mage, usng dfferent PS values. The proposed method s based on the work of Ytzhaky and Pel [8], whch s used for edge detecton evaluaton. In ther approach, a specfc range and a specfc step for each one of the parameters are ntally defned. The proper values for the PS are then estmated by comparng the results obtaned by all possble combnatons of the PS values. The proper PS values are estmated usng a Recever
Operatng Characterstcs (ROC) analyss and a Chsquare test. In order to mprove ths algorthm, we use a wde ntal range for every parameter and apply an adaptve convergence procedure, n order to estmate the proper parameter values. Specfcally, at each teraton of the adaptve procedure, the ranges of the parameters are redefned accordng to the estmaton of the best and the second best bnarzaton result obtaned. The adaptve procedure termnates when the ranges of the parameter values cannot be reduced further and the proper PS values are those obtaned from the last teraton. The proposed technque was extensvely tested usng a varety of documents most of whch come from the old Greek Parlamentary Proceedngs and from the Unversty of Washngton database [3]. Bnarzaton results of ndependent technques are compared wth the proposed evaluaton technque and accordng to ther performance n an OCR applcaton. The mean ratng values calculated for each bnarzaton technque, from the proposed technque and the human-assessment experment, are smlar wth a varaton of ±.5. All the experments presented, confrm the effectveness of the proposed technque. Sauvola and Petkanen s technque seems to work properly, n most of the cases, wth the specfc document database.. Obtanng the best Bnarzaton Result For the body of your document, use Tmes ew Roman font, -pont type sze, sngle-spaced. The whole document should be fully ustfed (not only left-ustfed). Headngs should be -pont, upper- and lower-case, bold or pt upper case, bold. Subheadngs should be -pont upper- and lower-case. When a document mage s converted to a bnary form, the deal result s not known a pror. Ths s a maor problem n comparatve evaluaton tests. In order to have comparatve results, t s mportant to estmate a ground truth mage and consequently usng ths mage as a reference. ong ths, we can compare the dfferent bnarzaton results obtaned, and therefore, we can estmate the best of those. Ths ground truth mage, known as Estmated Ground Truth (EGT) mage, s selected from a lst of Potental Ground Truth () mages as proposed by Ytzhaky and Pel [8] for edge detecton evaluaton. Consder document bnary mages,,..., obtaned by applyng one or more document bnarzaton technques to a gray-scale document mage of sze K L. In order to get the best bnary mage t s necessary to obtan the EGT mage frst. In turn, the ndependent bnarzaton results are compared wth the EGT mage usng the Ch-square test. The entre procedure s descrbed below, the background and foreground pxels are represented by and, respectvely. Stage For every pxel, the number of bnary mages that consder t as foreground pxel s calculated. The results are stored to a matrx Cxy (, ), x,..., K and y,..., L. It s obvous that the values of ths matrx wll be between and. Stage,,.., bnary mages are produced usng the matrx Cxy (, ). Every mage s consdered as the mage that has as foreground pxels all the pxels wth Cxy (, ). Stage 3 and are defned respectvely as the background and foreground pxels n mage, whle and represent the background and foreground pxels n mage. For each mage, four probabltes are defned: Probablty that a pxel s a foreground pxel n both and mages: TP, () K L k l Probablty that a pxel s a foreground pxel n mage and background pxel n mage: FP, () K L k l Probablty that a pxel s a background pxel n both and mages: T, (3) K L k l Probablty that a pxel s a background pxel n mage and foreground pxel n mage: F, (4) K L k l Accordng to the above defntons, for each the average value of the four probabltes resultng from ts match wth each of the ndvdual bnarzaton results, s calculated: TP TP (5) FP T F, FP, T, F, (6) (7) (8) Stage 4 In ths stage, the senstvty TPR and specfcty ( FPR ) values are calculated accordng to the relatons: TP TPR (9) P
FP FPR P where P TP + F, () Stage 5 Ths stage s used to obtan the EGT mage. The EGT mage s selected to be one of the mages. For each, the X value s calculated, accordng to the relaton: ( senstvty Q )( ( )) specfcty Q X () ( Q ) Q where Q TP + FP. A hstogram from the values of X s constructed (CT-Ch-square hstogram). The best CT wll be the value of that maxmzes X. The mage n ths CT level wll be then consdered as the EGT mage. An example of a CT Ch-square hstogram s shown n Fgure for 9. The detected CT level n ths example s the ffth. Fgure. An example of a CT Ch-square hstogram. The ffth level s the CT level. Stage 6 For each mage, four probabltes are defned: Probablty that a pxel s a foreground pxel n both and EGT mages: TP () K L k l Probablty that a pxel s a foreground pxel n mage and background pxel n EGT mage: FP (3) K L k l Probablty that a pxel s a background pxel n both and EGT mages: T (4) K L k l Probablty that a pxel s a background pxel n mage and foreground pxel n EGT mage: F (5) K L k l Stage 7 Stages 4 and 5 are repeated to compare each bnary mage wth the EGT mage, usng the relatons ()-(5) rather than the relatons (5)-(8) whch calculated n Stage 3. Accordng to the Ch-square test, the maxmum value of X ndcates the mage whch s the estmated best document bnarzaton result. Sortng the values of the Ch-square hstogram, the bnarzaton results are sorted accordng to ther qualty. 3. Parameter Estmaton Algorthm In the frst stage of the proposed evaluaton system t s necessary to estmate the proper PS values for each one of the ndependent document bnarzaton technques. Ths estmaton s based on the method of Ytzhaky and Pel [8] proposed for edge detecton evaluaton. However, n order to ncrease the accuracy of the estmated proper PS values, we mprove ths algorthm by usng a wde ntal range for every parameter and an adaptve convergence procedure. That s, the ranges of the parameters are redetermned n lne wth the estmaton of the best and second best bnarzaton result obtaned n each teraton of the adaptve procedure. Ths procedure termnates when the ranges of the parameters values cannot be further reduced and the proper PS values are those obtaned durng the last teraton. It s mportant to note that ths s an adaptve procedure and s applcable to every document mage. The stages of the proposed parameter estmaton algorthm, for two parameters ( P, P ), are as follows: Stage efne the ntal range of the PS values. Consder as [ s, e ] the range for the frst parameter and [ s, e ] the range for the second one. Stage efne the number of steps that wll be used n each teraton. For the two parameters case, let St and St be the numbers of steps for the ranges [ s, e ] and [ s, e ], respectvely. In most cases St St 3. Stage 3 Calculate the lengths L and L of each step, accordng to the followng relatons: e s e s L, L (6) St St Stage 4 In each step, the values of parameters P, P are updated accordng to the relatons: P( ) s + L,,.., St (7) P() s + L,,.., St (8) Stage 5 Apply the bnarzaton technque to the document mage usng all the possble combnatons of ( P, P ). Thus, bnary mages,,..., are produced, where s equal to St St. Stage 6 Examne the bnary document results, usng the algorthm descrbed n Secton, to estmate the best and the second best document bnarzaton results. Let ( PB, P B ) and ( P S, P S) be the parameters values
obtaned from the best and the second best bnarzaton results, respectvely. Stage 7 Redefne the ranges for the two parameters as [ s, e ] and [ s, e ] that wll be used durng the next teraton of the method, accordng to the relatons: If PB > PS then [ s, e] [ PS, PB] If PB P S then If PB < PS then [ s, e] [ PB, PS] [ s, e] s + A e + A If PB PS A then [ s, e], If PB > PS then [ s, e] [ PS, PB] If PB P S then If PB < PS then [ s, e] [ PB, PS] [ s, e] s + A e + A If PB PS A then [ s, e], Stage 8 Adust the steps St, St wth the ranges that wll be used n the next teraton accordng to the relatons: ' ' If e - s < St then St St St ' else St St ' ' If e - s < St then St St St ' else St St Stage 9 If St St 3 go to Stage 3 and repeat all the stages. The teratons termnate when the calculated new steps for the next teraton have a product less to 3 ( St St < 3 ). The proper PS values are those estmated durng the Stage 6 of the last teraton. 4. Comparng the results of dfferent bnarzaton technques The proposed evaluaton technque can be extended to estmate the best bnarzaton results by comparng the bnary mages obtaned by ndependent technques. The algorthm descrbed n Secton can be used to compare the bnarzaton results obtaned by the applcaton of ndependent document bnarzaton technques. Specfcally, the best document bnarzaton results obtaned from the ndependent technques usng the estmated proper PS values are compared through the procedure descrbed n Secton. That s, the fnal best document bnarzaton result s obtaned as follows: Stage Estmate the proper PS values for each document bnarzaton technque, usng the PEA descrbed n Secton 3. Stage Obtan the document bnarzaton results from each one of the ndependent bnarzaton technques by usng ther proper PS values. Stage 3 Compare the bnary mages obtaned n Stage and estmate the fnal best document bnarzaton result by usng the algorthm descrbed n Secton. 5. The bnarzaton technques ncorporated n the evaluaton system In order to acheve satsfactory document bnarzaton results, a number of powerful bnarzaton technques are ncluded n the proposed evaluaton system. Two of them are global, three are local and two are based on structural characterstcs of the characters. In partcular, the ncorporated bnarzaton technques are:. Otsu s technque []. Fuzzy C-Mean (FCM) [9] 3. black s technque [] 4. Sauvola and Petkanen s technque [3-4] 5. Bernsen s technque [6] 6. Adaptve Logcal Level Technque (ALLT) [5-6], [9] 7. Improvement of Integrated Functon Algorthm (IIFA) [7-8], [9]. 6. Expermental Results Experment Ths experment demonstrates the applcaton of the proposed technque to a large number of document mages obtaned from the old Greek Parlamentary Proceedngs and the Unversty of Washngton database [3]. The goal of ths experment s to evaluate the seven ndependent bnarzaton technques and to decde whch of them gves the best results. For each document mage, the best bnarzaton result of each ndependent bnarzaton technque s obtaned. These results are rated and sorted accordng to ther ch-square test values obtaned by the proposed evaluaton method. The ratng value for a document bnarzaton technque can be between (best) and 7 (worst). The mean ratng value for each bnarzaton technque s then calculated and the hstogram shown n Fgure s constructed usng these values. SAUVOLA OTSU BERSE IBLACK FCM IIFA ALLT,5 3, 3,3,,, 3, 4, 5, 6, 7, Fgure. The hstogram constructed by the mean ratng values. 3,9 4, 5,8 6,4 It s obvous that the mnmum value of ths hstogram s assgned to the bnarzaton technque whch has the best performance for the specfc document mage database.
The mean ch-square values obtaned for each bnarzaton technque are presented n the hstogram shown n Fgure 3. Accordng to the evaluaton results t s concluded that the Sauvola and Petkanen s technque gves, n most of the cases, the best document bnarzaton results. Ths concluson agrees wth other evaluaton tests such as the test performed by Sezgn and Sankur [7]. It should be notced that the black s technque was used wthout any post-processng step and ths has as a result the technque to acheve the worst mean ratng value. SAUVOLA OTSU BERSE IBLACK FCM IIFA ALLT,57,69,83,85,85,86,9 Fgure 4. The hstogram constructed by the mean ratng values n the experment based on human-assessment. 7. Conclusons In ths paper a new method s proposed for the estmaton of the proper PS values of document bnarzaton technques and the best bnarzaton result obtaned by a set of ndependent document bnarzaton technques. The proposed method s extended to produce an evaluaton system for ndependent document bnarzaton technques. The estmaton of the proper PS values s acheved by applyng an adaptve convergence procedure startng from a wde ntal range for every parameter. The entre system was extensvely tested wth a varety of document mages. The mean ratng values obtaned for each bnarzaton technque n the human-assessment experment, are smlar wth the mean ratng values obtaned by the proposed evaluaton technque. Sauvola and Petkanen s technque gves, n most of the cases, the best document bnarzaton results.,,,,3,4,5,6,7,8,9, Acknowledgements Fgure 3. The hstogram constructed by the mean chsquare values. Experment In order to prove the effectveness of the proposed evaluaton technque, a human-assessment experment has been performed n whch a group of people was asked to compare the vsual results obtaned n Experment by the ndependent bnarzaton technques. In partcular, these results were prnted and handed out to persons, askng them to rank the mages accordng to ther qualty. The mean ratng values obtaned n ths experment were smlar to the values obtaned n the prevous experment, wth a varaton of ±.5. The correspondng hstogram constructed n ths human-assessment experment s shown n Fgure 4. SAUVOLA OTSU BERSE IBLACK FCM IIFA ALLT,8 3,3,,, 3, 4, 5, 6, 7, 3,5 3,7 3,8 5,5 6,3 Ths work s co-funded by the European Socal Fund and atonal Resources-(EPEAEK-II) ARCHIMIES, TEI Serron References []. Otsu, A thresholdng selecton method from graylevel hstogram, IEEE Trans. Systems Man Cybernet. SMC-8, 978, 6-66. [] J. Kttler and J. Illngworth, Mnmum error thresholdng, Pattern Recognton 9 (), 986, 4-47. [3] S.S. Redd, S.F. Rudn and H.R. Keshavan, An optmal multple Threshold scheme for mage segmentaton, IEEE Tran. On System Man and Cybernetcs 4 (4), 984, 66-665. [4] J.. Kapur, P.K. Sahoo and A.K. Wong, A new method for gray-level pcture thresholdng usng the Entropy of the hstogram, Computer Vson Graphcs and Image Processng 9, 985, 73-85. [5]. Papamarkos and B. Gatos, A new approach for multthreshold selecton, Computer Vson Graphcs and Image Processng 56 (5), 994, 357-37. [6] J. Bernsen, ynamc thresholdng of grey-level mages, Proc. Eghth Int. Conf. Pattern Recognton, Pars, 986, 5-55. [7] C.K. Chow and T. Kaneko, Automatc detecton of the left ventrcle from cneangograms, Computers and Bomedcal Research 5, 97, 388-4. [8] L. Ekvl, T. Taxt and K. Moen, A fast adaptve method for Bnarzaton of document mages, Proc. ICAR, France, 99, 435-443.
[9] K.V. Marda and T.J. Hansworth, A spatal thresholdng method for mage segmentaton, IEEE Trans. Pattern Anal. Mach. Intell, (8), 988, 99-97. [] W. black, An Introducton to gtal Image Processng, Englewood Clffs,.J. Prentce Hall,, 986, 5-6. [] T. Taxt, P.J. Flynn, and A.K. Jan, Segmentaton of document mages, IEEE Trans. Pattern Anal. Mach. Intell (), 989, 3-39. [] S.. Yanowtz and A.M. Brucksten, A new method for mage segmentaton, Computer Vson, Graphcs and Image Processng 46 (), 989, 8-95. [3] J. Sauvola, T. Seppanen, S. Haapakosk, and M.Petkanen, Adaptve ocument Bnarzaton, ICAR Ulm Germany, 997, 47-5. [4] J. Sauvola and M. Petkanen, Adaptve ocument Image Bnarzaton, Pattern Recognton 33,, 5 36. [5] M. Kamel and Α. Zhao, Extracton of bnary character / graphcs mages from gray-scale document mages, CVGIP: Graphcal Models Image Process. 55 (3), 993, 3-7. [6] Y. Yang and H. Yan, An adaptve logcal method for bnarzaton of degraded document mages, Pattern Recognton 33,, 787-87. [7] J.M. Whte and G.. Rohrer, Image segmentaton for optcal character recognton and other applcatons requrng character mage extracton, IBM J. Res. ev. 7 (4), 983, 4-4. [8] O.. Trer and T. Taxt, Improvement of Integrated Functon Algorthm for bnarzaton of document mages, Pattern Recognton Letters 6, 995, 77-83. [9] Z. Ch, H. Yan, and T. Pham, Fuzzy Algorthms: Wth Applcatons to Image Processng and Pattern Recognton, World Scentfc Publshng, 996. []. Papamarkos, A neuro-fuzzy technque for document bnarzaton, eural Computng & Applcatons, (3-4), 3, 9-99. []. Papamarkos, C. Strouthopoulos and I. Andreads, "Multthresholdng of color and gray-level mages through a neural network technque", Image and Vson Computng, 8,, 3-. []. Papamarkos and A. Atsalaks, "Gray-level reducton usng local spatal features", Computer Vson and Image Understandng, 78,, 336-35. [3]. Papamarkos, A. Atsalaks and C. Strouthopoulos, "Adaptve Color Reducton", IEEE Trans. on Systems, Man, and Cybernetcs-Part B, 3 (),, 44-56. [4] O.. Trer and T. Taxt, Evaluaton of bnarzaton methods for document mages, IEEE Trans. Pattern Anal. Mach. Intellgence 7 (3), 995, 3-35. [5] O.. Trer and A.K. Jan, Goal-rected Evaluaton of Bnarzaton Methods, IEEE Trans. Pattern Anal. Mach. Intellgence 7 (), 995, 9-. [6] G. Leedham, C. Yan, K. Takru and J. H. Man, Comparson of Some Thresholdng Algorthms for Text/Background Segmentaton n ffcult ocument Images, Proc. of 7 th ICAR () Scotland, 3, 859 865. [7] M. Sezgn and B. Sankur, Survey over mage thresholdng technques and quanttatve performance evaluaton, Journal of Electronc Imagng 3(), 4, 46 65. [8] Y. Ytzhaky and E. Pel, A Method for Obectve Edge etecton Evaluaton and etector Parameter Selecton, IEEE Transactons on Pattern Analyss and Machne Intellgence, 5 (8), 3, 7-33. [9] E. Badekas and. Papamarkos, A system for document bnarzaton, 3rd Internatonal Symposum on Image and Sgnal Processng and Analyss ISPA 3, Rome, Italy [3] UW: Englsh ocument Image atabase, Unversty of Washngton, Seattle, 993. [3] B. Gatos, I. Pratkaks and S.J. Perantons, Adaptve degraded document mage bnarzaton, Pattern Recognton 39, 6, 37 37.