A NEW FUZZY C-MEANS BASED SEGMENTATION STRATEGY. APPLICATIONS TO LIP REGION IDENTIFICATION

A NEW FUZZY C-MEANS BASED SEGMENTATION STRATEGY. APPLICATIONS TO LIP REGION IDENTIFICATION Mhaela Gordan *, Constantne Kotropoulos **, Apostolos Georgaks **, Ioanns Ptas ** * Bass of Electroncs Department, Techncal Unversty of Cluj-Napoca mhag@bel.utcluj.ro Constantn Dacovcu Street, No. 15, Cluj-Napoca, RO-3400, Romana ** Artfcal Intellgence and Informaton Analyss Laboratory, Department of Informatcs, Arstotle Unversty of Thessalonk, Box 451, GR-54006 Thessalonk, Greece {costas,apostolos, ptas}@zeus.csd.auth.gr ABSTRACT The problem of lp contour detecton s crtcal n the lpreadng systems based on contour processng. The typcal contour detecton strategy based on mage segmentaton n homogeneous regons fals n the case when the mouth mages avalable for lpreadng are lowcontrast gray level mages. Most of the solutons adopted requre manual markng of some contour ponts. Here we propose a new soluton from the mage segmentaton class sutable to lp contour extracton, usng a modfed verson of the fuzzy c-means mage segmentaton algorthm. The novelty of the soluton proposed conssts n the types of features used for segmentaton, whch nclude not only lumnance nformaton (as n the standard use of fuzzy c- means), but also spatal nformaton about the pxels n the mage. After a smple flterng of the outlers, the contours of the resultng segmented objects are extracted. The expermental results obtaned are superor to the ones obtaned by standard or other versons of geometrcally constraned fuzzy c-means, or by gradent-based edge detecton strateges, wthout the need of manual markng of any contour ponts. Thus we consder the strategy proposed promsng for automatc lp contour extracton applcatons. KEYWORDS: fuzzy c-means, spatally constraned mage segmentaton, lpreadng, lp contour detecton 1. INTRODUCTION A relatvely large class of lpreadng algorthms are based on lp contour analyss. Examples of such algorthms can be found n [1-3]. In these cases, lp contour extracton s needed as the frst step. By lp contour extracton, we usually refer to the process of lp contour detecton n the frst frame of an audo-vsual mage sequence. Obtanng the lp contour n subsequent frames s usually referred as lp trackng. Whle for lp contour trackng there are well-developed technques and algorthms to perform ths task automatcally, n the case of lp contour extracton n the frst frame the thngs are dfferent. Ths s a much more dffcult task than trackng, due to the lack of a good a-pror nformaton n respect to the mouth poston n the mage, the mouth sze, the approxmate shape of the mouth, mouth openng etc. So, whle n lp contour trackng we have a good ntal estmate of the mouth contour from the prevous frame, ths 1 of 6

ntal estmate s not always avalable for the frst frame, but t has to be produced by some means. Dfferent authors tred dfferent procedures to solve the extracton of a good lp contour n the ntal frame. Of course, the goal would be to solve ths task automatcally; approaches lke regon-based mage segmentaton and edge detecton have been proposed. These methods work qute well n profle mages and also n frontal mages where the speaker wears lpstck or reflectve markers. However, n the frontal mages wthout any markng of the lps, the above-mentoned technques unfortunately fal; and these mages are the most used for speechreadng. The problem of automatc extracton of the lp contour becomes even harder n the gray-level mages, where the chromatc nformaton dfferentatng between lps and skn s no longer present. Usually these mages have a low contrast, so regon-based segmentaton and edge detecton algorthms fal to provde good results [1,2]. In these cases, the soluton adopted s based on markng manually more or less ponts on the lp contour and lp contour detecton from nterpolaton or geometrc modelng, or even on drawng manually the entre lp contour. One of the most successful mage segmentaton algorthms nto homogeneous regons s fuzzy c-means algorthm. There are a lot of vsual applcatons reportng the use of fuzzy c-means, e.g. n medcal mage analyss, sol structure analyss, satellte magery. [4-6]. Unfortunately the experments demonstrated that t s stll not sutable for the segmentaton of mouth mages nto lp and skn regons. The problem comes from the fact that, n ts standard use, the resultng regons are not spatally contnuous, due to the fact that only the gray level unformty s checked. Therefore, we propose an enhancement of the use of fuzzy c-means, to ensure spatally contnuous regons after segmentaton. Prevous approaches on usng fuzzy c-means wth geometrc constrants were reported n [7-9]. These approaches were based on usng an extra-step to update the fuzzy partton, step n whch geometrc propertes of the pxels n dfferent szed neghborhoods (typcally 3 3) are consdered. However, the features used n the fuzzy c-means algorthm were the same as n the standard approach: just the gray levels (color components ntenstes) of the pxels. The approach proposed n ths paper s based on the followng prncple: the features are modfed to nclude also nformaton about the spatal poston of the pxels, not only ther gray level values. For the tme beng we use the easest way to nclude the spatal nformaton: each pxel s represented by ts lumnance, ts x and y coordnates. For the applcaton addressed, on small sze mages, ths technque allows us to obtan very good segmentaton results, superor to the standard use of fuzzy c-means and to other versons of spatally constraned fuzzy c-means [7,8], wthout the need of any manually marked contour ponts. 2. THE PROPOSED FUZZY C-MEANS BASED IMAGE SEGMENTATION STRATEGY The need for a new segmentaton strategy of the mouth mage nto two homogeneous regons, namely the lp regon and the skn regon, comes from the observaton that, n the partcular applcaton of standard fuzzy c-means algorthm for gray level mouth mages, the resultng objects are not compact, meanng, the separaton of the pxels nto lp and skn regons s not accurately done. In other words, smply the gray levels of the pxels are not enough to dfferentate between the lps and the skn, also due to the low contrast of the mages; many outlers are present n both classes, so they are dffcult to be fltered out. An example s gven n Fgure 1 (b). 2 of 6

Therefore, we propose the addton of new features n the data to be classfed/clustered, amng at preservng the topology of the neghborng pxels, consderng the fact that both lps and skn areas are spatally contnuous regons. In other words, the neghborng pxels wth smlar lumnance should be kept together, and the dstance between classes should take nto account both the space dstance and the gray level dstance. To do ths, the most straghtforward approach s to consder each data pont represented by ts spatal poston and ts gray level. Therefore we wll have a three-dmensonal feature space, n whch each pxel of the mouth mage p wll be represented by ts x and y coordnates and ts lumnance l: p ( ) T = x y l (1) 0... W 1 0... H 1 ; H the mage heght; where: x [ ]; W- the mage wdth; y [ ] l [ 0... L 1] Max ; L Max the maxmum lumnance level, e.g. 256. All the three components of the feature vector have nteger values.thus, for the W H szed mage, the dataset to be parttoned s P={p 1, p 2,, p WH }. Ths dataset wll form the unverse of dscourse, to be parttoned n C classes. In our partcular applcaton, although the classes are skn and lps, sometmes s more advantageous to partton the skn nto two regons (to ensure ther convexty) whch leads to a number of C=3 classes nstead of C=2. The partton matrx contanng the membershp degrees to the C classes (C=2 or C=3), U=[u j ] of sze C W H, and the set of class centers, V={v 1,, v C }, result fnally as output of the fuzzy c-means algorthm. Here, each v s a vector of three components v ( ) T = vx vy vl comprsng an x coordnate, y coordnate and a lumnance value. The partton matrx should satsfy the same constrants as n standard fuzzy c-means. Wth the use of the Eucldan dstance, the cost functon to be mnmzed becomes: WH C m T (, V) = ( p v ) ( p v ) J m U u j j j (2) j= 1 = 1 As n the standard use of fuzzy c-means, we set the weghtng coeffcent m to m=2. Then, the modfed fuzzy c-means algorthm runs teratvely n the followng steps: Step 1. Set C to 2 or 3 (dependng on the llumnaton varance of the mouth mage). Step 2. Set the convergence error ε=0.001%. Step 3. Set m=2. Step 4. Intalze randomly the partton matrx, U=U 0. Set j=0. Step 5. If ( u ) > ε max = 1,..., C; k = 1,... WH j k j 1 k Step 6. 6.1. j=j+1; 6.2. Compute the class centers: W H j 1 ( u ) k = 1 j 1 ( u ) u, go to Step 6. Otherwse, go to Step 7. pk j v for each =1,, C k k = 1 = W H k 2 2 6.3. Update the fuzzy partton: 2 1 C T ( ) ( ) j p k v pk v u k = T = 1 ( ) ( ) for all =1,, C and k=1,, W H. n pk vn pk vn 6.4. Go to Step 5. Step 7. Set the fnal partton matrx U=U j and the fnal class centers vector V=V j. 3 of 6

After the fnal segmentaton results are obtaned, we label all the data as belongng to the most plausble class. The class label s encoded by the gray level of the correspondng class center v, =1,, C, namely: for each k, k=1,, W H, p ' ( ) T k = xk yk vl, such that u k = max unk n= 1,..., C The last step of the regon-based segmentaton scheme proposed refers to the elmnaton of the outlers present nsde each regon. A pxel n the segmented mage s defned as outler f most of the pxels n a 3 3 neghborhood around t belong to another class than the pxel under nvestgaton. In ths case, the pxel s flagged as outler, and n a second pass, t wll be moved to the class to whch most of ts neghbors belong. In ths way, the nosy class decsons wll be elmnated, the result beng a set of smooth contnuous regons. 3. APPLICATION OF THE PROPOSED STRATEGY TO LIP CONTOUR EXTRACTION IN LOW CONTRAST GRAY LEVEL IMAGES As stated n Introducton, the fnal goal of the proposed new strategy of fuzzy c- means based mage segmentaton s to mprove the lp contour extracton. The crtcal problem appears n the case of low contrast gray-level mouth mages. Here we dfferentate two categores of mages, whch wll requre two dfferent approaches: (a) medum-low contrast gray level mouth mages, wth slghtly varable llumnaton. In ths case, we can process the entre mouth mage at once, but due to the llumnaton varance nsde the mouth mage and to the non-convexty of the skn regon, a segmentaton n C = 3 classes mght be needed. Namely, one class wll represent the lps, and the other two classes wll represent the skn areas, such that each skn area s mostly convex. (b) very-low contrast gray level mouth mages, wth varable llumnaton. In ths case, applyng the proposed fuzzy c-means algorthm on the entre mouth mage at once stll fals to provde good segmentaton results. A better approach s to dvde the mouth mage nto four submages, the upper-left, the upper-rght, the lower-left and the lower-rght ones. Thus, after applyng the segmentaton on each submage and extractng the contour on each segmented submage, we wll obtan the pecewse lp contour, whch can be fnally jont by nterpolaton. An example of such a splttng of a mouth mage nto 4 submages s gven n Fgure 1 (a). For the tme beng only the lp contour extracton n mouth mages wth closed mouth s consdered. For open mouth mages, the number of classes wll ncrease to nclude teeth regon, tongue regon etc. We must notce here that, n the case of splttng the mouth mage nto 4 submages, only a segmentaton n C = 2 classes s needed, because now the skn regon n the submage becomes convex. The frst three processng steps: dataset buldng, the modfed fuzzy c-means algorthm and outler flterng were descrbed n detal n the prevous secton. The last step of the segmentaton scheme s to ensure the fnal am of the proposed regon-based segmentaton strategy: to fnd the lp contour. Ths becomes an easy task once we have avalable a good regon-based segmentaton. We must note that, n the case of mouth mage segmentaton n C = 3 classes, we wll probably get n the gradent mage an extra-boundary between the 2 skn regons, or nsde the lp regon, whch does not represent a lp boundary, but these false boundares can be easly recognzed based on ther shape or area enclosed or to the fact they are not closed boundares as the lp contour must be. The qualty of the extracted lp contour wll be much better than n the case of classcal edge detecton schemes or smple fuzzy c-means regon-based segmentaton schemes. 4 of 6

4. EXPERIMENTAL RESULTS In order to evaluate the performances of the new proposed regon segmentatonbased strategy n the lp contour detecton applcaton, we mplemented software the above descrbed system n C++. The estmaton of the results s done by vsual examnaton on a test set of dfferent gray level mouth mages, both of the segmented mages and of the extracted lp contour, by supermposng t on the orgnal mouth mages. The proposed segmentaton strategy s compared to other varants of fuzzy c- means: standard fuzzy c-means; rule-base neghborhood enhancement (RB-NE) fuzzy c-means [7]. As test set, we selected two classes of mouth mages: () gray level mouth mages wth medum-low contrast: we manually selected rectangular regons comprsng the mouth from the mages Lena, Lsa and Grl ; () gray level mouth mages wth very low contrast, from the audo-vsual database Tulps1 [10]: closed mouth mages of the subjects Anthony, Ben, Candace, George. These mages were spltted nto four submages pror to segmentaton. The orgnal mages from class (a) and one example test mage from class (b) (subject Anthony) from the test set are gven n Fgure 1 (a). The segmentaton results usng standard fuzzy c-means, proposed fuzzy c-means and RB-NE fuzzy c-means are gven n Fgure 1 (b), (c) and respectvely, (d). We can see vsually the better performance of the proposed strategy n all the test cases. The only case where our algorthm, the same as the others, fals, s for the lower half of the mouth mage depcted from Tulps1 database. (a) (b) (c) (d) Fgure 1. Some orgnal mouth mages (a) and the results of applyng dfferent varants of the fuzzy c-means algorthm for ther segmentaton: (b) standard fuzzy c-means; (c) the proposed enhancement of fuzzy c-means; (d) RB-NE fuzzy c-means. From left to rght, the four orgnal mages represent: mouth areas of the mages Grl, Lena, Lsa ; the mouth mage for subject Anthony from Tulps1, splt n 4 submages Fgure 2. Examples of resultng regon boundares after the proposed modfed fuzzy c-means segmentaton; from left to rght: for the mouth mage of Lena ; Lsa ; Anthony (Tulps1). 5 of 6

Some resultng boundares on the regon-based segmented mage, overlapped on the orgnal mouth mage/submages, are demonstrated n Fgure 2. Here we notce the good qualty of the extracted lp contour. The false boundares present n the mages for the 3- class segmentaton case can be easly fltered out, snce they ether are non-closed boundares (so they cannot be lp boundares), ether the boundary s closed but s not the outermost one, so t cannot be the outer lp contour. 5. CONCLUSIONS We proposed a new strategy for extractng object boundares n low contrast gray level mages, amng to solve the problem of lp contour extracton n gray level low contrast mouth mages n the frst frame. Startng from a well known algorthm, fuzzy c-means, we modfed ts standard use by ncludng n the feature vector spatal nformaton about the pxels postons. The expermental results on a set of varous low contrast gray level mouth mages show better performance of our algorthm n terms of compactness of the segmented regons as compared to the standard fuzzy c-means or other versons of geometrcally guded fuzzy c-means. The soluton proposed makes possble a completely automated contour extracton, snce no manually marked ponts on the lp contour are needed. In our future research we wll examne the ncluson of some texture features n the feature vector and the use of more elaborated dstance measures n the fuzzy c-means algorthm for a better shape varablty of the clusters. REFERENCES [1] R. Kaucc, B. Dalton, A. Blake (1996), Real-tme lp trackng for audo-vsual speech recognton applcatons, Proc. European Conf. Computer Vson, Cambrdge, UK, pp. 376-387 [2] J. Luettn, N. A. Thacker, S. W. Beet (1996), Actve shape models for vsual speech feature extracton, Speechreadng by Humans and Machne, NATO ASI Seres, Seres F: Computer and Systems Scences, Sprnger Verlag, Berln, 150:383-390 [3] I. Matthews, T. Cootes, S. Cox, R. Harvey, J. A. Bangham (1998), Lpreadng usng shape, shadng and scale, Proc. Audtory-Vsual Speech Processng, Sydney, Australa, pp. 73-78 [4] S. E. Crane, L. O. Hall (1999), Learnng to dentfy fuzzy regons n magnetc resonance mages, 18 th Int. Conf. of NAFIPS, pp. 352-356 [5] P. Thtmajshma (2000), A new modfed fuzzy c-means algorthm for multspectral satellte mages segmentaton, Proc. Geoscence and Remote Sensng Symposum, Vol. 4, pp. 1684-1686 [6] D. L. Pahm, J. L. Prnce (1999), Adaptve fuzzy segmentaton of magnetc resonance mages, IEEE Transactons on Medcal Imagng, Vol. 18, no. 9, pp. 737-752 [7] Y. A. Tolas, S. M. Panas (1998), On applyng spatal constrants n fuzzy mage clusterng usng a fuzzy rule-based system, IEEE Sgnal Processng Letters, Vol. 5, no. 10, pp.245-247 [8] T. D. Pham (2001), Image segmentaton usng probablstc fuzzy c-means clusterng, Proc. Int. Conf. On Image Processng, Thessalonk, Greece, Vol. 1, pp. 722-725 [9] J.C. Noordam, W.H.A.M. van den Broek (2000), Geometrcally Guded Fuzzy C- Means Clusterng for Multvarate Image Segmentaton, Proc. Int. Conf. on Pattern Recognton, pp. 462-465 [10] J. R. Movellan (1995), Vsual Speech Recognton wth Stochastc Networks, Advances n Neural Informaton Processng Systems, (G. Tesauro, D. Toruetzky, and T. Leen, Eds.), MIT Press, Cambrdge, MA, Vol 7 6 of 6