An Unsupevised Segmentation Famewok Fo Textue Image Queies Shu-Ching Chen Distibuted Multimedia Infomation System Laboatoy School of Compute Science Floida Intenational Univesity Miami, FL 33199, USA chens@cs.fiu.edu Chengcui Zhang Distibuted Multimedia Infomation System Laboatoy School of Compute Science Floida Intenational Univesity Miami, FL 33199, USA czhang02@cs.fiu.edu Mei-Ling Shyu Depatment of Electical and Compute Engineeing Univesity of Miami Coal Gables, FL 33124, USA shyu@miami.edu Abstact In this pape, a novel unsupevised segmentation famewok fo textue image queies is pesented. The poposed famewok consists of an unsupevised segmentation method fo textue images, and a multi-filte quey stategy. By applying the unsupevised segmentation method on each textue image, a set of textue featue paametes fo that textue image can be extacted automatically. Based upon these paametes, an effective multi-filte quey stategy which allows the uses to issue textue-based image queies is developed. The test esults of the poposed famewok on 318 textue images obtained fom the MIT VisTex and Bodatz database ae pesented to show its effectiveness. 1. Intoduction Segmentation is an impotant pat of the compute vision and image analysis, wheein egions of inteest ae identified and extacted fo futue pocessing. The definition of suitable similaity and homogeneity measues is a fundamental task in many impotant applications, anging fom emote sensing to similaity-based etieval in lage image databases such as the quey by image content (QBIC) system [4]. Textue segmentation involves the identification of unifom textued egions in an image. Many techniques have This eseach was suppoted in pat by NSF CDA-9711582. been used fo the analysis of textues such as [5, 6]. With the estiction to a set of known textues, etieval and segmentation poblems ae essentially educed to a supevised classification task, which is amenable fo standad techniques fom patten ecognition and statistics. Techniques used fo image segmentation include simple statistical models to obtain estimates of pobability density functions [7], and intensity and textue measues [10], etc. Local statistics and edge infomation have also been used to segment and distinguish egions of inteest fom the backgound [9]. Segmentation techniques can be gouped unde split and mege methods [8], egion gowing methods[1], and stochastic model based methods [2]. The main appoach taken in most of the emeging techniques includes the step to choose a stategy to estimate the paametes of distibutions, which is invaiably to be the maximum likehood (ML) estimation o maximum a posteioi (MAP) estimation. Howeve, computing the exact MAP estimate of the class label field is consideed a had poblem. Also, no methods in the liteatue can compute the MAP estimates of the class paametes as well as the pixel labels simultaneously. Thee exists a tight elationship between similaity-based textue image etieval and unsupevised textue segmentation. Image etieval often equies to select those images in a database which ae most simila to a given quey image, while the goal of segmentation is to patition a given image into maximally homogeneous egions. Theefoe these tasks ae closely elated to similaity measues, since homogeneity can be defined as the aveage similaity between pais of local textue patches within a egion. In this pape,
) H ) U U we popose an unsupevised textue segmentation method which can ecognize the vaiability of content desciption depending on the complexity of the image egions and effectively addess it. The poposed method consides the poblem of segmentation as a joint estimation of the patition and class paametes. This class paameteization enables us to compute the optimal paametes using a simple least squaes technique, and the class desciptions ae amenable to diect estimation of thei paametes without esoting to expensive numeical optimization pocedues. By consideing both the patition and the class paametes as andom vaiables and estimating them jointly, thei MAP estimates ae computed simultaneously. In ou famewok, we fist segment each of the textue images into classes (usually 2 classes), and extact textue featues of each class simultaneously by geneating the class paametes duing the pocess of segmentation. Based on the database of the textue featues, a multi-filte quey mechanism is developed to filte out most of the biased textue images that ae fa diffeent fom the example quey textue image at the vey beginning of quey, which can geatly educe the ovehead. The test esults ae based on the 318 textue images obtained fom the MIT VisTex Textue database [11] and Bodatz database [12]. The est of the pape is oganized as follows. In Section 2, the unsupevised textue segmentation method is pesented, and the featue paametes obtained by segmentation ae descibed in details. Section 3 explains the quey stategy. Section 4 gives the test esults and the discussions. Conclusions and futue wok ae given in Section 5. 2. Unsupevised Textue Segmentation In the poposed unsupevised textue segmentation famewok, the patition and the class paametes ae teated as andom vaiables. The method of patitioning a still image stats with a andom patition and employs an iteative algoithm to estimate the patition and the class paametes jointly [3]. 2.1. Segmentation Method Suppose the image is of size with intensities given by = :, and thee ae two classes in the image. Let the patition vaiable be! #"$&%, and the classes be paameteized by ' #' " ' %. Also, suppose all the pixel values belonging to class ()*( ",+.- ae put into a vecto /10. Each ow of the matix 2 is given by )3 " " " 4 - and 576 is the vecto of paametes )8576:9 "<;=;>;=" 576@? -BA. 5 6:9DC 5 6 C 5 6 % C 5 6@? 4 "3E )8 " - GF 6 / 0 2D5 6 H 5 6 )I2 A 2 -@J 2 A / 0 The best patition is estimated as that which maximizes the a posteioi pobability (MAP) of the patition vaiable given the image data. Now, the MAP estimates of! #"$&% and ' #'K" ' % ae given by L" H H ' -M Ag QN OKP LR ) V" 'XW - Ag QN OKP LR )8YW V" ' - U ) L" ' - Let Z[) V" ' - be the functional to be minimized. With appopiate assumptions, this joint estimation can be simplified to the following fom: L" H ' -M Ag QN]\>^ LR Z[) 7 #"$.%" ' #" ' %- Z[)! _"$.%_" 'K" ' %`-a b c:de@f!gihj ^lk )8m&n,'- C b c,doe@f p hj ^qk % )4.n:' %- The algoithm stats with an abitay patition of the data and computes the coesponding class paametes. Using these class paametes and the data, a new patition is estimated. Both the patition and the class paametes ae iteatively efined until thee is no futhe change in them. Afte the segmentation, a set of paametes descibing both of the two classes is obtained automatically, and pat of the paametes ae selected fo futue quey use. 2.2. Initial Patitions fo Segmentation The poposed segmentation method stats with a andomly geneated initial patition. Hence, diffeent initial patitions yield to diffeent local minima. The smallest local minimum among them gives the desied solution though it may not be the global minimum. In the poposed famewok, a numbe of local minima (e.g., 20) ae computed and the smallest local minimum is used. Since the computational equiement fo each local minimum is vey little, the oveall computation needed fo the best local minimum is not much. Two methods ae used to geneate those twenty initial patition candidates. By the staight-line patition method, the aea of the oiginal textue images is patitioned by an abitaily geneated staight-line acoss the whole image aea. Diffeent aeas sepaated by the staightline epesent diffeent classes. In many cases, the andomly geneated staight-line patitions ae good enough to get the desied initial patition, but in many othe cases it cannot wok well. In ode to obtain a good initial patition as quickly as possible, the pedefined template method is also
h h used to geneate the initial patitions. Eight pedefined templates ae selected as candidates in the selection of the desied initial patition. Anothe impotant issue about the initial patition is how to select the best one among those candidates. The citeia fo evaluating the candidates involve two aspects. One is the local minimum, and the othe is the standad deviation of each class within a textue image. Two candidates ae chosen when each of them has eithe the lowest local minimum o the lowest standad deviation. Then, the global minima of these two candidates ae computed and the one with the lowe global minimum is chosen as the final patition. 3. Quey Stategy Afte the segmentation on each textue image, a set of paametes fo each image is obtained automatically. Some of these paametes ae selected fo quey use. Since the poposed segmentation method uses the functions of the spatial coodinates of the pixels as the mathematical desciption of a class, those paametes elated to spatial infomation should be able to epesent the spatial distibution featues of textues. Paamete sut : Afte the segmentation, each pixel within a textue has its class identification. Fo example, the class identification fo each pixel is eithe 1 o 2 when thee ae two classes. As mentioned ealie, each class is paameteized by a vecto of paametes )85 6:9 "<;=;>;=" 5 6@? - A. In othe wods, this paamete vecto contains not only the spatial distibution infomation of the textue, but also the infomation of intensity values within that class. Futhemoe, among the fou paametes in the vecto, 576:9 is usually fa moe lage than the othe thee. Theefoe, given the numbe of classes is 2, two svt paametes (one fo each class) ae obtained fo each textue. Paamete wx : It is the covaiance matix of matix y@z!{ỳ } ( } =1 o 2). This paamete epesents the spatial distibution patten of each class. wx )*wx " wx +- n wx } ) y@z!{#y< } y@z!{#y< } A -$~ ( } n y@z!{ỳ } ) ` (! } " ` ( y } - A } } ƒ } {K h whee ` (! } and < ( y } ae column vectos with each ow being the y ƒkƒ K } 5 3{ and y ƒkƒ K. } 5 3{ of F &, espectively. Hee, } } is a column vecto with 2 elements epesenting the means of the y ƒkƒ #. } 5 3{# and y ƒkƒ K } 5 3{K of F. ƒ, and } {K ( } is a unity vecto of ( } elements (i.e., all of them have the value 1). ( } Paamete x sv ˆŠ s : Duing the pocess of segmentation, the low-level featues such as the vaiance and mean value fo each class can also be obtained, which does not cause any excessive computation cost. Since the textue image is well segmented afte the segmentation phase, using the low-level featues of each class as the quey citeia is expected to achieve good quey esults. SEGMENTATION Textue Image Database Example Quey Image AK Filte Covaiance Filte Va-Mean Filte Quey esults, displayed by ankings in deceasing ode Figue 1. The multi-filte quey achitectue. Since we use Euclidean distance fo compaing two featue vectos, the smalle the dimension of vecto is, the bette the pefomance is. Notice that fo each textue image, it has only two sut paametes. Though the infomation included in paamete svt may not be enough to achieve good quey esults, howeve, if it is used as the fist level filte in the quey stategy, the oveall computation cost can be educed significantly. Hence, a multi-filte quey mechanism is developed in the poposed famewok. Figue 1 shows the achitectue of the multi-filte quey stategy. As can be seen fom this figue, the multi-filte quey mechanism includes the sut filte, Covaiance filte, and Va Mean filte. The idea is to use the spatial distibution infomation obtained though segmentation to filte out those bias textue images, and use the classified x sv ˆŠ s to ank the etieved images. The anking of the etieved textue images is elatively simple. The sum of the weighted Euclidean distances on the x sv ˆŠ s fo each class and the oveall x su ˆŠ s between the quey image and the etieved image is used to detemine the anking. The weights ae deived fom the expeimental esults. 4. Test Results and Discussions 4.1. Image Retieval Results In ode to test the pefomance of the poposed famewok, 318 natual textue images mostly obtained fom the MIT VisTex Textue database and Bodatz database ae used. Fo the images fom Bodatz, we patition each of the 512 512images into 6 subimages (with ovelap). Each textue image is of size 240 ows and 180 columns. In the poposed famewok, the similaity quey is used. An example of the quey looks like Show me moe textue images which ae simila in textue pattens with the quey image.
200 textue208 200 textue210 200 textue215 Rank 1 : 0.07844 Rank 2 : 0.27975 vaiance mean featues. As fo the etieved images textue215 and textue214, they have simila spatial distibutions in textue pattens with the quey image, but thei vaiance mean distibutions ae quite diffeent fom that of the quey image. In addition, the spatial distibution of image textue17 looks close to the quey image, but not as close as textue215 and textue214 do. Anothe obsevation is that since the textue210 is the closest to the quey image, its coesponding ank value is almost fou times highe than that of image textue215, which is significant enough to epesent its high similaity with the quey image. 200 textue214 200 textue17 Rank 3 : 0.33361 Rank 4 : 0.383 (a) Quey esults fo quey image textue208. 200 textue301 200 textue304 200 textue302 200 textue208 200 textue210 200 textue215 200 textue303 200 textue306 200 textue305 200 textue214 200 textue17 (b) Segmentation esults 200 textue307 200 textue309 (a) Quey esults fo quey image textue301. Figue 2. Textue quey esults afte the segmentation. Example quey image textue208 is on the top left. Matches of the images ae listed fom top left to bottom ight in deceasing ode of thei similaities. 200 textue301 200 textue304 200 textue302 Figues 2(a)-(b) show the quey esults fo example quey image textue208, which is an image fom MIT Vis- Tex database. Figue 2(a) shows the fist fou oiginal textue images being etieved. The example quey image textue208 is on the top left, and the matches ae listed fom top left to bottom ight in deceasing ode of thei similaities. The coesponding anks of the matches ae also given below the name of each oiginal textue image as shown in Figue 2(a). The ank indicates how simila it is to the example quey image. Figue 2(b) shows the segmentation esults of those textue images in Figue 2(a). Fom the obsevations of the segmentation esults, we can see that the textue patten of image textue210 is the closest to the quey image textue208. The spatial distibutions within each class ae vey simila to each othe, as well as the 200 textue303 200 textue307 200 textue306 200 textue309 (b) Segmentation esults 200 textue305 Figue 3. Textue quey esults afte the segmentation. Example quey image textue301 is on the top left. Matches of the images ae listed fom top left to bottom ight in deceasing ode of thei similaities.
Figue 3 shows anothe quey esults fo example quey image textue301 which comes fom the Bodatz database. The ecall numbe is Œ. It is clea that the top matches include all the subimages which come fom the same oiginal image as the quey image. By analyzing the quey esults fo the example quey image, it is vey pomising to see that the poposed famewok fo textue segmentation and quey can easonably etieve those textue images that have the simila textue pattens with the example quey image. Moeove, since the poposed segmentation method is an unsupevised simultaneous patition and class paamete estimation algoithm, all the needed featue paametes can be obtained automatically and indexed offline without any use inteactions. In the expeiments, the accuacy of segmentation esults fo textue images exceeds 85 pecent. In addition, the use of multi-filtes (sut, wx and x sv ˆŠ s ) geatly educes the numbe of etieved images at each step, which is essential to educe the computation cost and get quick answes fo the issued queies. Fo example, when textue208 is used as the example quey image, the numbe of etieved images shaply dopped ove 70 pecent afte the svt filte. 5. Conclusion and Futue Wok In this pape, an unsupevised segmentation famewok fo textue image queies was poposed. By using a novel and effective segmentation method, a set of featue paametes fo each class within an image is extacted automatically without any use intefeence. Based on these featue paametes, the poposed famewok suppots textue image queies effectively. Moeove, a multi-filte mechanism is used in the quey pocedue to geatly educe the numbe of image candidates and at the same time, educe the quey pocessing time. Futhemoe, applying the segmentation method on patitioning the natual image also gives good esults. One of the potentials of the poposed segmentation method is that it can also deal with the situation of multiple classes (moe than two). The idea is to conside the numbe of classes as anothe andom vaiable. Ou futue wok will focus on genealizing the poposed famewok to handle the cases when the numbe of classes is moe than two so that it can patition the image moe easonably and pecisely, which is essential to the accuacy of the queies. [2] R. Chellappa and A. Jain. Makov Random Fields: Theoy and Applications. New Yok: McGaw-Hill Book Company, 1993. [3] S.-C. Chen, S. Sista, M.-L. Shyu, and R. L. Kashyap, Augmented Tansition Netwoks as Video Bowsing Models fo Multimedia Databases and Multimedia Infomation Systems, 11th Intenational Confeence on Tools with Atificial Intelligence (ICTAI 99), pp. 175-182, Nov. 1999. [4] M. Flickne et al., Quey by image and video content: The QBIC system, IEEE Compute, pp. 23-32, Sept. 1995. [5] R. M. Haalick, Statistical and stuctual appoaches to textue, Poceedings of IEEE, vol. 67, pp. 786-804, 1979. [6] W. Y. Ma and B. S. Manjunath, Textue Featues and Leaning Similaity Poc. IEEE Intenational Confeence on Compute Vision and Patten Recognition, San Fancisco, CA, pp. 425-430, June 1996. [7] D. Nai and J. K. Aggawal, A focused taget segmentation paadigm, 4th Euopean Confeence on Compute Vision, vol. 1, pp. 579-588, Cambidge, UK, Apil 1996. [8] T. Pavlidis. Stuctual Patten Recognition, Spinge- Velag, 1991. [9] K. Pice, Image segmentation: A comment on studies in global and local histogam-guided elaxation algoithms, IEEE Tans. on PAMI, 6(2):247-249, Mach 1984. [10] M. Spann and A. Gace, Adaptive segmentation of noisy and textued images, Patten Recognition, 27(12): 1717-1733, Decembe 1994. [11] http://www-white.media.mit.edu/vismod/imagey/ VisionTextue/vistex.html [12] http://www.ux.his.no/ tanden/bodatz.html Refeences [1] J. M. Beulieu and M. Goldbeg, Hieachy in pictue segmentation: A stepwise optimization appoach, IEEE Tans. on PAMI, 11(2):-163, Febuay 1989.