Feature Extraction and Selection for Image Retrieval

Feature Extracton and Selecton for Image Retreval Xang Sean Zhou, Ira Cohen, Q Tan, Thomas S. Huang Beckman Insttute for Advanced Scence and Technology Unversty of Illnos at Urbana Champagn Urbana, IL 680 {xzhou, racohen, qtan, huang}@fp.uuc.edu ABSTRACT In ths paper feature extracton process s analyzed and a new set of edge features s proposed. A revsed edge-based structural feature extracton approach s ntroduced. A prncple feature selecton algorthm s also proposed for new feature analyss and feature selecton. The results of the PFA s tested and compared to the orgnal feature set, random selectons, as well as those from Prncple Component Analyss and multvarate lnear dscrmnant analyss. The experments showed that the proposed features perform better than wavelet moment for mage retreval n a real world mage database; and the feature selected by the proposed algorthm yelds comparable results to orgnal feature set, and better results than random sets. Keywords Feature extracton, feature selecton, content-based mage retreval, prncple component analyss, dscrmnant analyss.. INTRODUCTION Image retreval problem, n some cases, can be regarded as a pattern classfcaton problem, where each mage s assumed as ground truth belongs to a specfc class. Then query-byexample s to fnd the class and return mages wthn that class. But usually there s no class labels avalable so ths can only be acheved by comparng the smlarty among mages. Applcatons nclude medcal mage database retreval, aeral/satellte mage analyss and retreval, etc., where the user s nterested n some pre-defned targets such as tumors or vehcles. However n other cases, the assumpton of fxed, statc class membershp assgnments does not hold. One example s the trademark database retreval. Even though trademarks are categorzed (by hand!) accordng to ther prmary component, a new trademark desgn wll be rejected f t s beleved to cause confuson wth an exstng one, regardless of the category assgnments. Another example can be llustrated by the mage n Fgure : as a query mage submtted by the user, t s possble that the user s lookng for cars, or the user s lookng for autumn, or house, etc. In general ths s the case for every mage n the database,.e., each mage has multple possble labels, or class membershps. And to complcate the ssue further, dfferent users can nterpret the same mage dfferently thus assgn dfferent labels n ther mnd, makng tranng or preclassfcaton much more dffcult. We refer the retreval process n ths case as dynamc matchng. Fgure An mage of multple objects In ths paper we wll dscuss the feature extracton and selecton process wth the aforementoned characterstcs of mage retreval problem n mnd. Especally for feature selecton, whch s defned as the process of selectng a subset from a set of orgnal features, dfferent cases wll lead to dfferent nputs and algorthms, whch n turn wll gve dfferent subsets of features. Wthout class labels, Prncple Component Analyss (PCA), or a novel feature selecton algorthm, Prncple Feature Analyss (PFA), can be appled to transform the features to a reduced space, or to select the subset of the features that contans the most nformaton, respectvely. PFA s ntroduced n secton 3. For the mage classfcaton case, when some class labels are avalable, multvarate lnear dscrmnant analyss (MLDA) s appled to transform the features nto the Most Dscrmnatng Feature space. The expermental results are compared n Secton 4. For dynamc matchng problem, contrbuton from each feature component would vary from tme to tme f dscrmnant analyss was conducted for dfferent labelng scheme for the mages. So the best way would be usng dynamc on-lne weghtng of the features through relevance feedback []. If a reduced feature space s desred, t can only be done based on the results of suffcent trals of the relevance feedback process, so that a feature s deleted or gnored to a great extent only f t does not contrbute n all or most of the dynamc weghtng schemes. (See Fgure 3) Feature extracton s the process of creatng a representaton for, or a transformaton from the orgnal data. There has been a phlosophcal dfference n attackng the problem of effectve and effcent feature extracton. One s human perceptoncentered approach,.e., based on human vsual percepton and

psychologcal experments, compute the measurements of certan percepton-based features (e.g., smoothness for texture), and then select the best mathematcal representaton for t. The other s machne-centered approach, n whch case a unfed computng scheme s selected for extracton of certan ad hoc features (e.g., co-occurrence features, or wavelet moments for texture). The emphass s frst put on the computng aspect to provde an effcent algorthm to compute the numbers ; and then the effectveness, or the correlaton between the numbers and human percepton s establshed by experments. Examples of the former nclude Tamura texture features [], color quantzaton based on human perceptons, etc. A glance at the Appendx I of Haralck [] wll gve the taste of the latter approach. Or, feature extracton from nvsble bands for hyper-spectral mage analyss, where there s no nput to the human percepton system at all. The percepton-centered approach has the obvous advantages n servng the master or user of the mage retreval system human, gven that matchng human performance s the ultmate goal of the system. But n other cases where human percepton s not the target and performs not so well, whle computer sees somethng human doesn t, the latter approach s more approprate. The two approaches were both used for color and texture feature extracton n the lterature. But n terms of edge-based structure nformaton, much less attenton has been pad n both drectons. (We defne structural features as features nbetween texture and shape. Texture captures spatal dstrbuton of llumnance varatons n the specal form of repeatng patterns. Shape represents a specfc edge feature that s related to object contour. Nether of them pays attenton to nformaton represented n non-repeatng llumnance patterns, or more specfcally, edge patterns n general, whch s the category one can defne the edge-based structural features to represent.) There have been attempts to use percepton-based edge features for cty/buldng mage retreval,.e., use the features that human uses for recognzng buldngs: edge drecton; cotermnaton ( L or U junctons) and parallel lnes[7]; etc. We propose a machne-centered approach for general-purpose edge-based structural feature extracton. To llustrate the motvaton, consder the edge maps n Fgure. In all the three edge maps shape cannot be easly extracted. But a human subject can tell the top two as the Lena mage and a buldng mage. However ths s not the case for the fox edge map at the bottom. But what can a computer do? Nothng much f understandng of the content (.e., automatc segmentaton and recognton of the object) s the goal. But f the goal s for mage retreval and matchng only,.e., fnd a representaton whch gves smlar numbers whenever two mages have smlar content, and vce versa. Now the queston becomes how to effectve and effcently represent the nformaton carred n an edge map? In Secton we propose a revsed verson of our prevous proposed water-fllng algorthm[7] to extract more features from an edge map for the purpose of mage matchng and retreval. (Note that ths can work only to the extent where the nvarance property requred by the retreval task s satsfed n the correspondng edge maps.) We also used the proposed Prncple Feature Analyss algorthm to test whether the new features carry enough new nformaton. Fgure Edge maps The scope of ths paper s depcted n Fgure 3, where the modules wth a * are the emphass of ths paper. Orgnal Image Feature Extracton* Color Texture Edge Feature Transformaton Selecton* and Dynamc Weghtng Percepton-centered Percepton-based color quantzaton Tamura texture features Percepton-based edge features w/o labels PCA, PFA* w/ labels Dscrmnant analyss* Fgure 3 Scope of ths paper Machne-centered Hstogram, moments Wavelet moments, Co-Occurrences... Water-Fllng features* Dynamc labels Dynamc weghtng. EDGE-BASED STRUCTURE FEATURE EXTRACTION In ths paper we use water-fllng algorthm to extract edge/structural features. The advantages of ths algorthm nclude effcency t s a lnear-tme algorthm; and effectveness multple features correspondng to dfferent human perceptons can be extracted smultaneously.. The Water-Fllng Algorthm We propose an algorthm to extract features from the edge map drectly wthout edge lnkng or shape representaton. The dea

s to look for measures for the edge length and edge structure and complexty by a very effcent graph traverse algorthm. In ths paper, we use 4-connectvty for smplcty. The algorthm also assumes that thnnng operaton has been performed on the edge map so that all the edges are one pxel wde. To llustrate the algorthm, let s frst consder the smple case of an 8 by 8 edge map wth all the edge pxels connected (Fg. 4: shaded pxels are edge pxels). The algorthm wll do a frst raster scan on the edge map and start a traverse (thnk t as fllng n water) at the frst edge pxel encountered that has less than neghbors,.e., start at an end pont. In Fg.4 the pxel wth label s the frst end pont encountered. (In case all edge pxels have at least two neghbors,.e., no end ponts but all loops, e.g., Fg. 6, then start at the frst unflled edge pxel durng the second scan. So t s necessary to have two raster scans to avod possble mss). The waterfront then flows along the edges n the order ndcated by the numbers. Note that when there are more than one possble paths to go at one pont, the waterfront wll fork (e.g., at pxel 6 and 9 n Fg. ). 3 4 9 8 7 6 0 7 8 0 9 0 Fgure 4 Fgure One can see that ths algorthm can be regarded as a smulaton of floodng of connected canal systems (.e., connected edges), hence the name water-fllng algorthm. The assumptons mpled by the algorthm nclude: ) we have unlmted water supply at the startng pxel; ) water flows at a constant speed n all drectons; ) water front wll stop only at a dead-end, or at a pont where all possble drectons have been flled. When there are more than one set of connected edges n the edge map (e.g., the case n Fg.), the algorthm wll fll all the sets ndependently n sequental or n parallel. As water flls the canals (edges), varous nformaton are extracted, whch are stored as the feature prmtves. Feature vectors can then be constructed based on these feature prmtves. The tme complexty of ths algorthm s lnear, proportonal to the number of edge ponts n the mage.. Feature Extracton 3 4 3 4 9 8 7 6 0 7 8 0 9 0 6 7 8 9 03 4.. Feature Prmtves We propose the concept of Feature prmtves, whch are defned as the quanttes assocated wth or calculated from an mage that can serve as bases for constructng feature vectors, often through usng ther statstcs or entropes. Feature prmtves can be used as feature vector drectly as well, but often they are not compact enough. For example, co-occurrence matrces are the feature prmtves for the co-occurrence texture features, most of whch are moments, correlatons, and entropes []; and wavelet transform coeffcents can be regarded as feature prmtves for wavelet based texture features such as wavelet moments. In our case, we propose the followng quanttes as structural feature prmtves: ). Fllng tme: Fllng tme s the tme for water to fll a set of connected edges. For Fg. 4 through 6, the fllng tmes are {}, {4,, }, and {}, respectvely. Usng dfferent startng pxels, the fllng tme can vary n a range of [t, t], where t s the mnmum fllng tme among all possble selecton of the startng pxels. Ths s easly proved as follows: denote the startng pxel that gves the mnmum fllng tme t as S. Snce we assume water runs n both drectons on an edge, water can reach S from any other startng pxel, say P, n tme t, and then the waterfront can go from S to reach any pxels left unflled wthn tme t. So the fllng tme from P s =t. To mnmze the varaton n fllng tme due to selecton of startng pxels, we can mpose addtonal constrants on the selecton of startng pxels (e.g., choose only the end ponts), or choose dfferent startng pxels and average the results. To acheve scalng nvarance, normalze the fllng tme accordng to the mage sze. For example, dvde fllng tme by (wdth + heght). ). Fork count: Fork count s the total number of branches the waterfront has forked durng the fllng of a set of edges. If we consder the ntal waterfront as one branch to start wth, then the fork count for Fg. 4 through 6 are {3}, {, 3, }, and {6}, respectvely. If we choose an end pxel as startng pxel whenever possble, fork count s almost nvarant to startng pxel selecton. The varaton s wthn ±, depend upon whether the water starts from the mddle of an edge or the end of the edge. Also f multple waterfronts collde at one ntersecton, even though the water does not actually fork, the potental forks should be counted to acheve the almost nvarance. E.g., an extra fork s counted both at the upper 9 and the lower 0 n Fg. 6; but none at, snce t s not a potental fork pont n any way). 3). Loop count: Loop count s the number of smple loops (or, smple cycles as defned n Corman et al., 997, p. 88) n a set of connected edges. For example, n Fg.4 through 6, the loop counts are {}, {0,, 0}, and {3}, respectvely. Loop count s nvarant to rotaton. 3 4 6 3 4 6 7 8 4 6 7 8 9 8 0 9 7 8 9 0 Fgure 6 Fgure 7 To get the loop count durng the water-fllng process, we make use of the followng Theorem of Splashes : S E N 3

If we assume that when two waterfronts collde, we see one splash ; or more generally, when n waterfronts collde at one ntersecton, we see n- splashes (thnk t as n- waterfronts collde wth the frst waterfront sequentally). Then the number of splashes = the number of smple loops. For example, n Fg. 6, three splashes are recorded at tme 9, 0, and. Hence the loop count s 3. Proof: Treat the set of connected edges as a graph; and regard the startng pont, the fork ponts, and the collson/splash ponts as the nodes of the graph; and the water branches as the edges. For example, the correspondng graph for Fg. 6 s shown n Fg. 7. Then the water-fllng process s a traverse of the graph, and the total number of splashes s the total number of revsts to some of the nodes, whch s the number of smple cycles/loops n the graph. The above theorem provdes a way of recordng loop counts wthn the water-fllng process wth very lttle overhead computaton. 4). Water amount: Water amount s the total amount of water used to fll up the set of edges n terms of number of pxels. So t s the edge pxel count. In Fg. 4 through 6, the water amounts are {8}, {4, 8, }, and {9}, respectvely. ). Horzontal (vertcal) cover: Horzontal (vertcal) cover s the wdth (heght) of the rectangular boundng box of the set of edges. In Fg. 4 through 6, the horzontal covers are {}, {6,, }, and {7}, respectvely. 6). Longest horzontal (vertcal) flow: Longest horzontal (vertcal) flow s the longest horzontal (vertcal) edge n the set of connected edges. For Fg. 4 through 6, the longest vertcal flows are {6}, {8, 6, }, and {6}, respectvely. Note that there exst many other possbltes on selectng the feature prmtves. However, the fnal selecton should depend upon the specfc applcaton,.e., what nformaton s mportant and most dscrmnatve toward the classfcaton... Edge/structural feature formaton Based on the feature prmtves n 4., we can then construct edge/structural features from ther statstcs. For example: moments (e.g., average fllng tme); order statstcs (e.g., maxmum loop count); dstrbutons (e.g., water amount hstogram); etc. In the followng we dscuss some examples wth the emphass on ther meanngs from a human percepton pont of vew. ). MaxFllngTme and the assocated ForkCount MaxFllngTme(MFC) s defned as max{fllng tmes}. For Fg. 4 through 6, the MFC s, 4, and, respectvely. And the assocated ForkCount(FC) s 3,, and 6 respectvely. So the MFT&FC vectors for Fg. 4 through 6 are (, 3), (4, ), and (, 6), respectvely. MFT&FC are features most probably assocated wth a salent object (wth the longest edge) n the mage. The MFT conveys a rough measure of the sze (edge length) of ths object, whle the assocated FC gves measure of complexty of the structure of the object (complexty of the edges). ). MaxForkCount and the assocated FllngTme Smlarly defned as MFT&FC, these are also features most probably assocated wth a salent object n the mage. The MFT conveys a rough measure of the complexty of the object. Ths object may or may NOT be the same object as the prevous one. For Fg. 4 and 6, the MFC&FT s the same as the MFT&FC. But for Fg., the MFC&FT vector s (3, ). 3). (GLC&MLC) GlobalLoopCount and MaxLoopCount GlobalLoopCount s defned as sum{loop counts}. MaxLoopCount s max{loop counts}. Ths feature vector can capture structural nformaton such as the wndows n the buld mages. Or can be used toward character detect and recognton applcatons. 4). (FTH&FC) FllngTme Hstogram and the assocated averaged ForkCount wthn each bn Ths s a global feature on all sets of connected edges n the edge map. It represents the edge map by the dstrbuton of edge length. Nose or changng background wth short edges may only affect part of the hstogram, leavng the porton depctng the salent objects unchanged. Thus by proper weghtng of the components (e.g. by relevance feedback[]), we could acheve robust retreval. ). (WAH) WaterAmount Hstogram Ths s also a global feature wth multple components. It s another measure of the dstrbuton n edge length or densty. Note agan that there can be many other possble ways to construct feature vectors, such as the moments of fllng tmes, fork counts, loop counts, or water amounts, etc., 3. FEATURE TRANSFORMATION AND SELECTION Fgure 8 st PC Most dscrmnant projecton drecton For any handcrafted feature set, data-dependent analyss can always be carred out to select an optmal projecton or transformaton to best represent the nformaton carred n the orgnal feature set, or reduce the dmensonalty wth mnmum nformaton loss. When there s no label for the data, prncple components analyss [9] gves the best lnear transform from the orgnal feature set n terms of reconstructon errors. If class labels are (partally) avalable, 4

then multvarate lnear dscrmnant analyss (MLDA) [] gves the best lnear transformaton n terms of maxmzng the rato of nter-class scatter over n-class scatter,.e., the MLDA vector has the most dscrmnatng power among all lnear transformatons. The resultng transformaton of PCA or MLDA s n general dfferent. Fgure 8 shows the data scatter of the whole dataset, and two classes by three eclpses, and the two projecton drectons are shown n dotted lne. Both of the above technques requre the orgnal feature set for the calculaton of the transformed features. If computaton s an ssue n the feature extracton process, or some of the orgnal feature component s of specal meanng and the user want to keep them unchanged, whle dmenson reducton s stll desred, then feature selecton technque s needed. [8] gves a detaled comparson on the feature selecton algorthms for labeled data. Here we propose a Prncple Feature Selecton algorthm for un-labeled data,.e., ths algorthm wll output the subset of the features that wll best represent orgnal data. 3. Prncple Component Analyss (PCA) Prncple components are the projecton of the orgnal features onto the egenvectors corresponds to the largest egenvalues of the covarance matrx of the orgnal feature set. Prncple components provde lnear representaton of the orgnal data usng the least number of components wth the mean-squared error mnmzed. (See [9] for detals) 3. Prncple Feature Analyss (PFA) Let X be a zero mean n-dmensonal random feature vector. Let Σ be the covarance matrx of X (whch could be n correlaton form as well). Let A be a matrx whose columns are the orthonormal egenvectors of the matrx Σ, computed usng the sngular value decomposton of the Σ: λ where Λ= and T A A = I n. 0 T Σ = AΛA () 0. λ n λ λ... λ, n Let A q be the frst q columns of A and let V, V,... V n R be the rows of the matrx A q. The vector V corresponds to the th feature (varable) n the vector X, and the coeffcents of V correspond to the weghts of that feature on each axes of the subspace. Features that are hghly correlated wll have smlar absolute value weght vectors (absolute value s necessary snce changng the sgn of one varable changes the sgns of the correspondng weghts but has no statstcal sgnfcance [9]). In order to fnd the best subset we wll use the structure of these rows to frst fnd the features whch are hghly correlated to each other and then choose from each group of correlated features the one whch wll represent that group optmally n terms of hgh spread n q the lower dmenson, reconstructon and nsenstvty to nose. The algorthm can be summarzed n the followng fve steps: Step Step Step 3 Compute the Sample Covarance matrx, or use the true covarance matrx f t s avalable. In some cases t would be preferred to use the Correlaton matrx nstead of the Covarance matrx. The Correlaton matrx s defned as the nxn matrx whose,j th entry s ρ j E = E[ x [ x x ] j ] E[ x Ths representaton s preferred n cases where the features have very dfferent varances from each other, and usng the regular covarance form wll cause the PCA to put very heavy weghts on the features wth the hghest varances. See [9] for more detals. Compute the Prncpal components and egenvalues of the Covarance/Correlaton matrx as defned n equaton (). Choose the subspace dmenson q and construct the matrx A q from A. Ths can be chosen by decdng how much of the varablty of the data s desred to be retaned. The retaned varablty can be computed usng: j Varablty Re taned = ] q = n = λ λ 00% Step 4 Cluster the vectors V, V,..., V n R to p q clusters usng K-Means algorthm. The dstance measure used for the K-Means algorthm s the Eucldean dstance. The vectors are clustered n p clusters and the means of each cluster s computed. Ths s an teratve stage whch repeats tself untl the p clusters are found and do not change. The reason to choose p greater then q n some cases s f the same retaned varablty as the PCA s desred, a slghtly hgher number of features s needed (Usually - more are enough). Step In each cluster, fnd the correspondng vector V whch s closest to the mean of the cluster. Choose the correspondng feature x as a prncpal feature. Ths wll yeld the choce of p features. The reason for choosng the vector nearest to the mean s twofold. Ths feature can be thought of as the central feature of that cluster- the one most domnant n t, and whch holds the least redundant nformaton of features n other clusters. Thus t satsfes both of the propertes we wanted to acheve- large spread n the lower dmensonal space, and good representaton of the orgnal data. Note that f a cluster has just two components n t, the chosen feature wll be the one q

whch has the hghest varance of the coeffcents, snce ths means that t has better separablty power n the subspace. By choosng the prncpal features usng ths algorthm, we choose the subset that represents well the entre feature set both n terms of retanng the varatons n the feature space (by usng the clusterng procedure), and keep the predcton error at a mnmum (by choosng the feature who s vector s closest to the mean of the cluster). (See [9] for more detals.) 3.3 Multvarate Lnear Dscrmnant Analyss (MLDA) Assume that S w s the n-class scatter matrx, S b s the nterclass scatter matrx, then the most dcrmnant projecton drectons are the egenvectors of the S w - S b assocated wth the largest egenvalues. (For detals see []) 4. Experments and Comparsons The frst set of experments s desgned to test the performance of the water-fllng features versus the wdely used waveletmoments as texture measures. Water-fllng features used n these experments are MFT&FC (see Secton for defnton), MFC&FT, GLC&MLC, and FTH&FC (7 bns) a total of 8 feature components per mage. Eucldean dstance s used as dstance measure. The datasets from Corel conssts of 769 mages, 400 of whch are arplanes, and 00 of them are Amercan eagles. Table shows the comparson n terms of averaged number of hts n top 0, 0, 40 and 80 returns for 00 arplanes and 00 eagles as the query mages, respectvely. Table. Water-fllng (WF) versus Wavelet Varances (WM) Arplanes top 0 Top 0 Top 40 Top 80 WF 3.600 6.900 0.900 8.0300 WV 3.300.700 9.9400 7.0700 Eagles top 0 Top 0 Top 40 Top 80 WF.638 3.369 4.93 6.788 WV.9808.873 4.437 6.769 Prncple feature analyss s appled on the jont feature set, whch s the unon of the Water-fllng features and wavelet moment features. For three datasets,,, and 3 pars of features from dfferent sets are clustered together, all wth low correlaton coeffcents (<0.), whch ndcates that waterfllng features contan mostly new nformaton than that s already expressed by wavelet moments. The second sets of experments are desgned to test the output of the prncple feature analyss algorthm (PF) for the purpose of mage retreval. Table shows the retreval results on the same sets of query mages as the frst experment (00 arplanes and 00 eagles). The fourth test n each case corresponds to the same number of features selected, but they span less number of clusters. One can see that the performance s consstently lower. Table 3 shows the testng results on the VISTEX texture database from MIT Meda Lab. In ths database the orgnal mages are tled nto 4X4=6 sub mages for testng purpose, so there are bg mages and 83 resultng sub mages. Snce we have the classes and labels, MLDA s tested to compare wth all the other cases. One can clearly see that MLDA outperform all the cases. Ths s reasonable snce t utlzes extra nformaton. Please note that the last test case n Table 3 have more feature components (6) than that of the prncple features selected (), but snce these feature components are chosen as clustered ones (feature, 4, 8 and 0 are all clustered nto one cluster for ths dataset!), so the performance s even worse, even though one mght ntutvely expect t to perform better snce they spanned all the three wavelet sub-bands! Table. Prncple Features (PF) Performance Comparson usng 0 wavelet varances (WV) for Corel dataset Arplanes top 0 Top 0 Top 40 Top 80 0 WV s 3.300.700 9.9400 7.0700 7 PC s 3.3700.700 9.9000 6.8000 7 PF s (,,6,7,8,9,0) 3.800 6.400.00 9.900 (,,3,4,8,9,0) 3.000.3400 9.00.600 Eagles top 0 Top 0 Top 40 Top 80 0 WV s.9808.873 4.437 6.769 7 PC s.96.796 4.0 6.69 7 PF s (,,6,7,8,9,0).44.96 4.673 7. (,,3,4,8,9,0).7788.000 3.7.700 Table 3. Prncple Features (PF) Performance Comparson usng 0 wavelet varances (WV) for VISTEX dataset VISTEX top 0 Top 0 Top 40 Top 80 0 WV s 8.63.6 3.7308 4.77 PC s 8.00.897 3.77 4.6370 MLDA s 8.370.67 4.800.67 PF s (,,6,7,0) 7.67.393 3.69 4.43 (,3,4,6,8,0) 7.339 0.78.9087 4.4 Table 4. Prncple Features (PF) Performance Comparson usng 9 color moments and 0 wavelet moments (WM) 40 random queres n Corel top 0 Top 0 All:9 color + 0 WM 4.98 8.3 PF s (7 color, WM) 4.90 8. 6

random selected (I) (7 color, WM) random selected (II) (7 color, WM) random selected (III) (7 color, WM) 3.88 6.6 4.8 6.8 4.40 7.8 Table 4 shows results on the 9 color moments plus 0 wavelet moments on the Corel database. 7 out 9 color features and out of 0 wavelet feature are selected by PFA. 40 randomly selected query mages are pcked. The performance of the orgnal feature set and the prncple set are tested aganst 3 randomly selected (but the same number of) features. It clearly shows that Prncple features yeld comparable results as that of orgnal set and sgnfcantly hgher results than any random pcks.. CONCLUSIONS In ths paper we proposed a revsed edge-based structural feature extracton approach by followng gudelnes obtaned from summarzng the exstng feature extracton approaches. A prncple feature selecton algorthm s also proposed for new feature analyss and feature selecton. The results of the PFA s tested and compared to the orgnal feature set, random selectons, as well as those from Prncple Component Analyss and multvarate lnear dscrmnant analyss. The experments showed that the proposed algorthm yeld comparable results to orgnal set, and better results than random sets. 6. REFERENCES [] Haralck, R. M., K. Shanmugam, and I. Dnsten, Texture feature for mage classfcaton, IEEE Trans. System Man and Cybernetcs (Nov., 973) [] Tamura, Hdeyuk, S. Mor, and T. Yamawak, Texture Features Correspondng to Vsual Percepton, IEEE Trans. System Man and Cybernetcs, 8 (6), (978). [3] Corman, T. H., C. E. Leserson, R. L. Rvest. Introducton to algorthms, McGraw-Hll, New York, 997 [4] Flckner, M. et al., Query by mage and vdeo content: The qbc system, IEEE Computers, 99 [] Gonzalez, R. C. and Woods, Dgtal Image Processng, Addson-Wesley, 99 [6] Hu, M. K., Vsual pattern recognton by moment nvarants, IRE Trans. Informaton Theory, 8, 96 [7] Iqbal, Q. and J. K. Aggarwal, Applyng perceptual groupng to content-based mage retreval: Buldng mages, Proc. IEEE CVPR 99, 4-48, 999 [8] Jan, A. K., Fundamentals of Dgtal Image Processng, Prentce Hall, 989 [9] Jollffe, I.T., Prncpal Component Analyss, Sprnger- Verlag, New-York, 986. [0] Lane, A., J. Fan. Texture classfcaton by wavelet packet sgnatures. IEEE Trans. Pattern Anal. Machne Intell., 993 [] Ru, Y., T. S. Huang, M. Ortega, and S. Mehrotra, Relevance Feedback: A Power Tool n Interactve Content-Based Image Retreval, IEEE Tran on Crcuts and Systems for Vdeo Technology, Vol 8, No., Sept., 644-6, 998 [] Smth, J. R. and Chang, Transform features for texture classfcaton and dscrmnaton n large mage databases, Proc. IEEE ICIP, 99 [3] Tabb, M., N. Ahuja, Multscale Image Segmentaton by Integrated Edge and Regon Detecton, IEEE trans. Image Processng, Vol. 6, No., May 997 [4] Vasconcelos, N. and A. Lppman, Embedded mxture modelng for effcent probablstc content-based ndexng and retreval, n SPIE Multmeda Storage and Archvng Systems III, Boston, 998 [] Wlks, S.S., Mathematcal Statstcs. New York: Wley, (963) [6] Zahn, C. T. and Roskes, Fourer descrptors for plane closed curves, IEEE Trans. Computers, 97 [7] Zhou, X. S., Y. Ru, and T. S. Huang, Water-fllng: a novel way for mage structural feature extracton, Proc. ICIP, 999 [8] Jan, A., A.,Zongker, Feature Selecton: Evaluaton, Applcaton, and Small Sample Performance, IEEE Trans. PAMI, VOL. 9, (Feb, 997) [9] I. Cohen, Q. Tan, X. Zhou, T. S. Huang, Feature Selecton and Dmensonalty Reducton Usng Prncple Feature Analyss", submtted to the Sxteenth Conference on Uncertanty n Artfcal Intellgence (UAI-000) Workshop on Fuson of Doman Knowledge wth Data For Decson Support, Stanford Unversty, Stanford, CA, June 30, 000. 7