COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL Nader Safavan and Shohreh Kasae Department of Computer Engneerng Sharf Unversty of Technology Tehran, Iran skasae@sharf.edu ABSTRACT Wth the rapd establshment of dgtal lbrares and multmeda databases, the need for an effcent search algorthm s also ncreasng. In ths paper, a new approach for content-based mage ndexng and retreval s presented. The proposed method s based on a combnaton of multresoluton analyss and color characterstcs of the mage. Also, n order to obtan better retreval results, the mage texture features are combned wth the color features to form a powerful dscrmnatng feature vector for each mage. The texture features are obtaned wth the use of dual-tree complex wavelet transform (DT CWT) method. Accordng to the new algorthm, the mage s dvded nto dfferent sublayers, each of whch contanng only pxels n areas wth smlar spatal frequency characterstcs. Here, we present a computatonally effcent algorthm to mplement an effcent content-based mage retreval algorthm. Smulaton results show the effcency of the proposed algorthm. Key words Color ndexng, texture analyss, content-based mage retreval, complex wavelet transform, and hstogram. 1. INTRODUCTION The Content-based mage retreval (CBIR) has become very attractve n the past decade. The frst mportant step n every algorthm dealng wth ths matter s to fnd a way n descrbng the content of every mage n the database. At frst, swan and ballard [1] used smple color hstogram to ndex each mage. Ths method has been shown to be very effectve and has become popular n ndexng applcatons due to ts low complexty Color hstograms are computatonally effcent, and generally nsenstve to small changes n camera poston. However, a color hstogram provdes only a very coarse characterzaton of an mage (and the nformaton regardng the spatal postons of the color s not ncluded). Consequently, two mages wth smlar hstograms can have dramatcally dfferent appearances. When workng wth large databases, the chance of havng (vsually) dfferent mages wth smlar hstograms ncreases [2], and consequently the effectveness of the method wll be reduced. To overcome ths dffculty researchers have proposed dfferent approaches. One common approach s to dvde the mage nto several submages and assgn a separate color hstogram to each submage [3]. The problem wth that method s that t s computatonally expensve and t requres a huge amount of storage memory. Also, ths method s vulnerable to translaton and rotaton of color mages. Mandel et al. [4] reduce the computatonal complexty of color hstogram n terms of ts moments. In that research t s shown that the Legendre moments provde superor retreval performances when compared to regular moments. In other approaches, hstograms are augmented wth local spatal propertes. A famous method of ths type s the color coherence vector (CCV), by Pass and Zabh [5], where the mage pxels n a gven hstogram bn are parttoned nto two classes based on ther spatal coherence. A pxel s assgned as coherent f t s n a contnuous regon wth the same color and wth a sze larger than a mnmum amount; and otherwse as ncoherent. The performance of ths method s better than that of the conventonal hstogram, but t suffers from low computaton speed. Huang et al. [6] propose color correlogram whch ncludes the spatal correlaton of colors along wth the global dstrbuton of local spatal correlaton of colors. It outperforms the CCV method but t also suffers from a very hgh computatonal cost. Han et al. [7] argue that a conventonal color hstogram consders nether the color smlarty across dfferent bns nor the color dssmlarty n the same bn, and therefore t s senstve to nosy nterference such as llumnaton changes and quantzaton error. To address theses concerns, they present a new color hstogram representaton called fuzzy color hstogram (FCH), by consderng the color smlarty of each pxel s color assocated to all the hstogram bns through a fuzzy-set membershp functon. The problem wth ths method s that t s a rather complcated approach. Another problem assocated wth all the prevous works s that they do not consder the way humans judge smlarty of the retreved mages. Qu and Lam [8] used frequency layered color ndexng (LCI) n ther research. They separated every mage to dfferent layers accordng to ther frequency content. They used smple Gaussan flters [9] n ther
work. In our approach, we consder the human vsual system propertes and use them n proposng a new algorthm for mage retreval. Although color s the most effectve cue for mage retreval, ncorporatng other features (such as texture) wll enhance the mage retreval performance. In ths paper, we propose a new method n whch we take advantage of both color and texture propertes of mages to mprove the performance of the mage ndexng and retreval algorthm. The paper s organzed as follows. In Secton 2 the detals of the proposed technque along wth a bref ntroducton to the dual-tree complex wavelet transform procedure are gven. The smulaton results are llustrated n Sectons 3, followed by the concluson statement presented n Secton 4. 2. PROPOSED ALGORITHM The best judge regardng the performance of an mage retreval system s gven by the fnal user; the human. Therefore, n order to propose a "good" retreval system, we should take nto account the way a human perceves smlarty between mages. Accordng to the frequency analyss theory [10], there are several frequency channels perceved by the human vsual system. Each of these frequency channels responses only to a lmted bandwdth of the mage constructed on the retna. Consequently, we can consder them as bandpass flters and t would be convenent to thnk of the vsual system as a flter bank consstng of several flters, where each flter s response covers certan areas of the spatal frequency spectrum. In an mage, a busy or sharp area conssts of hgh frequency components and a smooth area contans lower frequency components. The flat areas can be assocated wth the nteror of the objects or backgrounds whle the busy areas can be textured surfaces or object boundares. Physcally, dfferent frequency components n an mage may be regarded to dfferent objects or boundares. As a result, judgng the smlarty of mages by the human vsual system can occur as follows: an mage s decomposed nto dfferent frequency components and the correspondng components of dfferent mages are compared. We have based our algorthm on ths phlosophy. A schematc block dagram of the proposed algorthm s presented n Fg. 1. As can be seen n ths fgure, we frst pass an mage through a flter bank. The output of each flter s used to defne a separate layer. Each ndvdual layer (whch contans pxels wth smlar frequency dstrbutons) s used for ts own ndex. Then, we combne these features to construct the total feature vector of the mage. Also to compare two mages, we compare the ndex of each layer wth the ndex of ts counterpart n another mage. To desgn our algorthm we should consder two mportant ssues. The frst s how to mplement the flter bank and the second s how to ndex each layer. Color Image Gray-Scale Image CWT Fgure 1. Block dagram of the proposed layer separaton stage. For the flter bank we use the CWT [11] to take advantage of ts numerous specal propertes. In Secton 2.1 we represent a bref descrpton of the CWT. Snce the nformaton of the busy and sharp areas along wth the nformaton of the flat areas s present n the grayscale components of the mage, for the texture analyss stage, we convert the color (RGB) mage nto a grayscale mage before mplementng the CWT onto t. Multple thresholds are then appled to the outputs of the flters. We obtan the bnary mages usng (1) and the layers usng (2). 1, f Tk 1 y(, Tk bk (, 0, Otherwse (1) x(,, f bk (, 1 Lk (, Empty, Otherwse (2) 2.1. Dual-Tree Complex Wavelet Transform Kngsbury n hs dstngushed work ntroduced the dual-tree complex wavelet transform [11]; whch s smlar to the Gabor flter but t s orthogonal and also can be computed faster than that. The frequency responses of the CWT are shown n Fg. 2. Just lke a typcal Gabor flter, there are 6 orentatons at each of the 4 scales (the number of scales s arbtrary, but the number of orentatons s fxed). The man advantages of the DT CWT over the real DWT are that the complex wavelets are approxmately shft nvarant (meanng that the obtaned texture features are lkely to be nvarant to the translatons n mages) and have separate subbands for postve and negatve orentatons (6 orentatons). Note that the conventonal separable real wavelets suffers from the lack of shft nvarance, provde just three orentatons whle havng a poor drectonal selectvty and also cannot dstngush o o between 45 and 45 drectons. The CWT attans these propertes by replacng the tree structure of the conventonal wavelet transform wth a dual tree. At each scale, one of the trees produces the real part of the dual-tree complex wavelet coeffcents, whle the other tree produces the magnary part. LPF HPF1 HPF2 HPF3 HPF4 HPF5 HPF6
Fgure 2. Contours of 70% peak magntudes of the CWT flters at scales of 3 and 4 (adopted from [11]). The extra redundancy allows a sgnfcant reducton of alasng terms and causes the complex wavelets to become approxmately shft nvarant. Translaton causes large changes to the phase of the wavelet coeffcents, but ther magntude (and hence energy) s much more stable. By usng even and odd flters alternately n these trees, t s possble to acheve overall complex mpulse responses wth symmetrc real parts and antsymmetrc magnary parts. In Fg. 3, the structure of the dual tree CWT s llustrated. In our proposed method, we extract features of mean, standard devaton and kurtoss of the magntude of each of the sx submages of the CWT at the frst stage as the texture features. 3. EXPERIMENTAL RESULTS As ndcated n Fg. 1, there are one low-pass flter and sx hgh-pass flters n the CWT. We use the lowpass flter to construct a layer correspondng to low frequency dstrbuton components of the mage (whch relate to the background or nteror parts) and sx outputs of the hgh-pass flters to buld the layers ndcatng detals (such as boundares or textured parts). Fgure 3. Dual trees of real flters for the mplementaton of the DT CWT (adopted from [11]). The threshold values are chosen to be the maxmum pxel values of the output fltered mages dvded by two. Therefore f we scale the maxmum pxel value of each submage to 1, we would set T. k 1 5 and T k 1 n equaton (1). To make the algorthm computatonally effcent we have used only three hgh-pass flters nstead of sx and a 54-bn color quantzer ({H, S, V}: {6, 3, 3}) to buld the ndex of each layer. As such, the feature vector length s 234 ( 4 54 + 18 234 ). To compare two hstograms, the L 1 -norm dstance metrc s used. If X ( x1, x2,..., xn ) and Y ( y1, y2,..., yn ) denote the hstograms of the two counterpart layers, then ther dfference would be: x y D( X, Y ) (3) 1+ x + y 3.1. Retreval Performance Evaluaton Our database conssts of mages manly downloaded from: http:// www. cs. washngton. edu/ research /magedatabase /groundtruth/ as well as other mages gathered n our laboratory; contanng more than 2000 mages. Our database s dvded nto dfferent categores, where each category contans smlar and related mages. The ground truth answers are also provded. Therefore, each mage has an ndex ndcatng the ground truth answers. We have mplemented the algorthm usng Matlab 6.5 package on a Pentum IV (2 GHz), and have run the algorthms on ths database. For each mage, the feature extracton stage of our algorthm took only about 2 seconds. We used three dfferent crtera to evaluate the performance of the algorthm as follows: Number P j Number R j of retreved and relevant elements n the j frst j poston (4) of retreved and relevant elements n the frst j poston Total number of relevant elements n the collecton In Table 1, the amounts of these crtera for fve dfferent categores contaned n our database are lsted. Also, we have proposed another recall crteron whch s a more approprate measure. If Q s the th query mage and (), Q ( 2) Q ( N ) Q,..., 1 are the N correct answers to ths query, then we propose to use the followng recall measure: AR () l { Q ( rank ( Q ( ) l } N (6) (5)
In Fg. 4, the recall performance of the hstogram, the correlogram and our method are shown. Fgures 5 and 6 show the performance of the proposed algorthm when retrevng the mage shown n the top left on the fgure. Table1. Comparson of Dfferent Methods [C: Correlogram, O: Our method]. Category P 5 P 10 P 15 P 20 R 5 R 10 R 15 R 20 Cherres 1 1 1 1.04.08.2.3 O.8.8.9.85.04.06.12.16 C Football 1 1 1.9.21.46.78.81 O Feld 1.9.8.85.18.41.71.78 C Red Rose 1 1 1 1.12.28.29.46 O.8.7.8.8.11.2.3.39 C Greenland 1 1.9.8.14.23.31.36 O 1.9.8.7.11.19.28.31 C Ancent 1 1 1.9.09.2.31.46 O Speces 1 1.8.83.09.21.25.41 C 30 25 20 15 10 5 0 1 4 7 101316192225 correlogram LCI new method Fgure 4. Comparson of three methods accordng to (6). 4. CONCLUSIONS In ths paper we proposed a new method for contentbased mage retreval. We tred to take nto account the human vsual system propertes an ncorporated the texture features to obtan better retreval results. We utlzed the dual-tree complex wavelet transform whch s shown to be very effcent for extractng texture features from mages and we showed ts superorty over the Gabor flter. The expermental results show the effcency of the proposed algorthm. ACKNOWLEDGEMENT The authors would lke to thank Prof. Nck G. Kngsbury from Unversty of Cambrdge for provdng us wth the Matlab codes of the dual-tree complex wavelet transform. Ths research was n part supported by a grant from ITRC. 5. REFERENCES [1]. M.Swan and D. Ballard, Color Indexng, Int. J. Comput. Vs., vol. 7, pp. 11-32, 1991. [2]. A. W. M. Smeulders et al., Content-Based Image Retreval at the end of the Early Years, IEEE Trans. on Pattern Anal. Machne Intell., vol. 22, pp.1349-1380, Dec. 2000. [3]. M. Strcker and A. Dma, Spectral Covarance and Fuzzy Regons for Image Indexng, Mach. Vs. Applcat., vol. 10, pp. 66-73, 1977. [4]. M. K. Mandel and T. Aboulnasr, Fast Wavelet Hstogram Technques for Image Indexng, IEEE Trans. On Consumer Electron., vol. 42. Aug. 1996. [5]. G. Pass and R. Zabh, Hstogram Refnement for Content Based Image Retreval, n Proc. IEEE Workshop on Applcatons of Computer Vson,1996, pp. 96-102. [6]. J. Huang et al., Spatal Color Indexng and Applcatons, Int. J. Comput. Vs., pp.245-268, 1999. [7]. J. Han and K. Ma, Fuzzy Color Hstogram and Its Use n Color Image Retreval,. IEEE Trans. on Image Processng, vol. 11, No. 8. Aug. 2002 [8]. G. Qu and K. Lam. Frequency Layered Color Indexng for Content-Based Image Retreval, IEEE Trans. on Image Processng, vol. 12, No. 1, Jan. 2003. [9]. P.J. Burt and E. H. Adelson, The Laplacan Pyramd as a Compact Image Code, IEEE Trans. on Commun., vol. COM-31, pp. 532-540, 1983. [10]. A. Mojslovc et al., Matchng and Retreval based on the Vocabulary and Grammar of Color Patterns, IEEE Trans. on Image Processng, vol. 9, pp. 38-54, Jan. 2000. [11]. N. Kgsbury, Image Processng wth Complex Wavelets, Phl. Trans. R. Soc. Royal Socety, 1999.
Fgure 5. Performance of the proposed retreval algorthm. The mage n the top left s the query mage and mages from left to rght and top to bottom are the retreved mages.(due to the lack of related mages n the database, t has retreved the last three mages that have been the most related mages contaned n the database.) Fgure 6. Performance of the proposed retreval algorthm. The mage n the top left s the query mage and mages from left to rght and top to bottom are the retreved mages.