A Maching Algorihm for Conen-Based Image Rerieval Sue J. Cho Deparmen of Compuer Science Seoul Naional Universiy Seoul, Korea Absrac Conen-based image rerieval sysem rerieves an image from a daabase using visual informaion. Among approaches o expressing visual aspecs in queries, "query by sech" is mos convenien and expressive. However, he query drawn by he user is ypically quie differen from he arge image. In his paper, a maching algorihm for imperfec queries is presened. The algorihm measures he similariy beween he query and each image sored in he daabase based on heir opological srucures ha are represened by prime edge graphs. Experimenal resuls show ha he sysem rerieves he inended image wih a high similariy score even from a parial or shifed query. Keywords: conen-based image rerieval, image daabase, query by sech, maching, similariy 1 Inroducion Conen-based image rerieval sysem rerieves an image from a daabase using visual informaion such as color, exure, or shape. In mos sysems, he user queries by presening an example image ha has he inended feaure [4,5,6]. Alhough his approach has advanages in effecive query processing, i is inferior in expressive power and he user canno represen all inended feaures in his query. To fully reflec user's inenion, he alernaive approach ha acceps a user-drawn sech as a query has been developed [1,2,3]. In his approach, he ype of maching algorihm sysem employs limis he feaures ha user can represen in his query and vice versa. In he sysem ha uses edge-maching algorihm, he user draws a boundary sech as a query [1,3]. However, in he sysem ha allows only boundary informaion in he query, he user canno represen color informaion. In he sysem ha uses wavele-maching algorihm, he user draws a rough paining as a query [2], bu even roughly reproducing an image is very difficul. In his paper, i is assumed ha he Su I. Yoo Deparmen of Compuer Science Seoul Naional Universiy Seoul, Korea user draws a query by placing colored objecs in appropriae posiions, in which he posiion of each individual objec may no be correc nor may all objecs be reproduced. The query is as a form of rough map of colored objecs in which he aribues of objecs and heir posiional relaionships play an imporan role. The sysem maches his rough drawing o each of he image daa in he daabase o rerieve he arge image. Since he query image drawn by he user ends o be quie differen from he arge image, he classical maching algorihm based on pixel values are no very effecive in discriminaing he arge image from he res of he daabase. In his paper, a maching algorihm is described, which is o rerieve an image from a parial or shifed query effecively. An image is segmened ino a collecion of objecs. Wih hese objecs and heir opological srucure, he similariy beween wo images is measured. The res of his paper gives he brief descripion of he daa model of images, and shows he algorihms and some experimenal resuls. 2 Image Conversion An image is segmened ino a collecion of objecs. Here, he meaning of an objec is somewha differen from ha of wha we call an objec in real life. An objec is a se of pixels ha are conneced and have he same (quanized) color. Each objec is represened as a se of aribues such as color, size, posiion, major axis, and minor axis. When an image is insered in he daabase, i is convered o a collecion of objecs and hese aribues are auomaically exraced. Since i is assumed ha a user draws a query in objec basis, he query image can be recognized as a collecion of objecs wihou any segmenaion process. This query image is convered o a graph
Query Complee Digraph Edge Labeling 11 10 14 6 7 11 = 1011 (2) 10 : up 01 : down 11 : none Edge Pruning Prime Edge Graph Figure 1: Query Conversion 9 13 5 Figure 2: Edge Labeling 10 : lef 01 : righ 11 : none called prime edge graph, which represens he opological srucure of objecs in he image. In a prime edge graph, each node represens an objec and an edge beween wo nodes represens he posiional relaion of corresponding objecs. 2.1 Query Conversion Algorihm The query image is convered o a prime edge graph hrough several seps a bloc diagram is shown in figure 1. Sep 1 (Generaion of a complee direced graph:) From a query image Q, generae a complee direced graph G q (V,E). Each node v i in V represens an objec in he query image Q. An edge e ij is insered beween every wo nodes v i and v j, where i j. Sep 2 (Edge labeling:) Label each edge e ij in E a number L(e ij ) according o he posiional relaion of wo nodes v i and v j. The number assigned o each of eigh direcions is shown in figure 2. The number for each direcion is 4-bi code based on x, y posiions of sar and end poins. The firs and second bis of 4-bi code represen he relaion of x values of sar and end poins. If he x value of sar poin is smaller han ha of end poin, firs and second bis of 4-bi code are 1 and 0. The las wo bis represen he relaion of y values of sar and end poins. For example, if v i is o he lef of v j, L(e ij ) = 0111 and L(e ji ) = 1011, and heir decimal represenaions are 7 and 11 respecively. Sep 3 (Edge pruning:) Repea he following process unil no change occurs: if L(e ij ) ~ L(e j ) = L(e i ) hen delee(e i ) Where, ~ denoes a biwise-and operaor. If he resul of biwise-and operaion beween wo edge labels L(e ij ) and L(e j ) is equal o he L(e i ), i mean ha v i, v j, and v are placed in one direcion in his order. Even hough e i is deleed, he posiional relaion of v i and v can be sill found wih L(e ij ) and L(e j ). For example, if a node u is on he lef of v and v on he lef of w, i is rivial ha u is on he lef of w. All such ransiiviy can be eliminaed. Sep 4 (Prime edge graph:) Finally, one of wo edges e ij and e ji beween wo nodes v i and v j is deleed as follows: For every e ij, if i > j, hen delee(e ij ). Seps 3 and 4 are o minimize he number of comparison in laer maching process. Figure 3 shows an example of query conversion. The query image in figure 3(a) consiss of 4 objecs. According o he posiional relaion of every wo objecs and figure 2, all edges are labeled (figure 3(b)). Figure 3(c) shows he resuled prime edge graph. Noe ha e 13 is deleed since L(e 12 ) ~ L(e 23 ) = 5 ~ 13 = 5 = L(e 13 ). 12 edges in figure 3(b) is reduced o 3 edges in figure 3(c) in seps 3 and 4, and he opological srucure of objecs is sill remained. v 1 0 5 5 5 v 2 10 0 13 5 v 3 10 14 0 5 v 4 10 10 10 0 v 1 v 2 v 3 v 4 (a) (b) (c) v 1 Figure 3: Prime edge graph v 2 v 3 v 4
Image Daabase score Ge an image rerieved images Prime Edge Graph Objec Maching accep Prime Edge Maching accep rejec rejec Figure 4: Image rerieval 3 Image Rerieval The process of rerieving images wih a prime edge graph is shown in figure 4. For each collecion of objecs ha represens an image in he daabase, similariy score is measured hrough wosep maching objec maching and prime edge maching. Finally, he acceped images are displayed in he order of similariy score. 3.1 Objec Maching A daabase image I is a collecion of objecs {c 1, c 2,, c n } and each objec c j is a lis of aribues. Le A (x) be he funcion ha reurns he -h aribue value. For example, A 0 (c j ) is he color of c j, and A 1 (c j ) is he normalized size of c j. For each node v i in G p, he candidae objecs in an image I are found according o he mach score. The mach score M(c j ) of a candidae objec c j is obained as follows: M c ) = w * A ( v ) A ( c ) A ( v ) / A ( v ), ( j i j i i where w is he weigh of -h aribue. If M(c j ) exceeds some hreshold T, c j is mared as a candidae for v i.. Since aribues are compared direcly, hey mus have normalized values. For effecive segmenaion and maching, colors are quanized and coded so ha neighbor colors would produce he high mach score. If here exiss a node in G p ha has no candidae a all, I is rejeced. 3.2 Prime Edge Maching The prime edge maching sep checs he consisencies beween each prime edge e ij and he posiions of candidae objecs of v i and v j. If all candidaes of v i and v j are no consisen wih L(e ij ), I is rejeced. Finally, one candidae objec for each node in G p remains. The similariy score S of I is compued by summing and normalizing he mach scores of he finally seleced candidaes. S = 100 * M ( f ) / n, where f is he finally seleced candidae and n is he number of nodes in G p. 4 Experimenal resuls The presened maching algorihm is esed on an image daabase wih 400 images. Firs, 3 queries drawn by differen people are esed o rerieve he image in figure 5. Second, o evaluae he algorihm, he similariy scores produced by anoher algorihms are compared wih he presened algorihm. Finally, some ineresing ess ha are o rerieve images ha have cerain componen are performed. 4.1 Robusness o Imperfec Queries The query image Q 1 in figure 6(a) conains 2 objecs, Q 2 in figure 6(b) conains 4 objecs, and Q 3 in figure 6(c) conains 2 objecs whose posiions are quie differen from he arge image in figure 5. The rerieved images for each query and heir similariy scores are shown. Even from a parial or shifed query, he sysem rerieved he inended image wih a high similariy score. 4.2 Rerieval Effeciveness The presened algorihm has wo disincive characerisics. Firs, he similariy scores are calculaed in objec basis, so he user can rerieve an image wih a parial query. Second, he opological srucure of he objecs are compared insead
Œ ŒŽ ŽŽ Ž {{ Œ Œ Ž Ž Ž Œ ŒŒ Œ { Œ Œ Ž Œ ŽŽŽ Œ{{ Ž Ž Ž Œ Œ { Ž Œ Œ Ž Œ ŽŽŽ {{ Œ Ž Œ ŒŒ{ Œ Œ Œ ŒŒ Œ Œ ŒŒ {{ ŒŽ Œ Œ Ž{ Œ ŒŒ Œ Ž Ž {{ Œ Ž Ž Ž { Œ Œ ŽŽ Œ Œ Ž {{ Œ Œ Œ Œ Œ ŒŽ ŒŽ { Œ Œ Œ ŽŒ {{ Œ ŒŽ Œ Ž Œ Œ{ Ž Œ Ž Ž Ž ŽŒ Œ {{ Œ ŽŽ Ž Œ{ Ž Ž Ž Ž Ž ŽŽ Œ Ž Ž{{ŒŽ ŽŽ ŒŒ Ž Ž Ž Ž { ŒŽ Œ Ž Ž Ž {{ŒŽ Ž Ž Ž Œ Œ { Œ Ž Œ Œ Ž {{Œ Œ Œ Œ Œ{ Œ Ž Œ Ž Œ Ž {{ Ž ŽŽ Œ { ŒŒŒ ŒŽ Ž Œ{{ Ž Ž Ž Œ Œ { Ž Œ ŒŒ Ž Ž {{ Ž Œ Ž ŽŒŽ { Ž Ž Ž Œ Œ {{ Œ Œ Ž Ž Ž Œ Ž { Ž Œ Ž ŽŒ Œ Œ Ž {{ Ž Ž Ž Œ Œ { Œ Œ ŽŒ Œ {{ Ž Œ Ž Ž Ž Œ Ž Ž Œ { Figure 5: Targe image and he objecs se (a) Rerieval resuls for Q 1 (S 1 = 92.8, S 2 = 90.2, S 3 = 76.7) (b) Rerieval resuls for Q 2 (S 1 = 96.4) (c) Rerieval resuls for Q 3 (S 1 = 95.5, S 2 = 87.6, S 3 = 74.1) Figure 6: Example queries and rerieved images
of heir posiions, so even when he query image is shifed or he disance beween objecs are no correc, arge image can be rerieved. To bring ou hese feaures clearly, hree algorihms are developed. In he firs algorihm, he similariy score is calculaed as follows: 1 S = 100 * M ( f ) / m, where m is he number of objecs in I insead of he number of nodes in G p. Alhough his algorihm is also based on objecs, he similariy score canno be high if only he small se of objecs is represened in he query. In he second algorihm, queries are no convered o prime edge graphs and prime edge maching sep is omied. In objec maching sep, he posiion aribues are compared and he final candidae is he one ha has he highes mach score. The similariy score is calculaed as follows: 2 S = 100 * M ( f ' ) / n, where f' is he final candidae for -h objec in he query image Q and n is he number of objecs in Q. In he hird algorihm, he final candidae is seleced as he second algorihm and he similariy score is calculaed as follows: 3 S = 100 * M ( f ' ) / m In figure 7, S, S 1, S 2, and S 3 are similariy scores of he image in figure 5 wih queries in figure 6, calculaed by he presened algorihm, he firs algorihm, he second algorihm, and he hird Similariy Score 100 80 60 Q 1 Q 2 Q 3 algorihm, respecively. 4.3 Rerieving Unnown Images Someimes i is very useful o rerieve images ha conain some figures. The user may have no seen he image nor he may now if here is such an image in he image daabase. For example, a person who is o mae a Chrismas card may wan o rerieve an image of Sana Claus. In his case, he doesn' care where Sana Claus is, nor he nows if here is a Sana Claus image. For his ype of query, ou maching algorihm shows a good performance wih he weigh of size aribue minimized. Figure 8 shows some experimens. In figure 8(a), o rerieve a Sana image, a red ha and a whie beard are drawn in he query. There are several Sana images in he daabase, bu only he bes-mached image is shown in his example. (a) Sana query: here is a Sana Claus image 40 20 0 S S 1 S 2 S 3 Figure 7: Comparison wih anoher algorihms (b) Srawberry query: here is no srawberry image Figure 8: Rerieving unnown images
5 Discussion and Fuure Wor The conen-based rerieval sysem using he algorihm described in his paper has wo imporan advanages. Firs, he sysem can rerieve he inended image wih a high similariy score even from a parial or shifed query. Second, he query can be drawn more easily. The proposed mehod also has some limiaions. Firs, he maching ime increases as he daabase grows. Currenly, o speed up he maching, wo imes of rejecion occurs during he maching process. In objec maching, if here exis a node in G p ha has no candidae a all, maching of I is erminaed immediaely. In prime edge maching, if a prime edge wih which no candidae objec is consisen is found, he image is rejeced. However, for very large daabases, more powerful speed-up scheme mus be devised along wih effecive daabase managemen scheme. Second, for some ind of images such as exured images, he presened mehod is no so effecive. To rerieve such images effecively, supplemenary query scheme is needed. I is no difficul o combine a rouine ha processes he query wih global feaures such as color or exure. 733, July 1996. [6] C. Schmid and R. Morh. Image Rerieval Using Local Characerizaion. In Proceedings of ICIP-96, pages 781-783, IEEE, 1996. References [1] A. D. Bimbo and P. Pala. Visual Image Rerieval by Elasic Maching of User Seches. IEEE Trans. Paern Analysis and Machine Inelligence, 19(2), Feb 1997. [2] C. E. Jacobs, A. Finelsein, and D. H. Salesin. Fas Muliresoluion Image Querying. In Proceedings of SIGGRAPH '95, pages 277-286, ACM, New Yor, 1995. [3] T. Kao, T. Kuria, N. Osu, and K. Hiraa. A Sech Rerieval Mehod for Full Color Image Daabase. In Proceedings of 11h IAPR, pages 530-533, IEEE, 1992. [4] P. M. Kelly, M. Cannon, and D. R. Hush. Query by image example: he CANDID approach. SPIE Vol. 2420 Sorage and Rerieval for Image and Video Daabases III, pages 238-248, 1995. [5] F. Liu and R. W. Picard. Periodiciy, direcionaliy, and randomness: Wold feaures or image modeling and rerieval. IEEE Trans. Paern Analysis and Machine Inelligence, 18(7):722-