KIDS Lab at ImageCLEF 2012 Personal Photo Retrieval

KD Lab at mageclef 2012 Personal Photo Retreval Cha-We Ku, Been-Chan Chen, Guan-Bn Chen, L-J Gaou, Rong-ng Huang, and ao-en Wang Knowledge, nformaton, and Database ystem Laboratory Department of Computer cence and nformaton Engneerng Natonal Unversty of Tanan, Tanan, Tawan cooperku@msn.com bcchen@mal.nutn.edu.tw Abstract. The personal photo retreval task at mageclef 2012 s a plot task for testng QBE-based retreval scenaros n the scope of personal nformaton retreval. Ths plot task s organzed as two subtasks: the vsual concepts retreval and the events retreval. n ths paper, we develop a framework of combnng dfferent vsual features, EXF data and smlarty measures based on two clusterng methods to retreve the relevant mages havng smlar vsual concepts. We frst analyze and select the effectve vsual features ncludng color, shape, texture, and descrptor to be the basc elements of recognton. A flexble smlarty measure s then gven to acheve hgh precse mage retreval automatcally. The expermental results show that the proposed framework can provde good effectveness n dstnct measures of evaluaton. Keywords: mage retreval, Concept retreval, Features clusterng, mlarty measure 1 ntroducton The man am of the mageclef 2012 personal photo retreval task s provdng a test bed for mage retreval based on some gven query mages [ 1]. The task s further dvded nto two subtasks: the vsual concepts retreval and the events retreval. Compare wth tradtonal mage retreval, the topcs of ths task are more abstract or more general. t mght cause mage retreval to be more dffcult. The benchmark data set used n ths task conssts of 5,555 mages downloaded from Flckr. Both the vsual concepts retreval and the events retreval use the same dataset. The vsual concepts retreval s a great challenge to the developers. ome of the concepts are abstract lke the topc sgn, and some of them are very subectve lke art obect. Even dfferent people would draw dfferent opnons on the same mage. The events retreval s to fnd the mages wth the same knds of events. ome of the target topcs lke fre and conference are too general to defne n vsual concept. Parts of the events n ths subtask connect wth geographcal topcs. Thus, most of the topcs are dffcult to retreve n vsual. n such a case, EXF features may support much more nformaton about the event concept.

n our partcpaton to the mageclef 2012 personal photo retreval task, we developed a framework for the vsual concepts retreval and the events retreval. Frst, we selected 7 vsual features from the gven features set for the task. Each selected feature s used to cluster all the mages nto groups ndvdually. We frst defne the smlarty degree for vsual features and EXF s nformaton. Then, the smlarty measures for dfferent mage features are ntegrated to estmate the smlarty scores between each mage and the query mage. The cluster of each feature s used to help weghtng the mage smlarty. Fnally, the framework combnes and ranks the smlarty degrees between an mage and the dfferent QBE mages to retreve the photos wth the same concept. The remander of ths paper s organzed as follows. We descrbe the used features provded by the organzers n ecton 2. ecton 3 ntroduces the proposed smlarty measures and retreval methods. n ecton 4, we present the expermental results of our proposed framework. Fnally, we conclude the paper wth a dscusson and future work. 2 Process of mage Features 2.1 Vsual Features The orgnal datasets n the personal photo retreval task provded 19 extracted vsual features. After our estmatng test, 7 features were selected from the 19 features. They are AutoColorCorrelogram [ 2], BC [ 3], CEDD [ 4], Color tructure, Edge Hstogram, FCTH [ 5], and URF [ 6]. The selected features cover dfferent knds of popular vsual percepton ncludng color, shape, and texture. URF s a robustly scale-nvarant and rotaton-nvarant descrptor feature. The features are summarzed n Table 1. Table 1. The selected 7 features. Vsual Features Color hape Texture Descrptor AutoColorCorrelogram BC CEDD Color tructure Edge Hstogram FCTH URF Vsual Features Clusterng. We frst cluster mages by the ndvdual vsual features to fnd the groups of mages wth the smlar vsual features. Two dfferent clusterng methods are developed for the URF descrptor and the others vsual features, respectvely. We depct the clusterng algorthms n the followng. URF feature clusterng. The URF descrptor s the feature wth scale-nvarant and rotaton-nvarant. n ths paper we defned the matchng par to measure the smlar-

ty between two mages. f the URF descrptor d for the mage matches another descrptor d n the mage and vce versa, the descrptors d and d form a matchng par. The dstance between two mages and s defned as 1 dst URF (, ), (1) N (, ) where N mp (, ) s the number of matchng pars between the two mages and. The larger N mp s, more smlar two mages are. Based on the measure of the matchng par, we propose the clusterng algorthm for URF descrptors, shown as Table 2. Before descrbng the detaled algorthm, we defne two cluster dstances: the ntracluster D ntra (C k ) and the nter-cluster D nter (C k, C l ). Table 2. The clusterng algorthm for URF feature. Algorthm: urfcluster nput: the set of mages Output: the clusters of mages C C = {}; whle ( mn{dst URF (, )} < ) // s the threshold of URF dstance Case 1: C k and C l for C k, C l C and C k C l f ( D ntra (C k C l ) 1 mn{d ntra (C k ), D ntra (C l )}) // 1 s a constant. C = C {C k C l } - {C k } - {C l }; else urfcluster(c C ); end f Case 2: C k and C k for C k C f ( D nter (, C k ) 2 D ntra (C k ) ) // 2 s a constant. C = C {C k { }}; else C = C {{, }}; end f Case 3: C k and C k for all C k C C = C {{, }}; end whle for C k, C l n C f (C k C l ) f ( D ntra (C k C l ) 1 mn{d ntra (C k ), D ntra (C l )}) C = C {C k C l } - {C k } - {C l }; else urfcluster(c C ); end f end f end for mp

1 Dntra ( Ck ) (, ), 2 dsturf for, C k ; (2) C k 1 D nter (, Ck ) (, ), dsturf for C k. (3) C k Accordng to our observaton, f the number of matchng pars s larger than four, the mages look smlar n vsual. Hence, we defne the smlarty for URF feature as N (, ) 4 mp URF (, ) max 1,. (4) 4 Other Vsual Features. For other vsual features, the clusterng methods consder only the smlarty between two mages usng the dstance measures of Table 3. The detaled algorthm s lst as Table 4. Table 3. Features and ther dstance measures. Vsual Feature AutoColorCorrelogram BC CEDD Color tructure Edge Hstogram FCTH Dstance Measure L 1 measure L 1 measure Tanmoto measure L 1 measure L 1 measure Tanmoto measure Table 4. The clusterng algorthm for general vsual features. Algorthm: VsualCluster nput: the set of mages Output: the cluster of mages C C = {}; for C = C {{ }}; end for for C k, C l n C f ( mn{dst(c k, C l )} < ) // s the threshold of the mnmum dstance. C = C {C k C l } - {C k } - {C l }; end f end for

2.2 Textual Features The textual features are manly extracted from EXFs. There are totally 63 features n EXFs; for example, ApertureValue, BrghtnessValue, Colorpace, CompressedBts- PerPxel, Contrast, etc. However, only two features, the GP and the tme, were consdered and used n our methods. The values of the GP and the tme are also clustered by the same clusterng algorthm of general vsual features shown n Table 4 usng L 1 dstance measure. 3 The Measure for mlarty mage Retreval 3.1 Normalzaton of Vsual Features The ranges of feature dstances are qute dfferent for all vsual features. Before combnng all the features to measure the smlarty of mages, the normalzaton process s necessary. We use the approxmaton proposed by Abramowtz & tegun [ 7] to approxmate the values of normalzaton. The approxmaton step s very fast and accurate. Let x be the smlarty between two mages of an mage feature, the normalzaton was calculated by the followng equaton, 2 3 4 5 1 ( x) 1 ( x)( b1t b2t b3t b4t b5t ) ( x), t, (5) 1 b x where (x) s the normal probablty densty functon of the smlarty degrees among all mages n the feature, b 0 to b 5 are constants, and the absolute error ε(x) would be smaller than 7.5 10 8. 0 3.2 mlarty Measures of mage Features The mlarty Measure of Vsual Features. Let, denote two mages. Then the vsual smlarty between the mages and, V (, ), s defned as (, ) (, ), (6) V URF w (, ) where fk (, ) means the smlarty between the mages and of the k-th feature, w k s the weght of the k-th feature. Two weghtng methods, the cluster weghtng and the non-cluster weghtng, are proposed as follows: Cluster Weghtng. We use the clusterng results of ecton 2 to automatcally weght the features. f a query mage belongs to a cluster for a specfc vsual feature, the average smlarty between the query mage and each mage n the cluster s computed as the weght of the specfc vsual feature. k k fk

Non-Cluster Weghtng. n ths method, the weghts w k are set to 1, except for the weghts of AutoColorCorrelogram, Color tructure, and URF features double other vsual features. The mlarty Measure of the GP feature. Two dstance smlarty measures are proposed for the geographcal dstance: Boolean measure. The Boolean measure of the GP feature s defned as 1 f GP( ) and GP( ) are n the same cluster, G(B) (, ) (7) 0 otherwse; where GP( ) and GP( ) denote the values of the GP feature n EXF for,. mlarty measure. The contnuous smlarty measure on geographcal dstance s defned as G( ), ) 1 1 dst( GP ( ), GP ( )) radus ( e, (8) where and radus are smoothng parameters; dst(gp( ), GP( )) means the real geographcal dstance on earth between the two postons GP( ), GP( ). The mlarty Measure of the Tme feature. Two tme smlarty measures are proposed for tme duraton: Boolean measure. The Boolean measure of the tme feature s defned as 1 f T ( ) and T ( ) are n the same cluster, T (B)(, ) (9) 0 otherwse; where T( ) and T( ) denote the tme feature n EXF of and. mlarty Measure. The contnuous smlarty measure on tme s defned as T ( ) (, t ) 1 dst( T ( ), T ( )). (10) where dst(t( ), T( )) denote the real tme dfference n second between two tmestamp T( ) and T( ). 3.3 The Rankng of mage mlarty Fnally, we defne the smlarty between two mages and by ntegrate the features V (, ), G( ) (, ), and T( ) (, ) nto a lnear combnaton. The mage smlarty m(, ) s defned as m(, ) wv V (, ) wg G( ) (, ) wt T ( ) (, ). (11)

Gven a set of query mages Q, 1 m, the smlarty of each query mage Q and the mage n the mage set s measured by m(, Q ). The maxmum smlarty m(, Q ) s the smlarty degree of the mage for the vsual concept va the m query mages Q. t can be formally defned as max{ m(, Q )}. (12) 1 m 4 Experments and Dscusson 4.1 Expermental Envronments The system s mplemented on a Mcrosoft Wndows XP P 3, 2.33 GHz PC wth 3.00GB RAM. The developed software and related systems are wrtten n Java language, so the system s cross-platform. The methods n fve runs used dfferent mage features, whch are shown n Table 5. The notatons n the table are: V stands for the vsual features; G denotes the GP feature; T s the tme feature. Whle the parameter C, N means the cluster weghtng and the non-clusterng weghtng, respectvely. Fnally, the parameter B represents the Boolean measures and s the smlarty measures. Table 5. Features we used n our methods. Vsual Features GP Tme V C V + G N B V + T N B V + G + T N B B G + T T 4.2 Results of ubtask 1: Retreval of Vsual Concepts n ths subtask, 24 vsual concept queres were gven to be evaluated from the totally 32 concepts. The retreval results for the vsual concepts are evaluated by three dfferent measures: precson, NDCG (normalze dscount cumulatve gan) [ 8], and MAP (mean average precson). The expermental results are shown n Table 6. As Table 6 shows, the Run 5 usng all of the mage features s the best one for all measures. The second place s the Run 2 whch uses the tme feature only. The Run 3 wth the vsual features and the GP feature s the thrd place. The Run 1 and the Run 4 are worse than the above three runs. The results show that the vsual features are not useful for most of the vsual concepts n the task. The reason s that most of the concept topcs are semantcally related to each other. There s not much common characterstc n vsual features among QBE mages. Whle combnng the vsual features wth the EXF features, the performance

ncreases obvously. The GP feature can help us to fnd the mages photographed n the neghborng postons easly. The geographc-related topcs lke Asan temple & palace and temple (ancent) have good results. However, some topcs are not expected to be good, lke anmals and submarne scene, whch returned hgh precson. The man reason s that a photographer generally tres to take pctures wth the smlar topcs at the same place. Although the GP feature s precse for geographcrelated topcs, the mssng values on the GP feature wll degrade the precson greatly. ome non-geographcal topcs have obvously bad results n the runs of usng the GP features, lke clouds. The tme feature s also an mportant factor for searchng personal photos. nce the mages photographed n short tme are usually very smlar or dependent n vsual concept. As the above dscusson, the Run 5 gettng the best results shows that our mage smlarty measure method can combne the dfferent mage features effectvely. Table 6. Performance on retreval of vsual concepts. Run 1 Run 2 Run 3 Run 4 Run 5 Features V T V + G G + T V + G + T Weghts w V w T w V w G w G w T w V w G w T 1 1 0.45 0.17 0.975 0.025 0.45 0.18 0.22 P@5 0.6750 0.8000 0.7667 0.6500 0.8333 P@10 0.6125 0.7292 0.6583 0.6500 0.7833 P@15 0.5778 0.6667 0.6222 0.6083 0.7222 P@20 0.5354 0.6354 0.6104 0.5771 0.6896 P@30 0.4486 0.6083 0.5639 0.5611 0.6347 P@100 0.3054 0.4117 0.3925 0.3925 0.4379 NDCG@5 0.5701 0.5858 0.5800 0.4073 0.6405 NDCG@10 0.5062 0.5348 0.5184 0.4268 0.6017 NDCG@15 0.4798 0.5028 0.4951 0.4123 0.5658 NDCG@20 0.4545 0.4836 0.4872 0.4066 0.5459 NDCG@30 0.4016 0.4728 0.4615 0.4046 0.5213 NDCG@100 0.3303 0.4144 0.3979 0.3717 0.4436 MAP@30 0.0632 0.0952 0.0906 0.0854 0.1026 MAP@100 0.0930 0.1589 0.1558 0.1518 0.1777 4.3 Results of ubtask 2: Retreval of Events n the subtask, totally 15 dfferent events queres are gven to fnd the pctures wth the same event. Each query contans three QBE mages. The evaluatons are done by precson, NDCG, and MAP as the subtask 1. The expermental results are shown n Table 7. As Table 7 shows, the best results are the Run 2 and Run 5. The Run 1 usng the vsual features s stll the worst as the subtask 1. The Run 3 usng the vsual and the GP features s a lttle better than the Run 4 takng the vsual and the tme features.

Owng to the event queres usually descrbe the mages wth the propertes of happenng n specfc tme duraton or locaton area, the tme and the GP features are relatvely mportant here. For example, the topcs Australa, Bal, and Egypt are related n geographcal; the topcs of actvtes lke conference, party, and rock concert are temporal-related. Hence, the provded EXFs of the mages are very useful n ths subtask of events retreval. The Run 5 combnng all features s not expected to be the best as the subtask 1. The reason mght be that the event queres are not so related wth the vsual concept, but hghly dependent on tme and locaton. However, the proposed smlarty measure method dd not degrade the precson much. Table 7. Performance on retreval of events. Run 1 Run 2 Run 3 Run 4 Run 5 Features V G + T V + G V + T V + G + T Weghts w V w G w T w V w G w G w T w V w G w T 1 0.975 0.025 0.45 0.17 0.45 0.22 0.45 0.18 0.22 P@5 0.6533 1.0000 0.9333 0.9200 1.0000 P@10 0.5800 1.0000 0.9000 0.8733 1.0000 P@15 0.5156 0.9644 0.8533 0.8400 0.9644 P@20 0.4833 0.9333 0.8100 0.7867 0.9267 P@30 0.4467 0.8889 0.7622 0.6956 0.8756 P@100 0.2693 0.6787 0.5740 0.4613 0.6307 NDCG@5 0.6904 1.0000 0.9417 0.9201 1.0000 NDCG@10 0.6247 1.0000 0.9153 0.8877 1.0000 NDCG@15 0.5727 0.9837 0.8884 0.8681 0.9841 NDCG@20 0.5446 0.9697 0.8636 0.8357 0.9655 NDCG@30 0.5186 0.9586 0.8458 0.7854 0.9489 NDCG@100 0.4101 0.9126 0.8042 0.6638 0.8601 MAP@30 0.1100 0.3305 0.2800 0.2287 0.3225 MAP@100 0.1484 0.5533 0.4282 0.3179 0.4947 5 Concluson n ths paper we proposed a framework and smlarty measure methods to combne dfferent mage features for retrevng mages from a set of conceptual photos. The proposed method can handle the vsual concepts retreval subtask n part. However, the tme and poston nformaton are more mportant than other vsual features n the event retreval subtask. Although the proposed method could adust the weghts to ft the requrements, t has stll a lot of problems to be solved. The proposed framework retreved the relevant mages weghted by manual n most of the cases. As we know, the feature selecton s mportant n retreval ndvdual concept. For example, the expermental results show that the GP and the tme features are very useful for re-

treval n ths dataset. However, t may be not so effectve n other dataset. The problem of selectng and weghtng the features automatcally s a challenge n the task. Ths plot task s ts frst year announced at mageclef. The dataset seems too small for evaluatng modern applcatons. Further, the concept queres often contan some rrelevant mages n vsual. The procedure of determnng concepts and ther relevant mages may need to be fxed for provdng as a benchmark. References 1. Zellhöfer, D.: Overvew of the Personal Photo Retreval Plot Task at mageclef 2012. n: CLEF 2012 workng notes, Rome, taly (2012) 2. Huang, J., Kumar,., Mtra, M., Zhu, W. J., Zabh, R.: mage ndexng Usng Color Correlograms. EEE Computer ocety Conference on Computer Vson and Pattern Recognton, pp. 762-768 (1997) 3. tehlng, R. O., Nascmento, M. A., Falcão, A. X.: A Compact and Effcent mage Retreval Approach Based on Border/nteror Pxel Classfcaton. n: Proceedngs of the eleventh nternatonal conference on nformaton and knowledge management, pp. 102-109 (2002) 4. Chatzchrstofs,. A., Boutals, Y..: CEDD: Color and Edge Drectvty Descrptor. A Compact Descrptor for mage ndexng and Retreval. n: Proceedngs of the 6th nternatonal Conference on Computer Vson ystems, vol. 5008/2008, pp. 312-322, prnger (2008) 5. Chatzchrstofs,. A., Boutals, Y..: FCTH: Fuzzy Color and Texture Hstogram a Low Level Feature for Accurate mage Retreval. n: Proceedngs of the nnth nternatonal Workshop on mage Analyss for Multmeda nteractve ervces, pp 191-196 (2008) 6. Bay, H., Tuytelaars, T., Gool, L. V.: URF: peeded-up Robust Features. n: 9th European Conference on Computer Vson, vol. 3951/2006, pp. 404-417, prnger (2006) 7. Abramowtz, M. and tegun,. A.: Handbook of Mathematcal Functons: wth Formulas, Graphs, and Mathematcal Tables. BN: 0-486-61272-4. Dover Publcatons (1965) 8. Järveln, K., Kekälänen, J.: Cumulated Gan-based Evaluaton of R Technques. ACM Transactons on nformaton ystems, vol. 20, no. 4, 422-446 (2002)