200 2d Iteratioal Coferece o Iformatio ad Multimedia Techology (ICIMT 200) IPCSIT vol. 42 (202) (202) IACSIT Press, Sigapore DOI: 0.7763/IPCSIT.202.V42.0 Idex Weight Decisio Based o AHP for Iformatio Retrieval o Mobile Device Yuayua Wu, Ye Tia, Wedog Wag, Xirog Que, Xiagyag Gog, Jia Ma 2, Cafeg Che 2, Xiaogag Yag 2 State Key Laboratory of Networkig ad Switchig Techology, Beijig Uiversity of Posts ad Telecommuicatios Beijig, Chia 2 Nokia Research Ceter, Beijig, Chia yuayua32660@sia.com, humaty@63.com, wdwag@bupt.edu.c, rogqx@bupt.edu.c, xygog@bupt.edu.c, jia.j.ma@gmail.com, cafeg-david.che@okia.com, extxiaogag..yag@okia.com Abstract Idex weight decisio is importat to the rakig result of iformatio retrieval o mobile device. Nowadays may methods for determiig the idex weights are subjective ad complicated. So a idex weight decisio techique based o the AHP method is itroduced so that a better retrieval performace ca be obtaied. We itroduce how the AHP method is applied to get the proper idex weights i the mobile applicatio i details, ad the a prototype project is implemeted to test the availability of the techique. The experimetal results show that the idex weight decisio techique is effective i improvig the performace of searchig cotets o mobile device. Keywords-AHP; Search Egie; Mobile Applicatio; Idex; Weights. Itroductio With the rapid developmet of mobile market, cotet retrieval o mobile device is playig a more ad more importat role i our daily life. Normally there are several kids of objects i mobile device, such as coversatio, photo, caledar, email ad cotact. Retrieve relevat objects by takig certai mobile applicatio object as a etry is a hot research poit owadays. Whe oe is browsig a photo, he may probably wish to fid the related iformatio with the photo. He may wat to kow the cotact iformatio of the people who took the photo, or the coversatios durig the time the photo was take. So the requiremets of effective idexig ad searchig relevat cotets i mobile device are growig rapidly. Actually, the created idex directly affects the rakig of searchig results, while settig the field weights affects the created idex. That is, chagig the weight values will result i the chaged scores of the fields, ad the the result set is sorted accordigly so that the documets where the fields stay ca be positioed i advace or by post. Whe calculatig the scores, it is irrelevat betwee the documet score ad the weight value i default case. However, oce the default weight value is chaged, the relatioship betwee the documet score ad the weight value becomes relevat. The larger the weight value is, the higher the documet score is. Thus, settig field weights i the process of idexig is sigificat. Nowadays, there are several ways of settig field weights. But i most cases they are subjective ad merely upo the experiece of the developers. To solve the problem, i the paper we exploit a AHP-based method to set the field weights such that objective ad reasoable weights ca be obtaied. The rest of the paper is orgaized as below. I Sectio 2, we simply itroduce the related work with the paper. The we describe AHP method ad how we apply the method to set field weights i Sectio 3. I 65
Sectio 4, a prototype project of settig field weight with AHP method is developed ad the experimetal results are aalyzed. Fially, we coclude our work ad describe the future work i Sectio 5. 2. Related Work Cotet retrieval o mobile device has bee a hot topic owadays, ad effective idexig ad searchig act as importat roles i the field. I this case, settig field weights i the process of idexig has bee a meaigful research poit. Hye-Ji Jeog [4] proposes a techique that determies the weight of the idex. The techique updates the weight of the idex o the basis of the weight of terms, ad calculates the weight for the term i each paragraph. Nig Zhou ad others [6] itroduces a method to create idex i Lucee ad utilizatio of the idex, such that the weighted frequet items ca be foud. They also implemet a weighted associatio rule miig algorithm based o apriori algorithm, which searches the weighted frequet items util a satisfactory result ca be obtaied. But the methods described above are maily used i web search applicatios, ad they are complicated ad eed to frequetly updatig the idex, which are ot suitable for the searchig o mobile device. Therefore we propose a ovel approach which ca be well applied i the mobile applicatio ad produces effective performace. 3. System Framework 3. AHP Itroductio The Aalytic Hierarchy Process (AHP) by Saaty [] is a aalysis method that combies qualitative ad quatitative aspects. It is a method that decomposes a complex problem ito various compoet factors, ad groups those factors accordig to their domiatig relatios i order to form a progressive structure. Through the comparisos of pair-wise factors, the relative importace of factors is determied. Ad through the comprehesive decisio judgmet, the total order of the relative importace i the decisio solutio is determied. Figure [2] depicts the model of AHP. The characteristic of the method is that the weight of each factor is calculated by the relative importace of various factors, thereby it is scietific by better avoidig the subjectivity ad arbitrariess. Figure. AHP flow chart 66
Due to its practicability ad effectiveess i the complex decisio-makig problem, it is quickly applied i the scopes of ecoomic plaig ad maagemet, eergy policy ad distributio, behavioral sciece, trasportatio, agriculture, educatio, medical treatmet, eviromet, etc. But it is rare i the field of computer sciece, especially i the scope of iformatio retrieval. 3.2 Mobile Applicatio Based o AHP I this paper we exploit AHP model for mobile applicatios retrieval. The mobile applicatio objects, such as photo, coversatio, email, caledar ad cotact, ca be take as a etry to search relevat objects. Figure 2 shows the architecture of the mobile applicatio which creates the idex based o AHP. Figure 2. Architecture of mobile applicatio based o AHP The raw data icludig the photos, coversatios, emails, caledars, ad cotacts are collected o mobile device. Whe idexig based o those data, the AHP method determies the weight of each field which will be stored ito the idex files. If idexig is doe, users ca iput search coditios o mobile device, ad the system will get data from the idex after aalyzig the coditios. At last, the search results are displayed o mobile device whe the data obtaied from the idex is filtered ad raked by the system. The architecture shows that the AHP method plays a importat role i determiig the field weights i the process of idexig. I the ext sectio, we will discuss the implemetatio of settig the idex weights based o AHP i details. 4. System Implemetatio I this sectio, we itroduce how AHP is applied to set idex weights i details based o a prototype project. 4. Establish hierarchical aalysis structure model O the basis of thorough aalysis of the practical problems, the factors ifluecig the cotet retrieval o mobile device are decomposed ito several layers. The factors of the same layer ifluece that of the upper layer, ad domiate that of the lower layer. Whe settig the idex weights with AHP, the hierarchical structure model ca be established as show i Figure 3. Figure 3. Hierarchical structure of mobile applicatio 67
The top layer A is called goal layer, which aims to solve the problem usig hierarchy process (AHP). Here usig the AHP method aims at obtaiig proper idexig weights. The itermediate layer B is called criterio layer, which liks goal layer A ad measure layer C. I this system the layer is eeded to determie the fial weights i the goal layer, ad it is iflueced by the measure layer at the same time. Here photo, coversatio, caledar, email ad cotact are cosidered i this layer The bottom layer C is called measure layer, which stads for the policies or measures to solve the problem. Ad i the system the layer cotais the detail factors of each object i criterio layer. Actually, those factors i layer C directly ifluece the result of layer B. 4.2 Establish the judgmet matrix From the goal layer A to the measure layer C of the hierarchical structure model, the judgmet matrix is established through the compariso betwee the paired factors which are relevat i the same layer. The compariso result represets the importace degree of the lower layer to the upper oe, ad it is expressed with - 9 scale defiitio, which is show as Figure 4. Figure 4. - 9 scale ad defiitio I this system, each factor of the criterio layer B should be compared with the others i the same layer, so that the ifluece to the goal layer A ca be obtaied. That is, the relative importace amog photo, coversatio, caledar, email ad cotact is eeded to calculate. Ad the the factors of the measure layer C should be made paired compariso i order to get the ifluece to the relevat factor of the criterio layer B. For example, photo has the attributes: time, GPS ad keywords. The compariso amog those attributes stads for the importace degree to the photo. 4.3 Determie the idex weights The steps of calculatig the idex weights are as follows. Based o the judgmet matrix, calculate W i of each row i the matrix. Wi = bij( i =,2,3... ) () j= Normalize the vectorw = [ W, W2,..., W ] T. W = W / W( i =,2,..., ) (2) i i i i= W = [ W, W2,..., W ] T is the desired eigevector. Calculate the maximum eigevalue λ max of the judgmet matrix. λ max ( AW ) i = (3) W i= i ( AW ) i represets i-th compoet of vector AW. I order to test the cosistecy of the judgmet matrix, CI (cosistecy idex) is eeded to calculate. 68
whe CI = 0, the matrix is cocluded to have the complete cosistecy. λmax CI = (4) whe CR = CI/RI < 0.0, the matrix is cocluded to have satisfactory cosistecy. Otherwise the matrix is eeded to adjust. To test whether the cosistecy of the judgmet matrix is satisfactory, it is eeded to compare CI with average radom cosistecy idex RI. For the matrix with ~ 9 order, RI values are show as Figure 5. Figure 5. RI values With the above steps, the idex weights of layer B to layer A ad layer C to layer B ca be calculated. I this applicatio, the weight of each factor i the criterio layer is calculated, ad the results show that the relative importace of each factor of the criterio layer to the goal layer. For the goal of gettig proper idex weights, we ca see that the coversatio is more importat tha the photo whe the weight of the coversatio is larger tha that of the photo. So is the measure layer to the criterio layer. Based o the weights of layer B to layer A ad layer C to layer B, the fial weights of layer C to layer A ca be calculated ad they are exactly what we desire. That is, the weight of each field is obtaied to reach the goal of gettig proper idex weights i the applicatio. 5. Performace Evaluatio Based o the implemetatio of the prototype project, we collect 800 photos, 800 coversatios, 26 caledars, 080 emails ad 524 cotacts o mobile device. Ad those data are gathered based o 50 scees i two weeks. For example, we set a scee of ivitig frieds to watchig flowers, i which we sed 35 coversatios ad 20 emails, set 4 caledars ad 0 cotacts, take 40 photos. All of the data are collected to support the performace evaluatio of the idexig weights with AHP. 5. Results of the applicatio with AHP ) Survey o the desired idex weights: We distribute the questioaires to 30 graduate studets i the field of computer sciece. From the collected data of the 30 questioaires desiged o the basis of - 9 scale, the judgmet matrixes will be geerated. Figure 6 shows the sample of the questioaire, ad represets the studet s selectio of the raletive importace betwee the fields. Figure 6. Sample of the questioaire From the sample, we ca see that the user thiks time is more importat tha GPS slightly to the photo searchig. Based o -9 scale, the value of time to GPS is 3, ad the value of GPS to time is /3 i the judgmet matrix. 2) Calculatio of the idex weights: O the basis of the collected data from the questioaires ad the caculatio method itroduced i Chapter IV, the weights of layer B to layer A are obtaied show as TABLE I. I the table, B represets the photo, B2 represets the coversatio, B3 represets the caledar, B4 represets the email, ad B5 represets the cotact. TABLE I. A B JUDGMENT MATRIX AND WEIGHTS A B B2 B3 B4 B5 W B /5 /3 /6 /6 0.0440 B2 5 4 2 0.3242 B3 3 /4 /4 /4 0.084 B4 6 4 0.2928 B5 6 /2 4 0.2549 69
The weights of layer C to layer B are also obtaied. Here we take the photo B for example, ad the weights of time, GPS ad keywords to the photo are show as TABLE II. I the table, C represets time, C2 represets GPS, ad C3 represets keywords. TABLE II. B C JUDGMENT MATRIX AND WEIGHTS B C C2 C3 W C /4 0.222 C2 /4 0.222 C3 4 4 0.5576 Fially, the weights of layer C to layer A are calculated show as TABLE III. They reflect the importace degree of each field to the search results. The larger is the weight, the more importat is the field. To elarge the differece amog the field weights, each weight is multiplied by 200 to get the fial field weights. TABLE III. A C FINAL WEIGHTS A B C C to A Field Weight B(0.0440) C(0.222) C(0.0097) C(.94) C2(0.222) C2(0.0097) C2(.94) C3(0.5576) C3(0.0245) C3(4.90) B2(0.3242) C4(0.0963) C4(0.032) C4(6.24) C5(0.0973) C5(0.035) C5(6.30) C6(0.3828) C6(0.24) C6(24.82) C7(0.4236) C7(0.373) C7(27.46) B3(0.084) C8(0.04) C8(0.0088) C8(.76) C9(0.3878) C9(0.0326) C9(6.52) C0(0.3878) C0(0.0326) C0(6.52) C(0.203) C(0.002) C(2.04) B4(0.2928) C2(0.0733) C2(0.026) C2(4.52) C3(0.0572) C3(0.068) C3(3.36) C4(0.637) C4(0.0479) C4(9.58) C5(0.3852) C5(0.28) C5(22.56) C6(0.366) C6(0.0928) C6(8.56) B5(0.2549) C7(0.5453) C7(0.390) C7(27.80) C8(0.653) C8(0.042) C8(8.42) C9(0.0948) C9(0.0242) C9(4.84) C20(0.0763) C20(0.094) C20(3.88) C2(0.83) C2(0.0302) C2(6.04) 5.2 Evaluatio of the results Here, we use the most popular metrics of precisio ratio ad recall ratio to evaluate the search performace. Ad the the relevace ratio is used to evaluate the determiatio of the documet rakig [3]. 3) Precisio Ratio ad Recall Ratio Evaluatio: Precisio ratio is defied as the searched relevace items to the total searched items show as (5) [4]. SRD PR = (5) SD where PR is precisio ratio, SRD is the umber of searched relevace documets ad SD is the total umber of searched documets [4]. Recall ratio is show as (6) [4], which is defied as SRD RR = (6) RD where RR is recall ratio, SRD is the umber of searched relevace documets ad RD is the total umber of relevat documets [4]. Figure 7 shows the test results of precisio ad recall metrics. We ca see that the idex weights determied by the AHP method i the paper yield the improved search performace with the higher precisio ratio ad recall ratio. 70
Figure 7. Precisio ad recall of results 4) Relevace Ratio Evaluatio: Relevace ratio is suitable for evaluatig the rakig performace determied by the idex weights. relevace = i= i= R R score max (7) where R score represets that the paper relevace as evaluated by the user is withi the rage 0-3. The meaig of each value is defied as: 0 = Not relevat, = Normal, 2 = Relevat, 3 = Very relevat. R max represets that the maximum rlevace value is 3. represets the umber of documets withi the top % of raked papers [4]. Relevace ratio of the tests results is draw i ad Figure 8. We ca see that the rakig performace is improved sigificatly after usig the AHP method, especially for the search precisio of the photo, coversatio ad email. Figure 8. Relevace ratio of results 6. Coclusio ad Future Work I this paper, we itroduce the AHP method to get the objective ad scietific idex weights. Firstly we describe AHP method ad the mobile applicatio based o AHP. Secodly, settig the field weights with AHP is itroduced i the applicatio i details. Fially, we develop a prototype project of settig field weight with AHP method ad evaluate the performace of the applicatio with AHP. We ca see that the idex weights geerated by AHP improve the rakig of search results, ad obtai the satisfactory cotets o mobile device. However, there are may other factors affectig the rakig of search results except of the idex weights. So i the future, there is still a log way to go for the better rakig i the field of searchig o mobile device. 7. Refereces. 7
[] T.L. Saaty, Decisio Makig with Depedece ad Feedback: the Aalytic Network Process, Pittsburgh: RWS, 200. [2] Li Yogxi, Wag Kaizhuo. Idex Weight Techology i Threat Evaluatio Based o Improved Grey Theory. Itelliget Iformatio Techology Applicatio Workshops, 2008. [3] Seo-Mi Woo, Chu-Sik Yoo, Yog-Sug Kim. User - Cetered Documet Rakig Techique usig Term Associatio Aalysis. Joural of Korea Iformatio Sciece Society, Vol.28, NO.2, pp.49-56, 200. [4] Hye-Ji Jeog. Idex Weightig Based o the Weight of the Idex of Each Tag. Future Geeratio Commuicatio ad Networkig Symposia, 2008. [5] Hye-Ji Jeog, Yog-Sug Kim. Idex Weight Decisio Techique for Search Reliable Documets. Future Geeratio Commuicatio ad Networkig, 2008. [6] Nig Zhou, JiaXi Wu, Shaolog Zhag, Hogqi Che, XiagRog Zhag. Mig Weighted Associatio Rules with Lucee Idex. Wireless Commuicatios, Networkig ad Mobile Computig, 2007. [7] Mei Yag, Meihog Deg, Rogya Jia, Yue Liu. Research o idex weight based o improved Grey Relatioal Aalysis. Machie Learig ad Cyberetics(ICMLC), 200. 72