Sectio 4, a prototype project of settig field weight with AHP method is developed ad the experimetal results are aalyzed. Fially, we coclude our work

Similar documents
3D Model Retrieval Method Based on Sample Prediction

Optimization for framework design of new product introduction management system Ma Ying, Wu Hongcui

The identification of key quality characteristics based on FAHP

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

Harris Corner Detection Algorithm at Sub-pixel Level and Its Application Yuanfeng Han a, Peijiang Chen b * and Tian Meng c

Ontology-based Decision Support System with Analytic Hierarchy Process for Tour Package Selection

EFFECT OF QUERY FORMATION ON WEB SEARCH ENGINE RESULTS

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A QoS balancing model for Mobile Ad hoc Networks

*Corresponding author. Keywords: Power quality, Assessment system, Harmonic evaluation, Comprehensive evaluation.

Study on effective detection method for specific data of large database LI Jin-feng

Ones Assignment Method for Solving Traveling Salesman Problem

Searching a Russian Document Collection Using English, Chinese and Japanese Queries

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c

Stone Images Retrieval Based on Color Histogram

HADOOP: A NEW APPROACH FOR DOCUMENT CLUSTERING

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System

New HSL Distance Based Colour Clustering Algorithm

Mobile terminal 3D image reconstruction program development based on Android Lin Qinhua

Web Text Feature Extraction with Particle Swarm Optimization

Lecture 5. Counting Sort / Radix Sort

DISTRIBUTED ALGORITHM FOR MULTI-AGENT ENVIRONMENT

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Cubic Polynomial Curves with a Shape Parameter

Optimal Mapped Mesh on the Circle

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

An Algorithm of Mobile Robot Node Location Based on Wireless Sensor Network

Journal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article

Probabilistic Fuzzy Time Series Method Based on Artificial Neural Network

A Method of Malicious Application Detection

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Available online at ScienceDirect. Procedia CIRP 53 (2016 ) 21 28

New Fuzzy Color Clustering Algorithm Based on hsl Similarity

Scalable Diversified Ranking on Large Graphs

A Flexible Hierarchical Classification Algorithm for Content Based Image Retrieval

Accuracy Improvement in Camera Calibration

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties

A Study on the Performance of Cholesky-Factorization using MPI

What are Information Systems?

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

ISSN (Print) Research Article. *Corresponding author Nengfa Hu

Improving Information Retrieval System Security via an Optimal Maximal Coding Scheme

Evaluation of the Software Industry Competitiveness in Jilin Province Based on Factor Analysis

Enhancing Efficiency of Software Fault Tolerance Techniques in Satellite Motion System

Pattern Recognition Systems Lab 1 Least Mean Squares

Chapter 3 Classification of FFT Processor Algorithms

1 Enterprise Modeler

World Scientific Research Journal (WSRJ) ISSN: Research on Fresnel Lens Optical Receiving Antenna in Indoor Visible

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

Research on Interest Model of User Behavior

GPUMP: a Multiple-Precision Integer Library for GPUs

Lower Bounds for Sorting

Evaluation of Support Vector Machine Kernels for Detecting Network Anomalies

Text Feature Selection based on Feature Dispersion Degree and Feature Concentration Degree

Low Complexity H.265/HEVC Coding Unit Size Decision for a Videoconferencing System

Using Markov Model and Popularity and Similarity-based Page Rank Algorithm for Web Page Access Prediction

Keywords Software Architecture, Object-oriented metrics, Reliability, Reusability, Coupling evaluator, Cohesion, efficiency

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Dimensionality Reduction PCA

Research on K-Means Algorithm Based on Parallel Improving and Applying

Evaluation scheme for Tracking in AMI

Data diverse software fault tolerance techniques

Heuristic Approaches for Solving the Multidimensional Knapsack Problem (MKP)

Bayesian Network Structure Learning from Attribute Uncertain Data

Latent Visual Context Analysis for Image Re-ranking

Design and Implementation of Web Usage Mining Intelligent System in the Field of e-commerce

The Impact of Feature Selection on Web Spam Detection

Identification of the Swiss Z24 Highway Bridge by Frequency Domain Decomposition Brincker, Rune; Andersen, P.

Analysis of Class Design Coupling Based on Information Entropy Di Jiang 1,2, a, Hua Zhou 1,2,b and Xingping Sun 1,2,c

A Key Distribution method for Reducing Storage and Supporting High Level Security in the Large-scale WSN

Mining from Quantitative Data with Linguistic Minimum Supports and Confidences

Multi Attributes Approach for Tourist Trips Design

GE FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III

Rapid Frequent Pattern Growth and Possibilistic Fuzzy C-means Algorithms for Improving the User Profiling Personalized Web Page Recommendation System

An Estimation of Distribution Algorithm for solving the Knapsack problem

An Efficient Algorithm for Graph Bisection of Triangularizations

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Improving Template Based Spike Detection

The Penta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems

n Some thoughts on software development n The idea of a calculator n Using a grammar n Expression evaluation n Program organization n Analysis

Hui Xiao School of Environmental Science, Nanjing Xiaozhuang University, Nanjing , China

Effect of control points distribution on the orthorectification accuracy of an Ikonos II image through rational polynomial functions

Τεχνολογία Λογισμικού

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

Relay Placement Based on Divide-and-Conquer

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis

Empirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic.

Relationship between augmented eccentric connectivity index and some other graph invariants

A Comprehensive Method for Text Summarization Based on Latent Semantic Analysis

BASED ON ITERATIVE ERROR-CORRECTION

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

EMPIRICAL ANALYSIS OF FAULT PREDICATION TECHNIQUES FOR IMPROVING SOFTWARE PROCESS CONTROL

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Data Warehousing. Paper

Term Ranking for Clustering Web Search Results

Transcription:

200 2d Iteratioal Coferece o Iformatio ad Multimedia Techology (ICIMT 200) IPCSIT vol. 42 (202) (202) IACSIT Press, Sigapore DOI: 0.7763/IPCSIT.202.V42.0 Idex Weight Decisio Based o AHP for Iformatio Retrieval o Mobile Device Yuayua Wu, Ye Tia, Wedog Wag, Xirog Que, Xiagyag Gog, Jia Ma 2, Cafeg Che 2, Xiaogag Yag 2 State Key Laboratory of Networkig ad Switchig Techology, Beijig Uiversity of Posts ad Telecommuicatios Beijig, Chia 2 Nokia Research Ceter, Beijig, Chia yuayua32660@sia.com, humaty@63.com, wdwag@bupt.edu.c, rogqx@bupt.edu.c, xygog@bupt.edu.c, jia.j.ma@gmail.com, cafeg-david.che@okia.com, extxiaogag..yag@okia.com Abstract Idex weight decisio is importat to the rakig result of iformatio retrieval o mobile device. Nowadays may methods for determiig the idex weights are subjective ad complicated. So a idex weight decisio techique based o the AHP method is itroduced so that a better retrieval performace ca be obtaied. We itroduce how the AHP method is applied to get the proper idex weights i the mobile applicatio i details, ad the a prototype project is implemeted to test the availability of the techique. The experimetal results show that the idex weight decisio techique is effective i improvig the performace of searchig cotets o mobile device. Keywords-AHP; Search Egie; Mobile Applicatio; Idex; Weights. Itroductio With the rapid developmet of mobile market, cotet retrieval o mobile device is playig a more ad more importat role i our daily life. Normally there are several kids of objects i mobile device, such as coversatio, photo, caledar, email ad cotact. Retrieve relevat objects by takig certai mobile applicatio object as a etry is a hot research poit owadays. Whe oe is browsig a photo, he may probably wish to fid the related iformatio with the photo. He may wat to kow the cotact iformatio of the people who took the photo, or the coversatios durig the time the photo was take. So the requiremets of effective idexig ad searchig relevat cotets i mobile device are growig rapidly. Actually, the created idex directly affects the rakig of searchig results, while settig the field weights affects the created idex. That is, chagig the weight values will result i the chaged scores of the fields, ad the the result set is sorted accordigly so that the documets where the fields stay ca be positioed i advace or by post. Whe calculatig the scores, it is irrelevat betwee the documet score ad the weight value i default case. However, oce the default weight value is chaged, the relatioship betwee the documet score ad the weight value becomes relevat. The larger the weight value is, the higher the documet score is. Thus, settig field weights i the process of idexig is sigificat. Nowadays, there are several ways of settig field weights. But i most cases they are subjective ad merely upo the experiece of the developers. To solve the problem, i the paper we exploit a AHP-based method to set the field weights such that objective ad reasoable weights ca be obtaied. The rest of the paper is orgaized as below. I Sectio 2, we simply itroduce the related work with the paper. The we describe AHP method ad how we apply the method to set field weights i Sectio 3. I 65

Sectio 4, a prototype project of settig field weight with AHP method is developed ad the experimetal results are aalyzed. Fially, we coclude our work ad describe the future work i Sectio 5. 2. Related Work Cotet retrieval o mobile device has bee a hot topic owadays, ad effective idexig ad searchig act as importat roles i the field. I this case, settig field weights i the process of idexig has bee a meaigful research poit. Hye-Ji Jeog [4] proposes a techique that determies the weight of the idex. The techique updates the weight of the idex o the basis of the weight of terms, ad calculates the weight for the term i each paragraph. Nig Zhou ad others [6] itroduces a method to create idex i Lucee ad utilizatio of the idex, such that the weighted frequet items ca be foud. They also implemet a weighted associatio rule miig algorithm based o apriori algorithm, which searches the weighted frequet items util a satisfactory result ca be obtaied. But the methods described above are maily used i web search applicatios, ad they are complicated ad eed to frequetly updatig the idex, which are ot suitable for the searchig o mobile device. Therefore we propose a ovel approach which ca be well applied i the mobile applicatio ad produces effective performace. 3. System Framework 3. AHP Itroductio The Aalytic Hierarchy Process (AHP) by Saaty [] is a aalysis method that combies qualitative ad quatitative aspects. It is a method that decomposes a complex problem ito various compoet factors, ad groups those factors accordig to their domiatig relatios i order to form a progressive structure. Through the comparisos of pair-wise factors, the relative importace of factors is determied. Ad through the comprehesive decisio judgmet, the total order of the relative importace i the decisio solutio is determied. Figure [2] depicts the model of AHP. The characteristic of the method is that the weight of each factor is calculated by the relative importace of various factors, thereby it is scietific by better avoidig the subjectivity ad arbitrariess. Figure. AHP flow chart 66

Due to its practicability ad effectiveess i the complex decisio-makig problem, it is quickly applied i the scopes of ecoomic plaig ad maagemet, eergy policy ad distributio, behavioral sciece, trasportatio, agriculture, educatio, medical treatmet, eviromet, etc. But it is rare i the field of computer sciece, especially i the scope of iformatio retrieval. 3.2 Mobile Applicatio Based o AHP I this paper we exploit AHP model for mobile applicatios retrieval. The mobile applicatio objects, such as photo, coversatio, email, caledar ad cotact, ca be take as a etry to search relevat objects. Figure 2 shows the architecture of the mobile applicatio which creates the idex based o AHP. Figure 2. Architecture of mobile applicatio based o AHP The raw data icludig the photos, coversatios, emails, caledars, ad cotacts are collected o mobile device. Whe idexig based o those data, the AHP method determies the weight of each field which will be stored ito the idex files. If idexig is doe, users ca iput search coditios o mobile device, ad the system will get data from the idex after aalyzig the coditios. At last, the search results are displayed o mobile device whe the data obtaied from the idex is filtered ad raked by the system. The architecture shows that the AHP method plays a importat role i determiig the field weights i the process of idexig. I the ext sectio, we will discuss the implemetatio of settig the idex weights based o AHP i details. 4. System Implemetatio I this sectio, we itroduce how AHP is applied to set idex weights i details based o a prototype project. 4. Establish hierarchical aalysis structure model O the basis of thorough aalysis of the practical problems, the factors ifluecig the cotet retrieval o mobile device are decomposed ito several layers. The factors of the same layer ifluece that of the upper layer, ad domiate that of the lower layer. Whe settig the idex weights with AHP, the hierarchical structure model ca be established as show i Figure 3. Figure 3. Hierarchical structure of mobile applicatio 67

The top layer A is called goal layer, which aims to solve the problem usig hierarchy process (AHP). Here usig the AHP method aims at obtaiig proper idexig weights. The itermediate layer B is called criterio layer, which liks goal layer A ad measure layer C. I this system the layer is eeded to determie the fial weights i the goal layer, ad it is iflueced by the measure layer at the same time. Here photo, coversatio, caledar, email ad cotact are cosidered i this layer The bottom layer C is called measure layer, which stads for the policies or measures to solve the problem. Ad i the system the layer cotais the detail factors of each object i criterio layer. Actually, those factors i layer C directly ifluece the result of layer B. 4.2 Establish the judgmet matrix From the goal layer A to the measure layer C of the hierarchical structure model, the judgmet matrix is established through the compariso betwee the paired factors which are relevat i the same layer. The compariso result represets the importace degree of the lower layer to the upper oe, ad it is expressed with - 9 scale defiitio, which is show as Figure 4. Figure 4. - 9 scale ad defiitio I this system, each factor of the criterio layer B should be compared with the others i the same layer, so that the ifluece to the goal layer A ca be obtaied. That is, the relative importace amog photo, coversatio, caledar, email ad cotact is eeded to calculate. Ad the the factors of the measure layer C should be made paired compariso i order to get the ifluece to the relevat factor of the criterio layer B. For example, photo has the attributes: time, GPS ad keywords. The compariso amog those attributes stads for the importace degree to the photo. 4.3 Determie the idex weights The steps of calculatig the idex weights are as follows. Based o the judgmet matrix, calculate W i of each row i the matrix. Wi = bij( i =,2,3... ) () j= Normalize the vectorw = [ W, W2,..., W ] T. W = W / W( i =,2,..., ) (2) i i i i= W = [ W, W2,..., W ] T is the desired eigevector. Calculate the maximum eigevalue λ max of the judgmet matrix. λ max ( AW ) i = (3) W i= i ( AW ) i represets i-th compoet of vector AW. I order to test the cosistecy of the judgmet matrix, CI (cosistecy idex) is eeded to calculate. 68

whe CI = 0, the matrix is cocluded to have the complete cosistecy. λmax CI = (4) whe CR = CI/RI < 0.0, the matrix is cocluded to have satisfactory cosistecy. Otherwise the matrix is eeded to adjust. To test whether the cosistecy of the judgmet matrix is satisfactory, it is eeded to compare CI with average radom cosistecy idex RI. For the matrix with ~ 9 order, RI values are show as Figure 5. Figure 5. RI values With the above steps, the idex weights of layer B to layer A ad layer C to layer B ca be calculated. I this applicatio, the weight of each factor i the criterio layer is calculated, ad the results show that the relative importace of each factor of the criterio layer to the goal layer. For the goal of gettig proper idex weights, we ca see that the coversatio is more importat tha the photo whe the weight of the coversatio is larger tha that of the photo. So is the measure layer to the criterio layer. Based o the weights of layer B to layer A ad layer C to layer B, the fial weights of layer C to layer A ca be calculated ad they are exactly what we desire. That is, the weight of each field is obtaied to reach the goal of gettig proper idex weights i the applicatio. 5. Performace Evaluatio Based o the implemetatio of the prototype project, we collect 800 photos, 800 coversatios, 26 caledars, 080 emails ad 524 cotacts o mobile device. Ad those data are gathered based o 50 scees i two weeks. For example, we set a scee of ivitig frieds to watchig flowers, i which we sed 35 coversatios ad 20 emails, set 4 caledars ad 0 cotacts, take 40 photos. All of the data are collected to support the performace evaluatio of the idexig weights with AHP. 5. Results of the applicatio with AHP ) Survey o the desired idex weights: We distribute the questioaires to 30 graduate studets i the field of computer sciece. From the collected data of the 30 questioaires desiged o the basis of - 9 scale, the judgmet matrixes will be geerated. Figure 6 shows the sample of the questioaire, ad represets the studet s selectio of the raletive importace betwee the fields. Figure 6. Sample of the questioaire From the sample, we ca see that the user thiks time is more importat tha GPS slightly to the photo searchig. Based o -9 scale, the value of time to GPS is 3, ad the value of GPS to time is /3 i the judgmet matrix. 2) Calculatio of the idex weights: O the basis of the collected data from the questioaires ad the caculatio method itroduced i Chapter IV, the weights of layer B to layer A are obtaied show as TABLE I. I the table, B represets the photo, B2 represets the coversatio, B3 represets the caledar, B4 represets the email, ad B5 represets the cotact. TABLE I. A B JUDGMENT MATRIX AND WEIGHTS A B B2 B3 B4 B5 W B /5 /3 /6 /6 0.0440 B2 5 4 2 0.3242 B3 3 /4 /4 /4 0.084 B4 6 4 0.2928 B5 6 /2 4 0.2549 69

The weights of layer C to layer B are also obtaied. Here we take the photo B for example, ad the weights of time, GPS ad keywords to the photo are show as TABLE II. I the table, C represets time, C2 represets GPS, ad C3 represets keywords. TABLE II. B C JUDGMENT MATRIX AND WEIGHTS B C C2 C3 W C /4 0.222 C2 /4 0.222 C3 4 4 0.5576 Fially, the weights of layer C to layer A are calculated show as TABLE III. They reflect the importace degree of each field to the search results. The larger is the weight, the more importat is the field. To elarge the differece amog the field weights, each weight is multiplied by 200 to get the fial field weights. TABLE III. A C FINAL WEIGHTS A B C C to A Field Weight B(0.0440) C(0.222) C(0.0097) C(.94) C2(0.222) C2(0.0097) C2(.94) C3(0.5576) C3(0.0245) C3(4.90) B2(0.3242) C4(0.0963) C4(0.032) C4(6.24) C5(0.0973) C5(0.035) C5(6.30) C6(0.3828) C6(0.24) C6(24.82) C7(0.4236) C7(0.373) C7(27.46) B3(0.084) C8(0.04) C8(0.0088) C8(.76) C9(0.3878) C9(0.0326) C9(6.52) C0(0.3878) C0(0.0326) C0(6.52) C(0.203) C(0.002) C(2.04) B4(0.2928) C2(0.0733) C2(0.026) C2(4.52) C3(0.0572) C3(0.068) C3(3.36) C4(0.637) C4(0.0479) C4(9.58) C5(0.3852) C5(0.28) C5(22.56) C6(0.366) C6(0.0928) C6(8.56) B5(0.2549) C7(0.5453) C7(0.390) C7(27.80) C8(0.653) C8(0.042) C8(8.42) C9(0.0948) C9(0.0242) C9(4.84) C20(0.0763) C20(0.094) C20(3.88) C2(0.83) C2(0.0302) C2(6.04) 5.2 Evaluatio of the results Here, we use the most popular metrics of precisio ratio ad recall ratio to evaluate the search performace. Ad the the relevace ratio is used to evaluate the determiatio of the documet rakig [3]. 3) Precisio Ratio ad Recall Ratio Evaluatio: Precisio ratio is defied as the searched relevace items to the total searched items show as (5) [4]. SRD PR = (5) SD where PR is precisio ratio, SRD is the umber of searched relevace documets ad SD is the total umber of searched documets [4]. Recall ratio is show as (6) [4], which is defied as SRD RR = (6) RD where RR is recall ratio, SRD is the umber of searched relevace documets ad RD is the total umber of relevat documets [4]. Figure 7 shows the test results of precisio ad recall metrics. We ca see that the idex weights determied by the AHP method i the paper yield the improved search performace with the higher precisio ratio ad recall ratio. 70

Figure 7. Precisio ad recall of results 4) Relevace Ratio Evaluatio: Relevace ratio is suitable for evaluatig the rakig performace determied by the idex weights. relevace = i= i= R R score max (7) where R score represets that the paper relevace as evaluated by the user is withi the rage 0-3. The meaig of each value is defied as: 0 = Not relevat, = Normal, 2 = Relevat, 3 = Very relevat. R max represets that the maximum rlevace value is 3. represets the umber of documets withi the top % of raked papers [4]. Relevace ratio of the tests results is draw i ad Figure 8. We ca see that the rakig performace is improved sigificatly after usig the AHP method, especially for the search precisio of the photo, coversatio ad email. Figure 8. Relevace ratio of results 6. Coclusio ad Future Work I this paper, we itroduce the AHP method to get the objective ad scietific idex weights. Firstly we describe AHP method ad the mobile applicatio based o AHP. Secodly, settig the field weights with AHP is itroduced i the applicatio i details. Fially, we develop a prototype project of settig field weight with AHP method ad evaluate the performace of the applicatio with AHP. We ca see that the idex weights geerated by AHP improve the rakig of search results, ad obtai the satisfactory cotets o mobile device. However, there are may other factors affectig the rakig of search results except of the idex weights. So i the future, there is still a log way to go for the better rakig i the field of searchig o mobile device. 7. Refereces. 7

[] T.L. Saaty, Decisio Makig with Depedece ad Feedback: the Aalytic Network Process, Pittsburgh: RWS, 200. [2] Li Yogxi, Wag Kaizhuo. Idex Weight Techology i Threat Evaluatio Based o Improved Grey Theory. Itelliget Iformatio Techology Applicatio Workshops, 2008. [3] Seo-Mi Woo, Chu-Sik Yoo, Yog-Sug Kim. User - Cetered Documet Rakig Techique usig Term Associatio Aalysis. Joural of Korea Iformatio Sciece Society, Vol.28, NO.2, pp.49-56, 200. [4] Hye-Ji Jeog. Idex Weightig Based o the Weight of the Idex of Each Tag. Future Geeratio Commuicatio ad Networkig Symposia, 2008. [5] Hye-Ji Jeog, Yog-Sug Kim. Idex Weight Decisio Techique for Search Reliable Documets. Future Geeratio Commuicatio ad Networkig, 2008. [6] Nig Zhou, JiaXi Wu, Shaolog Zhag, Hogqi Che, XiagRog Zhag. Mig Weighted Associatio Rules with Lucee Idex. Wireless Commuicatios, Networkig ad Mobile Computig, 2007. [7] Mei Yag, Meihog Deg, Rogya Jia, Yue Liu. Research o idex weight based o improved Grey Relatioal Aalysis. Machie Learig ad Cyberetics(ICMLC), 200. 72