Using Query Contexts in Information Retrieval Jing Bai 1, Jian-Yun Nie 1, Hugues Bouchard 2, Guihong Cao 1 1 Department IRO, University of Montreal

Size: px

Start display at page:

Download "Using Query Contexts in Information Retrieval Jing Bai 1, Jian-Yun Nie 1, Hugues Bouchard 2, Guihong Cao 1 1 Department IRO, University of Montreal"

Bathsheba Wright
5 years ago
Views:

1 Usng uery Contexts n Informaton Retreval Jng Ba 1, Jan-Yun Ne 1, Hugues Bouchard 2, Guhong Cao 1 1 epartment IRO, Unversty of Montreal CP. 6128, succursale Centre-vlle, Montreal, uebec, H3C 3J7, Canada {bang, ne, caogu}@ro.umontreal.ca 2 Yahoo! Inc. Montreal, uebec, Canada bouchard@yahoo-nc.com ABSTRACT User query s an element that specfes an nformaton need, but t s not the only one. Studes n lterature have found many contextual factors that strongly nfluence the nterpretaton of a query. Recent studes have tred to consder the user s nterests by creatng a user profle. However, a sngle profle for a user may not be suffcent for a varety of queres of the user. In ths study, we propose to use query-specfc contexts nstead of user-centrc ones, ncludng context around query and context wthn query. The former specfes the envronment of a query such as the doman of nterest, whle the latter refers to context words wthn the query, whch s partcularly useful for the selecton of relevant term relatons. In ths paper, both types of context are ntegrated n an IR model based on language modelng. Our experments on several TREC collectons show that each of the context factors brngs sgnfcant mprovements n retreval effectveness. Categores and Subect escrptors H.3.3 [Informaton storage and retreval]: Informaton Search and Retreval Retreval Models General Terms Algorthms, Performance, Expermentaton, Theory. Keywords uery contexts, oman model, Term relaton, Language model. 1. INTROUCTION ueres, especally short queres, do not provde a complete specfcaton of the nformaton need. Many relevant terms can be absent from queres and terms ncluded may be ambguous. These ssues have been addressed n a large number of prevous studes. Typcal solutons nclude expandng ether document or query representaton [19][35] by explotng dfferent resources [24][31], usng word sense dsambguaton [25], etc. In these studes, however, t has been generally assumed that query s the only element avalable about the user s nformaton need. In realty, query s always formulated n a search context. As t has been found n many prevous studes [2][14][2][21][26], contextual factors have a strong nfluence on relevance udgments. These factors nclude, among many others, the user s doman of nterest, nowledge, preferences, etc. All these elements specfy the Permsson to mae dgtal or hard copes of all or part of ths wor for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. SIGIR 7, July 23 27, 27, Ámsterdam, The Netherlands. Copyrght 27 ACM /7/7...$5.. contexts around the query. So we call them context around query n ths paper. It has been demonstrated that user s query should be placed n ts context for a correct nterpretaton. Recent studes have nvestgated the ntegraton of some contexts around the query [9][3][23]. Typcally, a user profle s constructed to reflect the user s domans of nterest and bacground. A user profle s used to favor the documents that are more closely related to the profle. However, a sngle profle for a user can group a varety of dfferent domans, whch are not always relevant to a partcular query. For example, f a user worng n computer scence ssues a query Java hotel, the documents on Java language wll be ncorrectly favored. A possble soluton to ths problem s to use query-related profles or models nstead of user-centrc ones. In ths paper, we propose to model topc domans, among whch the related one(s) wll be selected for a gven query. Ths method allows us to select more approprate query-specfc context around the query. Another strong contextual factor dentfed n lterature s doman nowledge, or doman-specfc term relatons, such as program computer n computer scence. Usng ths relaton, one would be able to expand the query program wth the term computer. However, doman nowledge s avalable only for a few domans (e.g. Medcne ). The shortage of doman nowledge has led to the utlzaton of general nowledge for query expanson [31], whch s more avalable from resources such as thesaur, or t can be automatcally extracted from documents [24][27]. However, the use of general nowledge gves rse to an enormous problem of nowledge ambguty [31]: we are often unable to determne f a relaton apples to a query. For example, usually lttle nformaton s avalable to determne whether program computer s applcable to queres Java program and TV program. Therefore, the relaton has been appled to all queres contanng program n prevous studes, leadng to a wrong expanson for TV program. Loong at the two query examples, however, people can easly determne whether the relaton s applcable, by consderng the context words Java and TV. So the mportant queston s how we can serve these context words n queres to select the approprate relatons to apply. These context words form a context wthn query. In some prevous studes [24][31], context words n a query have been used to select expanson terms suggested by term relatons, whch are, however, context-ndependent (such as program computer ). Although mprovements are observed n some cases, they are lmted. We argue that the problem stems from the lac of necessary context nformaton n relatons themselves, and a more radcal soluton les n the addton of contexts n relatons. The method we propose s to add context words nto the condton of a relaton, such as {Java, program} computer, to lmt ts applcablty to the approprate context. 15

2 Ths paper ams to mae contrbutons on the followng aspects: uery-specfc doman model: We construct more specfc doman models nstead of a sngle user model groupng all the domans. The doman related to a specfc query s selected (ether manually or automatcally) for each query. Context wthn query: We ntegrate context words n term relatons so that only approprate relatons can be appled to the query. Multple contextual factors: Fnally, we propose a framewor based on language modelng approach to ntegrate multple contextual factors. Our approach has been tested on several TREC collectons. The experments clearly show that both types of context can result n sgnfcant mprovements n retreval effectveness, and ther effects are complementary. We wll also show that t s possble to determne the query doman automatcally, and ths results n comparable effectveness to a manual specfcaton of doman. Ths paper s organzed as follows. In secton 2, we revew some related wor and ntroduce the prncple of our approach. Secton 3 presents our general model. Then sectons 4 and 5 descrbe respectvely the doman model and the nowledge model. Secton 6 explans the method for parameter tranng. Experments are presented n secton 7 and conclusons n secton CONTEXTS AN UTILIZATION IN IR There are many contextual factors n IR: the user s doman of nterest, nowledge about the subect, preference, document recency, and so on [2][14]. Among them, the user s doman of nterest and nowledge are consdered to be among the most mportant ones [2][21]. In ths secton, we revew some of the studes n IR concernng these aspects. oman of nterest and context around query A doman of nterest specfes a partcular bacground for the nterpretaton of a query. It can be used n dfferent ways. Most often, a user profle s created to encompass all the domans of nterest of a user [23]. In [5], a user profle contans a set of topc categores of OP (Open rectory Proect, dentfed by the user. The documents (Web pages) classfed n these categores are used to create a term vector, whch represents the whole domans of nterest of the user. On the other hand, [9][15][26][3], as well as Google Personalzed Search [12] use the documents read by the user, stored on user s computer or extracted from user s search hstory. In all these studes, we observe that a sngle user profle (usually a statstcal model or vector) s created for a user wthout dstngushng the dfferent topc domans. The systematc applcaton of the user profle can ncorrectly bas the results for queres unrelated to the profle. Ths stuaton can often occur n practce as a user can search for a varety of topcs outsde the domans that he has prevously searched n or dentfed. A possble soluton to ths problem s the creaton of multple profles, one for a separate doman of nterest. The domans related to a query are then dentfed accordng to the query. Ths wll enable us to use a more approprate query-specfc profle, nstead of a user-centrc one. Ths approach s used n [18] n whch OP drectores are used. However, only a small scale experment has been carred out. A smlar approach s used n [8], where doman models are created usng OP categores and user queres are manually mapped to them. However, the experments showed varable results. It remans unclear whether doman models can be effectvely used n IR. In ths study, we also model topc domans. We wll carry out experments on both automatc and manual dentfcaton of query domans. oman models wll also be ntegrated wth other factors. In the followng dscusson, we wll call the topc doman of a query a context around query to contrast wth another context wthn query that we wll ntroduce. Knowledge and context wthn query ue to the unavalablty of doman-specfc nowledge, general nowledge resources such as Wordnet and term relatons extracted automatcally have been used for query expanson [27][31]. In both cases, the relatons are defned between two sngle terms such as t 1 t 2. If a query contans term t 1, then t 2 s always consdered as a canddate for expanson. As we mentoned earler, we are faced wth the problem of relaton ambguty: some relatons apply to a query and some others should not. For example, program computer should not be appled to TV program even f the latter contans program. However, lttle nformaton s avalable n the relaton to help us determne f an applcaton context s approprate. To remedy ths problem, approaches have been proposed to mae a selecton of expanson terms after the applcaton of relatons [24][31]. Typcally, one defnes some sort of global relaton between the expanson term and the whole query, whch s usually a sum of ts relatons to every query word. Although some napproprate expanson terms can be removed because they are only wealy connected to some query terms, many others reman. For example, f the relaton program computer s strong enough, computer wll have a strong global relaton to the whole query TV program and t stll remans as an expanson term. It s possble to ntegrate stronger control on the utlzaton of nowledge. For example, [17] defned strong logcal relatons to encode nowledge of dfferent domans. If the applcaton of a relaton leads to a conflct wth the query (or wth other peces of evdence), then t s not appled. However, ths approach requres encodng all the logcal consequences ncludng contradctons n nowledge, whch s dffcult to mplement n practce. In our earler study [1], a smpler and more general approach s proposed to solve the problem at ts source,.e. the lac of context nformaton n term relatons: by ntroducng strcter condtons n a relaton, for example {Java, program} computer and {algorthm, program} computer, the applcablty of the relatons wll be naturally restrcted to correct contexts. As a result, computer wll be used to expand queres Java program or program algorthm, but not TV program. Ths prncple s smlar to that of [33] for word sense dsambguaton. However, we do not explctly assgn a meanng to a word; rather we try to mae dfferences between word usages n dfferent contexts. From ths pont of vew, our approach s more smlar to word sense dscrmnaton [27]. In ths paper, we use the same approach and we wll ntegrate t nto a more global model wth other context factors. As the context words added nto relatons allow us to explot the word context wthn the query, we call such factors context wthn query. Wthn query context exsts n many queres. In fact, users 16

3 often do not use a sngle ambguous word such as Java as query (f they are aware of ts ambguty). Some context words are often used together wth t. In these cases, contexts wthn query are created and can be exploted. uery profle and other factors Many attempts have been made n IR to create query-specfc profles. We can consder mplct feedbac or blnd feedbac [7][16][29][32][35] n ths famly. A short-term feedbac model s created for the gven query from feedbac documents, whch has been proven to be effectve to capture some aspects of the user s ntent behnd the query. In order to create a good query model, such a query-specfc feedbac model should be ntegrated. There are many other contextual factors ([26]) that we do not deal wth n ths paper. However, t seems clear that many factors are complementary. As found n [32], a feedbac model creates a local context related to the query, whle the general nowledge or the whole corpus defnes a global context. Both types of contexts have been proven useful [32]. oman model specfes yet another type of useful nformaton: t reflects a set of specfc bacground terms for a doman, for example polluton, ran, greenhouse, etc. for the doman of Envronment. These terms are often presumed when a user ssues a query such as waste cleanup n the doman. It s useful to add them nto the query. We see a clear complementarty among these factors. It s then useful to combne them together n a sngle IR model. In ths study, we wll ntegrate all the above factors wthn a unfed framewor based on language modelng. Each component contextual factor wll determnes a dfferent ranng score, and the fnal document ranng combnes all of them. Ths s descrbed n the followng secton. 3. GENERAL IR MOEL In the language modelng framewor, a typcal score functon s defned n KL-dvergence as follows: Score (, ) t ) log t ) KL( ) (1) = t V where s a (ungram) language model created for a document, a language model for the query, and V the vocabulary. Smoothng on document model s recognzed to be crucal [35], and one of common smoothng methods s the Jelne-Mercer nterpolaton smoothng: P ( t ' ) = ( 1 λ ) t ) + λ t ) (2) where λ s an nterpolaton parameter and C the collecton model. In the basc language modelng approaches, the query model s estmated by Maxmum Lelhood Estmaton (MLE) wthout any smoothng. In such a settng, the basc retreval operaton s stll lmted to eyword matchng, accordng to a few words n the query. To mprove retreval effectveness, t s mportant to create a more complete query model that represents better the nformaton need. In partcular, all the related and presumed words should be ncluded n the query model. A more complete query model by several methods have been proposed usng feedbac documents [16][35] or usng term relatons [1][1][34]. In these cases, we construct two models for the query: the ntal query model contanng only the orgnal terms, and a new model contanng the added terms. They are then combned through nterpolaton. C In ths paper, we generalze ths approach and ntegrate more models for the query. Let us use to denote the orgnal query F model, for the feedbac model created from feedbac om K documents, for a doman model and for a nowledge model created by applyng term relatons. can be created by F MLE. has been used n several prevous studes [16][35]. In F ths paper, s extracted usng the 2 blnd feedbac om documents. We wll descrbe the detals to construct and n Secton 4 and 5. K Gven these models, we create the followng fnal query model by nterpolaton: P ( t ) = α P ( t ) (3) X where X={, om, K, F} s the set of all component models and α (wth α = 1) are ther mxture weghts. X Then the document score n Equaton (1) s extended as follows: ( ) = t )log t ) = Score, α α Score (, ) (4) t V X Score (, ) t ) log t t V X where = ) s the score accordng to each component model. Here we can see that our strategy of enhancng the query model by contextual factors s equvalent to document re-ranng, whch s used n [5][15][3]. The remanng problem s to construct doman models and nowledge model and to combne all the models (parameter settng). We descrbe ths n the followng sectons. 4. CONSTRUCTING AN USING OMAIN MOELS As n prevous studes, we explot a set of documents already classfed n each doman. These documents can be dentfed n two dfferent ways: 1) One can tae advantages of an exstng doman herarchy and the documents manually classfed n them, such as OP. In that case, a new query should be classfed nto the same domans ether manually or automatcally. 2) A user can defne hs own domans. By assgnng a doman to hs queres, the system can gather a set of answers to the queres automatcally, whch are then consdered to be n-doman documents. The answers could be those that the user have read, browsed through, or udged relevant to an n-doman query, or they can be smply the top-raned retreval results. An earler study [4] has compared the above two strateges usng TREC queres 51-15, for whch a doman has been manually assgned. These domans have been mapped to OP categores. It s found that both approaches mentoned above are equally effectve and result n comparable performance. Therefore, n ths study, we only use the second approach. Ths choce s also motvated by the possblty to compare between manual and automatc assgnment of doman to a new query. Ths wll be explaned n detal n our experments. Whatever the strategy, we wll obtan a set of documents for each doman, from whch a language model can be extracted. If maxmum lelhood estmaton s used drectly on these documents, the resultng doman model wll contan both doman- 17

4 specfc terms and general terms, and the former do not emerge. Therefore, we employ an EM process to extract the specfc part of the doman as follows: we assume that the documents n a doman are generated by a doman-specfc model (to be extracted) and general language model (collecton model). Then the lelhood of a document n the doman can be formulated as follows: P c ( ) = [ ( ) ( ) + ( )] ( t; ' ) om 1 η P t om ηp t C (5) t where c(t; ) s the count of t n document and η s a smoothng parameter (whch wll be fxed at.5 as n [35]). The EM algorthm s used to extract the doman model that om maxmzes om om ) (where om s the set of documents n the doman), that s: om = arg max = arg max P om ( om ' ) om [ ( ) ( ) + ( )] c( t ; 1 η P t ) om ηp t C om t om Ths s the same process as the one used to extract feedbac model n [35]. It s able to extract the most specfc words of the doman from the documents whle flterng out the common words of the language. Ths can be observed n the followng table, whch shows some words n the doman model of Envronment before and after EM teratons (5 teratons). Table 1. Term probabltes before/after EM Term Intal Fnal change Term Intal Fnal change ar % year % envronment % system *e -6-99% ran % program % polluton % mllon *e -6-99% storm % mae *e -5-95% flood % company *e -8-99% tornado % presdent *e -6-99% greenhouse % month *e -5-95% Gven a set of doman models, the related ones have to be assgned to a new query. Ths can be done manually by the user or automatcally by the system usng query classfcaton. We wll compare both approaches. uery classfcaton has been nvestgated n several studes [18][28]. In ths study, we use a smple classfcaton method: the selected doman s the one wth whch the query s KL-dvergence score s the lowest,.e.: om (6) = arg mn t ) log t ) (7) om t om Ths classfcaton method s an extenson to Naïve Bayes as shown n [22]. The score dependng on the doman model s then as follows: om Score (, ) = t )log t ) (8) om t V Although the above equaton requres usng all the terms n the vocabulary, n practce, only the strongest terms n the doman model are useful and the terms wth low probabltes are often nose. Therefore, we only retan the top 1 strongest terms. The same strategy s used for Knowledge model. Although doman models are more refned than a sngle user profle, the topcs n a sngle doman can stll be very dfferent, mang the doman model too large. Ths s partcularly true for large domans such as Scence and technology defned n TREC queres. Usng such a large doman model as the bacground can ntroduce much nose terms. Therefore, we further construct a subdoman model more related to the gven query, by usng a subset of n-doman documents that are related to the query. These documents are the top-raned documents retreved wth the orgnal query wthn the doman. Ths approach s ndeed a combnaton of doman and feedbac models. In our experments, we wll see that ths further specfcaton of sub-doman s necessary n some cases, but not n all, especally when Feedbac model s also used. 5. EXTRACTING CONTEXT-EPENENT TERM RELATIONS FROM OCUMENTS In ths paper, we extract term relatons from the document collecton automatcally. In general, a term relaton can be represented as A B. Both A and B have been restrcted to sngle terms n prevous studes. A sngle term n A means that the relaton s applcable to all the queres contanng that term. As we explaned earler, ths s the source of many wrong applcatons. The soluton we propose s to add more context terms nto A, so that t s applcable only when all the terms n A appear n a query. For example, nstead of creatng a context-ndependent relaton Java program, we wll create {Java, computer} program, whch means that program s selected when both Java and computer appear n a query. The term added n the condton specfes a strcter context to apply the relaton. We call ths type of relaton context-dependent relaton. In prncple, the addton s not restrcted to one term. However, we wll mae ths restrcton due to the followng reasons: User queres are usually very short. Addng more terms nto the condton wll create many rarely applcable relatons; In most cases, an ambguous word such as Java can be effectvely dsambguated by one useful context word such as computer or hotel ; The addton of more terms wll also lead to a hgher space and tme complexty for extractng and storng term relatons. The extracton of relatons of type {t,t } t can be performed usng mnng algorthms for assocaton rules [13]. Here, we use a smple co-occurrence analyss. Wndows of fxed sze (1 words n our case) are used to obtan co-occurrence counts of three terms, and the probablty P t t t ) s determned as follows: ( P ( t t t ) = c( t, t, t ) c( t, t, t ) (9) where c t, t, t ) s the count of co-occurrences. ( t l In order to reduce space requrement, we further apply the followng flterng crtera: The two terms n the condton should appear at least certan tme together n the collecton (1 n our case) and they should be related. We use the followng pontwse mutual nformaton as a measure of relatedness (MI > ) [6]: l 18

5 t, t ) MI ( t, t ) = log t ) t ) The probablty of a relaton should be hgher than a threshold (.1 n our case); Havng a set of relatons, the correspondng Knowledge model s defned as follows: t K ) = = ( t t ) ( t t ) t t t t t ) t t t ) t ) ) t ) (1) where (t t ) means any combnaton of two terms n the query. Ths s a drect extenson of the translaton model proposed n [3] to our context-dependent relatons. The score accordng to the Knowledge model s then defned as follows: Score (, ) = t t t ) t ) t ) log t ) (11) K t V ( t t ) Agan, only the top 1 expanson terms are used. 6. MOEL PARAMETERS There are several parameters n our model: λ n Equaton (2) and α ( {, om, K, F}) n Equaton (3). As the parameter λ only affects document model, we wll set t to the same value n all our experments. The value λ=.5 s determned to maxmze the effectveness of the baselne models (see Secton 7.2) on the tranng data: TREC queres 1-5 and documents on s 2. The mxture weghts α of component models are traned on the same tranng data usng the followng method of lne search [11] to maxmze the Mean Average Precson (MAP): each parameter s consdered as a search drecton. We start by searchng n one drecton testng all the values n that drecton, whle eepng the values n other drectons unchanged. Each drecton s searched n turn, untl no mprovement n MAP s observed. In order to avod beng trapped at a local maxmum, we started from 1 random ponts and the best settng s selected. 7. EXPERIMENTS 7.1 Settng The man test data are those from TREC 1-3 ad hoc and flterng tracs, ncludng queres 1-15, and documents on ss 1-3. The choce of ths test collecton s due to the avalablty of manually specfed doman for each query. Ths allows us to compare wth an approach usng automatc doman dentfcaton. Below s an example of topc: <num> Number: 13 <dom> oman: Law and Government <ttle> Topc: Welfare Reform We only use topc ttles n all our tests. ueres 1-5 are used for tranng and for testng. 13 domans are defned n these queres and ther dstrbutons among the two sets of queres are shown n Fg. 1. We can see that the dstrbuton vares strongly between domans and between the two query sets. We have also tested on TREC 7 and 8 data. For ths seres of tests, each collecton s used n turn as tranng data whle the other s used for testng. Some statstcs of the data are descrbed n Tab Envronment Fnance Int. Economcs Int. Fnance Int. Poltcs Int. Relatons Law&Gov. Medcal&Bo. Mltary Poltcs Sc.&Tech. US Economcs US Poltcs Fgure 1. strbuton of domans Table 2. TREC collecton statstcs All the documents are preprocessed usng Porter stemmer n Lemur and the standard stoplst s used. Some queres (4, 5 and 3 n the three query sets) only contan one word. For these queres, nowledge model s not applcable. On doman models, we examne several questons: uery 1-5 uery Collecton ocument Sze (GB) Voc. # of oc. uery Tranng s ,85 231, ss 1-3 ss ,932 1,78, ss , , ss , , When query doman s specfed manually, s t useful to ncorporate the doman model? If the query doman s not specfed, can t be determned automatcally? How effectve s ths method? We descrbed two ways to gather documents for a doman: ether usng documents udged relevant to queres n the doman or usng documents retreved for these queres. How do they compare? On Knowledge model, n addton to testng ts effectveness, we also want to compare the context-dependent relatons wth context-ndependent ones. Fnally, we wll see the mpact of each component model when all the factors are combned. 7.2 Baselne Methods Two baselne models are used: the classcal ungram model wthout any expanson, and the model wth Feedbac. In all the experments, document models are created usng Jelne-Mercer smoothng. Ths choce s made accordng to the observaton n [36] that the method performs very well for long queres. In our case, as queres are expanded, they perform smlarly to long queres. In our prelmnary tests, we also found ths method performed better than the other methods (e.g. rchlet), especally for the man baselne method wth Feedbac model. Table 3 shows the retreval effectveness on all the collectons. 7.3 Knowledge Models Ths model s combned wth both baselne models (wth or wthout feedbac). We also compare the context-dependent nowledge model wth the tradtonal context-ndependent term relatons (defned between two sngle terms), whch are used to expand queres. Ths latter selects expanson terms wth strongest global relaton to the query. Ths relaton s measured by the sum of relatons to each of the query terms. Ths method s equvalent to [24]. It s also smlar to the translaton model [3]. We call t 19

6 ss 1-3 ss1-3 Table 3. Baselne models Ungram Model Wthout FB Wth FB (+49.3%) Recall / (+31.4%) Recall / P@ (+21.87%) Recall / P@ Table 4. Knowledge models Co-occurrence Knowledge model Wthout FB Wth FB Wthout FB Wth FB.1884 (+2.%) (+3.75%)**.2164 (+37.83%) (+5.8%)** Recall / P@ (+1.8%) (+8.%)*.2157 (+3.25%) (+1.34%)** Recall / P@ (+5.53%).2926 (+.58%).2724 (+14.12%)++.37 (+3.37%) Recall / P@ (The column WthoutFB s compared to the baselne model wthout feedbac, whle WthFB s compared to the baselne wth feedbac. ++ and + mean sgnfcant changes n t-test wth respect to the baselne wthout feedbac, at the level of p<.1 and p<.5, respectvely. ** and * are smlar but compared to the baselne model wth feedbac.) Co-occurrence model n Table 4. T-test s also performed for statstcal sgnfcance. As we can see, smple co-occurrence relatons can produce relatvely strong mprovements; but context-dependent relatons can produce much stronger mprovements n all cases, especally when feedbac s not used. All the mprovements over cooccurrence model are statstcally sgnfcant (ths s not shown n the table). The large dfferences between the two types of relaton clearly show that context-dependent relatons are more approprate for query expanson. Ths confrms the hypothess we made, that by ncorporatng context nformaton nto relatons, we can better determne the approprate relatons to apply and thus avod ntroducng napproprate expanson terms. The followng example can further confrm ths observaton, where we show the strongest expanson terms suggested by both types of relaton for the query #384 space staton moon : Co-occurrence Relatons: year power tme develop.8932 offc.8485 oper earth.7843 wor.781 rado.771 system.7627 buld nclud.7377 state.776 program.762 naton.6937 open.6889 servc.689 ar.6734 space.6685 nuclear.6521 full.6425 mae.641 compan.6262 peopl.6244 proect.6147 unt.6114 gener.636 da.629 Context-ependent Relatons: space mar earth man.3777 program.3377 proect.2691 base orbt.2519 buld.2542 msson call explor.2161 launch develop shuttl plan flght staton.1645 ntern.162 energ oper power transport construct.1216 nasa naton perman apan apollo.1997 lunar.1898 In comparson wth the baselne model wth feedbac (Tab. 3), we see that the mprovements made by Knowledge model alone are slghtly lower. However, when both models are combned, there are addtonal mprovements over the Feedbac model, and these mprovements are statstcally sgnfcant n 2 cases out of 3. Ths demonstrates that the mpacts produced by feedbac and term relatons are dfferent and complementary. 7.4 oman Models In ths secton, we test several strateges to create and use doman models, by explotng the doman nformaton of the query set n varous ways. Strateges for creatng doman models: C1 - Wth the relevant documents for the n-doman queres: ths strategy smulates the case where we have an exstng drectory n whch documents relevant to the doman are ncluded. C2 - Wth the top-1 documents retreved wth the n-doman queres: ths strategy smulates the case where the user specfes a doman for hs queres wthout udgng document relevance, and the system gathers related documents from hs search hstory. Strateges for usng doman models: U1 - The doman model s determned by the user manually. U2 - The doman model s determned by the system Creatng oman models We test strateges C1 and C2. In ths seres of tests, each of the queres s used n turn as the test query whle the other queres and ther relevant documents (C1) or top-raned retreved documents (C2) are used to create doman models. The same method s used on queres 1-5 to tune the parameters. ss1-3 (U1) ss1-3 (U1) Table 5. oman models wth relevant documents (C1) oman Sub-oman Wthout FB Wth FB Wthout FB Wth FB.17 (+8.28%) (+4.69%)**.1918 (+22.17%) (+4.99%)** Recall / P@ (+3.56%) (+9.79%)*.1842 (+11.23%) (+1.66%)** Recall / P@ (+2. 3%).2957 (+1.65%).2563 (+7.37%).2967 (+1.99%) Recall / P@ Table 6. oman models wth top-1 documents (C2) oman Sub-oman Wthout FB Wth FB Wthout FB Wth FB.1718 (+9.43%) (+4.78%)**.1799 (+14.59%) (+4.61%)** Recall / P@ (+6.58%) (+1.6%)**.1785 (+7.79%) (+9.97%)** Recall / P@ (+1.97%).2949 (+1.38%).2441 (+2.26%).2961 (+1.79%) Recall / P@

7 We also compare the doman models created wth all the ndoman documents (oman) and wth only the top-1 retreved documents n the doman wth the query (Sub-oman). In these tests, we use manual dentfcaton of query doman for ss 1-3 (U1), but automatc dentfcaton for and 8. Frst, t s nterestng to notce that the ncorporaton of doman models can generally mprove retreval effectveness n all the cases. The mprovements on ss 1-3 and are statstcally sgnfcant. However, the mprovement scales are smaller than usng Feedbac and Relaton models. Loong at the dstrbuton of the domans (Fg. 1), ths observaton s not surprsng: for many domans, we only have few tranng queres, thus few ndoman documents to create doman models. In addton, topcs n the same doman can vary greatly, n partcular n large domans such as scence and technology, nternatonal poltcs, etc. Second, we observe that the two methods to create doman models perform equally well (Tab. 6 vs. Tab. 5). In other words, provdng relevance udgments for queres does not add much advantage for the purpose of creatng doman models. Ths may seem surprsng. An analyss mmedately shows the reason: a doman model (n the way we created) only captures term dstrbuton n the doman. Relevant documents for all n-doman queres vary greatly. Therefore, n some large domans, characterstc terms have varable effects on queres. On the other hand, as we only use term dstrbuton, even f the top documents retreved for the n-doman queres are rrelevant, they can stll contan doman characterstc terms smlarly to relevant documents. Thus both strateges produce very smlar effects. Ths result opens the door for a smpler method that does not requre relevance udgments, for example usng search hstory. Thrd, wthout Feedbac model, the sub-doman models constructed wth relevant documents perform much better than the whole doman models (Tab. 5). However, once Feedbac model s used, the advantage dsappears. On one hand, ths confrms our earler hypothess that a doman may be too large to be able to suggest relevant terms for new queres n the doman. It ndrectly valdates our frst hypothess that a sngle user model or profle may be too large, so smaller doman models are preferred. On the other hand, sub-doman models capture smlar characterstcs to Feedbac model. So when the latter s used, sub-doman models become superfluous. However, f doman models are constructed wth top-raned documents (Tab. 6), sub-doman models mae much less dfferences. Ths can be explaned by the fact that the domans constructed wth top-raned documents tend to be more unform than relevant documents wth respect to term dstrbuton, as the top retreved documents usually have stronger statstcal correspondence wth the queres than the relevant documents etermnng uery oman Automatcally It s not realstc to always as users to specfy a doman for ther queres. Here, we examne the possblty to automatcally dentfy query domans. Table 7 shows the results wth ths strategy usng both strateges for doman model constructon. We can observe that the effectveness s only slghtly lower than those produced wth manual dentfcaton of query doman (Tab. 5 & 6, oman models). Ths shows that automatc doman dentfcaton s a way to select doman model as effectve as manual dentfcaton. Ths also demonstrates the feasblty to use doman models for queres when no doman nformaton s provded. Table 7. Automatc query doman dentfcaton om. wth rel. doc. (C1) om. wth top-1 doc. (C2) Wthout FB Wth FB Wthout FB Wth FB ss 1-3 ss (+5.1%) (+4.27%)**.167 (+6.37%) (+4.48%)** Recall P@ Table 8. Complete models (C1) Man. dom. d. (U1) All oc. oman Auto. dom. d..251 (+6.7%) **.2489 (+6.19%) ** Recall / P@ (+13.14%) ** Recall /4 674 N/A 3 14 P@ (+4.13%) ** Recall /4 728 N/A P@1.52 Loong at the accuracy of the automatc doman dentfcaton, however, t s surprsngly low: for queres 51-15, only 38% of the determned domans correspond to the manual dentfcatons. Ths s much lower than the above 8% rates reported n [18]. A detaled analyss reveals that the man reason s the closeness of several domans n TREC queres (e.g. Internatonal relatons, Internatonal poltcs, Poltcs ). However, n ths stuaton, wrong domans assgned to queres are not always rrelevant and useless. For example, even when a query n Internatonal relatons s classfed n Internatonal poltcs, the latter doman can stll suggest useful terms to the query. Therefore, the relatvely low classfcaton accuracy does not mean low usefulness of the doman models. 7.5 Complete Models The results wth the complete model are shown n Table 8. Ths model ntegrates all the components descrbed n ths paper: Orgnal query model, Feedbac model, oman model and Knowledge model. We have tested both strateges to create doman models, but the dfferences between them are very small. So we only report the results wth the relevant documents. Our frst observaton s that the complete models produce the best results. All the mprovements over the baselne model (wth feedbac) are statstcally sgnfcant. Ths result confrms that the ntegraton of contextual factors s effectve. Compared to the other results, we see consstent, although small n some cases, mprovements over all the partal models. Loong at the mxture weghts, whch may reflect the mportance of each model, we observed that the best settngs n all the collectons vary n the followng ranges:.1 α.2,.1 α om.2,.1 α K.2 and.5 α F.6. We see that the most mportant factor s Feedbac model. Ths s also the sngle factor whch produced the hghest mprovements over the orgnal query model. Ths observaton seems to ndcate that ths model has the hghest capablty to capture the nformaton need behnd the query. However, even wth lower weghts, the other models do have strong mpacts on the fnal effectveness. Ths demonstrates the beneft of ntegratng more contextual factors n IR. 21

8 8. CONCLUSIONS Tradtonal IR approaches usually consder the query as the only element avalable for the user nformaton need. Many prevous studes have nvestgated the ntegraton of some contextual factors n IR models, typcally by ncorporatng a user profle. In ths paper, we argue that a sngle user profle (or model) can contan a too large varety of dfferent topcs so that new queres can be ncorrectly based. Smlarly to some prevous studes, we propose to model topc domans nstead of the user. Prevous nvestgatons on context focused on factors around the query. We showed n ths paper that factors wthn the query are also mportant they help select the approprate term relatons to apply n query expanson. We have ntegrated the above contextual factors, together wth feedbac model, n a sngle language model. Our expermental results strongly confrm the beneft of usng contexts n IR. Ths wor also shows that the language modelng framewor s approprate for ntegratng many contextual factors. Ths wor can be further mproved on several aspects, ncludng other methods to extract term relatons, to ntegrate more context words n condtons and to dentfy query domans. It would also be nterestng to test the method on Web search usng user search hstory. We wll nvestgate these problems n our future research. 9. REFERENCES [1] Ba, J., Ne, J.Y., Cao, G., Context-dependent term relatons for nformaton retreval, EMNLP 6, pp , 26. [2] Beln, N.J., Interacton wth texts: Informaton retreval as nformaton seeng behavor, Informaton Retreval 93: Von der modellerung zu anwendung, pp , Konstanz: Krause & Womser-Hacer, [3] Berger, A., Lafferty, J., Informaton retreval as statstcal translaton, SIGIR 99, pp , [4] Bouchard, H., Ne, J.Y., Modèles de langue applqués à la recherche d nformaton contextuelle, Conf. en Recherche d Informaton et Applcatons (CORIA), Lyon, 26. [5] Chrta, P.A., Pau, R., Nedl, W., Kohlschütter, C., Usng OP metadata to personalze search, SIGIR, pp , 25. [6] Church, K. W., Hans, P., Word assocaton norms, mutual nformaton, and lexcography. ACL, pp , [7] Croft, W. B., Cronen-Townsend, S., Lavreno, V., Relevance feedbac and personalzaton: A language modelng perspectve, In: The ELOS-NSF Worshop on Personalzaton and Recommender Systems gtal Lbrares, pp , 26. [8] Croft, W. B., We, X., Context-based topc models for query modfcaton, CIIR Techncal Report, Unversty of Massachusetts, 25. [9] umas, S., Cutrell, E., Cadz, J., Jance, G., Sarn, R., Robbns,. C., Stuff I've seen: a system for personal nformaton retreval and re-use, SIGIR'3, pp , 23. [1] Fang, H., Zha, C., Semantc term matchng n axomatc approaches to nformaton retreval, SIGIR 6, pp , 26. [11] Gao, J.,, H., Xa, X., Ne, J.-Y., Lnear dscrmnatve model for nformaton retreval. SIGIR 5, pp , 25. [12] Goole Personalzed Search, [13] Hpp, J., Guntzer, U., Nahaezadeh, G., Algorthms for assocaton rule mnng - a general survey and comparson. SIGK exploratons, 2 (1), pp , 2. [14] Ingwersen, P., Jäverln, K., Informaton retreval n context: IRX, SIGIR Forum, 39: pp , 24. [15] Km, H.-R., Chan, P.K., Personalzed ranng of search results wth learned user nterest herarches from boomars, WEBK 5 Worshop at ACM-K, pp , 25. [16] Lavreno, V., Croft, W. B., Relevance-based language models, SIGIR 1, pp , 21. [17] Lau, R., Bruza, P., Song,., Belef revson for adaptve nformaton retreval, SIGIR 4, pp , 24. [18] Lu, F., Yu,C., Meng, W., Personalzed web search by mappng user queres to categores, CIKM 2, pp [19] Lu, X., Croft, W. B., Cluster-based retreval usng language models, SIGIR '4, pp , 24. [2] Morrs, R.C., Toward a user-centered nformaton servce, JASIS, 45: pp. 2-3, [21] Par, T.K., Toward a theory of user-based relevance: A call for a new paradgm of nqury, JASIS, 45: pp , [22] Peng, F., Schuurmans,., Wang, S. Augmentng Nave Bayes Classfers wth Statstcal Language Models. Inf. Retr. 7(3-4): pp , 24. [23] Ptow, J., Schütze, H., Cass, T., Cooley, R., Turnbull,., Edmonds, A., Adar, E., Breuel, T., Personalzed Search, Communcatons of ACM, 45: pp. 5-55, 22. [24] u, Y., Fre, H.P. Concept based query expanson. SIGIR 93, pp , [25] Sanderson, M., Retrevng wth good sense, Inf. Ret., 2(1): pp , 2. [26] Schamber, L., Esenberg, M.B., Nlan, M.S., A reexamnaton of relevance: Towards a dynamc, stuatonal defnton, Informaton Processng and Management, 26(6): pp , 199. [27] Schütze, H., Pedersen J.O., A cooccurrence-based thesaurus and two applcatons to nformaton retreval, Informaton Processng and Management, 33(3): pp , [28] Shen,., Pan, R., Sun, J-T., Pan, J.J., Wu, K., Yn, J., Yang,. uery enrchment for web-query classfcaton. ACM- TOIS, 24(3): pp , 26. [29] Shen, X., Tan, B., Zha, C., Context-senstve nformaton retreval usng mplct feedbac, SIGIR 5, pp. 43-5, 25. [3] Teevan, J., umas, S.T., Horvtz, E., Personalzng search va automated analyss of nterests and actvtes, SIGIR 5, pp , 25. [31] Voorhees, E., uery expanson usng lexcal-semantc relatons. SIGIR 94, pp , [32] Xu, J., Croft, W.B., uery expanson usng local and global document analyss, SIGIR 96, pp. 4-11, [33] Yarowsy,. Unsupervsed word sense dsambguaton rvalng supervsed methods. ACL, pp [34] Zhou X., Hu X., Zhang X., Ln X., Song I-Y., Contextsenstve semantc smoothng for the language modelng approach to genomc IR, SIGIR 6, pp , 26. [35] Zha, C., Lafferty, J., Model-based feedbac n the language modelng approach to nformaton retreval, CIKM 1, pp , 21. [36] Zha, C., Lafferty, J., A study of smoothng methods for language models appled to ad hoc nformaton retreval. SIGIR, pp ,

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department