Using Query Contexts in Information Retrieval Jing Bai 1, Jian-Yun Nie 1, Hugues Bouchard 2, Guihong Cao 1 1 Department IRO, University of Montreal

Size: px
Start display at page:

Download "Using Query Contexts in Information Retrieval Jing Bai 1, Jian-Yun Nie 1, Hugues Bouchard 2, Guihong Cao 1 1 Department IRO, University of Montreal"

Transcription

1 Usng uery Contexts n Informaton Retreval Jng Ba 1, Jan-Yun Ne 1, Hugues Bouchard 2, Guhong Cao 1 1 epartment IRO, Unversty of Montreal CP. 6128, succursale Centre-vlle, Montreal, uebec, H3C 3J7, Canada {bang, ne, caogu}@ro.umontreal.ca 2 Yahoo! Inc. Montreal, uebec, Canada bouchard@yahoo-nc.com ABSTRACT User query s an element that specfes an nformaton need, but t s not the only one. Studes n lterature have found many contextual factors that strongly nfluence the nterpretaton of a query. Recent studes have tred to consder the user s nterests by creatng a user profle. However, a sngle profle for a user may not be suffcent for a varety of queres of the user. In ths study, we propose to use query-specfc contexts nstead of user-centrc ones, ncludng context around query and context wthn query. The former specfes the envronment of a query such as the doman of nterest, whle the latter refers to context words wthn the query, whch s partcularly useful for the selecton of relevant term relatons. In ths paper, both types of context are ntegrated n an IR model based on language modelng. Our experments on several TREC collectons show that each of the context factors brngs sgnfcant mprovements n retreval effectveness. Categores and Subect escrptors H.3.3 [Informaton storage and retreval]: Informaton Search and Retreval Retreval Models General Terms Algorthms, Performance, Expermentaton, Theory. Keywords uery contexts, oman model, Term relaton, Language model. 1. INTROUCTION ueres, especally short queres, do not provde a complete specfcaton of the nformaton need. Many relevant terms can be absent from queres and terms ncluded may be ambguous. These ssues have been addressed n a large number of prevous studes. Typcal solutons nclude expandng ether document or query representaton [19][35] by explotng dfferent resources [24][31], usng word sense dsambguaton [25], etc. In these studes, however, t has been generally assumed that query s the only element avalable about the user s nformaton need. In realty, query s always formulated n a search context. As t has been found n many prevous studes [2][14][2][21][26], contextual factors have a strong nfluence on relevance udgments. These factors nclude, among many others, the user s doman of nterest, nowledge, preferences, etc. All these elements specfy the Permsson to mae dgtal or hard copes of all or part of ths wor for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. SIGIR 7, July 23 27, 27, Ámsterdam, The Netherlands. Copyrght 27 ACM /7/7...$5.. contexts around the query. So we call them context around query n ths paper. It has been demonstrated that user s query should be placed n ts context for a correct nterpretaton. Recent studes have nvestgated the ntegraton of some contexts around the query [9][3][23]. Typcally, a user profle s constructed to reflect the user s domans of nterest and bacground. A user profle s used to favor the documents that are more closely related to the profle. However, a sngle profle for a user can group a varety of dfferent domans, whch are not always relevant to a partcular query. For example, f a user worng n computer scence ssues a query Java hotel, the documents on Java language wll be ncorrectly favored. A possble soluton to ths problem s to use query-related profles or models nstead of user-centrc ones. In ths paper, we propose to model topc domans, among whch the related one(s) wll be selected for a gven query. Ths method allows us to select more approprate query-specfc context around the query. Another strong contextual factor dentfed n lterature s doman nowledge, or doman-specfc term relatons, such as program computer n computer scence. Usng ths relaton, one would be able to expand the query program wth the term computer. However, doman nowledge s avalable only for a few domans (e.g. Medcne ). The shortage of doman nowledge has led to the utlzaton of general nowledge for query expanson [31], whch s more avalable from resources such as thesaur, or t can be automatcally extracted from documents [24][27]. However, the use of general nowledge gves rse to an enormous problem of nowledge ambguty [31]: we are often unable to determne f a relaton apples to a query. For example, usually lttle nformaton s avalable to determne whether program computer s applcable to queres Java program and TV program. Therefore, the relaton has been appled to all queres contanng program n prevous studes, leadng to a wrong expanson for TV program. Loong at the two query examples, however, people can easly determne whether the relaton s applcable, by consderng the context words Java and TV. So the mportant queston s how we can serve these context words n queres to select the approprate relatons to apply. These context words form a context wthn query. In some prevous studes [24][31], context words n a query have been used to select expanson terms suggested by term relatons, whch are, however, context-ndependent (such as program computer ). Although mprovements are observed n some cases, they are lmted. We argue that the problem stems from the lac of necessary context nformaton n relatons themselves, and a more radcal soluton les n the addton of contexts n relatons. The method we propose s to add context words nto the condton of a relaton, such as {Java, program} computer, to lmt ts applcablty to the approprate context. 15

2 Ths paper ams to mae contrbutons on the followng aspects: uery-specfc doman model: We construct more specfc doman models nstead of a sngle user model groupng all the domans. The doman related to a specfc query s selected (ether manually or automatcally) for each query. Context wthn query: We ntegrate context words n term relatons so that only approprate relatons can be appled to the query. Multple contextual factors: Fnally, we propose a framewor based on language modelng approach to ntegrate multple contextual factors. Our approach has been tested on several TREC collectons. The experments clearly show that both types of context can result n sgnfcant mprovements n retreval effectveness, and ther effects are complementary. We wll also show that t s possble to determne the query doman automatcally, and ths results n comparable effectveness to a manual specfcaton of doman. Ths paper s organzed as follows. In secton 2, we revew some related wor and ntroduce the prncple of our approach. Secton 3 presents our general model. Then sectons 4 and 5 descrbe respectvely the doman model and the nowledge model. Secton 6 explans the method for parameter tranng. Experments are presented n secton 7 and conclusons n secton CONTEXTS AN UTILIZATION IN IR There are many contextual factors n IR: the user s doman of nterest, nowledge about the subect, preference, document recency, and so on [2][14]. Among them, the user s doman of nterest and nowledge are consdered to be among the most mportant ones [2][21]. In ths secton, we revew some of the studes n IR concernng these aspects. oman of nterest and context around query A doman of nterest specfes a partcular bacground for the nterpretaton of a query. It can be used n dfferent ways. Most often, a user profle s created to encompass all the domans of nterest of a user [23]. In [5], a user profle contans a set of topc categores of OP (Open rectory Proect, dentfed by the user. The documents (Web pages) classfed n these categores are used to create a term vector, whch represents the whole domans of nterest of the user. On the other hand, [9][15][26][3], as well as Google Personalzed Search [12] use the documents read by the user, stored on user s computer or extracted from user s search hstory. In all these studes, we observe that a sngle user profle (usually a statstcal model or vector) s created for a user wthout dstngushng the dfferent topc domans. The systematc applcaton of the user profle can ncorrectly bas the results for queres unrelated to the profle. Ths stuaton can often occur n practce as a user can search for a varety of topcs outsde the domans that he has prevously searched n or dentfed. A possble soluton to ths problem s the creaton of multple profles, one for a separate doman of nterest. The domans related to a query are then dentfed accordng to the query. Ths wll enable us to use a more approprate query-specfc profle, nstead of a user-centrc one. Ths approach s used n [18] n whch OP drectores are used. However, only a small scale experment has been carred out. A smlar approach s used n [8], where doman models are created usng OP categores and user queres are manually mapped to them. However, the experments showed varable results. It remans unclear whether doman models can be effectvely used n IR. In ths study, we also model topc domans. We wll carry out experments on both automatc and manual dentfcaton of query domans. oman models wll also be ntegrated wth other factors. In the followng dscusson, we wll call the topc doman of a query a context around query to contrast wth another context wthn query that we wll ntroduce. Knowledge and context wthn query ue to the unavalablty of doman-specfc nowledge, general nowledge resources such as Wordnet and term relatons extracted automatcally have been used for query expanson [27][31]. In both cases, the relatons are defned between two sngle terms such as t 1 t 2. If a query contans term t 1, then t 2 s always consdered as a canddate for expanson. As we mentoned earler, we are faced wth the problem of relaton ambguty: some relatons apply to a query and some others should not. For example, program computer should not be appled to TV program even f the latter contans program. However, lttle nformaton s avalable n the relaton to help us determne f an applcaton context s approprate. To remedy ths problem, approaches have been proposed to mae a selecton of expanson terms after the applcaton of relatons [24][31]. Typcally, one defnes some sort of global relaton between the expanson term and the whole query, whch s usually a sum of ts relatons to every query word. Although some napproprate expanson terms can be removed because they are only wealy connected to some query terms, many others reman. For example, f the relaton program computer s strong enough, computer wll have a strong global relaton to the whole query TV program and t stll remans as an expanson term. It s possble to ntegrate stronger control on the utlzaton of nowledge. For example, [17] defned strong logcal relatons to encode nowledge of dfferent domans. If the applcaton of a relaton leads to a conflct wth the query (or wth other peces of evdence), then t s not appled. However, ths approach requres encodng all the logcal consequences ncludng contradctons n nowledge, whch s dffcult to mplement n practce. In our earler study [1], a smpler and more general approach s proposed to solve the problem at ts source,.e. the lac of context nformaton n term relatons: by ntroducng strcter condtons n a relaton, for example {Java, program} computer and {algorthm, program} computer, the applcablty of the relatons wll be naturally restrcted to correct contexts. As a result, computer wll be used to expand queres Java program or program algorthm, but not TV program. Ths prncple s smlar to that of [33] for word sense dsambguaton. However, we do not explctly assgn a meanng to a word; rather we try to mae dfferences between word usages n dfferent contexts. From ths pont of vew, our approach s more smlar to word sense dscrmnaton [27]. In ths paper, we use the same approach and we wll ntegrate t nto a more global model wth other context factors. As the context words added nto relatons allow us to explot the word context wthn the query, we call such factors context wthn query. Wthn query context exsts n many queres. In fact, users 16

3 often do not use a sngle ambguous word such as Java as query (f they are aware of ts ambguty). Some context words are often used together wth t. In these cases, contexts wthn query are created and can be exploted. uery profle and other factors Many attempts have been made n IR to create query-specfc profles. We can consder mplct feedbac or blnd feedbac [7][16][29][32][35] n ths famly. A short-term feedbac model s created for the gven query from feedbac documents, whch has been proven to be effectve to capture some aspects of the user s ntent behnd the query. In order to create a good query model, such a query-specfc feedbac model should be ntegrated. There are many other contextual factors ([26]) that we do not deal wth n ths paper. However, t seems clear that many factors are complementary. As found n [32], a feedbac model creates a local context related to the query, whle the general nowledge or the whole corpus defnes a global context. Both types of contexts have been proven useful [32]. oman model specfes yet another type of useful nformaton: t reflects a set of specfc bacground terms for a doman, for example polluton, ran, greenhouse, etc. for the doman of Envronment. These terms are often presumed when a user ssues a query such as waste cleanup n the doman. It s useful to add them nto the query. We see a clear complementarty among these factors. It s then useful to combne them together n a sngle IR model. In ths study, we wll ntegrate all the above factors wthn a unfed framewor based on language modelng. Each component contextual factor wll determnes a dfferent ranng score, and the fnal document ranng combnes all of them. Ths s descrbed n the followng secton. 3. GENERAL IR MOEL In the language modelng framewor, a typcal score functon s defned n KL-dvergence as follows: Score (, ) t ) log t ) KL( ) (1) = t V where s a (ungram) language model created for a document, a language model for the query, and V the vocabulary. Smoothng on document model s recognzed to be crucal [35], and one of common smoothng methods s the Jelne-Mercer nterpolaton smoothng: P ( t ' ) = ( 1 λ ) t ) + λ t ) (2) where λ s an nterpolaton parameter and C the collecton model. In the basc language modelng approaches, the query model s estmated by Maxmum Lelhood Estmaton (MLE) wthout any smoothng. In such a settng, the basc retreval operaton s stll lmted to eyword matchng, accordng to a few words n the query. To mprove retreval effectveness, t s mportant to create a more complete query model that represents better the nformaton need. In partcular, all the related and presumed words should be ncluded n the query model. A more complete query model by several methods have been proposed usng feedbac documents [16][35] or usng term relatons [1][1][34]. In these cases, we construct two models for the query: the ntal query model contanng only the orgnal terms, and a new model contanng the added terms. They are then combned through nterpolaton. C In ths paper, we generalze ths approach and ntegrate more models for the query. Let us use to denote the orgnal query F model, for the feedbac model created from feedbac om K documents, for a doman model and for a nowledge model created by applyng term relatons. can be created by F MLE. has been used n several prevous studes [16][35]. In F ths paper, s extracted usng the 2 blnd feedbac om documents. We wll descrbe the detals to construct and n Secton 4 and 5. K Gven these models, we create the followng fnal query model by nterpolaton: P ( t ) = α P ( t ) (3) X where X={, om, K, F} s the set of all component models and α (wth α = 1) are ther mxture weghts. X Then the document score n Equaton (1) s extended as follows: ( ) = t )log t ) = Score, α α Score (, ) (4) t V X Score (, ) t ) log t t V X where = ) s the score accordng to each component model. Here we can see that our strategy of enhancng the query model by contextual factors s equvalent to document re-ranng, whch s used n [5][15][3]. The remanng problem s to construct doman models and nowledge model and to combne all the models (parameter settng). We descrbe ths n the followng sectons. 4. CONSTRUCTING AN USING OMAIN MOELS As n prevous studes, we explot a set of documents already classfed n each doman. These documents can be dentfed n two dfferent ways: 1) One can tae advantages of an exstng doman herarchy and the documents manually classfed n them, such as OP. In that case, a new query should be classfed nto the same domans ether manually or automatcally. 2) A user can defne hs own domans. By assgnng a doman to hs queres, the system can gather a set of answers to the queres automatcally, whch are then consdered to be n-doman documents. The answers could be those that the user have read, browsed through, or udged relevant to an n-doman query, or they can be smply the top-raned retreval results. An earler study [4] has compared the above two strateges usng TREC queres 51-15, for whch a doman has been manually assgned. These domans have been mapped to OP categores. It s found that both approaches mentoned above are equally effectve and result n comparable performance. Therefore, n ths study, we only use the second approach. Ths choce s also motvated by the possblty to compare between manual and automatc assgnment of doman to a new query. Ths wll be explaned n detal n our experments. Whatever the strategy, we wll obtan a set of documents for each doman, from whch a language model can be extracted. If maxmum lelhood estmaton s used drectly on these documents, the resultng doman model wll contan both doman- 17

4 specfc terms and general terms, and the former do not emerge. Therefore, we employ an EM process to extract the specfc part of the doman as follows: we assume that the documents n a doman are generated by a doman-specfc model (to be extracted) and general language model (collecton model). Then the lelhood of a document n the doman can be formulated as follows: P c ( ) = [ ( ) ( ) + ( )] ( t; ' ) om 1 η P t om ηp t C (5) t where c(t; ) s the count of t n document and η s a smoothng parameter (whch wll be fxed at.5 as n [35]). The EM algorthm s used to extract the doman model that om maxmzes om om ) (where om s the set of documents n the doman), that s: om = arg max = arg max P om ( om ' ) om [ ( ) ( ) + ( )] c( t ; 1 η P t ) om ηp t C om t om Ths s the same process as the one used to extract feedbac model n [35]. It s able to extract the most specfc words of the doman from the documents whle flterng out the common words of the language. Ths can be observed n the followng table, whch shows some words n the doman model of Envronment before and after EM teratons (5 teratons). Table 1. Term probabltes before/after EM Term Intal Fnal change Term Intal Fnal change ar % year % envronment % system *e -6-99% ran % program % polluton % mllon *e -6-99% storm % mae *e -5-95% flood % company *e -8-99% tornado % presdent *e -6-99% greenhouse % month *e -5-95% Gven a set of doman models, the related ones have to be assgned to a new query. Ths can be done manually by the user or automatcally by the system usng query classfcaton. We wll compare both approaches. uery classfcaton has been nvestgated n several studes [18][28]. In ths study, we use a smple classfcaton method: the selected doman s the one wth whch the query s KL-dvergence score s the lowest,.e.: om (6) = arg mn t ) log t ) (7) om t om Ths classfcaton method s an extenson to Naïve Bayes as shown n [22]. The score dependng on the doman model s then as follows: om Score (, ) = t )log t ) (8) om t V Although the above equaton requres usng all the terms n the vocabulary, n practce, only the strongest terms n the doman model are useful and the terms wth low probabltes are often nose. Therefore, we only retan the top 1 strongest terms. The same strategy s used for Knowledge model. Although doman models are more refned than a sngle user profle, the topcs n a sngle doman can stll be very dfferent, mang the doman model too large. Ths s partcularly true for large domans such as Scence and technology defned n TREC queres. Usng such a large doman model as the bacground can ntroduce much nose terms. Therefore, we further construct a subdoman model more related to the gven query, by usng a subset of n-doman documents that are related to the query. These documents are the top-raned documents retreved wth the orgnal query wthn the doman. Ths approach s ndeed a combnaton of doman and feedbac models. In our experments, we wll see that ths further specfcaton of sub-doman s necessary n some cases, but not n all, especally when Feedbac model s also used. 5. EXTRACTING CONTEXT-EPENENT TERM RELATIONS FROM OCUMENTS In ths paper, we extract term relatons from the document collecton automatcally. In general, a term relaton can be represented as A B. Both A and B have been restrcted to sngle terms n prevous studes. A sngle term n A means that the relaton s applcable to all the queres contanng that term. As we explaned earler, ths s the source of many wrong applcatons. The soluton we propose s to add more context terms nto A, so that t s applcable only when all the terms n A appear n a query. For example, nstead of creatng a context-ndependent relaton Java program, we wll create {Java, computer} program, whch means that program s selected when both Java and computer appear n a query. The term added n the condton specfes a strcter context to apply the relaton. We call ths type of relaton context-dependent relaton. In prncple, the addton s not restrcted to one term. However, we wll mae ths restrcton due to the followng reasons: User queres are usually very short. Addng more terms nto the condton wll create many rarely applcable relatons; In most cases, an ambguous word such as Java can be effectvely dsambguated by one useful context word such as computer or hotel ; The addton of more terms wll also lead to a hgher space and tme complexty for extractng and storng term relatons. The extracton of relatons of type {t,t } t can be performed usng mnng algorthms for assocaton rules [13]. Here, we use a smple co-occurrence analyss. Wndows of fxed sze (1 words n our case) are used to obtan co-occurrence counts of three terms, and the probablty P t t t ) s determned as follows: ( P ( t t t ) = c( t, t, t ) c( t, t, t ) (9) where c t, t, t ) s the count of co-occurrences. ( t l In order to reduce space requrement, we further apply the followng flterng crtera: The two terms n the condton should appear at least certan tme together n the collecton (1 n our case) and they should be related. We use the followng pontwse mutual nformaton as a measure of relatedness (MI > ) [6]: l 18

5 t, t ) MI ( t, t ) = log t ) t ) The probablty of a relaton should be hgher than a threshold (.1 n our case); Havng a set of relatons, the correspondng Knowledge model s defned as follows: t K ) = = ( t t ) ( t t ) t t t t t ) t t t ) t ) ) t ) (1) where (t t ) means any combnaton of two terms n the query. Ths s a drect extenson of the translaton model proposed n [3] to our context-dependent relatons. The score accordng to the Knowledge model s then defned as follows: Score (, ) = t t t ) t ) t ) log t ) (11) K t V ( t t ) Agan, only the top 1 expanson terms are used. 6. MOEL PARAMETERS There are several parameters n our model: λ n Equaton (2) and α ( {, om, K, F}) n Equaton (3). As the parameter λ only affects document model, we wll set t to the same value n all our experments. The value λ=.5 s determned to maxmze the effectveness of the baselne models (see Secton 7.2) on the tranng data: TREC queres 1-5 and documents on s 2. The mxture weghts α of component models are traned on the same tranng data usng the followng method of lne search [11] to maxmze the Mean Average Precson (MAP): each parameter s consdered as a search drecton. We start by searchng n one drecton testng all the values n that drecton, whle eepng the values n other drectons unchanged. Each drecton s searched n turn, untl no mprovement n MAP s observed. In order to avod beng trapped at a local maxmum, we started from 1 random ponts and the best settng s selected. 7. EXPERIMENTS 7.1 Settng The man test data are those from TREC 1-3 ad hoc and flterng tracs, ncludng queres 1-15, and documents on ss 1-3. The choce of ths test collecton s due to the avalablty of manually specfed doman for each query. Ths allows us to compare wth an approach usng automatc doman dentfcaton. Below s an example of topc: <num> Number: 13 <dom> oman: Law and Government <ttle> Topc: Welfare Reform We only use topc ttles n all our tests. ueres 1-5 are used for tranng and for testng. 13 domans are defned n these queres and ther dstrbutons among the two sets of queres are shown n Fg. 1. We can see that the dstrbuton vares strongly between domans and between the two query sets. We have also tested on TREC 7 and 8 data. For ths seres of tests, each collecton s used n turn as tranng data whle the other s used for testng. Some statstcs of the data are descrbed n Tab Envronment Fnance Int. Economcs Int. Fnance Int. Poltcs Int. Relatons Law&Gov. Medcal&Bo. Mltary Poltcs Sc.&Tech. US Economcs US Poltcs Fgure 1. strbuton of domans Table 2. TREC collecton statstcs All the documents are preprocessed usng Porter stemmer n Lemur and the standard stoplst s used. Some queres (4, 5 and 3 n the three query sets) only contan one word. For these queres, nowledge model s not applcable. On doman models, we examne several questons: uery 1-5 uery Collecton ocument Sze (GB) Voc. # of oc. uery Tranng s ,85 231, ss 1-3 ss ,932 1,78, ss , , ss , , When query doman s specfed manually, s t useful to ncorporate the doman model? If the query doman s not specfed, can t be determned automatcally? How effectve s ths method? We descrbed two ways to gather documents for a doman: ether usng documents udged relevant to queres n the doman or usng documents retreved for these queres. How do they compare? On Knowledge model, n addton to testng ts effectveness, we also want to compare the context-dependent relatons wth context-ndependent ones. Fnally, we wll see the mpact of each component model when all the factors are combned. 7.2 Baselne Methods Two baselne models are used: the classcal ungram model wthout any expanson, and the model wth Feedbac. In all the experments, document models are created usng Jelne-Mercer smoothng. Ths choce s made accordng to the observaton n [36] that the method performs very well for long queres. In our case, as queres are expanded, they perform smlarly to long queres. In our prelmnary tests, we also found ths method performed better than the other methods (e.g. rchlet), especally for the man baselne method wth Feedbac model. Table 3 shows the retreval effectveness on all the collectons. 7.3 Knowledge Models Ths model s combned wth both baselne models (wth or wthout feedbac). We also compare the context-dependent nowledge model wth the tradtonal context-ndependent term relatons (defned between two sngle terms), whch are used to expand queres. Ths latter selects expanson terms wth strongest global relaton to the query. Ths relaton s measured by the sum of relatons to each of the query terms. Ths method s equvalent to [24]. It s also smlar to the translaton model [3]. We call t 19

6 ss 1-3 ss1-3 Table 3. Baselne models Ungram Model Wthout FB Wth FB (+49.3%) Recall / (+31.4%) Recall / P@ (+21.87%) Recall / P@ Table 4. Knowledge models Co-occurrence Knowledge model Wthout FB Wth FB Wthout FB Wth FB.1884 (+2.%) (+3.75%)**.2164 (+37.83%) (+5.8%)** Recall / P@ (+1.8%) (+8.%)*.2157 (+3.25%) (+1.34%)** Recall / P@ (+5.53%).2926 (+.58%).2724 (+14.12%)++.37 (+3.37%) Recall / P@ (The column WthoutFB s compared to the baselne model wthout feedbac, whle WthFB s compared to the baselne wth feedbac. ++ and + mean sgnfcant changes n t-test wth respect to the baselne wthout feedbac, at the level of p<.1 and p<.5, respectvely. ** and * are smlar but compared to the baselne model wth feedbac.) Co-occurrence model n Table 4. T-test s also performed for statstcal sgnfcance. As we can see, smple co-occurrence relatons can produce relatvely strong mprovements; but context-dependent relatons can produce much stronger mprovements n all cases, especally when feedbac s not used. All the mprovements over cooccurrence model are statstcally sgnfcant (ths s not shown n the table). The large dfferences between the two types of relaton clearly show that context-dependent relatons are more approprate for query expanson. Ths confrms the hypothess we made, that by ncorporatng context nformaton nto relatons, we can better determne the approprate relatons to apply and thus avod ntroducng napproprate expanson terms. The followng example can further confrm ths observaton, where we show the strongest expanson terms suggested by both types of relaton for the query #384 space staton moon : Co-occurrence Relatons: year power tme develop.8932 offc.8485 oper earth.7843 wor.781 rado.771 system.7627 buld nclud.7377 state.776 program.762 naton.6937 open.6889 servc.689 ar.6734 space.6685 nuclear.6521 full.6425 mae.641 compan.6262 peopl.6244 proect.6147 unt.6114 gener.636 da.629 Context-ependent Relatons: space mar earth man.3777 program.3377 proect.2691 base orbt.2519 buld.2542 msson call explor.2161 launch develop shuttl plan flght staton.1645 ntern.162 energ oper power transport construct.1216 nasa naton perman apan apollo.1997 lunar.1898 In comparson wth the baselne model wth feedbac (Tab. 3), we see that the mprovements made by Knowledge model alone are slghtly lower. However, when both models are combned, there are addtonal mprovements over the Feedbac model, and these mprovements are statstcally sgnfcant n 2 cases out of 3. Ths demonstrates that the mpacts produced by feedbac and term relatons are dfferent and complementary. 7.4 oman Models In ths secton, we test several strateges to create and use doman models, by explotng the doman nformaton of the query set n varous ways. Strateges for creatng doman models: C1 - Wth the relevant documents for the n-doman queres: ths strategy smulates the case where we have an exstng drectory n whch documents relevant to the doman are ncluded. C2 - Wth the top-1 documents retreved wth the n-doman queres: ths strategy smulates the case where the user specfes a doman for hs queres wthout udgng document relevance, and the system gathers related documents from hs search hstory. Strateges for usng doman models: U1 - The doman model s determned by the user manually. U2 - The doman model s determned by the system Creatng oman models We test strateges C1 and C2. In ths seres of tests, each of the queres s used n turn as the test query whle the other queres and ther relevant documents (C1) or top-raned retreved documents (C2) are used to create doman models. The same method s used on queres 1-5 to tune the parameters. ss1-3 (U1) ss1-3 (U1) Table 5. oman models wth relevant documents (C1) oman Sub-oman Wthout FB Wth FB Wthout FB Wth FB.17 (+8.28%) (+4.69%)**.1918 (+22.17%) (+4.99%)** Recall / P@ (+3.56%) (+9.79%)*.1842 (+11.23%) (+1.66%)** Recall / P@ (+2. 3%).2957 (+1.65%).2563 (+7.37%).2967 (+1.99%) Recall / P@ Table 6. oman models wth top-1 documents (C2) oman Sub-oman Wthout FB Wth FB Wthout FB Wth FB.1718 (+9.43%) (+4.78%)**.1799 (+14.59%) (+4.61%)** Recall / P@ (+6.58%) (+1.6%)**.1785 (+7.79%) (+9.97%)** Recall / P@ (+1.97%).2949 (+1.38%).2441 (+2.26%).2961 (+1.79%) Recall / P@

7 We also compare the doman models created wth all the ndoman documents (oman) and wth only the top-1 retreved documents n the doman wth the query (Sub-oman). In these tests, we use manual dentfcaton of query doman for ss 1-3 (U1), but automatc dentfcaton for and 8. Frst, t s nterestng to notce that the ncorporaton of doman models can generally mprove retreval effectveness n all the cases. The mprovements on ss 1-3 and are statstcally sgnfcant. However, the mprovement scales are smaller than usng Feedbac and Relaton models. Loong at the dstrbuton of the domans (Fg. 1), ths observaton s not surprsng: for many domans, we only have few tranng queres, thus few ndoman documents to create doman models. In addton, topcs n the same doman can vary greatly, n partcular n large domans such as scence and technology, nternatonal poltcs, etc. Second, we observe that the two methods to create doman models perform equally well (Tab. 6 vs. Tab. 5). In other words, provdng relevance udgments for queres does not add much advantage for the purpose of creatng doman models. Ths may seem surprsng. An analyss mmedately shows the reason: a doman model (n the way we created) only captures term dstrbuton n the doman. Relevant documents for all n-doman queres vary greatly. Therefore, n some large domans, characterstc terms have varable effects on queres. On the other hand, as we only use term dstrbuton, even f the top documents retreved for the n-doman queres are rrelevant, they can stll contan doman characterstc terms smlarly to relevant documents. Thus both strateges produce very smlar effects. Ths result opens the door for a smpler method that does not requre relevance udgments, for example usng search hstory. Thrd, wthout Feedbac model, the sub-doman models constructed wth relevant documents perform much better than the whole doman models (Tab. 5). However, once Feedbac model s used, the advantage dsappears. On one hand, ths confrms our earler hypothess that a doman may be too large to be able to suggest relevant terms for new queres n the doman. It ndrectly valdates our frst hypothess that a sngle user model or profle may be too large, so smaller doman models are preferred. On the other hand, sub-doman models capture smlar characterstcs to Feedbac model. So when the latter s used, sub-doman models become superfluous. However, f doman models are constructed wth top-raned documents (Tab. 6), sub-doman models mae much less dfferences. Ths can be explaned by the fact that the domans constructed wth top-raned documents tend to be more unform than relevant documents wth respect to term dstrbuton, as the top retreved documents usually have stronger statstcal correspondence wth the queres than the relevant documents etermnng uery oman Automatcally It s not realstc to always as users to specfy a doman for ther queres. Here, we examne the possblty to automatcally dentfy query domans. Table 7 shows the results wth ths strategy usng both strateges for doman model constructon. We can observe that the effectveness s only slghtly lower than those produced wth manual dentfcaton of query doman (Tab. 5 & 6, oman models). Ths shows that automatc doman dentfcaton s a way to select doman model as effectve as manual dentfcaton. Ths also demonstrates the feasblty to use doman models for queres when no doman nformaton s provded. Table 7. Automatc query doman dentfcaton om. wth rel. doc. (C1) om. wth top-1 doc. (C2) Wthout FB Wth FB Wthout FB Wth FB ss 1-3 ss (+5.1%) (+4.27%)**.167 (+6.37%) (+4.48%)** Recall P@ Table 8. Complete models (C1) Man. dom. d. (U1) All oc. oman Auto. dom. d..251 (+6.7%) **.2489 (+6.19%) ** Recall / P@ (+13.14%) ** Recall /4 674 N/A 3 14 P@ (+4.13%) ** Recall /4 728 N/A P@1.52 Loong at the accuracy of the automatc doman dentfcaton, however, t s surprsngly low: for queres 51-15, only 38% of the determned domans correspond to the manual dentfcatons. Ths s much lower than the above 8% rates reported n [18]. A detaled analyss reveals that the man reason s the closeness of several domans n TREC queres (e.g. Internatonal relatons, Internatonal poltcs, Poltcs ). However, n ths stuaton, wrong domans assgned to queres are not always rrelevant and useless. For example, even when a query n Internatonal relatons s classfed n Internatonal poltcs, the latter doman can stll suggest useful terms to the query. Therefore, the relatvely low classfcaton accuracy does not mean low usefulness of the doman models. 7.5 Complete Models The results wth the complete model are shown n Table 8. Ths model ntegrates all the components descrbed n ths paper: Orgnal query model, Feedbac model, oman model and Knowledge model. We have tested both strateges to create doman models, but the dfferences between them are very small. So we only report the results wth the relevant documents. Our frst observaton s that the complete models produce the best results. All the mprovements over the baselne model (wth feedbac) are statstcally sgnfcant. Ths result confrms that the ntegraton of contextual factors s effectve. Compared to the other results, we see consstent, although small n some cases, mprovements over all the partal models. Loong at the mxture weghts, whch may reflect the mportance of each model, we observed that the best settngs n all the collectons vary n the followng ranges:.1 α.2,.1 α om.2,.1 α K.2 and.5 α F.6. We see that the most mportant factor s Feedbac model. Ths s also the sngle factor whch produced the hghest mprovements over the orgnal query model. Ths observaton seems to ndcate that ths model has the hghest capablty to capture the nformaton need behnd the query. However, even wth lower weghts, the other models do have strong mpacts on the fnal effectveness. Ths demonstrates the beneft of ntegratng more contextual factors n IR. 21

8 8. CONCLUSIONS Tradtonal IR approaches usually consder the query as the only element avalable for the user nformaton need. Many prevous studes have nvestgated the ntegraton of some contextual factors n IR models, typcally by ncorporatng a user profle. In ths paper, we argue that a sngle user profle (or model) can contan a too large varety of dfferent topcs so that new queres can be ncorrectly based. Smlarly to some prevous studes, we propose to model topc domans nstead of the user. Prevous nvestgatons on context focused on factors around the query. We showed n ths paper that factors wthn the query are also mportant they help select the approprate term relatons to apply n query expanson. We have ntegrated the above contextual factors, together wth feedbac model, n a sngle language model. Our expermental results strongly confrm the beneft of usng contexts n IR. Ths wor also shows that the language modelng framewor s approprate for ntegratng many contextual factors. Ths wor can be further mproved on several aspects, ncludng other methods to extract term relatons, to ntegrate more context words n condtons and to dentfy query domans. It would also be nterestng to test the method on Web search usng user search hstory. We wll nvestgate these problems n our future research. 9. REFERENCES [1] Ba, J., Ne, J.Y., Cao, G., Context-dependent term relatons for nformaton retreval, EMNLP 6, pp , 26. [2] Beln, N.J., Interacton wth texts: Informaton retreval as nformaton seeng behavor, Informaton Retreval 93: Von der modellerung zu anwendung, pp , Konstanz: Krause & Womser-Hacer, [3] Berger, A., Lafferty, J., Informaton retreval as statstcal translaton, SIGIR 99, pp , [4] Bouchard, H., Ne, J.Y., Modèles de langue applqués à la recherche d nformaton contextuelle, Conf. en Recherche d Informaton et Applcatons (CORIA), Lyon, 26. [5] Chrta, P.A., Pau, R., Nedl, W., Kohlschütter, C., Usng OP metadata to personalze search, SIGIR, pp , 25. [6] Church, K. W., Hans, P., Word assocaton norms, mutual nformaton, and lexcography. ACL, pp , [7] Croft, W. B., Cronen-Townsend, S., Lavreno, V., Relevance feedbac and personalzaton: A language modelng perspectve, In: The ELOS-NSF Worshop on Personalzaton and Recommender Systems gtal Lbrares, pp , 26. [8] Croft, W. B., We, X., Context-based topc models for query modfcaton, CIIR Techncal Report, Unversty of Massachusetts, 25. [9] umas, S., Cutrell, E., Cadz, J., Jance, G., Sarn, R., Robbns,. C., Stuff I've seen: a system for personal nformaton retreval and re-use, SIGIR'3, pp , 23. [1] Fang, H., Zha, C., Semantc term matchng n axomatc approaches to nformaton retreval, SIGIR 6, pp , 26. [11] Gao, J.,, H., Xa, X., Ne, J.-Y., Lnear dscrmnatve model for nformaton retreval. SIGIR 5, pp , 25. [12] Goole Personalzed Search, [13] Hpp, J., Guntzer, U., Nahaezadeh, G., Algorthms for assocaton rule mnng - a general survey and comparson. SIGK exploratons, 2 (1), pp , 2. [14] Ingwersen, P., Jäverln, K., Informaton retreval n context: IRX, SIGIR Forum, 39: pp , 24. [15] Km, H.-R., Chan, P.K., Personalzed ranng of search results wth learned user nterest herarches from boomars, WEBK 5 Worshop at ACM-K, pp , 25. [16] Lavreno, V., Croft, W. B., Relevance-based language models, SIGIR 1, pp , 21. [17] Lau, R., Bruza, P., Song,., Belef revson for adaptve nformaton retreval, SIGIR 4, pp , 24. [18] Lu, F., Yu,C., Meng, W., Personalzed web search by mappng user queres to categores, CIKM 2, pp [19] Lu, X., Croft, W. B., Cluster-based retreval usng language models, SIGIR '4, pp , 24. [2] Morrs, R.C., Toward a user-centered nformaton servce, JASIS, 45: pp. 2-3, [21] Par, T.K., Toward a theory of user-based relevance: A call for a new paradgm of nqury, JASIS, 45: pp , [22] Peng, F., Schuurmans,., Wang, S. Augmentng Nave Bayes Classfers wth Statstcal Language Models. Inf. Retr. 7(3-4): pp , 24. [23] Ptow, J., Schütze, H., Cass, T., Cooley, R., Turnbull,., Edmonds, A., Adar, E., Breuel, T., Personalzed Search, Communcatons of ACM, 45: pp. 5-55, 22. [24] u, Y., Fre, H.P. Concept based query expanson. SIGIR 93, pp , [25] Sanderson, M., Retrevng wth good sense, Inf. Ret., 2(1): pp , 2. [26] Schamber, L., Esenberg, M.B., Nlan, M.S., A reexamnaton of relevance: Towards a dynamc, stuatonal defnton, Informaton Processng and Management, 26(6): pp , 199. [27] Schütze, H., Pedersen J.O., A cooccurrence-based thesaurus and two applcatons to nformaton retreval, Informaton Processng and Management, 33(3): pp , [28] Shen,., Pan, R., Sun, J-T., Pan, J.J., Wu, K., Yn, J., Yang,. uery enrchment for web-query classfcaton. ACM- TOIS, 24(3): pp , 26. [29] Shen, X., Tan, B., Zha, C., Context-senstve nformaton retreval usng mplct feedbac, SIGIR 5, pp. 43-5, 25. [3] Teevan, J., umas, S.T., Horvtz, E., Personalzng search va automated analyss of nterests and actvtes, SIGIR 5, pp , 25. [31] Voorhees, E., uery expanson usng lexcal-semantc relatons. SIGIR 94, pp , [32] Xu, J., Croft, W.B., uery expanson usng local and global document analyss, SIGIR 96, pp. 4-11, [33] Yarowsy,. Unsupervsed word sense dsambguaton rvalng supervsed methods. ACL, pp [34] Zhou X., Hu X., Zhang X., Ln X., Song I-Y., Contextsenstve semantc smoothng for the language modelng approach to genomc IR, SIGIR 6, pp , 26. [35] Zha, C., Lafferty, J., Model-based feedbac n the language modelng approach to nformaton retreval, CIKM 1, pp , 21. [36] Zha, C., Lafferty, J., A study of smoothng methods for language models appled to ad hoc nformaton retreval. SIGIR, pp ,

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts Selectng Query Term Alteratons for Web Search by Explotng Query Contexts Guhong Cao Stephen Robertson Jan-Yun Ne Dept. of Computer Scence and Operatons Research Mcrosoft Research at Cambrdge Dept. of Computer

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Web-supported Matching and Classification of Business Opportunities

Web-supported Matching and Classification of Business Opportunities Web-supported Matchng and Classfcaton of Busness Opportuntes. DIRO Unversté de Montréal C.P. 628, succursale Centre-vlle Montréal, Québec, H3C 3J7, Canada Jng Ba, Franços Parads,2, Jan-Yun Ne {bajng, paradfr,

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Cross-Language Information Retrieval

Cross-Language Information Retrieval Feature Artcle: Cross-Language Informaton Retreval 19 Cross-Language Informaton Retreval Jan-Yun Ne 1 Abstract A research group n Unversty of Montreal has worked on the problem of cross-language nformaton

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Deep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies

Deep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies Deep Classfer: Automatcally Categorzng Search Results nto Large-Scale Herarches Dkan Xng 1, Gu-Rong Xue 1, Qang Yang 2, Yong Yu 1 1 Shangha Jao Tong Unversty, Shangha, Chna {xaobao,grxue,yyu}@apex.sjtu.edu.cn

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval Combnng Multple Resources, Evdence and Crtera for Genomc Informaton Retreval Luo S 1, Je Lu 2 and Jame Callan 2 1 Department of Computer Scence, Purdue Unversty, West Lafayette, IN 47907, USA ls@cs.purdue.edu

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Query classification using topic models and support vector machine

Query classification using topic models and support vector machine Query classfcaton usng topc models and support vector machne Deu-Thu Le Unversty of Trento, Italy deuthu.le@ds.untn.t Raffaella Bernard Unversty of Trento, Italy bernard@ds.untn.t Abstract Ths paper descrbes

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Decision Strategies for Rating Objects in Knowledge-Shared Research Networks

Decision Strategies for Rating Objects in Knowledge-Shared Research Networks Decson Strateges for Ratng Objects n Knowledge-Shared Research etwors ALEXADRA GRACHAROVA *, HAS-JOACHM ER **, HASSA OUR ELD ** OM SUUROE ***, HARR ARAKSE *** * nsttute of Control and System Research,

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Semi-parametric Regression Model to Estimate Variability of NO 2 Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Prof. Chrs Clfton 15 September 2017 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group Retreval Models Informaton Need Representaton

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks Federated Search of Text-Based Dgtal Lbrares n Herarchcal Peer-to-Peer Networks Je Lu School of Computer Scence Carnege Mellon Unversty Pttsburgh, PA 15213 jelu@cs.cmu.edu Jame Callan School of Computer

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

USING GRAPHING SKILLS

USING GRAPHING SKILLS Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

A new query expansion method based on query logs mining1

A new query expansion method based on query logs mining1 Internatonal Journal on Asan Language Processng, 19 (1): 1-12 1 A new query expanson method based on query logs mnng1 Zhu Kunpeng, Wang Xaolong, Lu Yuanchao School of Computer Scence and Technology, Harbn

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 97-735 Volume Issue 9 BoTechnology An Indan Journal FULL PAPER BTAIJ, (9), [333-3] Matlab mult-dmensonal model-based - 3 Chnese football assocaton super league

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Extraction of User Preferences from a Few Positive Documents

Extraction of User Preferences from a Few Positive Documents Extracton of User Preferences from a Few Postve Documents Byeong Man Km, Qng L Dept. of Computer Scences Kumoh Natonal Insttute of Technology Kum, kyungpook, 730-70,South Korea (Bmkm, lqng)@se.kumoh.ac.kr

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval A Generaton Model to Unfy Topc Relevance and Lexcon-based Sentment for Opnon Retreval Mn Zhang State key lab of Intellgent Tech.& Sys, Dept. of Computer Scence, Tsnghua Unversty, Bejng, 00084, Chna 86-0-6279-2595

More information

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Parameter estimation for incomplete bivariate longitudinal data in clinical trials

Parameter estimation for incomplete bivariate longitudinal data in clinical trials Parameter estmaton for ncomplete bvarate longtudnal data n clncal trals Naum M. Khutoryansky Novo Nordsk Pharmaceutcals, Inc., Prnceton, NJ ABSTRACT Bvarate models are useful when analyzng longtudnal data

More information

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status Internatonal Journal of Appled Busness and Informaton Systems ISSN: 2597-8993 Vol 1, No 2, September 2017, pp. 6-12 6 Implementaton Naïve Bayes Algorthm for Student Classfcaton Based on Graduaton Status

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

The Effect of Similarity Measures on The Quality of Query Clusters

The Effect of Similarity Measures on The Quality of Query Clusters The effect of smlarty measures on the qualty of query clusters. Fu. L., Goh, D.H., Foo, S., & Na, J.C. (2004). Journal of Informaton Scence, 30(5) 396-407 The Effect of Smlarty Measures on The Qualty of

More information

Professional competences training path for an e-commerce major, based on the ISM method

Professional competences training path for an e-commerce major, based on the ISM method World Transactons on Engneerng and Technology Educaton Vol.14, No.4, 2016 2016 WIETE Professonal competences tranng path for an e-commerce maor, based on the ISM method Ru Wang, Pn Peng, L-gang Lu & Lng

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Keyword-based Document Clustering

Keyword-based Document Clustering Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of

More information

CHAPTER 2 DECOMPOSITION OF GRAPHS

CHAPTER 2 DECOMPOSITION OF GRAPHS CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng

More information

Personalized Concept-Based Clustering of Search Engine Queries

Personalized Concept-Based Clustering of Search Engine Queries IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 Personalzed Concept-Based Clusterng of Search Engne Queres Kenneth Wa-Tng Leung, Wlfred Ng, and Dk Lun Lee Abstract The exponental growth of nformaton

More information