A new query expansion method based on query logs mining1

Size: px
Start display at page:

Download "A new query expansion method based on query logs mining1"

Transcription

1 Internatonal Journal on Asan Language Processng, 19 (1): A new query expanson method based on query logs mnng1 Zhu Kunpeng, Wang Xaolong, Lu Yuanchao School of Computer Scence and Technology, Harbn Insttute of Technology, Harbn , Chna Emal:{pzhu, wangxl, yclu}@nsun.ht.edu.cn Abstract: Query expanson has long been suggested as an effectve way to mprove the performance of nformaton retreval systems by addng addtonal relevant terms to the orgnal queres. However, most prevous research has been lmted n extractng new terms from a subset of relevant documents, but has not exploted the nformaton about user nteractons. In ths paper, we proposed a method for automatc query expanson based on user nteractons recorded n query logs. The central dea s to extract correlatons among queres by analyzng the common documents the users selected for them, and the expanded terms only come from the assocated queres more than the relevant documents. In partcular, we argue that queres should be dealt wth n dfferent ways accordng to ther ambguty degrees, whch can be calculated from the log nformaton. We verfy ths method n a large scale query logs collecton and the expermental results show that the method maes good use of the nowledge of user nteractons, and t can remarably mprove search performance. Keywords: Query expanson, log mnng, nformaton retreval, search engne, 1. Introducton Wth the rapd growth of nformaton on the World Wde Web, more and more users need search engne technology to help them explot such an extremely valuable resource. Although many search engne systems have been successfully deployed, the current search systems are stll far from optmal because of usng smple eywords to search and ran relevant documents. A well-nown lmtaton of current search engne systems s the dffculty of dealng wth synonymy (dfferent words for descrbng the same thngs) and 1 Supported by Natonal Natural Scence Foundaton of Chna ( , ) and The Natonal Hgh Technology Research and Development Program of Chna (2006AA01Z197, 2007AA01Z172)

2 2 Zhu Kunpeng, Wang Xaolong, Lu Yuanchao polysemy (same word to descrbe dfferent thngs). For example, a farmer may use query 苹果 to get relevant nformaton about the frut, whle computer lovers may use the same query to fnd related results of ths brand computer. When such a query s ssued, t s dffcult for search engne system to choose whch nformaton he/she wshes to get. Another problem of search engnes s that web users typcally submt very short queres to search engnes and the average length of web queres s less than two words (Wen J. R. 2001). Short queres do not provde suffcent ndcatons for an effectve selecton of relevant documents and thus negatvely affect the performance of web search n terms of both precson and recall. To overcome the above problems, researchers have focused on usng query expanson technques to help users formulate a better query. Query expanson s a method for mprovng the effectveness of nformaton retreval through the reformulaton of queres by provdng addtonal contextual nformaton to the orgnal queres. It has been shown to perform very well over large data sets, especally wth short nput queres (Kraft R. 2004, Carmel D. 2002). However, prevous query expanson methods have been lmted n extractng expanson terms from a subset of documents, but have not exploted the nformaton about user nteractons. Anyone who uses search engnes has accumulated lots of clc through data, from whch we can now what queres have been used to retreve what documents. These query logs provde valuable nformaton to extract relatonshps between queres and documents, and whch can be used n query expanson. Another problem of current query expanson s that most proposed methods are unformly appled to all queres. In fact, we thn that queres should not be handled n the same manner because we fnd that there s no need for expanson on some queres. Ths has also been found n (Dou Z. C. 2007). For example on the query Google, almost all of the users are consstently selectng results to redrect to Google s homepage, and therefore none of the expanson strateges could provde sgnfcant benefts to users. In ths paper, we suggest a new query expanson method based on the analyss of user logs. By consderng f queres should be expanded and mnng correlatons among user queres from user logs, our query expanson method can acheve sgnfcant mprovements n retreval effectveness compared to current query expanson technques. The remnder of ths paper s structured as follows. Secton 2 s a dscusson of prevous wors for query expanson method. Secton 3 ntroduces a whole procedure of our query expanson method step by step. Secton 4 shows emprcal evdence of the effectveness of our method and nvestgates the expermental results more n detal. Fnally, Secton 5 summarzes our fndngs. 2. Query Expanson Based on Relevance feedbac There have been many pror attempts on query expanson. In ths paper, we focus on the related wor dong query expanson based on relevance feedbac (Roccho J. 1971, Salton G. 1990) nformaton. In ths approach, the results returned for the ntal query wll be mared as relevant or rrelevant accordng to user s nformaton need, expanson terms can be extracted from the relevant documents. Frst approaches were explct (Roccho J. 1971, Oabe M. 2005) n the sense that the user was the one choosng the relevant results, and then varous methods were appled to extract new terms related to the query and the selected documents. Unfortunately, n a real search context, users usually are reluctant to mae the extra effort to provde such relevance feedbac nformaton (Kelly D. 2003). To overcome the dffculty due to the lac of suffcent relevance udgments, an automatc feedbac technque called pseudo-relevance feedbac (also nown as blnd feedbac) s

3 A New Query Expanson Method Based on Query Logs Mnng 3 commonly used. Ths method made a conecture that, n the absence of any other relevance udgment, the top few documents retreved on the bass of an ntal query are relevant (Attar L. 1977, Croft W.B. 1979). Expanson terms are extracted from the top-raned documents to formulate a new query for a second cycle retreval (Lam-Adesna M. 2001, Carpneto C. 2001). However, the method of pseudo-relevance feedbac s hghly dependent on the qualty of the documents retreved n the ntal retreval. In cases where the top raned documents retreved have lttle relevance to the query, ths method wll not wor well and t may even ntroduce rrelevant terms nto the queston and degrade the performance. Another group of relevance feedbac technque s mplct feedbac, n whch an IR system can mae nferences about relevance from searcher nteracton, removes the need for the users to explctly ndcate whch documents are relevant (Kelly D. 2003, Morta M. 1994). Several prevous studes have shown that mplct nformaton may be helpful for nferrng user nformaton need and can mprove retreval accuracy through query expanson. Some query expanson methods based on mplct feedbac have been proposed n (Cu H. 2003, Lv Y. H. 2006), the mplct nformaton they used s clc-through data collected over a long tme perod n query logs. These query logs provde valuable ndcatons to understand the nds of documents the users ntend to retreve by formulatng a query wth a set of partcular terms, and expanson terms can be selected from the sets or the results of past queres. One mportant assumpton behnd these methods s that the clced documents are relevant to the query. Ths presumpton s not always rght. However, although the clcng nformaton s not as accurate as explct relevance udgment, the user's choce does suggest a certan degree of relevance. In fact, users usually do not mae the choce randomly. Even f some of the document clcs are erroneous, we can expect that most users do clc on documents that are relevant. Some prevous wor on usng query logs also strongly supports ths assumpton (Bar-Yossef 2008, Wen 2002, Bllerbec 2003 and Zhang 2006). Therefore, query logs can be taen as a very valuable resource contanng abundant relevance feedbac data. In ths paper, we present a new query expanson method based on query logs mnng, at the same tme, n order to avod the problem of query drft, we utlze clced results of the present search process as another type of mplct feedbac nformaton to deduce users nformaton need. Our wor dffers from the exstng ones n two mportant aspects. Frst, we ntroduce a method to evaluate the qualty of user queres, whch can be measured by the calculaton of Kullbac-Lebler Dstance (Cover T. 1991) among documents n query logs. Query expanson can strongly mprove the performance of short queres and ambguous queres. But ths technque can not acheve the same goal on an accurate query; some new added terms wll ntroduce the problem of query drft and degrade the performance. So, we beleve that queres should not be dealt wth n the same way and measurement of query qualty s essental to udge f a query need to be expanded, whch has never been researched before. Second, we propose a new query expanson method based on query logs, relevant expanson terms are selected from the past queres wth the analyss of relaton between queres and documents under the language modelng framewor. Comparng to the exstng wor, the dfference s that we extract the terms from the past queres more than the relevant documents, the experments show that our method gets better performance n some aspects. 3. Query Expanson Method Based on Logs mnng The query expanson method based on logs mnng presented n ths paper s composed of

4 4 Zhu Kunpeng, Wang Xaolong, Lu Yuanchao two parts: measurement of query qualty wth ambguty analyss and terms expanson wth query log mnng. In ths secton, the detals of these two parts are descrbed. 3.1 Measurement of Query qualty wth Ambguty analyss A good query should be general enough to cover all relevant documents and specfc enough to select only relevant ones. But ths rule can not be used to evaluate the qualty of user queres because the relevant documents are unnown n advance. In fact, many of the queres need to be expanded for ther ambguty, such as the query of 苹果 mentoned above. In ths study, we proposed a new method to measure qualty of a query based on the calculaton of ts ambguty degree, and query logs are adopted as the data resource. In query logs, the orgnal form of each clc-through record s descrbed as: record =< : sesson _ d >< query _ text >< ran >< order >< page _ url > Sesson _ d s a unque value assgned by the search engne to dentfy a query tas, ran s the document order n all returned results, and order s the order n clced documents. Our frst tas s to extract query sessons from the orgnal log data. A query sesson s formed by the records wth the same sesson d, whch can be defned as follows: sesson =< : query _ text > [ clced documents ] Each sesson contans one query and a set of documents whch the user clced on. Because most of queres are repeated, that means one query _ text can correspond to one or more sessons. The central dea of our method s that, for the same query, the clced documents n the same sesson should be related wth each other and smlar n content, but those n the dfferent sessons are not necessarly related. For example, the clced documents of query mouse may be about rodents or computer devces, these two types of documents are not related for the query ambguty. So the content dfferences of the clced documents among the query sessons can be used to measure the ambguty degree of a query. In our method, we assume that the clced documents n the same sesson were related, whch can be regarded as to be generated by one language model. The calculaton of ambguty degree can be consdered as an evaluaton of Kullbac-Lebler Dstance (KLD) among these language models. KLD s often used to measure the dvergence of two probablty dstrbutons n Informaton Theory, and t s also can be used to evaluate the rrelevant degree between two language models. Gven a query q, we can get a collecton of sessons from log data denoted by S( q) = { s1, s2, L, s n }, each sesson wll be represented by a sequence of the clced documents, s = { d1, d2, L, dm}. The Inner ambguty degree of a query s IA( q ), then: n 1 KLD( p( s ) p( s )) + KLD( p( s ) p( s )) IA( q) = (1) n( n 1) = 1 2 That s the average dvergence of the sessons. p( s ) s the probablty dstrbuton of the language model whch s used to generate the document set s, and KLD( p( s ) p( s )) s defned as: p( t s ) KLD( p( s ) p( s )) p( t s )log (2) = p ( t s ) t

5 A New Query Expanson Method Based on Query Logs Mnng 5 In order to compute the score of formula (2), we need to be able to estmate the value of p( t s ), whch s the condtonal probablty of occurrence of word t n s. The estmate for p( t s ) s: m p( t s ) = α λ P( t d ) + (1 α) P( t S) (3) = 1 Where α s the nterpolaton weght determned emprcally to smooth the language models, so that non-zero probablty can be assgned to terms that do not appear n a gven document. P( t S ) s the global bacground collecton model. λ s a weghtng parameter determned by the ran of d n the clced documents, and P( t d ) s the maxmum lelhood estmate of the probablty of term t under the term dstrbuton for document d. The values of λ and P( t d ) can be calculated by the followng formulas: 0.5 λ = + n n( n ) (4) tf ( t, d ) P( t d) = d (5) Here tf ( t, d ) s the raw term frequency of term t n document d and d s the total number of terms n the document. We also gve an outer ambguty whch comes from the dea of (Cronen-Townsend S., 2002). They use the concept of clarty score to quantfy the query s ambguty, whch s the relatve entropy between a query language model and the correspondng collecton language model. The outer ambguty of the query can be defned as the recprocal of clarty score : 1 (6) OA( q) = P( t q) P( t q) log 2 t V Pcoll ( t) Accordng to the above formulas, we can compute the ambguty degree for a gven query. A( q) = βoa( q) + (1 β ) IA( q) (7) And β s the adusted parameter. Inner ambguty degree represents the dfference between the related documents of the query. Intutvely, f a query s clear, the clced documents n ts sessons wll be focused on the same topc, and the term dstrbutons on these documents should be approxmately smlar. And outer ambguty degree represents the dfference between the related documents and global documents collecton. Therefore, the ambguty degree of a clear query s smaller than an ambguous one s. In our test, we set β = 0.4, because we thn nner ambguty degree s more mportant for the calculaton. We wll normalze the value of ambguty degree from 0 to 1, and gve a max length of query expanson, namedθ, and use A( q) θ to set the number of query expanson terms. The dea s that f a query s more ambguous, more terms should be added for expanson, and f a query s more clarty, fewer terms should be added n order to avod mportng the rrelevant words. 3.2 Query expanson wth Logs mnng

6 6 Zhu Kunpeng, Wang Xaolong, Lu Yuanchao There are two steps n our approach to expand an ambguous query. The frst step s to get the canddate terms from the assocated queres, and the second step s to determne whch canddate words should be added to the new query. In ths secton, the detal of these two steps wll be descrbed. In the frst step, we wll use the nformaton of clced documents to create the correlatons of the queres. Generally, we assume that a query s relevant wth the documents that the user clced on, and each record of log data suggests such a relatonshp. If two queres are related wth the same clced documents, we beleve these two queres are assocated wth each other n some way, and the terms n the assocated queres can be used as the canddate terms for query expanson. Here, we used the condtonal probablty P( q q ) to calculate the correlaton between q and q. P( q, q ) d (,, ) D P q q d P( q q ) = = P( q ) P( q ) d (, ) (, ) D P q q d P q d = P( q ) Here we support that P( q q, d ) = P( q d ), because the relaton of queres s created by the document, so d separates P( q q ) = q from q, and we get followng formula: = P( q d ) P( d q ) d D P( q d ) P( d q ) P( q ) d D P( q ) In formula (9), P( d q ) s the condtonal probablty when query s q and the clced document s d ; P( q d ) s the condtonal probablty when the clced document s d and the query s q. The two condtonal probablty can be estmated by followng:: f ( q, d ) P( d q ) = (10) f ( q ) (8) (9) f ( q, d ) P( q d ) = (11) f ( d ) f ( q, d ) s used to descrbe the co-occurrence frequency of query q and document d n log data. f ( q ) s the frequency of q n log data. f ( q, d ) s used to descrbe the cooccurrence frequency of query q and document d n log data. f ( d ) s the frequency of d n log data. By the calculaton of the frequency, we can get the collecton of related queres of q, and the terms n the queres can be used for query expanson. The weghts of terms can be calculated by followng formula: P( t q) = P( q q) (12) q s. t. t q

7 A New Query Expanson Method Based on Query Logs Mnng 7 In the second step, we wll sort the expanded terms by ther weghts and the number of the terms wll be set A( q) θ. We set θ = 40 based on experence. The top A( q) θ terms wll be used for query expanson. 4. Evaluatons and Analyss 4.1 Expermental Data and Methodology Due to the characterstcs of our query expanson method, we can not conduct experments on standard test collecton such as the TREC data snce they do not contan user logs that we need. We test our method on a dataset collected from the query logs of Sogou( 搜狗 ) ( search engne. It covers one mouth log data and about 80% of the queres n t contan Chnese words. Approxmately 24 mllon query records and 3 mllon dstnct queres are dentfed. We select two hundred test nput queres randomly accordng to the overall frequency dstrbutons and extract about one mllon query sessons from the log data. Wth respect to documents set, we collect about ten thousand pages from the Internet accordng to the records n query logs to form the test corpus. In ths data set, each document has been retreved and vewed by users wth a certan query, and we can get suffcent clc-through nformaton to expand a query wth our method. In order to demonstrate the effectveness of our method, three experments were carred out. The frst s to nvestgate the correlaton between the query lengths and the ambguty degrees. In the second, we extract ten queres from the queres set and the performance of query expanson on these queres wll be llustrated. At last, the expermental results of our query expanson method wll be compared wth other systems. 4.2 Results proporton of queres Query Length Fg 1. Dstrbuton of query lengths Fgure 1 llustrates the dstrbuton of query lengths accordng to the number of words. In our experment, we notce that 35% of the queres contan only one eyword and 32% of the queres contan two eywords. The average length of all queres s The result

8 8 Zhu Kunpeng, Wang Xaolong, Lu Yuanchao shows that most people le to use short queres to retreve nformaton. We do not select the queres contaned more than 5 words, because these queres are seldom used and we can not get enough log data for calculaton. Fgure 2 llustrates the relaton between the query lengths and the statstcal analyss values of ther ambguty degree. Let the ambguty of query q s a = A( q ),then the average a and the varance σ can be defned as: 1 n a n = 1 a = (13) n = 1 σ = ( a a) n We observe that the average values of short queres are hgher than the ones of long queres. Ths verfes that the short queres are more ambguous than the long queres and the query expanson technque should be appled on short queres more than long queres. The results also approve the effectveness of our method to measure the ambguty of queres. But t should be emphaszed that not all short queres are bad queres. The varance analyss proves that query length s not a better crteron to measure the qualty of queres. The varance s often used to descrbe the devaton of the data from ts mean center. We observe that the varance s larger when query length s 2 and 3, whch means the ambguty values of queres n these two groups mae a greater fluctuaton around ther mean value. 2 (14) average varance Average and Varance Query Length Fg 2. Average and varance analyss of ambguty degree In the second experment, we extract ten queres from queres set whch are shown n Table 1 and each query wll be dvded nto both short and long verson n order to see how query expanson affects retreval results on short queres and long queres. In our experment, the long queres come from the queres whch length s 4 or 5, and short queres only contan one word. After pre-processng documents, ncludng phrasng,

9 A New Query Expanson Method Based on Query Logs Mnng 9 removng stop words and useless characters, we get a thesaurus whch contans about sxty thousand words. The results are the precson-recall performance of these queres whch wll be counted by manual. ID Short Queres Long Queres 1 苹果苹果褐斑病防治 2 成都成都旅游景点 3 足球足球过人技术视频 4 网易网易邮箱申请 5 比尔盖茨比尔盖茨慈善基金 6 经济国际经济形势 7 DNA DNA 提取侦破技术 8 汽车汽车保险计算方法 9 华为华为招聘信息 10 手机手机生产厂家 Table 1. Lst of Queres n Both the Long Query Set and the Short Query Set The retreval results are shown n Table 2. Accordng to the calculaton of ambguty degree, we beleve the queres n Short Queres set are more ambguous than the queres n Long Queres set, so the average precson of Short Queres set should be lower than the one of Long Queres set. Smlar to the retreval process, query expanson s also affected by the ambguty of orgnal queres. Compared wth an accurate query, the query expanson method can acheve a more mprovement on an ambguous one. The results confrm our expectaton ust descrbed. Wthout query expanson, the average precson on Short Queres set s 22.63% whch s lower than 28.80% of Long Queres set. The mprovement ganed wth query expanson on Short Queres set s observably hgher than that obtaned on Long Queres set, and the results show the applcaton of query expanson on Short Queres set s more valuable. Recall Short Queres Wthout QE Short Queres Wth QE Long Queres Wthout QE Long Queres Wth QE (+54.07) (23.01) (+62.39) (35.19) (+74.79) (33.92) (+76.37) (35.97) (+84.07) (+39.35) (+94.73) (+36.83) (+90.65) (+41.88) (+85.28) (+39.47) (+88.75) (+41.99) (+85.44) (+38.51) Average (+74.95) (+34.69) Table 2. Comparson wth and wthout QE on both Long Query Set and Short Query Set

10 10 Zhu Kunpeng, Wang Xaolong, Lu Yuanchao The results n Table 2 also prove that query expanson technque can not acheve the same performance on the accurate queres compared wth the ambguous ones; some new added terms wll ntroduce the problem of query drft and degrade the performance. In order to evaluate our query expanson method, we wll compare ts performance not only wth that of the orgnal queres, but also wth that of local context analyss (LCA) whch extracts the expanded terms from the related documents. The results are shown n Fg Baselne QE on LCA QE on log 0.5 Precson Recall Fg 3. Comparson of query expanson For local context analyss, we use 30 expanson terms from 100 top-raned documents for query expanson. The smoothng factor δ n local context analyss s set to 0.1. The experments showed that query expanson technques can greatly mprove the performance of precson rate and recall rate for nformaton retreval, especally for the documents collecton wth a wde range of content. The results also show that the method of query expanson based on query logs gets better performance than other systems. The reason of the poorer performance acheved by QE on LCA s that the ntal search results of are unsatsfactory. Ths stuaton affects the performance of the expanson algorthm, resultng n rrelevant terms be added to the orgnal query and thus faled to acheve the better results. In our method, the expanson algorthm s based on the mnng of a large scale query logs, relevant expanson terms are selected from the past queres wth the analyss of relaton between queres and documents under the language modelng framewor. Our method can avalably reduce the stuaton of expandng rrelevant terms and decrease the bad mpact of unsatsfactory ntal search results. 5. Conclusons In ths artcle, we presented a new method for query expanson based on query logs mnng. Ths method ams frst to calculate the ambguty degree of the query by explotng the user logs. The result can be used to measure the qualty of the query and decde the expanded length of the query. And n the next step, we use the nformaton of clced documents to

11 A New Query Expanson Method Based on Query Logs Mnng 11 create the correlatons of the queres, and the hgh-qualty expanson terms are selected from the past queres wth the analyss of relaton between queres and documents. Ths s an effectve way to avod the problem of query drft by reducng the rrelevant expanson terms. We tested our method on a data set that s extracted from the real Web envronment. A seres of experments conducted on the data set showed that the query expanson method based on query logs mnng can acheve substantal mprovements n performance. It also outperforms local context analyss, whch s one of the most effectve query expanson methods n the past. Our experments also show that query expanson s more effectve for ambguous queres than for clear queres. Ths also proved that queres should not be dealt wth n the same way and measurement of query qualty s essental to udge f a query need to be expanded, because some expanson terms can degrade the performance of hghqualty queres. 6. References Wen, J. R., Ne, J. Y., and Zhang, H. J., 2001, Clusterng user queres of a search engne. Proceedngs of the 10th Internatonal World Wde Web Conference, pp Kraft, R., and Zen, J., 2004, Mnng anchor text for query refnement. Proceedngs of the. 13th nternatonal conference on World Wde Web, pp Carmel, D., Farch, E., Petruscha, Y., and Soffer, A., 2002, Automatc query refnement usng lexcal affntes wth maxmal nformaton gan. Proceedngs of the 25th Internatonal ACM SIGIR Conference on research and development n nformaton retreval, pp Dou, Z. C., Song, R. H., and Wen, J. R., 2007, A large-scale evaluaton and analyss of personalzed search strateges. Proceedngs of the 16th Internatonal World Wde Web Conference, pp Roccho, J., 1971, Relevance feedbac n nformaton retreval. In The SMART Retreval System: Experments n Automatc Document Processng, pp Salton, G., and Bucley, C., 1990, Improvng retreval performance by relevance feedbac. Journal of the Amercan Socety for Informaton Scence, 41(4), pp Oabe, M., Umemura, K., and Yamada, S., 2005, Query expanson wth the mnmum user feedbac by transductve learnng. Proceedngs Human Language Technology Conference. Emprcal Methods n Natural Language Processng, pp Kelly, D., and Teevan, J., 2003, Implct feedbac for nferrng user preference: A Bblography. ACM SIGIR Forum, 37(2), pp Attar, L., and Fraenel, A.S., 1977, Local feedbac n full-text retreval systems. Journal of the Assocaton for Computng Machnery, 24(3), pp Croft, W.B. and Harper, D.J., 1979, Usng probablstc models of document retreval wthout relevance nformaton. Journal of Documentaton, 35(4), pp

12 12 Zhu Kunpeng, Wang Xaolong, Lu Yuanchao Lam-Adesna, M., and Jones, G. J. F., 2001, Applyng summarzaton technques for term selecton n relevance feedbac. Proceedngs of the 24th Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval, pp Carpneto, C., De Mor, R., Romano, G., and Bg, B., 2001, An nformaton-theoretc approach to automatc query expanson. ACM Transactons on Informaton Systems, 19(1), pp Morta, M., and Shnoda, Y., 1994, Informaton flterng based on user behavor analyss and best match text retreval. Proceedngs of the 17th Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval, pp Cu, H., Wen, J. R., Ne, J. Y. and Ma, W. Y., 2003, Query expanson by mnng user logs. IEEE Transactons on Knowledge and Data Engneerng, 15(4), pp Lv, Y. H., Sun, L., Zhang, J. L., Ne, J. Y., Chen, W., and Zhang, W., 2006, An teratve mplct feedbac approach to personalzed search. Proceedngs of the 21st Internatonal Conference on Computatonal Lngustcs and 44th Annual Meetng of the ACL, pp Bar-Yossef, Z. and Gurevch, M., 2008, Mnng search engne query logs va suggeston samplng. Proceedngs of the 34th Internatonal Conference on Very Large Data Bases, pp Wen, J. R., Ne, J. Y., and Zhang, H. J., 2002, Query clusterng usng user logs. ACM Transactons on Informaton Systems, 20(1), pp Bllerbec, B., Scholer, F., Wllams, H. E., and Zobel, J., 2003, Query expanson usng assocated queres. Proceedngs of the 12th nternatonal conference on Informaton and nowledge management, pp Zhang, Z. and Nasraou, O., 2006, Mnng search engne query logs for query recommendaton. Proceedngs of the. 15th nternatonal World Wde Web conference, pp Cover, T. and Thomas, J., 1991, Elements of Informaton Theory. New Yor: John Wley and Sons. Cronen-Townsend S., Zhou Y., Croft W. B. Quantfyng query ambguty. In Proc. of Human Language Technology, 2002, pp:94--98

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Prof. Chrs Clfton 15 September 2017 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group Retreval Models Informaton Need Representaton

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

IN recent years, we have been witnessing the explosive

IN recent years, we have been witnessing the explosive IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 15, NO. 4, JULY/AUGUST 2003 1 Query Expanson by Mnng User Logs Hang Cu, J-Rong Wen, Jan-Yun Ne, and We-Yng Ma, Member, IEEE Abstract Queres to

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Using Query Contexts in Information Retrieval Jing Bai 1, Jian-Yun Nie 1, Hugues Bouchard 2, Guihong Cao 1 1 Department IRO, University of Montreal

Using Query Contexts in Information Retrieval Jing Bai 1, Jian-Yun Nie 1, Hugues Bouchard 2, Guihong Cao 1 1 Department IRO, University of Montreal Usng uery Contexts n Informaton Retreval Jng Ba 1, Jan-Yun Ne 1, Hugues Bouchard 2, Guhong Cao 1 1 epartment IRO, Unversty of Montreal CP. 6128, succursale Centre-vlle, Montreal, uebec, H3C 3J7, Canada

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

An Iterative Implicit Feedback Approach to Personalized Search

An Iterative Implicit Feedback Approach to Personalized Search An Iteratve Implct Feedback Approach to Personalzed Search Yuanhua Lv 1, Le Sun 2, Junln Zhang 2, Jan-Yun Ne 3, Wan Chen 4, and We Zhang 2 1, 2 Insttute of Software, Chnese Academy of Scences, Beng, 100080,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Extraction of User Preferences from a Few Positive Documents

Extraction of User Preferences from a Few Positive Documents Extracton of User Preferences from a Few Postve Documents Byeong Man Km, Qng L Dept. of Computer Scences Kumoh Natonal Insttute of Technology Kum, kyungpook, 730-70,South Korea (Bmkm, lqng)@se.kumoh.ac.kr

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Cross-Language Information Retrieval

Cross-Language Information Retrieval Feature Artcle: Cross-Language Informaton Retreval 19 Cross-Language Informaton Retreval Jan-Yun Ne 1 Abstract A research group n Unversty of Montreal has worked on the problem of cross-language nformaton

More information

Personalized Concept-Based Clustering of Search Engine Queries

Personalized Concept-Based Clustering of Search Engine Queries IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 Personalzed Concept-Based Clusterng of Search Engne Queres Kenneth Wa-Tng Leung, Wlfred Ng, and Dk Lun Lee Abstract The exponental growth of nformaton

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Query classification using topic models and support vector machine

Query classification using topic models and support vector machine Query classfcaton usng topc models and support vector machne Deu-Thu Le Unversty of Trento, Italy deuthu.le@ds.untn.t Raffaella Bernard Unversty of Trento, Italy bernard@ds.untn.t Abstract Ths paper descrbes

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

The Effect of Similarity Measures on The Quality of Query Clusters

The Effect of Similarity Measures on The Quality of Query Clusters The effect of smlarty measures on the qualty of query clusters. Fu. L., Goh, D.H., Foo, S., & Na, J.C. (2004). Journal of Informaton Scence, 30(5) 396-407 The Effect of Smlarty Measures on The Qualty of

More information

A Knowledge Management System for Organizing MEDLINE Database

A Knowledge Management System for Organizing MEDLINE Database A Knowledge Management System for Organzng MEDLINE Database Hyunk Km, Su-Shng Chen Computer and Informaton Scence Engneerng Department, Unversty of Florda, Ganesvlle, Florda 32611, USA Wth the exploson

More information

Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence

Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence 2nd Internatonal Conference on Software Engneerng, Knowledge Engneerng and Informaton Engneerng (SEKEIE 204) Text Smlarty Computng Based on LDA Topc Model and Word Co-occurrence Mngla Shao School of Computer,

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

The Shortest Path of Touring Lines given in the Plane

The Shortest Path of Touring Lines given in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Domain Thesaurus Construction from Wikipedia *

Domain Thesaurus Construction from Wikipedia * Internatonal Conference on Computer, Networks and Communcaton Engneerng (ICCNCE 2013) Doman Thesaurus Constructon from Wkpeda * WenKe Yn 1, Mng Zhu 2, TanHao Chen 2 1 Department of Electronc Engneerng

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Behavioral Model Extraction of Search Engines Used in an Intelligent Meta Search Engine

Behavioral Model Extraction of Search Engines Used in an Intelligent Meta Search Engine Behavoral Model Extracton of Search Engnes Used n an Intellgent Meta Search Engne AVEH AVOUSI Computer Department, Azad Unversty, Garmsar Branch BEHZAD MOSHIRI Electrcal and Computer department, Faculty

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Web Document Classification Based on Fuzzy Association

Web Document Classification Based on Fuzzy Association Web Document Classfcaton Based on Fuzzy Assocaton Choochart Haruechayasa, Me-Lng Shyu Department of Electrcal and Computer Engneerng Unversty of Mam Coral Gables, FL 33124, USA charuech@mam.edu, shyu@mam.edu

More information

Alignment Results of SOBOM for OAEI 2010

Alignment Results of SOBOM for OAEI 2010 Algnment Results of SOBOM for OAEI 2010 Pegang Xu, Yadong Wang, Lang Cheng, Tany Zang School of Computer Scence and Technology Harbn Insttute of Technology, Harbn, Chna pegang.xu@gmal.com, ydwang@ht.edu.cn,

More information

An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback

An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback An enhanced representaton of tme seres whch allows fast and accurate classfcaton, clusterng and relevance feedback Eamonn J. Keogh and Mchael J. Pazzan Department of Informaton and Computer Scence Unversty

More information

Cross-lingual Pseudo Relevance Feedback Based on Weak Relevant Topic Alignment

Cross-lingual Pseudo Relevance Feedback Based on Weak Relevant Topic Alignment Cross-lngual Pseudo Relevance Feedback Based on Weak Relevant opc Algnment WANG Xu-wen Insttute of Medcal Informaton & Lbrary, Chnese Academy of Medcal Scences, Beng 100020 wang.xuwen@mcams.ac.cn ZHANG

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Relevance Feedback for Image Retrieval

Relevance Feedback for Image Retrieval Vashal D Dhale et al, / (IJCSIT Internatonal Journal of Computer Scence and Informaton Technologes, Vol 4 (2, 203, 39-323 Relevance Feedback for Image Retreval Vashal D Dhale, Dr A R Mahaan, Prof Uma Thakur

More information

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval Combnng Multple Resources, Evdence and Crtera for Genomc Informaton Retreval Luo S 1, Je Lu 2 and Jame Callan 2 1 Department of Computer Scence, Purdue Unversty, West Lafayette, IN 47907, USA ls@cs.purdue.edu

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts Selectng Query Term Alteratons for Web Search by Explotng Query Contexts Guhong Cao Stephen Robertson Jan-Yun Ne Dept. of Computer Scence and Operatons Research Mcrosoft Research at Cambrdge Dept. of Computer

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

Finite Element Analysis of Rubber Sealing Ring Resilience Behavior Qu Jia 1,a, Chen Geng 1,b and Yang Yuwei 2,c

Finite Element Analysis of Rubber Sealing Ring Resilience Behavior Qu Jia 1,a, Chen Geng 1,b and Yang Yuwei 2,c Advanced Materals Research Onlne: 03-06-3 ISSN: 66-8985, Vol. 705, pp 40-44 do:0.408/www.scentfc.net/amr.705.40 03 Trans Tech Publcatons, Swtzerland Fnte Element Analyss of Rubber Sealng Rng Reslence Behavor

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Structured Query Suggestion for Specialization and Parallel Movement: Effect on Search Behaviors

Structured Query Suggestion for Specialization and Parallel Movement: Effect on Search Behaviors Structured Query Suggeston for Specalzaton and Parallel Movement: Effect on Search Behavors Makoto P. Kato Tetsuya Saka Katsum Tanaka Mcrosoft Research Asa, Chna tetsuyasaka@acm.org Kyoto Unversty, Japan

More information

Using Ambiguity Measure Feature Selection Algorithm for Support Vector Machine Classifier

Using Ambiguity Measure Feature Selection Algorithm for Support Vector Machine Classifier Usng Ambguty Measure Feature Selecton Algorthm for Support Vector Machne Classfer Saet S.R. Mengle Informaton Retreval Lab Computer Scence Department Illnos Insttute of Technology Chcago, Illnos, U.S.A

More information

An Improved Image Segmentation Algorithm Based on the Otsu Method

An Improved Image Segmentation Algorithm Based on the Otsu Method 3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,

More information

Intrinsic Plagiarism Detection Using Character n-gram Profiles

Intrinsic Plagiarism Detection Using Character n-gram Profiles Intrnsc Plagarsm Detecton Usng Character n-gram Profles Efstathos Stamatatos Unversty of the Aegean 83200 - Karlovass, Samos, Greece stamatatos@aegean.gr Abstract: The task of ntrnsc plagarsm detecton

More information

Modular PCA Face Recognition Based on Weighted Average

Modular PCA Face Recognition Based on Weighted Average odern Appled Scence odular PCA Face Recognton Based on Weghted Average Chengmao Han (Correspondng author) Department of athematcs, Lny Normal Unversty Lny 76005, Chna E-mal: hanchengmao@163.com Abstract

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1 A New Feature of Unformty of Image Texture Drectons Concdng wth the Human Eyes Percepton Xng-Jan He, De-Shuang Huang, Yue Zhang, Tat-Mng Lo 2, and Mchael R. Lyu 3 Intellgent Computng Lab, Insttute of Intellgent

More information

A Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment

A Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment A Webpage Smlarty Measure for Web Sessons Clusterng Usng Sequence Algnment Mozhgan Azmpour-Kv School of Engneerng and Scence Sharf Unversty of Technology, Internatonal Campus Ksh Island, Iran mogan_az@ksh.sharf.edu

More information

Feature Kernel Functions: Improving SVMs Using High-level Knowledge

Feature Kernel Functions: Improving SVMs Using High-level Knowledge Feature Kernel Functons: Improvng SVMs Usng Hgh-level Knowledge Qang Sun, Gerald DeJong Department of Computer Scence, Unversty of Illnos at Urbana-Champagn qangsun@uuc.edu, dejong@cs.uuc.edu Abstract

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Network Intrusion Detection Based on PSO-SVM

Network Intrusion Detection Based on PSO-SVM TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation Internatonal Conference on Logstcs Engneerng, Management and Computer Scence (LEMCS 5) Maxmum Varance Combned wth Adaptve Genetc Algorthm for Infrared Image Segmentaton Huxuan Fu College of Automaton Harbn

More information

Deep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies

Deep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies Deep Classfer: Automatcally Categorzng Search Results nto Large-Scale Herarches Dkan Xng 1, Gu-Rong Xue 1, Qang Yang 2, Yong Yu 1 1 Shangha Jao Tong Unversty, Shangha, Chna {xaobao,grxue,yyu}@apex.sjtu.edu.cn

More information

A Method of Hot Topic Detection in Blogs Using N-gram Model

A Method of Hot Topic Detection in Blogs Using N-gram Model 84 JOURNAL OF SOFTWARE, VOL. 8, NO., JANUARY 203 A Method of Hot Topc Detecton n Blogs Usng N-gram Model Xaodong Wang College of Computer and Informaton Technology, Henan Normal Unversty, Xnxang, Chna

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks Federated Search of Text-Based Dgtal Lbrares n Herarchcal Peer-to-Peer Networks Je Lu School of Computer Scence Carnege Mellon Unversty Pttsburgh, PA 15213 jelu@cs.cmu.edu Jame Callan School of Computer

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

On-line Hot Topic Recommendation Using Tolerance Rough Set Based Topic Clustering

On-line Hot Topic Recommendation Using Tolerance Rough Set Based Topic Clustering JOURNAL OF COMPUTERS, VOL. 5, NO. 4, APRIL 2010 549 On-lne Hot Topc Recommendaton Usng Tolerance Rough Set Based Topc Clusterng Yonghu Wu, Yuxn Dng, Xaolong Wang, Jun Xu Intellgence Computng Research Center

More information