arxiv: v1 [cs.ir] 23 Nov 2017

Size: px
Start display at page:

Download "arxiv: v1 [cs.ir] 23 Nov 2017"

Transcription

1 A Deep Relevance Matchng Model for Ad-hoc Retreval Jafeng Guo, Yxng Fan, Qngyao A, W. Bruce Croft CAS Key Lab of Network Data Scence and Technology, Insttute of Computng Technology, Chnese Academy of Scences, Bejng, Chna Center for Intellgent Informaton Retreval, Unversty of Massachusetts Amherst, MA, USA arxv: v1 [cs.ir] 23 Nov 2017 ABSTRACT In recent years, deep neural networks have led to exctng breakthroughs n speech recognton, computer vson, and natural language processng (NLP) tasks. However, there have been few postve results of deep models on ad-hoc retreval tasks. Ths s partally due to the fact that many mportant characterstcs of the ad-hoc retreval task have not been well addressed n deep models yet. Typcally, the ad-hoc retreval task s formalzed as a matchng problem between two peces of text n exstng work usng deep models, and treated equvalent to many NLP tasks such as paraphrase dentfcaton, queston answerng and automatc conversaton. However, we argue that the ad-hoc retreval task s manly about relevance matchng whle most NLP matchng tasks concern semantc matchng, and there are some fundamental dfferences between these two matchng tasks. Successful relevance matchng requres proper handlng of the exact matchng sgnals, query term mportance, and dverse matchng requrements. In ths paper, we propose a novel deep relevance matchng model (DRMM) for ad-hoc retreval. Specfcally, our model employs a jont deep archtecture at the query term level for relevance matchng. By usng matchng hstogram mappng, a feed forward matchng network, and a term gatng network, we can effectvely deal wth the three relevance matchng factors mentoned above. Expermental results on two representatve benchmark collectons show that our model can sgnfcantly outperform some well-known retreval models as well as state-of-the-art deep matchng models. Keywords Relevance Matchng, Semantc Matchng, Neural Models, Ad-hoc Retreval, Rankng Models 1. INTRODUCTION Machne learnng methods have been successfully appled to nformaton retreval (IR) n recent years. Typcally, a rankng functon whch produces a relevance score gven a Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. Copyrghts for components of ths work owned by others than ACM must be honored. Abstractng wth credt s permtted. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. Request permssons from permssons@acm.org. CIKM 16, October 24-28, 2016, Indanapols, IN, USA c 2016 ACM. ISBN /16/10... $15.00 DOI: query and document par s learned based on a set of human defned features. However, handcraftng features can be tme-consumng, ncomplete and over-specfed. On the other hand, deep neural networks, as a representaton learnng method, are able to dscover from the tranng data the hdden structures and features at dfferent levels of abstracton that are useful for the tasks. Recently, deep models have been appled to a varety of applcatons n computer vson [16], speech recognton [10] and NLP [25, 17], and have yelded sgnfcant performance mprovements. Gven the success of deep learnng n these domans, t seems that deep learnng should have a major mpact on IR. However, there have been few postve results of deep models on IR tasks, especally ad-hoc retreval tasks, untl now. Wthout loss of generalty, when applyng deep models to ad-hoc retreval, the task s typcally formalzed as a matchng problem between two peces of text (.e., the query and document). Such a matchng problem formalzaton s often consdered general n the sense that t can cover both ad-hoc retreval tasks as well as manynlptasks suchas paraphrase dentfcaton, queston answerng (QA), and automatc conversaton [17, 11]. A varety of deep matchng models have been proposed to solve ths matchng problem, whch can be categorzed nto two types accordng to ther model archtecture. One s the representaton-focused model, whch tres to buld a good representaton for a sngle text wth a deep neural network, and then conducts matchng between the compostonal and abstract text representatons. Examples nclude DSSM [12], C-DSSM [23, 8] and ARC-I [11]. The other s the nteracton-focused model, whch frst bulds local nteractons (.e., local matchng sgnals) between two peces of text, and then uses deep neural networks to learn herarchcal nteracton patterns for matchng. Examples nclude DeepMatch [17], ARC-II [11] and MatchPyramd [19]. However, n ths work, we argue that the matchng problems n many NLP tasks and the ad-hoc retreval task are fundamentally dfferent. Most NLP tasks concern semantc matchng,.e., dentfyng the semantc meanng and nferrng the semantc relatons between two peces of text, whle the ad-hoc retreval task s manly about relevance matchng,.e., dentfyng whether a document s relevant to a gven query. We pont out three major dfferences between these two matchng problems whch may lead to sgnfcantly dfferent archtecture desgn for the deep matchng models. We also show that most exstng deep matchng models are desgned for semantc matchng rather than relevance matchng. Based on these dfferences, we propose a deep relevance

2 Fgure 1: Two types of deep matchng models: (a) Representaton-focused models employ a Samese (symmetrc) archtecture over the text nputs; (b) Interacton-focused models employ a herarchcal deep archtecture over the local nteracton matrx. matchng model (DRMM) for ad-hoc retreval by explctly modelng the three major factors n relevance matchng. Overall, our model s an nteracton-focused model whch employs ajont deeparchtectureat thequeryterm 1 level for relevance matchng. Specfcally, we frst buld local nteractons between each par of terms from a query and a document based on term embeddngs. For each query term, we map the varable-length local nteractons nto a fxed-length matchng hstogram. Based on ths fxed-length matchng hstogram, we then employ a feed forward matchng network to learn herarchcal matchng patterns and produce a matchng score. Fnally, the overall matchng score s generated by aggregatng the scores from each query term wth a term gatng network computng the aggregaton weghts. We show how our major model desgns, ncludng matchng hstogram mappng, a feed forward matchng network, and a term gatng network, address the three key factors n relevance matchng for ad-hoc retreval. We evaluate the effectveness of the proposed DRMM based on two representatve ad-hoc retreval benchmark collectons. For comparson, we take nto account some wellknown tradtonal retreval models, as well as several stateof-the-art deep matchng models ether desgned for the general matchng problem or proposed specfcally for the adhoc retreval task. The emprcal results show that the exstng deep matchng models cannot compete wth the tradtonal retreval models on these benchmark collectons, whle our model can outperform all the baselne models sgnfcantly n terms of all the evaluaton metrcs. The major contrbutons of ths paper nclude: 1. We pont out three major dfferences between semantc matchng and relevance matchng, whch may lead to sgnfcantly dfferent archtecture desgn of the deep matchng models. 2. We propose a novel deep relevance matchng model for ad-hoc retreval by explctly addressng the three key factors of relevance matchng. 3. We conduct rgorous comparsons over state-of-the-art retreval models on benchmark collectons and analyze the defcences of exstng deep matchng models and advantages of the DRMM. 1 Here we use term to denote the ndexed unts n search systems, whch could be stemmed words or phrases. 2. AD-HOC RETRIEVAL AS A MATCHING PROBLEM Accordng to exstng lterature [12, 17], the core problem n ad-hoc retreval,.e., the computaton of the relevance for a document gven a partcular query, can be formalzed as a text matchng problem as follows. Gven two texts T 1 and T 2, the degree of matchng s typcally measured as a score produced by a scorng functon based on the representaton of each text: match(t 1,T 2) = F(Φ(T 1),Φ(T 2)), where Φ s a functon to map each text to a representaton vector, and F s the scorng functon based on the nteractons between them. Such a text matchng problem s consdered general snce t also descrbes many NLP tasks, such as paraphrase dentfcaton, queston answerng, and automatc conversaton [17, 11]. A varety of deep matchng models have been proposed ether for the specfc ad-hoc retreval task or for the general matchng problem. Dependng on how you choose the two functons, exstng deep matchng models can be categorzed nto two types. The frst one, the representaton-focused model, tres to buld a good representaton for a sngle text wth a deep neural network, and then conducts matchng between two compostonal and abstract text representatons. In ths approach, Φ s a complex representaton mappng functon whle F s a relatvely smple matchng functon. For example, n DSSM [12], Φ s a feed forward neural network, whle F s the cosne smlarty functon. In C-DSSM [23, 8], Φ s a convolutonal neural network (CNN) [16], whle F s the cosne smlarty functon. In ARC-I [11], Φ s a CNN, whle F s a mult-layer perceptron (MLP). Wthout loss of generalty, all the model archtectures of representatonfocused models can be vewed as a Samese (symmetrc) archtecture over the text nputs, as shown n Fgure 1(a). The second one, the nteracton-focused model, frst bulds the local nteractons between two texts based on some basc representatons, and then uses deep neural networks to learn the herarchcal nteracton patterns for matchng. In ths approach, Φ s usually a smple mappng functon whle F sacomplexdeepmodel. Forexample, ndeepmatch[17], Φ smply maps each text to a sequence of words, whle F s a feed forward neural network powered by a topc model over the word nteracton matrx. In ARC-II [11] and MatchPyramd [19], Φ maps each text to a sequence of word vectors,

3 whle F s a CNN over the nteracton matrx between word vectors from the two texts. Wthout loss of generalty, all the model archtectures of nteracton-focused models can be vewed as a herarchcal deep archtecture over the local nteracton matrx, as shown n Fgure 1(b). Although varous deep matchng models have been proposed under such a general matchng problem formalzaton, most of them have only beendemonstrated to be effectve on a set of NLP tasks such as paraphrase dentfcaton and QA [11, 26]. There have been few postve results on the ad-hoc retreval task. Even the deep models specally desgned for Web search, e.g., DSSM and C-DSSM, were only evaluated on <query, doc ttle> pars whch are not a typcal ad-hoc retreval settng. If we drectly apply these deep matchng models on some benchmark retreval collectons, e.g. TREC collectons, we fnd relatvely poor performance compared to tradtonal rankng models, such as the language model [31] and BM25 [22]. All these observatons rase some questons such as: Is matchng n ad-hoc retreval really the same as that n NLP tasks? Are the exstng deep matchng models sutable for the ad-hoc retreval task? 3. SEMANTIC MATCHING VS. RELEVANCE MATCHING In ths secton, we dscuss the dfferences between text matchng n ad-hoc retreval and other NLP tasks. The matchng n many NLP tasks, such as paraphrase dentfcaton, queston answerng and automatc conversaton, s manly concerned wth semantc matchng,.e., dentfyng the semantc meanng and nferrng the semantc relatons between two peces of text. In these semantc matchng tasks, the two texts are usually homogeneous and consst of a few natural language sentences, such as questons/answer sentences, or dalogs. To nfer the semantc relatons between natural language sentences, semantc matchng emphaszes the followng three factors: Smlarty matchng sgnals: It s mportant, or crtcal to capture the semantc smlarty/relatedness between words, phrases and sentences, as compared wth exact matchng sgnals. For example, n paraphrase dentfcaton, one needs to dentfy whether two sentences convey the same meanng wth dfferent expressons. In automatc conversaton, one ams to fnd a proper response semantcally related to the prevous dalog, whch may not share any common words or phrases between them. Compostonal meanngs: Snce texts n semantc matchng usually consst of natural language sentences wth grammatcal structures, t s more benefcal to use the compostonal meanng of the sentences based on such grammatcal structures rather than treatng them as a set/sequence of words [25]. For example, n queston answerng, most questons have clear grammatcal structures whch can help dentfy the compostonal meanng that reflects what the queston s about. Global matchng requrement: Semantc matchng usually treats the two peces of text as a whole to nfer the semantc relatons between them, leadng to a global matchng requrement. Ths s partally related to the fact that most texts n semantc matchng have lmted lengths and thus the topc scope s concentrated. For example, two sentences are consdered as paraphrases f the whole meanng s the same, and a good answer fully answers the queston. The matchng n ad-hoc retreval, on the contrary, s manly about relevance matchng,.e., dentfyng whether a document s relevant to a gven query. In ths task, the query s typcally short and keyword based, whle the document can vary consderably n length, from tens of words to thousands or even tens of thousands of words. To estmate the relevance between a query and a document, relevance matchng s focused on the followng three factors: Exact matchng sgnals: Although term msmatch s a crtcal problem n ad-hoc retreval and has been tackled usng dfferent semantc smlarty sgnals, the exact matchng of terms n documents wth those n queres s stll the most mportant sgnal n ad-hoc retreval due to the ndexng and search paradgm n modern search engnes. For example, Fang and Zha [7] proposed the semantc term matchng constrant whch states that matchng an orgnal query term exactly should always contrbute no less to the relevance score than matchng a semantcally related term multple tmes. Ths also explans why some tradtonal retreval models, e.g., BM25, can work reasonably well purely based on exact matchng sgnals. Query term mportance: Snce queres are manly short and keyword based wthout complex grammatcal structures n ad-hoc retreval, t s mportant to take nto account term mportance, whle the compostonal relaton among the query terms s usually the smple and relaton n operatonal search. For example, gven the query btcon news, a relevant document s expected to be about btcon and news, where the term btcon s more mportant than news n the sense that a document descrbng other aspects of btcon would be more relevant than a document descrbng news of other thngs. In the lterature, there have been many formal studes on retreval models showng the mportance of term dscrmnaton [5, 6]. Dverse matchng requrement: In ad-hoc retreval, a relevant document can be very long and there have been dfferent hypotheses concernng document length [22] n the lterature, leadng to a dverse matchng requrement. Specfcally, the Verbosty Hypothess assumes that a long document s lke a short document, coverng a smlar scope but wth more words. In ths case, the relevance matchng mght be global f we assume short documents have a concentrated topc. On the contrary, the Scope Hypothess assumes a long document conssts of a number of unrelated short documents concatenated together. In ths way, the relevance matchng could happen n any part of a relevant document, and we do not requre the document as a whole to be relevant to a query. As we can see, there are sgnfcant dfferences between relevance matchng n ad-hoc retreval and semantc matchng n many NLP tasks. These dfferences affect the desgn of deep model archtectures and t may be dffcult to fnd a one-ft-all soluton to such dfferent matchng problems. If we revst the exstng deep matchng models, we fnd that most of them concern semantc matchng rather than relevance matchng. For example, the representaton-focused models such as DSSM, C-DSSM and ARC-I focus on the compostonal meanng of the texts and ft the global matchng requrement. In these models, detaled matchng sgnals and, especally, exact matchng sgnals are lost snce they defer the nteracton between two texts untl ther ndvdual representatons have been created [11]. Although the nteracton-focused models such as DeepMatch, ARC-II and

4 Fgure 2: Archtecture of the Deep Relevance Matchng Model. MatchPyramd preserve both exact and smlarty matchng sgnals, they do not dfferentate these sgnals but treat them as equally mportant. These models focus on learnng the composton of local nteractons wthout addressng term mportance. In partcular, the convolutonal structures n ARC-II and MatchPyramd are desgned to learn postonal regulartes, whch may work well under the global matchng requrement but fal under the dverse matchng requrement.(there s more dscusson on ths n Secton 4.) 4. DEEP RELEVANCE MATCHING MODEL Based on the above analyss, we propose a novel deep matchng model specfcally desgned for relevance matchng n ad-hoc retreval by explctly addressng the three factors descrbed n Secton 3. We refer to our model as a deep relevance matchng model (DRMM). Overall, our model s smlar to nteracton-focused models rather than representaton-focused models snce the latter would nevtably lose the detaled matchng sgnals whch are crtcal for relevance matchng n ad-hoc retreval. Specfcally, our model employs a jont deep archtecture at the query term level over the local nteractons between query and document terms for relevance matchng. We frst buld local nteractons between each par of terms from a query and a document based on term embeddngs. For each query term, we then transform the varable-length local nteractons nto a fxed-length matchng hstogram. Based on the fxed-length matchng hstogram, we employ a feed forward matchng network to learn herarchcal matchng patterns and produce a matchng score for each query term. Fnally, the overall matchng score s generated by aggregatng the scores from each sngle query term wth a term gatng network computng the aggregaton weghts. The model archtecture s depcted n Fgure 2. More formally, suppose both query and document are representedasasetoftermvectorsdenotedbyq={w (q) 1,...,w(q) and d = {w (d) 1,...,w(d) }, where w(q), = 1,...,M and w (d) N M } j,j = 1,...,N denotes a query term vector and a document term vector, respectvely, and s denotes the fnal relevance score, we have z (0) = h(w (q) d), = 1,...,M z (l) = tanh(w (l) z (l 1) +b (l) ), = 1,...,M,l= 1,...,L M s = g z (L) =1 where denotes the nteracton operator between a query term and the document terms, h denotes the mappng functon from local nteractons to matchng hstogram, z (l),l = 0,...,L denotes the ntermedate hdden layers for the -th query term, and g, = 1,...,M denotes the aggregaton weght produced by the term gatng network. W (l) denotes the l-th weght matrx and b (l) denotes the l-th bas term, whch are shared across dfferent query terms. Note that we adopt cosne smlarty, a wdely used measure for semantc closeness n neural embeddngs [18, 20], as the nteracton operator between each par of term vectors from a query and a document. In our work, we assume the term vectors are learned a pror usng exstng neural embeddng models such as Word2Vec [18]. We do not learn term vectors n our deep relevance matchng model for the followng reasons: 1) Relable term representatons can be better acqured from large scale unlabeled text collectons rather than from the lmted ground truth data for ad-hoc retreval; 2) By usng the a pror learned term vectors, we can focus the learnng of our model on relevance matchng patterns and consderably reduce the model complexty. In the followng, we wll descrbe the major components of our model, ncludng the matchng hstogram mappng, feed forward matchng network, and term gatng network n detal, and dscuss how they address the three key factors of relevance matchng n ad-hoc retreval. Matchng Hstogram Mappng: The nput of our deep relevance matchng model s the local nteractons between each par of terms from a query and a document. A major problem s that the sze of local nteractons s not fxed due to the vared lengths of queres and documents. Prevous nteracton-based models vew the local nteractons as a matchng matrx by preservng the sequental term orders

5 n both queres and documents. Clearly the matchng matrx s a poston preservng representaton, whch s useful f the learnng task s poston related. However, accordng to the dverse matchng requrement, relevance matchng s not poston related snce t could happen n any poston n a long document. Thus the matchng matrx may not be a sutable representaton for ad-hoc retreval due to the potentally nosy postonal sgnals n t. In our work, we adopt a strength preservng representaton, namely a matchng hstogram, whch groups local nteractons accordng to dfferent levels of sgnal strengths rather than ther postons. Specfcally, snce the local nteracton (.e., cosne smlarty between term vectors) s wthn the nterval [ 1,1], we dscretze the nterval nto a set of ordered bns and accumulate the count of local nteractons n each bn. In ths work, we consder fxed bn sze and treat exact matchng as a separate bn. Other dscretzaton schemes could be explored n future work. For example, suppose the bn sze s set as 0.5, we wll obtan fve bns {[ 1, 0.5),[ 0.5, 0),[0,0.5), [0.5,1),[1,1]} n an ascendng order. Gven a query term car and a document (car, rent, truck, bump, njuncton, runway), and the correspondng local nteractons based on cosne smlarty are (1,0.2,0.7,0.3, 0.1,0.1), we wll obtan a matchng hstogram as [0,1,3,1,1]. We explore three ways of the matchng hstogram mappng: Count-based Hstogram (CH): Ths s the smplest way of transformaton as descrbed above whch drectly takes the count of local nteractons n each bn as the hstogram value. Normalzed Hstogram (NH): We normalze the count value n each bn by the total count to focus on the relatve rather than the absolute number of dfferent levels of nteractons. LogCount-based Hstogram (LCH): We apply logarthm over the count value n each bn, both to reduce the range, and to allow our model to more easly learn multplcatve relatonshps [1]. We compare our matchng hstogram representaton wth prevous matchng matrx representatons to show the advantages. Frstly, by settng exact matchng as a separate bn, the matchng hstogram clearly dstngushes the exact matchng sgnals from smlarty matchng sgnals, whle n a matchng matrx all the sgnals are mxed together. Secondly, to solve the problem of varable sze n the matchng matrx, a zero-paddng scheme s often adopted n prevous methods [11]. However, the zero-paddng scheme ntroduces addtonal nteracton sgnals whch may be unfar for short documents. In contrast, we map the varable-sze nteractons nto a fxed-length matchng hstogram wthout ntroducng any addtonal sgnals. Feed forward Matchng Network: Based on the matchng hstogram above, we employ a feed forward matchng network to learn the herarchcal matchng patterns and produce a matchng score for each query term. Snce our model follows the approach of nteracton-focused models, we dscuss the major dfferences between the learnng of our feed forward matchng network and that n prevous nteractonfocused models. Exstng nteracton-focused models, e.g., ARC-II and MatchPyramd, employ a CNN to learn herarchcal matchng patterns over the matchng matrx. These models are bascally poston-aware usng convolutonal unts wth a local receptve feld and learnng postonal regulartes n matchng patterns. Ths may be sutable for the mage recognton task, and work well on semantc matchng problems due to the global matchng requrement (.e., all the postons are mportant). However, t may not be sutable for the ad-hoc retreval task, snce such postonal regularty may not exst n relevance matchng due to the dverse matchng requrement dscussed n Secton 3. Besdes, snce CNN parameters are poston related, these models wll treat both exact matchng and smlarty matchng sgnals equally. Our deep relevance matchng model, on the contrary, ams to extract herarchcal matchng patterns from dfferent levels of nteracton sgnals rather than dfferent postons. The poston-free and strength-focused property makes t better at handlng the dverse matchng requrement n ad-hoc retreval. Meanwhle, snce the matchng hstogram drectly dstngushes exact matchng sgnals from the rest, our model can naturally learn the mportance of exact matchng sgnals. There have been some nteracton-focused models that employ specal poolng strateges to turn the poston-aware nteractons nto strength-based fxed-length representatons. For example, MV-LSTM [26] used K-max poolng strategy [13] to select the top K strongest nteracton sgnals from the matchng matrx as the nput of a MLP. However, such a poolng strategy smply truncates the sgnals and thus wll be strongly based to long documents snce t s more lkely for long documents to contan more strong sgnals. The poolng strategy s appled over the entre matchng matrx n MV-LSTM, makng t possble that the top K strongest sgnals all come from the nteractons between a sngle query term and the document terms. In contrast, our model does not rely on any poolng strategy to truncate the nteractons so that we can avod these problems. Term Gatng Network: One sgnfcant dfference of our model from exstng nteracton-focused models s that we employ a jont deep archtecture at the query term level. In ths way, our model can explctly model query term mportance. Ths s acheved by usng the term gatng network, whch produces an aggregaton weght for each query term controllng how much the relevance score on that query term contrbutes to the fnal relevance score. Specfcally, we employ the softmax functon as the gatng functon. g = exp(w gx (q) ) M j=1 exp(wgx(q) j ), = 1,...,M, where w g denotes the weght vector of the term gatng network and x (q), = 1,...,M denotes the -th query term nput. We tred dfferent nputs for the gatng functon as follows: Term Vector (TV): Inspred by the work [32] where term embeddngs can be leveraged to learn the term weghts n queres, we use query term vectors as the nput of the gatng functon. In ths method, x (q) denotes the -th query term vector, and w g s a weght vector wth the same dmensonalty of term vectors. Inverse Document Frequency (IDF): An mportant sgnal of term mportance n ad-hoc retreval s the nverse document frequency. We also tred ths smple but powerful sgnal n the gatng functon. In ths method, x (q) denotes

6 Table 1: Statstcs of the TREC collectons used n ths study. The ClueWeb-09-Cat-B collecton has been fltered to the set of documents n the 60 th percentle of spam scores. Robust04 ClueWeb-09-Cat-B Vocabulary 0.6M 38M Document Count 0.5M 34M Collecton Length 252M 26B Query Count the nverse document frequency of the -th query term, and w g reduces to a sngle parameter. 4.1 Model Tranng Snce the ad-hoc retreval task s fundamentally a rankng problem, we employ a parwse rankng loss such as hnge loss to tran our deep relevance matchng model. Gven a trple (q,d +,d ), where document d + s ranked hgher than document d wth respect to query q, the loss functon s defned as: L(q,d +,d ;Θ) = max(0,1 s(q,d + )+s(q,d )) where s(q, d) denotes the predcted matchng score for (q, d), and Θ ncludes the parameters for the feed forward matchng network and those for the term gatng network. The optmzaton s relatvely straghtforward wth standard backpropagaton[29]. We apply stochastc gradent descent method Adagrad [4] wth mn-batches (20 n sze), whch can be easly parallelzed on sngle machne wth mult-cores. For regularzaton, we fnd that the early stoppng [9] strategy works well for our model. 5. EXPERIMENTS In ths secton, we conduct experments to demonstrate the effectveness of our proposed model. 5.1 Data Sets To conduct experments, we use two TREC collectons, Robust04 and ClueWeb-09-Cat-B. The detals of the two collectons are provded n Table 1. As we can see, they represent dfferent szes and genres of heterogeneous text collectons. Robust04 s a small news dataset. Its topcs are collected from TREC Robust Track ClueWeb-09- Cat-B, on the other hand, s a large Web collecton, whose topcs are accumulated from TREC Web Tracks 2009, 2010, and Note that ClueWeb-09-Cat-B s fltered to the set of documents wth spam scores n the 60 th percentle, usng the Waterloo Fuson spam scores [3]. For both datasets, we made use of both the ttle and the descrpton of each TREC topc n our experments. The retreval experments descrbed n ths secton are mplemented usng the Galago Search Engne 2. Durng ndexng and retreval, both documents and query words are whte-space tokenzed, lowercased, and stemmed usng the Krovetz stemmer [15]. Stopword removal s performed on query words durng retreval usng the INQUERY stop lst [2]. 5.2 Baselnes and Expermental Settngs 2 We adopt three types of baselne methods for comparson, ncludng tradtonal retreval models, representatonfocused deep matchng models and nteracton-focused deep matchng models. Tradtonal retreval models nclude QL: Query lkelhood model based on Drchlet smoothng [31] s one of the best performng language models. BM25: The BM25 formula [22] s another hghly effectve retreval model that represents the classcal probablstc retreval model. Representaton-focused deep matchng models nclude DSSM T/DSSM D: DSSM [12] s a state-of-the-art deep matchng model for Web search. In the orgnal paper, the model was evaluated based on <query, doc ttle> pars where doc ttle s extracted from the ttle feld. We denote ths model as DSSM T. Snce other baselne models and our model are based on the full text of the documents, we also evaluated the DSSM model under the same settng, denoted by DSSM D. Snce DSSM needs large scale tranng data due to ts huge parameter sze, we drectly used the released model 3 (traned on large clck-through dataset) n our experments. C-DSSM T/C-DSSM D: C-DSSM [23, 8] s a smlar deep matchng model to DSSM for Web search, replacng the feed forward neural network wth a convolutonal neural network. For the same reason as DSSM, we also made use of the released model 3 drectly and adopt two versons of the C-DSSM model, one based on ttle felds of documents denoted as C-DSSM T andthe other based thewhole document denoted as C-DSSM D. ARC-I: ARC-I [11] s a general representaton-focused deep matchng model that has been tested on a set of NLP tasks ncludng sentence completon, response matchng, and paraphrase dentfcaton. We mplemented the ARC-I model accordng to the orgnal paper snce there s no publcly avalable code. Interacton-focused deep matchng models are as follows: ARC-II: ARC-II [11] was proposed by the authors of the model ARC-I, but focuses on learnng herarchcal matchng patterns from local nteractons usng a CNN. We also mplemented ACR-II snce there s no publcly avalable code. MP: MatchPyramd [19] s another state-of-the-art nteracton-focused deep matchng model and has been tested on two NLP tasks ncludng paraphrase dentfcaton and paper ctaton matchng. There are three varants of the model based on dfferentnteracton operators, denotedas MP IND, MP COS, and MP DOT. We obtaned the orgnal mplementaton of the model from the authors for comparson. We refer to our proposed deep relevance matchng model as DRMM. Wth dfferent types of hstogram mappng functons (.e., CH, NH and LCH) and term gatng functons (.e., TV and IDF), we obtaned sx dfferent varants of our proposed model. For example, by DRMM CH IDF we refer to DRMM wth Count-based hstogram and term gatng network usng nverse document frequency. Term Embeddngs: For all the models based on term embeddng nputs, ncludng ARC-I, ARC-II, MatchPyramd and DRMM, we used 300-dmensonal term vectors traned wth the Contnuous Bag-of-Words (CBOW) Model [18] on the Robust04 and ClueWeb-09-Cat-B collectons, respectvely. Specfcally, we used 10 as the context wndow sze and used 10 negatve samples and a subsamplng of fre- 3

7 quent words wth samplng threshold of 10 4 as suggested by Word2Vec 4. Each corpus was pre-processed by removng HTML tags and stemmng. We also dscarded from the vocabulary all the terms that occur less than 10 tmes n the corpus, whch resulted n a vocabulary of sze 0.1M and 4.1M on the Robust04 and ClueWeb-09-Cat-B collectons, respectvely. To address the out-of-vocabulary (OOV) terms (.e., some rare terms or numbers not traned by CBOW) n queres, we follow the practce n prevous work [14] to only allow exact matchng between such query terms and document terms. Network Confguratons: For network confguratons (e.g., numbers of layers and hdden nodes), we tune the hyper parameters on a valdaton set (as part of the tranng set). For ARC-I, ARC-II and MatchPyramd, we tred both the default confguratons n ther orgnal paper and other settngs. We fnd that models wth less layers and feature maps perform better, probably due to the lmted tranng data n TREC collectons. Specfcally, for ARC-I and ARC- II,we use3-wordwndows, 64feature mapsand6layers (two for convolutons, two for max-poolng and two full connecton). For MatchPyramd, we use one convolutonal layer, one dynamc poolng layer and two full connecton layers. The number of feature maps s 8 and the kernel sze s set to be 3 3. For DRMM, we also use a four-layer archtecture throughout all experments,.e., one hstogram nput layer (30 nodes), two hdden layers n the feed forward matchng network (5 nodes and 1 node respectvely), and one output layer (1 node) wth the term gatng network for the fnal matchng score. 5.3 Evaluaton Methodology Gven the lmted number of queres for each collecton, we conduct 5-fold cross-valdaton to mnmze over-fttng wthout reducng the number of learnng nstances. Topcs for each collecton are randomly dvded nto 5 folds. The parameters for each model are tuned on 4-of-5 folds. The fnal fold n each case s used to evaluate the optmal parameters. Ths process s repeated 5 tmes, once for each fold. Mean average precson (MAP) s the optmzed metrc for all retreval models. Throughout ths paper each dsplayed evaluaton statstc s the average of the fve foldlevel evaluaton values. For evaluaton, the top-ranked 1, 000 documents are compared usng the mean average precson (MAP), normalzed dscounted cumulatve gan at rank 20 (ndcg@20), and precson at rank 20 (P@20). Statstcal dfferences between models are computed usng the Fsher randomzaton test [24] (α = 0.05). Note that for all the deep matchng models, we adopt a re-rankng strategy for effcent computaton. An ntal retreval s performed usng the QL model to obtan the top 2,000 ranked documents. We then use the deep matchng models to re-rank these top results. The top-ranked 1, 000 documents are then used for comparson. 5.4 Retreval Performance and Analyss Ths secton presents the performance results of dfferent retreval models over the two benchmark datasets. A summary of results s dsplayed n Table 2. As we can see, all the representaton-focused models perform sgnfcantly worse than the tradtonal retreval models, demonstratng the unsutablty of these models for rel- 4 evance matchng. Both DSSM T and C-DSSM T can work better than ther counterpart on the whole document on ClueWeb-09-Cat-B, showng that models desgned for global matchng requrement cannot handle the dverse matchng requrement n long documents. Note that we do not reporttheperformance ofdssm T andc-dssm T onrobust04 snce there s no ttle feld n many subsets n ths collecton. The ARC-I model, although traned on the correspondng corpus, performs even worse than DSSM and C-DSSM. A possble reason s that ARC-I concatenates the query and document representaton for computng the matchng score, whchmaybeless effectvethanthecosne functonndssm and C-DSSM. When we look at the nteracton-focused models, we fnd that these baselne models cannot compete wth the tradtonal retreval models ether. Among these models, ARC-II can outperform ARC-I by drectly learnng from local nteractons, but performs worse than MatchPyramd models due to the ndrect local nteractons (.e., local nteracton s based on the weghted sum of query and document term vectors rather than cosne smlarty or dot product), whch s consstent wth prevous results n [11, 19]. Moreover, the best performng nteracton-focused model, MP COS, can consstently outperform all the representaton-focused models on both test collectons. When comparng the MatchPyramd models, we fnd that both MP IND and MP COS perform muchbetter thanmp DOT. Note thatmp IND s purely based on exact matchng sgnals, MP COS and MP DOT nvolve both exact and smlarty matchng sgnals where exact matchng sgnals are always stronger than smlarty sgnals n MP COS, but ths may not be true n MP DOT. The performance gap between MP DOT and the other two MPs ndcates the mportance of the exact matchng sgnals n relevance matchng. In fact, when evaluated on the semantc matchng tasks n [19], MP DOT performed better than the other two MPs even though t cannot dfferentate the exact matchng sgnals from the rest, demonstratng the sgnfcant dfferences between semantc matchng and relevance matchng. As for our proposed DRMMs, we have the followng observatons: (1) NH-based models perform sgnfcantly worse than CH-based models, whle LCH-based models acheve the best performance on both collectons. The low performance of NH-based models may be related to the loss of document length nformaton after normalzaton whch s mportant n ad-hoc retreval [6]. Meanwhle, the good performance of LCH-based models ndcates that deep neural networks can beneft from nput sgnals wth reduced range and nonlnear transformaton useful for learnng multplcatve relatonshps [1]; (2) The term gatng functon based on nverse document frequency works better than that based on term vectors. There are two possble reasons for ths result. Frstly, term vectors do not contan suffcent nformaton for the term mportance. Secondly, the learnng of the model mght be domnated by the term gatng network when we use term vectors as the nput snce there are more parameters (.e., 300 parameters) n the gatng network compared to the feed forward matchng network (.e., 155 parameters). Fnally, we can see that the best performng DRMM (.e., DRMM LCH IDF) s sgnfcantly better than all the exstng deep matchng models as well as tradtonal retreval models. For example, on ClueWeb-09-Cat-B topc ttles, the relatve mprovement of our model over the best perform-

8 Table 2: Comparson of dfferent retreval models over the Robust-04 and ClueWeb-09-Cat-B collectons. Sgnfcant mprovement or degradaton wth respect to QL s ndcated (+/-) (p-value 0.05). Robust-04 collecton Topc ttles Topc descrptons Model Type Model Name MAP ndcg@20 P@20 MAP ndcg@20 P@20 Tradtonal Retreval QL Baselnes BM DSSM Representaton-Focused D CDSSM Matchng Baselnes D ARC-I Interacton-Focused Matchng Baselnes Our Approach ARC-II MP IND MP COS MP DOT DRMM CH TV DRMM NH TV DRMM LCH TV DRMM CH IDF DRMM NH IDF DRMM LCH IDF ClueWeb-09-Cat-B collecton Topc ttles Topc descrptons Model Type Model Name MAP ndcg@20 P@20 MAP ndcg@20 P@20 Tradtonal Retreval QL Baselnes BM DSSM T DSSM Representaton-Focused D CDSSM Matchng Baselnes T CDSSM D ARC-I Interacton-Focused Matchng Baselnes Our Approach ACR-II MP IND MP COS MP DOT DRMM CH TV DRMM NH TV DRMM LCH TV DRMM CH IDF DRMM NH IDF DRMM LCH IDF ng baselne (.e., BM25) s about 11.9%, 14.7%, and 12% n terms of MAP, ndcg@20 and P@20, respectvely. Another nterestng fndng s that on the Robust04 collecton, the performance of DRMM LCH IDF on topc descrptons can be comparable to that on topc ttles, whch s seldom observed on prevous models. Ths also demonstrates the potental of our model n handlng long queres n ad-hoc retreval. 5.5 Analyss on DRMM model We conducted experments to verfy the effectveness of dfferent components n the DRMM and analyze the effect of term embeddng dmensons. Through these experments, we try to gan a better understandng of the DRRM Impact of Dfferent Model Components To study the effect of dfferent model components, we compare the orgnal DRMM LCH IDF wth several smpler versons of the model. Frstly, we removed the term gatng network and used a smple sum to aggregate the scores from all the query terms. Snce the aggregaton weght s unform, we denote ths model as DRMM LCH UNI. We also tred removng the hstogram mappng layer but kept the rest unchanged. To turn the varable-length local nteractons nto a fxed-length representaton, we adopted two poolng strateges. One s dynamc poolng as n [25, 19] whch keeps the poston nformaton, and the other s K- max poolng as n [26] whch turns the postonal sgnals nto strength related sgnals. For a far comparson, we requre the sze of the representaton after poolng to be the same as the sze of the matchng hstogram (.e., 30). Note that although the matchng network structure s the same, the learned model s sgnfcantly dfferent due to the change of

9 DRMMLCHxIDF DRMMLCHxUNI DRMMDYNxIDF DRMMKMAXxIDF Table 3: Performance comparson of DRMM over dfferent dmensonalty of term embeddngs traned by CBOW on the Robust04 collecton. Topc Embeddng MAP CBOW-50d CBOW-100d Ttles CBOW-300d CBOW-500d CBOW-50d CBOW-100d Descrptons CBOW-300d CBOW-500d Fgure 3: Comparson of several smpler versons of DRMM over topc ttles of the two test collectons n terms of MAP. the nput. The matchng model based on dynamc poolng s a poston-aware model, whle the model based on K-max poolng s learned wth respect to the top strong nteracton sgnals. We denote the former model as DRMM DYN IDF and the latter as DRMM KMAX IDF. Thecomparson resultsoverthetopcttlesonthetwotest collectons n terms of MAP are depcted n Fgure 3. As we can see, wthout the term gatng network, DRMM LCH UNI performs slghtly worse than the orgnal DRMM. Specfcally, the relatve MAP drop of DRMM LCH UNI compared wth DRMM LCH IDF s about 6.8% and 3.5% on Robust04 and ClueWeb-09-Cat-B, respectvely. The results demonstrate the effectveness of the dfferentaton of query term mportance n relevance matchng. Besdes, we fnd that DRMM DY N IDF basedon poston-related sgnals performs sgnfcantly worse than the other two models based on strengthrelatedsgnals(.e., DRMM LCH IDF anddrmm KMAX IDF). The results ndcate that ad-hoc retreval s more lkely to be a strength-related task rather than a poston-related task. When comparng DRMM KMAX IDF and the orgnal DRMM LCH IDF, we fnd that DRMM KMAX IDF works qute well on Robust04 but fals on ClueWeb-09-Cat-B. The possble reason s that the document length varaton on Web data (.e., ClueWeb-09-Cat-B) s much larger than that on news data (.e., Robust04), leadng to the falure of the K- max poolng method whch has potental bas towards very long documents. Ths further demonstrates the effectveness of our matchng hstogram mappng and the correspondng hstogram based feed forward matchng network Impact of Term Embeddngs Snce we leverage a pror learned term embeddngs n our model, we further study the effect of embeddng dmensonalty on the retreval performance. Here we report the performance results on the Robust04 collecton usng term embeddngs traned by CBOW model wth 50, 100, 300, and 500 dmensons, respectvely. As shown n Table 3, the performance frst ncreases and then slghtly drops wth the ncrease of dmensonalty. Term embeddngs of dfferent dmensonalty provde dfferent granularty of semantc smlarty; they may also requre dfferent amounts of tranng data. Wth lower dmensonalty, the smlarty between term embeddngs mght be coarse and hurt the relevance matchng performance. However, wth larger dmensonalty, one may need more data to tran relable term embeddngs. Our results suggest that 300 dmensons s suffcent for learnng term embeddngs effectve for relevance matchng on the Robust04 collecton. 6. RELATED WORK By formalzng ad-hoc retreval as a text matchng problem, deep matchng models can be appled to ths task so that features can be automatcally acqured n an end-to-end way. In recent years, avarety of deep matchngmodels have been proposed for the text matchng problems. As mentoned before, we can categorze the exstng deep matchng models nto two major types, namely representatonfocused models and nteracton-focused models. We have descrbed several representatve deep matchng models n these two classes n prevous sectons ncludng DSSM, C- DSSM, ARC-I, ARC-II and MatchPyramd. Here we wll dscuss some other related work n ths drecton. In the class of representaton-focused models, Qu et al.[21] proposed Convolutonal Neural Tensor Network (CNTN) for communty-based queston answerng. The CNTN model s smlar to ARC-I, usng CNN to buld the representatons for each pece of texts. The major dfference between CNTN and ARC-Is that CNTN employs a tensor layer rather than MLP on topof thetwocnns tocomputethematchngscore betweenthetwopecesoftext. In[25], Socheretal.proposed an Unfoldng Recursve Autoencoder(uRAE) for paraphrase dentfcaton. They frst employed recursve autoencoders to buld the herarchcal compostonal text representatons based on syntactc trees, and then conducted matchng at dfferent levels for the dentfcaton task. In [30], Yn et al. ntroduced MultGranCNN whch employs a CNN to obtan herarchcal representatons of texts, and then computes the matchng score based on the nteractons between these multgranular representatons. In the class of nteracton-focused models, Wang et al. [28] proposed Deep Match Tree (DeepMatch tree) for the short text matchng problem. Dfferent from DeepMatch[17] whch bulds local nteractons between texts based on semantc topcs, DeepMatch tree defnes nteractons n the product space of dependency trees. A deep neural network s then leveraged for makng a matchng decson on the two short texts, on the bass of these local nteractons. In [27], Wan et al. ntroduced Match-SRNN to model the recursve matchng structure n the local nteractons so that long-dstance dependency between the nteractons can be captured. The proposed model was evaluated on two tasks, ncludng communty based queston answerng and paper ctaton matchng. Most of these deep matchng models are desgned for the semantc matchng problem, whch s sgnfcantly dfferent

10 from the relevance matchng problem n ad-hoc retreval. In ths work, we ntroduce a model specfcally desgned for the relevance matchng problem. 7. CONCLUSIONS In ths paper, we pont out that there are sgnfcant dfferences between semantc matchng for many NLP tasks and relevance matchng for the ad-hoc retreval task. Many exstng deep matchng models desgned for the semantc matchng problem thus may not ft the ad-hoc retreval task. Based on ths analyss, we propose a novel deep relevance matchng model for ad-hoc retreval, by explctly addressng the three factors n relevance matchng. The proposed model contans three major components,.e., matchng hstogram mappng, a feed forward matchng network, and a term gatng network. Expermental results on two representatve benchmark datasets show that our model can sgnfcantly outperform tradtonal retreval models as well as state-of-the-art deep matchng models. For future work, we would lke to leverage larger tranng data, e.g. clck-through logs, to tran deeper DRMM so that we can further explore the potental of the proposed model on ad-hoc retreval. We may also nclude phrase embeddngs so that phrases can be treated as a whole rather than separate terms. In ths way, we expect the local nteractons can better reflect the meanng by usng the proper semantc unts n language, leadng to better retreval performance. 8. ACKNOWLEDGMENTS Ths work was supported n part by the Center for Intellgent Informaton Retreval, n part by the 973 Program of Chna under Grant No. 2014CB and 2013CB329606, n part by the Natonal Natural Scence Foundaton of Chna under Grant No , , , and , and n part by the Youth Innovaton Promoton Assocaton CAS under Grant No and REFERENCES [1] C. Burges, T. Shaked, E. Renshaw, A. Lazer, M. Deeds, N. Hamlton, and G. Hullender. Learnng to rank usng gradent descent. In ICML, pages ACM, [2] J. P. Callan, W. B. Croft, and J. Broglo. Trec and tpster experments wth nquery. IPM, 31(3): , [3] G. V. Cormack, M. D. Smucker, and C. L. Clarke. Effcent and effectve spam flterng and re-rankng for large web datasets. Informaton retreval, 14(5): , [4] J. Duch, E. Hazan, and Y. Snger. Adaptve subgradent methods for onlne learnng and stochastc optmzaton. JMLR, 12: , [5] H. Fang, T. Tao, and C. Zha. A formal study of nformaton retreval heurstcs. In SIGIR, pages ACM, [6] H. Fang, T. Tao, and C. Zha. Dagnostc evaluaton of nformaton retreval models. TOIS, 29(2):7, [7] H. Fang and C. Zha. Semantc term matchng n axomatc approaches to nformaton retreval. In SIGIR, pages ACM, [8] J. Gao, P. Pantel, M. Gamon, X. He, L. Deng, and Y. Shen. Modelng nterestngness wth deep neural networks. EMNLP, October [9] R. C. S. L. L. Gles. Overfttng n neural nets: Backpropagaton, conjugate gradent, and early stoppng. In NIPS, volume 13, page 402. MIT Press, [10] G. Hnton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jatly, A. Senor, V. Vanhoucke, P. Nguyen, T. N. Sanath, et al. Deep neural networks for acoustc modelng n speech recognton: The shared vews of four research groups. Sgnal Processng Magazne, 29(6):82 97, [11] B. Hu, Z. Lu, H. L, and Q. Chen. Convolutonal neural network archtectures for matchng natural language sentences. In NIPS, pages , [12] P.-S. Huang, X. He, J. Gao, L. Deng, A. Acero, and L. Heck. Learnng deep structured semantc models for web search usng clckthrough data. In CIKM, pages ACM, [13] N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutonal neural network for modellng sentences. arxv preprnt arxv: , [14] T. Kenter and M. de Rjke. Short text smlarty wth word embeddngs. In CIKM, pages ACM, [15] R. Krovetz. Vewng morphology as an nference process. In SIGIR, pages ACM, [16] Y. LeCun and Y. Bengo. Convolutonal networks for mages, speech, and tme seres. The handbook of bran theory and neural networks, 3361(10):1995, [17] Z. Lu and H. L. A deep archtecture for matchng short texts. In NIPS, pages , [18] T. Mkolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Dstrbuted representatons of words and phrases and ther compostonalty. In NIPS, pages , [19] L. Pang, Y. Lan, J. Guo, J. Xu, S. Wan, and X. Cheng. Text matchng as mage recognton [20] J. Pennngton, R. Socher, and C. D. Mannng. Glove: Global vectors for word representaton. In EMNLP, pages , [21] X. Qu and X. Huang. Convolutonal neural tensor network archtecture for communty-based queston answerng. In IJCAI, pages , [22] S. E. Robertson and S. Walker. Some smple effectve approxmatons to the 2-posson model for probablstc weghted retreval. In SIGIR, pages ACM, [23] Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnl. Learnng semantc representatons usng convolutonal neural networks for web search. In WWW, pages , [24] M. D. Smucker, J. Allan, and B. Carterette. A comparson of statstcal sgnfcance tests for nformaton retreval evaluaton. In CIKM, pages ACM, [25] R. Socher, E. H. Huang, J. Pennn, C. D. Mannng, and A. Y. Ng. Dynamc poolng and unfoldng recursve autoencoders for paraphrase detecton. In NIPS, pages , 2011.

A Deep Relevance Matching Model for Ad-hoc Retrieval

A Deep Relevance Matching Model for Ad-hoc Retrieval A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval Combnng Multple Resources, Evdence and Crtera for Genomc Informaton Retreval Luo S 1, Je Lu 2 and Jame Callan 2 1 Department of Computer Scence, Purdue Unversty, West Lafayette, IN 47907, USA ls@cs.purdue.edu

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25 Transformaton Networks for Target-Orented Sentment Classfcaton 1 Xn L 1, Ldong Bng 2, Wa Lam 1, Be Sh 1 1 The Chnese Unversty of Hong Kong 2 Tencent AI Lab ACL 2018 1 Jont work wth Tencent AI Lab Transformaton

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Learning an Image Manifold for Retrieval

Learning an Image Manifold for Retrieval Learnng an Image Manfold for Retreval Xaofe He*, We-Yng Ma, and Hong-Jang Zhang Mcrosoft Research Asa Bejng, Chna, 100080 {wyma,hjzhang}@mcrosoft.com *Department of Computer Scence, The Unversty of Chcago

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 97-735 Volume Issue 9 BoTechnology An Indan Journal FULL PAPER BTAIJ, (9), [333-3] Matlab mult-dmensonal model-based - 3 Chnese football assocaton super league

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Generalized Team Draft Interleaving

Generalized Team Draft Interleaving Generalzed Team Draft Interleavng Eugene Khartonov,2, Crag Macdonald 2, Pavel Serdyukov, Iadh Ouns 2 Yandex, Russa 2 Unversty of Glasgow, UK {khartonov, pavser}@yandex-team.ru 2 {crag.macdonald, adh.ouns}@glasgow.ac.uk

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

A Deep Top-K Relevance Matching Model for Ad-hoc Retrieval

A Deep Top-K Relevance Matching Model for Ad-hoc Retrieval A Deep Top-K Relevance Matching Model for Ad-hoc Retrieval Zhou Yang, Qingfeng Lan, Jiafeng Guo, Yixing Fan, Xiaofei Zhu, Yanyan Lan, Yue Wang, and Xueqi Cheng School of Computer Science and Engineering,

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts Selectng Query Term Alteratons for Web Search by Explotng Query Contexts Guhong Cao Stephen Robertson Jan-Yun Ne Dept. of Computer Scence and Operatons Research Mcrosoft Research at Cambrdge Dept. of Computer

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION

A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION 1 FENG YONG, DANG XIAO-WAN, 3 XU HONG-YAN School of Informaton, Laonng Unversty, Shenyang Laonng E-mal: 1 fyxuhy@163.com, dangxaowan@163.com, 3 xuhongyan_lndx@163.com

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Intrinsic Plagiarism Detection Using Character n-gram Profiles

Intrinsic Plagiarism Detection Using Character n-gram Profiles Intrnsc Plagarsm Detecton Usng Character n-gram Profles Efstathos Stamatatos Unversty of the Aegean 83200 - Karlovass, Samos, Greece stamatatos@aegean.gr Abstract: The task of ntrnsc plagarsm detecton

More information

Semantic Image Retrieval Using Region Based Inverted File

Semantic Image Retrieval Using Region Based Inverted File Semantc Image Retreval Usng Regon Based Inverted Fle Dengsheng Zhang, Md Monrul Islam, Guoun Lu and Jn Hou 2 Gppsland School of Informaton Technology, Monash Unversty Churchll, VIC 3842, Australa E-mal:

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

USING GRAPHING SKILLS

USING GRAPHING SKILLS Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Laplacian Eigenmap for Image Retrieval

Laplacian Eigenmap for Image Retrieval Laplacan Egenmap for Image Retreval Xaofe He Partha Nyog Department of Computer Scence The Unversty of Chcago, 1100 E 58 th Street, Chcago, IL 60637 ABSTRACT Dmensonalty reducton has been receved much

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Adaptive Transfer Learning

Adaptive Transfer Learning Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk

More information

Hierarchical Image Retrieval by Multi-Feature Fusion

Hierarchical Image Retrieval by Multi-Feature Fusion Preprnts (www.preprnts.org) NOT PEER-REVIEWED Posted: 26 Aprl 207 do:0.20944/preprnts20704.074.v Artcle Herarchcal Image Retreval by Mult- Fuson Xaojun Lu, Jaojuan Wang,Yngq Hou, Me Yang, Q Wang* and Xangde

More information

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender 2013 Frst Internatonal Conference on Artfcal Intellgence, Modellng & Smulaton Comparng Image Representatons for Tranng a Convolutonal Neural Network to Classfy Gender Choon-Boon Ng, Yong-Haur Tay, Bok-Mn

More information

A Bilinear Model for Sparse Coding

A Bilinear Model for Sparse Coding A Blnear Model for Sparse Codng Davd B. Grmes and Rajesh P. N. Rao Department of Computer Scence and Engneerng Unversty of Washngton Seattle, WA 98195-2350, U.S.A. grmes,rao @cs.washngton.edu Abstract

More information

Research Article A High-Order CFS Algorithm for Clustering Big Data

Research Article A High-Order CFS Algorithm for Clustering Big Data Moble Informaton Systems Volume 26, Artcle ID 435627, 8 pages http://dx.do.org/.55/26/435627 Research Artcle A Hgh-Order Algorthm for Clusterng Bg Data Fanyu Bu,,2 Zhku Chen, Peng L, Tong Tang, 3 andyngzhang

More information