Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling
|
|
- Luke West
- 5 years ago
- Views:
Transcription
1 Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling Chenyan Xiong, Zhengzhong Liu, Jamie Callan, and Tie-Yan Liu* Carnegie Mellon University & Microsoft Research* 1
2 Document Understanding in Search Interaction-Based Ranking Models Bag-of-Words Word Matches Bag-of-Entities Entity Semantics Document 2
3 Document Understanding in Search Bag-of-Terms : Effective & Efficient Interaction-Based Ranking Models Bag-of-Words Word Matches Bag-of-Entities Entity Semantics Document 3
4 Document Understanding in Search Bag-of-Terms : Effective & Efficient Mostly Frequency Signals Interaction-Based Ranking Models Bag-of-Words Word Matches Bag-of-Entities Entity Semantics Document 4
5 Shallow Understanding of Bag-of-Terms Frequency!= Importance Rank of Title Entities in Their Wiki Pages Other 57% not the most frequent Top 3 Top 1 Top 2 5
6 Shallow Understanding of Bag-of-Terms Frequency!= Importance 6
7 Shallow Understanding of Bag-of-Terms Frequency!= Importance My name only appears once. 7
8 The Entity Salience Task Estimate entity importance in documents [Dunietz and Gillick 2014] 8
9 The Entity Salience Task Estimate entity importance in documents [Dunietz and Gillick 2014] Annotated NYT ~Half Million News Manual Summary 9
10 The Entity Salience Task Estimate entity importance in documents [Dunietz and Gillick 2014] Annotated NYT ~Half Million News Manual Summary Candidate Entities Entity Annotations ~200 per Article 10
11 The Entity Salience Task Estimate entity importance in documents [Dunietz and Gillick 2014] Annotated NYT ~Half Million News Manual Summary Candidate Entities Entity Annotations ~200 per Article Salient Labels Appearance in Summary ~28 per Article 11
12 This Work Kernel Entity Salience Estimation: Represent entities by Knowledge-Enriched Embeddings Model entity interactions by a Kernel Interaction Model Learn to estimate salience end-to-end Improve web search by domain adaption 12
13 Intuition From counting frequency to modeling interactions 13
14 Learn to represent entities using embeddings Integrate knowledge graph semantics KNOWLEDGE-ENRICHED EMBEDDING 14
15 Step 1: Knowledge-Enriched Embedding Map entities to embeddings (to be learned) Target Entity Concussion Embedding Layer e e $ e $ Entity Embedding 15
16 Step 1: Knowledge-Enriched Embedding Introduce words in the entity description Target Entity Concussion Concussion mild traumatic injury w ' w ( w ) w * Embedding Layer e Description Words e $ e $ w + w + Entity Embedding Word Embedding 16
17 Step 1: Knowledge-Enriched Embedding Compose words by CNN filters Target Entity Concussion Concussion mild traumatic injury w ' w ( w ) w * CNN Embedding Layer e Description Words e $ e $ w + w + C + = W / w +:+23 Entity Embedding Word Embedding CNN Filter 17
18 Step 1: Knowledge-Enriched Embedding Max-pool to description embedding Target Entity Concussion Concussion mild traumatic injury w ' w ( w ) w * CNN Embedding Layer Max Pooling e v 5 Description Words e $ e $ w + w + C + = W / w +:+23 v 5 = max(c ',, C *<3 ) Entity Embedding Word Embedding CNN Filter Description Embedding 18
19 Step 1: Knowledge-Enriched Embedding Combine to the Knowledge Enriched Embedding (KEE) Target Entity Concussion Concussion mild traumatic injury w ' w ( w ) w * CNN Embedding Layer Max Pooling e v 5 Knowledge-Enriched Embedding (KEE) W + v > Description Words e $ e $ w + w + C + = W / w +:+23 v 5 = max(c ',, C *<3 ) v >@ = W A (e $ v 5 ) Entity Embedding Word Embedding CNN Filter Description Embedding KEE Embedding 19
20 Step 1: Knowledge-Enriched Embedding Combine to the Knowledge Enriched Embedding (KEE) Data-Driven Embeddings Target Entity Concussion Concussion mild traumatic injury w ' w ( w ) w * CNN Embedding Layer Max Pooling e v 5 Knowledge-Enriched Embedding (KEE) W + v > Knowledge Graph Semantics Description Words e $ e $ w + w + C + = W / w +:+23 v 5 = max(c ',, C *<3 ) v >@ = W A (e $ v 5 ) Entity Embedding Word Embedding CNN Filter Description Embedding KEE Embedding 20
21 Model term interactions in the embedding space Capture multi-level interactions using kernels KERNEL INTERACTION MODEL 21
22 Step 2: Kernel Interaction Model Model entity interactions in the embedding space Embedding of Document Entities v >@ Embedding of Target Entity v >B v >C v >D v >E Cosine Similarity 22
23 Step 2: Kernel Interaction Model Use kernels to capture multi-level interaction [Xiong et al. 2017] (K-NRM) Embedding of Document Entities Embedding of Target Entity v >@ v >B v >C v >D v >E Cosine Similarity RBF Kernels Φ(e $, E) Entity Kernels Similar!= Related Multi-Level Interactions Let the data decide φ I e $, e J = exp ( MNO > P <Q R C <(S R C ) Φ I e $, e J = {φ ' e $, e J,, φ U (e $, e J )} 23
24 Step 2: Kernel Interaction Model Multi-level interactions as votes to the target entity Embedding of Document Entities Embedding of Target Entity v >@ v >B v >C v >D v >E Cosine Similarity RBF Kernels Φ(e $, E) Entity Kernels φ I e $, e J = exp ( MNO > P <Q R C <(S R C ) Φ I e $, e J = {φ ' e $, e J,, φ U (e $, e J )} Φ e $, E = Φ I (e $, e J ) J 24
25 Step 2: Kernel Interaction Model Model word-entity interactions as well Embedding of Document Entities Embedding of Target Entity v >@ Embeddings of Document Words v >B v >C v >D v >E Cosine Similarity RBF Kernels Φ(e $, E) 25 Entity Kernels Word Kernels w ' w ( w ) w Z Φ(e $, W) φ I e $, e J = exp ( MNO > > P <Q R <(SC ) R Φ I e $, e J = {φ ' e $, e J,, φ U (e $, e J )} Φ e $, E = Φ I (e $, e J ) J KIM e $, d = Φ(e $, E) Φ(e $, W)
26 Step 2: Kernel Interaction Model Kernel scores as features for downstream tasks Embedding of Document Entities Embedding of Target Entity v >@ Embeddings of Document Words v >B v >C v >D v >E Cosine Similarity RBF Kernels Φ(e $, E) Entity Kernels Word Kernels w ' w ( w ) w Z Φ(e $, W) φ I e $, e J = exp ( MNO > > P <Q R <(SC ) R Φ I e $, e J = {φ ' e $, e J,, φ U (e $, e J )} Φ e $, E = Φ I (e $, e J ) J KIM e $, d = Φ(e $, E) Φ(e $, W) Multi-level votes from other entities Multi-level votes from other words 26
27 Step 3: End-to-End Learning for Salience Combine word and entity kernels to the salience score f e $, d = W a {Φ e $, E Φ e $, W } + b a Entity Votes Word Votes 27
28 Step 3: End-to-End Learning for Salience Combine word and entity kernels to the salience score f e $, d = W a {Φ e $, E Φ e $, W } + b a Pairwise learning to rank with hinge loss d max (0, 1 f e 2, d + f(e <, d)) > h,> i k Salient Entity > Others 28
29 Step 3: End-to-End Learning for Salience Combine word and entity kernels to the salience score f e $, d = W a {Φ e $, E Φ e $, W } + b a Pairwise learning to rank with hinge loss d max (0, 1 f e 2, d + f(e <, d)) > h,> i k Learn end-to-end: 29
30 Step 3: End-to-End Learning for Salience Combine word and entity kernels to the salience score Pairwise learning to rank with hinge loss Learn end-to-end: f e $, d = W a {Φ e $, E Φ e $, W } + b a d max (0, 1 f e 2, d + f(e <, d)) > h,> i k Allocate the embeddings space by kernels
31 Can we do better than counting frequency? SALIENCE EXPERIMENTS 31
32 Salience Estimation Performance Freq LTR EMB KESM 0.2 Freq LTR EMB KESM Freq: Frequency Count. EMB: Raw embeddings. LTR: Feature-based learning to rank. KESM: Kernel Entity Salience Model. 32
33 Salience Estimation Performance 0.6 Frequency is a strong indicator 0.35of importance Precision@ Precision@ Freq LTR EMB KESM 0.2 Freq LTR EMB KESM Freq: Frequency Count. EMB: Raw embeddings. LTR: Feature-based learning to rank. KESM: Kernel Entity Salience Model. 33
34 Salience Estimation Performance % with linguistic and 0.35 semantic features Freq LTR EMB KESM 0.2 Freq LTR EMB KESM Freq: Frequency Count. EMB: Raw embeddings. LTR: Feature-based learning to rank. KESM: Kernel Entity Salience Model. 34
35 Salience Estimation Performance 0.6 Without kernels, no gains 0.35 from embeddings Freq LTR EMB KESM 0.2 Freq LTR EMB KESM Freq: Frequency Count. EMB: Raw embeddings. LTR: Feature-based learning to rank. KESM: Kernel Entity Salience Model. 35
36 Salience Estimation Performance %~30% with our 0.35 Kernel Salience Model Freq LTR EMB KESM 0.2 Freq LTR EMB KESM Freq: Frequency Count. EMB: Raw embeddings. LTR: Feature-based learning to rank. KESM: Kernel Entity Salience Model. 36
37 How to improve search using entity salience AD HOC SEARCH TASK 37
38 Step 4: Adapt to Web Search Ranking by the salience of query entities in the document f q, d = W m d log Φ e $, E Φ e $, W E k r 38
39 Step 4: Adapt to Web Search Ranking by the salience of query entities in the document f q, d = W m d log Φ e $, E Φ e $, W E k r The interactions between query entities and document terms 39
40 Step 4: Adapt to Web Search Ranking by the salience of query entities in the document f q, d = W m d log Φ e $, E Φ e $, W E k r Pre-train by the NYT salience labels Serve as ranking features Use in standard learning to rank 40
41 Step 4: Adapt to Web Search Ranking by the salience of query entities in the document f q, d = W m d log Φ e $, E Φ e $, W E k r Pre-train by the NYT salience labels Serve as ranking features Use in standard learning to rank Generalize 41
42 Ranking Performance on TREC Word Match LTR ESR Conv-KNRM KERM ESR: Entity frequencies and knowledge graph embedding. [WWW 2017] Conv-KNRM: N-gram soft matches pre-trained on Bing search log. [WSDM 2018] KERM: Entity salience pre-trained on NYT salience. (This Work) Results on ClueWeb09-B. Similar results on ClueWeb12-B13. 42
43 Ranking Performance on TREC Word Match +6% Entity Frequency LTR ESR Conv-KNRM KERM ESR: Entity frequencies and knowledge graph embedding. [WWW 2017] Conv-KNRM: N-gram soft matches pre-trained on Bing search log. [WSDM 2018] KERM: Entity salience pre-trained on NYT salience. (This Work) Results on ClueWeb09-B. Similar results on ClueWeb12-B13. 43
44 Ranking Performance on TREC % +3% Word Match Entity Frequency User Clicks LTR ESR Conv-KNRM KERM ESR: Entity frequencies and knowledge graph embedding. [WWW 2017] Conv-KNRM: N-gram soft matches pre-trained on Bing search log. [WSDM 2018] KERM: Entity salience pre-trained on NYT salience. (This Work) Results on ClueWeb09-B. Similar results on ClueWeb12-B13. 44
45 Ranking Performance on TREC Word Match +6% +3% Entity Frequency User Clicks +5% Entity Salience LTR ESR Conv-KNRM KERM ESR: Entity frequencies and knowledge graph embedding. [WWW 2017] Conv-KNRM: N-gram soft matches pre-trained on Bing search log. [WSDM 2018] KERM: Entity salience pre-trained on NYT salience. (This Work) Results on ClueWeb09-B. Similar results on ClueWeb12-B13. 45
46 Conclusion Understanding: From counting frequency to modeling interaction Knowledge-Enriched Embedding Kernel-Based Interaction Model 46
47 Conclusion Understanding: From counting frequency to modeling interaction Knowledge-Enriched Embedding Kernel-Based Interaction Model Data-Driven: Learn interaction patterns and embeddings end-to-end
48 Conclusion Understanding: From counting frequency to modeling interaction Knowledge-Enriched Embedding Kernel-Based Interaction Model Data-Driven: Learn interaction patterns and embeddings end-to-end 1 +1 Generalizable: Better salience estimation leads to better search Fine-Grained Text Processing Information Retrieval Systems 48
49 Conclusion Understanding: From counting frequency to modeling interaction Knowledge-Enriched Embedding Kernel-Based Interaction Model Data-Driven: Learn interaction patterns and embeddings end-to-end 1 +1 Generalizable: Better salience estimation leads to better search Fine-Grained Text Processing Information Retrieval Systems 49
50 Many thanks to my co-authors! Come to our KG4IR workshop! Codes and data will be available on my website. QUESTIONS? 50
51 The Entity Salience Task: Semantic Scholar Predicting which entities appear in paper title Salient entities should be mentioned in the title. Document Paper abstract & title One million documents Candidate Entities Entity Annotations ~70 entities per abstract Salient Labels Those appear in title ~7 entities per paper 51
Entity and Knowledge Base-oriented Information Retrieval
Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061
More informationTowards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling
Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling ABSTRACT Chenyan Xiong Carnegie Mellon University cx@cs.cmu.edu Jamie Callan Carnegie Mellon University callan@cs.cmu.edu
More informationEnd-to-End Neural Ad-hoc Ranking with Kernel Pooling
End-to-End Neural Ad-hoc Ranking with Kernel Pooling Chenyan Xiong 1,Zhuyun Dai 1, Jamie Callan 1, Zhiyuan Liu, and Russell Power 3 1 :Language Technologies Institute, Carnegie Mellon University :Tsinghua
More informationExplicit Semantic Ranking for Academic Search via Knowledge Graph Embedding
Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding Chenyan Xiong Language Technologies Institute Carnegie Mellon University Pittsburgh, PA, USA cx@cs.cmu.edu Russell Power Allen
More informationModern Retrieval Evaluations. Hongning Wang
Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance
More informationLearning to Reweight Terms with Distributed Representations
Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215 Outline Goal: Assign weights to query terms for better retrieval results
More informationFrom Neural Re-Ranking to Neural Ranking:
From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing Hamed Zamani (1), Mostafa Dehghani (2), W. Bruce Croft (1), Erik Learned-Miller (1), and Jaap Kamps (2)
More informationA Deep Relevance Matching Model for Ad-hoc Retrieval
A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese
More informationUniversity of Delaware at Diversity Task of Web Track 2010
University of Delaware at Diversity Task of Web Track 2010 Wei Zheng 1, Xuanhui Wang 2, and Hui Fang 1 1 Department of ECE, University of Delaware 2 Yahoo! Abstract We report our systems and experiments
More informationDeep and Reinforcement Learning for Information Retrieval
CIPS Summer School July 29, 2018 Beijing Deep and Reinforcement Learning for Information Retrieval Jun Xu, Liang Pang Institute of Computing Technology Chinese Academy of Sciences 1 Outline Introduction
More informationA Few Things to Know about Machine Learning for Web Search
AIRS 2012 Tianjin, China Dec. 19, 2012 A Few Things to Know about Machine Learning for Web Search Hang Li Noah s Ark Lab Huawei Technologies Talk Outline My projects at MSRA Some conclusions from our research
More informationInverted List Caching for Topical Index Shards
Inverted List Caching for Topical Index Shards Zhuyun Dai and Jamie Callan Language Technologies Institute, Carnegie Mellon University {zhuyund, callan}@cs.cmu.edu Abstract. Selective search is a distributed
More informationRepresentation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval
Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval Xiaodong Liu 12, Jianfeng Gao 1, Xiaodong He 1 Li Deng 1, Kevin Duh 2, Ye-Yi Wang 1 1
More informationarxiv: v1 [cs.ir] 16 Oct 2017
DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, Xueqi Cheng pl8787@gmail.com,{lanyanyan,guojiafeng,junxu,cxq}@ict.ac.cn,xujingfang@sogou-inc.com
More informationUnderstanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22
Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion
More informationRuslan Salakhutdinov and Geoffrey Hinton. University of Toronto, Machine Learning Group IRGM Workshop July 2007
SEMANIC HASHING Ruslan Salakhutdinov and Geoffrey Hinton University of oronto, Machine Learning Group IRGM orkshop July 2007 Existing Methods One of the most popular and widely used in practice algorithms
More informationA Hybrid Neural Model for Type Classification of Entity Mentions
A Hybrid Neural Model for Type Classification of Entity Mentions Motivation Types group entities to categories Entity types are important for various NLP tasks Our task: predict an entity mention s type
More informationInformativeness for Adhoc IR Evaluation:
Informativeness for Adhoc IR Evaluation: A measure that prevents assessing individual documents Romain Deveaud 1, Véronique Moriceau 2, Josiane Mothe 3, and Eric SanJuan 1 1 LIA, Univ. Avignon, France,
More informationOutline. Morning program Preliminaries Semantic matching Learning to rank Entities
112 Outline Morning program Preliminaries Semantic matching Learning to rank Afternoon program Modeling user behavior Generating responses Recommender systems Industry insights Q&A 113 are polysemic Finding
More informationKnowledge Based Text Representations for Information Retrieval
Knowledge Based Text Representations for Information Retrieval Chenyan Xiong cx@cs.cmu.edu May 2016 Language Technologies Institute School of Computer Science Carnegie Mellon University 5000 Forbes Ave.,
More informationEffective Structured Query Formulation for Session Search
Effective Structured Query Formulation for Session Search Dongyi Guan Hui Yang Nazli Goharian Department of Computer Science Georgetown University 37 th and O Street, NW, Washington, DC, 20057 dg372@georgetown.edu,
More informationMinghai Liu, Rui Cai, Ming Zhang, and Lei Zhang. Microsoft Research, Asia School of EECS, Peking University
Minghai Liu, Rui Cai, Ming Zhang, and Lei Zhang Microsoft Research, Asia School of EECS, Peking University Ordering Policies for Web Crawling Ordering policy To prioritize the URLs in a crawling queue
More informationDocument and Query Expansion Models for Blog Distillation
Document and Query Expansion Models for Blog Distillation Jaime Arguello, Jonathan L. Elsas, Changkuk Yoo, Jamie Callan, Jaime G. Carbonell Language Technologies Institute, School of Computer Science,
More informationarxiv: v1 [cs.ir] 14 Sep 2016
Document Filtering for Long-tail Entities Ridho Reinanda r.reinanda@uva.nl Edgar Meij edgar.meij@acm.org Maarten de Rijke derijke@uva.nl University of Amsterdam, Amsterdam, The Netherlands Bloomberg L.P.,
More informationSemantic text features from small world graphs
Semantic text features from small world graphs Jurij Leskovec 1 and John Shawe-Taylor 2 1 Carnegie Mellon University, USA. Jozef Stefan Institute, Slovenia. jure@cs.cmu.edu 2 University of Southampton,UK
More informationWebSci and Learning to Rank for IR
WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles
More informationAn Information Retrieval Approach for Source Code Plagiarism Detection
-2014: An Information Retrieval Approach for Source Code Plagiarism Detection Debasis Ganguly, Gareth J. F. Jones CNGL: Centre for Global Intelligent Content School of Computing, Dublin City University
More informationAn Exploration of Query Term Deletion
An Exploration of Query Term Deletion Hao Wu and Hui Fang University of Delaware, Newark DE 19716, USA haowu@ece.udel.edu, hfang@ece.udel.edu Abstract. Many search users fail to formulate queries that
More informationBasic techniques. Text processing; term weighting; vector space model; inverted index; Web Search
Basic techniques Text processing; term weighting; vector space model; inverted index; Web Search Overview Indexes Query Indexing Ranking Results Application Documents User Information analysis Query processing
More informationInformation Retrieval
Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data
More informationClassification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014
Classification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014 Sungbin Choi, Jinwook Choi Medical Informatics Laboratory, Seoul National University, Seoul, Republic of
More informationFall CS646: Information Retrieval. Lecture 2 - Introduction to Search Result Ranking. Jiepu Jiang University of Massachusetts Amherst 2016/09/12
Fall 2016 CS646: Information Retrieval Lecture 2 - Introduction to Search Result Ranking Jiepu Jiang University of Massachusetts Amherst 2016/09/12 More course information Programming Prerequisites Proficiency
More informationSentence selection with neural networks over string kernels
Sentence selection with neural networks over string kernels Mihai Dan Mașala, Ștefan Rușeți, Traian Rebedea KES 2017 University POLITEHNICA of Bucharest Introduction Sentence selection: given a question,
More informationInformation Retrieval
Introduction Information Retrieval Information retrieval is a field concerned with the structure, analysis, organization, storage, searching and retrieval of information Gerard Salton, 1968 J. Pei: Information
More informationwith Deep Learning A Review of Person Re-identification Xi Li College of Computer Science, Zhejiang University
A Review of Person Re-identification with Deep Learning Xi Li College of Computer Science, Zhejiang University http://mypage.zju.edu.cn/xilics Email: xilizju@zju.edu.cn Person Re-identification Associate
More informationQuery Intent Detection using Convolutional Neural Networks
Query Intent Detection using Convolutional Neural Networks Homa B Hashemi Intelligent Systems Program University of Pittsburgh hashemi@cspittedu Amir Asiaee, Reiner Kraft Yahoo! inc Sunnyvale, CA {amirasiaee,
More informationLearning to Rank for Faceted Search Bridging the gap between theory and practice
Learning to Rank for Faceted Search Bridging the gap between theory and practice Agnes van Belle @ Berlin Buzzwords 2017 Job-to-person search system Generated query Match indicator Faceted search Multiple
More informationObject Detection Based on Deep Learning
Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
More informationSemi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction
Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction Pavel P. Kuksa, Rutgers University Yanjun Qi, Bing Bai, Ronan Collobert, NEC Labs Jason Weston, Google Research NY Vladimir
More informationModeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track
Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track Jiyun Luo, Xuchu Dong and Grace Hui Yang Department of Computer Science Georgetown University Introduc:on Session
More informationLearning Dense Models of Query Similarity from User Click Logs
Learning Dense Models of Query Similarity from User Click Logs Fabio De Bona, Stefan Riezler*, Keith Hall, Massi Ciaramita, Amac Herdagdelen, Maria Holmqvist Google Research, Zürich *Dept. of Computational
More informationCombining Neural Networks and Log-linear Models to Improve Relation Extraction
Combining Neural Networks and Log-linear Models to Improve Relation Extraction Thien Huu Nguyen and Ralph Grishman Computer Science Department, New York University {thien,grishman}@cs.nyu.edu Outline Relation
More informationarxiv: v2 [cs.ir] 27 Jul 2017
Darío Garigliotti University of Stavanger dario.garigliotti@uis.no Faegheh Hasibi Norwegian University of Science and Technology faegheh.hasibi@ntnu.no Krisztian Balog University of Stavanger krisztian.balog@uis.no
More informationThis is an author-deposited version published in : Eprints ID : 12964
Open Archive TOULOUSE Archive Ouverte (OATAO) OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible. This is an author-deposited
More informationVolume 2, Issue 6, June 2014 International Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 6, June 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Internet
More informationICTNET at Web Track 2010 Diversity Task
ICTNET at Web Track 2010 Diversity Task Yuanhai Xue 1,2, Zeying Peng 1,2, Xiaoming Yu 1, Yue Liu 1, Hongbo Xu 1, Xueqi Cheng 1 1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing,
More informationWebis at the TREC 2012 Session Track
Webis at the TREC 2012 Session Track Extended Abstract for the Conference Notebook Matthias Hagen, Martin Potthast, Matthias Busse, Jakob Gomoll, Jannis Harder, and Benno Stein Bauhaus-Universität Weimar
More informationLearning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan,
Learning to Rank Tie-Yan Liu Microsoft Research Asia CCIR 2011, Jinan, 2011.10 History of Web Search Search engines powered by link analysis Traditional text retrieval engines 2011/10/22 Tie-Yan Liu @
More informationon learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015
on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,
More informationRishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India. Monojit Choudhury and Srivatsan Laxman Microsoft Research India India
Rishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India Monojit Choudhury and Srivatsan Laxman Microsoft Research India India ACM SIGIR 2012, Portland August 15, 2012 Dividing a query into individual semantic
More informationAttentive Neural Architecture for Ad-hoc Structured Document Retrieval
Attentive Neural Architecture for Ad-hoc Structured Document Retrieval Saeid Balaneshin 1 Alexander Kotov 1 Fedor Nikolaev 1,2 1 Textual Data Analytics Lab, Department of Computer Science, Wayne State
More informationA Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition
A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es
More informationPTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks
PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks Pramod Srinivasan CS591txt - Text Mining Seminar University of Illinois, Urbana-Champaign April 8, 2016 Pramod Srinivasan
More informationACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang
ACM MM 2010 Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation Proliferation of images and videos on the Internet
More informationpresent the results in the best way to users. These challenges reflect an issue that has been presented in different works which is: diversity of quer
An Approach to Diversify Entity Search Results Imène Saidi University of Oran, LITIO Laboratory BP 1524, El-M Naouer, 31000 Oran, Algeria saidi.imene@univ-oran.dz Sihem Amer-Yahia Safia Nait Bahloul CNRS,
More informationNTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task
NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task Chieh-Jen Wang, Yung-Wei Lin, *Ming-Feng Tsai and Hsin-Hsi Chen Department of Computer Science and Information Engineering,
More informationSemantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman
Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information
More informationLinking Entities in Chinese Queries to Knowledge Graph
Linking Entities in Chinese Queries to Knowledge Graph Jun Li 1, Jinxian Pan 2, Chen Ye 1, Yong Huang 1, Danlu Wen 1, and Zhichun Wang 1(B) 1 Beijing Normal University, Beijing, China zcwang@bnu.edu.cn
More informationA Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition
A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es
More informationCS231N Section. Video Understanding 6/1/2018
CS231N Section Video Understanding 6/1/2018 Outline Background / Motivation / History Video Datasets Models Pre-deep learning CNN + RNN 3D convolution Two-stream What we ve seen in class so far... Image
More informationQuery Subtopic Mining Exploiting Word Embedding for Search Result Diversification
Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification Md Zia Ullah, Md Shajalal, Abu Nowshed Chy, and Masaki Aono Department of Computer Science and Engineering, Toyohashi University
More informationBuilding Rich User Profiles for Personalized News Recommendation
Building Rich User Profiles for Personalized News Recommendation Youssef Meguebli 1, Mouna Kacimi 2, Bich-liên Doan 1, and Fabrice Popineau 1 1 SUPELEC Systems Sciences (E3S), Gif sur Yvette, France, {youssef.meguebli,bich-lien.doan,fabrice.popineau}@supelec.fr
More informationMultimodal Medical Image Retrieval based on Latent Topic Modeling
Multimodal Medical Image Retrieval based on Latent Topic Modeling Mandikal Vikram 15it217.vikram@nitk.edu.in Suhas BS 15it110.suhas@nitk.edu.in Aditya Anantharaman 15it201.aditya.a@nitk.edu.in Sowmya Kamath
More informationEnhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention
Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention Anand K. Hase, Baisa L. Gunjal Abstract In the real world applications such as landmark search, copy protection, fake image
More informationExternal Query Reformulation for Text-based Image Retrieval
External Query Reformulation for Text-based Image Retrieval Jinming Min and Gareth J. F. Jones Centre for Next Generation Localisation School of Computing, Dublin City University Dublin 9, Ireland {jmin,gjones}@computing.dcu.ie
More informationDetecting Multilingual and Multi-Regional Query Intent in Web Search
Detecting Multilingual and Multi-Regional Query Intent in Web Search Yi Chang, Ruiqiang Zhang, Srihari Reddy Yahoo! Labs 701 First Avenue Sunnyvale, CA 94089 {yichang,ruiqiang,sriharir}@yahoo-inc.com Yan
More informationMultimodal Information Spaces for Content-based Image Retrieval
Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due
More informationVideo annotation based on adaptive annular spatial partition scheme
Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory
More informationLearning to Rank: A New Technology for Text Processing
TFANT 07 Tokyo Univ. March 2, 2007 Learning to Rank: A New Technology for Text Processing Hang Li Microsoft Research Asia Talk Outline What is Learning to Rank? Ranking SVM Definition Search Ranking SVM
More informationNPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval
NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval Canjia Li 1, Yingfei Sun 1, Ben He 1,3, Le Wang 1,4, Kai Hui 2, Andrew Yates 5, Le Sun 3, Jungang Xu 1 1 University of
More informationCS54701: Information Retrieval
CS54701: Information Retrieval Basic Concepts 19 January 2016 Prof. Chris Clifton 1 Text Representation: Process of Indexing Remove Stopword, Stemming, Phrase Extraction etc Document Parser Extract useful
More informationQuery Expansion using Wikipedia and DBpedia
Query Expansion using Wikipedia and DBpedia Nitish Aggarwal and Paul Buitelaar Unit for Natural Language Processing, Digital Enterprise Research Institute, National University of Ireland, Galway firstname.lastname@deri.org
More informationFast Edge Detection Using Structured Forests
Fast Edge Detection Using Structured Forests Piotr Dollár, C. Lawrence Zitnick [1] Zhihao Li (zhihaol@andrew.cmu.edu) Computer Science Department Carnegie Mellon University Table of contents 1. Introduction
More informationHeterogeneous Graph-Based Intent Learning with Queries, Web Pages and Wikipedia Concepts
Heterogeneous Graph-Based Intent Learning with Queries, Web Pages and Wikipedia Concepts Xiang Ren, Yujing Wang, Xiao Yu, Jun Yan, Zheng Chen, Jiawei Han University of Illinois, at Urbana Champaign MicrosoD
More informationImproving Document Ranking for Long Queries with Nested Query Segmentation
Improving Document Ranking for Long Queries with Nested Query Segmentation Rishiraj Saha Roy 1, Anusha Suresh 2, Niloy Ganguly 2, and Monojit Choudhury 3 1 Max Planck Institute for Informatics, Saarbrücken,
More informationA Study of MatchPyramid Models on Ad hoc Retrieval
A Study of MatchPyramid Models on Ad hoc Retrieval Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences Text Matching Many text based
More informationInformation Retrieval
Information Retrieval Natural Language Processing: Lecture 12 30.11.2017 Kairit Sirts Homework 4 things that seemed to work Bidirectional LSTM instead of unidirectional Change LSTM activation to sigmoid
More informationNatural Language Processing. SoSe Question Answering
Natural Language Processing SoSe 2017 Question Answering Dr. Mariana Neves July 5th, 2017 Motivation Find small segments of text which answer users questions (http://start.csail.mit.edu/) 2 3 Motivation
More informationInterpreting Document Collections with Topic Models. Nikolaos Aletras University College London
Interpreting Document Collections with Topic Models Nikolaos Aletras University College London Acknowledgements Mark Stevenson, Sheffield Tim Baldwin, Melbourne Jey Han Lau, IBM Research Talk Outline Introduction
More informationCMSC 476/676 Information Retrieval Midterm Exam Spring 2014
CMSC 476/676 Information Retrieval Midterm Exam Spring 2014 Name: You may consult your notes and/or your textbook. This is a 75 minute, in class exam. If there is information missing in any of the question
More informationColumbia University High-Level Feature Detection: Parts-based Concept Detectors
TRECVID 2005 Workshop Columbia University High-Level Feature Detection: Parts-based Concept Detectors Dong-Qing Zhang, Shih-Fu Chang, Winston Hsu, Lexin Xie, Eric Zavesky Digital Video and Multimedia Lab
More informationarxiv: v1 [cs.ir] 29 Apr 2018
Entity Set Search of Scientific Literature: An Unsupervised Ranking Approach Department of Computer Science, University of Illinois Urbana-Champaign, IL, USA {js2, jxiao3, xhe7, shang7, sinhas, hanj}@illinoisedu
More informationAutomatically Building Research Reading Lists
Automatically Building Research Reading Lists Michael D. Ekstrand 1 Praveen Kanaan 1 James A. Stemper 2 John T. Butler 2 Joseph A. Konstan 1 John T. Riedl 1 ekstrand@cs.umn.edu 1 GroupLens Research Department
More informationQuery Intent Detection using Convolutional Neural Networks
Query Intent Detection using Convolutional Neural Networks Homa B. Hashemi, Amir Asiaee, Reiner Kraft QRUMS workshop - February 22, 2016 Query Intent Detection michelle obama age Query Intent Detection
More informationThe Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System
The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System Our first participation on the TRECVID workshop A. F. de Araujo 1, F. Silveira 2, H. Lakshman 3, J. Zepeda 2, A. Sheth 2, P. Perez
More informationMachine Learning for Media Data Analysis
27 OKTOBER 2010 Machine Learning for Media Data Analysis Assistant Machine Learning and Computer Vision Aarhus University, alexandrosiosifidis@engaudk Overview Media Data Analysis applications Image analysis/recognition/segmentation
More informationarxiv: v3 [cs.ir] 21 Jul 2017
PACRR: A Position-Aware Neural IR Model for Relevance Matching Kai Hui MPI for Informatics Saarbrücken Graduate School of Computer Science Andrew Yates MPI for Informatics Klaus Berberich htw saar MPI
More informationWeb Document Clustering using Semantic Link Analysis
Web Document Clustering using Semantic Link Analysis SOMJIT ARCH-INT, Ph.D. Semantic Information Technology Innovation (SITI) LAB Department of Computer Science, Faculty of Science, Khon Kaen University,
More informationAuthor: Yunqing Xia, Zhongda Xie, Qiuge Zhang, Huiyuan Zhao, Huan Zhao Presenter: Zhongda Xie
Author: Yunqing Xia, Zhongda Xie, Qiuge Zhang, Huiyuan Zhao, Huan Zhao Presenter: Zhongda Xie Outline 1.Introduction 2.Motivation 3.Methodology 4.Experiments 5.Conclusion 6.Future Work 2 1.Introduction(1/3)
More informationExploiting Global Impact Ordering for Higher Throughput in Selective Search
Exploiting Global Impact Ordering for Higher Throughput in Selective Search Michał Siedlaczek [0000-0002-9168-0851], Juan Rodriguez [0000-0001-6483-6956], and Torsten Suel [0000-0002-8324-980X] Computer
More informationCIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets
CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,
More informationMidterm Exam Search Engines ( / ) October 20, 2015
Student Name: Andrew ID: Seat Number: Midterm Exam Search Engines (11-442 / 11-642) October 20, 2015 Answer all of the following questions. Each answer should be thorough, complete, and relevant. Points
More informationAComparisonofRetrievalModelsusingTerm Dependencies
AComparisonofRetrievalModelsusingTerm Dependencies Samuel Huston and W. Bruce Croft Center for Intelligent Information Retrieval University of Massachusetts Amherst 140 Governors Dr, Amherst, Massachusetts,
More informationQuery Difficulty Prediction for Contextual Image Retrieval
Query Difficulty Prediction for Contextual Image Retrieval Xing Xing 1, Yi Zhang 1, and Mei Han 2 1 School of Engineering, UC Santa Cruz, Santa Cruz, CA 95064 2 Google Inc., Mountain View, CA 94043 Abstract.
More informationTowards Efficient and Effective Semantic Table Interpretation Ziqi Zhang
Towards Efficient and Effective Semantic Table Interpretation Ziqi Zhang Department of Computer Science, University of Sheffield Outline Define semantic table interpretation State-of-the-art and motivation
More informationFederated Text Search
CS54701 Federated Text Search Luo Si Department of Computer Science Purdue University Abstract Outline Introduction to federated search Main research problems Resource Representation Resource Selection
More informationBing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web
More informationRanking with Query-Dependent Loss for Web Search
Ranking with Query-Dependent Loss for Web Search Jiang Bian 1, Tie-Yan Liu 2, Tao Qin 2, Hongyuan Zha 1 Georgia Institute of Technology 1 Microsoft Research Asia 2 Outline Motivation Incorporating Query
More informationPapers for comprehensive viva-voce
Papers for comprehensive viva-voce Priya Radhakrishnan Advisor : Dr. Vasudeva Varma Search and Information Extraction Lab, International Institute of Information Technology, Gachibowli, Hyderabad, India
More informationLearning Ranking Functions with SVMs
Learning Ranking Functions with SVMs CS4780/5780 Machine Learning Fall 2012 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference
More information