Time-aware Approaches to Information Retrieval

Size: px
Start display at page:

Download "Time-aware Approaches to Information Retrieval"

Transcription

1 Time-aware Approaches to Information Retrieval Nattiya Kanhabua Department of Computer and Information Science Norwegian University of Science and Technology 24 February 2012

2 Motivation Searching documents created/edited over time E.g., web archives, news archives, blogs, or s Retrieve documents about Pope Benedict XVI written before 2005 Web archives news archives blogs s Term-based IR approaches may give unsatisfied results temporal document collections Nattiya Kanhabua 2

3 Wayback Machine 1 A web archive search tool by the Internet Archive Query by a URL, e.g., No keyword query No relevance ranking 1 Retrieved on 15 January 2011 Nattiya Kanhabua 3

4 Google News Archive Search A news archive search tool by Google Query by keywords Rank results by relevance or date Not consider terminology changes over time Nattiya Kanhabua 4

5 Objective of PhD thesis Study problems of temporal search Propose approaches to solve the problems Main research question How to exploit temporal information in documents, queries, and external sources in order to improve the retrieval effectiveness? Nattiya Kanhabua 5

6 Outline contributions Part I - Content Analysis RQ1: How to determine time of non-timestamped documents? Part II - Query Analysis RQ2: How to determine time of queries? RQ3: How to handle terminology changes over time? RQ4: How to predict the effectiveness of temporal queries? RQ5: How to predict the suitable time-aware ranking? Part III - Retrieval and Ranking Models RQ6: How to model time into retrieval and ranking? RQ7: How to combine different features and time for ranking? Nattiya Kanhabua 6

7 PART I - CONTENT ANALYSIS Nattiya Kanhabua 7

8 RQ1: Determining time of documents Problem Statements Difficult to find the trustworthy time for web documents Time gap between crawling and indexing Decentralization and relocation of web documents No standard metadata for time/date For a given document with uncertain timestamp, can the contents be used to determine the timestamp with a sufficiently high confidence? I found a bible-like document. But I have no idea when it was created? Let s me see This document is probably written in 850 A.C. with 95% confidence. Nattiya Kanhabua 8

9 Preliminaries Temporal Language Models [de Jong 2005] Based on the statistic usage of words over time Compare each word of a non-timestamped document with a reference corpus Tentative timestamp -- a time partition mostly overlaps in word usage A non-timestamped document tsunami Thailand Similarity Scores Score(1999) = 1 Temporal Language Models Partition Word 1999 tsunami 1999 Japan 1999 tidal wave 2004 tsunami 2004 Thailand 2004 earthquake Score(2004) = = 2 Most likely timestamp is 2004 Nattiya Kanhabua 9

10 Improving document dating Three enhancement techniques: 1. Semantic-based data preprocessing 2. Search statistics to enhance similarity scores 3. Temporal entropy as term weights Nattiya Kanhabua and Kjetil Nørvåg, Improving Temporal Language Models For Determining Time of Non- Timestamped Documents, In Proceedings of European Conference on Research and Advanced Technology for Digital Libraries (ECDL), Nattiya Kanhabua 10

11 Improving document dating Three enhancement techniques: 1. Semantic-based data preprocessing 2. Search statistics to enhance similarity scores 3. Temporal entropy as term weights Intuition: Direct comparison between extracted words and corpus partitions has limited accuracy Approach: Integrate semantic-based techniques into document preprocessing Nattiya Kanhabua 11

12 Improving document dating Three enhancement techniques: 1. Semantic-based data preprocessing 2. Search statistics to enhance similarity scores 3. Temporal entropy as term weights Intuition: Search Direct comparison statistics Google between Zeitgeist extracted (GZ) can words increase and corpus the partitions probability has of limited a tentative accuracy time partition Approach: Linearly Integrate combine semantic-based a GZ score techniques with the into normalized document preprocessing log-likelihood ratio Nattiya Kanhabua 12

13 Improving document dating Three enhancement techniques: 1. Semantic-based data preprocessing 2. Search statistics to enhance similarity scores 3. Temporal entropy as term weights Intuition: A Search Direct term comparison statistics weight depends Google between on Zeitgeist how extracted good (GZ) the can words increase term and corpus is for the separating partitions probability has time of limited a partitions tentative accuracy (discriminative) time partition Approach: Linearly Propose Integrate combine temporal semantic-based a entropy, GZ score techniques based with the on a into term normalized selection document presented preprocessing log-likelihood Lochbaum ratio and Streeter Nattiya Kanhabua 13

14 Experiments Collection 9,000 documents collected from the Internet Archive 8 years time span, 15 news sources Randomly select 1,000 documents for testing Results Proposed techniques gain improvement over the baseline Precision = the fraction of documents correctly dated Open issue The effectiveness of document dating is still limited Highly dependent on the quality of a reference corpus Nattiya Kanhabua 14

15 PART II - QUERY ANALYSIS Nattiya Kanhabua 15

16 Challenges with temporal queries Semantic gaps: lacking knowledge about 1. possibly relevant time of queries 2. terminology changes over time Nattiya Kanhabua 16

17 Challenges with temporal queries Semantic gaps: lacking knowledge about 1. possibly relevant time of queries 2. terminology changes over time query suggest time 1 time 2 time k Nattiya Kanhabua 17

18 Challenges with temporal queries Semantic gaps: lacking knowledge about 1. possibly relevant time of queries 2. terminology changes over time query suggest time 1 time 2 time k Nattiya Kanhabua 18

19 Challenges with temporal queries Semantic gaps: lacking knowledge about 1. possibly relevant time of queries 2. terminology changes over time query suggest Nattiya Kanhabua 19

20 RQ2: Determining time of queries Problem Statements 1.5% of web queries are explicitly provided with temporal expression [Nunes 2008] Time is a part of query, U.S. Presidential election 2008 About 7% of web queries have temporal intent implicitly provided [Metzler 2009] Time is not given in queries, e.g., Germany World Cup or tsunami Difficult to achieve high accuracy using only keywords Relevant documents associated to particular time not given Nattiya Kanhabua 20

21 Our contributions 1. Determining the time of queries when no time is given 2. Re-ranking search results using the determined time Nattiya Kanhabua and Kjetil Nørvåg, Determining Time of Queries for Re-ranking Search Results, In Proceedings of the 14th European Conference on Research and Advanced Technology for Digital Libraries (ECDL), Nattiya Kanhabua 21

22 Determining time of queries Approach I. Dating using keywords* Approach II. Dating using top-k documents* Queries are short keywords Inspired by pseudo-relevance feedback Approach III. Using timestamp of top-k documents No temporal language models are used *Using Temporal Language Models proposed by de Jong et al. Nattiya Kanhabua 22

23 Re-ranking search results Intuition: documents published closely to the time of queries are more relevant Assign document priors based on publication dates Determine time 2005, 2004, 2006,... query News archive D 2009 D 2005 Initial retrieved results Re-ranked results Nattiya Kanhabua 23

24 Experiments: Part 1 Precision = the fraction of queries correctly dated Determining the time of queries Collection NYT Corpus contains over 1.8M ( ) 30 time-sensitive queries from the TREC Robust2004 Results The smaller top-k, the better precision (k=5 > k=10 > k=15) The larger g (granularity), the better precision (g=12-month > g=6-month) Nattiya Kanhabua 24

25 Experiments: Part 2 Re-ranking of search results Collection TREC Robust2004, 30 time-sensitive queries NYT Corpus, 24 queries from Google zeitgeist Results Approach III (no TMLs) outperforms all other approaches Using publication dates is more accurate than the dating process Open issue Time can improve the effectiveness (if the query dating is improved with a higher accuracy) Nattiya Kanhabua 25

26 Challenges of temporal search Semantic gaps: lacking knowledge about 1. possibly relevant time of queries 2. terminology changes over time query suggest Nattiya Kanhabua 26

27 RQ3: Handling terminology changes Problem Statements Queries composed of named entities (people, organization, location) Highly dynamic in appearance, i.e., relationships between terms changes over time E.g. changes of roles, name alterations, or semantic shift Scenario 1 Query: Pope Benedict XVI and written before 2005 Documents about Joseph Alois Ratzinger are relevant Scenario 2 Query: Hillary R. Clinton and written from 1997 to 2002 Documents about New York Senator and First Lady of the United States are relevant Nattiya Kanhabua 27

28 QUEST Demo: Nattiya Kanhabua 28

29 Our contributions Discover time-based synonyms over time using Wikipedia Generally, synonyms are words with similar meanings This work refers synonyms as alternative names of an entity Improve the accuracy of time of synonyms Query expansion using time-based synonyms Nattiya Kanhabua and Kjetil Nørvåg, Exploiting Time-based Synonyms in Searching Document Archives, In Proceedings of the ACM/IEEE Conference on Digital Libraries (JCDL), Nattiya Kanhabua 29

30 Recognize named entities Nattiya Kanhabua 30

31 Recognize named entities Nattiya Kanhabua 31

32 Recognize named entities Nattiya Kanhabua 32

33 Find synonyms Find a set of entity-synonym relationships at time t k For each e i ϵ E tk, extract anchor texts from article links: Entity: President_of_the_United_States Synonym: George W. Bush Time: 11/2004 George W. Bush George W. Bush President_of_the_ United_States President George W. Bush President Bush (43) Nattiya Kanhabua 33

34 Initial results Time periods are not accurate Note: the time of synonyms are timestamps of Wikipedia articles (8 years) Nattiya Kanhabua 34

35 Enhancement using NYT Analyze NYT Corpus to discover more accurate time 20-year time span ( ) Use the burst detection algorithm [Kleinberg 2003] Time periods of synonyms = burst intervals Initial results Nattiya Kanhabua 35

36 Query expansion 1. A user enters an entity as a query QUEST Demo: Nattiya Kanhabua 36

37 Query expansion 1. A user enters an entity as a query 2. The system retrieves synonyms wrt. the query QUEST Demo: Nattiya Kanhabua 37

38 Query expansion 1. A user enters an entity as a query 2. The system retrieves synonyms wrt. the query 3. The user select synonyms to expand the query QUEST Demo: Nattiya Kanhabua 38

39 Experiments Part 1- Synonym detection Collection The whole history of English Wikipedia all pages and revisions 03/2001 to 03/ month snapshots about 2.8 Terabytes Result Randomly selected 500 entity-synonym relationships for evaluating Accuracy 51% for all types of entities Accuracy 73% for people, organization, and company Part 2 - Query expansion Collection TREC Robust2004 Track (250 queries) NewsLibrary.com over 100M U.S. news articles (20 temporal queries) Result Baseline: Probabilistic Model without query expansion QE significantly improves the effectiveness over the baseline for both collections Open issues Only the name changes of famous persons can be discovered Nattiya Kanhabua 39

40 Query prediction problems Two problems are addressed 1. Performance prediction Predict the retrieval effectiveness wrt. a ranking model query predict precision =? recall =? MAP =? Nattiya Kanhabua 40

41 Query prediction problems Two problems are addressed 1. Performance prediction Predict the retrieval effectiveness wrt. a ranking model 2. Ranking prediction Predict the ranking model that is most suitable predict query ranking =? max(precision) max(recall) max(map) Nattiya Kanhabua 41

42 RQ4: Query performance prediction Problem Statement Predict the effectiveness (e.g., MAP) that a query will achieve in advance of, or during retrieval [Hauff 2010] high MAP good low MAP poor Objective Apply query enhancement techniques to improve the overall performance Query suggestion is applied for poor queries To best of our knowledge, predicting the performance of temporal queries has never done before Nattiya Kanhabua 42

43 Discussion Contributions First study of performance prediction for temporal queries Propose 10 time-based pre-retrieval predictors Both text and time are considered Experiment Collection: NYT Corpus and 40 temporal queries [Berberich 2010] Results Time-based predictors outperform keyword-based predictors Combined predictors outperform single predictors in most cases Open issue Increase the number of queries Consider time uncertainty Nattiya Kanhabua and Kjetil Nørvåg, Time-based Query Performance Predictors (poster), In Proceedings of the 34th Annual ACMSIGIR Conference (SIGIR), Nattiya Kanhabua 43

44 RQ5: Time-aware ranking prediction Problem statement Two time dimensions: publication time and content time Content time = temporal expressions mentioned in documents Difference in effectiveness for temporal queries when ranking using publication time or content time Nattiya Kanhabua 44

45 Discussion Contributions First study of the impact on effectiveness of ranking models using the two time dimensions Three features from analyzing top-k documents Temporal KL-divergence [Diaz 2004] Content Clarity [Cronen-Townsend 2002] Divergence of retrieval scores [Peng 2010] Results A small number of top-k documents achieves better performance The larger number k, the more irrelevant documents are introduced into the analysis Open issue When comparing with the optimal case there is still room for further improvements Nattiya Kanhabua, Klaus Berberich and Kjetil Nørvåg, Time-aware Ranking Prediction, (under submission). Nattiya Kanhabua 45

46 PART III - RETRIEVAL AND RANKING MODELS Nattiya Kanhabua 46

47 RQ6: Time-aware ranking models Problem statements Time must be explicitly modeled in order to increase the effectiveness Time uncertainty should be taken into account Two temporal expressions can refer to the same time period even though they are not equally written Example Given the query Independence Day 2011, a retrieval model relying on term-matching will fail to retrieve documents mentioning July 4, 2011 Nattiya Kanhabua 47

48 Discussion Contributions Analyze and compare five ranking methods Experiment Collection: NYT Corpus and 40 temporal queries[berberich 2010] Result TSU outperforms other methods significantly for most metrics Conclusions Although TSU gains the best performance, it is limited for a document collection with no time metadata LMT, LMTU can be applied to any collection without time metadata, but extraction of temporal expressions is needed. Nattiya Kanhabua and Kjetil Nørvåg, A Comparison of Time-aware Ranking Methods (poster), In Proceedings of the 34th Annual ACMSIGIR Conference (SIGIR), Nattiya Kanhabua 48

49 RQ7:Ranking related news predictions Problem statement Can the combination of time and other features help improving the retrieval effectiveness? A new task called ranking related news predictions Retrieve predictions related to a news story in news archives Rank them according to their relevance to the news story Nattiya Kanhabua 49

50 Related news predictions Nattiya Kanhabua 50

51 Contributions Define the task ranking related news predictions Searching the future is proposed in [Baeza-Yates 2005] Propose four classes of features Term similarity, entity-based similarity, topic similarity and temporal similarity Rank predictions using learning-to-rank [Liu 2009] Make available the dataset with over 6000 judgments Nattiya Kanhabua, Roi Blanco and Michael Matthews, Ranking Related News Predictions, In Proceedings of the 34th Annual ACMSIGIR Conference (SIGIR), Nattiya Kanhabua 51

52 Experiments NYT Corpus More than 25% contain at least one prediction Feature analysis Topic features play an important role in ranking Features in top-5 features with lowest weights are entitybased features Open issues Extract predictions from other sources, e.g., Wikipedia, blogs, comments, etc. Sentiment analysis for future-related information. Nattiya Kanhabua 52

53 Conclusions Solutions to all research questions: Part I - Content Analysis RQ1: How to determine time of non-timestamped documents? Part II - Query Analysis RQ2: How to determine time of queries? RQ3: How to handle terminology changes over time? RQ4: How to predict the effectiveness of temporal queries? RQ5: How to predict the suitable time-aware ranking? Part III - Retrieval and Ranking Models RQ6: How to model time into retrieval and ranking? RQ7: How to combine different features and time for ranking? Nattiya Kanhabua 53

54 Publications Nattiya Kanhabua and Kjetil Nørvåg, Improving Temporal Language Models For Determining Time of Non- Timestamped Documents, In Proceedings of European Conference on Research and Advanced Technology for Digital Libraries (ECDL), Nattiya Kanhabua and Kjetil Nørvåg, Using temporal language models for document dating, In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), 2009 Nattiya Kanhabua and Kjetil Nørvåg, Determining Time of Queries for Re-ranking Search Results, In Proceedings of the 14th European Conference on Research and Advanced Technology for Digital Libraries (ECDL), Nattiya Kanhabua and Kjetil Nørvåg, Exploiting Time-based Synonyms in Searching Document Archives, In Proceedings of the ACM/IEEE Conference on Digital Libraries (JCDL), Nattiya Kanhabua and Kjetil Nørvåg, QUEST: query expansion using synonyms over time, In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), Nattiya Kanhabua and Kjetil Nørvåg, Time-based Query Performance Predictors (poster), In Proceedings of the 34th Annual ACMSIGIR Conference (SIGIR), Nattiya Kanhabua and Kjetil Nørvåg, A Comparison of Time-aware Ranking Methods (poster), In Proceedings of the 34th Annual ACMSIGIR Conference (SIGIR), Nattiya Kanhabua, Roi Blanco and Michael Matthews, Ranking Related News Predictions, In Proceedings of the 34th Annual ACMSIGIR Conference (SIGIR), Nattiya Kanhabua, Klaus Berberich and Kjetil Nørvåg, Time-aware Ranking Prediction, Technical Report. Nattiya Kanhabua 54

55 References [Baeza-Yates 2005] R. A. Baeza-Yates. Searching the future. In Proceedings of SIGIR workshop on mathematical/formal methods in information retrieval MF/IR, SIGIR 05, [Berberich 2010] K. Berberich, S. J. Bedathur, O. Alonso, and G. Weikum. A language modeling approach for temporal information needs. In Proceedings of the 32nd European Conference on IR Research on Advances in Information Retrieval, ECIR 10, pp , [Cronen-Townsend 2002] S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR 02, pp , [Diaz 2004] F. Diaz and R. Jones. Using temporal profiles of queries for precision prediction. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR 04, pp , [Hauff 2010] C. Hauff, L. Azzopardi, D. Hiemstra, and F. de Jong. Query performance prediction: Evaluation contrasted with effectiveness. In Proceedings of the 32nd European Conference on IR Research on Advances in Information Retrieval, ECIR 10, pp , April [de Jong 2005] F. de Jong, H. Rode, and D. Hiemstra. Temporal language models for the disclosure of historical text. In Humanities, computers and cultural heritage: Proceedings of the 16th International Conference of the Association for History and Computing, AHC '05, pp , [Kleinberg 2003] J. Kleinberg. Bursty and hierarchical structure in streams. Data Min. Knowl. Discov., 7:pp , October [Liu 2009] T-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):pp , March [Metzler 2009] D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving search relevance for implicitly temporal queries. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, SIGIR 09, pp , [Nunes 2008] S. Nunes, C. Ribeiro, and G. David. Use of temporal expressions in web search. In Proceedings of the 30th European Conference on IR Research on Advances in Information Retrieval, ECIR 08, pp , [Peng 2010] J. Peng, C. Macdonald, and I. Ounis. Learning to select a ranking function. In Proceedings of the 32nd European Conference on IR Research on Advances Nattiya Kanhabua in Information Retrieval, ECIR 10, pp ,

56 Thank you Nattiya Kanhabua 56

Nattiya Kanhabua Time-aware Approaches to Information Retrieval

Nattiya Kanhabua Time-aware Approaches to Information Retrieval Doctoral theses at NTNU, 2012:5 ISBN 978-82-471-3264-7 (printed ver.) ISBN 978-82-471-3265-4 (electronic ver.) ISSN 1503-8181 Nattiya Kanhabua Doctoral theses at NTNU, 2012:5 NTNU Norwegian University

More information

Robust Relevance-Based Language Models

Robust Relevance-Based Language Models Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new

More information

Temporal Information Access (Temporalia-2)

Temporal Information Access (Temporalia-2) Temporal Information Access (Temporalia-2) NTCIR-12 Core Task http://ntcirtemporalia.github.io Hideo Joho (University of Tsukuba) Adam Jatowt (Kyoto University) Roi Blanco (Yahoo! Research) Haitao Yu (University

More information

Predicting Query Performance on the Web

Predicting Query Performance on the Web Predicting Query Performance on the Web No Author Given Abstract. Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However,

More information

Query Likelihood with Negative Query Generation

Query Likelihood with Negative Query Generation Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer

More information

Time-Surfer: Time-Based Graphical Access to Document Content

Time-Surfer: Time-Based Graphical Access to Document Content Time-Surfer: Time-Based Graphical Access to Document Content Hector Llorens 1,EstelaSaquete 1,BorjaNavarro 1,andRobertGaizauskas 2 1 University of Alicante, Spain {hllorens,stela,borja}@dlsi.ua.es 2 University

More information

Ranking models in Information Retrieval: A Survey

Ranking models in Information Retrieval: A Survey Ranking models in Information Retrieval: A Survey R.Suganya Devi Research Scholar Department of Computer Science and Engineering College of Engineering, Guindy, Chennai, Tamilnadu, India Dr D Manjula Professor

More information

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks University of Amsterdam at INEX 2010: Ad hoc and Book Tracks Jaap Kamps 1,2 and Marijn Koolen 1 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Faculty of Science,

More information

Information Search in Web Archives

Information Search in Web Archives Information Search in Web Archives Miguel Costa Advisor: Prof. Mário J. Silva Co-Advisor: Prof. Francisco Couto Department of Informatics, Faculty of Sciences, University of Lisbon PhD thesis defense,

More information

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna

More information

Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization

Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization Romain Deveaud 1 and Florian Boudin 2 1 LIA - University of Avignon romain.deveaud@univ-avignon.fr

More information

A PRELIMINARY STUDY ON THE EXTRACTION OF SOCIO-TOPICAL WEB KEYWORDS

A PRELIMINARY STUDY ON THE EXTRACTION OF SOCIO-TOPICAL WEB KEYWORDS A PRELIMINARY STUDY ON THE EXTRACTION OF SOCIO-TOPICAL WEB KEYWORDS KULWADEE SOMBOONVIWAT Graduate School of Information Science and Technology, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033,

More information

Re-contextualization and contextual Entity exploration. Sebastian Holzki

Re-contextualization and contextual Entity exploration. Sebastian Holzki Re-contextualization and contextual Entity exploration Sebastian Holzki Sebastian Holzki June 7, 2016 1 Authors: Joonseok Lee, Ariel Fuxman, Bo Zhao, and Yuanhua Lv - PAPER PRESENTATION - LEVERAGING KNOWLEDGE

More information

Using Temporal Profiles of Queries for Precision Prediction

Using Temporal Profiles of Queries for Precision Prediction Using Temporal Profiles of Queries for Precision Prediction Fernando Diaz Center for Intelligent Information Retrieval Department of Computer Science University of Massachusetts Amherst, MA 01003 fdiaz@cs.umass.edu

More information

An Investigation of Basic Retrieval Models for the Dynamic Domain Task

An Investigation of Basic Retrieval Models for the Dynamic Domain Task An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu

More information

MPI-INF AT THE NTCIR-11 TEMPORAL QUERY CLASSIFICATION TASK

MPI-INF AT THE NTCIR-11 TEMPORAL QUERY CLASSIFICATION TASK MPI-INF AT THE NTCIR-11 TEMPORAL QUERY CLASSIFICATION TASK Robin Burghartz Klaus Berberich Max Planck Institute for Informatics, Saarbrücken, Germany General Approach Overall strategy for TQIC subtask:

More information

ALEXANDRIA. Temporal Retrieval, Exploration and Analytics in Web Archives. Wolfgang Nejdl. L3S Research Center Hannover, Germany

ALEXANDRIA. Temporal Retrieval, Exploration and Analytics in Web Archives. Wolfgang Nejdl. L3S Research Center Hannover, Germany ALEXANDRIA Temporal Retrieval, Exploration and Analytics in Web Archives Wolfgang Nejdl L3S Research Center Hannover, Germany Looking back: The Austrian Socialist Party and Europe What is missing? ALEXANDRIA

More information

UMass at TREC 2017 Common Core Track

UMass at TREC 2017 Common Core Track UMass at TREC 2017 Common Core Track Qingyao Ai, Hamed Zamani, Stephen Harding, Shahrzad Naseri, James Allan and W. Bruce Croft Center for Intelligent Information Retrieval College of Information and Computer

More information

Navigating the User Query Space

Navigating the User Query Space Navigating the User Query Space Ronan Cummins 1, Mounia Lalmas 2, Colm O Riordan 3 and Joemon M. Jose 1 1 School of Computing Science, University of Glasgow, UK 2 Yahoo! Research, Barcelona, Spain 3 Dept.

More information

Using Coherence-based Measures to Predict Query Difficulty

Using Coherence-based Measures to Predict Query Difficulty Using Coherence-based Measures to Predict Query Difficulty Jiyin He, Martha Larson, and Maarten de Rijke ISLA, University of Amsterdam {jiyinhe,larson,mdr}@science.uva.nl Abstract. We investigate the potential

More information

Using a Medical Thesaurus to Predict Query Difficulty

Using a Medical Thesaurus to Predict Query Difficulty Using a Medical Thesaurus to Predict Query Difficulty Florian Boudin, Jian-Yun Nie, Martin Dawes To cite this version: Florian Boudin, Jian-Yun Nie, Martin Dawes. Using a Medical Thesaurus to Predict Query

More information

Accessing Web Archives

Accessing Web Archives Accessing Web Archives Web Science Course 2017 Helge Holzmann 05/16/2017 Helge Holzmann (holzmann@l3s.de) Not today s topic http://blog.archive.org/2016/09/19/the-internet-archive-turns-20/ 05/16/2017

More information

NUSIS at TREC 2011 Microblog Track: Refining Query Results with Hashtags

NUSIS at TREC 2011 Microblog Track: Refining Query Results with Hashtags NUSIS at TREC 2011 Microblog Track: Refining Query Results with Hashtags Hadi Amiri 1,, Yang Bao 2,, Anqi Cui 3,,*, Anindya Datta 2,, Fang Fang 2,, Xiaoying Xu 2, 1 Department of Computer Science, School

More information

Making Retrieval Faster Through Document Clustering

Making Retrieval Faster Through Document Clustering R E S E A R C H R E P O R T I D I A P Making Retrieval Faster Through Document Clustering David Grangier 1 Alessandro Vinciarelli 2 IDIAP RR 04-02 January 23, 2004 D a l l e M o l l e I n s t i t u t e

More information

On Modeling Temporal Dynamics Forgetting and Remembering for Intelligent Information Access

On Modeling Temporal Dynamics Forgetting and Remembering for Intelligent Information Access On Modeling Temporal Dynamics Forgetting and Remembering for Intelligent Information Access Advanced Methods for IR Dr. Nattiya Kanhabua L3S Research Center Hannover, Germany 24 June 2015 1 Outline Motivation

More information

Learning Temporal-Dependent Ranking Models

Learning Temporal-Dependent Ranking Models Learning Temporal-Dependent Ranking Models Miguel Costa, Francisco Couto, Mário Silva LaSIGE @ Faculty of Sciences, University of Lisbon IST/INESC-ID, University of Lisbon 37th Annual ACM SIGIR Conference,

More information

Understanding User s Search Behavior towards Spiky Events

Understanding User s Search Behavior towards Spiky Events Understanding User s Search Behavior towards Spiky Events Behrooz Mansouri Mohammad Zahedi Ricardo Campos Mojgan Farhoodi Maseud Rahgozar Ricardo Campos TempWeb 2018 @ WWW Lyon, France, Apr 23, 2018 Iran

More information

Focused Retrieval Using Topical Language and Structure

Focused Retrieval Using Topical Language and Structure Focused Retrieval Using Topical Language and Structure A.M. Kaptein Archives and Information Studies, University of Amsterdam Turfdraagsterpad 9, 1012 XT Amsterdam, The Netherlands a.m.kaptein@uva.nl Abstract

More information

NTCIR Temporalia: A Test Collection for Temporal Information Access Research

NTCIR Temporalia: A Test Collection for Temporal Information Access Research NTCIR Temporalia: A Test Collection for Temporal Information Access Research Hideo Joho Research Center for Knowledge Communities, Faculty of Library, Information and Media Science, University of Tsukuba,

More information

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016 Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full

More information

The University of Amsterdam at the CLEF 2008 Domain Specific Track

The University of Amsterdam at the CLEF 2008 Domain Specific Track The University of Amsterdam at the CLEF 2008 Domain Specific Track Parsimonious Relevance and Concept Models Edgar Meij emeij@science.uva.nl ISLA, University of Amsterdam Maarten de Rijke mdr@science.uva.nl

More information

WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY

WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.4, April 2009 349 WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY Mohammed M. Sakre Mohammed M. Kouta Ali M. N. Allam Al Shorouk

More information

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,

More information

Informativeness for Adhoc IR Evaluation:

Informativeness for Adhoc IR Evaluation: Informativeness for Adhoc IR Evaluation: A measure that prevents assessing individual documents Romain Deveaud 1, Véronique Moriceau 2, Josiane Mothe 3, and Eric SanJuan 1 1 LIA, Univ. Avignon, France,

More information

Improving Difficult Queries by Leveraging Clusters in Term Graph

Improving Difficult Queries by Leveraging Clusters in Term Graph Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu

More information

Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits

Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits Carl Sabottke Octavian Suciu Tudor Dumitraș University of Maryland 2 Problem Increasing number

More information

Context-Based Topic Models for Query Modification

Context-Based Topic Models for Query Modification Context-Based Topic Models for Query Modification W. Bruce Croft and Xing Wei Center for Intelligent Information Retrieval University of Massachusetts Amherst 140 Governors rive Amherst, MA 01002 {croft,xwei}@cs.umass.edu

More information

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano

More information

Information Retrieval

Information Retrieval Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have

More information

A Survey of Temporal Web Search Experience

A Survey of Temporal Web Search Experience A Survey of Temporal Web Search Experience Hideo Joho Faculty of Library, Information and Media Science / Research Center for Knowledge Communities, University of Tsukuba, Japan hideo@slis.tsukuba.ac.jp

More information

Using Web Snippets and Web Query-logs to Measure Implicit Temporal Intents in Queries

Using Web Snippets and Web Query-logs to Measure Implicit Temporal Intents in Queries Using Web Snippets and Web Query-logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Mário Jorge Alípio, Gaël Dias To cite this version: Ricardo Campos, Mário Jorge Alípio, Gaël Dias.

More information

GTE-Cluster: A Temporal Search Interface for Implicit Temporal Queries

GTE-Cluster: A Temporal Search Interface for Implicit Temporal Queries GTE-Cluster: A Temporal Search Interface for Implicit Temporal Queries Ricardo Campos 1,2,6, Gaël Dias 4, Alípio Mário Jorge 1,3, and Célia Nunes 5, 6 1 LIAAD INESC TEC 2 Polytechnic Institute of Tomar,

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Diversifying Query Suggestions Based on Query Documents

Diversifying Query Suggestions Based on Query Documents Diversifying Query Suggestions Based on Query Documents Youngho Kim University of Massachusetts Amherst yhkim@cs.umass.edu W. Bruce Croft University of Massachusetts Amherst croft@cs.umass.edu ABSTRACT

More information

A Practical Passage-based Approach for Chinese Document Retrieval

A Practical Passage-based Approach for Chinese Document Retrieval A Practical Passage-based Approach for Chinese Document Retrieval Szu-Yuan Chi 1, Chung-Li Hsiao 1, Lee-Feng Chien 1,2 1. Department of Information Management, National Taiwan University 2. Institute of

More information

Inferring User Search for Feedback Sessions

Inferring User Search for Feedback Sessions Inferring User Search for Feedback Sessions Sharayu Kakade 1, Prof. Ranjana Barde 2 PG Student, Department of Computer Science, MIT Academy of Engineering, Pune, MH, India 1 Assistant Professor, Department

More information

Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track

Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track Jeffrey Dalton University of Massachusetts, Amherst jdalton@cs.umass.edu Laura

More information

Chapter 5 5 Performance prediction in Information Retrieval

Chapter 5 5 Performance prediction in Information Retrieval Chapter 5 5 Performance prediction in Information Retrieval Information retrieval performance prediction has been mostly addressed as a query performance issue, which refers to the performance of an information

More information

{dhgupta, jannik.stroetgen, Abstract

{dhgupta, jannik.stroetgen, Abstract Dhruv Gupta 1,2, Jannik Strötgen 1, and Klaus Berberich 1,3 1 Max Planck Institute for Informatics, Saarbrücken, Germany 2 Saarbrücken Graduate School of Compute Science, Saarbrücken, Germany 3 htw saar,

More information

A BELIEF NETWORK MODEL FOR EXPERT SEARCH

A BELIEF NETWORK MODEL FOR EXPERT SEARCH A BELIEF NETWORK MODEL FOR EXPERT SEARCH Craig Macdonald, Iadh Ounis Department of Computing Science, University of Glasgow, Glasgow, G12 8QQ, UK craigm@dcs.gla.ac.uk, ounis@dcs.gla.ac.uk Keywords: Expert

More information

Predicting Query Performance via Classification

Predicting Query Performance via Classification Predicting Query Performance via Classification Kevyn Collins-Thompson Paul N. Bennett Microsoft Research 1 Microsoft Way Redmond, WA USA 98052 {kevynct,paul.n.bennett}@microsoft.com Abstract. We investigate

More information

Verbose Query Reduction by Learning to Rank for Social Book Search Track

Verbose Query Reduction by Learning to Rank for Social Book Search Track Verbose Query Reduction by Learning to Rank for Social Book Search Track Messaoud CHAA 1,2, Omar NOUALI 1, Patrice BELLOT 3 1 Research Center on Scientific and Technical Information 05 rue des 03 frères

More information

Experiments on Related Entity Finding Track at TREC 2009 Qing Yang,Peng Jiang, Chunxia Zhang, Zhendong Niu

Experiments on Related Entity Finding Track at TREC 2009 Qing Yang,Peng Jiang, Chunxia Zhang, Zhendong Niu Experiments on Related Entity Finding Track at TREC 2009 Qing Yang,Peng Jiang, Chunxia Zhang, Zhendong Niu School of Computer, Beijing Institute of Technology { yangqing2005,jp, cxzhang, zniu}@bit.edu.cn

More information

Processing Structural Constraints

Processing Structural Constraints SYNONYMS None Processing Structural Constraints Andrew Trotman Department of Computer Science University of Otago Dunedin New Zealand DEFINITION When searching unstructured plain-text the user is limited

More information

Temporal Information Retrieval

Temporal Information Retrieval Foundations and Trends R in Information Retrieval Vol. 9, No. 2 (2015) 91 208 c 2015 N. Kanhabua, R. Blanco, and K. Nørvåg DOI: 10.1561/1500000043 Temporal Information Retrieval Nattiya Kanhabua L3S Research

More information

External Query Reformulation for Text-based Image Retrieval

External Query Reformulation for Text-based Image Retrieval External Query Reformulation for Text-based Image Retrieval Jinming Min and Gareth J. F. Jones Centre for Next Generation Localisation School of Computing, Dublin City University Dublin 9, Ireland {jmin,gjones}@computing.dcu.ie

More information

TempWeb rd Temporal Web Analytics Workshop

TempWeb rd Temporal Web Analytics Workshop TempWeb 2013 3 rd Temporal Web Analytics Workshop Stuff happens continuously: exploring Web contents with temporal information Omar Alonso Microsoft 13 May 2013 Disclaimer The views, opinions, positions,

More information

Metric Spaces for Temporal Information Retrieval

Metric Spaces for Temporal Information Retrieval Metric Spaces for Temporal Information Retrieval Matteo Brucato1, Danilo Montesi2 1 University of Massachusetts Amherst, USA 2 University of Bologna, Italy Presented by: Matteo Brucato matteo@cs.umass.edu

More information

Retrieval and Feedback Models for Blog Distillation

Retrieval and Feedback Models for Blog Distillation Retrieval and Feedback Models for Blog Distillation Jonathan Elsas, Jaime Arguello, Jamie Callan, Jaime Carbonell Language Technologies Institute, School of Computer Science, Carnegie Mellon University

More information

Web Spam. Seminar: Future Of Web Search. Know Your Neighbors: Web Spam Detection using the Web Topology

Web Spam. Seminar: Future Of Web Search. Know Your Neighbors: Web Spam Detection using the Web Topology Seminar: Future Of Web Search University of Saarland Web Spam Know Your Neighbors: Web Spam Detection using the Web Topology Presenter: Sadia Masood Tutor : Klaus Berberich Date : 17-Jan-2008 The Agenda

More information

Understanding the use of Temporal Expressions on Persian Web Search

Understanding the use of Temporal Expressions on Persian Web Search Understanding the use of Temporal Expressions on Persian Web Search Behrooz Mansouri Mohammad Zahedi Ricardo Campos Mojgan Farhoodi Alireza Yari Ricardo Campos TempWeb 2018 @ WWW Lyon, France, Apr 23,

More information

Sentiment analysis under temporal shift

Sentiment analysis under temporal shift Sentiment analysis under temporal shift Jan Lukes and Anders Søgaard Dpt. of Computer Science University of Copenhagen Copenhagen, Denmark smx262@alumni.ku.dk Abstract Sentiment analysis models often rely

More information

Mining the Web for Multimedia-based Enriching

Mining the Web for Multimedia-based Enriching Mining the Web for Multimedia-based Enriching Mathilde Sahuguet and Benoit Huet Eurecom, Sophia-Antipolis, France Abstract. As the amount of social media shared on the Internet grows increasingly, it becomes

More information

RSDC 09: Tag Recommendation Using Keywords and Association Rules

RSDC 09: Tag Recommendation Using Keywords and Association Rules RSDC 09: Tag Recommendation Using Keywords and Association Rules Jian Wang, Liangjie Hong and Brian D. Davison Department of Computer Science and Engineering Lehigh University, Bethlehem, PA 18015 USA

More information

Combining fields for query expansion and adaptive query expansion

Combining fields for query expansion and adaptive query expansion Information Processing and Management 43 (2007) 1294 1307 www.elsevier.com/locate/infoproman Combining fields for query expansion and adaptive query expansion Ben He *, Iadh Ounis Department of Computing

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

Developing Focused Crawlers for Genre Specific Search Engines

Developing Focused Crawlers for Genre Specific Search Engines Developing Focused Crawlers for Genre Specific Search Engines Nikhil Priyatam Thesis Advisor: Prof. Vasudeva Varma IIIT Hyderabad July 7, 2014 Examples of Genre Specific Search Engines MedlinePlus Naukri.com

More information

Leveraging Temporal Query-Term Dependency for Time-Aware Information Access

Leveraging Temporal Query-Term Dependency for Time-Aware Information Access Leveraging Temporal -Term Dependency for Time-Aware Information Access Bilel Moulahi, Lynda Tamine and Sadok Ben Yahia IRIT, University of Toulouse, France Faculty of Science of Tunis, University of Tunis

More information

Automatic Generation of Query Sessions using Text Segmentation

Automatic Generation of Query Sessions using Text Segmentation Automatic Generation of Query Sessions using Text Segmentation Debasis Ganguly, Johannes Leveling, and Gareth J.F. Jones CNGL, School of Computing, Dublin City University, Dublin-9, Ireland {dganguly,

More information

WebSci and Learning to Rank for IR

WebSci and Learning to Rank for IR WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles

More information

The Role of Language Evolution in Digital Archives

The Role of Language Evolution in Digital Archives The Role of Language Evolution in Digital Archives Thomas Risse, Helge Holzmann (L3S) Nina Tahmasebi, Uni Gothenburg L3S Research Center Web Science @ L3S Preserving, understanding and shaping the Web

More information

Open Research Online The Open University s repository of research publications and other research outputs

Open Research Online The Open University s repository of research publications and other research outputs Open Research Online The Open University s repository of research publications and other research outputs A Study of Document Weight Smoothness in Pseudo Relevance Feedback Conference or Workshop Item

More information

IITH at CLEF 2017: Finding Relevant Tweets for Cultural Events

IITH at CLEF 2017: Finding Relevant Tweets for Cultural Events IITH at CLEF 2017: Finding Relevant Tweets for Cultural Events Sreekanth Madisetty and Maunendra Sankar Desarkar Department of CSE, IIT Hyderabad, Hyderabad, India {cs15resch11006, maunendra}@iith.ac.in

More information

CADIAL Search Engine at INEX

CADIAL Search Engine at INEX CADIAL Search Engine at INEX Jure Mijić 1, Marie-Francine Moens 2, and Bojana Dalbelo Bašić 1 1 Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia {jure.mijic,bojana.dalbelo}@fer.hr

More information

NTCIR-11 Temporalia. Temporal Information Access

NTCIR-11 Temporalia. Temporal Information Access NTCIR-11 Temporalia Temporal Information Access Hideo Joho (Univ. of Tsukuba) Adam Jatowt (Kyoto Univ.) Roi Blanco (Yahoo! Research) Hajime Naka (Univ. of Tsukuba) Shuhei Yamamoto (Univ. of Tsukuba) https://sites.google.com/site/ntcirtemporalia

More information

International Journal of Computer Engineering and Applications, Volume IX, Issue X, Oct. 15 ISSN

International Journal of Computer Engineering and Applications, Volume IX, Issue X, Oct. 15  ISSN DIVERSIFIED DATASET EXPLORATION BASED ON USEFULNESS SCORE Geetanjali Mohite 1, Prof. Gauri Rao 2 1 Student, Department of Computer Engineering, B.V.D.U.C.O.E, Pune, Maharashtra, India 2 Associate Professor,

More information

Estimating Embedding Vectors for Queries

Estimating Embedding Vectors for Queries Estimating Embedding Vectors for Queries Hamed Zamani Center for Intelligent Information Retrieval College of Information and Computer Sciences University of Massachusetts Amherst Amherst, MA 01003 zamani@cs.umass.edu

More information

Telling Experts from Spammers Expertise Ranking in Folksonomies

Telling Experts from Spammers Expertise Ranking in Folksonomies 32 nd Annual ACM SIGIR 09 Boston, USA, Jul 19-23 2009 Telling Experts from Spammers Expertise Ranking in Folksonomies Michael G. Noll (Albert) Ching-Man Au Yeung Christoph Meinel Nicholas Gibbins Nigel

More information

Query Reformulation for Clinical Decision Support Search

Query Reformulation for Clinical Decision Support Search Query Reformulation for Clinical Decision Support Search Luca Soldaini, Arman Cohan, Andrew Yates, Nazli Goharian, Ophir Frieder Information Retrieval Lab Computer Science Department Georgetown University

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Information Retrieval Potsdam, 14 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the course book Outline 2 1 Introduction 2 Indexing Block Document

More information

An Improvement of Search Results Access by Designing a Search Engine Result Page with a Clustering Technique

An Improvement of Search Results Access by Designing a Search Engine Result Page with a Clustering Technique An Improvement of Search Results Access by Designing a Search Engine Result Page with a Clustering Technique 60 2 Within-Subjects Design Counter Balancing Learning Effect 1 [1 [2www.worldwidewebsize.com

More information

INCORPORATING SYNONYMS INTO SNIPPET BASED QUERY RECOMMENDATION SYSTEM

INCORPORATING SYNONYMS INTO SNIPPET BASED QUERY RECOMMENDATION SYSTEM INCORPORATING SYNONYMS INTO SNIPPET BASED QUERY RECOMMENDATION SYSTEM Megha R. Sisode and Ujwala M. Patil Department of Computer Engineering, R. C. Patel Institute of Technology, Shirpur, Maharashtra,

More information

arxiv: v1 [cs.dl] 28 Jan 2017

arxiv: v1 [cs.dl] 28 Jan 2017 How to Search the Internet Archive Without Indexing It Nattiya Kanhabua 1, Philipp Kemkes 2, Wolfgang Nejdl 2 Tu Ngoc Nguyen 2, Felipe Reis 2, Nam Khanh Tran 2 1 Department of Computer Science, Aalborg

More information

A Document-centered Approach to a Natural Language Music Search Engine

A Document-centered Approach to a Natural Language Music Search Engine A Document-centered Approach to a Natural Language Music Search Engine Peter Knees, Tim Pohle, Markus Schedl, Dominik Schnitzer, and Klaus Seyerlehner Dept. of Computational Perception, Johannes Kepler

More information

Learning to Reweight Terms with Distributed Representations

Learning to Reweight Terms with Distributed Representations Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215 Outline Goal: Assign weights to query terms for better retrieval results

More information

Extracting Visual Snippets for Query Suggestion in Collaborative Web Search

Extracting Visual Snippets for Query Suggestion in Collaborative Web Search Extracting Visual Snippets for Query Suggestion in Collaborative Web Search Hannarin Kruajirayu, Teerapong Leelanupab Knowledge Management and Knowledge Engineering Laboratory Faculty of Information Technology

More information

Event-based Time Label Propagation for Automatic Dating of News Articles

Event-based Time Label Propagation for Automatic Dating of News Articles Event-based Time Label Propagation for Automatic Dating of News Articles Tao Ge Baobao Chang Sujian Li Zhifang Sui Key Laboratory of Computational Linguistics, Ministry of Education School of Electronics

More information

Tools and Infrastructure for Supporting Enterprise Knowledge Graphs

Tools and Infrastructure for Supporting Enterprise Knowledge Graphs Tools and Infrastructure for Supporting Enterprise Knowledge Graphs Sumit Bhatia, Nidhi Rajshree, Anshu Jain, and Nitish Aggarwal IBM Research sumitbhatia@in.ibm.com, {nidhi.rajshree,anshu.n.jain}@us.ibm.com,nitish.aggarwal@ibm.com

More information

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Walid Magdy, Gareth J.F. Jones Centre for Next Generation Localisation School of Computing Dublin City University,

More information

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22 Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion

More information

Blog Site Search Using Resource Selection

Blog Site Search Using Resource Selection Blog Site Search Using Resource Selection Jangwon Seo jangwon@cs.umass.edu Center for Intelligent Information Retrieval Department of Computer Science University of Massachusetts, Amherst Amherst, MA 01003

More information

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments Hui Fang ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign Abstract In this paper, we report

More information

Reducing Redundancy with Anchor Text and Spam Priors

Reducing Redundancy with Anchor Text and Spam Priors Reducing Redundancy with Anchor Text and Spam Priors Marijn Koolen 1 Jaap Kamps 1,2 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Informatics Institute, University

More information

EFFICIENT APPROACH FOR DETECTING HARD KEYWORD QUERIES WITH MULTI-LEVEL NOISE GENERATION

EFFICIENT APPROACH FOR DETECTING HARD KEYWORD QUERIES WITH MULTI-LEVEL NOISE GENERATION EFFICIENT APPROACH FOR DETECTING HARD KEYWORD QUERIES WITH MULTI-LEVEL NOISE GENERATION B.Mohankumar 1, Dr. P. Marikkannu 2, S. Jansi Rani 3, S. Suganya 4 1 3 4Asst Prof, Department of Information Technology,

More information

Federated Search. Jaime Arguello INLS 509: Information Retrieval November 21, Thursday, November 17, 16

Federated Search. Jaime Arguello INLS 509: Information Retrieval November 21, Thursday, November 17, 16 Federated Search Jaime Arguello INLS 509: Information Retrieval jarguell@email.unc.edu November 21, 2016 Up to this point... Classic information retrieval search from a single centralized index all ueries

More information

University of Glasgow at the Web Track: Dynamic Application of Hyperlink Analysis using the Query Scope

University of Glasgow at the Web Track: Dynamic Application of Hyperlink Analysis using the Query Scope University of Glasgow at the Web Track: Dynamic Application of yperlink Analysis using the uery Scope Vassilis Plachouras 1, Fidel Cacheda 2, Iadh Ounis 1, and Cornelis Joost van Risbergen 1 1 University

More information

CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval

CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval DCU @ CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval Walid Magdy, Johannes Leveling, Gareth J.F. Jones Centre for Next Generation Localization School of Computing Dublin City University,

More information

BUPT at TREC 2009: Entity Track

BUPT at TREC 2009: Entity Track BUPT at TREC 2009: Entity Track Zhanyi Wang, Dongxin Liu, Weiran Xu, Guang Chen, Jun Guo Pattern Recognition and Intelligent System Lab, Beijing University of Posts and Telecommunications, Beijing, China,

More information

Relevance Models for Topic Detection and Tracking

Relevance Models for Topic Detection and Tracking Relevance Models for Topic Detection and Tracking Victor Lavrenko, James Allan, Edward DeGuzman, Daniel LaFlamme, Veera Pollard, and Steven Thomas Center for Intelligent Information Retrieval Department

More information

ImgSeek: Capturing User s Intent For Internet Image Search

ImgSeek: Capturing User s Intent For Internet Image Search ImgSeek: Capturing User s Intent For Internet Image Search Abstract - Internet image search engines (e.g. Bing Image Search) frequently lean on adjacent text features. It is difficult for them to illustrate

More information