Learning Temporal-Dependent Ranking Models

Similar documents
Information Search in Web Archives

How to Crawl the Web. Hector Garcia-Molina Stanford University. Joint work with Junghoo Cho

Modern Retrieval Evaluations. Hongning Wang

Northeastern University in TREC 2009 Million Query Track

Characterizing Search Behavior in Web Archives

Experiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Ranking with Query-Dependent Loss for Web Search

How to Access If Rubrics does not appear on your course navbar, click Edit Course, Tools, Rubrics to activate..

Fractional Similarity : Cross-lingual Feature Selection for Search

Automatic Query Type Identification Based on Click Through Information

Information Retrieval

Learning to Rank for Faceted Search Bridging the gap between theory and practice

PERSONALIZED TAG RECOMMENDATION

A Large-Scale Characterization of User Behaviour in Cable TV

A RECOMMENDER SYSTEM FOR SOCIAL BOOK SEARCH

Rishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India. Monojit Choudhury and Srivatsan Laxman Microsoft Research India India

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

Retrieval Evaluation. Hongning Wang

Diversification of Query Interpretations and Search Results

Overview of the NTCIR-13 OpenLiveQ Task

Representative & Informative Query Selection for Learning to Rank using Submodular Functions

Chapter 8. Evaluating Search Engine

Retrieval Evaluation

RSDC 09: Tag Recommendation Using Keywords and Association Rules

Context based Re-ranking of Web Documents (CReWD)

TREC OpenSearch Planning Session

Metric Spaces for Temporal Information Retrieval

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann

Personalized Web Search

Time-aware Approaches to Information Retrieval

doi: / _32

Comparative Analysis of Clicks and Judgments for IR Evaluation

Building Test Collections. Donna Harman National Institute of Standards and Technology

Search Engine Architecture. Hongning Wang

Supervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information

WebSci and Learning to Rank for IR

CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation"

Ryen W. White, Peter Bailey, Liwei Chen Microsoft Corporation

Extracting Visual Snippets for Query Suggestion in Collaborative Web Search

mltool Documentation Release Maurizio Sambati

Fall Lecture 16: Learning-to-rank

Automatic Identification of User Goals in Web Search [WWW 05]

An Empirical Analysis on Point-wise Machine Learning Techniques using Regression Trees for Web-search Ranking

A STUDY ON THE EVOLUTION OF THE WEB

Web Spam. Seminar: Future Of Web Search. Know Your Neighbors: Web Spam Detection using the Web Topology

Ranking Assessment of Event Tweets for Credibility

Overview of the NTCIR-13 OpenLiveQ Task

Ranking and Learning. Table of Content. Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification.

This is an author-deposited version published in : Eprints ID : 15246

HYBRIDIZED MODEL FOR EFFICIENT MATCHING AND DATA PREDICTION IN INFORMATION RETRIEVAL

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

Informativeness for Adhoc IR Evaluation:

LaHC at CLEF 2015 SBS Lab

Towards a Distributed Web Search Engine

Supervised Reranking for Web Image Search

Incorporating Satellite Documents into Co-citation Networks for Scientific Paper Searches

Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web

Assignment No. 1. Abdurrahman Yasar. June 10, QUESTION 1

Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits

arxiv:cs/ v1 [cs.ir] 26 Apr 2002

Evaluating an Associative Browsing Model for Personal Information

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul

CS646 (Fall 2016) Homework 1

LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1

Web Modelling for Web Warehouse Design Daniel Coelho Gomes

The University of Illinois Graduate School of Library and Information Science at TREC 2011

Ranking Web Pages by Associating Keywords with Locations

Sentiment analysis under temporal shift

Learning to Reweight Terms with Distributed Representations

Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

Quick Base Certification Overview

Web Spam Challenge 2008

Find, New, Copy, Web, Page - Tagging for the (Re-)Discovery of Web Pages

Reducing Redundancy with Anchor Text and Spam Priors

Book Recommendation based on Social Information

Space-Efficient Support for Temporal Text Indexing in a Document Archive Context

From Passages into Elements in XML Retrieval

Advanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University

CS 347 Parallel and Distributed Data Processing

Automatic Search Engine Evaluation with Click-through Data Analysis. Yiqun Liu State Key Lab of Intelligent Tech. & Sys Jun.

CS 347 Parallel and Distributed Data Processing

for Searching Social Media Posts

Query Heartbeat: A Strange Property of Keyword Queries on the Web

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks

Citation for published version (APA): He, J. (2011). Exploring topic structure: Coherence, diversity and relatedness

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Information Retrieval

Hierarchical Location and Topic Based Query Expansion

Prof. Ahmet Süerdem Istanbul Bilgi University London School of Economics

Information Retrieval

Information Retrieval

An Active Learning Approach to Efficiently Ranking Retrieval Engines

Adobe Marketing Cloud Data Workbench Controlled Experiments

Information Retrieval. Lecture 7 - Evaluation in Information Retrieval. Introduction. Overview. Standard test collection. Wintersemester 2007

Information Retrieval Spring Web retrieval

Measuring Similarity to Detect

Information Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system.

Transcription:

Learning Temporal-Dependent Ranking Models Miguel Costa, Francisco Couto, Mário Silva LaSIGE @ Faculty of Sciences, University of Lisbon IST/INESC-ID, University of Lisbon 37th Annual ACM SIGIR Conference, Gold Coast, Australia July 10, 2014

Our Memory is in Digital Form E-books Web photo galleries Forums Blogs Online newspapers Social networks 2

The Web is Ephemeral 50 days - 50% of documents are changed (Cho and Garcia-Molina. 2000) 1 year - 80% of documents become inaccessible (Ntoulas, Cho and Olson. 2004) 27 months - 13% of web references disappear (http://webcitation.org/. 2007) 3

Will we face a Digital Dark Age? 4

2014: Web Archiving Initiatives +68 initiatives in 33 countries +534 billions of web contents since 1996 (17 PB) 5

PWA Search System Available since 2010: http://archive.pt 1.2 billion documents searchable by full-text and URL range between 1996 and 2013 6

URL Search 7

SAPO.PT 1997 8

Full-text Search 149.648.512 find the most relevant results 9

How to find the best search results for a given query in a Web Archive? Typical solution: combine a set of proven ranking features using learning-to-rank (L2R) algorithms 10

Contributions We describe how to leverage the temporal dimension of web data by: 1. designing novel ranking features that exploit correlations between archived data and relevance 2. designing a novel ranking framework that learns models considering variations of data over time 11

Temporal Features 12

Long-term Document Persistence Predominant user information need: navigational. Query-independent ranking features do not work well Much smaller volume of clicks Sparser web-graphs We need alternatives Are long-term persistent documents more relevant? How to measure persistence? lifespan number of versions 13

fraction of documents with a lifespan longer than 1 year Lifespan & Relevance 100 80 60 40 documents with higher relevance tend to have a longer lifespan 20 0 not relevant relevant very relevant relevance level 14 years of web snapshots analyzed 14

fraction of documents with more than 10 versions # Versions & Relevance 100 80 60 40 documents with higher relevance tend to have more versions 20 0 not relevant relevant very relevant relevance level 14 years of web snapshots analyzed 15

Modeling Document Persistence f d = log y (x) Parameters: x = #versions/lifespan of document d y = maximum #versions/lifespan of a document in the collection 16

Temporal-Dependent Ranking Models 17

Temporal-Dependent Ranking The web has different characteristics over time: more sites and pages longer contents different technologies slightly different language denser web-graphs Should we use a single-model that fits all data? No: [Kang & Kim 2003; Geng et al. 2008; Bian et al. 2010] 18

slope α (learning contribution) Temporal Intervals M 1 M 2 use all data (do not split data by time) closer periods are more likely to hold similar web characteristics M 3 Example: 3 intervals T= { [t1,t2], ]t2,t3], ]t3,t4] } 19

Temporal-Dependent Models L= loss function x i = input of query-document feature vector m = # instances m model = argmin f i=1 L f x i, ω, y i ω = parameters y i = relevance label Υ = temporal weight function TD model = argmin m f i=1 L Υ x i, Tk f x i, ω, y i Υ x i, Tk = 1 if xi Tk 1 α distance(xi,tk) T if xi Tk α = slope 20

Global Loss Function Results of temporal models are sub-optimal and hard to combine. Minimize a global loss function (correlation and overlap between models are considered). model 1,,model n = argmin f1,,f n i=1 m n L j=1 n = # temporal intervals Υ x i, Tj f j x i, ω, y i Scoring follows the global loss function. n score(x i ) = j=1 Υ x i, Tj f j x i, ω 21

Experimental Setup 22

Research Questions Do temporal features extracted from web archives improve Web Archive IR? Created a L2R dataset L2R algorithms used: AdaRank, RankSVM, Random Forests. L2R algorithms compared using the dataset with and without temporal features. Does the temporal-dependent ranking framework outperforms L2R single-models? L2R algorithms used: RankSVM and TD RankSVM. Temporal-dependent models compared with single-models. 23

Dataset for L2R in Web Archives 39 608 quadruples <query, version, grade, features> 50 queries randomly sampled from logs 843 versions assessed on average per query 3-level scale of relevance 68 ranking features extracted (including temporal) LETOR file format: Rel. Query Features Doc. Version 2 qid:21 1:0.70 2:0.34 3:0.27... 68:0.86 # id114746079 0 qid:22 1:0.05 2:0.18 3:0.14... 68:0.43 # id172346033 1 qid:22 1:0.75 2:0.33 3:0.84... 68:0.54 # id456334535 24

Evaluation Methodology Test Collection (based on Cranfield Paradigm): Corpus: 6 web collections, 255M contents, 8.9TB broad crawls, selective crawls, integrated collections Topics: 50 navigational (with date range) e.g. the page of Publico newspaper before 2000. Relevance Judgments: 3 judges, 3-level scale of relevance, 267 822 versions assessed Metrics: (NDCG@k, P@k k=1,5,10) 5-fold cross-validation 3 folders for training, 1 for validation, 1 for testing 25

Results 26

Temporal Features vs. Without Temporal Features L2R algorithms (without temporal features) Metric AdaRank Rank SVM Random Forests AdaRank L2R algorithms (68 features) Rank Random SVM Forests NDCG@1 NDCG@5 NDCG@10 0.380 0.427 0.470 0.500 0.485 0.523 0.550 0.610 0.650 0.400 0.426 0.476 0.530 0.546 0.571 0.650 0.665 0.688 + 10% All results show a statistical significance of p<0.05 with a two-sided paired t-test. 27

NDCG@10 Temporal-dependent models vs. Single-models (without temporal features) too large contribution 0.58 0.56 0.54 0.52 0.5 0.48 too small contribution + 5% slope 0.46 14 7 4 2 1 time intervals (using 14 years of web collections) α = 0.25 α = 0.5 α = 0.75 α = 1 α = 1.25 α = 1.5 28

NDCG@10 Temporal-dependent models vs. Single-models (with temporal features) 0.61 + 3.3% 0.59 0.57 0.55 0.53 slope 0.51 14 7 4 2 1 time intervals (using 14 years of web collections) α = 0.25 α = 0.5 α = 0.75 α = 1 α = 1.25 α = 1.5 29

Conclusions 30

Conclusions The evolution of web data over time can be exploited to improve the ranking of search results: by designing novel temporal features Relevant documents tend to have a longer lifespan and more versions. by considering time when learning models A model per period outperforms a single-model. (Combined techniques produce the best results) Web archives are an excellent source to provide temporal information to web search systems. 31

Resources Public service since 2010: http://archive.pt OpenSearch API: http://code.google.com/p/pwa-technologies/wiki/opensearch Test collection to support evaluation: https://code.google.com/p/pwa-technologies/wiki/testcollection L2R dataset for web archive IR research: http://code.google.com/p/pwa-technologies/wiki/l2r4wair All code available under the LGPL license: https://code.google.com/p/pwa-technologies/ 32

Thank you. Questions? migcosta@gmail.com