Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track

Size: px
Start display at page:

Download "Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track"

Transcription

1 Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track Jiyun Luo, Xuchu Dong and Grace Hui Yang Department of Computer Science Georgetown University

2 Introduc:on Session search Document retrieval for an en:re search session. TREC Session Track provides log data which records A sequence of query changes q 1,q 2 q n- 1,q n The ranked list for each past query Document clicked informa:on and dwell :me. TREC 2014 Session Track: RL1 using the last query of a session RL2 using any informa:on in current session RL3 using informa:on from other sessions We use: ClueWeb12 Category A as our corpus 2

3 Outline Introduc:on Methods and Approaches Ad- hoc Retrieval Model (Ad- hoc) Query Change Retrieval Model (QCM) Weighted QCM User- Click Model Clustering Session Performance Predic:on and Replacement Submissions Evalua:on Result Conclusion 3

4 Ad- hoc Retrieval Model (Ad- hoc) Mul:nomial Language Modeling + Dirichlet Smoothing. Term weight P(t d) as: μ is the Dirichlet smoothing parameter, and is set =

5 Query Change Retrieve Model (QCM) Idea: Query Change is an important form of user feedback Dongyi Guan, Sicong Zhang, and Hui Yang U:lizing query change for session search. (SIGIR '13). Defining query change Δq i as the syntac:c edi:ng changes between two adjacent queries: Δq i Δq i = Added term ; Removed term Δq ; Theme term q i q i 1 + i Table 1 A example of Query Change qtheme Session Queries Query Change Q theme Q 1 = hydropower efficiency +Δq 2 = environment hydropower session 52 Q 2 = hydropower environment - Δq 2 = efficiency Q 3 = hydropower damage +Δq 3 = damage hydropower - Δq 3 = environment 5

6 Query Change Retrieve Model (QCM) The relevance score Increase between weights one query q i and a document d is calculated by: for theme terms Increase weights for novel added terms Score(q i, d) = log P(q i d)+αw Theme βw Add,In +εw Add,Out δw Remove Current reward/ relevance score Decrease weights for old added terms Decrease weights for removed terms 6

7 Query Change Retrieve Model (QCM) The relevance score between one query q i and a document d is calculated by: Score(q i, d) = log P(q i d)+αw Theme βw Add,In +εw Add,Out δw Remove The QCM model combines all queries in a session with a discount factor Υ: n i=1 Score qcm (q 1..n, d) = γ n i Score(q i, d) 7

8 Weighted QCM Weighted QCM combines queries based on query quality which is indicated by user click Strong SAT- Click a clicked document with dwelled :me >= 30 seconds Weak SAT- Click a clicked document with dwell :me >= 10 seconds and< 30 seconds 8

9 Weighted QCM Weighted QCM combines queries based on query quality which is indicated by user click Strong SAT- Click a clicked document with dwelled :me >= 30 seconds Weak SAT- Click a clicked document with dwell :me >= 10 seconds and< 30 seconds!"#$%!"#$!!..!,! =!"#$% (!!,!) +!!!"#$% (!!,!)!!!!!""#!!!!"# The good query set: Queries bringing at least one SAT- Click + the current query The bad query set: Queries bringing no SAT- Click 9

10 User- Click Model We boost a document s ranking score, if it is SAT- Clicked by users Session Level User- Click Model for RL2 score from QCM model boost from Session level User- Click model Ψ points for a Strong SAT- Click, θ points for a Weak SAT- Click, sum up for the whole session normaliza1on to (0,1) 10

11 User- Click Model Topic Level User- Click Model for RL3 boost from Topic level User- Click model similar to session level User- Click model, however calcula:on is done for the whole session cluster A session cluster is a set of sessions that sharing similar search topics 11

12 Clustering Topic ID is not obtainable in real search prac:ce. cluster sessions by comparing queries similarity Ø Convert all queries in one session to a term vector Ø Assign idf value as weight to each dimension Ø Cluster sessions based on the Euclidean distance of these vectors We use K- means clustering algorithm and set K = 60 12

13 Session Performance Predic1on and Replacement For sessions that share similar search topics predict their performance replace bad sessions results with good sessions Predict session performance Extract several features (n) from the sessions Rank sessions by formula:!"#$%!! = 1!(! #!!"!!"!!#$%!!!"#$!%&$'(!!! = TRUE! )!!!!..! 13

14 Session Performance Predic1on and Replacement Features Table Table&2&Features&Extracted&for&each&Session&! Feature F 1 F 2 F 3 F 4 F 5 F 6 F 7 F 8 Definition Search intent is comparison No user-click in session s!!"#$$ 5s.!!"#$$!!"!!h!!!"#!!"!!"#$$!time in a session. # of unique terms in session s 20. (!)!!"#$$_!"#_!"#!$ <!!"#$$_!"#_!"#!$ 2 Session s does not contain the most frequent search term in T(s). # of unique terms in session s 6!!!(!) #!!"!!"#!!"#!$%!!"!!"!!#$%!! #!!"!!"#!!"#!$%!!"!!"!!#$%!! <!(!) * T(s) means a session cluster including session s 14

15 Outline Introduc:on Methods and Approaches Ad- hoc Retrieval Model (Ad- hoc) Query Change Retrieval Model (QCM) Weighted QCM User- Click Model Clustering Session Performance Predic:on and Replacement Submissions Evalua:on Result Conclusion 15

16 RL1 RL2 RL3 Our Submissions GUS14RUN1 GUS14RUN2 GUS14RUN3 Weighted QCM (ω=0.65) Session Level User- Click Model Weighted QCM (ω=0.65) Topic Level User- Click Model Ad- hoc Retrieval Model Weighted QCM (ω=0.8) Session Level User- Click Model Weighted QCM (ω=0.8) Topic Level User- Click Model Weighted QCM (ω=0.8) Topic Level User- Click Model using topic ids Session Performance Predic:on and Replacement RL3 in RUN1 and RUN2 using session clusters based on query similarity RL3 in RUN3 using session cluster based on topic id Why? similar queries leads to similar retrieval list in our system. Not useful when apply session replacement strategy 16

17 Evalua1on Results GUS14RUN1 GUS14RUN2 GUS14RUN3 Max Med RL RL RL Ø 2 nd rank in task RL1, 1 st rank in task RL2 and RL3 Adjus:ng term weight based on query change is effec:ve Combining queries in a session is useful for Session Track User- Click is effec:ve to predicate relevance Ø A small performance drop from RL2 to RL3 in RUN1 and RUN2 cluster sessions based on query similarity may work, however need more work to refine it Ø A small increase from RL2 to RL3 in RUN3 For sessions sharing same search topics, replacing poor sessions results using good sessions is prac:cal. 17

18 Conclusion Achieve 20.9% increase from RL1 to RL2 by u:lizing query change feedback user click feedback Achieve 4% increase from RL2 to RL3 by Topic level User- Click Model Session performance predic:on and replacement

19 Thanks! Jiyun Luo, Xuchu Dong and Grace Hui Yang Department of Computer Science Georgetown University 19

Utilizing Query Change for Session Search

Utilizing Query Change for Session Search Utilizing Query Change for Session Search Dongyi Guan, Sicong Zhang, Hui Yang Department of Computer Science Georgetown University 37th and O Street, NW, Washington, DC, 20057 {dg372, sz303}@georgetown.edu,

More information

Effective Structured Query Formulation for Session Search

Effective Structured Query Formulation for Session Search Effective Structured Query Formulation for Session Search Dongyi Guan Hui Yang Nazli Goharian Department of Computer Science Georgetown University 37 th and O Street, NW, Washington, DC, 20057 dg372@georgetown.edu,

More information

An Investigation of Basic Retrieval Models for the Dynamic Domain Task

An Investigation of Basic Retrieval Models for the Dynamic Domain Task An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu

More information

COSC572 GUEST LECTURE - PROF. GRACE HUI YANG INTRODUCTION TO INFORMATION RETRIEVAL NOV 2, 2016

COSC572 GUEST LECTURE - PROF. GRACE HUI YANG INTRODUCTION TO INFORMATION RETRIEVAL NOV 2, 2016 COSC572 GUEST LECTURE - PROF. GRACE HUI YANG INTRODUCTION TO INFORMATION RETRIEVAL NOV 2, 2016 1 TOPICS FOR TODAY Modes of Search What is Information Retrieval Search vs. Evaluation Vector Space Model

More information

TREC 2017 Dynamic Domain Track Overview

TREC 2017 Dynamic Domain Track Overview TREC 2017 Dynamic Domain Track Overview Grace Hui Yang Zhiwen Tang Ian Soboroff Georgetown University Georgetown University NIST huiyang@cs.georgetown.edu zt79@georgetown.edu ian.soboroff@nist.gov 1. Introduction

More information

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Course Goals To help you to understand search engines, evaluate and compare them, and

More information

An Exploration of Query Term Deletion

An Exploration of Query Term Deletion An Exploration of Query Term Deletion Hao Wu and Hui Fang University of Delaware, Newark DE 19716, USA haowu@ece.udel.edu, hfang@ece.udel.edu Abstract. Many search users fail to formulate queries that

More information

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Course Goals To help you to understand search engines, evaluate and compare them, and

More information

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano

More information

University of Delaware at Diversity Task of Web Track 2010

University of Delaware at Diversity Task of Web Track 2010 University of Delaware at Diversity Task of Web Track 2010 Wei Zheng 1, Xuanhui Wang 2, and Hui Fang 1 1 Department of ECE, University of Delaware 2 Yahoo! Abstract We report our systems and experiments

More information

Washington, DC April 22, 2013

Washington, DC April 22, 2013 Structured Query Formulation and Result Organization for Session Search A Thesis submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulllment of the

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Retrieval Models Provide a mathema1cal framework for defining the search process includes

More information

Learning to Reweight Terms with Distributed Representations

Learning to Reweight Terms with Distributed Representations Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215 Outline Goal: Assign weights to query terms for better retrieval results

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually

More information

On Duplicate Results in a Search Session

On Duplicate Results in a Search Session On Duplicate Results in a Search Session Jiepu Jiang Daqing He Shuguang Han School of Information Sciences University of Pittsburgh jiepu.jiang@gmail.com dah44@pitt.edu shh69@pitt.edu ABSTRACT In this

More information

Modern Retrieval Evaluations. Hongning Wang

Modern Retrieval Evaluations. Hongning Wang Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance

More information

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Query Process Retrieval Models Provide a mathema.cal framework for defining the search process

More information

On Duplicate Results in a Search Session

On Duplicate Results in a Search Session On Duplicate Results in a Search Session Jiepu Jiang Daqing He Shuguang Han School of Information Sciences University of Pittsburgh jiepu.jiang@gmail.com dah44@pitt.edu shh69@pitt.edu ABSTRACT In this

More information

Recommender Systems Collabora2ve Filtering and Matrix Factoriza2on

Recommender Systems Collabora2ve Filtering and Matrix Factoriza2on Recommender Systems Collaborave Filtering and Matrix Factorizaon Narges Razavian Thanks to lecture slides from Alex Smola@CMU Yahuda Koren@Yahoo labs and Bing Liu@UIC We Know What You Ought To Be Watching

More information

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015 University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 5:00pm-6:15pm, Monday, October 26th Name: ComputingID: This is a closed book and closed notes exam. No electronic

More information

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)"

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3) CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Language Model" Unigram language

More information

CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation"

CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Evaluation" Evaluation is key to building

More information

A Comparative Analysis of Cascade Measures for Novelty and Diversity

A Comparative Analysis of Cascade Measures for Novelty and Diversity A Comparative Analysis of Cascade Measures for Novelty and Diversity Charles Clarke, University of Waterloo Nick Craswell, Microsoft Ian Soboroff, NIST Azin Ashkan, University of Waterloo Background Measuring

More information

A Deep Relevance Matching Model for Ad-hoc Retrieval

A Deep Relevance Matching Model for Ad-hoc Retrieval A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval Lecture 8: Evalua*on 1 Sec. 6.2 This lecture How do we know if our results are any good? Evalua*ng a search engine Benchmarks Precision and recall 2 EVALUATING SEARCH

More information

Reducing Click and Skip Errors in Search Result Ranking

Reducing Click and Skip Errors in Search Result Ranking Reducing Click and Skip Errors in Search Result Ranking Jiepu Jiang Center for Intelligent Information Retrieval College of Information and Computer Sciences University of Massachusetts Amherst jpjiang@cs.umass.edu

More information

A term-based methodology for query reformulation understanding

A term-based methodology for query reformulation understanding DOI 10.1007/s10791-015-9251-5 A term-based methodology for query reformulation understanding Marc Sloan Hui Yang Jun Wang Received: 15 August 2014 / Accepted: 24 February 2015 Ó Springer Science+Business

More information

Information Search in Web Archives

Information Search in Web Archives Information Search in Web Archives Miguel Costa Advisor: Prof. Mário J. Silva Co-Advisor: Prof. Francisco Couto Department of Informatics, Faculty of Sciences, University of Lisbon PhD thesis defense,

More information

Modeling multiple interactions with a Markov random field in query expansion for session search

Modeling multiple interactions with a Markov random field in query expansion for session search Received: 20 April 2016 Revised: 18 September 2017 Accepted: 20 September 2017 DOI: 10.1111/coin.12154 ORIGINAL ARTICLE Modeling multiple interactions with a Markov random field in query expansion for

More information

Chapter 8. Evaluating Search Engine

Chapter 8. Evaluating Search Engine Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can

More information

Context based Re-ranking of Web Documents (CReWD)

Context based Re-ranking of Web Documents (CReWD) Context based Re-ranking of Web Documents (CReWD) Arijit Banerjee, Jagadish Venkatraman Graduate Students, Department of Computer Science, Stanford University arijitb@stanford.edu, jagadish@stanford.edu}

More information

Increasing Stability of Result Organization for Session Search

Increasing Stability of Result Organization for Session Search Increasing Stability of Result Organization for Session Search Dongyi Guan and Hui Yang Department of Computer Science, Georgetown University 37th and O Street NW, Washington DC, 20057, USA dongyi.guan@gmail.com,

More information

Fall Lecture 16: Learning-to-rank

Fall Lecture 16: Learning-to-rank Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin

More information

Reducing Redundancy with Anchor Text and Spam Priors

Reducing Redundancy with Anchor Text and Spam Priors Reducing Redundancy with Anchor Text and Spam Priors Marijn Koolen 1 Jaap Kamps 1,2 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Informatics Institute, University

More information

Machine Learning Crash Course: Part I

Machine Learning Crash Course: Part I Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec

More information

Midterm Exam Search Engines ( / ) October 20, 2015

Midterm Exam Search Engines ( / ) October 20, 2015 Student Name: Andrew ID: Seat Number: Midterm Exam Search Engines (11-442 / 11-642) October 20, 2015 Answer all of the following questions. Each answer should be thorough, complete, and relevant. Points

More information

Query Likelihood with Negative Query Generation

Query Likelihood with Negative Query Generation Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer

More information

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! (301) 219-4649 james.mayfield@jhuapl.edu What is Information Retrieval? Evaluation

More information

Personalized Web Search

Personalized Web Search Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

Microsoft Research Asia at the Web Track of TREC 2009

Microsoft Research Asia at the Web Track of TREC 2009 Microsoft Research Asia at the Web Track of TREC 2009 Zhicheng Dou, Kun Chen, Ruihua Song, Yunxiao Ma, Shuming Shi, and Ji-Rong Wen Microsoft Research Asia, Xi an Jiongtong University {zhichdou, rsong,

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Indexes Indexes are data structures designed to make search faster Text search has unique

More information

Overview of the NTCIR-13 OpenLiveQ Task

Overview of the NTCIR-13 OpenLiveQ Task Overview of the NTCIR-13 OpenLiveQ Task ABSTRACT Makoto P. Kato Kyoto University mpkato@acm.org Akiomi Nishida Yahoo Japan Corporation anishida@yahoo-corp.jp This is an overview of the NTCIR-13 OpenLiveQ

More information

Informa(on Retrieval

Informa(on Retrieval Introduc)on to Informa(on Retrieval CS276 Informa)on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 8: Evalua)on Sec. 6.2 This lecture How do we know if our results are any good? Evalua)ng

More information

This lecture. Measures for a search engine EVALUATING SEARCH ENGINES. Measuring user happiness. Measures for a search engine

This lecture. Measures for a search engine EVALUATING SEARCH ENGINES. Measuring user happiness. Measures for a search engine Sec. 6.2 Introduc)on to Informa(on Retrieval CS276 Informa)on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 8: Evalua)on This lecture How do we know if our results are any good? Evalua)ng

More information

Feature selection. LING 572 Fei Xia

Feature selection. LING 572 Fei Xia Feature selection LING 572 Fei Xia 1 Creating attribute-value table x 1 x 2 f 1 f 2 f K y Choose features: Define feature templates Instantiate the feature templates Dimensionality reduction: feature selection

More information

Birkbeck (University of London)

Birkbeck (University of London) Birkbeck (University of London) MSc Examination for Internal Students Department of Computer Science and Information Systems Information Retrieval and Organisation (COIY64H7) Credit Value: 5 Date of Examination:

More information

Improving Difficult Queries by Leveraging Clusters in Term Graph

Improving Difficult Queries by Leveraging Clusters in Term Graph Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu

More information

Information Retrieval. (M&S Ch 15)

Information Retrieval. (M&S Ch 15) Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion

More information

Clustering. Introduction to Data Science University of Colorado Boulder SLIDES ADAPTED FROM LAUREN HANNAH

Clustering. Introduction to Data Science University of Colorado Boulder SLIDES ADAPTED FROM LAUREN HANNAH Clustering Introduction to Data Science University of Colorado Boulder SLIDES ADAPTED FROM LAUREN HANNAH Introduction to Data Science Boulder Clustering 1 of 9 Clustering Lab Review of k-means Work through

More information

The University of Amsterdam at the CLEF 2008 Domain Specific Track

The University of Amsterdam at the CLEF 2008 Domain Specific Track The University of Amsterdam at the CLEF 2008 Domain Specific Track Parsimonious Relevance and Concept Models Edgar Meij emeij@science.uva.nl ISLA, University of Amsterdam Maarten de Rijke mdr@science.uva.nl

More information

Related Entity Finding Based on Co-Occurrence

Related Entity Finding Based on Co-Occurrence Related Entity Finding Based on Co-Occurrence Marc Bron Krisztian Balog Maarten de Rijke ISLA, University of Amsterdam http://ilps.science.uva.nl/ Abstract: We report on experiments for the Related Entity

More information

Machine Learning based session drop prediction in LTE networks and its SON aspects

Machine Learning based session drop prediction in LTE networks and its SON aspects Machine Learning based session drop prediction in LTE networks and its SON aspects Bálint Daróczy, András Benczúr Institute for Computer Science and Control (MTA SZTAKI) Hungarian Academy of Sciences Péter

More information

Ranking and Learning. Table of Content. Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification.

Ranking and Learning. Table of Content. Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification. Table of Content anking and Learning Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification 290 UCSB, Tao Yang, 2013 Partially based on Manning, aghavan,

More information

Robust Relevance-Based Language Models

Robust Relevance-Based Language Models Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new

More information

Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling

Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling Chenyan Xiong, Zhengzhong Liu, Jamie Callan, and Tie-Yan Liu* Carnegie Mellon University & Microsoft Research* 1

More information

A Security Punctua.on Framework for Enforcing Access Control on Streaming Data. Rimma V. Nehme, Elke A. Rundensteinerr, Elisa Ber.

A Security Punctua.on Framework for Enforcing Access Control on Streaming Data. Rimma V. Nehme, Elke A. Rundensteinerr, Elisa Ber. A Security Punctua.on Framework for Enforcing Access Control on Streaming Data Rimma V. Nehme, Elke A. Rundensteinerr, Elisa Ber.no Presented by Thao Pham Mo.va.on Mobile devices make available personal

More information

Informa(on Retrieval

Informa(on Retrieval Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Construc9on Sort- based indexing Blocked Sort- Based Indexing

More information

Northeastern University in TREC 2009 Million Query Track

Northeastern University in TREC 2009 Million Query Track Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College

More information

CSCI 5417 Information Retrieval Systems. Jim Martin!

CSCI 5417 Information Retrieval Systems. Jim Martin! CSCI 5417 Information Retrieval Systems Jim Martin! Lecture 7 9/13/2011 Today Review Efficient scoring schemes Approximate scoring Evaluating IR systems 1 Normal Cosine Scoring Speedups... Compute the

More information

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Indexing Process Indexes Indexes are data structures designed to make search faster Text search

More information

Consistency Rationing in the Cloud: Pay only when it matters

Consistency Rationing in the Cloud: Pay only when it matters Consistency Rationing in the Cloud: Pay only when it matters By Sandeepkrishnan Some of the slides in this presenta4on have been taken from h7p://www.cse.iitb.ac.in./dbms/cs632/ra4oning.ppt 1 Introduc4on:

More information

Informa(on Retrieval

Informa(on Retrieval Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Compression Collec9on and vocabulary sta9s9cs: Heaps and

More information

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval 1 Naïve Implementation Convert all documents in collection D to tf-idf weighted vectors, d j, for keyword vocabulary V. Convert

More information

Unsupervised Rank Aggregation with Distance-Based Models

Unsupervised Rank Aggregation with Distance-Based Models Unsupervised Rank Aggregation with Distance-Based Models Alexandre Klementiev, Dan Roth, and Kevin Small University of Illinois at Urbana-Champaign Motivation Consider a panel of judges Each (independently)

More information

A Simple and Efficient Sampling Method for Es7ma7ng AP and ndcg

A Simple and Efficient Sampling Method for Es7ma7ng AP and ndcg A Simple and Efficient Sampling Method for Es7ma7ng AP and ndcg Emine Yilmaz Microso' Research, Cambridge, UK Evangelos Kanoulas Javed Aslam Northeastern University, Boston, USA Introduc7on Obtaining relevance

More information

Search Engine Architecture II

Search Engine Architecture II Search Engine Architecture II Primary Goals of Search Engines Effectiveness (quality): to retrieve the most relevant set of documents for a query Process text and store text statistics to improve relevance

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval Lecture 6-: Scoring, Term Weighting Outline Why ranked retrieval? Term frequency tf-idf weighting 2 Ranked retrieval Thus far, our queries have all been Boolean. Documents

More information

Overview of the NTCIR-12 MobileClick-2 Task

Overview of the NTCIR-12 MobileClick-2 Task Overview of the NTCIR-12 MobileClick-2 Task Makoto P. Kato (Kyoto U.), Tetsuya Sakai (Waseda U.), Takehiro Yamamoto (Kyoto U.), Virgil Pavlu (Northeastern U.), Hajime Morita (Kyoto U.), and Sumio Fujita

More information

Information Retrieval Using Context Based Document Indexing and Term Graph

Information Retrieval Using Context Based Document Indexing and Term Graph Information Retrieval Using Context Based Document Indexing and Term Graph Mr. Mandar Donge ME Student, Department of Computer Engineering, P.V.P.I.T, Bavdhan, Savitribai Phule Pune University, Pune, Maharashtra,

More information

The Water Filling Model and The Cube Test: Multi-Dimensional Evaluation for Professional Search

The Water Filling Model and The Cube Test: Multi-Dimensional Evaluation for Professional Search The Water Filling Model and The Cube Test: Multi-Dimensional Evaluation for Professional Search Jiyun Luo 1 Christopher Wing 1 Hui Yang 1 Marti A. Hearst 2 1 Department of Computer Science 2 School of

More information

Query Log Anonymization by Differential Privacy

Query Log Anonymization by Differential Privacy Query Log Anonymization by Differential Privacy A Thesis submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the

More information

Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data

Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Misha Bilenko and Ryen White presented by Matt Richardson Microsoft Research Search = Modeling User Behavior

More information

CS 6320 Natural Language Processing

CS 6320 Natural Language Processing CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic

More information

IRCE at the NTCIR-12 IMine-2 Task

IRCE at the NTCIR-12 IMine-2 Task IRCE at the NTCIR-12 IMine-2 Task Ximei Song University of Tsukuba songximei@slis.tsukuba.ac.jp Yuka Egusa National Institute for Educational Policy Research yuka@nier.go.jp Masao Takaku University of

More information

University of TREC 2009: Indexing half a billion web pages

University of TREC 2009: Indexing half a billion web pages University of Twente @ TREC 2009: Indexing half a billion web pages Claudia Hauff and Djoerd Hiemstra University of Twente, The Netherlands {c.hauff, hiemstra}@cs.utwente.nl draft 1 Introduction The University

More information

Fall CS646: Information Retrieval. Lecture 2 - Introduction to Search Result Ranking. Jiepu Jiang University of Massachusetts Amherst 2016/09/12

Fall CS646: Information Retrieval. Lecture 2 - Introduction to Search Result Ranking. Jiepu Jiang University of Massachusetts Amherst 2016/09/12 Fall 2016 CS646: Information Retrieval Lecture 2 - Introduction to Search Result Ranking Jiepu Jiang University of Massachusetts Amherst 2016/09/12 More course information Programming Prerequisites Proficiency

More information

IALP 2016 Improving the Effectiveness of POI Search by Associated Information Summarization

IALP 2016 Improving the Effectiveness of POI Search by Associated Information Summarization IALP 2016 Improving the Effectiveness of POI Search by Associated Information Summarization Hsiu-Min Chuang, Chia-Hui Chang*, Chung-Ting Cheng Dept. of Computer Science and Information Engineering National

More information

A Multiple-stage Approach to Re-ranking Clinical Documents

A Multiple-stage Approach to Re-ranking Clinical Documents A Multiple-stage Approach to Re-ranking Clinical Documents Heung-Seon Oh and Yuchul Jung Information Service Center Korea Institute of Science and Technology Information {ohs, jyc77}@kisti.re.kr Abstract.

More information

Ranking with Query-Dependent Loss for Web Search

Ranking with Query-Dependent Loss for Web Search Ranking with Query-Dependent Loss for Web Search Jiang Bian 1, Tie-Yan Liu 2, Tao Qin 2, Hongyuan Zha 1 Georgia Institute of Technology 1 Microsoft Research Asia 2 Outline Motivation Incorporating Query

More information

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline Relevance Feedback and Query Reformulation Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price IR on the Internet, Spring 2010 1 Outline Query reformulation Sources of relevance

More information

TREC 2015 Dynamic Domain Track Overview

TREC 2015 Dynamic Domain Track Overview TREC 2015 Dynamic Domain Track Overview Hui Yang Department of Computer Science Georgetown University huiyang@cs.georgetown.edu Ian Soboroff NIST ian.soboroff@nist.gov John Frank Diffeo MIT jrf@diffeo.com

More information

Chapter 6: Information Retrieval and Web Search. An introduction

Chapter 6: Information Retrieval and Web Search. An introduction Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods

More information

Document indexing, similarities and retrieval in large scale text collections

Document indexing, similarities and retrieval in large scale text collections Document indexing, similarities and retrieval in large scale text collections Eric Gaussier Univ. Grenoble Alpes - LIG Eric.Gaussier@imag.fr Eric Gaussier Document indexing, similarities & retrieval 1

More information

Information Retrieval

Information Retrieval Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data

More information

Boolean Model. Hongning Wang

Boolean Model. Hongning Wang Boolean Model Hongning Wang CS@UVa Abstraction of search engine architecture Indexed corpus Crawler Ranking procedure Doc Analyzer Doc Representation Query Rep Feedback (Query) Evaluation User Indexer

More information

A Formal Approach to Score Normalization for Meta-search

A Formal Approach to Score Normalization for Meta-search A Formal Approach to Score Normalization for Meta-search R. Manmatha and H. Sever Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherst, MA 01003

More information

Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization

Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization Romain Deveaud 1 and Florian Boudin 2 1 LIA - University of Avignon romain.deveaud@univ-avignon.fr

More information

Towards Prac+cal Relevance Ranking for 10 Million Books

Towards Prac+cal Relevance Ranking for 10 Million Books Towards Prac+cal Relevance Ranking for 10 Million Books wwww.hathitrust.orgww.hathit rust.org Tom Burton- West Informa+on Retrieval Programmer Digital Library Produc+on Service University of Michigan Library

More information

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016 Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full

More information

From Neural Re-Ranking to Neural Ranking:

From Neural Re-Ranking to Neural Ranking: From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing Hamed Zamani (1), Mostafa Dehghani (2), W. Bruce Croft (1), Erik Learned-Miller (1), and Jaap Kamps (2)

More information

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Effective Latent Space Graph-based Re-ranking Model with Global Consistency Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case

More information

PERSONALIZED TAG RECOMMENDATION

PERSONALIZED TAG RECOMMENDATION PERSONALIZED TAG RECOMMENDATION Ziyu Guan, Xiaofei He, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can Wang Zhejiang University, China Univ. of Illinois/Univ. of Michigan 1 Booming of Social Tagging Applications

More information

Instructor: Stefan Savev

Instructor: Stefan Savev LECTURE 2 What is indexing? Indexing is the process of extracting features (such as word counts) from the documents (in other words: preprocessing the documents). The process ends with putting the information

More information

A Taxonomy of Web Search

A Taxonomy of Web Search A Taxonomy of Web Search by Andrei Broder 1 Overview Ø Motivation Ø Classic model for IR Ø Web-specific Needs Ø Taxonomy of Web Search Ø Evaluation Ø Evolution of Search Engines Ø Conclusions 2 1 Motivation

More information

Keyword search in databases: the power of RDBMS

Keyword search in databases: the power of RDBMS Keyword search in databases: the power of RDBMS 1 Introduc

More information

A Study of Methods for Negative Relevance Feedback

A Study of Methods for Negative Relevance Feedback A Study of Methods for Negative Relevance Feedback Xuanhui Wang University of Illinois at Urbana-Champaign Urbana, IL 61801 xwang20@cs.uiuc.edu Hui Fang The Ohio State University Columbus, OH 43210 hfang@cse.ohiostate.edu

More information

Behavioral Data Mining. Lecture 18 Clustering

Behavioral Data Mining. Lecture 18 Clustering Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval CS276: Informa*on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 12: Clustering Today s Topic: Clustering Document clustering Mo*va*ons Document

More information

Notes: Notes: Primo Ranking Customization

Notes: Notes: Primo Ranking Customization Primo Ranking Customization Hello, and welcome to today s lesson entitled Ranking Customization in Primo. Like most search engines, Primo aims to present results in descending order of relevance, with

More information