Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track
|
|
- Allen Matthews
- 6 years ago
- Views:
Transcription
1 Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track Jiyun Luo, Xuchu Dong and Grace Hui Yang Department of Computer Science Georgetown University
2 Introduc:on Session search Document retrieval for an en:re search session. TREC Session Track provides log data which records A sequence of query changes q 1,q 2 q n- 1,q n The ranked list for each past query Document clicked informa:on and dwell :me. TREC 2014 Session Track: RL1 using the last query of a session RL2 using any informa:on in current session RL3 using informa:on from other sessions We use: ClueWeb12 Category A as our corpus 2
3 Outline Introduc:on Methods and Approaches Ad- hoc Retrieval Model (Ad- hoc) Query Change Retrieval Model (QCM) Weighted QCM User- Click Model Clustering Session Performance Predic:on and Replacement Submissions Evalua:on Result Conclusion 3
4 Ad- hoc Retrieval Model (Ad- hoc) Mul:nomial Language Modeling + Dirichlet Smoothing. Term weight P(t d) as: μ is the Dirichlet smoothing parameter, and is set =
5 Query Change Retrieve Model (QCM) Idea: Query Change is an important form of user feedback Dongyi Guan, Sicong Zhang, and Hui Yang U:lizing query change for session search. (SIGIR '13). Defining query change Δq i as the syntac:c edi:ng changes between two adjacent queries: Δq i Δq i = Added term ; Removed term Δq ; Theme term q i q i 1 + i Table 1 A example of Query Change qtheme Session Queries Query Change Q theme Q 1 = hydropower efficiency +Δq 2 = environment hydropower session 52 Q 2 = hydropower environment - Δq 2 = efficiency Q 3 = hydropower damage +Δq 3 = damage hydropower - Δq 3 = environment 5
6 Query Change Retrieve Model (QCM) The relevance score Increase between weights one query q i and a document d is calculated by: for theme terms Increase weights for novel added terms Score(q i, d) = log P(q i d)+αw Theme βw Add,In +εw Add,Out δw Remove Current reward/ relevance score Decrease weights for old added terms Decrease weights for removed terms 6
7 Query Change Retrieve Model (QCM) The relevance score between one query q i and a document d is calculated by: Score(q i, d) = log P(q i d)+αw Theme βw Add,In +εw Add,Out δw Remove The QCM model combines all queries in a session with a discount factor Υ: n i=1 Score qcm (q 1..n, d) = γ n i Score(q i, d) 7
8 Weighted QCM Weighted QCM combines queries based on query quality which is indicated by user click Strong SAT- Click a clicked document with dwelled :me >= 30 seconds Weak SAT- Click a clicked document with dwell :me >= 10 seconds and< 30 seconds 8
9 Weighted QCM Weighted QCM combines queries based on query quality which is indicated by user click Strong SAT- Click a clicked document with dwelled :me >= 30 seconds Weak SAT- Click a clicked document with dwell :me >= 10 seconds and< 30 seconds!"#$%!"#$!!..!,! =!"#$% (!!,!) +!!!"#$% (!!,!)!!!!!""#!!!!"# The good query set: Queries bringing at least one SAT- Click + the current query The bad query set: Queries bringing no SAT- Click 9
10 User- Click Model We boost a document s ranking score, if it is SAT- Clicked by users Session Level User- Click Model for RL2 score from QCM model boost from Session level User- Click model Ψ points for a Strong SAT- Click, θ points for a Weak SAT- Click, sum up for the whole session normaliza1on to (0,1) 10
11 User- Click Model Topic Level User- Click Model for RL3 boost from Topic level User- Click model similar to session level User- Click model, however calcula:on is done for the whole session cluster A session cluster is a set of sessions that sharing similar search topics 11
12 Clustering Topic ID is not obtainable in real search prac:ce. cluster sessions by comparing queries similarity Ø Convert all queries in one session to a term vector Ø Assign idf value as weight to each dimension Ø Cluster sessions based on the Euclidean distance of these vectors We use K- means clustering algorithm and set K = 60 12
13 Session Performance Predic1on and Replacement For sessions that share similar search topics predict their performance replace bad sessions results with good sessions Predict session performance Extract several features (n) from the sessions Rank sessions by formula:!"#$%!! = 1!(! #!!"!!"!!#$%!!!"#$!%&$'(!!! = TRUE! )!!!!..! 13
14 Session Performance Predic1on and Replacement Features Table Table&2&Features&Extracted&for&each&Session&! Feature F 1 F 2 F 3 F 4 F 5 F 6 F 7 F 8 Definition Search intent is comparison No user-click in session s!!"#$$ 5s.!!"#$$!!"!!h!!!"#!!"!!"#$$!time in a session. # of unique terms in session s 20. (!)!!"#$$_!"#_!"#!$ <!!"#$$_!"#_!"#!$ 2 Session s does not contain the most frequent search term in T(s). # of unique terms in session s 6!!!(!) #!!"!!"#!!"#!$%!!"!!"!!#$%!! #!!"!!"#!!"#!$%!!"!!"!!#$%!! <!(!) * T(s) means a session cluster including session s 14
15 Outline Introduc:on Methods and Approaches Ad- hoc Retrieval Model (Ad- hoc) Query Change Retrieval Model (QCM) Weighted QCM User- Click Model Clustering Session Performance Predic:on and Replacement Submissions Evalua:on Result Conclusion 15
16 RL1 RL2 RL3 Our Submissions GUS14RUN1 GUS14RUN2 GUS14RUN3 Weighted QCM (ω=0.65) Session Level User- Click Model Weighted QCM (ω=0.65) Topic Level User- Click Model Ad- hoc Retrieval Model Weighted QCM (ω=0.8) Session Level User- Click Model Weighted QCM (ω=0.8) Topic Level User- Click Model Weighted QCM (ω=0.8) Topic Level User- Click Model using topic ids Session Performance Predic:on and Replacement RL3 in RUN1 and RUN2 using session clusters based on query similarity RL3 in RUN3 using session cluster based on topic id Why? similar queries leads to similar retrieval list in our system. Not useful when apply session replacement strategy 16
17 Evalua1on Results GUS14RUN1 GUS14RUN2 GUS14RUN3 Max Med RL RL RL Ø 2 nd rank in task RL1, 1 st rank in task RL2 and RL3 Adjus:ng term weight based on query change is effec:ve Combining queries in a session is useful for Session Track User- Click is effec:ve to predicate relevance Ø A small performance drop from RL2 to RL3 in RUN1 and RUN2 cluster sessions based on query similarity may work, however need more work to refine it Ø A small increase from RL2 to RL3 in RUN3 For sessions sharing same search topics, replacing poor sessions results using good sessions is prac:cal. 17
18 Conclusion Achieve 20.9% increase from RL1 to RL2 by u:lizing query change feedback user click feedback Achieve 4% increase from RL2 to RL3 by Topic level User- Click Model Session performance predic:on and replacement
19 Thanks! Jiyun Luo, Xuchu Dong and Grace Hui Yang Department of Computer Science Georgetown University 19
Utilizing Query Change for Session Search
Utilizing Query Change for Session Search Dongyi Guan, Sicong Zhang, Hui Yang Department of Computer Science Georgetown University 37th and O Street, NW, Washington, DC, 20057 {dg372, sz303}@georgetown.edu,
More informationEffective Structured Query Formulation for Session Search
Effective Structured Query Formulation for Session Search Dongyi Guan Hui Yang Nazli Goharian Department of Computer Science Georgetown University 37 th and O Street, NW, Washington, DC, 20057 dg372@georgetown.edu,
More informationAn Investigation of Basic Retrieval Models for the Dynamic Domain Task
An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu
More informationCOSC572 GUEST LECTURE - PROF. GRACE HUI YANG INTRODUCTION TO INFORMATION RETRIEVAL NOV 2, 2016
COSC572 GUEST LECTURE - PROF. GRACE HUI YANG INTRODUCTION TO INFORMATION RETRIEVAL NOV 2, 2016 1 TOPICS FOR TODAY Modes of Search What is Information Retrieval Search vs. Evaluation Vector Space Model
More informationTREC 2017 Dynamic Domain Track Overview
TREC 2017 Dynamic Domain Track Overview Grace Hui Yang Zhiwen Tang Ian Soboroff Georgetown University Georgetown University NIST huiyang@cs.georgetown.edu zt79@georgetown.edu ian.soboroff@nist.gov 1. Introduction
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Course Goals To help you to understand search engines, evaluate and compare them, and
More informationAn Exploration of Query Term Deletion
An Exploration of Query Term Deletion Hao Wu and Hui Fang University of Delaware, Newark DE 19716, USA haowu@ece.udel.edu, hfang@ece.udel.edu Abstract. Many search users fail to formulate queries that
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Course Goals To help you to understand search engines, evaluate and compare them, and
More informationTREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback
RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano
More informationUniversity of Delaware at Diversity Task of Web Track 2010
University of Delaware at Diversity Task of Web Track 2010 Wei Zheng 1, Xuanhui Wang 2, and Hui Fang 1 1 Department of ECE, University of Delaware 2 Yahoo! Abstract We report our systems and experiments
More informationWashington, DC April 22, 2013
Structured Query Formulation and Result Organization for Session Search A Thesis submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulllment of the
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Retrieval Models Provide a mathema1cal framework for defining the search process includes
More informationLearning to Reweight Terms with Distributed Representations
Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215 Outline Goal: Assign weights to query terms for better retrieval results
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually
More informationOn Duplicate Results in a Search Session
On Duplicate Results in a Search Session Jiepu Jiang Daqing He Shuguang Han School of Information Sciences University of Pittsburgh jiepu.jiang@gmail.com dah44@pitt.edu shh69@pitt.edu ABSTRACT In this
More informationModern Retrieval Evaluations. Hongning Wang
Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Query Process Retrieval Models Provide a mathema.cal framework for defining the search process
More informationOn Duplicate Results in a Search Session
On Duplicate Results in a Search Session Jiepu Jiang Daqing He Shuguang Han School of Information Sciences University of Pittsburgh jiepu.jiang@gmail.com dah44@pitt.edu shh69@pitt.edu ABSTRACT In this
More informationRecommender Systems Collabora2ve Filtering and Matrix Factoriza2on
Recommender Systems Collaborave Filtering and Matrix Factorizaon Narges Razavian Thanks to lecture slides from Alex Smola@CMU Yahuda Koren@Yahoo labs and Bing Liu@UIC We Know What You Ought To Be Watching
More informationUniversity of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015
University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 5:00pm-6:15pm, Monday, October 26th Name: ComputingID: This is a closed book and closed notes exam. No electronic
More informationCSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)"
CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Language Model" Unigram language
More informationCSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation"
CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Evaluation" Evaluation is key to building
More informationA Comparative Analysis of Cascade Measures for Novelty and Diversity
A Comparative Analysis of Cascade Measures for Novelty and Diversity Charles Clarke, University of Waterloo Nick Craswell, Microsoft Ian Soboroff, NIST Azin Ashkan, University of Waterloo Background Measuring
More informationA Deep Relevance Matching Model for Ad-hoc Retrieval
A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese
More informationInforma(on Retrieval
Introduc*on to Informa(on Retrieval Lecture 8: Evalua*on 1 Sec. 6.2 This lecture How do we know if our results are any good? Evalua*ng a search engine Benchmarks Precision and recall 2 EVALUATING SEARCH
More informationReducing Click and Skip Errors in Search Result Ranking
Reducing Click and Skip Errors in Search Result Ranking Jiepu Jiang Center for Intelligent Information Retrieval College of Information and Computer Sciences University of Massachusetts Amherst jpjiang@cs.umass.edu
More informationA term-based methodology for query reformulation understanding
DOI 10.1007/s10791-015-9251-5 A term-based methodology for query reformulation understanding Marc Sloan Hui Yang Jun Wang Received: 15 August 2014 / Accepted: 24 February 2015 Ó Springer Science+Business
More informationInformation Search in Web Archives
Information Search in Web Archives Miguel Costa Advisor: Prof. Mário J. Silva Co-Advisor: Prof. Francisco Couto Department of Informatics, Faculty of Sciences, University of Lisbon PhD thesis defense,
More informationModeling multiple interactions with a Markov random field in query expansion for session search
Received: 20 April 2016 Revised: 18 September 2017 Accepted: 20 September 2017 DOI: 10.1111/coin.12154 ORIGINAL ARTICLE Modeling multiple interactions with a Markov random field in query expansion for
More informationChapter 8. Evaluating Search Engine
Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can
More informationContext based Re-ranking of Web Documents (CReWD)
Context based Re-ranking of Web Documents (CReWD) Arijit Banerjee, Jagadish Venkatraman Graduate Students, Department of Computer Science, Stanford University arijitb@stanford.edu, jagadish@stanford.edu}
More informationIncreasing Stability of Result Organization for Session Search
Increasing Stability of Result Organization for Session Search Dongyi Guan and Hui Yang Department of Computer Science, Georgetown University 37th and O Street NW, Washington DC, 20057, USA dongyi.guan@gmail.com,
More informationFall Lecture 16: Learning-to-rank
Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin
More informationReducing Redundancy with Anchor Text and Spam Priors
Reducing Redundancy with Anchor Text and Spam Priors Marijn Koolen 1 Jaap Kamps 1,2 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Informatics Institute, University
More informationMachine Learning Crash Course: Part I
Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec
More informationMidterm Exam Search Engines ( / ) October 20, 2015
Student Name: Andrew ID: Seat Number: Midterm Exam Search Engines (11-442 / 11-642) October 20, 2015 Answer all of the following questions. Each answer should be thorough, complete, and relevant. Points
More informationQuery Likelihood with Negative Query Generation
Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer
More informationJames Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!
James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! (301) 219-4649 james.mayfield@jhuapl.edu What is Information Retrieval? Evaluation
More informationPersonalized Web Search
Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents
More informationCS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University
CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and
More informationMicrosoft Research Asia at the Web Track of TREC 2009
Microsoft Research Asia at the Web Track of TREC 2009 Zhicheng Dou, Kun Chen, Ruihua Song, Yunxiao Ma, Shuming Shi, and Ji-Rong Wen Microsoft Research Asia, Xi an Jiongtong University {zhichdou, rsong,
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Indexes Indexes are data structures designed to make search faster Text search has unique
More informationOverview of the NTCIR-13 OpenLiveQ Task
Overview of the NTCIR-13 OpenLiveQ Task ABSTRACT Makoto P. Kato Kyoto University mpkato@acm.org Akiomi Nishida Yahoo Japan Corporation anishida@yahoo-corp.jp This is an overview of the NTCIR-13 OpenLiveQ
More informationInforma(on Retrieval
Introduc)on to Informa(on Retrieval CS276 Informa)on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 8: Evalua)on Sec. 6.2 This lecture How do we know if our results are any good? Evalua)ng
More informationThis lecture. Measures for a search engine EVALUATING SEARCH ENGINES. Measuring user happiness. Measures for a search engine
Sec. 6.2 Introduc)on to Informa(on Retrieval CS276 Informa)on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 8: Evalua)on This lecture How do we know if our results are any good? Evalua)ng
More informationFeature selection. LING 572 Fei Xia
Feature selection LING 572 Fei Xia 1 Creating attribute-value table x 1 x 2 f 1 f 2 f K y Choose features: Define feature templates Instantiate the feature templates Dimensionality reduction: feature selection
More informationBirkbeck (University of London)
Birkbeck (University of London) MSc Examination for Internal Students Department of Computer Science and Information Systems Information Retrieval and Organisation (COIY64H7) Credit Value: 5 Date of Examination:
More informationImproving Difficult Queries by Leveraging Clusters in Term Graph
Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu
More informationInformation Retrieval. (M&S Ch 15)
Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion
More informationClustering. Introduction to Data Science University of Colorado Boulder SLIDES ADAPTED FROM LAUREN HANNAH
Clustering Introduction to Data Science University of Colorado Boulder SLIDES ADAPTED FROM LAUREN HANNAH Introduction to Data Science Boulder Clustering 1 of 9 Clustering Lab Review of k-means Work through
More informationThe University of Amsterdam at the CLEF 2008 Domain Specific Track
The University of Amsterdam at the CLEF 2008 Domain Specific Track Parsimonious Relevance and Concept Models Edgar Meij emeij@science.uva.nl ISLA, University of Amsterdam Maarten de Rijke mdr@science.uva.nl
More informationRelated Entity Finding Based on Co-Occurrence
Related Entity Finding Based on Co-Occurrence Marc Bron Krisztian Balog Maarten de Rijke ISLA, University of Amsterdam http://ilps.science.uva.nl/ Abstract: We report on experiments for the Related Entity
More informationMachine Learning based session drop prediction in LTE networks and its SON aspects
Machine Learning based session drop prediction in LTE networks and its SON aspects Bálint Daróczy, András Benczúr Institute for Computer Science and Control (MTA SZTAKI) Hungarian Academy of Sciences Péter
More informationRanking and Learning. Table of Content. Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification.
Table of Content anking and Learning Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification 290 UCSB, Tao Yang, 2013 Partially based on Manning, aghavan,
More informationRobust Relevance-Based Language Models
Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new
More informationTowards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling
Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling Chenyan Xiong, Zhengzhong Liu, Jamie Callan, and Tie-Yan Liu* Carnegie Mellon University & Microsoft Research* 1
More informationA Security Punctua.on Framework for Enforcing Access Control on Streaming Data. Rimma V. Nehme, Elke A. Rundensteinerr, Elisa Ber.
A Security Punctua.on Framework for Enforcing Access Control on Streaming Data Rimma V. Nehme, Elke A. Rundensteinerr, Elisa Ber.no Presented by Thao Pham Mo.va.on Mobile devices make available personal
More informationInforma(on Retrieval
Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Construc9on Sort- based indexing Blocked Sort- Based Indexing
More informationNortheastern University in TREC 2009 Million Query Track
Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College
More informationCSCI 5417 Information Retrieval Systems. Jim Martin!
CSCI 5417 Information Retrieval Systems Jim Martin! Lecture 7 9/13/2011 Today Review Efficient scoring schemes Approximate scoring Evaluating IR systems 1 Normal Cosine Scoring Speedups... Compute the
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Indexing Process Indexes Indexes are data structures designed to make search faster Text search
More informationConsistency Rationing in the Cloud: Pay only when it matters
Consistency Rationing in the Cloud: Pay only when it matters By Sandeepkrishnan Some of the slides in this presenta4on have been taken from h7p://www.cse.iitb.ac.in./dbms/cs632/ra4oning.ppt 1 Introduc4on:
More informationInforma(on Retrieval
Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Compression Collec9on and vocabulary sta9s9cs: Heaps and
More informationBasic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval
Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval 1 Naïve Implementation Convert all documents in collection D to tf-idf weighted vectors, d j, for keyword vocabulary V. Convert
More informationUnsupervised Rank Aggregation with Distance-Based Models
Unsupervised Rank Aggregation with Distance-Based Models Alexandre Klementiev, Dan Roth, and Kevin Small University of Illinois at Urbana-Champaign Motivation Consider a panel of judges Each (independently)
More informationA Simple and Efficient Sampling Method for Es7ma7ng AP and ndcg
A Simple and Efficient Sampling Method for Es7ma7ng AP and ndcg Emine Yilmaz Microso' Research, Cambridge, UK Evangelos Kanoulas Javed Aslam Northeastern University, Boston, USA Introduc7on Obtaining relevance
More informationSearch Engine Architecture II
Search Engine Architecture II Primary Goals of Search Engines Effectiveness (quality): to retrieve the most relevant set of documents for a query Process text and store text statistics to improve relevance
More informationInformation Retrieval
Introduction to Information Retrieval Lecture 6-: Scoring, Term Weighting Outline Why ranked retrieval? Term frequency tf-idf weighting 2 Ranked retrieval Thus far, our queries have all been Boolean. Documents
More informationOverview of the NTCIR-12 MobileClick-2 Task
Overview of the NTCIR-12 MobileClick-2 Task Makoto P. Kato (Kyoto U.), Tetsuya Sakai (Waseda U.), Takehiro Yamamoto (Kyoto U.), Virgil Pavlu (Northeastern U.), Hajime Morita (Kyoto U.), and Sumio Fujita
More informationInformation Retrieval Using Context Based Document Indexing and Term Graph
Information Retrieval Using Context Based Document Indexing and Term Graph Mr. Mandar Donge ME Student, Department of Computer Engineering, P.V.P.I.T, Bavdhan, Savitribai Phule Pune University, Pune, Maharashtra,
More informationThe Water Filling Model and The Cube Test: Multi-Dimensional Evaluation for Professional Search
The Water Filling Model and The Cube Test: Multi-Dimensional Evaluation for Professional Search Jiyun Luo 1 Christopher Wing 1 Hui Yang 1 Marti A. Hearst 2 1 Department of Computer Science 2 School of
More informationQuery Log Anonymization by Differential Privacy
Query Log Anonymization by Differential Privacy A Thesis submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the
More informationMining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data
Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Misha Bilenko and Ryen White presented by Matt Richardson Microsoft Research Search = Modeling User Behavior
More informationCS 6320 Natural Language Processing
CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic
More informationIRCE at the NTCIR-12 IMine-2 Task
IRCE at the NTCIR-12 IMine-2 Task Ximei Song University of Tsukuba songximei@slis.tsukuba.ac.jp Yuka Egusa National Institute for Educational Policy Research yuka@nier.go.jp Masao Takaku University of
More informationUniversity of TREC 2009: Indexing half a billion web pages
University of Twente @ TREC 2009: Indexing half a billion web pages Claudia Hauff and Djoerd Hiemstra University of Twente, The Netherlands {c.hauff, hiemstra}@cs.utwente.nl draft 1 Introduction The University
More informationFall CS646: Information Retrieval. Lecture 2 - Introduction to Search Result Ranking. Jiepu Jiang University of Massachusetts Amherst 2016/09/12
Fall 2016 CS646: Information Retrieval Lecture 2 - Introduction to Search Result Ranking Jiepu Jiang University of Massachusetts Amherst 2016/09/12 More course information Programming Prerequisites Proficiency
More informationIALP 2016 Improving the Effectiveness of POI Search by Associated Information Summarization
IALP 2016 Improving the Effectiveness of POI Search by Associated Information Summarization Hsiu-Min Chuang, Chia-Hui Chang*, Chung-Ting Cheng Dept. of Computer Science and Information Engineering National
More informationA Multiple-stage Approach to Re-ranking Clinical Documents
A Multiple-stage Approach to Re-ranking Clinical Documents Heung-Seon Oh and Yuchul Jung Information Service Center Korea Institute of Science and Technology Information {ohs, jyc77}@kisti.re.kr Abstract.
More informationRanking with Query-Dependent Loss for Web Search
Ranking with Query-Dependent Loss for Web Search Jiang Bian 1, Tie-Yan Liu 2, Tao Qin 2, Hongyuan Zha 1 Georgia Institute of Technology 1 Microsoft Research Asia 2 Outline Motivation Incorporating Query
More informationRelevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline
Relevance Feedback and Query Reformulation Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price IR on the Internet, Spring 2010 1 Outline Query reformulation Sources of relevance
More informationTREC 2015 Dynamic Domain Track Overview
TREC 2015 Dynamic Domain Track Overview Hui Yang Department of Computer Science Georgetown University huiyang@cs.georgetown.edu Ian Soboroff NIST ian.soboroff@nist.gov John Frank Diffeo MIT jrf@diffeo.com
More informationChapter 6: Information Retrieval and Web Search. An introduction
Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods
More informationDocument indexing, similarities and retrieval in large scale text collections
Document indexing, similarities and retrieval in large scale text collections Eric Gaussier Univ. Grenoble Alpes - LIG Eric.Gaussier@imag.fr Eric Gaussier Document indexing, similarities & retrieval 1
More informationInformation Retrieval
Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data
More informationBoolean Model. Hongning Wang
Boolean Model Hongning Wang CS@UVa Abstraction of search engine architecture Indexed corpus Crawler Ranking procedure Doc Analyzer Doc Representation Query Rep Feedback (Query) Evaluation User Indexer
More informationA Formal Approach to Score Normalization for Meta-search
A Formal Approach to Score Normalization for Meta-search R. Manmatha and H. Sever Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherst, MA 01003
More informationEffective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization
Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization Romain Deveaud 1 and Florian Boudin 2 1 LIA - University of Avignon romain.deveaud@univ-avignon.fr
More informationTowards Prac+cal Relevance Ranking for 10 Million Books
Towards Prac+cal Relevance Ranking for 10 Million Books wwww.hathitrust.orgww.hathit rust.org Tom Burton- West Informa+on Retrieval Programmer Digital Library Produc+on Service University of Michigan Library
More informationAdvanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016
Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full
More informationFrom Neural Re-Ranking to Neural Ranking:
From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing Hamed Zamani (1), Mostafa Dehghani (2), W. Bruce Croft (1), Erik Learned-Miller (1), and Jaap Kamps (2)
More informationEffective Latent Space Graph-based Re-ranking Model with Global Consistency
Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case
More informationPERSONALIZED TAG RECOMMENDATION
PERSONALIZED TAG RECOMMENDATION Ziyu Guan, Xiaofei He, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can Wang Zhejiang University, China Univ. of Illinois/Univ. of Michigan 1 Booming of Social Tagging Applications
More informationInstructor: Stefan Savev
LECTURE 2 What is indexing? Indexing is the process of extracting features (such as word counts) from the documents (in other words: preprocessing the documents). The process ends with putting the information
More informationA Taxonomy of Web Search
A Taxonomy of Web Search by Andrei Broder 1 Overview Ø Motivation Ø Classic model for IR Ø Web-specific Needs Ø Taxonomy of Web Search Ø Evaluation Ø Evolution of Search Engines Ø Conclusions 2 1 Motivation
More informationKeyword search in databases: the power of RDBMS
Keyword search in databases: the power of RDBMS 1 Introduc
More informationA Study of Methods for Negative Relevance Feedback
A Study of Methods for Negative Relevance Feedback Xuanhui Wang University of Illinois at Urbana-Champaign Urbana, IL 61801 xwang20@cs.uiuc.edu Hui Fang The Ohio State University Columbus, OH 43210 hfang@cse.ohiostate.edu
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationInforma(on Retrieval
Introduc*on to Informa(on Retrieval CS276: Informa*on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 12: Clustering Today s Topic: Clustering Document clustering Mo*va*ons Document
More informationNotes: Notes: Primo Ranking Customization
Primo Ranking Customization Hello, and welcome to today s lesson entitled Ranking Customization in Primo. Like most search engines, Primo aims to present results in descending order of relevance, with
More information