Fractional Similarity : Cross-lingual Feature Selection for Search
|
|
- Derek Gilbert
- 6 years ago
- Views:
Transcription
1 : Cross-lingual Feature Selection for Search Jagadeesh Jagarlamudi University of Maryland, College Park, USA Joint work with Paul N. Bennett Microsoft Research, Redmond, USA
2 Using All the Data Existing Search Engine Get Help Search Engine in a New language/market Training Data (Judgements, Clicks, etc.) Query and result documents are in the same language 2
3 Problem Statement Improve foreign language ranker using English Search Engine data Query and result documents are in the same language Different from CLIR Obtaining relevance judgments is expensive Potentially advantageous Increased training data Quality of behavioral data Trivial solution may not be optimal Amount of signal carried by features can be different Differences in languages/ regions 3
4 Outline Problem Statement Approaches and Sub-problems Challenges in capturing a feature similarity Our Approach Fractional Similarity Cross-lingual Feature Selection Experiments Conclusion 4
5 Approaches and sub-problems 1. Using the queries in both languages a) Linguistically non-ambiguous queries [Gao et al. 2008] E.g. Harry Potter b) Joint Relevance Estimation for bilingual queries [Gao et al. 2009] c) Multi-PRF [Chinnakotla et al. ACL, 2010, SIGIR 2010] 2. Feature based Transfer Learning 17% of unique queries are translation Natural approach in Learning to Rank situations Query-document as a feature vector E.g., PageRank, QueryWordsInDoc, etc Amount of signal carried by a feature can vary based on language 5
6 Matching feature distributions E.g: Norm Estimate pdf. using Kernel-density estimation probability Normalized Norm Feature-value Quantify the similarity and when are they different 6
7 Challenges in capturing similarity Query-Document correlation Consider the feature NoQueryWordsInDocTitle (for Harry Potter ) Top-10 candidate result documents likely to take the same value (of 2) 50 queries with 20 results on average gives 1000 training instances Query-set variance Different type of queries based on the region / characteristics of language Can t capture the differences using significance tests Normalization of the feature Continuous and Discrete features 7
8 Outline Problem Statement Approaches and Sub-problems Challenges in capturing a feature similarity Our Approach Fractional Similarity Cross-lingual Feature Selection Experiments Conclusion 8
9 Our approach 1. Fractional Similarity Significance tests such as T-test may not be useful Due to query Set Variance Robust method to verify if two populations have same mean Making features comparable to each other 2. Cross-lingual Feature Selection Identify features that are similar across languages A direct application compares means of a feature across languages Use of log-likelihood at random points Compare distributions Query-document correlation 9
10 10
11 11
12 Sample ref 12
13 Sample Sample ref 13
14 Sample Sample ref 14
15 Sample Sample ref p- are computed using T-test 15
16 1 Sample Sample ref p- are computed using T-test 16
17 1 Sample Sample ref p ef p- are computed using T-test 17
18 1 Sample Sample ref p ef frac = p ef p- are computed using T-test 18
19 1 Sample Sample ref p ef frac = p ef p- are computed using T-test frac 19
20 1 Sample Sample ref Combined ( = 0) p ef frac = p ef p- are computed using T-test frac 20
21 1 Sample Sample ref Combined ( = 0) 0.2) p ef frac = p ef p- are computed using T-test frac 21
22 1 Sample Sample ref Combined ( = 0) 0.2) p ef frac = p ef p- are computed using T-test frac 22
23 1 Sample Sample ref Combined ( = 0) 0.2) 0.4) p ef frac = p ef p- are computed using T-test frac 23
24 1 Sample Sample ref Combined ( = 0) 0.2) 0.4) p ef frac = p ef p- are computed using T-test frac 24
25 1 Sample Sample ref Combined ( = 0) 0.2) 0.4) 0.6) p ef frac = p ef p- are computed using T-test frac 25
26 1 Sample Sample ref Combined ( = 0) 0.2) 0.4) 0.6) 0.8) p ef frac = p ef p- are computed using T-test frac 26
27 1 Sample Sample ref Combined ( = 0.2) 0.4) 0.6) 0.8) 1.0) p ef frac = p ef p- are computed using T-test frac 27
28 1 Sample Sample ref Combined ( = 0.2) 0.4) 0.6) 0.8) 1.0) p ef frac = p ef p- are computed using T-test As increases p ef decreases & frac (decreases) frac 28
29 1 Sample Sample ref Combined ( = 0.2) 0.4) 0.6) 0.8) 1.0) p- are computed using T-test Fractional Similarity p ef frac = p ef frac The maximum allowed such that, frac C (Binary Search for [0,1] ) C 29
30 Cross-lingual Feature Selection Direct application of Fractional Similarity will compare means Instead we want to compare pdfs Use of Log-likelihood We estimate a pdf from English Compute the likelihood of English and German Samples Enables comparison of pdfs Applicable to both discrete and continuous features Query-document correlation Query based sampling and Aggregate statistic of the query 30
31 Experiments Data sets Between English and German Queries and documents are sampled from a web search engine English 347 Common features Graded Human Relevance Judgments German # Queries 15K 7K # urls / query # Features LambdaMART for training a ranker Outperformed other approaches in Yahoo LETOR challenge 31
32 Using English data Adapt: Train ranker on English and Adapt to German Align: Simply train on both English and German data German Only Eng_adapt+German Δ over baseline Eng_align+German Δ over baseline Simply adding English training data of the common features performed better 32
33 Cross-lingual Feature Selection Rank all the common features based on similarity score Drop the data of mismatched features Add the filtered English data to German training data Baseline Use all the common features ( Align from prev.) 33
34 Discussion Using English training data is helpful Cross-lingual Feature Selection improves further Fractional Similarity identifies similar features better than KL Evidenced by the consistent improvement over KL For both the methods, the improvement drops at higher ranks Aggressive feature selection also hurts Removing ~ 25% gave best performance Theoretical arguments are outlined in the paper. 34
35 Conclusions For features with high variance Traditional significance tests are not useful Give almost zero p- Fractional Similarity overcomes this by using intra language variance Increased robustness Not limited to IR setting General to situations with correlated instances Applicable to both discrete and continuous features Appropriate selection of pdf. estimation technique 35
36 Thank You 36
WebSci and Learning to Rank for IR
WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles
More informationFrom Neural Re-Ranking to Neural Ranking:
From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing Hamed Zamani (1), Mostafa Dehghani (2), W. Bruce Croft (1), Erik Learned-Miller (1), and Jaap Kamps (2)
More informationLearning Dense Models of Query Similarity from User Click Logs
Learning Dense Models of Query Similarity from User Click Logs Fabio De Bona, Stefan Riezler*, Keith Hall, Massi Ciaramita, Amac Herdagdelen, Maria Holmqvist Google Research, Zürich *Dept. of Computational
More informationEntity and Knowledge Base-oriented Information Retrieval
Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061
More informationNortheastern University in TREC 2009 Million Query Track
Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College
More informationA Few Things to Know about Machine Learning for Web Search
AIRS 2012 Tianjin, China Dec. 19, 2012 A Few Things to Know about Machine Learning for Web Search Hang Li Noah s Ark Lab Huawei Technologies Talk Outline My projects at MSRA Some conclusions from our research
More informationPersonalized Web Search
Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents
More informationRishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India. Monojit Choudhury and Srivatsan Laxman Microsoft Research India India
Rishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India Monojit Choudhury and Srivatsan Laxman Microsoft Research India India ACM SIGIR 2012, Portland August 15, 2012 Dividing a query into individual semantic
More informationUnsupervised Rank Aggregation with Distance-Based Models
Unsupervised Rank Aggregation with Distance-Based Models Alexandre Klementiev, Dan Roth, and Kevin Small University of Illinois at Urbana-Champaign Motivation Consider a panel of judges Each (independently)
More informationModern Retrieval Evaluations. Hongning Wang
Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance
More informationLearning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan,
Learning to Rank Tie-Yan Liu Microsoft Research Asia CCIR 2011, Jinan, 2011.10 History of Web Search Search engines powered by link analysis Traditional text retrieval engines 2011/10/22 Tie-Yan Liu @
More informationLearning Temporal-Dependent Ranking Models
Learning Temporal-Dependent Ranking Models Miguel Costa, Francisco Couto, Mário Silva LaSIGE @ Faculty of Sciences, University of Lisbon IST/INESC-ID, University of Lisbon 37th Annual ACM SIGIR Conference,
More informationAdvanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University
Advanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University http://disa.fi.muni.cz The Cranfield Paradigm Retrieval Performance Evaluation Evaluation Using
More informationCSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation"
CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Evaluation" Evaluation is key to building
More informationChapter 8. Evaluating Search Engine
Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can
More informationWeb Query Translation with Representative Synonyms in Cross Language Information Retrieval
Web Query Translation with Representative Synonyms in Cross Language Information Retrieval August 25, 2005 Bo-Young Kang, Qing Li, Yun Jin, Sung Hyon Myaeng Information Retrieval and Natural Language Processing
More informationFall Lecture 16: Learning-to-rank
Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin
More informationLearning to Reweight Terms with Distributed Representations
Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215 Outline Goal: Assign weights to query terms for better retrieval results
More informationOverview of the NTCIR-13 OpenLiveQ Task
Overview of the NTCIR-13 OpenLiveQ Task Makoto P. Kato, Takehiro Yamamoto (Kyoto University), Sumio Fujita, Akiomi Nishida, Tomohiro Manabe (Yahoo Japan Corporation) Agenda Task Design (3 slides) Data
More informationExternal Query Reformulation for Text-based Image Retrieval
External Query Reformulation for Text-based Image Retrieval Jinming Min and Gareth J. F. Jones Centre for Next Generation Localisation School of Computing, Dublin City University Dublin 9, Ireland {jmin,gjones}@computing.dcu.ie
More informationTREC-7 Experiments at the University of Maryland Douglas W. Oard Digital Library Research Group College of Library and Information Services University
TREC-7 Experiments at the University of Maryland Douglas W. Oard Digital Library Research Group College of Library and Information Services University of Maryland, College Park, MD 20742 oard@glue.umd.edu
More informationExploring Reductions for Long Web Queries
Exploring Reductions for Long Web Queries Niranjan Balasubramanian University of Massachusetts Amherst 140 Governors Drive, Amherst, MA 01003 niranjan@cs.umass.edu Giridhar Kumaran and Vitor R. Carvalho
More informationCS6200 Information Retrieval. David Smith College of Computer and Information Science Northeastern University
CS6200 Information Retrieval David Smith College of Computer and Information Science Northeastern University Indexing Process!2 Indexes Storing document information for faster queries Indexes Index Compression
More informationUMass at TREC 2017 Common Core Track
UMass at TREC 2017 Common Core Track Qingyao Ai, Hamed Zamani, Stephen Harding, Shahrzad Naseri, James Allan and W. Bruce Croft Center for Intelligent Information Retrieval College of Information and Computer
More informationAdvanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016
Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full
More informationCriES 2010
CriES Workshop @CLEF 2010 Cross-lingual Expert Search - Bridging CLIR and Social Media Institut AIFB Forschungsgruppe Wissensmanagement (Prof. Rudi Studer) Organizing Committee: Philipp Sorg Antje Schultz
More informationExploring Econometric Model Selection Using Sensitivity Analysis
Exploring Econometric Model Selection Using Sensitivity Analysis William Becker Paolo Paruolo Andrea Saltelli Nice, 2 nd July 2013 Outline What is the problem we are addressing? Past approaches Hoover
More informationPredicting Query Performance on the Web
Predicting Query Performance on the Web No Author Given Abstract. Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However,
More informationNortheastern University in TREC 2009 Web Track
Northeastern University in TREC 2009 Web Track Shahzad Rajput, Evangelos Kanoulas, Virgil Pavlu, Javed Aslam College of Computer and Information Science, Northeastern University, Boston, MA, USA Information
More informationNUS-I2R: Learning a Combined System for Entity Linking
NUS-I2R: Learning a Combined System for Entity Linking Wei Zhang Yan Chuan Sim Jian Su Chew Lim Tan School of Computing National University of Singapore {z-wei, tancl} @comp.nus.edu.sg Institute for Infocomm
More informationA Dynamic Bayesian Network Click Model for Web Search Ranking
A Dynamic Bayesian Network Click Model for Web Search Ranking Olivier Chapelle and Anne Ya Zhang Apr 22, 2009 18th International World Wide Web Conference Introduction Motivation Clicks provide valuable
More informationRobust Shape Retrieval Using Maximum Likelihood Theory
Robust Shape Retrieval Using Maximum Likelihood Theory Naif Alajlan 1, Paul Fieguth 2, and Mohamed Kamel 1 1 PAMI Lab, E & CE Dept., UW, Waterloo, ON, N2L 3G1, Canada. naif, mkamel@pami.uwaterloo.ca 2
More informationAssignment 1. Assignment 2. Relevance. Performance Evaluation. Retrieval System Evaluation. Evaluate an IR system
Retrieval System Evaluation W. Frisch Institute of Government, European Studies and Comparative Social Science University Vienna Assignment 1 How did you select the search engines? How did you find the
More informationDiversification of Query Interpretations and Search Results
Diversification of Query Interpretations and Search Results Advanced Methods of IR Elena Demidova Materials used in the slides: Charles L.A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova,
More informationRepresentative & Informative Query Selection for Learning to Rank using Submodular Functions
Representative & Informative Query Selection for Learning to Rank using Submodular Functions Rishabh Mehrotra Dept of Computer Science University College London, UK r.mehrotra@cs.ucl.ac.uk Emine Yilmaz
More informationExperiments with ClueWeb09: Relevance Feedback and Web Tracks
Experiments with ClueWeb09: Relevance Feedback and Web Tracks Mark D. Smucker 1, Charles L. A. Clarke 2, and Gordon V. Cormack 2 1 Department of Management Sciences, University of Waterloo 2 David R. Cheriton
More informationSelf Introduction. Presentation Outline. College of Information 3/31/2016. Multilingual Information Access to Digital Collections
College of Information Multilingual Information Access to Digital Collections Jiangping Chen Http://coolt.lis.unt.edu/ Jiangping.chen@unt.edu April 20, 2016 Self Introduction An Associate Professor at
More informationS-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking
S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking Yi Yang * and Ming-Wei Chang # * Georgia Institute of Technology, Atlanta # Microsoft Research, Redmond Traditional
More informationA Task Level Metric for Measuring Web Search Satisfaction and its Application on Improving Relevance Estimation
A Task Level Metric for Measuring Web Search Satisfaction and its Application on Improving Relevance Estimation Ahmed Hassan Microsoft Research Redmond, WA hassanam@microsoft.com Yang Song Microsoft Research
More informationDCU at FIRE 2013: Cross-Language!ndian News Story Search
DCU at FIRE 2013: Cross-Language!ndian News Story Search Piyush Arora, Jennifer Foster, and Gareth J. F. Jones CNGL Centre for Global Intelligent Content School of Computing, Dublin City University Glasnevin,
More informationCross lingual Information Retrieval
Cross lingual Information Retrieval Chapter 1. CLIR and its challenges A large amount of information in the form of text, audio, video and other documents is available on the web. Users should be able
More informationEffective Latent Space Graph-based Re-ranking Model with Global Consistency
Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case
More informationEvaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology
Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationPerformance Measures for Multi-Graded Relevance
Performance Measures for Multi-Graded Relevance Christian Scheel, Andreas Lommatzsch, and Sahin Albayrak Technische Universität Berlin, DAI-Labor, Germany {christian.scheel,andreas.lommatzsch,sahin.albayrak}@dai-labor.de
More informationOn the Effectiveness of Query Weighting for Adapting Rank Learners to New Unlabelled Collections
On the Effectiveness of Query Weighting for Adapting Rank Learners to New Unlabelled Collections Pengfei Li RMIT University, Australia li.pengfei@rmit.edu.au Mark Sanderson RMIT University, Australia mark.sanderson@rmit.edu.au
More informationDocument Structure Analysis in Associative Patent Retrieval
Document Structure Analysis in Associative Patent Retrieval Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media Studies University of Tsukuba 1-2 Kasuga, Tsukuba, 305-8550,
More informationClassification-Enhanced Ranking
Classification-Enhanced Ranking Paul N. Bennett Microsoft Research One Microsoft Way, Redmond WA USA paul.n.bennett@microsoft.com Krysta Svore Microsoft Research One Microsoft Way, Redmond WA USA ksvore@microsoft.com
More informationTask3 Patient-Centred Information Retrieval: Team CUNI
Task3 Patient-Centred Information Retrieval: Team CUNI Shadi Saleh and Pavel Pecina Charles University Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics, Czech Republic {saleh,pecina}@ufal.mff.cuni.cz
More informationA Deep Relevance Matching Model for Ad-hoc Retrieval
A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese
More informationSearch Engines Chapter 8 Evaluating Search Engines Felix Naumann
Search Engines Chapter 8 Evaluating Search Engines 9.7.2009 Felix Naumann Evaluation 2 Evaluation is key to building effective and efficient search engines. Drives advancement of search engines When intuition
More informationAmbiguity. Potential for Personalization. Personalization. Example: NDCG values for a query. Computing potential for personalization
Ambiguity Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning and Pandu Nayak Personalization Unlikely that a short query can unambiguously describe a user s
More informationMulti-resolution image recognition. Jean-Baptiste Boin Roland Angst David Chen Bernd Girod
Jean-Baptiste Boin Roland Angst David Chen Bernd Girod 1 Scale distribution Outline Presentation of two different approaches and experiments Analysis of previous results 2 Motivation Typical image retrieval
More informationHigh Accuracy Retrieval with Multiple Nested Ranker
High Accuracy Retrieval with Multiple Nested Ranker Irina Matveeva University of Chicago 5801 S. Ellis Ave Chicago, IL 60637 matveeva@uchicago.edu Chris Burges Microsoft Research One Microsoft Way Redmond,
More informationLearning Ranking Functions with Implicit Feedback
Learning Ranking Functions with Implicit Feedback CS4780 Machine Learning Fall 2011 Pannaga Shivaswamy Cornell University These slides are built on an earlier set of slides by Prof. Joachims. Current Search
More informationComputer Vision I - Filtering and Feature detection
Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image
More informationTriRank: Review-aware Explainable Recommendation by Modeling Aspects
TriRank: Review-aware Explainable Recommendation by Modeling Aspects Xiangnan He, Tao Chen, Min-Yen Kan, Xiao Chen National University of Singapore Presented by Xiangnan He CIKM 15, Melbourne, Australia
More informationAuthorship Disambiguation and Alias Resolution in Data
Authorship Disambiguation and Alias Resolution in Email Data Freek Maes Johannes C. Scholtes Department of Knowledge Engineering Maastricht University, P.O. Box 616, 6200 MD Maastricht Abstract Given a
More informationRobust Relevance-Based Language Models
Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new
More informationThe University of Illinois Graduate School of Library and Information Science at TREC 2011
The University of Illinois Graduate School of Library and Information Science at TREC 2011 Miles Efron, Adam Kehoe, Peter Organisciak, Sunah Suh 501 E. Daniel St., Champaign, IL 61820 1 Introduction The
More informationUniversity of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015
University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 5:00pm-6:15pm, Monday, October 26th Name: ComputingID: This is a closed book and closed notes exam. No electronic
More informationNYU CSCI-GA Fall 2016
1 / 45 Information Retrieval: Personalization Fernando Diaz Microsoft Research NYC November 7, 2016 2 / 45 Outline Introduction to Personalization Topic-Specific PageRank News Personalization Deciding
More informationRepresentation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval
Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval Xiaodong Liu 12, Jianfeng Gao 1, Xiaodong He 1 Li Deng 1, Kevin Duh 2, Ye-Yi Wang 1 1
More informationDetecting Multilingual and Multi-Regional Query Intent in Web Search
Detecting Multilingual and Multi-Regional Query Intent in Web Search Yi Chang, Ruiqiang Zhang, Srihari Reddy Yahoo! Labs 701 First Avenue Sunnyvale, CA 94089 {yichang,ruiqiang,sriharir}@yahoo-inc.com Yan
More informationCS 664 Image Matching and Robust Fitting. Daniel Huttenlocher
CS 664 Image Matching and Robust Fitting Daniel Huttenlocher Matching and Fitting Recognition and matching are closely related to fitting problems Parametric fitting can serve as more restricted domain
More informationLearning Lexicon Models from Search Logs for Query Expansion
Learning Lexicon Models from Search Logs for Query Expansion Jianfeng Gao Microsoft Research, Redmond Washington 98052, USA jfgao@microsoft.com Xiaodong He Microsoft Research, Redmond Washington 98052,
More informationarxiv: v1 [cs.ir] 2 Feb 2015
Context Models For Web Search Personalization Maksims N. Volkovs University of Toronto 40 St. George Street Toronto, ON M5S 2E4 mvolkovs@cs.toronto.edu arxiv:502.00527v [cs.ir] 2 Feb 205 ABSTRACT We present
More informationCS47300: Web Information Search and Management
CS47300: Web Information Search and Management Federated Search Prof. Chris Clifton 13 November 2017 Federated Search Outline Introduction to federated search Main research problems Resource Representation
More informationCorpus Acquisition from the Interwebs. Christian Buck, University of Edinburgh
Corpus Acquisition from the Interwebs Christian Buck, University of Edinburgh There is no data like more data (Bob Mercer, 1985) Mining Bilingual Text "Same text in different languages" Usually: one side
More informationSupervised Reranking for Web Image Search
for Web Image Search Query: Red Wine Current Web Image Search Ranking Ranking Features http://www.telegraph.co.uk/306737/red-wineagainst-radiation.html 2 qd, 2.5.5 0.5 0 Linjun Yang and Alan Hanjalic 2
More informationInformation Retrieval
Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data
More informationMicrosoft Research Asia at the Web Track of TREC 2009
Microsoft Research Asia at the Web Track of TREC 2009 Zhicheng Dou, Kun Chen, Ruihua Song, Yunxiao Ma, Shuming Shi, and Ji-Rong Wen Microsoft Research Asia, Xi an Jiongtong University {zhichdou, rsong,
More informationCombining Appearance and Topology for Wide
Combining Appearance and Topology for Wide Baseline Matching Dennis Tell and Stefan Carlsson Presented by: Josh Wills Image Point Correspondences Critical foundation for many vision applications 3-D reconstruction,
More informationFinding parallel texts on the web using cross-language information retrieval
Finding parallel texts on the web using cross-language information retrieval Achim Ruopp University of Washington, Seattle, WA 98195, USA achimr@u.washington.edu Fei Xia University of Washington Seattle,
More informationIndexing. UCSB 290N. Mainly based on slides from the text books of Croft/Metzler/Strohman and Manning/Raghavan/Schutze
Indexing UCSB 290N. Mainly based on slides from the text books of Croft/Metzler/Strohman and Manning/Raghavan/Schutze All slides Addison Wesley, 2008 Table of Content Inverted index with positional information
More informationA Deep Top-K Relevance Matching Model for Ad-hoc Retrieval
A Deep Top-K Relevance Matching Model for Ad-hoc Retrieval Zhou Yang, Qingfeng Lan, Jiafeng Guo, Yixing Fan, Xiaofei Zhu, Yanyan Lan, Yue Wang, and Xueqi Cheng School of Computer Science and Engineering,
More informationBeyond PageRank: Machine Learning for Static Ranking
Beyond PageRank: Machine Learning for Static Ranking Matthew Richardson 1, Amit Prakash 1 Eric Brill 2 1 Microsoft Research 2 MSN World Wide Web Conference, 2006 Outline 1 2 3 4 5 6 Types of Ranking Dynamic
More informationApache Solr Learning to Rank FTW!
Apache Solr Learning to Rank FTW! Berlin Buzzwords 2017 June 12, 2017 Diego Ceccarelli Software Engineer, News Search dceccarelli4@bloomberg.net Michael Nilsson Software Engineer, Unified Search mnilsson23@bloomberg.net
More informationResearch Article. August 2017
International Journals of Advanced Research in Computer Science and Software Engineering ISSN: 2277-128X (Volume-7, Issue-8) a Research Article August 2017 English-Marathi Cross Language Information Retrieval
More informationLearning to Rank. Kevin Duh June 2018
Learning to Rank Kevin Duh kevinduh@cs.jhu.edu June 2018 Slides are online @ h?p://www.cs.jhu.edu/~kevinduh/t/ltr.pdf 1 What is Ranking? UNORDERED SET ORDERED LIST RANKING FUNCTION 2 What is Learning to
More informationData-Intensive Computing with MapReduce
Data-Intensive Computing with MapReduce Session 6: Similar Item Detection Jimmy Lin University of Maryland Thursday, February 28, 2013 This work is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationChapter 12 Feature Selection
Chapter 12 Feature Selection Xiaogang Su Department of Statistics University of Central Florida - 1 - Outline Why Feature Selection? Categorization of Feature Selection Methods Filter Methods Wrapper Methods
More informationFederated Text Search
CS54701 Federated Text Search Luo Si Department of Computer Science Purdue University Abstract Outline Introduction to federated search Main research problems Resource Representation Resource Selection
More informationEstimating Human Pose in Images. Navraj Singh December 11, 2009
Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks
More informationSelf-tuning ongoing terminology extraction retrained on terminology validation decisions
Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin
More informationCMPSCI 646, Information Retrieval (Fall 2003)
CMPSCI 646, Information Retrieval (Fall 2003) Midterm exam solutions Problem CO (compression) 1. The problem of text classification can be described as follows. Given a set of classes, C = {C i }, where
More informationMicrosoft Cambridge at TREC 13: Web and HARD tracks
Microsoft Cambridge at TREC 13: Web and HARD tracks Hugo Zaragoza Λ Nick Craswell y Michael Taylor z Suchi Saria x Stephen Robertson 1 Overview All our submissions from the Microsoft Research Cambridge
More informationExperiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India
Experiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India rroy@adobe.com 2014 Adobe Systems Incorporated. All Rights Reserved. 1 Introduction
More informationOverview of the NTCIR-13 OpenLiveQ Task
Overview of the NTCIR-13 OpenLiveQ Task ABSTRACT Makoto P. Kato Kyoto University mpkato@acm.org Akiomi Nishida Yahoo Japan Corporation anishida@yahoo-corp.jp This is an overview of the NTCIR-13 OpenLiveQ
More informationCROSS LANGUAGE INFORMATION ACCESS IN TELUGU
CROSS LANGUAGE INFORMATION ACCESS IN TELUGU by Vasudeva Varma, Aditya Mogadala Mogadala, V. Srikanth Reddy, Ram Bhupal Reddy in Siliconandhrconference (Global Internet forum for Telugu) Report No: IIIT/TR/2011/-1
More informationInformation Retrieval
Information Retrieval WS 2016 / 2017 Lecture 2, Tuesday October 25 th, 2016 (Ranking, Evaluation) Prof. Dr. Hannah Bast Chair of Algorithms and Data Structures Department of Computer Science University
More informationData Retrieval in Intermittedly Connected Networks
Humboldt University Computer Science Department Data Retrieval in Intermittedly Connected Networks Dirk Neukirchen Interplanetary Internet Seminar, 2006 June 22, 2006 Overview Traditional Approaches vs.
More informationJan Pedersen 22 July 2010
Jan Pedersen 22 July 2010 Outline Problem Statement Best effort retrieval vs automated reformulation Query Evaluation Architecture Query Understanding Models Data Sources Standard IR Assumptions Queries
More informationRetrieval Evaluation. Hongning Wang
Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User
More informationOnline Expansion of Rare Queries for Sponsored Search
Online Expansion of Rare Queries for Sponsored Search Peter Ciccolo, Evgeniy Gabrilovich, Vanja Josifovski, Don Metzler, Lance Riedel, Jeff Yuan Yahoo! Research 1 Sponsored Search 2 Sponsored Search in
More informationHITS and Misses: Combining BM25 with HITS for Expert Search
HITS and Misses: Combining BM25 with HITS for Expert Search Johannes Leveling and Gareth J. F. Jones School of Computing and Centre for Next Generation Localisation (CNGL) Dublin City University Dublin
More informationDeveloped as part of a research contract from EPA Region 9 Partnered with Dr. Paul English, CA DPH EHIB
Air quality hazards defined by CalEPA/ CARB with recommendations for health protective buffer zones to separate these hazards from sensitive populations Developed as part of a research contract from EPA
More informationSimultaneous Multilingual Search for Translingual Information Retrieval
Simultaneous Multilingual Search for Translingual Information Retrieval Kristen Parton 1 kristen@cs.columbia.edu Kathleen R. McKeown 1 kathy@cs.columbia.edu James Allan 2 allan@cs.umass.edu Enrique Henestroza
More informationwith BLENDER: Enabling Local Search a Hybrid Differential Privacy Model
BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model Brendan Avent 1, Aleksandra Korolova 1, David Zeber 2, Torgeir Hovden 2, Benjamin Livshits 3 University of Southern California 1
More informationEstimating Embedding Vectors for Queries
Estimating Embedding Vectors for Queries Hamed Zamani Center for Intelligent Information Retrieval College of Information and Computer Sciences University of Massachusetts Amherst Amherst, MA 01003 zamani@cs.umass.edu
More informationCS54701: Information Retrieval
CS54701: Information Retrieval Federated Search 10 March 2016 Prof. Chris Clifton Outline Federated Search Introduction to federated search Main research problems Resource Representation Resource Selection
More information