Learning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan,
|
|
- Paula Dawson
- 5 years ago
- Views:
Transcription
1 Learning to Rank Tie-Yan Liu Microsoft Research Asia CCIR 2011, Jinan,
2 History of Web Search Search engines powered by link analysis Traditional text retrieval engines 2011/10/22 Tie-Yan CCIR
3 Typical Search Engines Structure Query User Interface Query-time computing Caching Ranking Inverted Index Index Builder Page Authority Link Analysis Cached Pages Web Page Parser Pages Links & Anchors Link Map Link Graph Builder Link Graph Page & Site Statistics Crawler Offline computing Web 2011/10/22 Tie-Yan CCIR
4 Challenges to New Search Engines The same structure is shared by many search engines; which one can succeed? Those search engines with a longer history have accumulated many experiences in system tuning, and have accumulated a lot of heuristics in ranking. It is hard for newly-born search engines to compete with market leaders, because of the lack of experiences and domain knowledge. 2011/10/22 Tie-Yan CCIR
5 Challenges to New Search Engines Question: Can a new search engine get effective ranking heuristics and well tune its system without going through the long history? New Ranking Mechanism: Learning to Rank Answer: Heuristics Automatically can be learn accumulated, effective ranking and models can also from be learned; examples Systems using machine can be manually learning technologies! tuned, and can also be automatically optimized. 2011/10/22 Tie-Yan CCIR
6 Many Search Engines Employ Learning to Rank Technologies! Started from Ranking model trained using a machine learning method called RankNet (LambdaRank and LambdaMart later on). Bing is catching up with Google very quickly. Till 2011, Bing has gained about 30% market share. 2011/10/22 Tie-Yan CCIR
7 History of Web Search Search engines powered by link analysis Search engines powered by learning to rank Traditional text retrieval engines 2011/10/22 Tie-Yan CCIR
8 Outline What is learning to rank What s unique in learning to rank Future of learning to rank 2011/10/22 Tie-Yan CCIR
9 General sense Learning to Rank Discriminative training is also demanding. Everyday search engines receive a lot of user feedback; The capability of combining a large number of features is It is hard to describe user s feedback in a generative very promising: it can easily incorporate any new progress on retrieval feedback model, and by constantly including improve the output the of ranking the model as a feature. Any machine learning technologies that can be used to learn a ranking model Narrow sense manner; but it is definitely important to learn from the mechanism. In most recent works, learning to rank is defined as the methodology that learns how to combine features by means of discriminative training. 2011/10/22 Tie-Yan CCIR
10 Learning to Rank Use the Model to Answer Online Queries Learning the Ranking Model by Minimizing a Loss Function on the Training Data Feature Extraction for Query-document Pairs Collect Training Data (Queries and their labeled documents) 2011/10/22 Tie-Yan CCIR
11 Learning to Rank Algorithms Least Square Retrieval Function Query refinement (WWW 2008) (TOIS 1989) SVM-MAP (SIGIR 2007) Nested Ranker (SIGIR 2006) ListNet (ICML 2007) Pranking (NIPS 2002) LambdaRank (NIPS 2006) MPRank (ICML 2007) Frank (SIGIR 2007) MHR (SIGIR 2007) RankBoost (JMLR 2003) Learning to retrieval info (SCC 1995) LDM (SIGIR 2005) Large margin ranker (NIPS 2002) RankNet (ICML 2005) Ranking SVM (ICANN 1999) IRSVM (SIGIR 2006) Discriminative model for IR (SIGIR 2004) SVM Structure (JMLR 2005) OAP-BPM (ICML 2003) Subset Ranking (COLT 2006) GPRank (LR4IR 2007) QBRank (NIPS 2007) GBRank (SIGIR 2007) Constraint Ordinal Regression (ICML 2005) McRank (NIPS 2007) SoftRank (LR4IR 2007) AdaRank (SIGIR 2007) CCA (SIGIR 2007) ListMLE (ICML 2008) RankCosine (IP&M 2007) Supervised Rank Aggregation (WWW 2007) Relational ranking (WWW 2008) Learning to order things (NIPS 1998) Round robin ranking (ECML 2003) 2011/10/22 Tie-Yan CCIR
12 Learning to Rank Algorithms Revisited Many early work on learning to rank regard ranking as an application, and try to adopt existing machine learning algorithms to solve the ranking problem. Regression: treat relevance degree as real values Classification: treat relevance degree as categories Pairwise classification: reduce ranking to classifying the order between each pair of documents. 2011/10/22 Tie-Yan CCIR
13 Example: Subset Ranking (D. Cossock and T. Zhang, COLT 2006) Regard relevance degree as real number, and use regression to learn the ranking function. f ( x ) y 2 L( f ; x j, y j) j j Regression-based 2011/10/22 Tie-Yan CCIR
14 Example: McRank (P. Li, et al. NIPS 2007) Multi-class classification is used to learn the ranking function. For document x, the output of the classifier is ŷ j. Loss function: surrogate function of Ranking is produced by combining the outputs of the classifiers. pˆ j j, k P( yˆ j k), f ( x j) pˆ K k1 I yˆ j y { j j, k k } Classification-based 2011/10/22 Tie-Yan CCIR
15 Example: Ranking SVM (R. Herbrich, et al., Advances in Large Margin Classifiers, 2000; T. Joachims, KDD 2002) Ranking SVM is rooted in the framework of SVM Kernel tricks can also be applied to Ranking SVM, so as to handle complex non-linear problems. min w T ( i) uv 1 2 x ( i) u w 0, i x 2 ( i) v ( 1 1,..., n. n ( i) u, v i1 ( ) u, v: y, 1 C i) u, v u i v,if y ( i) u, v 1. x u -x v as positive instance of learning Use SVM to perform binary classification on these instances, to learn model parameter w Pairwise Classification-based 2011/10/22 Tie-Yan CCIR
16 Are They the Right Approaches? The reductions have not reflected the full nature of ranking In ranking, one cares about the order among documents, but not absolute scores or categories Top positions in the ranked list are more important Notion of query plays an important role Only documents associated with the same query can be compared to each other and ranked one after another Each query contributes equally to the overall evaluation measure (see the definition of MAP, NDCG) 2011/10/22 Tie-Yan CCIR
17 New Research is Needed New Algorithms To capture unique properties of ranking (relative order, position, query, etc.) in a principled manner Listwise Approach to Learning to Rank New theorems To understand theoretical nature of learning to rank algorithms and guarantee their performances Statistical Learning Theory for Ranking 2011/10/22 Tie-Yan CCIR
18 The Listwise Approach
19 Defining Ranking Loss is Non-trivial! An example: Model f: f(a)=3, f(b)=0, f(c)=1 ACB Model h: h(a)=4, h(b)=6, h(c)=3 BAC ground truth g: g(a)=6, g(b)=4, g(c)=3 ABC Question: which model is better (closer to ground truth)? Based on Euclidean distance: sim(f,g) < sim(g,h). Based on pairwise comparison: sim(f,g) = sim(g,h) However, according to NDCG, f should be closer to g! 2011/10/22 Tie-Yan CCIR
20 Listwise Loss Functions Ranked list Permutation probability distribution More informative representation for ranked list: permutation and ranked list has 1-1 correspondence. P( f ) 2011/10/22 Tie-Yan CCIR
21 Defining Permutation Probability Probability of a permutation is defined with Plackett-Luce Model Example: P PL ( f ) m ( j) m 1 exp( f ( x k j ( k ) j exp( f ( x )) )) P PL ABC f exp exp f (A) f (A) exp f (B) exp f (A) exp exp f (B) f (B) exp f (C) exp exp f f (C) (C) P(A ranked No.1) P(B ranked No.2 A ranked No.1) = P(B ranked No.1)/(1- P(A ranked No.1)) P(C ranked No.3 A ranked No.1, B ranked No.2) 2011/10/22 Tie-Yan CCIR
22 Distance between Ranked Lists Using KL-divergence to measure difference between distributions dis(f,g) = 0.46 dis(g,h) = /10/22 Tie-Yan CCIR
23 K-L Divergence Loss ListNet (ICML 2007) ListMLE (ICML 2008) An efficient variant of ListNet g P f ( ) L( f ; x, g) D P x PL 1 g ListNet and PPLListMLE ( g) are Pg ( regarded ) 0 as otherwise among the most effective learning to rank algorithms. L( f ; x, g) log P ( f ( x)) PL PL g 2011/10/22 Tie-Yan CCIR
24 Other Work on Listwise Ranking Listwise loss functions AdaRank: a boosting approach to listwise ranking (SIGIR 2007) PermuRank: a structured SVM approach to listwise ranking (SIGIR 2008) Listwise ranking functions C-CRF: define listwise ranking function using conditional random fields (NIPS 2008) R-RSVM: define listwise ranking function using relational SVM (WWW 2008) 2011/10/22 Tie-Yan CCIR
25 Listwise Ranking Has Become an Important Branch of Learning to Rank 2011/10/22 Tie-Yan CCIR
26 Statistical Learning Theory for Ranking
27 Why Theory? In practice, one can only observe experimental results on relatively small datasets. Such empirical results might not be reliable, because Small training set cannot fully realize the potential of a learning algorithm. Small test set cannot reflect the true performance of an algorithm, since the real query space is too huge. Statistical learning theory analyzes the performance of an algorithm when the training data is infinite and the test data is randomly sampled. 2011/10/22 Tie-Yan CCIR
28 Generalization Analysis In the training phase, one learn a model by minimizing the empirical risk on the training data. In the test phase, we evaluate the expected risk of the model on any sample. Generalization analysis is concerned with the bound of the difference between the expected and empirical risks, when the number of training data approaches infinity. 2011/10/22 Tie-Yan CCIR
29 Generalization in Learning to Rank Loss on finite data (e.g., likelihood loss) Training Process Ranking Model Test Process Measure on infinite data e.g. (1-NDCG) Can this process generalize? Test Measure Training Loss + ε(n, m, F) query n Doc 1 query query Label Doc 1 query Label 1 1 Doc 1 Label 1 Doc n Doc 1 Label 1 Label n Doc n Label n Doc n Label n Doc m Label m Queries Web Documents 2011/10/22 Tie-Yan CCIR
30 How to Get There Test Measure Training Loss + ε(n, m, F) Test Measure Test Loss Training Loss + ε /10/22 Tie-Yan CCIR
31 1 Test Loss Training Loss + ε n, m, F? This is generalization in terms of loss. To perform this generalization analysis, we need to make probabilistic assumptions on the data generation.
32 Previous Assumptions Document Ranking (Agarwal et.al., 2005; Clemencon et.al.,2007) Documents Doc 1 Label 1 Doc 2 Doc 3 Label 2 Label 3 Doc m Label m No notion of query! Test is conducted at query level in learning to rank! Deep and shallow training sets correspond to the same generalization ability. 2011/10/22 Tie-Yan CCIR
33 Previous Assumptions Subset Ranking (Lan et. al.,2008; Lan et. al.,2009) Queries query 1 Represent query by a deterministic subset of m documents and their labels Doc Set 1 Deterministic! Training documents are sampled, and different number of training documents will lead to different performance of the ranking model! Label Set 1 query 2 Doc Set 2 Label Set 2 query n More training documents will not enhance and even hurt generalization ability. Doc Set n Label Set n 2011/10/22 Tie-Yan CCIR
34 Two-layer Sampling (NIPS 2010) query 1 Different from document ranking, there is sampling of queries, and documents associated with different queries Queries are sampled Web according to different Doc 1 Doc distributions. 2 Doc 3 Documents Label 1 Label 2 Label 3 Doc m Different from subset ranking, the sampling of documents for each query is considered. Label m Elements in two-layer Feature sampling Feature are Feature neither Vector 1 Vector 2 Vector 3 independent nor identically distributed. Label 1 Label 2 Label 3 Feature Vector m Label m 2011/10/22 Tie-Yan CCIR
35 decomposition Concentration Two-layer Generalization Bound Test Loss Training Loss + ε (n, m, F) Two-layer error Querylayer error Introduce ghost query samples and fixed-size pseudo doc samples Doc-layer reduced two-layer RA Twolayer RA Doc-layer error Conditioned on query sample Introduce ghost doc sample for each query Query-layer reduced two-layer RA 2011/10/22 Tie-Yan CCIR
36 Discussion Deep or shallow? With budget to only label C documents, there is an optimal tradeoff between n and m. For example, if the ranking function class satisfies and, the optimal tradeoff is: 2011/10/22 Tie-Yan CCIR
37 2 Ranking Measure Loss Function?
38 Loss Function vs. Ranking Measure Loss Function in ListMLE L( f ; x, ) log P ( f ( x)) y Based on the scores produced by the ranking model. PL y 1- NDCG (Normalized Discounted Cumulative Gain) Normalization Cumulating Gain Position discount Based on the ranked list by sorting the scores. 2011/10/22 Tie-Yan CCIR
39 Challenge Relationship between loss and measure in ranking is unclear due to their different mathematical forms. In contrast, for classification, both loss and measure are defined regarding Loss individual Functions documents and their relationship is clear. Ranking Measures 2011/10/22 Tie-Yan CCIR
40 Essential Loss for Ranking (NIPS 2009) Model ranking as a sequence of classifications 2011/10/22 Tie-Yan CCIR Ground truth permutation: Prediction of the ranking function f: { D} C B A y Classifier C D D C y C D B D C B y C D A B D C B A y { C} D A B Output the document with the largest ranking score The weighted classification error for each step in the sequence
41 Essential Loss vs. Ranking Measures 1) Both (1-NDCG) and (1-MAP) are upper bounded by the essential loss. 2) The zero value of the essential loss is a necessary and sufficient condition for the zero values of (1-NDCG) and (1-MAP). 2011/10/22 Tie-Yan CCIR
42 Essential Loss vs. Surrogate Losses 1) Many pairwise and listwise loss functions are upper bounds of the essential loss. 2) Therefore, the pairwise and listwise loss functions are also upper bounds of (1-NDCG) and (1-MAP). 2011/10/22 Tie-Yan CCIR
43 Learning Theory for Ranking (1) + (2) build the foundation of statistical learning theory for ranking. Guarantee on the test performance (in terms of the ranking measure) given the training performance (in terms of the loss function). Many people have started to look into this important field, inspired by our work. 2011/10/22 Tie-Yan CCIR
44 Summary and Outlook
45 Learning to Rank is Really Hot! Hundreds of publications at SIGIR, ICML, NIPS, etc. Several benchmark datasets released. 1~2 sessions at SIGIR every recent year. Several workshops at SIGIR, ICML, NIPS, etc. Several tutorials at SIGIR, WWW, ACL, etc. Special issue at IR Journal. Yahoo! Learning to rank challenges. Several books published on the topic. 2011/10/22 Tie-Yan CCIR
46 Wide Applications of Learning to Rank Document retrieval Question answering Multimedia retrieval Text summarization Online advertising Collaborative filtering Machine translation 2011/10/22 Tie-Yan CCIR
47 Future Work Challenges of Theories Tighter generalization bound / convergence rate Statistical consistency Coverage of more learning to rank algorithms Sampling selection bias 2011/10/22 Tie-Yan CCIR
48 Future Work Challenges from Real Applications Large scale learning to rank Robust learning to rank Online, incremental, active learning to rank Transfer learning to rank Structural learning to rank (diversity, whole-page relevance) 2011/10/22 Tie-Yan CCIR
49 References 2011/10/22 Tie-Yan CCIR
50 /10/22 Tie-Yan CCIR
WebSci and Learning to Rank for IR
WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles
More informationLearning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia
Learning to Rank for Information Retrieval Tie-Yan Liu Lead Researcher Microsoft Research Asia 4/20/2008 Tie-Yan Liu @ Tutorial at WWW 2008 1 The Speaker Tie-Yan Liu Lead Researcher, Microsoft Research
More informationA Few Things to Know about Machine Learning for Web Search
AIRS 2012 Tianjin, China Dec. 19, 2012 A Few Things to Know about Machine Learning for Web Search Hang Li Noah s Ark Lab Huawei Technologies Talk Outline My projects at MSRA Some conclusions from our research
More informationSearch Engines and Learning to Rank
Search Engines and Learning to Rank Joseph (Yossi) Keshet Query processor Ranker Cache Forward index Inverted index Link analyzer Indexer Parser Web graph Crawler Representations TF-IDF To get an effective
More informationInformation Retrieval
Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data
More informationLearning to Rank for Information Retrieval
Learning to Rank for Information Retrieval Tie-Yan Liu Learning to Rank for Information Retrieval Tie-Yan Liu Microsoft Research Asia Bldg #2, No. 5, Dan Ling Street Haidian District Beijing 100080 People
More informationRanking with Query-Dependent Loss for Web Search
Ranking with Query-Dependent Loss for Web Search Jiang Bian 1, Tie-Yan Liu 2, Tao Qin 2, Hongyuan Zha 1 Georgia Institute of Technology 1 Microsoft Research Asia 2 Outline Motivation Incorporating Query
More informationFall Lecture 16: Learning-to-rank
Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin
More informationLearning to Rank. from heuristics to theoretic approaches. Hongning Wang
Learning to Rank from heuristics to theoretic approaches Hongning Wang Congratulations Job Offer from Bing Core Ranking team Design the ranking module for Bing.com CS 6501: Information Retrieval 2 How
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationLearning to Rank: A New Technology for Text Processing
TFANT 07 Tokyo Univ. March 2, 2007 Learning to Rank: A New Technology for Text Processing Hang Li Microsoft Research Asia Talk Outline What is Learning to Rank? Ranking SVM Definition Search Ranking SVM
More informationStructured Ranking Learning using Cumulative Distribution Networks
Structured Ranking Learning using Cumulative Distribution Networks Jim C. Huang Probabilistic and Statistical Inference Group University of Toronto Toronto, ON, Canada M5S 3G4 jim@psi.toronto.edu Brendan
More informationFeature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate
More informationLearning Ranking Functions with Implicit Feedback
Learning Ranking Functions with Implicit Feedback CS4780 Machine Learning Fall 2011 Pannaga Shivaswamy Cornell University These slides are built on an earlier set of slides by Prof. Joachims. Current Search
More informationLizhe Sun. November 17, Florida State University. Ranking in Statistics and Machine Learning. Lizhe Sun. Introduction
in in Florida State University November 17, 2017 Framework in 1. our life 2. Early work: Model Examples 3. webpage Web page search modeling Data structure Data analysis with machine learning algorithms
More informationLearning to Rank with Deep Neural Networks
Learning to Rank with Deep Neural Networks Dissertation presented by Goeric HUYBRECHTS for obtaining the Master s degree in Computer Science and Engineering Options: Artificial Intelligence Computing and
More informationarxiv: v1 [cs.ir] 19 Sep 2016
Enhancing LambdaMART Using Oblivious Trees Marek Modrý 1 and Michal Ferov 2 arxiv:1609.05610v1 [cs.ir] 19 Sep 2016 1 Seznam.cz, Radlická 3294/10, 150 00 Praha 5, Czech Republic marek.modry@firma.seznam.cz
More informationActive Evaluation of Ranking Functions based on Graded Relevance (Extended Abstract)
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Active Evaluation of Ranking Functions based on Graded Relevance (Extended Abstract) Christoph Sawade sawade@cs.uni-potsdam.de
More informationAdvanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016
Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full
More informationSelf-tuning ongoing terminology extraction retrained on terminology validation decisions
Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin
More informationLearning Dense Models of Query Similarity from User Click Logs
Learning Dense Models of Query Similarity from User Click Logs Fabio De Bona, Stefan Riezler*, Keith Hall, Massi Ciaramita, Amac Herdagdelen, Maria Holmqvist Google Research, Zürich *Dept. of Computational
More informationFractional Similarity : Cross-lingual Feature Selection for Search
: Cross-lingual Feature Selection for Search Jagadeesh Jagarlamudi University of Maryland, College Park, USA Joint work with Paul N. Bennett Microsoft Research, Redmond, USA Using All the Data Existing
More informationCombine the PA Algorithm with a Proximal Classifier
Combine the Passive and Aggressive Algorithm with a Proximal Classifier Yuh-Jye Lee Joint work with Y.-C. Tseng Dept. of Computer Science & Information Engineering TaiwanTech. Dept. of Statistics@NCKU
More informationLearning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER
Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER Chalmers University of Technology University of Gothenburg
More informationImproving Recommendations Through. Re-Ranking Of Results
Improving Recommendations Through Re-Ranking Of Results S.Ashwini M.Tech, Computer Science Engineering, MLRIT, Hyderabad, Andhra Pradesh, India Abstract World Wide Web has become a good source for any
More informationBoolean Model. Hongning Wang
Boolean Model Hongning Wang CS@UVa Abstraction of search engine architecture Indexed corpus Crawler Ranking procedure Doc Analyzer Doc Representation Query Rep Feedback (Query) Evaluation User Indexer
More informationA Deep Relevance Matching Model for Ad-hoc Retrieval
A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese
More informationData Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification
More informationStructured Learning. Jun Zhu
Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum
More informationClassification: Linear Discriminant Functions
Classification: Linear Discriminant Functions CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Discriminant functions Linear Discriminant functions
More informationBing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web
More informationIntroduction to Information Retrieval. Hongning Wang
Introduction to Information Retrieval Hongning Wang CS@UVa What is information retrieval? 2 Why information retrieval Information overload It refers to the difficulty a person can have understanding an
More informationThe Comparative Study of Machine Learning Algorithms in Text Data Classification*
The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification
More informationQuery Independent Scholarly Article Ranking
Query Independent Scholarly Article Ranking Shuai Ma, Chen Gong, Renjun Hu, Dongsheng Luo, Chunming Hu, Jinpeng Huai SKLSDE Lab, Beihang University, China Beijing Advanced Innovation Center for Big Data
More informationRetrieval Evaluation. Hongning Wang
Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User
More informationLearning to Rank for Faceted Search Bridging the gap between theory and practice
Learning to Rank for Faceted Search Bridging the gap between theory and practice Agnes van Belle @ Berlin Buzzwords 2017 Job-to-person search system Generated query Match indicator Faceted search Multiple
More informationRanking and Learning. Table of Content. Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification.
Table of Content anking and Learning Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification 290 UCSB, Tao Yang, 2013 Partially based on Manning, aghavan,
More informationDATA MINING - 1DL105, 1DL111
1 DATA MINING - 1DL105, 1DL111 Fall 2007 An introductory class in data mining http://user.it.uu.se/~udbl/dut-ht2007/ alt. http://www.it.uu.se/edu/course/homepage/infoutv/ht07 Kjell Orsborn Uppsala Database
More informationLecture 9: Support Vector Machines
Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and
More informationAn Investigation of Basic Retrieval Models for the Dynamic Domain Task
An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu
More informationDATA MINING II - 1DL460. Spring 2014"
DATA MINING II - 1DL460 Spring 2014" A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt14 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationData Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)
Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based
More informationAutomatic Domain Partitioning for Multi-Domain Learning
Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels
More informationAutomated Online News Classification with Personalization
Automated Online News Classification with Personalization Chee-Hong Chan Aixin Sun Ee-Peng Lim Center for Advanced Information Systems, Nanyang Technological University Nanyang Avenue, Singapore, 639798
More informationDivide and Conquer Kernel Ridge Regression
Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem
More informationTHIS LECTURE. How do we know if our results are any good? Results summaries: Evaluating a search engine. Making our good results usable to a user
EVALUATION Sec. 6.2 THIS LECTURE How do we know if our results are any good? Evaluating a search engine Benchmarks Precision and recall Results summaries: Making our good results usable to a user 2 3 EVALUATING
More informationEffective Latent Space Graph-based Re-ranking Model with Global Consistency
Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case
More informationA General Approximation Framework for Direct Optimization of Information Retrieval Measures
A General Approximation Framework for Direct Optimization of Information Retrieval Measures Tao Qin, Tie-Yan Liu, Hang Li October, 2008 Abstract Recently direct optimization of information retrieval (IR)
More informationText Categorization (I)
CS473 CS-473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization
More informationPersonalized Web Search
Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents
More informationPart I: Data Mining Foundations
Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?
More informationA Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising
A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising ABSTRACT Maryam Karimzadehgan Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL
More informationStat 602X Exam 2 Spring 2011
Stat 60X Exam Spring 0 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed . Below is a small p classification training set (for classes) displayed in
More informationIntroduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.
Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How
More informationMulti-label Classification. Jingzhou Liu Dec
Multi-label Classification Jingzhou Liu Dec. 6 2016 Introduction Multi-class problem, Training data (x $, y $ ) ( ), x $ X R., y $ Y = 1,2,, L Learn a mapping f: X Y Each instance x $ is associated with
More informationNortheastern University in TREC 2009 Million Query Track
Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College
More informationIALP 2016 Improving the Effectiveness of POI Search by Associated Information Summarization
IALP 2016 Improving the Effectiveness of POI Search by Associated Information Summarization Hsiu-Min Chuang, Chia-Hui Chang*, Chung-Ting Cheng Dept. of Computer Science and Information Engineering National
More informationClustering. Chapter 10 in Introduction to statistical learning
Clustering Chapter 10 in Introduction to statistical learning 16 14 12 10 8 6 4 2 0 2 4 6 8 10 12 14 1 Clustering ² Clustering is the art of finding groups in data (Kaufman and Rousseeuw, 1990). ² What
More informationA Taxonomy of Semi-Supervised Learning Algorithms
A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More informationUnsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi
Unsupervised Learning Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Content Motivation Introduction Applications Types of clustering Clustering criterion functions Distance functions Normalization Which
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2015 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows K-Nearest
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationLearning Ranking Functions with SVMs
Learning Ranking Functions with SVMs CS4780/5780 Machine Learning Fall 2012 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference
More informationTransductive Learning: Motivation, Model, Algorithms
Transductive Learning: Motivation, Model, Algorithms Olivier Bousquet Centre de Mathématiques Appliquées Ecole Polytechnique, FRANCE olivier.bousquet@m4x.org University of New Mexico, January 2002 Goal
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationLearning Socially Optimal Information Systems from Egoistic Users
Learning Socially Optimal Information Systems from Egoistic Users Karthik Raman Thorsten Joachims Department of Computer Science Cornell University, Ithaca NY karthik@cs.cornell.edu www.cs.cornell.edu/
More informationAdaptive Dropout Training for SVMs
Department of Computer Science and Technology Adaptive Dropout Training for SVMs Jun Zhu Joint with Ning Chen, Jingwei Zhuo, Jianfei Chen, Bo Zhang Tsinghua University ShanghaiTech Symposium on Data Science,
More informationEasy Samples First: Self-paced Reranking for Zero-Example Multimedia Search
Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search Lu Jiang 1, Deyu Meng 2, Teruko Mitamura 1, Alexander G. Hauptmann 1 1 School of Computer Science, Carnegie Mellon University
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 4, 2016 Outline Multi-core v.s. multi-processor Parallel Gradient Descent Parallel Stochastic Gradient Parallel Coordinate Descent Parallel
More informationOne-Pass Ranking Models for Low-Latency Product Recommendations
One-Pass Ranking Models for Low-Latency Product Recommendations Martin Saveski @msaveski MIT (Amazon Berlin) One-Pass Ranking Models for Low-Latency Product Recommendations Amazon Machine Learning Team,
More informationOpportunities and challenges in personalization of online hotel search
Opportunities and challenges in personalization of online hotel search David Zibriczky Data Science & Analytics Lead, User Profiling Introduction 2 Introduction About Mission: Helping the travelers to
More informationBayesian model ensembling using meta-trained recurrent neural networks
Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya
More informationSupervised Clustering of Label Ranking Data
Supervised Clustering of Label Ranking Data Mihajlo Grbovic, Nemanja Djuric, Slobodan Vucetic {mihajlo.grbovic, nemanja.djuric, slobodan.vucetic}@temple.edu SIAM SDM 202, Anaheim, California, USA Temple
More informationSupplementary A. Overview. C. Time and Space Complexity. B. Shape Retrieval. D. Permutation Invariant SOM. B.1. Dataset
Supplementary A. Overview This supplementary document provides more technical details and experimental results to the main paper. Shape retrieval experiments are demonstrated with ShapeNet Core55 dataset
More informationLearning Better Data Representation using Inference-Driven Metric Learning
Learning Better Data Representation using Inference-Driven Metric Learning Paramveer S. Dhillon CIS Deptt., Univ. of Penn. Philadelphia, PA, U.S.A dhillon@cis.upenn.edu Partha Pratim Talukdar Search Labs,
More informationPerformance Measures for Multi-Graded Relevance
Performance Measures for Multi-Graded Relevance Christian Scheel, Andreas Lommatzsch, and Sahin Albayrak Technische Universität Berlin, DAI-Labor, Germany {christian.scheel,andreas.lommatzsch,sahin.albayrak}@dai-labor.de
More informationSearch Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]
Search Evaluation Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Table of Content Search Engine Evaluation Metrics for relevancy Precision/recall F-measure MAP NDCG Difficulties in Evaluating
More informationApplying Supervised Learning
Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains
More informationClassification. 1 o Semestre 2007/2008
Classification Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 Single-Class
More informationInformation Retrieval
Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,
More informationCombining PGMs and Discriminative Models for Upper Body Pose Detection
Combining PGMs and Discriminative Models for Upper Body Pose Detection Gedas Bertasius May 30, 2014 1 Introduction In this project, I utilized probabilistic graphical models together with discriminative
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences
More informationApplication of Support Vector Machine Algorithm in Spam Filtering
Application of Support Vector Machine Algorithm in E-Mail Spam Filtering Julia Bluszcz, Daria Fitisova, Alexander Hamann, Alexey Trifonov, Advisor: Patrick Jähnichen Abstract The problem of spam classification
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-
More informationDetecting Malicious Activity with DNS Backscatter Kensuke Fukuda John Heidemann Proc. of ACM IMC '15, pp , 2015.
Detecting Malicious Activity with DNS Backscatter Kensuke Fukuda John Heidemann Proc. of ACM IMC '15, pp. 197-210, 2015. Presented by Xintong Wang and Han Zhang Challenges in Network Monitoring Need a
More informationCSE 573: Artificial Intelligence Autumn 2010
CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine
More informationMODEL SELECTION AND REGULARIZATION PARAMETER CHOICE
MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE REGULARIZATION METHODS FOR HIGH DIMENSIONAL LEARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu June 3, 2013 ABOUT THIS
More informationarxiv: v2 [cs.ir] 27 Feb 2019
Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm arxiv:1809.05818v2 [cs.ir] 27 Feb 2019 ABSTRACT Ziniu Hu University of California, Los Angeles, USA bull@cs.ucla.edu Recently a number
More informationInstructor: Stefan Savev
LECTURE 2 What is indexing? Indexing is the process of extracting features (such as word counts) from the documents (in other words: preprocessing the documents). The process ends with putting the information
More informationPredicting Query Performance on the Web
Predicting Query Performance on the Web No Author Given Abstract. Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However,
More informationLecture 5: Information Retrieval using the Vector Space Model
Lecture 5: Information Retrieval using the Vector Space Model Trevor Cohn (tcohn@unimelb.edu.au) Slide credits: William Webber COMP90042, 2015, Semester 1 What we ll learn today How to take a user query
More informationEvaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology
Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationInstance-Based Learning: Nearest neighbor and kernel regression and classificiation
Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each
More informationHigh Accuracy Retrieval with Multiple Nested Ranker
High Accuracy Retrieval with Multiple Nested Ranker Irina Matveeva University of Chicago 5801 S. Ellis Ave Chicago, IL 60637 matveeva@uchicago.edu Chris Burges Microsoft Research One Microsoft Way Redmond,
More informationApplied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University
Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University NIPS 2008: E. Sudderth & M. Jordan, Shared Segmentation of Natural
More informationRegularization and model selection
CS229 Lecture notes Andrew Ng Part VI Regularization and model selection Suppose we are trying select among several different models for a learning problem. For instance, we might be using a polynomial
More informationSVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines
SVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines Boriana Milenova, Joseph Yarmus, Marcos Campos Data Mining Technologies Oracle Overview Support Vector
More informationThe Offset Tree for Learning with Partial Labels
The Offset Tree for Learning with Partial Labels Alina Beygelzimer IBM Research John Langford Yahoo! Research June 30, 2009 KDD 2009 1 A user with some hidden interests make a query on Yahoo. 2 Yahoo chooses
More information