A Few Things to Know about Machine Learning for Web Search

Size: px
Start display at page:

Download "A Few Things to Know about Machine Learning for Web Search"

Transcription

1 AIRS 2012 Tianjin, China Dec. 19, 2012 A Few Things to Know about Machine Learning for Web Search Hang Li Noah s Ark Lab Huawei Technologies

2 Talk Outline My projects at MSRA Some conclusions from our research on web search

3 My Past Projects at MSRA Text Mining ( ) Development of SQL Server 2005 Text Mining Enterprise Search ( ) Development of Office 2007, 2010, 2012 SharePoint Search Web Search ( ) Development of Live Search 2008, Bing 2009

4 Research on Machine Learning for Learning to Rank Tie-Yan Liu, Jun Xu, Tao Qin, etc Web Search Letor dataset [Liu+ 07], ListNet[Cao+ 07], ListMLE[Xia+ 09], AdaRank[Xu+07], IR SVM [Cao+ 06] Importance Ranking Tie-Yan Liu, Bin Gao, etc BrowseRank [Liu+ 08] Semantic Matching (Relevance) Gu Xu, Jun Xu, Jingfang Xu, etc CRF [Guo+ 08], NERQ [Guo+ 09], LogLinear [Wang+ 11], RLSI [Wang+ 11], RMLS[Wu+ 12], SRK [Bu+ 12] Search Log Mining Daxin Jiang, Yunhua Hu, etc Context-aware Search [Cao+ 08] [Cao+ 09][Xiang +11], Intent Mining [Hu+ 12]

5 Research on Machine Learning for Web Search (cont ) We tried to address the fundamental computer science problems, i.e., to develop fundamental models (algorithms) Performance can be further improved by adding engineering efforts

6 Some Conclusions from Our Research Machine learning based ranking and rule-based ranking both have pros and cons State of the art learning to rank algorithms More features better performance No signal for relevance is enough Matching (feature) is more important than ranking (model) Matching can be performed at multiple levels Click data is useful Browse data is useful Flexibility is key for handling queries List of useful features in ranking Spelling errors in query can be corrected first

7 Beyond Search Other applications have similar problems Online advertisement Question answering Recommender system Techniques can be applied to the applications as well

8 Machine Learning based Ranking vs Rule based Ranking Two types of signals Relevance (matching) Importance The higher the scores are, the better relevance is Simplest model Linear combination Make it possible for rule-based approach Precise tuning needs either learning-based approach (learning to rank) or rule-based approach

9 Machine Learning based Ranking vs Rule based Ranking Learning based Rule based Update of model Easy Hard Fine tuning Hard to control Easy to control Creation of model Creation of training data Optimized for average cases Necessary Can be optimized to avoid worst cases Not necessary

10 State of the Art Learning to Rank Algorithms LambdaMart LambdaRank ListNet AdaRank Rank SVM IR SVM RankNet RankBoost LambdaMark performed the best in Yahoo Competition, etc The differences among the above rankers are small

11 More Features Better Performance The more features used in ranker (ranking model), usually the better performance Even redundant features (e.g., BM25 and tfidf) In terms of NDCG and the Cranefield evaluation

12 No Signal (Feature) is Enough Not possible to just use one type of signal Power law distribution (long tail) Head is easy, but tail is hard Representing signals at Multiple fields: title, anchor, url, click

13 Matching (Feature) vs Ranking (Model) In traditional IR: Ranking = matching f ( q, d) f 25( q, d) BM or f ( q, d) P ( d q) LMIR Web search: Ranking and matching become separated Learning to rank becomes state-of-the-art f ( q, d) f 25( q, d) g ( d) BM PageRank Matching = feature learning for ranking Learning to Match 13

14 Same Search Intent Different Query Representations Example = Distance between Sun and Earth "how far" earth sun "how far" sun "how far" sun earth average distance earth sun average distance from earth to sun average distance from the earth to the sun distance between earth & sun distance between earth and sun distance between earth and the sun distance from earth to the sun distance from sun to earth distance from sun to the earth distance from the earth to the sun distance from the sun to earth distance from the sun to the earth distance of earth from sun distance between earth sun how far away is the sun from earth how far away is the sun from the earth how far earth from sun how far earth is from the sun how far from earth is the sun how far from earth to sun how far from the earth to the sun distance between sun and earth 14

15 Level of Semantics Matching at Multiple Levels Match between structures of query & document title Structure Topic Word Sense how far is sun from earth Match between topics of query & document Microsoft Office Match between word senses in query & document utube Microsoft PowerPoint, Word, Excel youtube distance between sun and earth NY New York Phrase Term Match between phrases in query & document hot dog hot dog Match between terms in query & document NY NY youtube youtube 15

16 Click Data Queries associated with page in click data can be viewed as metadata of page Useful streams (fields): title, anchor, url, click, and body Web search technologies First generation: traditional IR Second generation: anchor text, PageRank Third generation: click data, learning to rank, etc

17 Browse Data PageRank is not as powerful as people may expect Number of visits is a good strong for page importance BrowseRank (continuous time Markov process)

18 Flexibility Is Key for Handling Queries Four types of queries Noun phrases Multiple noun phrases Titles of books, songs, etc Natural language questions (about 1%) Needs to handle variants of expressions (cf., distance between sun and earth) String Re-writing Kernel (Bu 2012) for tackling flexibility of quires

19 List of Useful Features Features can be defined in multiple fields Title Anchor URL Click Body Useful features BM25 N-gram BM25 Exact match Translation between queries and titles Topic model Latent matching model PageRank BrowseRank

20 Spelling Error Correction English queries contain spelling errors Formalized as string transformation problem CRF [Guo et al 08] Spelling error correction should be done only when confident Eg. mlss singapore = miss singapore or machine learning summer school singapore Spelling error correction does not depend on documents Other query re-writing depends on documents E.g, seattle best hotel vs seattle best hotels Eg., arms reduction vs arm reduction

21 Some Conclusions from Our Research Machine learning based ranking and rule-based ranking both have pros and cons State of the art learning to rank algorithms More features better performance No signal for relevance is enough Matching (feature) is more important than ranking (model) Matching can be performed at multiple levels Click data is useful Browse data is useful Flexibility is key for handling queries List of useful features in ranking Spelling errors in query can be corrected first

22 References Wei Wu, Zhengdong Lv, Hang Li, Regularized Mapping to Latent Structures and Its Application to Web Search, under review. Yunhua Hu, Yanan Qian, Hang Li, Daxin Jiang, Jian Pei, Qinghua Zheng, Mining Query Subtopics from Search Log Data, In Proceedings of the 35th Annual International ACM SIGIR Conference (SIGIR 12), , Fan Bu, Hang Li, Xiaoyan Zhu, String Re-Writing Kernel, In Proceedings of the 50th Annual Meeting of Association for Computational Linguistics (ACL 12), , (ACL 12 Best Student Paper Award). Hang Li, A Short Introduction to Learning to Rank, IEICE Transactions on Information and Systems, E94-D(10), Quan Wang, Jun Xu, Hang Li, Nick Craswell, Regularized Latent Semantic Indexing, In Proceedings of the 34th Annual International ACM SIGIR Conference (SIGIR 11), , Ziqi Wang, Gu Xu, Hang Li and Ming Zhang, A Fast and Accurate Method for Approximate String Search, In Proceedings of the 49th Annual Meeting of Association for Computational Linguistics: Human Language Technologies (ACL-HLT 11), 52-61, Hang Li, Learning to Rank for Information Retrieval and Natural Language Processing, Synthesis Lectures on Human Language Technology, Lecture 12, Morgan & Claypool Publishers, Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, Hang Li, Context-Aware Ranking in Web Search. In Proceedings of the 33rd Annual International ACM SIGIR Conference (SIGIR 10), , Jiafeng Guo, Gu Xu, Xueqi Cheng, Hang Li, Named Entity Recognition in Query. In Proceedings of the 32nd Annual International ACM SIGIR Conference (SIGIR 09), , 2009.

23 References Huanhuan Cao, Daxin Jiang, Jian Pei, Enhong Chen, Hang Li, Towards Context-aware Search by Learning a Very Large Variable Length Hidden Markov Model from Search Logs. In Proceedings of the 18th World Wide Web Conference (WWW'09), , Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhohng Chen, Hang Li. Context-Aware Query Suggestion by Mining Click-Through and Session Data, In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08), pages , (SIGKDD 08 Best Application Paper Award). Yuting Liu, Bin Gao, Tie-Yan Liu, Ying Zhang, Zhiming Ma, Shuyuan He, Hang Li. BrowseRank: Letting Users Vote for Page Importance, In Proceedings of the 31st Annual International ACM SIGIR Conference (SIGIR 08), pages , (SIGIR 08 Best Student Paper Award). Jiafeng Guo, Gu Xu, Hang Li, Xueqi Cheng. A Unified and Discriminative Model for Query Refinement. In Proceedings of the 31st Annual International ACM SIGIR Conference (SIGIR 08), pages , Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, Hang Li. Listwise Approach to Learning to Rank â Theory and Algorithm, In Proceedings of the 25th International Conference on Machine Learning (ICML 08), , Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. Learning to Rank: From Pairwise Approach to Listwise Approach. In Proceedings of the 24th International Conference on Machine Learning (ICML 07), pages , Tie-Yan Liu, Jun Xu, Tao Qin, Wenying Xiong, and Hang Li. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval. In Proceedings of SIGIR 2007 Workshop on Learning to Rank for Information Retrieval, 2007.

24 References Jun Xu and Hang Li. AdaRank: A Boosting Algorithm for Information Retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference (SIGIR 07), pages , Yunbo Cao, Jun Xu, Tie-Yan Liu, Hang Li, Yalou Huang, Hsiao-Wuen Hon. Adapting Ranking SVM to Document Retrieval. In Proceedings of the 29th Annual International ACM SIGIR Conference (SIGIR 06), pages , 2006.

25 Thank You! Contact:

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

WebSci and Learning to Rank for IR

WebSci and Learning to Rank for IR WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles

More information

Learning to Rank: A New Technology for Text Processing

Learning to Rank: A New Technology for Text Processing TFANT 07 Tokyo Univ. March 2, 2007 Learning to Rank: A New Technology for Text Processing Hang Li Microsoft Research Asia Talk Outline What is Learning to Rank? Ranking SVM Definition Search Ranking SVM

More information

Learning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan,

Learning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan, Learning to Rank Tie-Yan Liu Microsoft Research Asia CCIR 2011, Jinan, 2011.10 History of Web Search Search engines powered by link analysis Traditional text retrieval engines 2011/10/22 Tie-Yan Liu @

More information

Learning to Rank for Information Retrieval

Learning to Rank for Information Retrieval Learning to Rank for Information Retrieval Tie-Yan Liu Learning to Rank for Information Retrieval Tie-Yan Liu Microsoft Research Asia Bldg #2, No. 5, Dan Ling Street Haidian District Beijing 100080 People

More information

Information Retrieval

Information Retrieval Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data

More information

Heterogeneous Graph-Based Intent Learning with Queries, Web Pages and Wikipedia Concepts

Heterogeneous Graph-Based Intent Learning with Queries, Web Pages and Wikipedia Concepts Heterogeneous Graph-Based Intent Learning with Queries, Web Pages and Wikipedia Concepts Xiang Ren, Yujing Wang, Xiao Yu, Jun Yan, Zheng Chen, Jiawei Han University of Illinois, at Urbana Champaign MicrosoD

More information

Learning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia

Learning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia Learning to Rank for Information Retrieval Tie-Yan Liu Lead Researcher Microsoft Research Asia 4/20/2008 Tie-Yan Liu @ Tutorial at WWW 2008 1 The Speaker Tie-Yan Liu Lead Researcher, Microsoft Research

More information

Learning to Rank for Information Retrieval and Natural Language Processing Second Edition

Learning to Rank for Information Retrieval and Natural Language Processing Second Edition MORGAN& CLAYPOOL PUBLISHERS Learning to Rank for Information Retrieval and Natural Language Processing Second Edition Hang Li SyntheSiS LectureS on human Language technologies Graeme Hirst, Series Editor

More information

Linking Entities in Tweets to Wikipedia Knowledge Base

Linking Entities in Tweets to Wikipedia Knowledge Base Linking Entities in Tweets to Wikipedia Knowledge Base Xianqi Zou, Chengjie Sun, Yaming Sun, Bingquan Liu, and Lei Lin School of Computer Science and Technology Harbin Institute of Technology, China {xqzou,cjsun,ymsun,liubq,linl}@insun.hit.edu.cn

More information

Ranking with Query-Dependent Loss for Web Search

Ranking with Query-Dependent Loss for Web Search Ranking with Query-Dependent Loss for Web Search Jiang Bian 1, Tie-Yan Liu 2, Tao Qin 2, Hongyuan Zha 1 Georgia Institute of Technology 1 Microsoft Research Asia 2 Outline Motivation Incorporating Query

More information

Analysis of Trail Algorithms for User Search Behavior

Analysis of Trail Algorithms for User Search Behavior Analysis of Trail Algorithms for User Search Behavior Surabhi S. Golechha, Prof. R.R. Keole Abstract Web log data has been the basis for analyzing user query session behavior for a number of years. Web

More information

Learning to Rank. from heuristics to theoretic approaches. Hongning Wang

Learning to Rank. from heuristics to theoretic approaches. Hongning Wang Learning to Rank from heuristics to theoretic approaches Hongning Wang Congratulations Job Offer from Bing Core Ranking team Design the ranking module for Bing.com CS 6501: Information Retrieval 2 How

More information

Ph.D. in Computer Science & Technology, Tsinghua University, Beijing, China, 2007

Ph.D. in Computer Science & Technology, Tsinghua University, Beijing, China, 2007 Yiqun Liu Associate Professor & Department co-chair Department of Computer Science and Technology Email yiqunliu@tsinghua.edu.cn URL http://www.thuir.org/group/~yqliu Phone +86-10-62796672 Fax +86-10-62796672

More information

LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1

LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1 LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1 1 Microsoft Research Asia, No.49 Zhichun Road, Haidian

More information

University of Delaware at Diversity Task of Web Track 2010

University of Delaware at Diversity Task of Web Track 2010 University of Delaware at Diversity Task of Web Track 2010 Wei Zheng 1, Xuanhui Wang 2, and Hui Fang 1 1 Department of ECE, University of Delaware 2 Yahoo! Abstract We report our systems and experiments

More information

Entity and Knowledge Base-oriented Information Retrieval

Entity and Knowledge Base-oriented Information Retrieval Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061

More information

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,

More information

IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM

IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM Myomyo Thannaing 1, Ayenandar Hlaing 2 1,2 University of Technology (Yadanarpon Cyber City), near Pyin Oo Lwin, Myanmar ABSTRACT

More information

TriRank: Review-aware Explainable Recommendation by Modeling Aspects

TriRank: Review-aware Explainable Recommendation by Modeling Aspects TriRank: Review-aware Explainable Recommendation by Modeling Aspects Xiangnan He, Tao Chen, Min-Yen Kan, Xiao Chen National University of Singapore Presented by Xiangnan He CIKM 15, Melbourne, Australia

More information

2008 International Conference on Apperceiving Computing and Intelligence Analysis (ICACIA 2008) Chengdu, China December 2008

2008 International Conference on Apperceiving Computing and Intelligence Analysis (ICACIA 2008) Chengdu, China December 2008 2008 International Conference on Apperceiving Computing and Intelligence Analysis (ICACIA 2008) Chengdu, China 13-15 December 2008 IEEE Catalog Number: ISBN: CFP0881F-PRT 978-1-4244-3427-5 TABLE OF CONTENTS

More information

arxiv: v2 [cs.ir] 4 Dec 2018

arxiv: v2 [cs.ir] 4 Dec 2018 Qingyao Ai CICS, UMass Amherst Amherst, MA, USA aiqy@cs.umass.edu Xuanhui Wang xuanhui@google.com Nadav Golbandi nadavg@google.com arxiv:1811.04415v2 [cs.ir] 4 Dec 2018 ABSTRACT Michael Bendersky bemike@google.com

More information

NUS-I2R: Learning a Combined System for Entity Linking

NUS-I2R: Learning a Combined System for Entity Linking NUS-I2R: Learning a Combined System for Entity Linking Wei Zhang Yan Chuan Sim Jian Su Chew Lim Tan School of Computing National University of Singapore {z-wei, tancl} @comp.nus.edu.sg Institute for Infocomm

More information

Inferring User Search for Feedback Sessions

Inferring User Search for Feedback Sessions Inferring User Search for Feedback Sessions Sharayu Kakade 1, Prof. Ranjana Barde 2 PG Student, Department of Computer Science, MIT Academy of Engineering, Pune, MH, India 1 Assistant Professor, Department

More information

Fall Lecture 16: Learning-to-rank

Fall Lecture 16: Learning-to-rank Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin

More information

Fractional Similarity : Cross-lingual Feature Selection for Search

Fractional Similarity : Cross-lingual Feature Selection for Search : Cross-lingual Feature Selection for Search Jagadeesh Jagarlamudi University of Maryland, College Park, USA Joint work with Paul N. Bennett Microsoft Research, Redmond, USA Using All the Data Existing

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

Ontology-Based Web Query Classification for Research Paper Searching

Ontology-Based Web Query Classification for Research Paper Searching Ontology-Based Web Query Classification for Research Paper Searching MyoMyo ThanNaing University of Technology(Yatanarpon Cyber City) Mandalay,Myanmar Abstract- In web search engines, the retrieval of

More information

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22 Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion

More information

Predicting Next Search Actions with Search Engine Query Logs

Predicting Next Search Actions with Search Engine Query Logs 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology Predicting Next Search Actions with Search Engine Query Logs Kevin Hsin-Yih Lin Chieh-Jen Wang Hsin-Hsi

More information

Learning to Rank for Faceted Search Bridging the gap between theory and practice

Learning to Rank for Faceted Search Bridging the gap between theory and practice Learning to Rank for Faceted Search Bridging the gap between theory and practice Agnes van Belle @ Berlin Buzzwords 2017 Job-to-person search system Generated query Match indicator Faceted search Multiple

More information

A Deep Relevance Matching Model for Ad-hoc Retrieval

A Deep Relevance Matching Model for Ad-hoc Retrieval A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese

More information

Comment Extraction from Blog Posts and Its Applications to Opinion Mining

Comment Extraction from Blog Posts and Its Applications to Opinion Mining Comment Extraction from Blog Posts and Its Applications to Opinion Mining Huan-An Kao, Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan

More information

Deep condolence to Professor Mark Everingham

Deep condolence to Professor Mark Everingham Deep condolence to Professor Mark Everingham Towards VOC2012 Object Classification Challenge Generalized Hierarchical Matching for Sub-category Aware Object Classification National University of Singapore

More information

Linking Entities in Chinese Queries to Knowledge Graph

Linking Entities in Chinese Queries to Knowledge Graph Linking Entities in Chinese Queries to Knowledge Graph Jun Li 1, Jinxian Pan 2, Chen Ye 1, Yong Huang 1, Danlu Wen 1, and Zhichun Wang 1(B) 1 Beijing Normal University, Beijing, China zcwang@bnu.edu.cn

More information

A Brief Review of Representation Learning in Recommender 赵鑫 RUC

A Brief Review of Representation Learning in Recommender 赵鑫 RUC A Brief Review of Representation Learning in Recommender Systems @ 赵鑫 RUC batmanfly@qq.com Representation learning Overview of recommender systems Tasks Rating prediction Item recommendation Basic models

More information

Improving Recommendations Through. Re-Ranking Of Results

Improving Recommendations Through. Re-Ranking Of Results Improving Recommendations Through Re-Ranking Of Results S.Ashwini M.Tech, Computer Science Engineering, MLRIT, Hyderabad, Andhra Pradesh, India Abstract World Wide Web has become a good source for any

More information

A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH

A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH Sandhya V. Kawale Prof. Dr. S. M. Kamalapur M.E. Student Associate Professor Deparment of Computer Engineering, Deparment of Computer Engineering, K. K. Wagh

More information

University of Illinois at Urbana-Champaign, Urbana, Illinois, U.S.

University of Illinois at Urbana-Champaign, Urbana, Illinois, U.S. Hongning Wang Contact Information Research Interests Education 2205 Thomas M. Siebel Center Department of Computer Science WWW: sifaka.cs.uiuc.edu/~wang296/ University of Illinois at Urbana-Champaign E-mail:

More information

Microsoft Research Asia at the Web Track of TREC 2009

Microsoft Research Asia at the Web Track of TREC 2009 Microsoft Research Asia at the Web Track of TREC 2009 Zhicheng Dou, Kun Chen, Ruihua Song, Yunxiao Ma, Shuming Shi, and Ji-Rong Wen Microsoft Research Asia, Xi an Jiongtong University {zhichdou, rsong,

More information

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,

More information

Northeastern University in TREC 2009 Million Query Track

Northeastern University in TREC 2009 Million Query Track Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College

More information

A new click model for relevance prediction in web search

A new click model for relevance prediction in web search A new click model for relevance prediction in web search Alexander Fishkov 1 and Sergey Nikolenko 2,3 1 St. Petersburg State Polytechnical University jetsnguns@gmail.com 2 Steklov Mathematical Institute,

More information

Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER

Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER Chalmers University of Technology University of Gothenburg

More information

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016 Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full

More information

Multimodal Information Spaces for Content-based Image Retrieval

Multimodal Information Spaces for Content-based Image Retrieval Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due

More information

Log Linear Model for String Transformation Using Large Data Sets

Log Linear Model for String Transformation Using Large Data Sets Log Linear Model for String Transformation Using Large Data Sets Mr.G.Lenin 1, Ms.B.Vanitha 2, Mrs.C.K.Vijayalakshmi 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology,

More information

CWS: : A Comparative Web Search System

CWS: : A Comparative Web Search System CWS: : A Comparative Web Search System Jian-Tao Sun, Xuanhui Wang, Dou Shen Hua-Jun Zeng, Zheng Chen Microsoft Research Asia University of Illinois at Urbana-Champaign Hong Kong University of Science and

More information

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task Chieh-Jen Wang, Yung-Wei Lin, *Ming-Feng Tsai and Hsin-Hsi Chen Department of Computer Science and Information Engineering,

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Detecting Multilingual and Multi-Regional Query Intent in Web Search

Detecting Multilingual and Multi-Regional Query Intent in Web Search Detecting Multilingual and Multi-Regional Query Intent in Web Search Yi Chang, Ruiqiang Zhang, Srihari Reddy Yahoo! Labs 701 First Avenue Sunnyvale, CA 94089 {yichang,ruiqiang,sriharir}@yahoo-inc.com Yan

More information

A New Approach to Query Segmentation for Relevance Ranking in Web Search

A New Approach to Query Segmentation for Relevance Ranking in Web Search Noname manuscript No. (will be inserted by the editor) A New Approach to Query Segmentation for Relevance Ranking in Web Search Haocheng Wu Yunhua Hu Hang Li Enhong Chen Received: date / Accepted: date

More information

Learning to Mine Query Subtopics from Query Log

Learning to Mine Query Subtopics from Query Log Learning to Mine Query Subtopics from Query Log Zhenzhong Zhang, Le Sun, Xianpei Han Institute of Software, Chinese Academy of Sciences, Beijing, China {zhenzhong, sunle, xianpei}@nfs.iscas.ac.cn Abstract

More information

A Study of MatchPyramid Models on Ad hoc Retrieval

A Study of MatchPyramid Models on Ad hoc Retrieval A Study of MatchPyramid Models on Ad hoc Retrieval Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences Text Matching Many text based

More information

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Effective Latent Space Graph-based Re-ranking Model with Global Consistency Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case

More information

arxiv: v1 [cs.ir] 15 May 2018

arxiv: v1 [cs.ir] 15 May 2018 Modeling Diverse Relevance Patterns in Ad-hoc Retrieval Yixing Fan,, Jiafeng Guo,, Yanyan Lan,, Jun Xu,, Chengxiang Zhai and Xueqi Cheng, University of Chinese Academy of Sciences, Beijing, China CAS Key

More information

Survey on Recommendation of Personalized Travel Sequence

Survey on Recommendation of Personalized Travel Sequence Survey on Recommendation of Personalized Travel Sequence Mayuri D. Aswale 1, Dr. S. C. Dharmadhikari 2 ME Student, Department of Information Technology, PICT, Pune, India 1 Head of Department, Department

More information

A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2

A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2 A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2 1 Student, M.E., (Computer science and Engineering) in M.G University, India, 2 Associate Professor

More information

arxiv: v1 [cs.si] 12 Jan 2019

arxiv: v1 [cs.si] 12 Jan 2019 Predicting Diffusion Reach Probabilities via Representation Learning on Social Networks Furkan Gursoy furkan.gursoy@boun.edu.tr Ahmet Onur Durahim onur.durahim@boun.edu.tr arxiv:1901.03829v1 [cs.si] 12

More information

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang ACM MM 2010 Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation Proliferation of images and videos on the Internet

More information

Supplementary Materials for Salient Object Detection: A

Supplementary Materials for Salient Object Detection: A Supplementary Materials for Salient Object Detection: A Discriminative Regional Feature Integration Approach Huaizu Jiang, Zejian Yuan, Ming-Ming Cheng, Yihong Gong Nanning Zheng, and Jingdong Wang Abstract

More information

arxiv: v1 [stat.ap] 14 Mar 2018

arxiv: v1 [stat.ap] 14 Mar 2018 arxiv:1803.05127v1 [stat.ap] 14 Mar 2018 Feature Selection and Model Comparison on Microsoft Learning-to-Rank Data Sets Sen LEI, Xinzhi HAN Submitted for the PSTAT 231 (Fall 2017) Final Project ONLY University

More information

Learning to find transliteration on the Web

Learning to find transliteration on the Web Learning to find transliteration on the Web Chien-Cheng Wu Department of Computer Science National Tsing Hua University 101 Kuang Fu Road, Hsin chu, Taiwan d9283228@cs.nthu.edu.tw Jason S. Chang Department

More information

The Parameters Optimization of Fusion Winglet Based on Orthogonal Experiment Yue LUO 1, *, Qi WANG 1, Qi DU 1 and Hou-An DING 1

The Parameters Optimization of Fusion Winglet Based on Orthogonal Experiment Yue LUO 1, *, Qi WANG 1, Qi DU 1 and Hou-An DING 1 2016 International Conference on Control and Automation (ICCA 2016) ISBN: 978-1-60595-329-8 The Parameters Optimization of Fusion Winglet Based on Orthogonal Experiment Yue LUO 1, *, Qi WANG 1, Qi DU 1

More information

Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification

Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification Md Zia Ullah, Md Shajalal, Abu Nowshed Chy, and Masaki Aono Department of Computer Science and Engineering, Toyohashi University

More information

Jan Pedersen 22 July 2010

Jan Pedersen 22 July 2010 Jan Pedersen 22 July 2010 Outline Problem Statement Best effort retrieval vs automated reformulation Query Evaluation Architecture Query Understanding Models Data Sources Standard IR Assumptions Queries

More information

Exploiting User Search Sessions for the Semantic Categorization of Question-like Informational Search Queries

Exploiting User Search Sessions for the Semantic Categorization of Question-like Informational Search Queries Exploiting User Search Sessions for the Semantic Categorization of Question-like Informational Search Queries Alejandro Figueroa Yahoo! Research Latin America Av. Blanco Encalada 2120, Santiago, Chile

More information

Research on Heterogeneous Communication Network for Power Distribution Automation

Research on Heterogeneous Communication Network for Power Distribution Automation 3rd International Conference on Material, Mechanical and Manufacturing Engineering (IC3ME 2015) Research on Heterogeneous Communication Network for Power Distribution Automation Qiang YU 1,a*, Hui HUANG

More information

arxiv: v1 [cs.ir] 19 Sep 2016

arxiv: v1 [cs.ir] 19 Sep 2016 Enhancing LambdaMART Using Oblivious Trees Marek Modrý 1 and Michal Ferov 2 arxiv:1609.05610v1 [cs.ir] 19 Sep 2016 1 Seznam.cz, Radlická 3294/10, 150 00 Praha 5, Czech Republic marek.modry@firma.seznam.cz

More information

arxiv: v1 [cs.ir] 16 Oct 2017

arxiv: v1 [cs.ir] 16 Oct 2017 DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, Xueqi Cheng pl8787@gmail.com,{lanyanyan,guojiafeng,junxu,cxq}@ict.ac.cn,xujingfang@sogou-inc.com

More information

A Deep Top-K Relevance Matching Model for Ad-hoc Retrieval

A Deep Top-K Relevance Matching Model for Ad-hoc Retrieval A Deep Top-K Relevance Matching Model for Ad-hoc Retrieval Zhou Yang, Qingfeng Lan, Jiafeng Guo, Yixing Fan, Xiaofei Zhu, Yanyan Lan, Yue Wang, and Xueqi Cheng School of Computer Science and Engineering,

More information

A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering

A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering R.Dhivya 1, R.Rajavignesh 2 (M.E CSE), Department of CSE, Arasu Engineering College, kumbakonam 1 Asst.

More information

An Investigation of Basic Retrieval Models for the Dynamic Domain Task

An Investigation of Basic Retrieval Models for the Dynamic Domain Task An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu

More information

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Query Sugges*ons Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Search engines User needs some information search engine tries to bridge this gap ssumption: the

More information

Method to Study and Analyze Fraud Ranking In Mobile Apps

Method to Study and Analyze Fraud Ranking In Mobile Apps Method to Study and Analyze Fraud Ranking In Mobile Apps Ms. Priyanka R. Patil M.Tech student Marri Laxman Reddy Institute of Technology & Management Hyderabad. Abstract: Ranking fraud in the mobile App

More information

Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling

Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling Chenyan Xiong, Zhengzhong Liu, Jamie Callan, and Tie-Yan Liu* Carnegie Mellon University & Microsoft Research* 1

More information

Contents Part I: Mathematics and Fuzziness Bipolar Fuzzy BRK-ideals in BRK-algebras... 3 Khizar Hayat, Xiao-Chu Liu, and Bing-Yuan Cao A New Approach for Solving Fuzzy Supplier Selection Problems Under

More information

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS Nilam B. Lonkar 1, Dinesh B. Hanchate 2 Student of Computer Engineering, Pune University VPKBIET, Baramati, India Computer Engineering, Pune University VPKBIET,

More information

ICTNET at Web Track 2010 Diversity Task

ICTNET at Web Track 2010 Diversity Task ICTNET at Web Track 2010 Diversity Task Yuanhai Xue 1,2, Zeying Peng 1,2, Xiaoming Yu 1, Yue Liu 1, Hongbo Xu 1, Xueqi Cheng 1 1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information

Finding Topic-centric Identified Experts based on Full Text Analysis

Finding Topic-centric Identified Experts based on Full Text Analysis Finding Topic-centric Identified Experts based on Full Text Analysis Hanmin Jung, Mikyoung Lee, In-Su Kang, Seung-Woo Lee, Won-Kyung Sung Information Service Research Lab., KISTI, Korea jhm@kisti.re.kr

More information

Supervised Reranking for Web Image Search

Supervised Reranking for Web Image Search for Web Image Search Query: Red Wine Current Web Image Search Ranking Ranking Features http://www.telegraph.co.uk/306737/red-wineagainst-radiation.html 2 qd, 2.5.5 0.5 0 Linjun Yang and Alan Hanjalic 2

More information

A Survey Paper on String Transformation using Probabilistic Approach

A Survey Paper on String Transformation using Probabilistic Approach A Survey Paper on String Transformation using Probabilistic Approach 1 Miss. Gayatridevi N.Kotame, 2 Prof. P. N. Kalavadekar 1 ME-II Scholar, 2 PG Co-ordinator Computer department, SRES College of Engineering,

More information

An improved PageRank algorithm for Social Network User s Influence research Peng Wang, Xue Bo*, Huamin Yang, Shuangzi Sun, Songjiang Li

An improved PageRank algorithm for Social Network User s Influence research Peng Wang, Xue Bo*, Huamin Yang, Shuangzi Sun, Songjiang Li 3rd International Conference on Mechatronics and Industrial Informatics (ICMII 2015) An improved PageRank algorithm for Social Network User s Influence research Peng Wang, Xue Bo*, Huamin Yang, Shuangzi

More information

International Conference on Information Sciences, Machinery, Materials and Energy (ICISMME 2015)

International Conference on Information Sciences, Machinery, Materials and Energy (ICISMME 2015) International Conference on Information Sciences, Machinery, Materials and Energy (ICISMME 2015) ARINC - 429 airborne communications transceiver system based on FPGA implementation Liu Hao 1,Gu Cao 2,MA

More information

arxiv: v2 [cs.ir] 27 Feb 2019

arxiv: v2 [cs.ir] 27 Feb 2019 Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm arxiv:1809.05818v2 [cs.ir] 27 Feb 2019 ABSTRACT Ziniu Hu University of California, Los Angeles, USA bull@cs.ucla.edu Recently a number

More information

A General Approximation Framework for Direct Optimization of Information Retrieval Measures

A General Approximation Framework for Direct Optimization of Information Retrieval Measures A General Approximation Framework for Direct Optimization of Information Retrieval Measures Tao Qin, Tie-Yan Liu, Hang Li October, 2008 Abstract Recently direct optimization of information retrieval (IR)

More information

arxiv: v1 [cs.cv] 16 Nov 2015

arxiv: v1 [cs.cv] 16 Nov 2015 Coarse-to-fine Face Alignment with Multi-Scale Local Patch Regression Zhiao Huang hza@megvii.com Erjin Zhou zej@megvii.com Zhimin Cao czm@megvii.com arxiv:1511.04901v1 [cs.cv] 16 Nov 2015 Abstract Facial

More information

Mining Query Subtopics from Search Log Data

Mining Query Subtopics from Search Log Data Mining Query Subtopics from Search Log Data Yunhua Hu Microsoft Research Asia Beijing, China yuhu@microsoft.com Daxin Jiang Microsoft Research Asia Beijing, China djiang@microsoft.com Yanan Qian SPKLSTN

More information

Topic Diversity Method for Image Re-Ranking

Topic Diversity Method for Image Re-Ranking Topic Diversity Method for Image Re-Ranking D.Ashwini 1, P.Jerlin Jeba 2, D.Vanitha 3 M.E, P.Veeralakshmi M.E., Ph.D 4 1,2 Student, 3 Assistant Professor, 4 Associate Professor 1,2,3,4 Department of Information

More information

Research Article. Three-dimensional modeling of simulation scene in campus navigation system

Research Article. Three-dimensional modeling of simulation scene in campus navigation system Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2013, 5(12):103-107 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Three-dimensional modeling of simulation scene

More information

A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising

A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising ABSTRACT Maryam Karimzadehgan Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL

More information

Verbose Query Reduction by Learning to Rank for Social Book Search Track

Verbose Query Reduction by Learning to Rank for Social Book Search Track Verbose Query Reduction by Learning to Rank for Social Book Search Track Messaoud CHAA 1,2, Omar NOUALI 1, Patrice BELLOT 3 1 Research Center on Scientific and Technical Information 05 rue des 03 frères

More information

by the customer who is going to purchase the product.

by the customer who is going to purchase the product. SURVEY ON WORD ALIGNMENT MODEL IN OPINION MINING R.Monisha 1,D.Mani 2,V.Meenasree 3, S.Navaneetha krishnan 4 SNS College of Technology, Coimbatore. megaladev@gmail.com, meenaveerasamy31@gmail.com. ABSTRACT-

More information

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection ILSVRC 2016 Object Detection from Video Byungjae Lee¹, Songguo Jin¹, Enkhbayar Erdenee¹, Mi Young Nam², Young Gui Jung², Phill Kyu

More information

Deep Character-Level Click-Through Rate Prediction for Sponsored Search

Deep Character-Level Click-Through Rate Prediction for Sponsored Search Deep Character-Level Click-Through Rate Prediction for Sponsored Search Bora Edizel - Phd Student UPF Amin Mantrach - Criteo Research Xiao Bai - Oath This work was done at Yahoo and will be presented as

More information

Query Independent Scholarly Article Ranking

Query Independent Scholarly Article Ranking Query Independent Scholarly Article Ranking Shuai Ma, Chen Gong, Renjun Hu, Dongsheng Luo, Chunming Hu, Jinpeng Huai SKLSDE Lab, Beihang University, China Beijing Advanced Innovation Center for Big Data

More information

Ranking Web Pages by Associating Keywords with Locations

Ranking Web Pages by Associating Keywords with Locations Ranking Web Pages by Associating Keywords with Locations Peiquan Jin, Xiaoxiang Zhang, Qingqing Zhang, Sheng Lin, and Lihua Yue University of Science and Technology of China, 230027, Hefei, China jpq@ustc.edu.cn

More information

Context based Re-ranking of Web Documents (CReWD)

Context based Re-ranking of Web Documents (CReWD) Context based Re-ranking of Web Documents (CReWD) Arijit Banerjee, Jagadish Venkatraman Graduate Students, Department of Computer Science, Stanford University arijitb@stanford.edu, jagadish@stanford.edu}

More information

A Mobile Web Focused Search Engine Using Implicit Feedback

A Mobile Web Focused Search Engine Using Implicit Feedback A Mobile Web Focused Search Engine Using Implicit Feedback Malvika Pimple Department of Computer Science University of North Dakota Grand Forks, ND 58202 malvika.pimple@email.und.edu Naima Kaabouch Department

More information

Session Based Click Features for Recency Ranking

Session Based Click Features for Recency Ranking Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Session Based Click Features for Recency Ranking Yoshiyuki Inagaki and Narayanan Sadagopan and Georges Dupret and Ciya

More information