Effective Latent Space Graphbased Reranking Model with Global Consistency


 Shonda Watkins
 5 years ago
 Views:
Transcription
1 Effective Latent Space Graphbased Reranking Model with Global Consistency Feb. 12,
2 Outline Introduction Related work Methodology Graphbased reranking model Learning a latent space graph A case study and the overall algorithm Experiments Conclusions and Future Work 2
3 Introduction Problem definition Given a set of documents D d 1 d 2 A term vector d i = x i Relevance scores using VSM or LM A connected graph Explicit link (e.g., hyperlinks) Implicit link (e.g., inferred from the content information) Many other features d 3 d 4 d 5 How to leverage the interconnection between documents/entities to improve the d ranking of retrieved results 2 with respect to the query? d 1 d 3 d 4 d 5 q 3
4 Introduction Initial ranking scores: relevance Graph structure: centrality (importance, authority) Simple method: Combine those two parts linearly Limitations: Do not make full use of the information Treat each of them individually What we have done? Propose a joint regularization framework Combine the content with link information in a latent space graph 4
5 Related work Structural reranking model Regularization framework Learning a latent space Using some variations of PageRank and HITS Centrality within graphs (Kurland and Lee, SIGIR 05 & SIGIR 06) Improve Web search results using affinity graph (Zhang et al., SIGIR 05) Improve an initial ranking by random walk in entityrelation networks (Minkov et al., SIGIR 06) Linear combination, treat the content and link individually 5
6 Related work Structural reranking model Regularization framework Learning a latent space Using some variations of Regularization framework PageRank and HITS Graph Laplacians for label propagation Centrality (two classes) within (Zhu graphs et al., (Kurland ICML 03, and Zhou Lee, et SIGIR 05 al., NIPS 03) & SIGIR 06) Improve Extent Web the graph search harmonic results function using to affinity multiple graph classes (Zhang (Mei et et al., WWW 08) SIGIR 05) Improve Score regularization an initial ranking to adjust by adhoc random retrieval walk scores in entityrelation (Diaz, CIKM 05) networks Enhance (Minkov learning et to al., rank SIGIR 06) with parameterized regularization models (Qin et al., WWW 08) Queryindependent settings Do not consider multiple relationships between objects. 6
7 Related work Structural reranking model Regularization framework Learning a latent space Using some variations of Regularization framework PageRank Learning and a HITS latent space Graph Laplacians for label propagation Centrality (two Latent classes) within Semantic (Zhu graphs et Analysis al., (Kurland ICML 03, (LSA) and Zhou Lee, (Deerwester et SIGIR 05 al., NIPS 03) et & al., SIGIR JASIS 90) 06) Improve Extent Probabilistic Web the graph search LSI harmonic (plsi) results (Hofmann, function using to affinity multiple SIGIR 99) graph classes (Zhang (Mei et et al., WWW 08) plsi + PHITS (Cohn and Hofmann, SIGIR 05) NIPS 00) Improve Score regularization an initial ranking to adjust by adhoc Combine content and link for random retrieval walk scores in entityrelation (Diaz, CIKM 05) classification using matrix factorization networks Enhance (Zhu(Minkov et learning al., SIGIR 07) et to al., rank SIGIR 06) with parameterized regularization models (Qin et al., WWW 08) Use the joint factorization to learning the latent feature. Difference: leverage the latent feature for building a latent space graph. 7
8 Methodology Graphbased reranking model + Case study: Expert finding Learning a latent space graph 8
9 Graphbased reranking model III. Methodology Intuition: Global consistency: similar documents are most likely to have similar ranking scores with respect to a query. The initial ranking scores provides invaluable information Regularization framework Parameter Global consistency Fit initial scores 9
10 Graphbased reranking model III. Methodology Optimization problem A closedform solution Connection with other methods µ α 0, return the initial scores µ α 1, a variation of PageRankbased model µ α (0, 1), combine both information simultaneously 10
11 Methodology Graphbased reranking model + Case study: Expert finding Learning a latent space graph 11
12 Learning a latent space graph III. Methodology Objective: incorporate the content with link information (or relational data) simultaneously Latent Semantic Analysis Joint factorization Combine the content with relational data Build latent space graph Calculate the weight matrix W 12
13 Latent Semantic Analysis III. Methodology  Learning a latent space graph Map documents to vector space of reduced dimensionality SVD is performed on the matrix The largest k singular values Reformulated as an optimization problem 13
14 III. Methodology  Learning a latent space graph Embedding multiple relational data Taking the papers as an example Paperterm matrix C Paperauthor matrix A A unified optimization problem NxL A C NxM Conjugate Gradient V C + X NxK V A 14
15 III. Methodology  Learning a latent space graph Embedding multiple relational data Taking the papers as an example Paperterm matrix C Paperauthor matrix A A unified optimization problem NxL A C NxM Conjugate Gradient V C + X NxK V A 15
16 Build latent space graph The edge weight w ij is defined III. Methodology  Learning a latent space graph W 16
17 Methodology Graphbased reranking model + Case study: Expert finding Learning a latent space graph 17
18 Case study: Application to expert finding Utilize statistical language model to calculate the initial ranking scores The probability of a query given a document III. Methodology Infer a document model θ d for each document The probability of the query generated by the document model θ d The product of terms generated by the document model (Assumption: each term are independent) 18
19 Case study: Application to expert finding Expert Finding: Identify a list of experts in the academic field for a given query topic (e.g., data mining Jiawei Han, etc ) Publications as representative of their expertise Use DBLP dataset to obtain the publications Authors have expertise in the topic of their papers Overall aggregation of their publications III. Methodology Refine the ranking scores of papers, then aggregate the refined scores to rerank the experts 19
20 III. Methodology Case study: Application to expert finding 20
21 Experiments DBLP Collection A subset of the DBLP records (15CONF) Statistics of the 15CONF collection 21
22 Benchmark Dataset IV. Experiments A benchmark dataset with 16 topics and expert lists 22
23 Evaluation Metrics IV. Experiments Precision at rank n (P@n): Mean Average Precision (MAP): Bpref: The score function of the number of nonrelevant candidates 23
24 Preliminary Experiments IV. Experiments Evaluation results (%) PRRM may not improve the performance GBRM achieve the best results 24
25 Details of the results IV. Experiments 25
26 IV. Experiments Effect of parameter µ α µ α 0, return the initial scores (baseline) µ α 1, discard the initial scores, consider the global consistency over the graph 26
27 IV. Experiments Effect of parameter µ α Robust, achieve the best results when µ α (0.5, 0.7) 27
28 Effect of graph construction IV. Experiments Different dimensionality (k d ) of the latent feature, which is used to calculate the weight matrix W Become better for greater k d, because higher dimensional space can better capture the similarities k d, = 50 achieve better results than tf.idf 28
29 Effect of graph construction IV. Experiments Different number of nearest neighbors (k nn ) Tends to degrade a little with increasing k nn k nn = 10 achieve the best results Average processing time: increase linearly with the increase of k nn 29
30 Conclusions and Future Work Conclusions Leverage the graphbased model for the querydependent ranking problem Integrate the latent space with the graphbased reranking model Address expert finding task in a the academic field using the proposed method The improvement in our proposed model is promising Future work Extend our framework to consider more features Apply the framework to other applications and largescale dataset 30
31 Q&A Thanks! 31
WebSci and Learning to Rank for IR
WebSci and Learning to Rank for IR Ernesto DiazAviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto DiazAviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto DiazAviles
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Recommender Systems II Instructor: Yizhou Sun yzsun@cs.ucla.edu May 31, 2017 Recommender Systems Recommendation via Information Network Analysis Hybrid Collaborative Filtering
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality Kmeans Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationHeterogeneous GraphBased Intent Learning with Queries, Web Pages and Wikipedia Concepts
Heterogeneous GraphBased Intent Learning with Queries, Web Pages and Wikipedia Concepts Xiang Ren, Yujing Wang, Xiao Yu, Jun Yan, Zheng Chen, Jiawei Han University of Illinois, at Urbana Champaign MicrosoD
More informationTopicLevel Random Walk through Probabilistic Model
TopicLevel Random Walk through Probabilistic Model Zi Yang, Jie Tang, Jing Zhang, Juanzi Li, and Bo Gao Department of Computer Science & Technology, Tsinghua University, China Abstract. In this paper,
More informationA Deep Relevance Matching Model for Adhoc Retrieval
A Deep Relevance Matching Model for Adhoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese
More informationPERSONALIZED TAG RECOMMENDATION
PERSONALIZED TAG RECOMMENDATION Ziyu Guan, Xiaofei He, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can Wang Zhejiang University, China Univ. of Illinois/Univ. of Michigan 1 Booming of Social Tagging Applications
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationClustering Results. Result List Example. Clustering Results. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Presenting Results Clustering Clustering Results! Result lists often contain documents related to different aspects of the query topic! Clustering is used to
More informationMultilabel Classification. Jingzhou Liu Dec
Multilabel Classification Jingzhou Liu Dec. 6 2016 Introduction Multiclass problem, Training data (x $, y $ ) ( ), x $ X R., y $ Y = 1,2,, L Learn a mapping f: X Y Each instance x $ is associated with
More informationExploiting Index Pruning Methods for Clustering XML Collections
Exploiting Index Pruning Methods for Clustering XML Collections Ismail Sengor Altingovde, Duygu Atilgan and Özgür Ulusoy Department of Computer Engineering, Bilkent University, Ankara, Turkey {ismaila,
More informationAlternating Minimization. Jun Wang, Tony Jebara, and Shihfu Chang
Graph Transduction via Alternating Minimization Jun Wang, Tony Jebara, and Shihfu Chang 1 Outline of the presentation Brief introduction and related work Problems with Graph Labeling Imbalanced labels
More informationDOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL
DOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL Shuguang Wang Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA swang@cs.pitt.edu Shyam Visweswaran Department of Biomedical
More informationReranking Documents Based on QueryIndependent Document Specificity
Reranking Documents Based on QueryIndependent Document Specificity Lei Zheng and Ingemar J. Cox Department of Computer Science University College London London, WC1E 6BT, United Kingdom lei.zheng@ucl.ac.uk,
More informationA Machine Learning Approach for Information Retrieval Applications. Luo Si. Department of Computer Science Purdue University
A Machine Learning Approach for Information Retrieval Applications Luo Si Department of Computer Science Purdue University Why Information Retrieval: Information Overload: Since the introduction of digital
More informationvector space retrieval many slides courtesy James Amherst
vector space retrieval many slides courtesy James Allan@umass Amherst 1 what is a retrieval model? Model is an idealization or abstraction of an actual process Mathematical models are used to study the
More informationA ew Algorithm for Community Identification in Linked Data
A ew Algorithm for Community Identification in Linked Data Nacim Fateh Chikhi, Bernard Rothenburger, Nathalie AussenacGilles Institut de Recherche en Informatique de Toulouse 118, route de Narbonne 31062
More informationClustering. Bruno Martins. 1 st Semester 2012/2013
Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2012/2013 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 Motivation Basic Concepts
More informationLRLWLSI: An Improved Latent Semantic Indexing (LSI) Text Classifier
LRLWLSI: An Improved Latent Semantic Indexing (LSI) Text Classifier Wang Ding, Songnian Yu, Shanqing Yu, Wei Wei, and Qianfeng Wang School of Computer Engineering and Science, Shanghai University, 200072
More informationFast Inbound Top K Query for Random Walk with Restart
Fast Inbound Top K Query for Random Walk with Restart Chao Zhang, Shan Jiang, Yucheng Chen, Yidan Sun, Jiawei Han University of Illinois at Urbana Champaign czhang82@illinois.edu 1 Outline Background
More informationJan Pedersen 22 July 2010
Jan Pedersen 22 July 2010 Outline Problem Statement Best effort retrieval vs automated reformulation Query Evaluation Architecture Query Understanding Models Data Sources Standard IR Assumptions Queries
More informationJianyong Wang Department of Computer Science and Technology Tsinghua University
Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity
More informationJune 15, Abstract. 2. Methodology and Considerations. 1. Introduction
Organizing Internet Bookmarks using Latent Semantic Analysis and Intelligent Icons Note: This file is a homework produced by two students for UCR CS235, Spring 06. In order to fully appreacate it, it may
More informationUsing PageRank in Feature Selection
Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy {ienco,meo,botta}@di.unito.it Abstract. Feature selection is an important
More informationGraphbased Techniques for Searching LargeScale Noisy Multimedia Data
Graphbased Techniques for Searching LargeScale Noisy Multimedia Data ShihFu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM),
More informationFeature selection. Term 2011/2012 LSI  FIB. Javier Béjar cbea (LSI  FIB) Feature selection Term 2011/ / 22
Feature selection Javier Béjar cbea LSI  FIB Term 2011/2012 Javier Béjar cbea (LSI  FIB) Feature selection Term 2011/2012 1 / 22 Outline 1 Dimensionality reduction 2 Projections 3 Attribute selection
More informationENHANCEMENT OF METICULOUS IMAGE SEARCH BY MARKOVIAN SEMANTIC INDEXING MODEL
ENHANCEMENT OF METICULOUS IMAGE SEARCH BY MARKOVIAN SEMANTIC INDEXING MODEL Shwetha S P 1 and Alok Ranjan 2 Visvesvaraya Technological University, Belgaum, Dept. of Computer Science and Engineering, Canara
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationFeature selection. LING 572 Fei Xia
Feature selection LING 572 Fei Xia 1 Creating attributevalue table x 1 x 2 f 1 f 2 f K y Choose features: Define feature templates Instantiate the feature templates Dimensionality reduction: feature selection
More informationUsing PageRank in Feature Selection
Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy fienco,meo,bottag@di.unito.it Abstract. Feature selection is an important
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationUniversity of Delaware at Diversity Task of Web Track 2010
University of Delaware at Diversity Task of Web Track 2010 Wei Zheng 1, Xuanhui Wang 2, and Hui Fang 1 1 Department of ECE, University of Delaware 2 Yahoo! Abstract We report our systems and experiments
More informationPersonalized Information Retrieval
Personalized Information Retrieval Shihn Yuarn Chen Traditional Information Retrieval Content based approaches Statistical and natural language techniques Results that contain a specific set of words or
More informationCombining Implicit and Explicit Topic Representations for Result Diversification
Combining Implicit and Explicit Topic Representations for Result Diversification Jiyin He J.He@cwi.nl Vera Hollink V.Hollink@cwi.nl Centrum Wiskunde en Informatica Science Park 123, 1098XG Amsterdam, the
More informationA Taxonomy of SemiSupervised Learning Algorithms
A Taxonomy of SemiSupervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More informationText Analytics (Text Mining)
CSE 6242 / CX 4242 Apr 1, 2014 Text Analytics (Text Mining) Concepts and Algorithms Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer,
More informationAdvanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016
Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpiinf.mpg.de Jannik Strötgen jannik.stroetgen@mpiinf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full
More informationCombining Review Text Content and ReviewerItem Rating Matrix to Predict Review Rating
Combining Review Text Content and ReviewerItem Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,
More informationQuery Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata
Query Sugges*ons Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Search engines User needs some information search engine tries to bridge this gap ssumption: the
More informationSupervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information
Supervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information DucTien DangNguyen, Giulia Boato, Alessandro Moschitti, Francesco G.B. De Natale Department of Information
More informationPart I: Data Mining Foundations
Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?
More informationEntity and Knowledge Baseoriented Information Retrieval
Entity and Knowledge Baseoriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061
More informationLearning to Rank Networked Entities
Learning to Rank Networked Entities Alekh Agarwal Soumen Chakrabarti Sunny Aggarwal Presented by Dong Wang 11/29/2006 We've all heard that a million monkeys banging on a million typewriters will eventually
More informationImproving Difficult Queries by Leveraging Clusters in Term Graph
Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu
More informationLiangjie Hong*, Dawei Yin*, Jian Guo, Brian D. Davison*
Tracking Trends: Incorporating Term Volume into Temporal Topic Models Liangjie Hong*, Dawei Yin*, Jian Guo, Brian D. Davison* Dept. of Computer Science and Engineering, Lehigh University, Bethlehem, PA,
More informationUnsupervised Rank Aggregation with DistanceBased Models
Unsupervised Rank Aggregation with DistanceBased Models Alexandre Klementiev, Dan Roth, and Kevin Small University of Illinois at UrbanaChampaign Motivation Consider a panel of judges Each (independently)
More informationAn Empirical Comparison of Collaborative Filtering Approaches on Netflix Data
An Empirical Comparison of Collaborative Filtering Approaches on Netflix Data Nicola Barbieri, Massimo Guarascio, Ettore Ritacco ICARCNR Via Pietro Bucci 41/c, Rende, Italy {barbieri,guarascio,ritacco}@icar.cnr.it
More informationText Categorization (I)
CS473 CS473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization
More informationon learned visual embedding patrick pérez Allegro Workshop Inria RhônesAlpes 22 July 2015
on learned visual embedding patrick pérez Allegro Workshop Inria RhônesAlpes 22 July 2015 Vector visual representation Fixedsize image representation Highdim (100 100,000) Generic, unsupervised: BoW,
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval Mohsen Kamyar چهارمین کارگاه ساالنه آزمایشگاه فناوری و وب بهمن ماه 1391 Outline Outline in classic categorization Information vs. Data Retrieval IR Models Evaluation
More informationIntroduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.
Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How
More informationFast Discriminative Component Analysis for Comparing Examples
NIPS 2006 LCE workshop Fast Discriminative Component Analysis for Comparing Examples Jaakko Peltonen 1, Jacob Goldberger 2, and Samuel Kaski 1 1 Helsinki Institute for Information Technology & Adaptive
More informationRetrieval by Content. Part 3: Text Retrieval Latent Semantic Indexing. Srihari: CSE 626 1
Retrieval by Content art 3: Text Retrieval Latent Semantic Indexing Srihari: CSE 626 1 Latent Semantic Indexing LSI isadvantage of exclusive use of representing a document as a Tdimensional vector of
More informationA Multiplestage Approach to Reranking Clinical Documents
A Multiplestage Approach to Reranking Clinical Documents HeungSeon Oh and Yuchul Jung Information Service Center Korea Institute of Science and Technology Information {ohs, jyc77}@kisti.re.kr Abstract.
More informationQuery Independent Scholarly Article Ranking
Query Independent Scholarly Article Ranking Shuai Ma, Chen Gong, Renjun Hu, Dongsheng Luo, Chunming Hu, Jinpeng Huai SKLSDE Lab, Beihang University, China Beijing Advanced Innovation Center for Big Data
More informationData mining. Classification knn Classifier. Piotr Paszek. (Piotr Paszek) Data mining knn 1 / 20
Data mining Piotr Paszek Classification knn Classifier (Piotr Paszek) Data mining knn 1 / 20 Plan of the lecture 1 Lazy Learner 2 knearest Neighbor Classifier 1 Distance (metric) 2 How to Determine
More informationMetric Learning for LargeScale Image Classification:
Metric Learning for LargeScale Image Classification: Generalizing to New Classes at NearZero Cost Florent Perronnin 1 work published at ECCV 2012 with: Thomas Mensink 1,2 Jakob Verbeek 2 Gabriela Csurka
More informationBing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web
More informationData Mining: Dynamic Past and Promising Future
SDM@10 Anniversary Panel: Data Mining: A Decade of Progress and Future Outlook Data Mining: Dynamic Past and Promising Future Jiawei Han Department of Computer Science University of Illinois at Urbana
More informationObject and Action Detection from a Single Example
Object and Action Detection from a Single Example Peyman Milanfar* EE Department University of California, Santa Cruz *Joint work with Hae Jong Seo AFOSR Program Review, June 45, 29 Take a look at this:
More informationLatent Semantic Indexing
Latent Semantic Indexing Thanks to Ian Soboroff Information Retrieval 1 Issues: Vector Space Model Assumes terms are independent Some terms are likely to appear together synonyms, related words spelling
More informationOn Finding Power Method in Spreading Activation Search
On Finding Power Method in Spreading Activation Search Ján Suchal Slovak University of Technology Faculty of Informatics and Information Technologies Institute of Informatics and Software Engineering Ilkovičova
More informationCentralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge
Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum
More informationUnsupervised discovery of category and object models. The task
Unsupervised discovery of category and object models Martial Hebert The task 1 Common ingredients 1. Generate candidate segments 2. Estimate similarity between candidate segments 3. Prune resulting (implicit)
More informationLETTER Local and Nonlocal Color Line Models for Image Matting
1814 IEICE TRANS. FUNDAMENTALS, VOL.E97 A, NO.8 AUGUST 2014 LETTER Local and Nonlocal Color Line Models for Image Matting ByoungKwang KIM a), Meiguang JIN, Nonmembers, and WooJin SONG, Member SUMMARY
More informationTREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback
RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano
More informationOverview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to SemiSupervised Learning Review 10/4/2010
INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to SemiSupervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi Supervised Learning, Morgan & Claypool Publishers,
More informationThanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman
Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman http://www.mmds.org Overview of Recommender Systems Contentbased Systems Collaborative Filtering J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive
More informationFall CS646: Information Retrieval. Lecture 2  Introduction to Search Result Ranking. Jiepu Jiang University of Massachusetts Amherst 2016/09/12
Fall 2016 CS646: Information Retrieval Lecture 2  Introduction to Search Result Ranking Jiepu Jiang University of Massachusetts Amherst 2016/09/12 More course information Programming Prerequisites Proficiency
More informationInternational Journal of Advance Engineering and Research Development. A Survey of Document Search Using Images
Scientific Journal of Impact Factor (SJIF): 4.14 International Journal of Advance Engineering and Research Development Volume 3, Issue 6, June 2016 A Survey of Document Search Using Images Pradnya More
More informationCS473: Course Review CS473. Luo Si Department of Computer Science Purdue University
CS473: CS473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Adhoc IR Terminologies and
More informationChapter 6: Information Retrieval and Web Search. An introduction
Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods
More informationText Analytics (Text Mining)
CSE 6242 / CX 4242 Text Analytics (Text Mining) Concepts and Algorithms Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko,
More informationCOMP5331: Knowledge Discovery and Data Mining
COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd, Jon M. Kleinberg 1 1 PageRank
More informationPrototype of Silver Corpus Merging Framework
www.visceral.eu Prototype of Silver Corpus Merging Framework Deliverable number D3.3 Dissemination level Public Delivery data 30.4.2014 Status Authors Final Markus Krenn, Allan Hanbury, Georg Langs This
More informationInformation Retrieval
Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data
More informationMining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams
/9/7 Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them
More informationRecontextualization and contextual Entity exploration. Sebastian Holzki
Recontextualization and contextual Entity exploration Sebastian Holzki Sebastian Holzki June 7, 2016 1 Authors: Joonseok Lee, Ariel Fuxman, Bo Zhao, and Yuanhua Lv  PAPER PRESENTATION  LEVERAGING KNOWLEDGE
More informationTGNet: Learning to Rank Nodes in Temporal Graphs. Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 LuAn Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2
TGNet: Learning to Rank Nodes in Temporal Graphs Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 LuAn Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2 1 2 3 Node Ranking in temporal graphs Temporal graphs have been
More informationAn Investigation of Basic Retrieval Models for the Dynamic Domain Task
An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu
More informationRanking Web Pages by Associating Keywords with Locations
Ranking Web Pages by Associating Keywords with Locations Peiquan Jin, Xiaoxiang Zhang, Qingqing Zhang, Sheng Lin, and Lihua Yue University of Science and Technology of China, 230027, Hefei, China jpq@ustc.edu.cn
More informationClickthrough Log Analysis by Collaborative Ranking
Proceedings of the TwentyFourth AAAI Conference on Artificial Intelligence (AAAI10) Clickthrough Log Analysis by Collaborative Ranking Bin Cao 1, Dou Shen 2, Kuansan Wang 3, Qiang Yang 1 1 Hong Kong
More informationTriRank: Reviewaware Explainable Recommendation by Modeling Aspects
TriRank: Reviewaware Explainable Recommendation by Modeling Aspects Xiangnan He, Tao Chen, MinYen Kan, Xiao Chen National University of Singapore Presented by Xiangnan He CIKM 15, Melbourne, Australia
More informationBibliometrics: Citation Analysis
Bibliometrics: Citation Analysis Many standard documents include bibliographies (or references), explicit citations to other previously published documents. Now, if you consider citations as links, academic
More informationRelative Constraints as Features
Relative Constraints as Features Piotr Lasek 1 and Krzysztof Lasek 2 1 Chair of Computer Science, University of Rzeszow, ul. Prof. Pigonia 1, 35510 Rzeszow, Poland, lasek@ur.edu.pl 2 Institute of Computer
More informationText Modeling with the Trace Norm
Text Modeling with the Trace Norm Jason D. M. Rennie jrennie@gmail.com April 14, 2006 1 Introduction We have two goals: (1) to find a lowdimensional representation of text that allows generalization to
More informationNYU CSCIGA Fall 2016
1 / 45 Information Retrieval: Personalization Fernando Diaz Microsoft Research NYC November 7, 2016 2 / 45 Outline Introduction to Personalization TopicSpecific PageRank News Personalization Deciding
More informationData Preprocessing. Javier Béjar. URL  Spring 2018 CS  MAI 1/78 BY: $\
Data Preprocessing Javier Béjar BY: $\ URL  Spring 2018 C CS  MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attributevalue matrix Structured
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems
More informationRelational Retrieval Using a Combination of PathConstrained Random Walks
Relational Retrieval Using a Combination of PathConstrained Random Walks Ni Lao, William W. Cohen University 2010.9.22 Outline Relational Retrieval Problems Pathconstrained random walks The need for
More informationVideo Google: A Text Retrieval Approach to Object Matching in Videos
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic, Frederik Schaffalitzky, Andrew Zisserman Visual Geometry Group University of Oxford The vision Enable video, e.g. a feature
More informationInformation Retrieval. hussein suleman uct cs
Information Management Information Retrieval hussein suleman uct cs 303 2004 Introduction Information retrieval is the process of locating the most relevant information to satisfy a specific information
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationMixture Models and the EM Algorithm
Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is
More informationROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015
ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti
More informationUnderstanding the Query: THCIB and THUIS at NTCIR10 Intent Task. Junjun Wang 2013/4/22
Understanding the Query: THCIB and THUIS at NTCIR10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion
More informationTRANSDUCTIVE LINK SPAM DETECTION
TRANSDUCTIVE LINK SPAM DETECTION Denny Zhou Microsoft Research http://research.microsoft.com/~denzho Joint work with Chris Burges and Tao Tao Presenter: Krysta Svore Link spam detection problem Classification
More informationEfficient Algorithms may not be those we think
Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann
More informationLargeScale Face Manifold Learning
LargeScale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random
More informationReviewer Profiling Using Sparse Matrix Regression
Reviewer Profiling Using Sparse Matrix Regression Evangelos E. Papalexakis, Nicholas D. Sidiropoulos, Minos N. Garofalakis Technical University of Crete, ECE department 14 December 2010, OEDM 2010, Sydney,
More information