Automatically Building Research Reading Lists

Similar documents
Automatically Building Research Reading Lists

Recommender Systems: User Experience and System Issues

Study on Recommendation Systems and their Evaluation Metrics PRESENTATION BY : KALHAN DHAR

Towards Time-Aware Semantic enriched Recommender Systems for movies

Influence in Ratings-Based Recommender Systems: An Algorithm-Independent Approach

The Design and Implementation of an Intelligent Online Recommender System

The Tourism Recommendation of Jingdezhen Based on Unifying User-based and Item-based Collaborative filtering

Robustness and Accuracy Tradeoffs for Recommender Systems Under Attack

Recommender Systems: User Experience and System Issues. About me. Scope of Recommenders. A Quick Introduction. Wide Range of Algorithms

DynamicLens: A Dynamic User-Interface for a Meta-Recommendation System

Making Recommendations Better: An Analytic Model for Human- Recommender Interaction

Collaborative Filtering using Euclidean Distance in Recommendation Engine

arxiv: v1 [cs.ir] 1 Jul 2016

SimRank : A Measure of Structural-Context Similarity

Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem

A Constrained Spreading Activation Approach to Collaborative Filtering

A Recursive Prediction Algorithm for Collaborative Filtering Recommender Systems

A Time-based Recommender System using Implicit Feedback

Part 11: Collaborative Filtering. Francesco Ricci

Semantically Rich Recommendations in Social Networks for Sharing, Exchanging and Ranking Semantic Context

Application of Dimensionality Reduction in Recommender System -- A Case Study

Fusion-based Recommender System for Improving Serendipity

A Scalable, Accurate Hybrid Recommender System

Visualizing Recommendation Flow on Social Network

Information Filtering SI650: Information Retrieval

Local is Good: A Fast Citation Recommendation Approach

Towards QoS Prediction for Web Services based on Adjusted Euclidean Distances

Design of Research Buddy: Personalized Research Paper Recommendation System

Comparing State-of-the-Art Collaborative Filtering Systems

Author(s): Rahul Sami, 2009

Part 11: Collaborative Filtering. Francesco Ricci

RecDB: Towards DBMS Support for Online Recommender Systems

Recommender Systems - Content, Collaborative, Hybrid

Letting Users Choose Recommender Algorithms: An Experimental Study

Lecture 9: I: Web Retrieval II: Webology. Johan Bollen Old Dominion University Department of Computer Science

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

A System for Identifying Voyage Package Using Different Recommendations Techniques

Recommender Systems using Collaborative Filtering D Yogendra Rao

Privacy-Preserving Collaborative Filtering using Randomized Perturbation Techniques

CS249: ADVANCED DATA MINING

Slope One Predictors for Online Rating-Based Collaborative Filtering

On the real-time web as a source of recommendation knowledge. Garcia Esparza, Sandra; O'Mahony, Michael P.; Smyth, Barry

Content Bookmarking and Recommendation

Learning Bidirectional Similarity for Collaborative Filtering

TriRank: Review-aware Explainable Recommendation by Modeling Aspects

Survey on Collaborative Filtering Technique in Recommendation System

UMass at TREC 2017 Common Core Track

Recommender System using Collaborative Filtering Methods: A Performance Evaluation

EigenRank: A Ranking-Oriented Approach to Collaborative Filtering

Rethinking the Recommender Research Ecosystem: Reproducibility, Openness, and LensKit

COMP 4601 Hubs and Authorities

Collaborative Filtering based on User Trends

Content-Based Recommendation for Web Personalization

Michele Gorgoglione Politecnico di Bari Viale Japigia, Bari (Italy)

Supporting Scholarly Search with Keyqueries

BordaRank: A Ranking Aggregation Based Approach to Collaborative Filtering

A Constrained Spreading Activation Approach to Collaborative Filtering

Advances in Natural and Applied Sciences. Information Retrieval Using Collaborative Filtering and Item Based Recommendation

Recommendation Based on Co-clustring Algorithm, Co-dissimilarity and Spanning Tree

Personalized Information Retrieval

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala

Recommender System for Online Dating Service. KSI MFF UK Malostranské nám. 25, Prague 1, Czech Republic

A Random Walk Method for Alleviating the Sparsity Problem in Collaborative Filtering

Performance Comparison of Algorithms for Movie Rating Estimation

Collaborative Filtering using a Spreading Activation Approach

Seminar Collaborative Filtering. KDD Cup. Ziawasch Abedjan, Arvid Heise, Felix Naumann

Literature Survey on Various Recommendation Techniques in Collaborative Filtering

Project Report. An Introduction to Collaborative Filtering

SOFIA: Social Filtering for Niche Markets

Comparing Performance of Recommendation Techniques in the Blogsphere

Chapter 6 Recommender Systems for the Web. J. Ben Schafer, Joseph A. Konstan, and John T. Riedl

Solving the Sparsity Problem in Recommender Systems Using Association Retrieval

RSLIS at INEX 2012: Social Book Search Track

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!

Music Recommendation Engine

A Visual Interface for Social Information Filtering

CAPTURING USER BROWSING BEHAVIOUR INDICATORS

ExcUseMe: Asking Users to Help in Item Cold-Start Recommendations

Rocchio Algorithm to Enhance Semantically Collaborative Filtering

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

Experiences from Implementing Collaborative Filtering in a Web 2.0 Application

Information Retrieval

Bookmark Hierarchies and Collaborative Recommendation

Passage Retrieval and other XML-Retrieval Tasks. Andrew Trotman (Otago) Shlomo Geva (QUT)

A Modified Algorithm to Handle Dangling Pages using Hypothetical Node

Prowess Improvement of Accuracy for Moving Rating Recommendation System

Cognizable Recommendation System using Spatial Ratings with Collaborative Filter Shilpa Mathapati 1, Bhuvaneswari Raju 2

KNOW At The Social Book Search Lab 2016 Suggestion Track

Bibliometrics: Citation Analysis

Web Personalization & Recommender Systems

5/13/2009. Introduction. Introduction. Introduction. Introduction. Introduction

Know your neighbours: Machine Learning on Graphs

Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs

Constructing Effective and Efficient Topic-Specific Authority Networks for Expert Finding in Social Media

NLMF: NonLinear Matrix Factorization Methods for Top-N Recommender Systems

Identifying Attack Models for Secure Recommendation

Towards a Scalable Social Recommender Engine for Online Marketplaces: The Case of Apache Solr

arxiv: v1 [cs.ir] 10 Sep 2018

(E-AggNN) Implementation of Effective and Efficient Algorithm for flexible aggregate similarity Search on Recommender System

Recommender Systems: User Experience and System Issues. About me. Scope of Recommenders. A Quick Introduction. Wide Range of Algorithms

Transcription:

Automatically Building Research Reading Lists Michael D. Ekstrand 1 Praveen Kanaan 1 James A. Stemper 2 John T. Butler 2 Joseph A. Konstan 1 John T. Riedl 1 ekstrand@cs.umn.edu 1 GroupLens Research Department of Computer Science and Engineering University of Minnesota 2 University of Minnesota Libraries RecSys 2010 Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 1 / 35

Recommending Research Papers Way too many papers to read need to find the right ones so we can get on to research! Many papers, many reviewers who is most qualified to judge the work? Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 2 / 35

Recommending Research Papers Way too many papers to read need to find the right ones so we can get on to research! Many papers, many reviewers who is most qualified to judge the work? Working on a bibliography or reference list who should have cited you? Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 2 / 35

Building a Reading List Alice is a new grad student who took a class on adaptive web technologies. The professor assigned some readings, and she wants to learn more. Herlocker et al. An algorithmic framework for performing collaborative filtering. In Proc. SIGIR 1999. Hoffman. Latent semantic models for collaborative filtering. In TOIS, volume 22, issue 1 (January 2004). Guy et al. Personalized recommendation of social software items based on social relations. In Proc. RecSys 2009. Karypis. Evaluation of Item-Based Top-N Recommendation Algorithms. In Proc. CIKM 2001. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 3 / 35

Building a Reading List Alice is a new grad student who took a class on adaptive web technologies. The professor assigned some readings, and she wants to learn more. Herlocker et al. An algorithmic framework for performing collaborative filtering. In Proc. SIGIR 1999. Hoffman. Latent semantic models for collaborative filtering. In TOIS, volume 22, issue 1 (January 2004). Guy et al. Personalized recommendation of social software items based on social relations. In Proc. RecSys 2009. Karypis. Evaluation of Item-Based Top-N Recommendation Algorithms. In Proc. CIKM 2001. How can we find important ones to read? Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 3 / 35

The Reading List Task Input List of papers on topic of interest Magic happens here Output List of important papers to read Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 4 / 35

Domain Characteristics Traditional CF methods are blind to item content or relationships. The citation web has a defined structure. Can we harness this to improve recommendation? Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 5 / 35

Questions Can we harness domain structure to improve recommendation of research papers? What algorithms perform well at building reading lists? How can we measure this performance? Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 6 / 35

Approach Analyze Task and Domain Design Candidate Algorithms Prune Algorithm Pool (offline) Evaluate Performance (user study) Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 7 / 35

Approach Analyze Task and Domain Design Candidate Algorithms Prune Algorithm Pool (offline) Evaluate Performance (user study) Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 8 / 35

Designing Candidate Algorithms Analyze Task and Domain Design Candidate Algorithms Prune Algorithm Pool (offline) Evaluate Performance (user study) Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 9 / 35

Algorithms We adapt and combine off-the-shelf algorithms. Collaborative filtering (CF) Item-item CF over sets of items [Karypis, 2001] Content-based filtering (CBF) Lemur toolkit [Ogilvie and Callan, 2002] in BM25 mode with recommended baseline parameters Graph ranking PageRank [Page et al., 1999] HITS authority scores [Kleinberg, 1999] SALSA [Lempel and Moran, 2000] Relative algorithms [White and Smyth, 2003] Biased HITS k-step Markov importance Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 10 / 35

Applying Item-Item CF No users - papers are both users and items [McNee et al., 2002]. A B C A B C Becomes a form of co-citation analysis. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 11 / 35

Rank-Weighting CF Inspired by [Karypis, 2001]: user influence in item similarity does not need to be the same! Karypis s approach: normalize user purchase vectors to unit vectors so users with few purchases influence item similarity more. Our adaptation: weight citation vectors by graph rank. Highly-ranked papers have more influence. â = r(a)a Use weighted citation vectors to compute cited paper similarity. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 12 / 35

Using Weights with CBF Two approaches Subgraph ranking Build subgraph from CBF results and rank its nodes. Query Graph CBF Expand Rank Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 13 / 35

Using Weights with CBF Two approaches Linear blending Blend CBF scores with node weights in the global graph. Query Graph CBF Rank αl(a) + βr(a) Learned coefficients with multivariate logistic regression (more on this later). Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 14 / 35

Hybridizing Recommenders [Burke, 2002] Several ways of blending algorithms Use CF as input to CBF Use CBF as input to CF Blend output from CBF and CF to produce result Basic algorithms + all hybrids = 177 algorithms. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 15 / 35

Hybridizing Recommenders [Burke, 2002] Several ways of blending algorithms Use CF as input to CBF Use CBF as input to CF Blend output from CBF and CF to produce result Basic algorithms + all hybrids = 177 algorithms. CF CBF Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 15 / 35

Hybridizing Recommenders [Burke, 2002] Several ways of blending algorithms Use CF as input to CBF Use CBF as input to CF Blend output from CBF and CF to produce result Basic algorithms + all hybrids = 177 algorithms. CBF CF Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 15 / 35

Hybridizing Recommenders [Burke, 2002] Several ways of blending algorithms Use CF as input to CBF Use CBF as input to CF Blend output from CBF and CF to produce result Basic algorithms + all hybrids = 177 algorithms. CF CBF Fusion Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 15 / 35

Pruning the Algorithm Pool Analyze Task and Domain Design Candidate Algorithms Prune Algorithm Pool (offline) Evaluate Performance (user study) Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 16 / 35

Designing an Evaluation Strategy Goal: Measure algorithm performance at supporting a specific task. Accuracy isn t enough [McNee et al., 2006]. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 17 / 35

Offline Evaluation Use metadata from ACM Digital Library. Simulate introductory reading list with hold-out test on articles in ACM Computing Surveys. Hold out 5 items from each citation list Attempt to recommend back the 5 items Skip surveys with less than 15 resolved citations Result: 220 survey articles for training and testing. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 18 / 35

Interlude - Learning CBF Blending Coefficients Use half the articles to train the CBF blend. Learned multivariate logistic regression with response of 1 for articles in the holdout set, 0 for other articles. s(a) = αl(a) + βr(a) Since we re ranking, intercept and log don t matter. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 19 / 35

Offline Evaluation (cont) Measure using half-life utility metric [Breese et al., 1998]: R a = i u a,i 2 (i 1)/(α 1) α = 5, the length of our reading lists u a,i = 1 if article i is cited by survey a, 0 otherwise Aggregate to compute fraction of potential utility achieved (in range [0, 1]). R = 1 R nra max a a Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 20 / 35

Offline Results CF Half life utility 0.10 0.05 0.15 0.20 0.30 0.25 cf hits cf plain cf full unit cf unit cf pagerank cf salsa Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 21 / 35

Offline Results Half life utility 0.30 0.25 0.20 0.15 0.10 0.05 CF CBF CBF CF CF CBF Fusion cbf blend hits cbf hits cbf hits biased cf pagerank cbf cf salsa cbf blend salsa cf pagerank cbf blend salsa cf salsa fuse cbf blend hits cf pagerank fuse cbf blend hits cf unit fuse cbf blend hits cbf blend pagerank cf salsa cbf cf pagerank cbf cf salsa cf unit cf pagerank cf salsa Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 22 / 35

Offline Results CBF Half life utility 0.02 0.04 0.06 0.08 0.10 cbf pagerank cbf k markov cbf salsa cbf cbf blend pagerank cbf hits biased cbf blend salsa cbf blend hits cbf hits Salsa performs like PageRank Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 23 / 35

Selected Algorithms Chose 3 algorithms that performed well but have differing structures. CBF with Biased HITS in subgraph ranking configuration CBF fed into PageRank-weighted CF PageRank-weighted CF Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 24 / 35

Evaluating Performance Analyze Task and Domain Design Candidate Algorithms Prune Algorithm Pool (offline) Evaluate Performance (user study) Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 25 / 35

User-Based Evaluation Asked graduate students to provide query sets of 5-10 papers and evaluate 3 5-item reading lists. Relevance of individual papers Importance of individual papers Quality of reading list as a whole Relative ranking of reading lists All questions were set in the context of introducing a new researcher to the topic. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 26 / 35

User-Based Results Rank score 2.5 2.0 1.5 Average score 4 3 2 1 Relevance Importance Familiarity cbf cbf cf cf cbf cbf cf cf Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 31 / 35

User-Based Results Rank score 2.5 2.0 1.5 Average score 4 3 2 1 Relevance Importance Familiarity cbf cbf cf cf cbf cbf cf cf CF performed the best (surprisingly) Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 31 / 35

Input Query (again) Herlocker et al. An algorithmic framework for performing collaborative filtering. In Proc. SIGIR 1999. Hoffman. Latent semantic models for collaborative filtering. In TOIS, volume 22, issue 1 (January 2004). Guy et al. Personalized recommendation of social software items based on social relations. In Proc. RecSys 2009. Karypis. Evaluation of Item-Based Top-N Recommendation Algorithms. In Proc. CIKM 2001. How do we do? Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 32 / 35

CF s Recommender Systems Recommendations Sarwar et al. Item-based collaborative fitlering recommendation algorithms. In Proc. WWW 2001. Resnick et al. GroupLens: An open architecture for collaborative filtering of netnews. In Proc. CSCW 1994. Shardanand and Maes. Social information filtering: algorithms for automating word of mouth. In Proc. CHI 1995. Deshpande and Karypis. Item-based top-n recommendation algorithms. In TOIS, volume 22, issue 1 (January 2004). Herlocker et al. Evaluating collaborative filtering recommender systems. In TOIS, volume 22, issue 1 (January 2004). Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 33 / 35

Task-Driven Design and Evaluation Design recommenders to support human information needs, not just improve prediction error [McNee et al., 2006]. Task and context inform design of both recommenders and evaluation strategies. Opportunity to harness unique characteristics (or personalities ) of specific algorithms. Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 34 / 35

Conclusion Contributions Design and evaluation of recommenders for reading lists Method for biasing CF with authority metrics CF works well for reading lists (surprising) Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 35 / 35

Conclusion Contributions Design and evaluation of recommenders for reading lists Method for biasing CF with authority metrics CF works well for reading lists (surprising) Open Questions How to tell user that they have an important paper? How to more accurately operate within topic scope? Is SALSA misused? Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 35 / 35

Conclusion Contributions Design and evaluation of recommenders for reading lists Method for biasing CF with authority metrics CF works well for reading lists (surprising) Open Questions How to tell user that they have an important paper? How to more accurately operate within topic scope? Is SALSA misused? Questions? ekstrand@cs.umn.edu Funded by NSF grants IIS 05-34939 and IIS 08-08692 Ekstrand et al. (GroupLens/UMN) Automatically Building Reading Lists #recsys2010 35 / 35