Recommender Systems: User Experience and System Issues

Recommender Systems: User Experience and System ssues Joseph A. Konstan University of Minnesota konstan@cs.umn.edu http://www.grouplens.org Summer 2005 1 About me Professor of Computer Science & Engineering, Univ. of Minnesota Ph.D. (1993) from U.C. Berkeley GU toolkit architecture Teaching nterests: HC, GU Tools Research nterests: General HC, and... Collaborative nformation Filtering Multimedia Authoring and Systems Visualization and nformation Management Medical/Health Applications and their Delivery Summer 2005 2

A Quick ntroduction What are recommender systems? Tools to help identify worthwhile stuff Filtering interfaces E-mail filters, clipping services Recommendation interfaces Suggestion lists, top-n, offers and promotions Prediction interfaces Evaluate candidates, predicted ratings Summer 2005 3 Scope of Recommenders Purely Editorial Recommenders Content Filtering Recommenders Collaborative Filtering Recommenders Hybrid Recommenders Summer 2005 4

Wide Range of Algorithms Simple Keyword Vector Matches Pure Nearest-Neighbor Collaborative Filtering Machine Learning on Content or Ratings Summer 2005 5 Classic Collaborative Filtering MovieLens* K-nearest neighbor algorithm Model-free, memory-based implementation ntuitive application, supports typical interfaces *Note newest releases use updated architecture/algorithm Summer 2005 6

CF Classic C.F. Engine Ratings Correlations Summer 2005 7 Submit Ratings ratings C.F. Engine Ratings Correlations Summer 2005 8

Store Ratings ratings C.F. Engine Ratings Correlations Summer 2005 9 Compute Correlations C.F. Engine pairwise corr. Ratings Correlations Summer 2005 10

Request Recommendations C.F. Engine request Ratings Correlations Summer 2005 11 dentify Neighbors C.F. Engine find good Ratings Correlations Neighborhood Summer 2005 12

Select tems; Predict Ratings C.F. Engine predictions recommendations Ratings Correlations Neighborhood Summer 2005 13 Hoop Dreams Star Wars Understanding the Computation Pretty Woman Titanic Blimp Rocky XV Joe D A B D?? John A F D F Susan A A A A A A Pat D A C Jean A C A C A Ben F A F Nathan D A A Summer 2005 14

Hoop Dreams Star Wars Understanding the Computation Pretty Woman Titanic Blimp Rocky XV Joe D A B D?? John A F D F Susan A A A A A A Pat D A C Jean A C A C A Ben F A F Nathan D A A Summer 2005 15 Hoop Dreams Star Wars Understanding the Computation Pretty Woman Titanic Blimp Rocky XV Joe D A B D?? John A F D F Susan A A A A A A Pat D A C Jean A C A C A Ben F A F Nathan D A A Summer 2005 16

Hoop Dreams Star Wars Understanding the Computation Pretty Woman Titanic Blimp Rocky XV Joe D A B D?? John A F D F Susan A A A A A A Pat D A C Jean A C A C A Ben F A F Nathan D A A Summer 2005 17 Hoop Dreams Star Wars Understanding the Computation Pretty Woman Titanic Blimp Rocky XV Joe D A B D?? John A F D F Susan A A A A A A Pat D A C Jean A C A C A Ben F A F Nathan D A A Summer 2005 18

Hoop Dreams Star Wars Understanding the Computation Pretty Woman Titanic Blimp Rocky XV Joe D A B D?? John A F D F Susan A A A A A A Pat D A C Jean A C A C A Ben F A F Nathan D A A Summer 2005 19 Hoop Dreams Star Wars Understanding the Computation Pretty Woman Titanic Blimp Rocky XV Joe D A B D?? John A F D F Susan A A A A A A Pat D A C Jean A C A C A Ben F A F Nathan D A A Summer 2005 20

ML-home Summer 2005 21 ML-comedy Summer 2005 22

ML-clist Summer 2005 23 ML-rate Summer 2005 24

ML-search Summer 2005 25 ML-slist Summer 2005 26

ML-buddies Summer 2005 27 Talk Roadmap ntroduction Algorithms nfluencing Users Current Research Summer 2005 28

Non-Personalized Summary Statistics K-Nearest Neighbor user-user item-item Dimensionality Reduction Content + Collaborative Filtering Graph Techniques Clustering Classifier Learning Collaborative Filtering Algorithms Summer 2005 29 tem-tem Collaborative Filtering Summer 2005 30

Summer 2005 31 tem-tem Collaborative Filtering Summer 2005 32 tem-tem Collaborative Filtering

1 2 s i,j =? tem Similarities 1 2 3 i j n-1 n R R - R u R R Used for similarity computation m-1 R R m R - Summer 2005 33 tem-tem Matrix Formulation Target item 1 2 1 2 3 i-1 i m-1 m u R R R - R s i,i-1 u R R R - R s i,1 s i,3 prediction s i,m m weighted sum regression-based 2nd 1st 4th 3rd 5th 5 closest neighbors Raw scores for prediction generation Approximation based on linear regression Summer 2005 34

tem-tem Discussion Good quality, in sparse situations Promising for incremental model building Small quality degradation Big performance gain Summer 2005 35 Non-Personalized Summary Statistics K-Nearest Neighbor Dimensionality Reduction Singular Value Decomposition Factor Analysis Content + Collaborative Filtering Graph Techniques Clustering Classifier Learning Collaborative Filtering Algorithms Summer 2005 36

Dimensionality Reduction Latent Semantic ndexing Used by the R community Worked well with the vector space model Used Singular Value Decomposition (SVD) Main dea Term-document matching in feature space Captures latent association Reduced space is less-noisy Summer 2005 37 SVD: Mathematical Background R R k = U U k S S k V V k m X n m X rk r k X rk k r X n The reconstructed matrix R k = U k.s k.v k is the closest rank-k matrix to the original matrix R. Summer 2005 38

SVD for Collaborative Filtering 1. Low dimensional representation O(m+n) storage requirement k x n m x n m x k. 2. Direct Prediction Summer 2005 39 Singular Value Decomposition Reduce dimensionality of problem Results in small, fast model Richer Neighbor Network ncremental Update Folding in Model Update Summer 2005 40

Collaborative Filtering Algorithms Non-Personalized Summary Statistics K-Nearest Neighbor Dimensionality Reduction Content + Collaborative Filtering Graph Techniques Horting: Navigate Similarity Graph Clustering Classifier Learning Rule-nduction Learning Bayesian Belief Networks Summer 2005 41 Talk Roadmap ntroduction Algorithms nfluencing Users Cosley et al, CH 2003 Current Research Summer 2005 42

Does Seeing Predictions Affect User Ratings? RERATE: Ask 212 users to rate 40 movies 10 with no shown prediction 30 with shown predictions (random order): 10 accurate, 10 up a star, 10 down a star Compare ratings to accurate predictions Prediction is user s original rating Hypothesis: users rate in the direction of the shown prediction Summer 2005 43 The Study Summer 2005 44

Seeing Matters Ratings % 80% 60% 40% 20% 0% Not show n Show n Below At Above Prediction shown? Summer 2005 45 Accuracy Matters Ratings % 80% 60% 40% 20% 0% Down Accurate Up Below At Above Prediction manipulation Summer 2005 46

Domino Effects? The power to manipulate? Summer 2005 47 Recap of RERATE effects: Rated, Unrated, Doesn t Matter Showing prediction changed 8% of ratings Altering shown prediction changed 12% Similar experiment, UNRATED movies 137 experimental users, 1599 ratings Showing prediction changed 8% of ratings Altering shown prediction changed 14% Summer 2005 48

But Users Notice! Users are often insensitive UNRATED part 2: satisfaction survey Control group: only accurate predictions Experimental predictions accurate, useful? ML predictions overall accurate, useful? Manipulated preds less well liked Surprise: 24 bad = MovieLens worse! Summer 2005 49 Talk Roadmap ntroduction Algorithms nfluencing Users Current Research Summer 2005 50

Current Research Themes Beyond Accuracy: Metrics and Algorithms New User Orientation nfluence and Shilling Eliciting Participation in On-Line Communities Reinventing Conversation Beyond Entertainment: Recommending Research Papers Summer 2005 51 Beyond Accuracy What affects user satisfaction? Novelty Confirmability Diversity Value How to measure? Custom tuned algorithms Herlocker et al., ACM TOS 1/2004 Summer 2005 52

New User Orientation How do we start new users? nterface (McNee et al., UM 2003) System-initiated User-initiated Mixed Content (Rashid et al., U 2002) Popular Entropy Mix Predict Seen Summer 2005 53 nfluence and Shilling nfluence (Rashid et al., SAM Data Mining 2005) Who has it? How much? Measurements? Shilling (Lam and Riedl, WWW 2004) Attacks Defenses Algorithm Differences Summer 2005 54

Eliciting Participation: An nitial Study A study of participation in discussions with two factors controlled Similarity of tastes Awareness of own uniqueness Results Dissimilarity increased contribution Awareness of own uniqueness increased contribution Active discussants were not highly-active raters Participants rated more than a control group Summer 2005 55 Motivational Follow- Up Class projects at CMU used an e-mail campaign to elicit ratings to discover: Making users aware of their uniqueness increased rating Giving users specific, achievable goals increased rating But Reminding users of their self-benefit or benefit to others actually decreased the number of ratings! Summer 2005 56

Other Projects Self-Maintaining Communities What happens when you let the masses maintain the database? Social Preference Using economic models to study user behavior Summer 2005 57 Reinventing Conversation Goal: More comprehensive engagement Full-factorial design around entities, people, comments/discussion entries Summer 2005 58

Summer 2005 59 Summer 2005 60

Summer 2005 61 Summer 2005 62

Recommending Research Papers Using Citation Webs For a full paper, we can recommend citations A paper rates the papers it cites Every paper has ratings in the system Other citation web mappings are possible, but many are have problems Summer 2005 63 Results Thus Far Off-line studies showed promise On-line study showed different algorithms met different needs General positive attitude from users Typically one or two useful recommendations in a set of five That s enough to be useful Specific value for hybrid content/collab algorithsms Current work: ACM Digital Library Summer 2005 64

Summer 2005 65 Summer 2005 66

Summer 2005 67 Summer 2005 68

Summer 2005 69 Directions Application-Focused Research Awareness service Paper and proposal-writing support Find people (reviewers/experts) Overview of a field Summer 2005 70

Conclusions From humble origins Substantial algorithmic research HC and online community research mportant applications Commercial deployment Summer 2005 71 Acknowledgements This work is being supported by grants from the National Science Foundation, and by grants from Net Perceptions, nc. Many people have contributed ideas, time, and energy to this project. Summer 2005 72

Recommender Systems: User Experience and System ssues Joseph A. Konstan University of Minnesota konstan@cs.umn.edu http://www.grouplens.org Summer 2005 73