Non-negative Matrix Factorization for Multimodal Image Retrieval
|
|
- Ezra Greene
- 5 years ago
- Views:
Transcription
1 Non-negative Matrix Factorization for Multimodal Image Retrieval Fabio A. González PhD Bioingenium Research Group Computer Systems and Industrial Engineering Department Universidad Nacional de Colombia Knowledge Discovery and Webmining Lab CECS, The U of L Graduate Seminar, Spring 2010 CECS Department, The University of Louisville F. González (UNAL) NMF for MM IR CECS Grad Sem 1 / 49
2 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 2 / 49
3 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 3 / 49
4 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 4 / 49
5 Content-Based Image Retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 5 / 49
6 Query by Visual Example F. González (UNAL) NMF for MM IR CECS Grad Sem 6 / 49
7 Low-level vs. High-level F. González (UNAL) NMF for MM IR CECS Grad Sem 7 / 49
8 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 8 / 49
9 Semantic Annotation using ML Source: Nuno Vasconcelos, UCSD, F. González (UNAL) NMF for MM IR CECS Grad Sem 9 / 49
10 An Example (1) F. González (UNAL) NMF for MM IR CECS Grad Sem 10 / 49
11 An Example (2) 1 Recall vs Precision Graph - Semantic models Histogram Intersection MetaFeatures Identity Kernel Sobel Precision Recall F. González (UNAL) NMF for MM IR CECS Grad Sem 11 / 49
12 Disadvantages Requires a training set with expert annotations, so it is a costly process F. González (UNAL) NMF for MM IR CECS Grad Sem 12 / 49
13 Disadvantages Requires a training set with expert annotations, so it is a costly process Does not scale for large semantic vocabularies F. González (UNAL) NMF for MM IR CECS Grad Sem 12 / 49
14 Disadvantages Requires a training set with expert annotations, so it is a costly process Does not scale for large semantic vocabularies The mapping from visual features to annotations may lose the visual richness F. González (UNAL) NMF for MM IR CECS Grad Sem 12 / 49
15 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 13 / 49
16 Multimodality Text and images come naturally together in many documents Academic papers, books Newspapers, web pages Medical cases F. González (UNAL) NMF for MM IR CECS Grad Sem 14 / 49
17 Multimodal Retrieval Unstructured text associated to images may be used as semantic annotations Images and texts are complimentary information units Take advantage of interactions between both data modalities F. González (UNAL) NMF for MM IR CECS Grad Sem 15 / 49
18 Multimodal Retrieval Unstructured text associated to images may be used as semantic annotations Images and texts are complimentary information units Take advantage of interactions between both data modalities Problems: Text associated to images is not structured Unclear relationships between keywords and visual patterns Possible presence of noise in both data modalities F. González (UNAL) NMF for MM IR CECS Grad Sem 15 / 49
19 Multimodal Retrieval Unstructured text associated to images may be used as semantic annotations Images and texts are complimentary information units Take advantage of interactions between both data modalities Problems: Text associated to images is not structured Unclear relationships between keywords and visual patterns Possible presence of noise in both data modalities Retrieval scenarios: Cross-modal: find images based on a text query find text based on an image query (image annotation) Visual retrieval based on a visual query F. González (UNAL) NMF for MM IR CECS Grad Sem 15 / 49
20 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 16 / 49
21 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 17 / 49
22 The Netflix Competition Problem: prediction of user ratings for films (collaborative filtering) Y. Koren, R. Bell, and C. Volinsky, Matrix factorization techniques for recommender systems, Computer, vol. 42, no. 8, pp , August 2009 F. González (UNAL) NMF for MM IR CECS Grad Sem 18 / 49
23 The Netflix Competition Problem: prediction of user ratings for films (collaborative filtering) Prize: $1 Million to the first algorithm able to improve Netflix own algorithm results by at least 10% Y. Koren, R. Bell, and C. Volinsky, Matrix factorization techniques for recommender systems, Computer, vol. 42, no. 8, pp , August 2009 F. González (UNAL) NMF for MM IR CECS Grad Sem 18 / 49
24 The Netflix Competition Problem: prediction of user ratings for films (collaborative filtering) Prize: $1 Million to the first algorithm able to improve Netflix own algorithm results by at least 10% Won on Sept/21/2009 by BellKor s Pragmatic Chaos Y. Koren, R. Bell, and C. Volinsky, Matrix factorization techniques for recommender systems, Computer, vol. 42, no. 8, pp , August 2009 F. González (UNAL) NMF for MM IR CECS Grad Sem 18 / 49
25 The Netflix Competition Problem: prediction of user ratings for films (collaborative filtering) Prize: $1 Million to the first algorithm able to improve Netflix own algorithm results by at least 10% Won on Sept/21/2009 by BellKor s Pragmatic Chaos Their approach used Non-negative Matrix Factorization (NMF) to build a latent-factor representation of users and movies Y. Koren, R. Bell, and C. Volinsky, Matrix factorization techniques for recommender systems, Computer, vol. 42, no. 8, pp , August 2009 F. González (UNAL) NMF for MM IR CECS Grad Sem 18 / 49
26 The Latent-Factor Model movies factors movies r r 1m q q 1f p p 1m users users..... = r n1... r nm q n1... q nf p f 1... p fm factors R QP Q, P 0 n m {(i, j) r ij 0} 10 8 f 200 F. González (UNAL) NMF for MM IR CECS Grad Sem 19 / 49
27 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 20 / 49
28 Non-negative Matrix Factorization Problem: to find a factorization Optimization problem: X n m = W n r H r m min A,B X WH 2 s.t. W, H 0 is the Frobenius norm It is a non-convex optimization problem Solution alternatives: Gradient descendent methods Multiplicative updating rules F. González (UNAL) NMF for MM IR CECS Grad Sem 21 / 49
29 Multiplicative Rules Optimization problem: min A,B X WH 2 s.t. W, H 0 Incremental optimization: H aµ H aµ (W T X ) aµ (W T WH) aµ W ia W ia (XH T ) αµ (WHH T ) αµ F. González (UNAL) NMF for MM IR CECS Grad Sem 22 / 49
30 Divergence Optimization Optimization problem: min A,B D(X WH) = ij s.t. W, H 0 Multiplicative Rules: ( X ij log ) X ij (WH) ij X ij + (WH) ij i H aµ H W iax iµ /(WH) iµ aµ i W ia µ W ia W H aµx iµ /(WH) iµ ia µ H aµ F. González (UNAL) NMF for MM IR CECS Grad Sem 23 / 49
31 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 24 / 49
32 PCA and SVD Problem: X n m = W n r H r m Principal Component Analysis (PCA) X = UΣV W = UΣ 1 2, H = Σ 1 2 V PCA = SVD keeping the best Eigenvectors Columns of U are orthonormal There is not restriction on sign. D. D. Lee and H. S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature, vol. 401, no. 6755, pp , October 1999 F. González (UNAL) NMF for MM IR CECS Grad Sem 25 / 49
33 Latent Semantic Indexing (LSI) documents factors documents x x 1m w w 1r h h 1m terms terms..... = x n1... x nm w n1... w nr h r1... h rm X W H factors Documents are represented by the frequency of keywords (terms) Uses SVD to find the factorization Factors = semantic concepts Columns of W are orthonormal F. González (UNAL) NMF for MM IR CECS Grad Sem 26 / 49
34 NMF vs LSI W. Xu, X. Liu, and Y. Gong, Document clustering based on non-negative matrix factorization, in SIGIR 03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. New York, NY, USA: ACM, 2003, pp F. González (UNAL) NMF for MM IR CECS Grad Sem 27 / 49
35 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 28 / 49
36 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 29 / 49
37 Semantic Space: Multimodal Latent Indexing (I) Objects are described by terms in a textual vocabulary and a visual vocabulary F. González (UNAL) NMF for MM IR CECS Grad Sem 30 / 49
38 Semantic Space: Multimodal Latent Indexing (I) Objects are described by terms in a textual vocabulary and a visual vocabulary Objects are mapped to a latent (semantic) space where objects are represented by a set of latent factors F. González (UNAL) NMF for MM IR CECS Grad Sem 30 / 49
39 Semantic Space: Multimodal Latent Indexing (I) Objects are described by terms in a textual vocabulary and a visual vocabulary Objects are mapped to a latent (semantic) space where objects are represented by a set of latent factors NMF is used to build the latent representation F. González (UNAL) NMF for MM IR CECS Grad Sem 30 / 49
40 Semantic Space: Multimodal Latent Indexing (I) Objects are described by terms in a textual vocabulary and a visual vocabulary Objects are mapped to a latent (semantic) space where objects are represented by a set of latent factors NMF is used to build the latent representation Three main tasks: Multimodal clustering Automatic image annotation Image retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 30 / 49
41 Semantic Space: Multimodal Latent Indexing (II) F. González (UNAL) NMF for MM IR CECS Grad Sem 31 / 49
42 Bag-of-Features Image Representation F. González (UNAL) NMF for MM IR CECS Grad Sem 32 / 49
43 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 33 / 49
44 Multimodal Clustering F. González (UNAL) NMF for MM IR CECS Grad Sem 34 / 49
45 Dual Multimodal Clustering F. González (UNAL) NMF for MM IR CECS Grad Sem 35 / 49
46 Does It Work? Clustering Corel data set: 1000 images, 25 categories Input matrix: 2500 vectors of dimension 1000 Clustering performance comparison: K-means and two NMF variants: K-means NMF (Frobenius norm) NMF (KL divergence) Accuracy *results are average of 10 runs F. González (UNAL) NMF for MM IR CECS Grad Sem 36 / 49
47 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 37 / 49
48 Image Annotation F. González (UNAL) NMF for MM IR CECS Grad Sem 38 / 49
49 Does it Work? plsa has shown good performance for image annotation NMF with KL divergence was shown to be equivalent to plsa W. Xu, X. Liu, and Y. Gong, Document clustering based on non-negative matrix factorization, in SIGIR 03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. New York, NY, USA: ACM, 2003, pp F. González (UNAL) NMF for MM IR CECS Grad Sem 39 / 49
50 Outline 1 The Problem Content-based image retrieval Semantic image retrieval Multimodal image retrieval 2 Non-Negative Matrix Factorization The Netflix Prize Non-negative matrix factorization NMF vs SVD 3 NMF for Multimodal Retrieval Semantic space Multimodal clustering Image annotation Multimodal retrieval F. González (UNAL) NMF for MM IR CECS Grad Sem 40 / 49
51 Image Retrieval Scenario: visual retrieval based on a visual query F. González (UNAL) NMF for MM IR CECS Grad Sem 41 / 49
52 Image Retrieval Scenario: visual retrieval based on a visual query Images are represented in a latent space using NMF F. González (UNAL) NMF for MM IR CECS Grad Sem 41 / 49
53 Image Retrieval Scenario: visual retrieval based on a visual query Images are represented in a latent space using NMF Some images in the database may not have text content associated F. González (UNAL) NMF for MM IR CECS Grad Sem 41 / 49
54 Image Retrieval Scenario: visual retrieval based on a visual query Images are represented in a latent space using NMF Some images in the database may not have text content associated The image query is projected the latent space F. González (UNAL) NMF for MM IR CECS Grad Sem 41 / 49
55 Image Retrieval Scenario: visual retrieval based on a visual query Images are represented in a latent space using NMF Some images in the database may not have text content associated The image query is projected the latent space Image are retrieved according to their latent space similarity F. González (UNAL) NMF for MM IR CECS Grad Sem 41 / 49
56 Experimental Evaluation Corel Images A subset of 2,500 images extracted from the Corel Database. 25 categories with 100 images each. F. González (UNAL) NMF for MM IR CECS Grad Sem 42 / 49
57 Bag of Features Corel Images Image content representation using parts of images Blocks are extracted from each image and the SIFT descriptor is computed A dictionary of visual patterns is built using k-means A histogram counting the occurrence of each codeblock is constructed for each image F. González (UNAL) NMF for MM IR CECS Grad Sem 43 / 49
58 Results This approach outperforms direct image matching by almost 4X F. González (UNAL) NMF for MM IR CECS Grad Sem 44 / 49
59 Results Some specific performance measures Measure NMF Search P1 Direct Search P P P R R R MAP F. González (UNAL) NMF for MM IR CECS Grad Sem 45 / 49
60 Semantic Space Visualization (I) F. González (UNAL) NMF for MM IR CECS Grad Sem 46 / 49
61 Semantic Space Visualization (II) F. González (UNAL) NMF for MM IR CECS Grad Sem 47 / 49
62 Ongoing Work Bigger data sets: ImageClefMed, Corel5000, Flickr Different image low-level features Kernel NMF Incremental NMF F. González (UNAL) NMF for MM IR CECS Grad Sem 48 / 49
63 Thanks! F. González (UNAL) NMF for MM IR CECS Grad Sem 49 / 49
Non-negative Matrix Factorization for Multimodal Image Retrieval
Non-negative Matrix Factorization for Multimodal Image Retrieval Fabio A. González PhD Machine Learning 2015-II Universidad Nacional de Colombia F. González NMF for MM IR ML 2015-II 1 / 54 Outline 1 The
More informationMultimodal Information Spaces for Content-based Image Retrieval
Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due
More informationNMF-based Multimodal Image Indexing for Querying by Visual Example
NMF-based Multimodal Image Indexing for Querying by Visual Example ABSTRACT Fabio A. González Universidad Nacional de Colombia fagonzalezo@unal.edu.co Olfa Nasraoui University of Louisville nasraoui@louisville.edu
More informationCollaborative Filtering for Netflix
Collaborative Filtering for Netflix Michael Percy Dec 10, 2009 Abstract The Netflix movie-recommendation problem was investigated and the incremental Singular Value Decomposition (SVD) algorithm was implemented
More informationAn Empirical Comparison of Collaborative Filtering Approaches on Netflix Data
An Empirical Comparison of Collaborative Filtering Approaches on Netflix Data Nicola Barbieri, Massimo Guarascio, Ettore Ritacco ICAR-CNR Via Pietro Bucci 41/c, Rende, Italy {barbieri,guarascio,ritacco}@icar.cnr.it
More informationA Family of Contextual Measures of Similarity between Distributions with Application to Image Retrieval
A Family of Contextual Measures of Similarity between Distributions with Application to Image Retrieval Florent Perronnin, Yan Liu and Jean-Michel Renders Xerox Research Centre Europe (XRCE) Textual and
More informationTagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation
TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, Cordelia Schmid LEAR team, INRIA Rhône-Alpes, Grenoble, France
More informationRecommender Systems New Approaches with Netflix Dataset
Recommender Systems New Approaches with Netflix Dataset Robert Bell Yehuda Koren AT&T Labs ICDM 2007 Presented by Matt Rodriguez Outline Overview of Recommender System Approaches which are Content based
More informationA Weighted Non-negative Matrix Factorization for Local Representations Λ
A Weighted Non-negative Matrix Factorization for Local Representations Λ David Guillamet, Marco Bressan and Jordi Vitrià Centre de Visió per Computador, Dept. Informàtica Universitat Autònoma de Barcelona,
More informationMultimodal Medical Image Retrieval based on Latent Topic Modeling
Multimodal Medical Image Retrieval based on Latent Topic Modeling Mandikal Vikram 15it217.vikram@nitk.edu.in Suhas BS 15it110.suhas@nitk.edu.in Aditya Anantharaman 15it201.aditya.a@nitk.edu.in Sowmya Kamath
More informationCSC 411 Lecture 18: Matrix Factorizations
CSC 411 Lecture 18: Matrix Factorizations Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 18-Matrix Factorizations 1 / 27 Overview Recall PCA: project data
More informationCluster Analysis (b) Lijun Zhang
Cluster Analysis (b) Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Grid-Based and Density-Based Algorithms Graph-Based Algorithms Non-negative Matrix Factorization Cluster Validation Summary
More informationProgress Report: Collaborative Filtering Using Bregman Co-clustering
Progress Report: Collaborative Filtering Using Bregman Co-clustering Wei Tang, Srivatsan Ramanujam, and Andrew Dreher April 4, 2008 1 Introduction Analytics are becoming increasingly important for business
More informationImageCLEF 2011
SZTAKI @ ImageCLEF 2011 Bálint Daróczy joint work with András Benczúr, Róbert Pethes Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Training/test
More informationInformation Retrieval: Retrieval Models
CS473: Web Information Retrieval & Management CS-473 Web Information Retrieval & Management Information Retrieval: Retrieval Models Luo Si Department of Computer Science Purdue University Retrieval Models
More informationBUAA AUDR at ImageCLEF 2012 Photo Annotation Task
BUAA AUDR at ImageCLEF 2012 Photo Annotation Task Lei Huang, Yang Liu State Key Laboratory of Software Development Enviroment, Beihang University, 100191 Beijing, China huanglei@nlsde.buaa.edu.cn liuyang@nlsde.buaa.edu.cn
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu /6/01 Jure Leskovec, Stanford C6: Mining Massive Datasets Training data 100 million ratings, 80,000 users, 17,770
More informationGeneral Instructions. Questions
CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These
More informationDimensionality Reduction using Relative Attributes
Dimensionality Reduction using Relative Attributes Mohammadreza Babaee 1, Stefanos Tsoukalas 1, Maryam Babaee Gerhard Rigoll 1, and Mihai Datcu 1 Institute for Human-Machine Communication, Technische Universität
More informationPattern Spotting in Historical Document Image
Pattern Spotting in historical document images Sovann EN, Caroline Petitjean, Stéphane Nicolas, Frédéric Jurie, Laurent Heutte LITIS, University of Rouen, France 1 Outline Introduction Commons Pipeline
More informationCSE 6242 / CX October 9, Dimension Reduction. Guest Lecturer: Jaegul Choo
CSE 6242 / CX 4242 October 9, 2014 Dimension Reduction Guest Lecturer: Jaegul Choo Volume Variety Big Data Era 2 Velocity Veracity 3 Big Data are High-Dimensional Examples of High-Dimensional Data Image
More informationLRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier
LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier Wang Ding, Songnian Yu, Shanqing Yu, Wei Wei, and Qianfeng Wang School of Computer Engineering and Science, Shanghai University, 200072
More informationContent-Based Medical Image Retrieval Using Low-Level Visual Features and Modality Identification
Content-Based Medical Image Retrieval Using Low-Level Visual Features and Modality Identification Juan C. Caicedo, Fabio A. Gonzalez and Eduardo Romero BioIngenium Research Group National University of
More informationCPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017
CPSC 340: Machine Learning and Data Mining Recommender Systems Fall 2017 Assignment 4: Admin Due tonight, 1 late day for Monday, 2 late days for Wednesday. Assignment 5: Posted, due Monday of last week
More informationCOMP 465: Data Mining Recommender Systems
//0 movies COMP 6: Data Mining Recommender Systems Slides Adapted From: www.mmds.org (Mining Massive Datasets) movies Compare predictions with known ratings (test set T)????? Test Data Set Root-mean-square
More informationLecture Video Indexing and Retrieval Using Topic Keywords
Lecture Video Indexing and Retrieval Using Topic Keywords B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa International Science Index, Computer and Information Engineering waset.org/publication/10007915
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu Training data 00 million ratings, 80,000 users, 7,770 movies 6 years of data: 000 00 Test data Last few ratings of
More informationFeature Selection for Image Retrieval and Object Recognition
Feature Selection for Image Retrieval and Object Recognition Nuno Vasconcelos et al. Statistical Visual Computing Lab ECE, UCSD Presented by Dashan Gao Scalable Discriminant Feature Selection for Image
More informationCSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo
CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo Data is Too Big To Do Something..
More informationData Mining Techniques
Data Mining Techniques CS 60 - Section - Fall 06 Lecture Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS6) Recommender Systems The Long Tail (from: https://www.wired.com/00/0/tail/)
More informationCSE 158 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)
CSE 158 Lecture 8 Web Mining and Recommender Systems Extensions of latent-factor models, (and more on the Netflix prize) Summary so far Recap 1. Measuring similarity between users/items for binary prediction
More informationTraining-Free, Generic Object Detection Using Locally Adaptive Regression Kernels
Training-Free, Generic Object Detection Using Locally Adaptive Regression Kernels IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIENCE, VOL.32, NO.9, SEPTEMBER 2010 Hae Jong Seo, Student Member,
More informationPerformance Comparison of Algorithms for Movie Rating Estimation
Performance Comparison of Algorithms for Movie Rating Estimation Alper Köse, Can Kanbak, Noyan Evirgen Research Laboratory of Electronics, Massachusetts Institute of Technology Department of Electrical
More informationCSE 494: Information Retrieval, Mining and Integration on the Internet
CSE 494: Information Retrieval, Mining and Integration on the Internet Midterm. 18 th Oct 2011 (Instructor: Subbarao Kambhampati) In-class Duration: Duration of the class 1hr 15min (75min) Total points:
More informationDocument Summarization using Semantic Feature based on Cloud
Advanced Science and echnology Letters, pp.51-55 http://dx.doi.org/10.14257/astl.2013 Document Summarization using Semantic Feature based on Cloud Yoo-Kang Ji 1, Yong-Il Kim 2, Sun Park 3 * 1 Dept. of
More informationMetric Learning for Large Scale Image Classification:
Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Thomas Mensink 1,2 Jakob Verbeek 2 Florent Perronnin 1 Gabriela Csurka 1 1 TVPA - Xerox Research Centre
More informationMultimodal topic model for texts and images utilizing their embeddings
Multimodal topic model for texts and images utilizing their embeddings Nikolay Smelik, smelik@rain.ifmo.ru Andrey Filchenkov, afilchenkov@corp.ifmo.ru Computer Technologies Lab IDP-16. Barcelona, Spain,
More informationA Bayesian Approach to Hybrid Image Retrieval
A Bayesian Approach to Hybrid Image Retrieval Pradhee Tandon and C. V. Jawahar Center for Visual Information Technology International Institute of Information Technology Hyderabad - 500032, INDIA {pradhee@research.,jawahar@}iiit.ac.in
More informationUse of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University
Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University {tedhong, dtsamis}@stanford.edu Abstract This paper analyzes the performance of various KNNs techniques as applied to the
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationData Mining Techniques
Data Mining Techniques CS 6 - Section - Spring 7 Lecture Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS6) Project Project Deadlines Feb: Form teams of - people 7 Feb:
More informationSemi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction
Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction Pavel P. Kuksa, Rutgers University Yanjun Qi, Bing Bai, Ronan Collobert, NEC Labs Jason Weston, Google Research NY Vladimir
More informationEnhancing Clustering by Exploiting Complementary Data Modalities in the Medical Domain
Enhancing Clustering by Exploiting Complementary Data Modalities in the Medical Domain Samah Jamal Fodeh 1, Ali Haddad 2, Cynthia Brandt 3, Martin Schultz 4, Michael Krauthammer 5 (1,3) Yale University
More informationCSE 258 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)
CSE 258 Lecture 8 Web Mining and Recommender Systems Extensions of latent-factor models, (and more on the Netflix prize) Summary so far Recap 1. Measuring similarity between users/items for binary prediction
More informationDOCUMENT INDEXING USING INDEPENDENT TOPIC EXTRACTION. Yu-Hwan Kim and Byoung-Tak Zhang
DOCUMENT INDEXING USING INDEPENDENT TOPIC EXTRACTION Yu-Hwan Kim and Byoung-Tak Zhang School of Computer Science and Engineering Seoul National University Seoul 5-7, Korea yhkim,btzhang bi.snu.ac.kr ABSTRACT
More informationA Brief Review of Representation Learning in Recommender 赵鑫 RUC
A Brief Review of Representation Learning in Recommender Systems @ 赵鑫 RUC batmanfly@qq.com Representation learning Overview of recommender systems Tasks Rating prediction Item recommendation Basic models
More informationA ew Algorithm for Community Identification in Linked Data
A ew Algorithm for Community Identification in Linked Data Nacim Fateh Chikhi, Bernard Rothenburger, Nathalie Aussenac-Gilles Institut de Recherche en Informatique de Toulouse 118, route de Narbonne 31062
More informationSparse Estimation of Movie Preferences via Constrained Optimization
Sparse Estimation of Movie Preferences via Constrained Optimization Alexander Anemogiannis, Ajay Mandlekar, Matt Tsao December 17, 2016 Abstract We propose extensions to traditional low-rank matrix completion
More informationMinoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University
Information Retrieval System Using Concept Projection Based on PDDP algorithm Minoru SASAKI and Kenji KITA Department of Information Science & Intelligent Systems Faculty of Engineering, Tokushima University
More informationEntropy Optimized Feature-based Bag-of-Words Representation for Information Retrieval
Entropy Optimized Feature-based Bag-of-Words Representation for Information Retrieval Nikolaos Passalis and Anastasios Tefas 1 Abstract In this paper, we present a supervised dictionary learning method
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu //8 Jure Leskovec, Stanford CS6: Mining Massive Datasets Training data 00 million ratings, 80,000 users, 7,770 movies
More informationSupervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information
Supervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information Duc-Tien Dang-Nguyen, Giulia Boato, Alessandro Moschitti, Francesco G.B. De Natale Department of Information
More informationIPAL at CLEF 2008: Mixed-Modality based Image Search, Novelty based Re-ranking and Extended Matching
IPAL at CLEF 2008: Mixed-Modality based Image Search, Novelty based Re-ranking and Extended Matching Sheng Gao, Jean-Pierre Chevallet and Joo-Hwee Lim IPAL, Institute for Infocomm Research, A*Star, Singapore
More informationBy Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin
By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth Under The Guidance of Dr. Richard Maclin Outline Problem Statement Background Proposed Solution Experiments & Results Related Work Future
More informationWhat is this Song About?: Identification of Keywords in Bollywood Lyrics
What is this Song About?: Identification of Keywords in Bollywood Lyrics by Drushti Apoorva G, Kritik Mathur, Priyansh Agrawal, Radhika Mamidi in 19th International Conference on Computational Linguistics
More informationTag Recommendation for Photos
Tag Recommendation for Photos Gowtham Kumar Ramani, Rahul Batra, Tripti Assudani December 10, 2009 Abstract. We present a real-time recommendation system for photo annotation that can be used in Flickr.
More informationImproving Recognition through Object Sub-categorization
Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,
More informationLeveraging flickr images for object detection
Leveraging flickr images for object detection Elisavet Chatzilari Spiros Nikolopoulos Yiannis Kompatsiaris Outline Introduction to object detection Our proposal Experiments Current research 2 Introduction
More informationCS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University
CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and
More informationon learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015
on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,
More informationAnalysis and Latent Semantic Indexing
18 Principal Component Analysis and Latent Semantic Indexing Understand the basics of principal component analysis and latent semantic index- Lab Objective: ing. Principal Component Analysis Understanding
More informationMusic Recommendation with Implicit Feedback and Side Information
Music Recommendation with Implicit Feedback and Side Information Shengbo Guo Yahoo! Labs shengbo@yahoo-inc.com Behrouz Behmardi Criteo b.behmardi@criteo.com Gary Chen Vobile gary.chen@vobileinc.com Abstract
More informationarxiv: v1 [cs.lg] 3 Jan 2018
CLUSTERING OF DATA WITH MISSING ENTRIES Sunrita Poddar, Mathews Jacob Department of Electrical and Computer Engineering, University of Iowa, IA, USA arxiv:1801.01455v1 [cs.lg] 3 Jan 018 ABSTRACT The analysis
More informationCHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING
43 CHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING 3.1 INTRODUCTION This chapter emphasizes the Information Retrieval based on Query Expansion (QE) and Latent Semantic
More informationRecommender Systems by means of Information Retrieval
Recommender Systems by means of Information Retrieval Alberto Costa LIX, École Polytechnique 91128 Palaiseau, France costa@lix.polytechnique.fr Fabio Roda LIX, École Polytechnique 91128 Palaiseau, France
More informationMULTIMEDIA ANALYTICS: SYNERGY BETWEEN HUMAN AND MACHINE BY VISUALIZATION
MULTIMEDIA ANALYTICS: SYNERGY BETWEEN HUMAN AND MACHINE BY VISUALIZATION Marcel Worring, Jan Zahálka, Stevan Rudinac Intelligent Systems Lab Amsterdam Amsterdam Data Science University of Amsterdam Amsterdam
More informationVector Space Models: Theory and Applications
Vector Space Models: Theory and Applications Alexander Panchenko Centre de traitement automatique du langage (CENTAL) Université catholique de Louvain FLTR 2620 Introduction au traitement automatique du
More informationPredicting Popular Xbox games based on Search Queries of Users
1 Predicting Popular Xbox games based on Search Queries of Users Chinmoy Mandayam and Saahil Shenoy I. INTRODUCTION This project is based on a completed Kaggle competition. Our goal is to predict which
More informationThis tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.
About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts
More informationLinking Entities in Chinese Queries to Knowledge Graph
Linking Entities in Chinese Queries to Knowledge Graph Jun Li 1, Jinxian Pan 2, Chen Ye 1, Yong Huang 1, Danlu Wen 1, and Zhichun Wang 1(B) 1 Beijing Normal University, Beijing, China zcwang@bnu.edu.cn
More informationIPL at ImageCLEF 2017 Concept Detection Task
IPL at ImageCLEF 2017 Concept Detection Task Leonidas Valavanis and Spyridon Stathopoulos Information Processing Laboratory, Department of Informatics, Athens University of Economics and Business, 76 Patission
More informationMatrix Co-factorization for Recommendation with Rich Side Information and Implicit Feedback
Matrix Co-factorization for Recommendation with Rich Side Information and Implicit Feedback ABSTRACT Yi Fang Department of Computer Science Purdue University West Lafayette, IN 47907, USA fangy@cs.purdue.edu
More informationPart 11: Collaborative Filtering. Francesco Ricci
Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating
More informationClassification Based on NMF
E-Mail Classification Based on NMF Andreas G. K. Janecek* Wilfried N. Gansterer Abstract The utilization of nonnegative matrix factorization (NMF) in the context of e-mail classification problems is investigated.
More informationFacial Expression Recognition Using Non-negative Matrix Factorization
Facial Expression Recognition Using Non-negative Matrix Factorization Symeon Nikitidis, Anastasios Tefas and Ioannis Pitas Artificial Intelligence & Information Analysis Lab Department of Informatics Aristotle,
More informationFinding Topic-centric Identified Experts based on Full Text Analysis
Finding Topic-centric Identified Experts based on Full Text Analysis Hanmin Jung, Mikyoung Lee, In-Su Kang, Seung-Woo Lee, Won-Kyung Sung Information Service Research Lab., KISTI, Korea jhm@kisti.re.kr
More informationText Analytics (Text Mining)
CSE 6242 / CX 4242 Text Analytics (Text Mining) Concepts and Algorithms Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko,
More informationEffective Latent Space Graph-based Re-ranking Model with Global Consistency
Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case
More informationThe Pre-Image Problem in Kernel Methods
The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1
More informationToward Retail Product Recognition on Grocery Shelves
Toward Retail Product Recognition on Grocery Shelves Gül Varol gul.varol@boun.edu.tr Boğaziçi University, İstanbul, Turkey İdea Teknoloji Çözümleri, İstanbul, Turkey Rıdvan S. Kuzu ridvan.salih@boun.edu.tr
More informationBINARY PRINCIPAL COMPONENT ANALYSIS IN THE NETFLIX COLLABORATIVE FILTERING TASK
BINARY PRINCIPAL COMPONENT ANALYSIS IN THE NETFLIX COLLABORATIVE FILTERING TASK László Kozma, Alexander Ilin, Tapani Raiko Helsinki University of Technology Adaptive Informatics Research Center P.O. Box
More informationBig-data Clustering: K-means vs K-indicators
Big-data Clustering: K-means vs K-indicators Yin Zhang Dept. of Computational & Applied Math. Rice University, Houston, Texas, U.S.A. Joint work with Feiyu Chen & Taiping Zhang (CQU), Liwei Xu (UESTC)
More informationLarge-Scale Face Manifold Learning
Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random
More informationAdditive Regression Applied to a Large-Scale Collaborative Filtering Problem
Additive Regression Applied to a Large-Scale Collaborative Filtering Problem Eibe Frank 1 and Mark Hall 2 1 Department of Computer Science, University of Waikato, Hamilton, New Zealand eibe@cs.waikato.ac.nz
More informationProbabilistic Graphical Models Part III: Example Applications
Probabilistic Graphical Models Part III: Example Applications Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2014 CS 551, Fall 2014 c 2014, Selim
More informationCPSC 340: Machine Learning and Data Mining. Multi-Dimensional Scaling Fall 2017
CPSC 340: Machine Learning and Data Mining Multi-Dimensional Scaling Fall 2017 Assignment 4: Admin 1 late day for tonight, 2 late days for Wednesday. Assignment 5: Due Monday of next week. Final: Details
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationMetric Learning for Large-Scale Image Classification:
Metric Learning for Large-Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Florent Perronnin 1 work published at ECCV 2012 with: Thomas Mensink 1,2 Jakob Verbeek 2 Gabriela Csurka
More informationProperty1 Property2. by Elvir Sabic. Recommender Systems Seminar Prof. Dr. Ulf Brefeld TU Darmstadt, WS 2013/14
Property1 Property2 by Recommender Systems Seminar Prof. Dr. Ulf Brefeld TU Darmstadt, WS 2013/14 Content-Based Introduction Pros and cons Introduction Concept 1/30 Property1 Property2 2/30 Based on item
More informationNotebook paper: TNO instance search submission 2012
Notebook paper: TNO instance search submission 2012 John Schavemaker, Corné Versloot, Joost de Wit, Wessel Kraaij TNO Technical Sciences Brassersplein 2, 2612 CT, Delft, The Netherlands E-mail of corresponding
More informationvector space retrieval many slides courtesy James Amherst
vector space retrieval many slides courtesy James Allan@umass Amherst 1 what is a retrieval model? Model is an idealization or abstraction of an actual process Mathematical models are used to study the
More informationCHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES
188 CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES 6.1 INTRODUCTION Image representation schemes designed for image retrieval systems are categorized into two
More informationChapter 6: Information Retrieval and Web Search. An introduction
Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods
More informationContent-based Dimensionality Reduction for Recommender Systems
Content-based Dimensionality Reduction for Recommender Systems Panagiotis Symeonidis Aristotle University, Department of Informatics, Thessaloniki 54124, Greece symeon@csd.auth.gr Abstract. Recommender
More informationData Distortion for Privacy Protection in a Terrorist Analysis System
Data Distortion for Privacy Protection in a Terrorist Analysis System Shuting Xu, Jun Zhang, Dianwei Han, and Jie Wang Department of Computer Science, University of Kentucky, Lexington KY 40506-0046, USA
More informationOnline Learning for URL classification
Online Learning for URL classification Onkar Dalal 496 Lomita Mall, Palo Alto, CA 94306 USA Devdatta Gangal Yahoo!, 701 First Ave, Sunnyvale, CA 94089 USA onkar@stanford.edu devdatta@gmail.com Abstract
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationRecommender Systems 6CCS3WSN-7CCSMWAL
Recommender Systems 6CCS3WSN-7CCSMWAL http://insidebigdata.com/wp-content/uploads/2014/06/humorrecommender.jpg Some basic methods of recommendation Recommend popular items Collaborative Filtering Item-to-Item:
More informationDATA-DRIVEN APPROACHES TO IMPROVING INFORMATION ACCESS
Festschrift for Richard M. Shiffrin DATA-DRIVEN APPROACHES TO IMPROVING INFORMATION ACCESS Susan Dumais, Microsoft Research Overview From IU to Industry (Bell Labs 1979, MSR 1997) Themes Practical focus
More informationVideo annotation based on adaptive annular spatial partition scheme
Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory
More information