Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data

Size: px
Start display at page:

Download "Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data"

Transcription

1 Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM), Tony Jebara (Columbia U), Wei Liu, Junfeng He, and Yu-Gang Jiang (Fudan U) 1

2 Graph-based Semi-Supervised Learning Given a small set of labeled data and a large number of unlabeled data in a high-dimensional feature space Build sparse graphs with local connectivity Propagate information over graphs of large data sets Hopefully robust to noise and scalable to gigantic sets Input samples with sparse labels Label propagation on graph Label inference results Positive Negative Unlabeled Positive Negative 2

3 Intuition Capture local structures via sparse graph Nonlinear Classifier Graph Semi- Supervised Linear Learning Classifier Through Spare Graph Construction (e.g., knn)

4 Possible Applications: Propagating Labels in Interactive Search & Auto Re ranking Image\Video data Processing (denoising, cropping ) Feature Extraction Compute Similarity No predefined Category Graph Construction Existing Ranking/filtering System Top-rank results Interactive browse / label User Interface Applications Search, Browsing Re-ranking over large set Label Propagation S.-F. Chang, Columbia U. Automatic Mode 4 Interactive Mode

5 Example: Web Search Reranking Google Search Statue of Liberty Keyword Search Web Search Top images as + Bottom imgs as Label Diagnosis Diffusion

6 Application: Web Search Reranking Rerank Keyword Search Web Images Top images as + Bottom imgs as Label Diagnosis Diffusion

7 Application: Web Search Reranking Google Search Tiger Keyword Search Web Images Top images as + Bottom imgs as Label Diagnosis Diffusion

8 Application: Web Search Reranking Rerank Keyword Search Web Images Top images as + Bottom imgs as Label Diagnosis Diffusion How to Handle Noisy Labels before Propagation? Scalability?

9 Background Review Given a dataset of labeled samples, and unlabeled samples undirected graph of samples as vertices and edges weighted by sample similarity Define weight matrix ; vertex degree

10 Example Weight matrix Node degree D ??? Label matrix classes samples Graph-based SSL F Label prediction

11 Some Options of Constructing Sparse Graph Distance Threshold K-Nearest Neighbor Graph max 1 and B-Matched Graph (Huang and Jebara, AISTATS 2007) (Jebara, Wang, and Chang, ICML 2009) max

12 Several Ways of Constructing Sparse Graphs k,b=4 k,b=6 Distance threshold Rank threshold (knn) B-Match

13 Examples of Graph Construction (KNN) (B-Matching) k = 4 b = 4

14 Graph Construction Edge Weighting Binary Weighting Gaussian Kernel Weighting Locally Linear Reconstruction Weighting

15 Measure Smoothness: Graph Laplacian Graph Laplacian, and normalized Laplacian smoothness of function f over graph, Multi-class

16 Classical Methods: (Zhu et al ICML03, Zhou et al NIPS04, Joachim ICML03) Predict a graph function (F) via cost optimization prediction function function smoothness empirical loss Local and Global Consistency - LGC (Zhou et al, NIPS 04) Gaussian Random Fields GRF (Zhu et al, ICML03) 0

17 Empirical Observations (Jebara, Wang, and Chang, ICML 2009) Compare method-graphs-weights B-matching tends to outperform knn B-Matching particularly good for GTAM + local linear (LLR) weight GTAM GTAM GTAM GTAM GTAM GTAM 17

18 Noisy Label and other Challenges LGC Propagation GRF Propagation Unbalanced Labels Ill Label Locations Noisy Data and Labels 18

19 Label Unbalance A Quick Fix Normalize labels within each class based on node degrees Example: classes samples Label matrix Node degree matrix

20 Dealing with Noisy Labels Graph Transduction via Alternate Minimization ( GTAM, Wang, Jebara, & Chang, ICML, 2008) ( LDST, Wang, Jiang, & Chang, CVPR, 2009) Change uni variate optimization to bi variate formulation:

21 Alternate Optimization First, given Y solve continuous valued Then, search optimal integer Y given F* Gradient decent search

22 Alternate Minimization for Label Tuning Example: = Q = Add label: Delete label: (3,1) (1,1) = Iteratively repeat the above procedure

23 Example Toy Data Consider adding label only Label propagation by GTAM Convergence procedure (non-monotonic due to discrete step size) Unlabeled Positive Negative

24 Initial Labels Label Diagnosis and Self Tuning ( LDST, Wang, Jian, & Chang, CVPR, 2009) Add label: Delete label: Iteration # 2 Iteration # 6 Decline of the cost function Q over iterations (with vs. without label tuning)

25 Application: Web Search Reranking Google Search Tiger Keyword Search Web Images Top images as + Bottom imgs as Label Diagnosis Diffusion

26 Application: Web Search Reranking Rerank Keyword Search Web Images Top images as + Bottom imgs as Label Diagnosis Diffusion

27 Figure 4. Example images of text search results from flickr.com. A total of nine text queries are used: dog, tiger, panda, bird, flower, airplane, forbidden city, statue of liberty, golden bridge.

28 Effects of Graph based reranking VisualRank: Jing & Baluja, 08

29 Possible Applications: Propagating Labels in Interactive Search & Auto Re ranking Image\Video data Processing (denoising, cropping ) Feature Extraction Compute Similarity No predefined Category Graph Construction Interactive browse / label User Interface Applications Search, Browsing Label Propagation Interactive Mode S.-F. Chang, Columbia U. 29

30 Application: Brain Machine Interface for Image Retrieval -- denoise unreliable labels from brain signal decoding (joint work with Sajda et al, ACMMM 2009, J. of Neural Engineering, May 2011) Use EEG brain signals to detect target of interest Use image graph to tune & propagate information

31 The Paradigm Database (any target that may interest users) 31

32 The Paradigm Neural (EEG) decoder EEG-scores Database 32

33 The Paradigm Neural (EEG) decoder Exemplar labels (noisy) image features Graph-based Semi-Supervised Learning Database prediction score 33

34 The Paradigm Pre-triage Post-triage 34

35 The Paradigm Human inspects only a small sample set via BCI Machine filters out noise and retrieves targets from very large DB General: no predefined target models, no keyword High Throughput: neuro vision as bootstrap of fast computer vision Pre-triage Post-triage 35

36 The Neural Signatures of Recognition D. Linden, Neuroscientist, 2005, the Oddball Effect Standard Novel (P3a) Target Target (P3b) Novel time Novel Target Standard 36

37 Effect of graph-based reranking (BCI test) Top (noisy) results of Brain EEG signal detection Top results after graphbased label denoising & propagation P R curve significantly improved 37

38 More Example Results Top 20 results of EEG detection Top 20 results of Hybrid System (BCI VPM) Top 20 results of EEG detection Top 20 results of Hybrid System (BCI VPM) 38

39 Graph over million points and more k-nn graph construction + label prediction infeasible for large-scale tasks Idea: AnchorGraph Regularization complexity: # anchors m << n (W. Liu, J. He, S.-F. Chang, ICML2010) time 14 x 1010 Time Complexity data size n

40 Active topic in research Large scale spectral analysis (Fergus et al, 09) Approximate solutions as linear combinations of a small number of eigenfucntions of graph Laplacian Elegant solutions with linear complexity But only applicable to ideal data distributions (separable uniform or Gaussian) Matrix approximation via Nyström (Zhang et al, 09) W = Complexity But may not be positive semidefinite > non convex

41 Idea: Build low-rank graph via anchors (Liu, He, Chang, ICML10) Use anchor points to abstract the graph structure Compute data-to-anchor similarity: sparse local embedding Data-to-data similarity W = inner product in the embedded space anchor points W18>0 x1 Z16 Z11 u6 u1 Z12 W14=0 u5 u2 u4 u3 x8 data points x4

42 Probabilistic Intuition Affinity between samples iand j, Wij = probability of two step Markov random walk, where = diag( ), m<< n AnchorGraph: sparse, positive semi-definite

43 AnchorGraph Regression Apply the same sparse embedding principle to labels The whole graph regularization process becomes low-rank Small matrix inversion Predicted function over graph = embedding matrix inferred labels on anchors

44 Intuition: Anchor Graph SSL Use low rank ARG to infer optimal labels on anchors and samples Predict optimal labels in the anchor space (~100 labels) label initial mapping labels in out Propagate to original sample space (~million labels)

45 Performance -small data set USPS-Train: 7,291 images of digits, 10 classes, 10 samples per class AGR^0: K-means anchors and naïve Z AGR: K-means anchors and optimized Z Method Error Rate (%) Time (seconds) 1NN LGC with NN graph GFHF with NN graph AGR^ x speedup AGR accuracy comparable to analytical optimum

46 Large Data Set Evaluation 630,000 MNIST images over 10 classes, 100 labeled images only Conventional analytical solutions infeasible Among scalable solutions reduce error rates by 30% 50% Method Error Rate (%) Training Time (seconds) 1NN Eigenfunction ( 09) PVM ( 09) AGR^ AGR %-50% gain

47 Extension to Web-Scale Techniques described above not scalable to Web scale or dynamic data sets Cannot handle cases when n = ~ billions For dynamic data, updating graph is expensive Preferred: learn Inductive Models to handle novel dynamic data 47

48 Data Subsampling & Learn Inductive Model Web-scale database novel data point x anchors labels a predict x s label z(x) T f(x)=z(x)a one million data points subsampling data-to-anchor map z(x) x Anchor Graph Regularization Anchor Graph Construction seed labels anchor points 48

49 ARG over 80M Tiny Images + CIFAR 10 training images test images airplane automobile bird cat deer dog frog horse ship truck background

50 80Million Tiny Images Novel test sample 1Million samples (1% labels from CIFAR-10) ARG as inductive model Learn ARG Method 1NN Linear SVM Eigen Function 1K Anchors PVM 2k Anchors 1K Anchors AGR 2k Anchors Accuracy (%) 51.66± ± ± ± ± ± ± 0.28 Training Time (s) Test Time (s) 6.29e e e e e e e 4

51 Multi edge Graph Additional Issues Multiple relation edges between nodes Multi feature Graph Build graphs in multiple feature spaces Joint optimization Label tuning vs. Active Learning 51

52 Image Based Multi Edge Graph Liu et al, ACM Multimedia 2010 two images with the same tag dog, flower dog, bird one edge connecting the two regions sharing the tag, but not all How to propagate label over multiple edges? 52

53 Extension to Multi-Feature Graphs Feature How to handle noisy labels in multiple graphs? Feature K How to handle noisy labels in multiple graphs? Graph 1 Label Propagation Graph K User Input Ranking list

54 Multi-graph SSL vs. single-graph Improve performance by 20%-80% Caltech 101 data set

55 References and Tools 1. X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi supervised learning using Gaussian fields and harmonic functions. ICML, D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. NIPS, W. Liu, J. He, and S. F. Chang. Large graph construction for scalable semi supervised learning. ICML, Software: wliu/anchor Graph.zip. 4. W. Liu, J. Wang, S. Kumar, and S. F. Chang. Hashing with graphs. ICML, J. Wang, T. Jebara, and S. F. Chang. Graph transduction via alternating minimization. ICML, J. Wang, Y. G. Jiang, and S. F. Chang. Label diagnosis through self tuning for web image search. CVPR, W. Liu, J. Jun, and S. F. Chang, Robust and Scalable Graph Based Semi Supervised Learning. In Review, IEEE Proceedings, J. Wang, E. Pohlmeyer, B. Hanna, Y. G. Jiang, P. Sajda, and S. F. Chang, Brain State Decoding for Rapid Image Retrieval, ACM Multimedia Conference, J. Wang, A. Kumar, S. F. Chang, Semi Supervised Hashing for Scalable Image Retrieval, CVPR

EE 6882 Visual Search Engine

EE 6882 Visual Search Engine EE 6882 Visual Search Engine Relevance Feedback March 5 th, 2012 Lecture #7 Graph Based Semi Supervised Learning Application of Image Matching: Manipulation Detection What We have Learned Image Representation

More information

Alternating Minimization. Jun Wang, Tony Jebara, and Shih-fu Chang

Alternating Minimization. Jun Wang, Tony Jebara, and Shih-fu Chang Graph Transduction via Alternating Minimization Jun Wang, Tony Jebara, and Shih-fu Chang 1 Outline of the presentation Brief introduction and related work Problems with Graph Labeling Imbalanced labels

More information

Visual Search: 3 Levels of Real-Time Feedback

Visual Search: 3 Levels of Real-Time Feedback Visual Search: 3 Levels of Real-Time Feedback Prof. Shih-Fu Chang Department of Electrical Engineering Digital Video and Multimedia Lab http://www.ee.columbia.edu/dvmm A lot of work on Image Classification

More information

Hashing with Graphs. Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011

Hashing with Graphs. Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011 Hashing with Graphs Wei Liu (Columbia Columbia), Jun Wang (IBM IBM), Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011 Overview Graph Hashing Outline Anchor Graph Hashing Experiments Conclusions

More information

Rongrong Ji (Columbia), Yu Gang Jiang (Fudan), June, 2012

Rongrong Ji (Columbia), Yu Gang Jiang (Fudan), June, 2012 Supervised Hashing with Kernels Wei Liu (Columbia Columbia), Jun Wang (IBM IBM), Rongrong Ji (Columbia), Yu Gang Jiang (Fudan), and Shih Fu Chang (Columbia Columbia) June, 2012 Outline Motivations Problem

More information

Graph Transduction via Alternating Minimization

Graph Transduction via Alternating Minimization Jun Wang Department of Electrical Engineering, Columbia University Tony Jebara Department of Computer Science, Columbia University Shih-Fu Chang Department of Electrical Engineering, Columbia University

More information

over Multi Label Images

over Multi Label Images IBM Research Compact Hashing for Mixed Image Keyword Query over Multi Label Images Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,

More information

A Taxonomy of Semi-Supervised Learning Algorithms

A Taxonomy of Semi-Supervised Learning Algorithms A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph

More information

Semi-supervised Data Representation via Affinity Graph Learning

Semi-supervised Data Representation via Affinity Graph Learning 1 Semi-supervised Data Representation via Affinity Graph Learning Weiya Ren 1 1 College of Information System and Management, National University of Defense Technology, Changsha, Hunan, P.R China, 410073

More information

Graph Transduction via Alternating Minimization

Graph Transduction via Alternating Minimization Jun Wang Department of Electrical Engineering, Columbia University Tony Jebara Department of Computer Science, Columbia University Shih-Fu Chang Department of Electrical Engineering, Columbia University

More information

Transductive Phoneme Classification Using Local Scaling And Confidence

Transductive Phoneme Classification Using Local Scaling And Confidence 202 IEEE 27-th Convention of Electrical and Electronics Engineers in Israel Transductive Phoneme Classification Using Local Scaling And Confidence Matan Orbach Dept. of Electrical Engineering Technion

More information

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang ACM MM 2010 Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation Proliferation of images and videos on the Internet

More information

Large-Scale Graph-based Semi-Supervised Learning via Tree Laplacian Solver

Large-Scale Graph-based Semi-Supervised Learning via Tree Laplacian Solver Large-Scale Graph-based Semi-Supervised Learning via Tree Laplacian Solver AAAI Press Association for the Advancement of Artificial Intelligence 2275 East Bayshore Road, Suite 60 Palo Alto, California

More information

Isometric Mapping Hashing

Isometric Mapping Hashing Isometric Mapping Hashing Yanzhen Liu, Xiao Bai, Haichuan Yang, Zhou Jun, and Zhihong Zhang Springer-Verlag, Computer Science Editorial, Tiergartenstr. 7, 692 Heidelberg, Germany {alfred.hofmann,ursula.barth,ingrid.haas,frank.holzwarth,

More information

SEMI-SUPERVISED LEARNING (SSL) for classification

SEMI-SUPERVISED LEARNING (SSL) for classification IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 12, DECEMBER 2015 2411 Bilinear Embedding Label Propagation: Towards Scalable Prediction of Image Labels Yuchen Liang, Zhao Zhang, Member, IEEE, Weiming Jiang,

More information

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Int. J. Advance Soft Compu. Appl, Vol. 9, No. 1, March 2017 ISSN 2074-8523 The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Loc Tran 1 and Linh Tran

More information

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010 INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,

More information

Efficient Iterative Semi-supervised Classification on Manifold

Efficient Iterative Semi-supervised Classification on Manifold . Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani

More information

Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA

Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Retrospective ICML99 Transductive Inference for Text Classification using Support Vector Machines Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Outline The paper in

More information

Graph Laplacian Kernels for Object Classification from a Single Example

Graph Laplacian Kernels for Object Classification from a Single Example Graph Laplacian Kernels for Object Classification from a Single Example Hong Chang & Dit-Yan Yeung Department of Computer Science, Hong Kong University of Science and Technology {hongch,dyyeung}@cs.ust.hk

More information

Kernel-based Transductive Learning with Nearest Neighbors

Kernel-based Transductive Learning with Nearest Neighbors Kernel-based Transductive Learning with Nearest Neighbors Liangcai Shu, Jinhui Wu, Lei Yu, and Weiyi Meng Dept. of Computer Science, SUNY at Binghamton Binghamton, New York 13902, U. S. A. {lshu,jwu6,lyu,meng}@cs.binghamton.edu

More information

Large-Scale Face Manifold Learning

Large-Scale Face Manifold Learning Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random

More information

Robust and Scalable Graph-Based Semisupervised Learning

Robust and Scalable Graph-Based Semisupervised Learning INVITED PAPER Robust and Scalable Graph-Based Semisupervised Learning Graph-based semisupervised learning methods and new techniques for handling contaminated noisy labels, and gigantic data sizes for

More information

Unsupervised and Semi-Supervised Learning vial 1 -Norm Graph

Unsupervised and Semi-Supervised Learning vial 1 -Norm Graph Unsupervised and Semi-Supervised Learning vial -Norm Graph Feiping Nie, Hua Wang, Heng Huang, Chris Ding Department of Computer Science and Engineering University of Texas, Arlington, TX 769, USA {feipingnie,huawangcs}@gmail.com,

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences

More information

Visual Search. Shih-Fu Chang. Department of Electrical Engineering Columbia University. Boston University ECE Lecture, September 2010

Visual Search. Shih-Fu Chang. Department of Electrical Engineering Columbia University. Boston University ECE Lecture, September 2010 Visual Search Shih-Fu Chang Department of Electrical Engineering Columbia University Boston University ECE Lecture, September 2010 (by Mac Funamizu) Many pictures are taken everyday/everywhere... Barack

More information

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Effective Latent Space Graph-based Re-ranking Model with Global Consistency Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case

More information

Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning

Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning Liansheng Zhuang 1, Haoyuan Gao 1, Zhouchen Lin 2,3, Yi Ma 2, Xin Zhang 4, Nenghai Yu 1 1 MOE-Microsoft Key Lab., University of Science

More information

CLSH: Cluster-based Locality-Sensitive Hashing

CLSH: Cluster-based Locality-Sensitive Hashing CLSH: Cluster-based Locality-Sensitive Hashing Xiangyang Xu Tongwei Ren Gangshan Wu Multimedia Computing Group, State Key Laboratory for Novel Software Technology, Nanjing University xiangyang.xu@smail.nju.edu.cn

More information

Improving Image Segmentation Quality Via Graph Theory

Improving Image Segmentation Quality Via Graph Theory International Symposium on Computers & Informatics (ISCI 05) Improving Image Segmentation Quality Via Graph Theory Xiangxiang Li, Songhao Zhu School of Automatic, Nanjing University of Post and Telecommunications,

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Supplementary material for the paper Are Sparse Representations Really Relevant for Image Classification?

Supplementary material for the paper Are Sparse Representations Really Relevant for Image Classification? Supplementary material for the paper Are Sparse Representations Really Relevant for Image Classification? Roberto Rigamonti, Matthew A. Brown, Vincent Lepetit CVLab, EPFL Lausanne, Switzerland firstname.lastname@epfl.ch

More information

Semi-Supervised Learning: Lecture Notes

Semi-Supervised Learning: Lecture Notes Semi-Supervised Learning: Lecture Notes William W. Cohen March 30, 2018 1 What is Semi-Supervised Learning? In supervised learning, a learner is given a dataset of m labeled examples {(x 1, y 1 ),...,

More information

Progressive Generative Hashing for Image Retrieval

Progressive Generative Hashing for Image Retrieval Progressive Generative Hashing for Image Retrieval Yuqing Ma, Yue He, Fan Ding, Sheng Hu, Jun Li, Xianglong Liu 2018.7.16 01 BACKGROUND the NNS problem in big data 02 RELATED WORK Generative adversarial

More information

Instance-level Semi-supervised Multiple Instance Learning

Instance-level Semi-supervised Multiple Instance Learning Instance-level Semi-supervised Multiple Instance Learning Yangqing Jia and Changshui Zhang State Key Laboratory on Intelligent Technology and Systems Tsinghua National Laboratory for Information Science

More information

Large Scale Mobile Visual Search

Large Scale Mobile Visual Search Large Scale Mobile Visual Search Ricoh, HotPaper (by Mac Funamizu) Shih-Fu Chang June 2012 The Explosive Growth of Visual Data broadcast Social portals video blogs 70,000 TB/year, 100 million hours 60

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing

More information

Learning Safe Graph Construction from Multiple Graphs

Learning Safe Graph Construction from Multiple Graphs Learning Safe Graph Construction from Multiple Graphs De-Ming Liang 1,2 and Yu-Feng Li 1,2 1 National Key Laboratory for Novel Software Technology, Nanjing University, 2 Collaborative Innovation Center

More information

Neighbor Search with Global Geometry: A Minimax Message Passing Algorithm

Neighbor Search with Global Geometry: A Minimax Message Passing Algorithm : A Minimax Message Passing Algorithm Kye-Hyeon Kim fenrir@postech.ac.kr Seungjin Choi seungjin@postech.ac.kr Department of Computer Science, Pohang University of Science and Technology, San 31 Hyoja-dong,

More information

Convolutional Deep Belief Networks on CIFAR-10

Convolutional Deep Belief Networks on CIFAR-10 Convolutional Deep Belief Networks on CIFAR-10 Alex Krizhevsky kriz@cs.toronto.edu 1 Introduction We describe how to train a two-layer convolutional Deep Belief Network (DBN) on the 1.6 million tiny images

More information

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Title Prototype Vector Machine for Large Scale Semi-Supervised Learning Permalink https://escholarship.org/uc/item/64r3c1rx Author

More information

Semi-supervised learning SSL (on graphs)

Semi-supervised learning SSL (on graphs) Semi-supervised learning SSL (on graphs) 1 Announcement No office hour for William after class today! 2 Semi-supervised learning Given: A pool of labeled examples L A (usually larger) pool of unlabeled

More information

Approximate Nearest Neighbor Search. Deng Cai Zhejiang University

Approximate Nearest Neighbor Search. Deng Cai Zhejiang University Approximate Nearest Neighbor Search Deng Cai Zhejiang University The Era of Big Data How to Find Things Quickly? Web 1.0 Text Search Sparse feature Inverted Index How to Find Things Quickly? Web 2.0, 3.0

More information

TRANSDUCTIVE LINK SPAM DETECTION

TRANSDUCTIVE LINK SPAM DETECTION TRANSDUCTIVE LINK SPAM DETECTION Denny Zhou Microsoft Research http://research.microsoft.com/~denzho Joint work with Chris Burges and Tao Tao Presenter: Krysta Svore Link spam detection problem Classification

More information

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 1. Introduction Reddit is one of the most popular online social news websites with millions

More information

Inference Driven Metric Learning (IDML) for Graph Construction

Inference Driven Metric Learning (IDML) for Graph Construction University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 1-1-2010 Inference Driven Metric Learning (IDML) for Graph Construction Paramveer S. Dhillon

More information

Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine

Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine Shahabi Lotfabadi, M., Shiratuddin, M.F. and Wong, K.W. (2013) Content Based Image Retrieval system with a combination of rough set and support vector machine. In: 9th Annual International Joint Conferences

More information

MULTIMODAL SEMI-SUPERVISED IMAGE CLASSIFICATION BY COMBINING TAG REFINEMENT, GRAPH-BASED LEARNING AND SUPPORT VECTOR REGRESSION

MULTIMODAL SEMI-SUPERVISED IMAGE CLASSIFICATION BY COMBINING TAG REFINEMENT, GRAPH-BASED LEARNING AND SUPPORT VECTOR REGRESSION MULTIMODAL SEMI-SUPERVISED IMAGE CLASSIFICATION BY COMBINING TAG REFINEMENT, GRAPH-BASED LEARNING AND SUPPORT VECTOR REGRESSION Wenxuan Xie, Zhiwu Lu, Yuxin Peng and Jianguo Xiao Institute of Computer

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 24, 2015 Course Information Website: www.stat.ucdavis.edu/~chohsieh/ecs289g_scalableml.html My office: Mathematical Sciences Building (MSB)

More information

Network embedding. Cheng Zheng

Network embedding. Cheng Zheng Network embedding Cheng Zheng Outline Problem definition Factorization based algorithms --- Laplacian Eigenmaps(NIPS, 2001) Random walk based algorithms ---DeepWalk(KDD, 2014), node2vec(kdd, 2016) Deep

More information

Structured Learning. Jun Zhu

Structured Learning. Jun Zhu Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

Trace Ratio Criterion for Feature Selection

Trace Ratio Criterion for Feature Selection Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Trace Ratio Criterion for Feature Selection Feiping Nie 1, Shiming Xiang 1, Yangqing Jia 1, Changshui Zhang 1 and Shuicheng

More information

From processing to learning on graphs

From processing to learning on graphs From processing to learning on graphs Patrick Pérez Maths and Images in Paris IHP, 2 March 2017 Signals on graphs Natural graph: mesh, network, etc., related to a real structure, various signals can live

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different

More information

Hashing with Binary Autoencoders

Hashing with Binary Autoencoders Hashing with Binary Autoencoders Ramin Raziperchikolaei Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu Joint work with Miguel Á. Carreira-Perpiñán

More information

Instances on a Budget

Instances on a Budget Retrieving Similar or Informative Instances on a Budget Kristen Grauman Dept. of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimham Work with Sudheendra Vijayanarasimham,

More information

Scalable Semi-Supervised Learning by Efficient Anchor Graph Regularization

Scalable Semi-Supervised Learning by Efficient Anchor Graph Regularization IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 Scalable Semi-Supervised Learning by Efficient Anchor Graph Regularization Meng Wang, Member, IEEE, Weijie Fu, Shijie Hao, Dacheng Tao, Fellow, IEEE,

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN: Semi Automatic Annotation Exploitation Similarity of Pics in i Personal Photo Albums P. Subashree Kasi Thangam 1 and R. Rosy Angel 2 1 Assistant Professor, Department of Computer Science Engineering College,

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Large-Scale Semi-Supervised Learning

Large-Scale Semi-Supervised Learning Large-Scale Semi-Supervised Learning Jason WESTON a a NEC LABS America, Inc., 4 Independence Way, Princeton, NJ, USA 08540. Abstract. Labeling data is expensive, whilst unlabeled data is often abundant

More information

GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION

GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION Nasehe Jamshidpour a, Saeid Homayouni b, Abdolreza Safari a a Dept. of Geomatics Engineering, College of Engineering,

More information

Robust Semi-Supervised Learning through Label Aggregation

Robust Semi-Supervised Learning through Label Aggregation Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Robust Semi-Supervised Learning through Label Aggregation Yan Yan, Zhongwen Xu, Ivor W. Tsang, Guodong Long, Yi Yang Centre

More information

LINK GRAPH ANALYSIS FOR ADULT IMAGES CLASSIFICATION

LINK GRAPH ANALYSIS FOR ADULT IMAGES CLASSIFICATION LINK GRAPH ANALYSIS FOR ADULT IMAGES CLASSIFICATION Evgeny Kharitonov *, ***, Anton Slesarev *, ***, Ilya Muchnik **, ***, Fedor Romanenko ***, Dmitry Belyaev ***, Dmitry Kotlyarov *** * Moscow Institute

More information

Bipartite Edge Prediction via Transductive Learning over Product Graphs

Bipartite Edge Prediction via Transductive Learning over Product Graphs Bipartite Edge Prediction via Transductive Learning over Product Graphs Hanxiao Liu, Yiming Yang School of Computer Science, Carnegie Mellon University July 8, 2015 ICML 2015 Bipartite Edge Prediction

More information

Semi-Supervised Learning with Trees

Semi-Supervised Learning with Trees Semi-Supervised Learning with Trees Charles Kemp, Thomas L. Griffiths, Sean Stromsten & Joshua B. Tenenbaum Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 0139 {ckemp,gruffydd,sean s,jbt}@mit.edu

More information

Efficient Algorithms may not be those we think

Efficient Algorithms may not be those we think Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann

More information

Image Annotation by k NN-Sparse Graph-based Label Propagation over Noisily-Tagged Web Images

Image Annotation by k NN-Sparse Graph-based Label Propagation over Noisily-Tagged Web Images Image Annotation by k NN-Sparse Graph-based Label Propagation over Noisily-Tagged Web Images JINHUI TANG, RICHANG HONG, SHUICHENG YAN, TAT-SENG CHUA National University of Singapore GUO-JUN QI University

More information

Graph based machine learning with applications to media analytics

Graph based machine learning with applications to media analytics Graph based machine learning with applications to media analytics Lei Ding, PhD 9-1-2011 with collaborators at Outline Graph based machine learning Basic structures Algorithms Examples Applications in

More information

INTRO TO SEMI-SUPERVISED LEARNING (SSL)

INTRO TO SEMI-SUPERVISED LEARNING (SSL) SSL (on graphs) 1 INTRO TO SEMI-SUPERVISED LEARNING (SSL) Semi-supervised learning Given: A pool of labeled examples L A (usually larger) pool of unlabeled examples U Option 1 for using L and U : Ignore

More information

Large Scale Manifold Transduction

Large Scale Manifold Transduction Large Scale Manifold Transduction Michael Karlen, Jason Weston, Ayse Erkan & Ronan Collobert NEC Labs America, Princeton, USA Ećole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland New York University,

More information

Day 3 Lecture 1. Unsupervised Learning

Day 3 Lecture 1. Unsupervised Learning Day 3 Lecture 1 Unsupervised Learning Semi-supervised and transfer learning Myth: you can t do deep learning unless you have a million labelled examples for your problem. Reality You can learn useful representations

More information

An efficient face recognition algorithm based on multi-kernel regularization learning

An efficient face recognition algorithm based on multi-kernel regularization learning Acta Technica 61, No. 4A/2016, 75 84 c 2017 Institute of Thermomechanics CAS, v.v.i. An efficient face recognition algorithm based on multi-kernel regularization learning Bi Rongrong 1 Abstract. A novel

More information

Semi-supervised Learning by Sparse Representation

Semi-supervised Learning by Sparse Representation Semi-supervised Learning by Sparse Representation Shuicheng Yan Huan Wang Abstract In this paper, we present a novel semi-supervised learning framework based on l 1 graph. The l 1 graph is motivated by

More information

Transductive Classification Methods for Mixed Graphs

Transductive Classification Methods for Mixed Graphs Transductive Classification Methods for Mixed Graphs S Sundararajan Yahoo! Labs Bangalore, India ssrajan@yahoo-inc.com S Sathiya Keerthi Yahoo! Labs Santa Clara, CA selvarak@yahoo-inc.com ABSTRACT In this

More information

Semi-supervised learning and active learning

Semi-supervised learning and active learning Semi-supervised learning and active learning Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Combining classifiers Ensemble learning: a machine learning paradigm where multiple learners

More information

Learning Better Data Representation using Inference-Driven Metric Learning

Learning Better Data Representation using Inference-Driven Metric Learning Learning Better Data Representation using Inference-Driven Metric Learning Paramveer S. Dhillon CIS Deptt., Univ. of Penn. Philadelphia, PA, U.S.A dhillon@cis.upenn.edu Partha Pratim Talukdar Search Labs,

More information

A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH

A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH Sandhya V. Kawale Prof. Dr. S. M. Kamalapur M.E. Student Associate Professor Deparment of Computer Engineering, Deparment of Computer Engineering, K. K. Wagh

More information

Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features

Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features 1 Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features Liansheng Zhuang, Shenghua Gao, Jinhui Tang, Jingjing Wang, Zhouchen Lin, Senior Member, and Yi Ma, IEEE Fellow, arxiv:1409.0964v1

More information

Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search

Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search Lu Jiang 1, Deyu Meng 2, Teruko Mitamura 1, Alexander G. Hauptmann 1 1 School of Computer Science, Carnegie Mellon University

More information

Package SSL. May 14, 2016

Package SSL. May 14, 2016 Type Package Title Semi-Supervised Learning Version 0.1 Date 2016-05-01 Author Package SSL May 14, 2016 Maintainer Semi-supervised learning has attracted the attention of machine

More information

The K-modes and Laplacian K-modes algorithms for clustering

The K-modes and Laplacian K-modes algorithms for clustering The K-modes and Laplacian K-modes algorithms for clustering Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://faculty.ucmerced.edu/mcarreira-perpinan

More information

Search Engines. Information Retrieval in Practice

Search Engines. Information Retrieval in Practice Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems

More information

Using PageRank in Feature Selection

Using PageRank in Feature Selection Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy fienco,meo,bottag@di.unito.it Abstract. Feature selection is an important

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14

More information

Topics in Graph Construction for Semi-Supervised Learning

Topics in Graph Construction for Semi-Supervised Learning University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 8-27-2009 Topics in Graph Construction for Semi-Supervised Learning Partha Pratim Talukdar

More information

Columbia University High-Level Feature Detection: Parts-based Concept Detectors

Columbia University High-Level Feature Detection: Parts-based Concept Detectors TRECVID 2005 Workshop Columbia University High-Level Feature Detection: Parts-based Concept Detectors Dong-Qing Zhang, Shih-Fu Chang, Winston Hsu, Lexin Xie, Eric Zavesky Digital Video and Multimedia Lab

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.

More information

Face Recognition A Deep Learning Approach

Face Recognition A Deep Learning Approach Face Recognition A Deep Learning Approach Lihi Shiloh Tal Perl Deep Learning Seminar 2 Outline What about Cat recognition? Classical face recognition Modern face recognition DeepFace FaceNet Comparison

More information

Graph-based Semi- Supervised Learning as Optimization

Graph-based Semi- Supervised Learning as Optimization Graph-based Semi- Supervised Learning as Optimization Partha Pratim Talukdar CMU Machine Learning with Large Datasets (10-605) April 3, 2012 Graph-based Semi-Supervised Learning 0.2 0.1 0.2 0.3 0.3 0.2

More information

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH Sai Tejaswi Dasari #1 and G K Kishore Babu *2 # Student,Cse, CIET, Lam,Guntur, India * Assistant Professort,Cse, CIET, Lam,Guntur, India Abstract-

More information

A REVIEW ON SEARCH BASED FACE ANNOTATION USING WEAKLY LABELED FACIAL IMAGES

A REVIEW ON SEARCH BASED FACE ANNOTATION USING WEAKLY LABELED FACIAL IMAGES A REVIEW ON SEARCH BASED FACE ANNOTATION USING WEAKLY LABELED FACIAL IMAGES Dhokchaule Sagar Patare Swati Makahre Priyanka Prof.Borude Krishna ABSTRACT This paper investigates framework of face annotation

More information

Geodesic Flow Kernel for Unsupervised Domain Adaptation

Geodesic Flow Kernel for Unsupervised Domain Adaptation Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman 1 Motivation TRAIN TEST Mismatch between different

More information

Clustering will not be satisfactory if:

Clustering will not be satisfactory if: Clustering will not be satisfactory if: -- in the input space the clusters are not linearly separable; -- the distance measure is not adequate; -- the assumptions limit the shape or the number of the clusters.

More information

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Moritz Baecher May 15, 29 1 Introduction Edge-preserving smoothing and super-resolution are classic and important

More information

Ranking on Data Manifolds

Ranking on Data Manifolds Ranking on Data Manifolds Dengyong Zhou, Jason Weston, Arthur Gretton, Olivier Bousquet, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany {firstname.secondname

More information

Semi supervised clustering for Text Clustering

Semi supervised clustering for Text Clustering Semi supervised clustering for Text Clustering N.Saranya 1 Assistant Professor, Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Coimbatore 1 ABSTRACT: Based on clustering

More information

On Information-Maximization Clustering: Tuning Parameter Selection and Analytic Solution

On Information-Maximization Clustering: Tuning Parameter Selection and Analytic Solution ICML2011 Jun. 28-Jul. 2, 2011 On Information-Maximization Clustering: Tuning Parameter Selection and Analytic Solution Masashi Sugiyama, Makoto Yamada, Manabu Kimura, and Hirotaka Hachiya Department of

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information