Hashing with Graphs. Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011
|
|
- Erick Robinson
- 6 years ago
- Views:
Transcription
1 Hashing with Graphs Wei Liu (Columbia Columbia), Jun Wang (IBM IBM), Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011
2 Overview Graph Hashing Outline Anchor Graph Hashing Experiments Conclusions 2
3 Fast NN Search In many ML/CV/IR problems, one needsa reliable content based similarity search engine. Thesimplest method is nearest neighbor (NN) search using some similarity measure such as L p, cosine, kernel. Theexhaustivelinear exhaustive search into n data items takes time. KD tree is faster but seriously cursed by dimensionality (c<=d d, d<20 works). 3
4 Hashing Low memoryusage and high search efficiency. Locality Sensitive Hashing (LSH) Indyk et al. (since 1998) Semantic Hashing (RBMs) Salakhutdinov & Hinton (2006,2007) 2007) Spectral Hashing (SH) Weiss et al. (2009) Binary Reconstruction Embedding (BRE) Kulis & Darrell ll(2010) Semi Supervised Hashing (SSH) Wang et al. (2010) randomized hash function long bits/multi tables learned hash function short bits/single table 4
5 Hashing Using Compact Codes Learning based methods try to learn compact binary codes whose Hamming distance approaches or preserves the given similarity, e.g., L 2, cosine, kernel (local measure). Achieve constant time search (O(1)) by flipping several bits and looking up the hash table. learning hash function buckets NN list 5
6 Our Mission Can load millions of data, eg e.g., images, into memory. Very compact codes! Beatthe the existing unsupervised hashing methods (LSH, PCAH, USPLH, SH, KLSH, SIKH). Exploit global measure on manifolds to learn compact codes. Use graphs! Graphs tend to capture the semantic neighborhoods. Beat L 2 linear scan!
7 Overview Graph Hashing Outline Anchor Graph Hashing Experiments Conclusions 7
8 Graph Hashing (GH) A standard graph based hashing framework [Weiss et al. 09] drop it by spectral relaxation x2 balanced bits uncorrelated bits x1 A12 A: the affinity matrix of the data graph. L: the graph Laplaciandiag(A1) A. A x 1 x 2 x x n data points 8
9 Steps Training: 1) Building a sparse matrix A of an exact neighborhood graph (e.g., k NN graph) onn n d dim data points needs. 2) Computing & binarizing r eigenvectors of the sparse graph Laplacian L needs. Test (hashing): 1) Generalizing r eigenvectors to any unseen point and binarizing needs (interpolations over n entries). Training 1) is intractable for offline training; Test 1) is infeasible for online hashing! 9
10 Spectral Hashing [Weiss et al. 09] SHdirectlyobtainsthe the analyticaleigenfunctions cos(akx+b) of 1D graph Laplacians based on the strong assumptions: i) data dimensions are independent to each other; ii) at each dimension data is uniformly distributed. The 1Dgraph Laplacian is derived from a complete graph. Manifold structure of raw data is almost ignored. Pseudo eigenfunctions & pseudo linear hash functions. 10
11 Toy Data 12k 2D points: blue color denotes the positive eigenvector entries and red color denotes the negative entries. GH smooth yet balanced our approach SH st eigenvector nd eigenvector
12 Overview Graph Hashing Outline Anchor Graph Hashing Experiments Conclusions 12
13 Idea In Training 1), we utilize our proposed large graph Anchor Graph whose space/time complexities are both linear. W. Liu, J. He, and S. F. Chang Large Graph Construction for Scalable Semi Supervised Learning ICML 2010 (code released) 13
14 Anchor Graph [Liu et al. 10] Nonnegative, sparse, and low rank affinity matrices. Approximate the exact neighborhood graph when the number of anchors is sufficiently large. 100 anchor points 10 NN graph Anchor Graph 14
15 Anchors Introduce a few anchor points (K means clustering centers) and do the data to anchor similarity computation (nxm). The data to anchor similarity matrix Obtain the data to data similarity matrix  using Z. Â18>0 x8 x1 Z16 anchor points Z12 Z11 u1 u6 Inner product? u2 u3 Â14=0 u5 u4 data points x4 15
16 Low Rank Affinity Matrix Kernel based truncated Z design ( ): Note Z1=1 such that Z acts as a mapping anchors data The Anchor Graph affinity matrix Rank<= m, only computing and saving Z in memory; a doubly stochastic matrix, so the graph Laplacian is I Â and has the same eigenvectors as those of Â. 16
17 Anchor Graph Hashing (AGH) Solvethe relaxed GH problem using the Anchor Graph Due to low rank of Â, its r eigenvectors Y = ZW (excluding the trivial 1) can be easily solved via a small eigenvalue decomposition on the mxm matrix. looks like r eigenvectors on m anchors. The r dim Hamming Embeddings are sgn(zw). 17
18 Eigenvector to Eigenfunction Supposeoneeigenvector one eigenvector y = Zw w 18
19 Theorem Given m anchor points U={u j } and any sample x, define a feature map as follows where =1 iff anchor u j is one of s nearest anchors of x in U according to the distance function D( ). ( ) Then the NystrÖm eigenfunction extended from the Anchor Graph Laplacian eigenvector y k = Zw k is interpolation i over m entries 19
20 Steps Training: 1) Building an Anchor Graph (Z) on nd dim data points needs (T isthe iterationnumber number ofk means) K means). 2) Computing & binarizing r eigenvectors of the Anchor Graph Laplacian needs. 3) Build the hash table O(n). Test (hashing): 1) Generalizing r eigenvectors to any unseen point via NystrÖm extension and binarizing needs. 2) Inverse lookup in the hash table O(1). Linear training time & constant hashing time! 20
21 Hierarchical Hashing The higher eigenvec/eigenfunc corresponding to the higher graph Laplacian eigenvalue is of low quality for partitioning. [Shi et al. 00] We propose two layer hashing to revisit the lower (smoother) eigenfuncs to generate multiple hash bits. 1 Eigenvalues of Anchor Graph Laplacian reuse
22 Hierarchical Hashing balanced partitioning 22
23 1 AGH vs. 2 AGH One Layer AGH = Binarizing r eigenfuncs into r hash bits hash functions r bit code Two Layer AGH = Hierarchically thresholding r/2 eigenfuncs into r hash bits r bit code 23
24 Overview Graph Hashing Outline Anchor Graph Hashing Experiments Conclusions 24
25 MNIST 70kdigit imagesfrom 10classes classes, 1000queries uniformly sampled from 10 classes. Compare 8 unsupervised hashing methods, L 2 linear scan, and Spectral Embedding (SE) L 2 linear scan (real valued version of 1 AGH). Measure Hamming radius 2 precision (truly hashing) and Hamming ranking accuracy (brute force search) in terms of true neighbors which take the same class label. 25
26 MNIST Pre ecision with hin Hammin ng radius (a) Precision vs. # bits (b) Precision vs. # anchors much higher LSH AGH>SE L 2 > L 2 PCAH 0.7 drop # bits USPLH SH KLSH SIKH 1-AGH 2AGH 2-AGH top-5000 neighbors Precision of l 2 Scan SE l 2 Scan (48 dims) 1-AGH (48 bits) 2-AGH (48 bits) # anchors (a) Precision within Hamming radius 2 using hash lookup (fix m=300 to run AGH); (b) Hamming ranking precision of top 5k ranked neighbors. 26
27 MNIST Precision (c) Precision curves (1-AGH vs. 2-AGH) l 2 Scan 1-AGH (24 bits) 1-AGH (48 bits) 2-AGH (48 bits) Recall AGH (r) > 1 AGH (r/2) # samples (d) Recall curves (1-AGH vs. 2-AGH) Recall 2 AGH (r) > 1 AGH (r) l 2 Scan 1-AGH (24 bits) AGH (48 bits) 2-AGH (48 bits) # samples x 10 4 (c) Hamming ranking precision curves (fix m=300); (d) Hamming ranking recall curves (fix m=300). 27
28 NUS WIDE within Ham mming radiu us 2 Precision About 270k web images from 81 semantic tags, 2100 query images sampled from 21 frequent tags. True neighbors are those sharing at least one semantic tag. (a) Precision vs. # bits LSH PCAH USPLH SH 0.48 KLSH SIKH AGH 2-AGH 0.47 n of top neighbo ors (b) Precision vs. # anchors much higher 2 AGH>SE L 2 > L 2 drop Precisio l 2 Scan #bit bits # anchors SE l 2 Scan (48 dims) 1-AGH (48 bits) 2-AGH (48 bits)
29 NUS WIDE (a)(c)(d) fix m=300. (b) top 5k hammingranking precision. Pre cision (c) Precision curves (1-AGH vs. 2-AGH) l 2 Scan 1-AGH (24 bits) 1-AGH (48 bits) 2-AGH (48 bits) # samples Re ecall AGH (r) > 1 AGH (r/2) (d) Recall curves (1-AGH vs. 2-AGH) Recall 2 AGH (r) > 1 AGH (r) l 2 Scan AGH (24 bits) 1-AGH (48 bits) 2-AGH (48 bits) # samples x
30 Hamming Ranking Method MAP on MNIST MP/5k on NUS WIDE r=24 r=48 r=24 r=48 supervised L 2 Scan SE L 2 Scan BRE (NIPS 10) SH (NIPS 09) KLSH (ICCV 09) linear SIKH (NIPS 10) USPLH (ICML 10) AGH AGH
31 Overview Graph Hashing Outline Anchor Graph Hashing Experiments Conclusions 31
32 Conclusions Underthe unsupervised scenario, almost all hashing methods are inferior to L 2 linear scan in search accuracy. Our AGH method outperforms L 2 scan in terms of searching semantic neighbors. We have applied Anchor Graphs to achieve scalability in SSL, clustering, and web image search. AGH will have wide applications, i even only the compact code generation scheme. 32
Rongrong Ji (Columbia), Yu Gang Jiang (Fudan), June, 2012
Supervised Hashing with Kernels Wei Liu (Columbia Columbia), Jun Wang (IBM IBM), Rongrong Ji (Columbia), Yu Gang Jiang (Fudan), and Shih Fu Chang (Columbia Columbia) June, 2012 Outline Motivations Problem
More informationApproximate Nearest Neighbor Search. Deng Cai Zhejiang University
Approximate Nearest Neighbor Search Deng Cai Zhejiang University The Era of Big Data How to Find Things Quickly? Web 1.0 Text Search Sparse feature Inverted Index How to Find Things Quickly? Web 2.0, 3.0
More informationover Multi Label Images
IBM Research Compact Hashing for Mixed Image Keyword Query over Multi Label Images Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,
More informationSupervised Hashing for Image Retrieval via Image Representation Learning
Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia, Yan Pan, Cong Liu (Sun Yat-Sen University) Hanjiang Lai, Shuicheng Yan (National University of Singapore) Finding Similar
More informationAdaptive Binary Quantization for Fast Nearest Neighbor Search
IBM Research Adaptive Binary Quantization for Fast Nearest Neighbor Search Zhujin Li 1, Xianglong Liu 1*, Junjie Wu 1, and Hao Su 2 1 Beihang University, Beijing, China 2 Stanford University, Stanford,
More informationGraph-based Techniques for Searching Large-Scale Noisy Multimedia Data
Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM),
More informationAdaptive Hash Retrieval with Kernel Based Similarity
Adaptive Hash Retrieval with Kernel Based Similarity Xiao Bai a, Cheng Yan a, Haichuan Yang a, Lu Bai b, Jun Zhou c, Edwin Robert Hancock d a School of Computer Science and Engineering, Beihang University,
More informationLarge-Scale Face Manifold Learning
Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random
More informationIsometric Mapping Hashing
Isometric Mapping Hashing Yanzhen Liu, Xiao Bai, Haichuan Yang, Zhou Jun, and Zhihong Zhang Springer-Verlag, Computer Science Editorial, Tiergartenstr. 7, 692 Heidelberg, Germany {alfred.hofmann,ursula.barth,ingrid.haas,frank.holzwarth,
More informationHashing with Binary Autoencoders
Hashing with Binary Autoencoders Ramin Raziperchikolaei Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu Joint work with Miguel Á. Carreira-Perpiñán
More informationImage Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.
Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 18 Image Hashing Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph:
More informationSupervised Hashing for Image Retrieval via Image Representation Learning
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia 1, Yan Pan 1, Hanjiang Lai 1,2, Cong Liu
More informationLarge-scale visual recognition Efficient matching
Large-scale visual recognition Efficient matching Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline!! Preliminary!! Locality Sensitive Hashing: the two modes!! Hashing!! Embedding!!
More informationlearning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function:
1 Query-adaptive Image Retrieval by Deep Weighted Hashing Jian Zhang and Yuxin Peng arxiv:1612.2541v2 [cs.cv] 9 May 217 Abstract Hashing methods have attracted much attention for large scale image retrieval.
More informationCLSH: Cluster-based Locality-Sensitive Hashing
CLSH: Cluster-based Locality-Sensitive Hashing Xiangyang Xu Tongwei Ren Gangshan Wu Multimedia Computing Group, State Key Laboratory for Novel Software Technology, Nanjing University xiangyang.xu@smail.nju.edu.cn
More informationSupervised Hashing with Kernels
Supervised Hashing with Kernels Wei Liu Jun Wang Rongrong Ji Yu-Gang Jiang Shih-Fu Chang Electrical Engineering Department, Columbia University, New York, NY, USA IBM T. J. Watson Research Center, Yorktown
More informationLearning to Hash on Structured Data
Learning to Hash on Structured Data Qifan Wang, Luo Si and Bin Shen Computer Science Department, Purdue University West Lafayette, IN 47907, US wang868@purdue.edu, lsi@purdue.edu, bshen@purdue.edu Abstract
More informationProgressive Generative Hashing for Image Retrieval
Progressive Generative Hashing for Image Retrieval Yuqing Ma, Yue He, Fan Ding, Sheng Hu, Jun Li, Xianglong Liu 2018.7.16 01 BACKGROUND the NNS problem in big data 02 RELATED WORK Generative adversarial
More informationLearning independent, diverse binary hash functions: pruning and locality
Learning independent, diverse binary hash functions: pruning and locality Ramin Raziperchikolaei and Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced
More informationEvaluation of Hashing Methods Performance on Binary Feature Descriptors
Evaluation of Hashing Methods Performance on Binary Feature Descriptors Jacek Komorowski and Tomasz Trzcinski 2 Warsaw University of Technology, Warsaw, Poland jacek.komorowski@gmail.com 2 Warsaw University
More informationSpectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,
Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4
More informationAsymmetric Discrete Graph Hashing
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Asymmetric Discrete Graph Hashing Xiaoshuang Shi, Fuyong Xing, Kaidi Xu, Manish Sapkota, Lin Yang University of Florida,
More informationLarge Scale Mobile Visual Search
Large Scale Mobile Visual Search Ricoh, HotPaper (by Mac Funamizu) Shih-Fu Chang June 2012 The Explosive Growth of Visual Data broadcast Social portals video blogs 70,000 TB/year, 100 million hours 60
More informationVariable Bit Quantisation for LSH
Sean Moran School of Informatics The University of Edinburgh EH8 9AB, Edinburgh, UK sean.moran@ed.ac.uk Variable Bit Quantisation for LSH Victor Lavrenko School of Informatics The University of Edinburgh
More informationarxiv: v2 [cs.cv] 27 Nov 2017
Deep Supervised Discrete Hashing arxiv:1705.10999v2 [cs.cv] 27 Nov 2017 Qi Li Zhenan Sun Ran He Tieniu Tan Center for Research on Intelligent Perception and Computing National Laboratory of Pattern Recognition
More informationSmart Hashing Update for Fast Response
Smart Hashing Update for Fast Response Qiang Yang, Longkai Huang, Wei-Shi Zheng*, and Yingbiao Ling School of Information Science and Technology, Sun Yat-sen University, China Guangdong Province Key Laboratory
More informationLearning Affine Robust Binary Codes Based on Locality Preserving Hash
Learning Affine Robust Binary Codes Based on Locality Preserving Hash Wei Zhang 1,2, Ke Gao 1, Dongming Zhang 1, and Jintao Li 1 1 Advanced Computing Research Laboratory, Beijing Key Laboratory of Mobile
More informationLearning Low-rank Transformations: Algorithms and Applications. Qiang Qiu Guillermo Sapiro
Learning Low-rank Transformations: Algorithms and Applications Qiang Qiu Guillermo Sapiro Motivation Outline Low-rank transform - algorithms and theories Applications Subspace clustering Classification
More informationRECENT years have witnessed the rapid growth of image. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval
SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval Jian Zhang, Yuxin Peng, and Junchao Zhang arxiv:607.08477v [cs.cv] 28 Jul 206 Abstract The hashing methods have been widely used for efficient
More informationSimilarity-Preserving Binary Hashing for Image Retrieval in large databases
Universidad Politécnica de Valencia Master s Final Project Similarity-Preserving Binary Hashing for Image Retrieval in large databases Author: Guillermo García Franco Supervisor: Dr. Roberto Paredes Palacios
More informationon learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015
on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,
More informationComplementary Hashing for Approximate Nearest Neighbor Search
Complementary Hashing for Approximate Nearest Neighbor Search Hao Xu Jingdong Wang Zhu Li Gang Zeng Shipeng Li Nenghai Yu MOE-MS KeyLab of MCC, University of Science and Technology of China, P. R. China
More informationMetric Learning Applied for Automatic Large Image Classification
September, 2014 UPC Metric Learning Applied for Automatic Large Image Classification Supervisors SAHILU WENDESON / IT4BI TOON CALDERS (PhD)/ULB SALIM JOUILI (PhD)/EuraNova Image Database Classification
More informationIterative Quantization: A Procrustean Approach to Learning Binary Codes
Iterative Quantization: A Procrustean Approach to Learning Binary Codes Yunchao Gong and Svetlana Lazebnik Department of Computer Science, UNC Chapel Hill, NC, 27599. {yunchao,lazebnik}@cs.unc.edu Abstract
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationFast Indexing and Search. Lida Huang, Ph.D. Senior Member of Consulting Staff Magma Design Automation
Fast Indexing and Search Lida Huang, Ph.D. Senior Member of Consulting Staff Magma Design Automation Motivation Object categorization? http://www.cs.utexas.edu/~grauman/slides/jain_et_al_cvpr2008.ppt Motivation
More informationAngular Quantization-based Binary Codes for Fast Similarity Search
Angular Quantization-based Binary Codes for Fast Similarity Search Yunchao Gong, Sanjiv Kumar, Vishal Verma, Svetlana Lazebnik Google Research, New York, NY, USA Computer Science Department, University
More informationScalable Graph Hashing with Feature Transformation
Proceedings of the Twenty-ourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Scalable Graph Hashing with eature Transformation Qing-Yuan Jiang and Wu-Jun Li National Key Laboratory
More informationInductive Hashing on Manifolds
Inductive Hashing on Manifolds Fumin Shen, Chunhua Shen, Qinfeng Shi, Anton van den Hengel, Zhenmin Tang The University of Adelaide, Australia Nanjing University of Science and Technology, China Abstract
More informationSupervised Hashing with Latent Factor Models
Supervised Hashing with Latent Factor Models Peichao Zhang Shanghai Key Laboratory of Scalable Computing and Systems Department of Computer Science and Engineering Shanghai Jiao Tong University, China
More informationSequential Projection Learning for Hashing with Compact Codes
Jun Wang jwang@ee.columbia.edu Department of Electrical Engineering, Columbia University, New Yor, NY 27, USA Sanjiv Kumar Google Research, New Yor, NY, USA sanjiv@google.com Shih-Fu Chang sfchang@ee.columbia.com
More informationSearching in one billion vectors: re-rank with source coding
Searching in one billion vectors: re-rank with source coding Hervé Jégou INRIA / IRISA Romain Tavenard Univ. Rennes / IRISA Laurent Amsaleg CNRS / IRISA Matthijs Douze INRIA / LJK ICASSP May 2011 LARGE
More informationarxiv: v1 [cs.cv] 19 Oct 2017 Abstract
Improved Search in Hamming Space using Deep Multi-Index Hashing Hanjiang Lai and Yan Pan School of Data and Computer Science, Sun Yan-Sen University, China arxiv:70.06993v [cs.cv] 9 Oct 207 Abstract Similarity-preserving
More informationAarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg
Spectral Clustering Aarti Singh Machine Learning 10-701/15-781 Apr 7, 2010 Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg 1 Data Clustering Graph Clustering Goal: Given data points X1,, Xn and similarities
More informationSpectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014
Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and
More informationDeep Supervised Hashing with Triplet Labels
Deep Supervised Hashing with Triplet Labels Xiaofang Wang, Yi Shi, Kris M. Kitani Carnegie Mellon University, Pittsburgh, PA 15213 USA Abstract. Hashing is one of the most popular and powerful approximate
More informationLearning to Hash with Binary Reconstructive Embeddings
Learning to Hash with Binary Reconstructive Embeddings Brian Kulis Trevor Darrell Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2009-0
More informationGeometric data structures:
Geometric data structures: Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade Sham Kakade 2017 1 Announcements: HW3 posted Today: Review: LSH for Euclidean distance Other
More informationLocality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC)
Locality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC) Goal We want to design a binary encoding of data such that similar data points (similarity measures
More informationBinary Embedding with Additive Homogeneous Kernels
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-7) Binary Embedding with Additive Homogeneous Kernels Saehoon Kim, Seungjin Choi Department of Computer Science and Engineering
More informationLearning to Hash with Binary Reconstructive Embeddings
Learning to Hash with Binary Reconstructive Embeddings Brian Kulis and Trevor Darrell UC Berkeley EECS and ICSI Berkeley, CA {kulis,trevor}@eecs.berkeley.edu Abstract Fast retrieval methods are increasingly
More informationMachine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016
Machine Learning 10-701, Fall 2016 Nonparametric methods for Classification Eric Xing Lecture 2, September 12, 2016 Reading: 1 Classification Representing data: Hypothesis (classifier) 2 Clustering 3 Supervised
More informationLocally Linear Landmarks for large-scale manifold learning
Locally Linear Landmarks for large-scale manifold learning Max Vladymyrov and Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu
More informationComplementary Projection Hashing
23 IEEE International Conference on Computer Vision Complementary Projection Hashing Zhongg Jin, Yao Hu, Yue Lin, Debing Zhang, Shiding Lin 2, Deng Cai, Xuelong Li 3 State Key Lab of CAD&CG, College of
More informationThorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA
Retrospective ICML99 Transductive Inference for Text Classification using Support Vector Machines Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Outline The paper in
More informationTask Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval
Case Study 2: Document Retrieval Task Description: Finding Similar Documents Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 11, 2017 Sham Kakade 2017 1 Document
More informationA Taxonomy of Semi-Supervised Learning Algorithms
A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More informationLarge scale object/scene recognition
Large scale object/scene recognition Image dataset: > 1 million images query Image search system ranked image list Each image described by approximately 2000 descriptors 2 10 9 descriptors to index! Database
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More informationThe Normalized Distance Preserving Binary Codes and Distance Table *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 32, XXXX-XXXX (2016) The Normalized Distance Preserving Binary Codes and Distance Table * HONGWEI ZHAO 1,2, ZHEN WANG 1, PINGPING LIU 1,2 AND BIN WU 1 1.
More informationClustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2
So far in the course Clustering Subhransu Maji : Machine Learning 2 April 2015 7 April 2015 Supervised learning: learning with a teacher You had training data which was (feature, label) pairs and the goal
More informationLost in Binarization: Query-Adaptive Ranking for Similar Image Search with Compact Codes
Lost in Binarization: Query-Adaptive Ranking for Similar Image Search with Compact Codes Yu-Gang Jiang, Jun Wang, Shih-Fu Chang Columbia University, New York, NY 10027, USA IBM T.J. Watson Research Center,
More informationLarge Scale Nearest Neighbor Search Theories, Algorithms, and Applications. Junfeng He
Large Scale Nearest Neighbor Search Theories, Algorithms, and Applications Junfeng He Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School
More informationarxiv: v5 [cs.cv] 15 May 2018
A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search The State Key Lab of CAD&CG, College of Computer Science, Zhejiang University, China Alibaba-Zhejiang University Joint Institute
More informationInstances on a Budget
Retrieving Similar or Informative Instances on a Budget Kristen Grauman Dept. of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimham Work with Sudheendra Vijayanarasimham,
More informationAssigning affinity-preserving binary hash codes to images
Assigning affinity-preserving binary hash codes to images Jason Filippou jasonfil@cs.umd.edu Varun Manjunatha varunm@umiacs.umd.edu June 10, 2014 Abstract We tackle the problem of generating binary hashcodes
More informationMTTTS17 Dimensionality Reduction and Visualization. Spring 2018 Jaakko Peltonen. Lecture 11: Neighbor Embedding Methods continued
MTTTS17 Dimensionality Reduction and Visualization Spring 2018 Jaakko Peltonen Lecture 11: Neighbor Embedding Methods continued This Lecture Neighbor embedding by generative modeling Some supervised neighbor
More information6.819 / 6.869: Advances in Computer Vision
6.819 / 6.869: Advances in Computer Vision Image Retrieval: Retrieval: Information, images, objects, large-scale Website: http://6.869.csail.mit.edu/fa15/ Instructor: Yusuf Aytar Lecture TR 9:30AM 11:00AM
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems
More informationLocality- Sensitive Hashing Random Projections for NN Search
Case Study 2: Document Retrieval Locality- Sensitive Hashing Random Projections for NN Search Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 18, 2017 Sham Kakade
More informationFast Supervised Hashing with Decision Trees for High-Dimensional Data
Fast Supervised Hashing with Decision Trees for High-Dimensional Data Guosheng Lin, Chunhua Shen, Qinfeng Shi, Anton van den Hengel, David Suter The University of Adelaide, SA 5005, Australia Abstract
More informationSemi-Supervised Hashing for Scalable Image Retrieval
Semi-Supervised Hashing for Scalable Image Retrieval Jun Wang Columbia University New Yor, NY, 10027 jwang@ee.columbia.edu Sanjiv Kumar Google Research New Yor, NY, 10011 sanjiv@google.com Shih-Fu Chang
More informationClustering. Subhransu Maji. CMPSCI 689: Machine Learning. 2 April April 2015
Clustering Subhransu Maji CMPSCI 689: Machine Learning 2 April 2015 7 April 2015 So far in the course Supervised learning: learning with a teacher You had training data which was (feature, label) pairs
More informationCompact Hash Code Learning with Binary Deep Neural Network
Compact Hash Code Learning with Binary Deep Neural Network Thanh-Toan Do, Dang-Khoa Le Tan, Tuan Hoang, Ngai-Man Cheung arxiv:7.0956v [cs.cv] 6 Feb 08 Abstract In this work, we firstly propose deep network
More informationCS 340 Lec. 4: K-Nearest Neighbors
CS 340 Lec. 4: K-Nearest Neighbors AD January 2011 AD () CS 340 Lec. 4: K-Nearest Neighbors January 2011 1 / 23 K-Nearest Neighbors Introduction Choice of Metric Overfitting and Underfitting Selection
More informationScalable Similarity Search with Optimized Kernel Hashing
Scalable Similarity Search with Optimized Kernel Hashing Junfeng He Columbia University New York, NY, 7 jh7@columbia.edu Wei Liu Columbia University New York, NY, 7 wliu@ee.columbia.edu Shih-Fu Chang Columbia
More informationNetwork embedding. Cheng Zheng
Network embedding Cheng Zheng Outline Problem definition Factorization based algorithms --- Laplacian Eigenmaps(NIPS, 2001) Random walk based algorithms ---DeepWalk(KDD, 2014), node2vec(kdd, 2016) Deep
More informationLearning to Hash for Indexing Big DataVA Survey
INVITED PAPER Learning to Hash for Indexing Big DataVA Survey This paper provides readers with a systematic understanding of insights, pros, and cons of the emerging indexing and search methods for Big
More informationImage Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 16
Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 16 Subspace/Transform Optimization Zhu Li Dept of CSEE, UMKC Office: FH560E, Email:
More informationSemi-supervised Data Representation via Affinity Graph Learning
1 Semi-supervised Data Representation via Affinity Graph Learning Weiya Ren 1 1 College of Information System and Management, National University of Defense Technology, Changsha, Hunan, P.R China, 410073
More informationSurfaces, meshes, and topology
Surfaces from Point Samples Surfaces, meshes, and topology A surface is a 2-manifold embedded in 3- dimensional Euclidean space Such surfaces are often approximated by triangle meshes 2 1 Triangle mesh
More informationGlobally and Locally Consistent Unsupervised Projection
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Globally and Locally Consistent Unsupervised Projection Hua Wang, Feiping Nie, Heng Huang Department of Electrical Engineering
More informationA Sparse Embedding and Least Variance Encoding Approach to Hashing
A Sparse Embedding and Least Variance Encoding Approach to Hashing Xiaofeng Zhu, Lei Zhang, Member, IEEE, Zi Huang Abstract Hashing is becoming increasingly important in large-scale image retrieval for
More informationDimension Reduction CS534
Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of
More informationDigital Geometry Processing
Digital Geometry Processing Spring 2011 physical model acquired point cloud reconstructed model 2 Digital Michelangelo Project Range Scanning Systems Passive: Stereo Matching Find and match features in
More informationTime and Space Efficient Spectral Clustering via Column Sampling
Time and Space Efficient Spectral Clustering via Column Sampling Mu Li Xiao-Chen Lian James T. Kwok Bao-Liang Lu, 3 Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai
More informationSPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING
SPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING Alireza Chakeri, Hamidreza Farhidzadeh, Lawrence O. Hall Department of Computer Science and Engineering College of Engineering University of South Florida
More informationDeep Semantic-Preserving and Ranking-Based Hashing for Image Retrieval
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Deep Semantic-Preserving and Ranking-Based Hashing for Image Retrieval Ting Yao, Fuchen Long, Tao Mei,
More informationOptimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink
Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink Rajesh Bordawekar IBM T. J. Watson Research Center bordaw@us.ibm.com Pidad D Souza IBM Systems pidsouza@in.ibm.com 1 Outline
More informationNEarest neighbor search plays an important role in
1 EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on knn Graph Cong Fu, Deng Cai arxiv:1609.07228v3 [cs.cv] 3 Dec 2016 Abstract Approximate nearest neighbor (ANN) search
More informationChapter 7. Learning Hypersurfaces and Quantisation Thresholds
Chapter 7 Learning Hypersurfaces and Quantisation Thresholds The research presented in this Chapter has been previously published in Moran (2016). 7.1 Introduction In Chapter 1 I motivated this thesis
More informationNearest Neighbor with KD Trees
Case Study 2: Document Retrieval Finding Similar Documents Using Nearest Neighbors Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox January 22 nd, 2013 1 Nearest
More information442 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 15, NO. 2, FEBRUARY 2013
442 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 15, NO. 2, FEBRUARY 2013 Query-Adaptive Image Search With Hash Codes Yu-Gang Jiang, Jun Wang, Member, IEEE, Xiangyang Xue, Member, IEEE, and Shih-Fu Chang, Fellow,
More informationIntroduction to Machine Learning
Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline
More informationAlgorithms for Nearest Neighbors
Algorithms for Nearest Neighbors State-of-the-Art Yury Lifshits Steklov Institute of Mathematics at St.Petersburg Yandex Tech Seminar, April 2007 1 / 28 Outline 1 Problem Statement Applications Data Models
More informationBilinear discriminant analysis hashing: A supervised hashing approach for high-dimensional data
Bilinear discriminant analysis hashing: A supervised hashing approach for high-dimensional data Author Liu, Yanzhen, Bai, Xiao, Yan, Cheng, Zhou, Jun Published 2017 Journal Title Lecture Notes in Computer
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences
More informationThe Constrained Laplacian Rank Algorithm for Graph-Based Clustering
The Constrained Laplacian Rank Algorithm for Graph-Based Clustering Feiping Nie, Xiaoqian Wang, Michael I. Jordan, Heng Huang Department of Computer Science and Engineering, University of Texas, Arlington
More informationBit-Scalable Deep Hashing with Regularized Similarity Learning for Image Retrieval and Person Re-identification
IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Bit-Scalable Deep Hashing with Regularized Similarity Learning for Image Retrieval and Person Re-identification Ruimao Zhang, Liang Lin, Rui Zhang, Wangmeng Zuo,
More information