Searching in one billion vectors: re-rank with source coding
|
|
- Elisabeth Cameron
- 5 years ago
- Views:
Transcription
1 Searching in one billion vectors: re-rank with source coding Hervé Jégou INRIA / IRISA Romain Tavenard Univ. Rennes / IRISA Laurent Amsaleg CNRS / IRISA Matthijs Douze INRIA / LJK ICASSP May 2011
2 LARGE media databases, this means Image search 1 million images 2 billion local SIFT descriptors (d=128) Video search: hundred/thousand hours of video E.g., TRECVID evaluation tasks Billions of audio and video descriptors Music retrieval Columbia Million-song database About 1 billion chroma descriptors (d=12, but usually compounded in 60- dimensional vectors to improve time consistency [Serra et al.])
3 Concrete Example: our TRECVID 2010 participation to copy detection Number of indexed vectors (database 400h) Video d= billion image descriptors (SIFT) Audio d= million descriptors (time/frequency energies) Similar videos are retrieved based on these local descriptors For each descriptor, we look at its (Euclidean) nearest neighbors Exhaustive linear search in intractable for search: 1000 descriptors for 1 query frame trillions of high dimensional vector comparisons = in the order of elementary operations Need for powerful approximate neighbors search: CPU and Memory efficient
4 Approximate nearest neighbor (ANN) search Three (contradictory) performance criteria search quality (retrieved vectors are actual nearest neighbors) speed memory usage Most algorithms mainly optimize the two first criteria: LSH, FLANN, etc Locality Sensitive Hashing: good theoretical properties memory consuming: 10 hash tables means at least 40 bytes per vector no data adaptation FLANN [Muja & Lowe 09]: Excellent accuracy/speed trade-off Need a lot of memory: more than 250MB for only 1 million vectors
5 Typical ANN schemes: 2 stages For this second ( re-ranking ) stage, we need raw descriptors, i.e., either huge amount of memory 128GB for 1 billion SIFTs either to perform disk accesses severely impacts efficiency
6 Typical ANN schemes: 2 stages NOT ON A LARGE SCALE For this second ( re-ranking ) stage, we need raw descriptors, i.e., either huge amount of memory 128GB for 1 billion SIFTs either to perform disk accesses severely impacts efficiency In this paper: alternative re-ranking method
7 Searching with quantization [J. et al., TPAMI 11] Main idea: ANN based on a compressed representation of the database vectors Each database vector y is represented by q c (y) where q c (.) is a product quantizer c Search = distance approximation problem The distance between a query vector x and a database vector y is estimated by c Distance is estimated in the compressed domain typically 8 table look-ups and additions per distance estimation (for SIFT) almost as fast as Hamming embedding methods (e.g., Weiss et al.) proved average upper bound on distance approximation error
8 Re-ranking: refine the descriptor s representation A good property of compressed based indexing: explicit (approximate) reconstruction of each database vector Instead of using raw descriptors for the re-ranking stage, refine this reconstruction by encoding the residual vector The residual error vector is This vector error is quantized using another (fine) quantizer q r y is now approximated by which improves the initial estimate Trade-off precision/memory: controlled by a parameter m = number of bytes used to represented an index for quantizer q r
9 Re-ranking: refine the descriptor s representation ALGORITHM Retrieve a list of k potential nearest neighbors using compression-based search method (k >k) d c (x,y) = Perform the explicit reconstruction of the improved estimates c Find the k-nearest neighbors based on improved distance estimates d r (x,y) =
10 Search results: searching in 1 billion vectors Parameters: indexing structure m = 8 bytes per vector m = number of bytes used for the re-ranking stage
11 Efficiency and balancing the bytes The re-ranking stage has a limited cost (almost negligible with typical parameters) Timings: between 1ms and 200ms to search in 1 billion vectors, depending on desired precision For a fixed memory usage, it is more efficient to use less bytes for the first stage and more for the re-ranking stage Moreover, similar or even better search quality:
12 BIGANN: a billion-sized dataset to evaluate approximate search Many ANN algorithms are evaluated on toy datasets Discrepancy of scale between ANN evaluation and real applications Key practical problems are ignored: memory usage not taken into account, over-fitting problems due to many parameters w.r.t. dataset size BIGANN: 128-dimensional SIFT descriptors (David Lowe s implementation) Three subsets 1 billion vectors (the database that we want to search) 10,000 queries 100 million independent vectors to learn the algorithm s parameters Ground-truth is provided for each query Based on exact Euclidean distance comparisons (1 billion x 10,000 in total) Lists of exact 1000 nearest neighbors and corresponding distances
13 Conclusion A source coding based re-ranking approach does not use raw descriptors improved efficiency/quality for fixed memory usage Searching in 1 billion vectors with high precision is possible on a commodity server BIGANN: a large dataset for ANN evaluation Online: toy Matlab package of our compression-based method:
14 Measurement based techniques Several method address this memory problem by designing/learning an embedding function mapping the Euclidean space into the compact Hamming space: Goal: neighborhood in the Hamming space reflects Euclidean neighborhood Related works: Charikar s LSH. More recently Small codes and large databases for recognition, Torralba et al. Spectral Hashing, Weiss et al. 09 But this is exhaustive search. Hybrid approach (non exhaustive+binary codes): Hamming Embedding and Weak geometric consistency for large scale image search, J. et al. 08
15 Searching with quantization [J. et al., TPAMI 11] Vector split into m subvectors: Subvectors are quantized separately by quantizers where each is learned by k-means with a limited number of centroids Example: y = 128-dim vector split in 8 subvectors of dimension components y 1 y 2 y 3 y 4 y 5 y 6 y 7 y centroids q 1 q 2 q 3 q 4 q 5 q 6 q 7 q 8 q 1 (y 1 ) q 2 (y 2 ) q 3 (y 3 ) q 4 (y 4 ) q 5 (y 5 ) q 6 (y 6 ) q 7 (y 7 ) q 8 (y 8 ) 8 bits 64-bit quantization index
16 Product quantizer: asymmetric distance computation (ADC) Compute the square distance approximation in the compressed domain To compute distance between query and many codes compute for each subvector and all possible centroids stored in look-up tables for each database code: sum the elementary square distances Each 8x8=64-bits code requires only m=8 additions per distance! IVFADC: combination with an inverted file to avoid exhaustive search
17 Biased estimator These estimators are biased: bias removed by quantization error terms but does not improve the NN search quality
18 Example Right: 2D anisotropic Gaussian greedy approach selects the dimension that best reduces the distance error per allocated bit Non-integer number of bits per dimension SDC: The number of possible distance is finite, but much higher than for Hamming Embedding up to 2 D instead of D+1 Left: 5 bits cleary more than 6 distances!
19 Performance evaluation with varying R) Comparison with other memory efficient approximate neighbor search techniques Hamming Embedding [Jegou 08] Spectral Hashing [Weiss 09] Performance measured by searching 1M vector varying R) Searching in 1M SIFT descriptors Searching in 1M GIST descriptors
20 Combination with an inverted file
21 Comparison with FLANN [Muja & Lowe 09] Tested on 1 million SIFTs 1.5 to 2 faster than FLANN for same accuracy Memory usage for 1M vectors (according to top command): FLANN: > 250MB Ours: < 25MB
Large-scale visual recognition Efficient matching
Large-scale visual recognition Efficient matching Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline!! Preliminary!! Locality Sensitive Hashing: the two modes!! Hashing!! Embedding!!
More informationCompressed local descriptors for fast image and video search in large databases
Compressed local descriptors for fast image and video search in large databases Matthijs Douze2 joint work with Hervé Jégou1, Cordelia Schmid2 and Patrick Pérez3 1: INRIA Rennes, TEXMEX team, France 2:
More informationLarge scale object/scene recognition
Large scale object/scene recognition Image dataset: > 1 million images query Image search system ranked image list Each image described by approximately 2000 descriptors 2 10 9 descriptors to index! Database
More informationEvaluation of GIST descriptors for web scale image search
Evaluation of GIST descriptors for web scale image search Matthijs Douze Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg and Cordelia Schmid INRIA Grenoble, France July 9, 2009 Evaluation of GIST for
More informationProduct quantization for nearest neighbor search
Product quantization for nearest neighbor search Hervé Jégou, Matthijs Douze, Cordelia Schmid Abstract This paper introduces a product quantization based approach for approximate nearest neighbor search.
More informationCLSH: Cluster-based Locality-Sensitive Hashing
CLSH: Cluster-based Locality-Sensitive Hashing Xiangyang Xu Tongwei Ren Gangshan Wu Multimedia Computing Group, State Key Laboratory for Novel Software Technology, Nanjing University xiangyang.xu@smail.nju.edu.cn
More informationLarge-scale visual recognition The bag-of-words representation
Large-scale visual recognition The bag-of-words representation Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline Bag-of-words Large or small vocabularies? Extensions for instance-level
More informationarxiv: v1 [cs.cv] 11 Dec 2013
Fast Neighborhood Graph Search using Cartesian Concatenation Jingdong Wang Jing Wang Gang Zeng Rui Gan Shipeng Li Baining Guo arxiv:32.3062v [cs.cv] Dec 203 Abstract In this paper, we propose a new data
More informationSearching with quantization: approximate nearest neighbor search using short codes and distance estimators
Searching with quantization: approximate nearest neighbor search using short codes and distance estimators Hervé Jégou, Matthijs Douze, Cordelia Schmid To cite this version: Hervé Jégou, Matthijs Douze,
More informationExplicit embeddings for nearest neighbor search with Mercer kernels
Explicit embeddings for nearest neighbor search with Mercer kernels Anthony Bourrier, Florent Perronnin, Rémi Gribonval, Patrick Pérez, Hervé Jégou To cite this version: Anthony Bourrier, Florent Perronnin,
More informationLink and code: Fast indexing with graphs and compact regression codes
Link and code: Fast indexing with graphs and compact regression codes Matthijs Douze, Alexandre Sablayrolles,, and Hervé Jégou Facebook AI Research Inria Abstract Similarity search approaches based on
More informationFast Nearest Neighbor Search in the Hamming Space
Fast Nearest Neighbor Search in the Hamming Space Zhansheng Jiang 1(B), Lingxi Xie 2, Xiaotie Deng 1,WeiweiXu 3, and Jingdong Wang 4 1 Shanghai Jiao Tong University, Shanghai, People s Republic of China
More informationPredictive Indexing for Fast Search
Predictive Indexing for Fast Search Sharad Goel, John Langford and Alex Strehl Yahoo! Research, New York Modern Massive Data Sets (MMDS) June 25, 2008 Goel, Langford & Strehl (Yahoo! Research) Predictive
More informationMixtures of Gaussians and Advanced Feature Encoding
Mixtures of Gaussians and Advanced Feature Encoding Computer Vision Ali Borji UWM Many slides from James Hayes, Derek Hoiem, Florent Perronnin, and Hervé Why do good recognition systems go bad? E.g. Why
More informationApproximate Nearest Neighbor Search. Deng Cai Zhejiang University
Approximate Nearest Neighbor Search Deng Cai Zhejiang University The Era of Big Data How to Find Things Quickly? Web 1.0 Text Search Sparse feature Inverted Index How to Find Things Quickly? Web 2.0, 3.0
More informationarxiv: v2 [cs.cv] 23 Jul 2018
arxiv:1802.02422v2 [cs.cv] 23 Jul 2018 Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors Dmitry Baranchuk 1,2, Artem Babenko 1,3, Yury Malkov 4 1 Yandex 2 Lomonosov Moscow
More informationFast Neighborhood Graph Search using Cartesian Concatenation
Fast Neighborhood Graph Search using Cartesian Concatenation Jing Wang Jingdong Wang 2 Gang Zeng Rui Gan Shipeng Li 2 Baining Guo 2 Peking University 2 Microsoft Research Abstract In this paper, we propose
More informationAggregating local image descriptors into compact codes
Aggregating local image descriptors into compact codes Hervé Jégou, Florent Perronnin, Matthijs Douze, Jorge Sánchez, Patrick Pérez, Cordelia Schmid To cite this version: Hervé Jégou, Florent Perronnin,
More informationEvaluation of GIST descriptors for web-scale image search
Evaluation of GIST descriptors for web-scale image search Matthijs Douze, Hervé Jégou, Sandhawalia Harsimrat, Laurent Amsaleg, Cordelia Schmid To cite this version: Matthijs Douze, Hervé Jégou, Sandhawalia
More informationEFFICIENT nearest neighbor (NN) search is one of the
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 19, NO. 11, NOVEMBER 2017 2521 Compact Hash Codes for Efficient Visual Descriptors Retrieval in Large Scale Databases Simone Ercoli, Marco Bertini, Member, IEEE, and
More informationLecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013
Lecture 24: Image Retrieval: Part II Visual Computing Systems Review: K-D tree Spatial partitioning hierarchy K = dimensionality of space (below: K = 2) 3 2 1 3 3 4 2 Counts of points in leaf nodes Nearest
More informationAPPROXIMATE K-Nearest Neighbor (ANN) search has
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Distance Encoded Product Quantization for Approximate K-Nearest Neighbor Search in High-Dimensional Space Jae-Pil Heo, Zhe Lin, and Sung-Eui
More informationINRIA LEAR-TEXMEX: Video copy detection task
INRIA LEAR-TEXMEX: Video copy detection task Hervé Jégou, Matthijs Douze, Guillaume Gravier, Cordelia Schmid, Patrick Gros To cite this version: Hervé Jégou, Matthijs Douze, Guillaume Gravier, Cordelia
More informationPQTable: Fast Exact Asymmetric Distance Neighbor Search for Product Quantization using Hash Tables
PQTable: Fast Exact Asymmetric Distance eighbor Search for Product Quantization using Hash Tables Yusuke Matsui Toshihiko Yamasaki Kiyoharu Aizawa The University of Tokyo, Japan {matsui, yamasaki, aizawa}@hal.t.u-tokyo.ac.jp
More informationHamming embedding and weak geometric consistency for large scale image search
Hamming embedding and weak geometric consistency for large scale image search Herve Jegou, Matthijs Douze, and Cordelia Schmid INRIA Grenoble, LEAR, LJK firstname.lastname@inria.fr Abstract. This paper
More informationEfficient Large-scale Approximate Nearest Neighbor Search on the GPU
Efficient Large-scale Approximate Nearest Neighbor Search on the GPU Patrick Wieschollek 1,4 Oliver Wang 2 Alexander Sorkine-Hornung 3 Hendrik PA Lensch 1 1 University of Tübingen 2 Adobe Systems Inc 3
More informationLocality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC)
Locality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC) Goal We want to design a binary encoding of data such that similar data points (similarity measures
More informationIMPROVING VLAD: HIERARCHICAL CODING AND A REFINED LOCAL COORDINATE SYSTEM. Christian Eggert, Stefan Romberg, Rainer Lienhart
IMPROVING VLAD: HIERARCHICAL CODING AND A REFINED LOCAL COORDINATE SYSTEM Christian Eggert, Stefan Romberg, Rainer Lienhart Multimedia Computing and Computer Vision Lab University of Augsburg ABSTRACT
More informationMachine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016
Machine Learning 10-701, Fall 2016 Nonparametric methods for Classification Eric Xing Lecture 2, September 12, 2016 Reading: 1 Classification Representing data: Hypothesis (classifier) 2 Clustering 3 Supervised
More informationon learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015
on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,
More informationEfficient Indexing of Billion-Scale datasets of deep descriptors
Efficient Indexing of Billion-Scale datasets of deep descriptors Artem Babenko Yandex Moscow Institute of Physics and Technology artem.babenko@phystech.edu Victor Lempitsky Skolkovo Institute of Science
More informationGeometric data structures:
Geometric data structures: Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade Sham Kakade 2017 1 Announcements: HW3 posted Today: Review: LSH for Euclidean distance Other
More informationAdaptive Binary Quantization for Fast Nearest Neighbor Search
IBM Research Adaptive Binary Quantization for Fast Nearest Neighbor Search Zhujin Li 1, Xianglong Liu 1*, Junjie Wu 1, and Hao Su 2 1 Beihang University, Beijing, China 2 Stanford University, Stanford,
More informationDistributed High-Dimensional Index Creation using Hadoop, HDFS and C++
Distributed High-Dimensional Index Creation using Hadoop, HDFS and C++ Gylfi Þór Gudmundsson INRIA, Rennes, France gylfi.gudmundsson@inria.fr Laurent Amsaleg IRISA CNRS, Rennes, France laurent.amsaleg@irisa.fr
More informationHashing with Graphs. Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011
Hashing with Graphs Wei Liu (Columbia Columbia), Jun Wang (IBM IBM), Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011 Overview Graph Hashing Outline Anchor Graph Hashing Experiments Conclusions
More informationSupervised Hashing for Image Retrieval via Image Representation Learning
Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia, Yan Pan, Cong Liu (Sun Yat-Sen University) Hanjiang Lai, Shuicheng Yan (National University of Singapore) Finding Similar
More informationIMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES
IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,
More informationFast Indexing and Search. Lida Huang, Ph.D. Senior Member of Consulting Staff Magma Design Automation
Fast Indexing and Search Lida Huang, Ph.D. Senior Member of Consulting Staff Magma Design Automation Motivation Object categorization? http://www.cs.utexas.edu/~grauman/slides/jain_et_al_cvpr2008.ppt Motivation
More informationVoronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013
Voronoi Region K-means method for Signal Compression: Vector Quantization Blocks of signals: A sequence of audio. A block of image pixels. Formally: vector example: (0.2, 0.3, 0.5, 0.1) A vector quantizer
More informationNEAREST neighbor search is ubiquitous in computer
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. X, NO. X, JANUARY 2013 1 Extremely low bit-rate nearest neighbor search using a Set Compression Tree Relja Arandjelović and Andrew Zisserman
More informationA General and Efficient Querying Method for Learning to Hash
A General and Efficient Querying Method for Jinfeng Li, Xiao Yan, Jian Zhang, An Xu, James Cheng, Jie Liu, Kelvin K. W. Ng, Ti-chung Cheng Department of Computer Science and Engineering The Chinese University
More informationNEAREST neighbor search is ubiquitous in computer
2396 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 36, NO. 12, DECEMBER 2014 Extremely Low Bit-Rate Nearest Neighbor Search Using a Set Compression Tree Relja Arandjelovic and Andrew
More informationover Multi Label Images
IBM Research Compact Hashing for Mixed Image Keyword Query over Multi Label Images Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,
More informationScalable Nearest Neighbor Algorithms for High Dimensional Data Marius Muja (UBC), David G. Lowe (Google) IEEE 2014
Scalable Nearest Neighbor Algorithms for High Dimensional Data Marius Muja (UBC), David G. Lowe (Google) IEEE 2014 Presenter: Derrick Blakely Department of Computer Science, University of Virginia https://qdata.github.io/deep2read/
More informationArchitectural Support for Large-Scale Visual Search. Carlo C. del Mundo Vincent Lee Armin Alaghi Luis Ceze Mark Oskin
Architectural Support for Large-Scale Visual Search Carlo C. del Mundo Vincent Lee Armin Alaghi Luis Ceze Mark Oskin Motivation: Visual Data & Their Applications Rebooting the IT Revolution, SIA, September
More informationThe Boundary Graph Supervised Learning Algorithm for Regression and Classification
The Boundary Graph Supervised Learning Algorithm for Regression and Classification! Jonathan Yedidia! Disney Research!! Outline Motivation Illustration using a toy classification problem Some simple refinements
More informationImage Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.
Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 18 Image Hashing Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph:
More informationLearning independent, diverse binary hash functions: pruning and locality
Learning independent, diverse binary hash functions: pruning and locality Ramin Raziperchikolaei and Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced
More informationImproving bag-of-features for large scale image search
Improving bag-of-features for large scale image search Hervé Jégou, Matthijs Douze, Cordelia Schmid To cite this version: Hervé Jégou, Matthijs Douze, Cordelia Schmid. Improving bag-of-features for large
More informationRepeating Segment Detection in Songs using Audio Fingerprint Matching
Repeating Segment Detection in Songs using Audio Fingerprint Matching Regunathan Radhakrishnan and Wenyu Jiang Dolby Laboratories Inc, San Francisco, USA E-mail: regu.r@dolby.com Institute for Infocomm
More informationHashing with Binary Autoencoders
Hashing with Binary Autoencoders Ramin Raziperchikolaei Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu Joint work with Miguel Á. Carreira-Perpiñán
More informationarxiv: v1 [cs.mm] 3 May 2016
Bloom Filters and Compact Hash Codes for Efficient and Distributed Image Retrieval Andrea Salvi, Simone Ercoli, Marco Bertini and Alberto Del Bimbo Media Integration and Communication Center, Università
More informationANTON MURAVEV EFFICIENT VECTOR QUANTIZATION FOR FAST APPROXIMATE NEAREST NEIGHBOR SEARCH
ANTON MURAVEV EFFICIENT VECTOR QUANTIZATION FOR FAST APPROXIMATE NEAREST NEIGHBOR SEARCH Master of Science thesis Examiners: Prof. Moncef Gabbouj Dr. Alexandros Iosifidis Examiner and topic approved in
More informationCompact video description for copy detection with precise temporal alignment
Compact video description for copy detection with precise temporal alignment Matthijs Douze 1, Hervé Jégou 2, Cordelia Schmid 1, and Patrick Pérez 3 1 INRIA Grenoble, France 2 INRIA Rennes, France 3 Technicolor
More informationEnhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention
Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention Anand K. Hase, Baisa L. Gunjal Abstract In the real world applications such as landmark search, copy protection, fake image
More informationStable hyper-pooling and query expansion for event detection
3 IEEE International Conference on Computer Vision Stable hyper-pooling and query expansion for event detection Matthijs Douze INRIA Grenoble Jérôme Revaud INRIA Grenoble Cordelia Schmid INRIA Grenoble
More informationA Unified Approach to Learning Task-Specific Bit Vector Representations for Fast Nearest Neighbor Search
A Unified Approach to Learning Task-Specific Bit Vector Representations for Fast Nearest Neighbor Search Vinod Nair Yahoo! Labs Bangalore vnair@yahoo-inc.com Dhruv Mahajan Yahoo! Labs Bangalore dkm@yahoo-inc.com
More informationIMAGE MATCHING - ALOK TALEKAR - SAIRAM SUNDARESAN 11/23/2010 1
IMAGE MATCHING - ALOK TALEKAR - SAIRAM SUNDARESAN 11/23/2010 1 : Presentation structure : 1. Brief overview of talk 2. What does Object Recognition involve? 3. The Recognition Problem 4. Mathematical background:
More informationSupplementary Material for Ensemble Diffusion for Retrieval
Supplementary Material for Ensemble Diffusion for Retrieval Song Bai 1, Zhichao Zhou 1, Jingdong Wang, Xiang Bai 1, Longin Jan Latecki 3, Qi Tian 4 1 Huazhong University of Science and Technology, Microsoft
More informationMULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou
MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou Departent of Coputer Science and Inforation Engineering, National Chiayi
More informationEfficient Representation of Local Geometry for Large Scale Object Retrieval
Efficient Representation of Local Geometry for Large Scale Object Retrieval Michal Perďoch Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University in Prague IEEE Computer Society
More informationBinary SIFT: Towards Efficient Feature Matching Verification for Image Search
Binary SIFT: Towards Efficient Feature Matching Verification for Image Search Wengang Zhou 1, Houqiang Li 2, Meng Wang 3, Yijuan Lu 4, Qi Tian 1 Dept. of Computer Science, University of Texas at San Antonio
More informationMetric Learning Applied for Automatic Large Image Classification
September, 2014 UPC Metric Learning Applied for Automatic Large Image Classification Supervisors SAHILU WENDESON / IT4BI TOON CALDERS (PhD)/ULB SALIM JOUILI (PhD)/EuraNova Image Database Classification
More informationLearning Vocabularies over a Fine Quantization
International Journal of Computer Vision manuscript No. (will be inserted by the editor) Learning Vocabularies over a Fine Quantization Andrej Mikulik Michal Perdoch Ondřej Chum Jiří Matas Received: date
More informationLearning a Fine Vocabulary
Learning a Fine Vocabulary Andrej Mikulík, Michal Perdoch, Ondřej Chum, and Jiří Matas CMP, Dept. of Cybernetics, Faculty of EE, Czech Technical University in Prague Abstract. We present a novel similarity
More informationLearning Affine Robust Binary Codes Based on Locality Preserving Hash
Learning Affine Robust Binary Codes Based on Locality Preserving Hash Wei Zhang 1,2, Ke Gao 1, Dongming Zhang 1, and Jintao Li 1 1 Advanced Computing Research Laboratory, Beijing Key Laboratory of Mobile
More informationA Novel Quantization Approach for Approximate Nearest Neighbor Search to Minimize the Quantization Error
A Novel Quantization Approach for Approximate Nearest Neighbor Search to Minimize the Quantization Error Uriti Archana 1, Urlam Sridhar 2 P.G. Student, Department of Computer Science &Engineering, Sri
More informationLocality- Sensitive Hashing Random Projections for NN Search
Case Study 2: Document Retrieval Locality- Sensitive Hashing Random Projections for NN Search Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 18, 2017 Sham Kakade
More informationNEarest neighbor search plays an important role in
1 EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on knn Graph Cong Fu, Deng Cai arxiv:1609.07228v2 [cs.cv] 18 Nov 2016 Abstract Approximate nearest neighbor (ANN) search
More informationNEarest neighbor search plays an important role in
1 EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on knn Graph Cong Fu, Deng Cai arxiv:1609.07228v3 [cs.cv] 3 Dec 2016 Abstract Approximate nearest neighbor (ANN) search
More informationNeural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders
Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components
More informationEfficient Large-scale Approximate Nearest Neighbor Search on OpenCL FPGA
Efficient Large-scale Approximate Nearest Neighbor Search on OpenCL FPGA Jialiang Zhang UW-Madison jialiang.zhang@ece.wisc.edu Soroosh Khoram UW-Madison khoram@wisc.edu Jing Li UW-Madison jli@ece.wisc.edu
More informationNearest neighbors. Focus on tree-based methods. Clément Jamin, GUDHI project, Inria March 2017
Nearest neighbors Focus on tree-based methods Clément Jamin, GUDHI project, Inria March 2017 Introduction Exact and approximate nearest neighbor search Essential tool for many applications Huge bibliography
More informationdoc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague
Praha & EU: Investujeme do vaší budoucnosti Evropský sociální fond course: Searching the Web and Multimedia Databases (BI-VWM) Tomáš Skopal, 2011 SS2010/11 doc. RNDr. Tomáš Skopal, Ph.D. Department of
More informationNagoya University at TRECVID 2014: the Instance Search Task
Nagoya University at TRECVID 2014: the Instance Search Task Cai-Zhi Zhu 1 Yinqiang Zheng 2 Ichiro Ide 1 Shin ichi Satoh 2 Kazuya Takeda 1 1 Nagoya University, 1 Furo-Cho, Chikusa-ku, Nagoya, Aichi 464-8601,
More informationLarge Scale 3D Reconstruction by Structure from Motion
Large Scale 3D Reconstruction by Structure from Motion Devin Guillory Ziang Xie CS 331B 7 October 2013 Overview Rome wasn t built in a day Overview of SfM Building Rome in a Day Building Rome on a Cloudless
More informationDeepIndex for Accurate and Efficient Image Retrieval
DeepIndex for Accurate and Efficient Image Retrieval Yu Liu, Yanming Guo, Song Wu, Michael S. Lew Media Lab, Leiden Institute of Advance Computer Science Outline Motivation Proposed Approach Results Conclusions
More informationDraft. Locality sensitive hashing: a comparison of hash function types and querying mechanisms. Loïc Paulevé, Hervé Jégou, Laurent Amsaleg
Locality sensitive hashing: a comparison of hash function types and querying mechanisms Loïc Paulevé, Hervé Jégou, Laurent Amsaleg Abstract It is well known that high-dimensional nearestneighbor retrieval
More informationFrom Pixels to Information Recent Advances in Visual Search
From Pixels to Information Recent Advances in Visual Search Bernd Girod Stanford University bgirod@stanford.edu Augmented Reality 3 Augmented Reality 2014 2012 2015 4 Future: Smart Contact Lenses Sight:
More informationThree things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012)
Three things everyone should know to improve object retrieval Relja Arandjelović and Andrew Zisserman (CVPR 2012) University of Oxford 2 nd April 2012 Large scale object retrieval Find all instances of
More informationQuery Adaptive Similarity for Large Scale Object Retrieval
Query Adaptive Similarity for Large Scale Object Retrieval Danfeng Qin Christian Wengert Luc van Gool ETH Zürich, Switzerland {qind,wengert,vangool}@vision.ee.ethz.ch Abstract Many recent object retrieval
More informationProduct Quantized Translation for Fast Nearest Neighbor Search
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-8) Product Quantized Translation for Fast Nearest Neighbor Search Yoonho Hwang, Mooyeol Baek, Saehoon Kim, Bohyung Han, Hee-Kap Ahn Dept.
More informationImproved Coding for Image Feature Location Information
Improved Coding for Image Feature Location Information Sam S. Tsai, David Chen, Gabriel Takacs, Vijay Chandrasekhar Mina Makar, Radek Grzeszczuk, and Bernd Girod Department of Electrical Engineering, Stanford
More informationLocality-Sensitive Hashing
Locality-Sensitive Hashing & Image Similarity Search Andrew Wylie Overview; LSH given a query q (or not), how do we find similar items from a large search set quickly? Can t do all pairwise comparisons;
More informationNegative evidences and co-occurrences in image retrieval: the benefit of PCA and whitening
Negative evidences and co-occurrences in image retrieval: the benefit of PCA and whitening Hervé Jégou, Ondrej Chum To cite this version: Hervé Jégou, Ondrej Chum. Negative evidences and co-occurrences
More informationNotebook paper: TNO instance search submission 2012
Notebook paper: TNO instance search submission 2012 John Schavemaker, Corné Versloot, Joost de Wit, Wessel Kraaij TNO Technical Sciences Brassersplein 2, 2612 CT, Delft, The Netherlands E-mail of corresponding
More informationLearning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009
Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer
More informationGeometric VLAD for Large Scale Image Search. Zixuan Wang 1, Wei Di 2, Anurag Bhardwaj 2, Vignesh Jagadesh 2, Robinson Piramuthu 2
Geometric VLAD for Large Scale Image Search Zixuan Wang 1, Wei Di 2, Anurag Bhardwaj 2, Vignesh Jagadesh 2, Robinson Piramuthu 2 1 2 Our Goal 1) Robust to various imaging conditions 2) Small memory footprint
More informationThe Yael library. Matthijs Douze, Hervé Jégou. To cite this version: HAL Id: hal
The Yael library Matthijs Douze, Hervé Jégou To cite this version: Matthijs Douze, Hervé Jégou. The Yael library. 22nd ACM International Conference on Multimedia, Nov 2014, Orlando, United States. ACM,
More informationTracking and compression techniques
Tracking and compression techniques for ALICE HLT Anders Strand Vestbø The ALICE experiment at LHC The ALICE High Level Trigger (HLT) Estimated data rate (Central Pb-Pb, TPC only) 200 Hz * 75 MB = ~15
More informationDetection of Cut-And-Paste in Document Images
Detection of Cut-And-Paste in Document Images Ankit Gandhi and C. V. Jawahar Center for Visual Information Technology, IIIT-Hyderabad, India Email: ankit.gandhiug08@students.iiit.ac.in, jawahar@iiit.ac.in
More informationEfficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488)
Efficiency Efficiency: Indexing (COSC 488) Nazli Goharian nazli@cs.georgetown.edu Difficult to analyze sequential IR algorithms: data and query dependency (query selectivity). O(q(cf max )) -- high estimate-
More informationHarnessing Encrypted Data in Cloud for Secure and Efficient Image Sharing from Mobile Devices
34th IEEE INFOCOM, 26 April 1 May, 2015, Hong Kong Harnessing Encrypted Data in Cloud for Secure and Efficient Image Sharing from Mobile Devices Helei Cui, Xingliang Yuan, and Cong Wang Department of Computer
More informationBloom Filters and Compact Hash Codes for Efficient and Distributed Image Retrieval
2016 IEEE International Symposium on Multimedia Bloom Filters and Compact Hash Codes for Efficient and Distributed Image Retrieval Andrea Salvi, Simone Ercoli, Marco Bertini and Alberto Del Bimbo MICC
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationWeb- Scale Mul,media: Op,mizing LSH. Malcolm Slaney Yury Li<shits Junfeng He Y! Research
Web- Scale Mul,media: Op,mizing LSH Malcolm Slaney Yury Li
More informationBinary Embedding with Additive Homogeneous Kernels
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-7) Binary Embedding with Additive Homogeneous Kernels Saehoon Kim, Seungjin Choi Department of Computer Science and Engineering
More informationCaching and Buffering in HDF5
Caching and Buffering in HDF5 September 9, 2008 SPEEDUP Workshop - HDF5 Tutorial 1 Software stack Life cycle: What happens to data when it is transferred from application buffer to HDF5 file and from HDF5
More informationSVM-KNN : Discriminative Nearest Neighbor Classification for Visual Category Recognition
SVM-KNN : Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alexander Berg, Michael Maire Jitendra Malik EECS, UC Berkeley Presented by Adam Bickett Objective Visual
More informationMining Large-Scale Music Data Sets
Mining Large-Scale Music Data Sets Dan Ellis & Thierry Bertin-Mahieux Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,thierry}@ee.columbia.edu
More information