Vocabulary tree. Vocabulary tree supports very efficient retrieval. It only cares about the distance between a query feature and each node.

Similar documents
Recognition. Topics that we will try to cover:

Scalable Recognition with a Vocabulary Tree

Scalable Recognition with a Vocabulary Tree

Lecture 12 Recognition

IMAGE MATCHING - ALOK TALEKAR - SAIRAM SUNDARESAN 11/23/2010 1

Exercise 4: Image Retrieval with Vocabulary Trees

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013

BoW model. Textual data: Bag of Words model

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

Keypoint-based Recognition and Object Search

Large Scale Image Retrieval

Lecture 12 Recognition

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang

Decision Trees, Random Forests and Random Ferns. Peter Kovesi

Learning bottom-up visual processes using automatically generated ground truth data

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson

Image Retrieval (Matching at Large Scale)

WISE: Large Scale Content Based Web Image Search. Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley

Efficient Large-scale Image Search With a Vocabulary Tree. PREPRINT December 19, 2017

Master s Thesis. Learning Codeword Characteristics for Image Retrieval Using Very High Dimensional Bag-of-Words Representation

A Comparison of Document Clustering Techniques

Ges$one Avanzata dell Informazione Part A Full- Text Informa$on Management. Full- Text Indexing

By Suren Manvelyan,

Indexing local features and instance recognition May 16 th, 2017

Chapter 12: Indexing and Hashing. Basic Concepts

Robust Online Object Learning and Recognition by MSER Tracking

Object Recognition and Augmented Reality

Sampling Strategies for Object Classifica6on. Gautam Muralidhar

Informa(on Retrieval

Chapter 12: Indexing and Hashing

Informa(on Retrieval

AIR FORCE INSTITUTE OF TECHNOLOGY

An Latent Feature Model for

Large-scale visual recognition Efficient matching

Automatic Ranking of Images on the Web

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Chapter 11: Indexing and Hashing

Indexing local features and instance recognition May 14 th, 2015

CS6200 Information Retrieval. David Smith College of Computer and Information Science Northeastern University

Indexing local features and instance recognition May 15 th, 2018

Informa)on Retrieval and Map- Reduce Implementa)ons. Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies

KdTree 3D Nearest Neighbor Search High Dimensional Nearest Neighbor Search Octree. PCL :: Search. Marius Muja and Julius Kammerl

Simple Example of Recognition Statistical Viewpoint!

Clustering. k-mean clustering. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Compressed local descriptors for fast image and video search in large databases

CS6670: Computer Vision

Large scale object/scene recognition

Large Scale 3D Reconstruction by Structure from Motion

Recognizing object instances

Perception IV: Place Recognition, Line Extraction

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Large-scale visual recognition The bag-of-words representation

Visual Word based Location Recognition in 3D models using Distance Augmented Weighting

CS 188: Ar)ficial Intelligence

Indexing. CS6200: Information Retrieval. Index Construction. Slides by: Jesse Anderton

CS395T Visual Recogni5on and Search. Gautam S. Muralidhar

Today s topic CS347. Results list clustering example. Why cluster documents. Clustering documents. Lecture 8 May 7, 2001 Prabhakar Raghavan

6.819 / 6.869: Advances in Computer Vision

Lecture 12 Recognition. Davide Scaramuzza

CSE 473: Ar+ficial Intelligence

Lecture 5: Information Retrieval using the Vector Space Model

Query-by-Example for Motion Capture Data. Bennett Lee Massachusetts Institute of Technology All rights reserved.

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

Hierarchical Clustering Lecture 9

Course work. Today. Last lecture index construc)on. Why compression (in general)? Why compression for inverted indexes?

CS60092: Informa0on Retrieval

Multi-View Vocabulary Trees for Mobile 3D Visual Search

Machine Learning using MapReduce

Robust Building Identification for Mobile Augmented Reality

Chapter 12: Indexing and Hashing

Intersection Acceleration

Query Evaluation Strategies

Efficiency vs. Effectiveness in Terabyte-Scale IR

A Comparison of Two Fully-Dynamic Delaunay Triangulation Methods

Informa/on Retrieval. Text Search. CISC437/637, Lecture #23 Ben CartereAe. Consider a database consis/ng of long textual informa/on fields

Fisher vector image representation

Active Object Recognition using Vocabulary Trees

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Harmony Poten,als: Fusing Global and Local Scale for Seman,c Image Segmenta,on

PAWS Database Discovery Some considera5ons since the last Berlin mee5ng

Today. Main questions 10/30/2008. Bag of words models. Last time: Local invariant features. Harris corner detector: rotation invariant detection

Decision Support Systems

Learning a Fine Vocabulary

Patch Descriptors. CSE 455 Linda Shapiro

Hamming embedding and weak geometric consistency for large scale image search

EE368/CS232 Digital Image Processing Winter

Huffman Coding. (EE 575: Source Coding Project) Project Report. Submitted By: Raza Umar. ID: g

Chapter 12: Indexing and Hashing (Cnt(

CSE 494 Project C. Garrett Wolf

Intro to DB CHAPTER 12 INDEXING & HASHING

Chapter 11: Indexing and Hashing" Chapter 11: Indexing and Hashing"

Recognizing Object Instances. Prof. Xin Yang HUST

Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Chapter 2. Architecture of a Search Engine

Geometric VLAD for Large Scale Image Search. Zixuan Wang 1, Wei Di 2, Anurag Bhardwaj 2, Vignesh Jagadesh 2, Robinson Piramuthu 2

Automatic Cluster Number Selection using a Split and Merge K-Means Approach

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

Image Retrieval with a Visual Thesaurus

Transcription:

Vocabulary tree

Vocabulary tree Recogni1on can scale to very large databases using the Vocabulary Tree indexing approach [Nistér and Stewénius, CVPR 2006]. Vocabulary Tree performs instance object recogni1on. It does not perform recogni1on at category- level. Vocabulary Tree method follows three steps: 1. organize local descriptors of images a in a tree using hierarchical k- means clustering. Inverted files are stored at each node with scores (offline). 2. generate a score for a given query image based on Term Frequency Inverse Document Frequency. 3. find the images in the database that best match that score. Vocabulary tree supports very efficient retrieval. It only cares about the distance between a query feature and each node. 2

Building the Vocabulary Tree The vocabulary tree is a hierarchical set of cluster centers and their corresponding Voronoi regions: For each image in the database, extract MSER regions and calculate a set of feature point descriptors (e.g. 128 SIFT). Build the vocabulary tree using hierarchical k- means clustering: run k- means recursively on each of the resul1ng quan1za1on cells up to a max number of levels L (L=6 max suggested) nodes are the centroids; leaves are the visual words. k defines the branch- factor of the tree, which indicates how fast the tree branches (k=10 max suggested) 3

A large number of ellip1cal regions are extracted from the image and warped to canonical posi1ons. A descriptor vector is computed for each region. The descriptor vector is then hierarchically quan1zed by the vocabulary tree. With each node in the vocabulary tree there is an associated inverted file with references to the images containing an instance of that node.

Hierarchical k- means clustering k =3 L=1 L=2 L=3 L=4

Perform hierarchical k- means clustering K=3 Slide from D. Nister

As the vocabulary tree is formed and is on- line, new images can be inserted in the database. Slide from D. Nister

Adding images to the tree.. Adding an image to the database requires the following steps: Image feature descriptors are computed. Each descriptor vector is dropped down from the root of the tree and quan1zed into a path down the tree Slide from D. Nister

Querying with Vocabulary Tree In the online phase, each descriptor vector is propagated down the tree by at each level comparing the descriptor vector to the k candidate cluster centers (represented by k children in the tree) and choosing the closest one. k dot products are performed at each level, resul1ng in a total of kl dot products, which is very efficient if k is not too large. The path down the tree can be encoded by a single integer and is then available for use in scoring. The relevance of a database image to the query image based on how similar the paths down the vocabulary tree are for the descriptors from the database image and the query image. The scheme assigns weights to the tree nodes and defines relevance scores associated to images. Paths of the tree for one image with 400 features

Scoring At each node i a weight w i is assigned that can be defined according to one of different schemes: - a constant weigh1ng scheme w i = k! - an entropy weigh-ng scheme: w i = log N $ # (inverse document frequency) N & " i % where N is the number of database images and N i is the number of images with at least one descriptor vector path through node i It is possible to use stop lists, where w i is set to zero for the most frequent and/or infrequent symbols. Node score N=4 N i =2 w = log(2) N=4 N i =1 w = log(4) Image from D. Nister

Query q i and database vectors d i are defined according to the assigned weights as : q i = m i w i d i = n i w i where m i is the number of the descriptor vectors of the query with a path along the node i and w i its weight n i is the the number of the descriptor vectors of each database image with a path along the node i. d i = n i w i S= 2 log(2) S =2 log(4) Each database image is given a relevance score based on the L1 normalized difference between the query and the database vectors Scores for the images in the database are accumulated. The winner is the image in the database with the most common informa1on with the input image.

Inverted file index To implement scoring efficiently, an inverted file index is associated to each node of the vocabulary tree (the inverted file of inner nodes is the concatena1on of it s children s inverted files). Inverted files at each node store the id- numbers of the images in which a par1cular node occurs and the term frequency of that image. Indexes back to the new image are then added to the relevant inverted files. n id = n. 1mes visual word i appears in doc d n d = n. visual words in doc d Inverted file index D1, t 11 =1/n 1, Img1 D1, t 21 =1/n 2, Img2 Image from D. Nister D2, t 11 =2/n 1, Img1

Performance considera1ons Performance of vocabulary tree is largely dependent upon its structure. Most important factors to make the method effec1ve are: A large vocabulary tree (16M words against 10K of Video Google) Using informa1ve features vs. uniform (compute informa-on gain of features and select the most informa1ve to build the tree i.e. features found in all images of a loca1on, features not in any image of another loca1on) 29

Performance figures on 6376 images Performance increases significantly with the number of leaf nodes Performance increases with the branch factor k Performance increases when the amount of training data grows From Tommasi