BoW model. Textual data: Bag of Words model

Size: px
Start display at page:

Download "BoW model. Textual data: Bag of Words model"

Transcription

1 BoW model Textual data: Bag of Words model With text, categoriza9on is the task of assigning a document to one or more categories based on its content. It is appropriate for: Detec9ng and indexing similar text/documents in large corpora Clustering document by topic Extrac9ng mid/high level concepts from documents is it something about medicine/biology? is it a document about business? For the purpose of categoriza9on, a text document can be represented as an unordered collec9on of words, a Bag of Words (BoW), that is by a histogram representa9on based on a word vocabulary. Grammar and word order are not taken into account in this model. The Bag of Words model, combined with advanced classifica9on techniques, permits to assign word distribu9on to document categories. It represents state of the art of document classifica9on. 1

2 Selec9ng a good vocabulary is related to the problem of feature selec9on. It is required to: pick only those terms that are really discriminant remove stop words (the most frequent words like the, of, an, etc.) apply stemming (reducing inflected/derived words to their stem, base or root form) Given a natural language textual corpus, the words frequency distribu9on follows the wellknown Zipf s law. Zipf's law: states that given a corpus of natural language uterances, the frequency of any word is inversely propor9onal to its rank in the frequency table (i.e. the most frequent occurs twice as ouen the second most frequent, three 9mes the third.). An ideal Zipf s distribu9on must be a straight line in log scale most useful words stop words Visual words Following the same reasoning, it has been hypothesized that a similar model can be adopted for images. If an image can be treated as a document, "visual words can be iden9fied from the local features extracted from the image. Procedure of determining visual words is as follows: Extract some local features from a number of images e.g., a SIFT descriptor: each keypoint is a point in a 128 dimensional feature space Image 1 D. Nister 2

3 Image 2 D. Nister Image 3 D. Nister 3

4 Image 4 D. Nister The SIFT descriptor space D. Nister 4

5 Feature descriptors that have been collected are clustered to perform a quan9za9on of the feature space. Each cluster s center is used as a visual word. D. Nister Local descriptors are assigned to the nearest word using an appropriate distance. The quan9zed feature space provides a visual vocabulary and suggests a vector representa9on of images that indicates the frequency of visual words, which can be used in conjunc9on with some vector based kernels or similarity measures for matching or categoriza9on of image content. Descriptor space 5

6 Visual words example Each group of patches belongs to the same visual word Sivic & Zisserman Visual codebook forma9on Codebook forma9on and feature assignment in images is substan9ally different from what happens in text because visual words have to be defined in advance using a clustering algorithm. Essen9al tasks that influence performance are: choice of local features sampling strategies quan9za9on method number of visual words 6

7 Choice of local features Local features are represented by local descriptors. The common framework is: divide the local region into spa9al cells calculate orienta9on of image gradient at each pixel apply pool quan9zed orienta9ons over each cell: descriptor contains an orienta9on histogram for each cell, weight votes by gradient magnitude This is the basis of the popular SIFT, SURF, HOG methods.. standard SIFT descriptor Sampling strategies Several strategies of local feature detec9on are possible: Random sampling: calculate local features at random points in the image Sparse sampling: with no spa7al rela7onships between features: local patches are detected by interest point detectors that are able to select salient regions (such as edges, corners, blobs). with spa7al rela7ons between features: repeatedly subdividing an image into a Spa7al Pyramid and compu9ng histograms of image features over the resul9ng subregions Dense sampling: the image is segmented into subregions by some horizontal and ver9cal lines according to a regular grid. Features are extracted in each subregion 7

8 Sparse sampling: local patches are detected at most salient regions. It uses more informa9on about the image itself respect to random or grid sampling Local patches detected using affine covariant features Advantage: able to detect salient regions that are related to the more atrac9ve and informa9ve regions. It has been used for specific object recogni9on and categoriza9on (beter for describing background/foreground) Disadvantage: depending on the interest points and the type/resolu9on of the image, some9mes only few regions are detected Dense sampling : par99oning into equal sized rectangular regions, compu9ng the visual word feature from each region and concatena9ng the features of these regions into a single feature vector Spa9al informa9on substan9ally improves the classifica9on performance. Typically grids are evenly sampled spaced at 1x1, 2x2, 3x3, 4x4. Advantage: it is able to describe the global content of an image. Par9cularly suited for textures and natural scenes. Despite of its simplicity, it provides good results for textures and natural scenes because it is able to describe more regions respect to interest points techniques Disadvantage: it uses litle informa9on of an image itself 8

9 The choice of the sampling strategy depends on the type of images and the applica9on goals: For image classifica9on : largely random sampling sparse SIFT [Nowak 06] (some9mes DoG cannot sample densely enough to produce leading edge classifica9on results) For indoor scene classifica9on: dense SIFT > sparse SIFT [Jurie 05]. Sampling only where feature detector fires can be a poor representa9on For natural scene categoriza9on dense, random > sparse SIFT [Li 05] Color is important for classifying natural scenes: HSV SIFT beter than Gray SIFT [Bosch 08]. Sampling only where feature detector fires can be a poor representa9on For object detec9on and concept detec9on dense patches (10K) sparse feature points [Jiang 07] Quan9za9on of the feature space The most common quan9za9on approach is the use of k means clustering, mainly because of its simplicity and convergence speed. K means is an algorithm to cluster n objects, based on their feature vector representa9on, into k<n par99ons. The objec9ve it tries to achieve is to minimize global intra cluster variance, or the squared error func9on: where k is the number of clusters, Si (i=1,...,k) are cluster par99ons, and μi is the centroid (or mean point) of all the points xj Si The most common form of k means is the Lloyd s algorithm. It is an heuris9c for solving the k means problem that is popular because it converges extremely rapidly. 9

10 k means clustering: Lloyd s algorithm Lloyd s algorithm is an itera9ve solu9on for the k means problem: 1. par99on the n input points into k ini9al sets, either at random or using some heuris9cs 2. calculate the centroid μi of each set Si (with i=1,...,k) 3. construct a new par99on by associa9ng each point with the closest centroid 4. recalculate the centroids for the new clusters, and repeat the process by alternate applica9on of the steps 1. and 2. un9l converges, that happens when the points no longer switch cluster or centroids are no longer changed 1,2) ini9al random centroids 3) new par99on by associa9ng points to nearest centroid 4) centroids are moved to the center of their clusters convergence K means clustering for codebook forma9on is not the op9mal solu9on. It has some main disadvantages: the number of visual words has to be known in advance the clustering is not very robust w.r.t. outliers cluster centers are atracted by the densest regions of the sample distribu9on, thus providing a more imprecise quan9za9on for the vectors laying in these regions. This is due to the assump9on of uniform distribu9on of the features in the descriptor space. k means (Voronoi tesselle9on). Detail of a dense region that has been split in 4 clusters.... Voronoi cells do not uniformly cover the feature space... Voronoi diagrams are a special decomposi9on of a metric space that is determined by distances to a set of objects in the space. Given a set of points that are Voronoi sites (f.e. the centers of clusters) each site c is associated to a a Voronoi cell that contains all points closer to c than to any other site. The segments of the Voronoi diagram are all the points that are equidistant to the two nearest sites. 10

11 Radius based clustering Radius based clustering is an effec9ve alterna9ve to k means clustering. Given n vectors, the algorithm starts with an uniform random subsampling s of the original dataset (green dots in the figure). Given a radius R, for each xi in s (grey circles in figure), a mean shiu procedure is ini9alized to locate the modes of the distribu9on of samples (to find the densest regions of the distribu9on) A new cluster center is then allocated on the mode corresponding to the maximal density region All vectors on the original set n within a distance < R from the center are labeled as members of this cluster and eliminated for the following itera9ons (it prevents the algorithm from repeatedly assigning centers to the same high density region) It can be stopped when a sufficient number of clusters (words) has been iden9fied R Mean shiu is a procedure for loca9ng the maxima of a density func9on given discrete data sampled from that func9on. MeanshiU is an itera9ve method: It starts with an ini9al es9mate x and a Gaussian kernel K(x i x) = e x i x 2 that weights the nearby points based on the distance to the current es9mate. At each itera9on it uses K(x i -x) to perform the re es9ma9on of the weighted mean of the sample density in the window determined by K: m(x) = Mean shiu algorithm K(x i x)x i xi K(x i x) x i The mean shiu algorithm sets x = m(x) repeats the es9ma9on un9l m(x) converges. The vector m(x) x always points in the direc9on of maximum increase of density. 11

12 Radius based clustering mode of opera9on Region of interest Center of mass Mean ShiU vector Final goal : find the densest region Region of interest Center of mass Mean ShiU vector 12

13 Region of interest Center of mass Mean ShiU vector Region of interest Center of mass Mean ShiU vector 13

14 Region of interest Center of mass Mean ShiU vector Region of interest Center of mass Mean ShiU vector 14

15 Region of interest Center of mass Final convergence to the densest region In this way, cluster centers are allocated more uniformly. A representa9on of this effect can be obtained visualizing a Voronoi tessella9on of the feature space (it is compared to kmeans) k means clustering (Voronoi tesselle9on) radius based clustering (Voronoi tesselle9on) This red circled dense region is split into 4 clusters by k means, and is correctly coded by radius based clustering 15

16 Bag of Visual Words model If visual words are extracted, an image can be represented as an unordered collec9on of visual words, i.e. a Bag of Visual Words. The set of real valued feature vectors coming from one image is transformed into a single flat histogram that counts the frequency of occurrence of some number of pre defined (quan9zed) feature prototypes. image of an object category Object bag of visual words face bike violin Codeword histograms Vocabulary (codewords) Feature assignment Given the codebook generated in the training stages, each region extracted from the test image has to be assigned to the corresponding visual word. Usually region descriptors are hard assigned to the nearest word (in terms of Euclidean distance) Feature (hard) assignment Feature detec9on Feature representa9on BoW model: histogram of visual words Courtesy A. Zisserman 16

17 Hard assignment drawback is that it takes account only of the closest codeword, and does not consider: codeword uncertainty: i.e. the problem of selec9ng the correct codeword when two or more candidates are relevant codeword plausibility: i.e. the problem of selec9ng the correct codeword when all codewords are too far and not representa9ve Feature assignment the small blue dots are image features the labeled red circles are codewords the yellow triangle represents an image feature that is correctly assigned to codeword b the green square is a example of codeword uncertainty the light blue diamond is an example of codeword plausibility SoU assignment is able to consider the informa9on of two (or more) relevant candidates. In this way, the word frequency histogram is calculated by smoothing the hard assignment of features to the codeword vocabulary Hard assignment SoU assignment (Gaussian kernel) 17

18 Bag of Visual Words advantages Main advantages of the Bag of Visual Word approach are: Invariance to scale and orienta9on (local patches can be detected from Harris corner detector, SIFT detector, SURF detector, Affine covariant patches, MSER. ) Offline computa9on to enable (near) real 9me applica9ons (e.g. retrieval) Promising to adopt exis9ng algorithms in text domain e.g. indexing, classifica9on, mining, etc. Bag of Visual Words weaknesses Bag of Visual Words representa9on faces several challenges of genera9ng an appropriately descrip9ve representa9on: Bin boundary issues create matching problems for flat histograms (Rubner et al., 2000). It is not clear whether a universal feature quan9za9on is more or less effec9ve than data setdependent vocabularies. Genera9ng the vocabulary from large amounts of data is generally computa9onally costly... Main open issues with Bag of Visual Words paradigm are: Size of vocabulary Efficiency of genera9ng visual words Feature selec9on and reduc9on Accoun9ng for spa9al informa9on 18

19 Size of vocabulary The number of visual words needed depends on the type of images referred to. Typically: in [Lazebnik 06] for natural scene categoriza9on 1000 in [Zhang 05] for texture classifica9on and object categoriza9on 6,000 10,000 in [Sivic 03] for object retrieval (matching) ,000 in [Jiang 07] for retrieval and classifica9on General observa9ons: More is beter but saturates at certain degrees in object retrieval [Philbin 07] Satura9ng or even degrading as increasing the size of visual words for image categoriza9on [Yang 07]. Efficiency of genera9ng Visual Words Efficiency is obtained by making k means clustering more efficient. This can be obtained by: Automa9c paralleliza9on and distribu9on: solu9ons are: k means over Hadoop; Apache Lucene Mahout to build scalable Apache licensed machine learning libraries based on an open source Map/Reduce framework [Dean 04] Hierarchical K means clustering [Nistér 06]: provides efficient codebook genera9on for large visual words vocabularies(~1m). At the first level of the tree, all data points are clustered to a small number (K = 10) of cluster centers; at the next level, K means is applied within each of the par99ons independently. 19

20 Feature Selec9on and Reduc9on Feature selec9on is mandatory for efficiency and effec9veness of classifica9on. Among the supervised methods Mutual Informa9on method performs the best for making classifica9on effec9ve. It both measures the dependence between each visual word and the dependency between visual words and class label [Yang 07] Among the semi supervised methods TF IDF like methods augmented with ranking scores can be used. They favor those words that have high TF (term frequency) and low Document Frequency (high Inverse Document Frequency) and stay in the high rank [Yang 08] Accoun9ng for spa9al informa9on: Spa9al Pyramid A spa9al pyramid is a collec9on of orderless feature histograms computed over cells defined by a mul9 level recursive image decomposi9on: At level 0, the decomposi9on consists of just a single cell, and the representa9on is equivalent to a standard bag of features. At level 1, the image is subdivided into four quadrants, yielding four feature histograms; Lazebnik, Schmid & Ponce level 0 20

21 Repeatedly subdividing an image and compu9ng histograms of image features over the resul9ng subregions Lazebnik, Schmid & Ponce level 0 level 1 Salient performance in image categoriza9on for L=3 and 400 visual words [Lazebnik 06]. Normaliza9on is necessary to account for images with different numbers of local features. Lazebnik, Schmid & Ponce level 0 level 1 level 2 21

OBJECT CATEGORIZATION

OBJECT CATEGORIZATION OBJECT CATEGORIZATION Ing. Lorenzo Seidenari e-mail: seidenari@dsi.unifi.it Slides: Ing. Lamberto Ballan November 18th, 2009 What is an Object? Merriam-Webster Definition: Something material that may be

More information

Sampling Strategies for Object Classifica6on. Gautam Muralidhar

Sampling Strategies for Object Classifica6on. Gautam Muralidhar Sampling Strategies for Object Classifica6on Gautam Muralidhar Reference papers The Pyramid Match Kernel Grauman and Darrell Approximated Correspondences in High Dimensions Grauman and Darrell Video Google

More information

Vocabulary tree. Vocabulary tree supports very efficient retrieval. It only cares about the distance between a query feature and each node.

Vocabulary tree. Vocabulary tree supports very efficient retrieval. It only cares about the distance between a query feature and each node. Vocabulary tree Vocabulary tree Recogni1on can scale to very large databases using the Vocabulary Tree indexing approach [Nistér and Stewénius, CVPR 2006]. Vocabulary Tree performs instance object recogni1on.

More information

CS395T Visual Recogni5on and Search. Gautam S. Muralidhar

CS395T Visual Recogni5on and Search. Gautam S. Muralidhar CS395T Visual Recogni5on and Search Gautam S. Muralidhar Today s Theme Unsupervised discovery of images Main mo5va5on behind unsupervised discovery is that supervision is expensive Common tasks include

More information

SIFT (Scale Invariant Feature Transform) descriptor

SIFT (Scale Invariant Feature Transform) descriptor Local descriptors SIFT (Scale Invariant Feature Transform) descriptor SIFT keypoints at loca;on xy and scale σ have been obtained according to a procedure that guarantees illumina;on and scale invance.

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets

More information

CS6670: Computer Vision

CS6670: Computer Vision CS6670: Computer Vision Noah Snavely Lecture 16: Bag-of-words models Object Bag of words Announcements Project 3: Eigenfaces due Wednesday, November 11 at 11:59pm solo project Final project presentations:

More information

Hands on Advanced Bag- of- Words Models for Visual Recogni8on

Hands on Advanced Bag- of- Words Models for Visual Recogni8on Hands on Advanced Bag- of- Words Models for Visual Recogni8on Lamberto Ballan and Lorenzo Seidenari MICC - University of Florence - The tutorial will start at 14:30 - In the meanwhile please download the

More information

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013 CS4670 / 5670: Computer Vision Noah Snavely Bag-of-words models Object Bag of words Bag of Words Models Adapted from slides by Rob Fergus and Svetlana Lazebnik 1 Object Bag of words Origin 1: Texture Recognition

More information

Visual Object Recognition

Visual Object Recognition Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe Computer Vision Laboratory ETH Zurich Chicago, 14.07.2008 & Kristen Grauman Department

More information

By Suren Manvelyan,

By Suren Manvelyan, By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan,

More information

Patch Descriptors. EE/CSE 576 Linda Shapiro

Patch Descriptors. EE/CSE 576 Linda Shapiro Patch Descriptors EE/CSE 576 Linda Shapiro 1 How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

ImageCLEF 2011

ImageCLEF 2011 SZTAKI @ ImageCLEF 2011 Bálint Daróczy joint work with András Benczúr, Róbert Pethes Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Training/test

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

Feature Matching and Robust Fitting

Feature Matching and Robust Fitting Feature Matching and Robust Fitting Computer Vision CS 143, Brown Read Szeliski 4.1 James Hays Acknowledgment: Many slides from Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Project 2 questions? This

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013 Lecture 24: Image Retrieval: Part II Visual Computing Systems Review: K-D tree Spatial partitioning hierarchy K = dimensionality of space (below: K = 2) 3 2 1 3 3 4 2 Counts of points in leaf nodes Nearest

More information

Deformable Part Models

Deformable Part Models Deformable Part Models References: Felzenszwalb, Girshick, McAllester and Ramanan, Object Detec@on with Discrimina@vely Trained Part Based Models, PAMI 2010 Code available at hkp://www.cs.berkeley.edu/~rbg/latent/

More information

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

PyImageSearch Gurus. A course and community designed to take you from computer vision beginner to expert. Guaranteed

PyImageSearch Gurus. A course and community designed to take you from computer vision beginner to expert. Guaranteed PyImageSearch Gurus A course and community designed to take you from computer vision beginner to expert. Guaranteed The PyImageSearch Gurus course covers 13 modules broken out into 168 lessons, with other

More information

String distance for automatic image classification

String distance for automatic image classification String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,

More information

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Fuzzy based Multiple Dictionary Bag of Words for Image Classification Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image

More information

Decision Trees, Random Forests and Random Ferns. Peter Kovesi

Decision Trees, Random Forests and Random Ferns. Peter Kovesi Decision Trees, Random Forests and Random Ferns Peter Kovesi What do I want to do? Take an image. Iden9fy the dis9nct regions of stuff in the image. Mark the boundaries of these regions. Recognize and

More information

Patch Descriptors. CSE 455 Linda Shapiro

Patch Descriptors. CSE 455 Linda Shapiro Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Computer Vision for HCI. Topics of This Lecture

Computer Vision for HCI. Topics of This Lecture Computer Vision for HCI Interest Points Topics of This Lecture Local Invariant Features Motivation Requirements, Invariances Keypoint Localization Features from Accelerated Segment Test (FAST) Harris Shi-Tomasi

More information

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1 Feature Detection Raul Queiroz Feitosa 3/30/2017 Feature Detection 1 Objetive This chapter discusses the correspondence problem and presents approaches to solve it. 3/30/2017 Feature Detection 2 Outline

More information

Bag-of-features. Cordelia Schmid

Bag-of-features. Cordelia Schmid Bag-of-features for category classification Cordelia Schmid Visual search Particular objects and scenes, large databases Category recognition Image classification: assigning a class label to the image

More information

Motion illusion, rotating snakes

Motion illusion, rotating snakes Motion illusion, rotating snakes Local features: main components 1) Detection: Find a set of distinctive key points. 2) Description: Extract feature descriptor around each interest point as vector. x 1

More information

Large-scale visual recognition The bag-of-words representation

Large-scale visual recognition The bag-of-words representation Large-scale visual recognition The bag-of-words representation Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline Bag-of-words Large or small vocabularies? Extensions for instance-level

More information

AK Computer Vision Feature Point Detectors and Descriptors

AK Computer Vision Feature Point Detectors and Descriptors AK Computer Vision Feature Point Detectors and Descriptors 1 Feature Point Detectors and Descriptors: Motivation 2 Step 1: Detect local features should be invariant to scale and rotation, or perspective

More information

Recognition. Topics that we will try to cover:

Recognition. Topics that we will try to cover: Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) Object classification (we did this one already) Neural Networks Object class detection Hough-voting techniques

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Scale Invariant Feature Transform Why do we care about matching features? Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Image

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

Object Classification Problem

Object Classification Problem HIERARCHICAL OBJECT CATEGORIZATION" Gregory Griffin and Pietro Perona. Learning and Using Taxonomies For Fast Visual Categorization. CVPR 2008 Marcin Marszalek and Cordelia Schmid. Constructing Category

More information

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang State Key Laboratory of Software Development Environment Beihang University, Beijing 100191,

More information

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale. Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature

More information

Ar#ficial Intelligence

Ar#ficial Intelligence Ar#ficial Intelligence Advanced Searching Prof Alexiei Dingli Gene#c Algorithms Charles Darwin Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for

More information

Local Features: Detection, Description & Matching

Local Features: Detection, Description & Matching Local Features: Detection, Description & Matching Lecture 08 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr David Lowe Professor, University of British

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Classifica1on and Clustering Classifica1on and clustering are classical padern recogni1on

More information

VK Multimedia Information Systems

VK Multimedia Information Systems VK Multimedia Information Systems Mathias Lux, mlux@itec.uni-klu.ac.at Dienstags, 16.oo Uhr This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Agenda Evaluations

More information

SIFT - scale-invariant feature transform Konrad Schindler

SIFT - scale-invariant feature transform Konrad Schindler SIFT - scale-invariant feature transform Konrad Schindler Institute of Geodesy and Photogrammetry Invariant interest points Goal match points between images with very different scale, orientation, projective

More information

Local Image Features

Local Image Features Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3

More information

arxiv: v3 [cs.cv] 3 Oct 2012

arxiv: v3 [cs.cv] 3 Oct 2012 Combined Descriptors in Spatial Pyramid Domain for Image Classification Junlin Hu and Ping Guo arxiv:1210.0386v3 [cs.cv] 3 Oct 2012 Image Processing and Pattern Recognition Laboratory Beijing Normal University,

More information

Local Image Features

Local Image Features Local Image Features Computer Vision CS 143, Brown Read Szeliski 4.1 James Hays Acknowledgment: Many slides from Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial This section: correspondence and alignment

More information

Minimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao

Minimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao Minimum Redundancy and Maximum Relevance Feature Selec4on Hang Xiao Background Feature a feature is an individual measurable heuris4c property of a phenomenon being observed In character recogni4on: horizontal

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Why do we care about matching features? Scale Invariant Feature Transform Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Automatic

More information

Beyond Bags of features Spatial information & Shape models

Beyond Bags of features Spatial information & Shape models Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features

More information

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking Feature descriptors Alain Pagani Prof. Didier Stricker Computer Vision: Object and People Tracking 1 Overview Previous lectures: Feature extraction Today: Gradiant/edge Points (Kanade-Tomasi + Harris)

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Local Features and Bag of Words Models

Local Features and Bag of Words Models 10/14/11 Local Features and Bag of Words Models Computer Vision CS 143, Brown James Hays Slides from Svetlana Lazebnik, Derek Hoiem, Antonio Torralba, David Lowe, Fei Fei Li and others Computer Engineering

More information

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Section 10 - Detectors part II Descriptors Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering

More information

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Introduction to object recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Overview Basic recognition tasks A statistical learning approach Traditional or shallow recognition

More information

Indexing local features and instance recognition May 14 th, 2015

Indexing local features and instance recognition May 14 th, 2015 Indexing local features and instance recognition May 14 th, 2015 Yong Jae Lee UC Davis Announcements PS2 due Saturday 11:59 am 2 We can approximate the Laplacian with a difference of Gaussians; more efficient

More information

Learning Representations for Visual Object Class Recognition

Learning Representations for Visual Object Class Recognition Learning Representations for Visual Object Class Recognition Marcin Marszałek Cordelia Schmid Hedi Harzallah Joost van de Weijer LEAR, INRIA Grenoble, Rhône-Alpes, France October 15th, 2007 Bag-of-Features

More information

Local features and image matching. Prof. Xin Yang HUST

Local features and image matching. Prof. Xin Yang HUST Local features and image matching Prof. Xin Yang HUST Last time RANSAC for robust geometric transformation estimation Translation, Affine, Homography Image warping Given a 2D transformation T and a source

More information

Image Segmentation. Shengnan Wang

Image Segmentation. Shengnan Wang Image Segmentation Shengnan Wang shengnan@cs.wisc.edu Contents I. Introduction to Segmentation II. Mean Shift Theory 1. What is Mean Shift? 2. Density Estimation Methods 3. Deriving the Mean Shift 4. Mean

More information

An Latent Feature Model for

An Latent Feature Model for An Addi@ve Latent Feature Model for Mario Fritz UC Berkeley Michael Black Brown University Gary Bradski Willow Garage Sergey Karayev UC Berkeley Trevor Darrell UC Berkeley Mo@va@on Transparent objects

More information

(More) Algorithms for Cameras: Edge Detec8on Modeling Cameras/Objects. Connelly Barnes

(More) Algorithms for Cameras: Edge Detec8on Modeling Cameras/Objects. Connelly Barnes (More) Algorithms for Cameras: Edge Detec8on Modeling Cameras/Objects Connelly Barnes Acknowledgment: Many slides from James Hays, also Derek Hoiem Grauman&Leibe 2008 Outline Edge Detec)on: Canny, etc.

More information

Artistic ideation based on computer vision methods

Artistic ideation based on computer vision methods Journal of Theoretical and Applied Computer Science Vol. 6, No. 2, 2012, pp. 72 78 ISSN 2299-2634 http://www.jtacs.org Artistic ideation based on computer vision methods Ferran Reverter, Pilar Rosado,

More information

Automatic Ranking of Images on the Web

Automatic Ranking of Images on the Web Automatic Ranking of Images on the Web HangHang Zhang Electrical Engineering Department Stanford University hhzhang@stanford.edu Zixuan Wang Electrical Engineering Department Stanford University zxwang@stanford.edu

More information

Today. Main questions 10/30/2008. Bag of words models. Last time: Local invariant features. Harris corner detector: rotation invariant detection

Today. Main questions 10/30/2008. Bag of words models. Last time: Local invariant features. Harris corner detector: rotation invariant detection Today Indexing with local features, Bag of words models Matching local features Indexing features Bag of words model Thursday, Oct 30 Kristen Grauman UT-Austin Main questions Where will the interest points

More information

Lecture 13: Tracking mo3on features op3cal flow

Lecture 13: Tracking mo3on features op3cal flow Lecture 13: Tracking mo3on features op3cal flow Professor Fei- Fei Li Stanford Vision Lab Lecture 13-1! What we will learn today? Introduc3on Op3cal flow Feature tracking Applica3ons (Problem Set 3 (Q1))

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 3 Parametric Distribu>ons We want model the probability

More information

Descriptors for CV. Introduc)on:

Descriptors for CV. Introduc)on: Descriptors for CV Content 2014 1.Introduction 2.Histograms 3.HOG 4.LBP 5.Haar Wavelets 6.Video based descriptor 7.How to compare descriptors 8.BoW paradigm 1 2 1 2 Color RGB histogram Introduc)on: Image

More information

Obtaining Feature Correspondences

Obtaining Feature Correspondences Obtaining Feature Correspondences Neill Campbell May 9, 2008 A state-of-the-art system for finding objects in images has recently been developed by David Lowe. The algorithm is termed the Scale-Invariant

More information

Recognition of Degraded Handwritten Characters Using Local Features. Markus Diem and Robert Sablatnig

Recognition of Degraded Handwritten Characters Using Local Features. Markus Diem and Robert Sablatnig Recognition of Degraded Handwritten Characters Using Local Features Markus Diem and Robert Sablatnig Glagotica the oldest slavonic alphabet Saint Catherine's Monastery, Mount Sinai Challenges in interpretation

More information

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013 Feature Descriptors CS 510 Lecture #21 April 29 th, 2013 Programming Assignment #4 Due two weeks from today Any questions? How is it going? Where are we? We have two umbrella schemes for object recognition

More information

TEXTURE CLASSIFICATION METHODS: A REVIEW

TEXTURE CLASSIFICATION METHODS: A REVIEW TEXTURE CLASSIFICATION METHODS: A REVIEW Ms. Sonal B. Bhandare Prof. Dr. S. M. Kamalapur M.E. Student Associate Professor Deparment of Computer Engineering, Deparment of Computer Engineering, K. K. Wagh

More information

A Survey on Image Classification using Data Mining Techniques Vyoma Patel 1 G. J. Sahani 2

A Survey on Image Classification using Data Mining Techniques Vyoma Patel 1 G. J. Sahani 2 IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 10, 2014 ISSN (online): 2321-0613 A Survey on Image Classification using Data Mining Techniques Vyoma Patel 1 G. J. Sahani

More information

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION Maral Mesmakhosroshahi, Joohee Kim Department of Electrical and Computer Engineering Illinois Institute

More information

Local features: detection and description May 12 th, 2015

Local features: detection and description May 12 th, 2015 Local features: detection and description May 12 th, 2015 Yong Jae Lee UC Davis Announcements PS1 grades up on SmartSite PS1 stats: Mean: 83.26 Standard Dev: 28.51 PS2 deadline extended to Saturday, 11:59

More information

Indexing local features and instance recognition May 16 th, 2017

Indexing local features and instance recognition May 16 th, 2017 Indexing local features and instance recognition May 16 th, 2017 Yong Jae Lee UC Davis Announcements PS2 due next Monday 11:59 am 2 Recap: Features and filters Transforming and describing images; textures,

More information

Machine Learning Crash Course: Part I

Machine Learning Crash Course: Part I Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec

More information

Image Retrieval (Matching at Large Scale)

Image Retrieval (Matching at Large Scale) Image Retrieval (Matching at Large Scale) Image Retrieval (matching at large scale) At a large scale the problem of matching between similar images translates into the problem of retrieving similar images

More information

CS 6140: Machine Learning Spring 2017

CS 6140: Machine Learning Spring 2017 CS 6140: Machine Learning Spring 2017 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis@cs Grades

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 09 130219 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Feature Descriptors Feature Matching Feature

More information

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing CS 4495 Computer Vision Features 2 SIFT descriptor Aaron Bobick School of Interactive Computing Administrivia PS 3: Out due Oct 6 th. Features recap: Goal is to find corresponding locations in two images.

More information

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013 Voronoi Region K-means method for Signal Compression: Vector Quantization Blocks of signals: A sequence of audio. A block of image pixels. Formally: vector example: (0.2, 0.3, 0.5, 0.1) A vector quantizer

More information

2D Image Processing Feature Descriptors

2D Image Processing Feature Descriptors 2D Image Processing Feature Descriptors Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Overview

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Improved Spatial Pyramid Matching for Image Classification

Improved Spatial Pyramid Matching for Image Classification Improved Spatial Pyramid Matching for Image Classification Mohammad Shahiduzzaman, Dengsheng Zhang, and Guojun Lu Gippsland School of IT, Monash University, Australia {Shahid.Zaman,Dengsheng.Zhang,Guojun.Lu}@monash.edu

More information

Large Scale Image Retrieval

Large Scale Image Retrieval Large Scale Image Retrieval Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University in Prague Features Affine invariant features Efficient descriptors Corresponding regions

More information

Mul$-objec$ve Visual Odometry Hsiang-Jen (Johnny) Chien and Reinhard Kle=e

Mul$-objec$ve Visual Odometry Hsiang-Jen (Johnny) Chien and Reinhard Kle=e Mul$-objec$ve Visual Odometry Hsiang-Jen (Johnny) Chien and Reinhard Kle=e Centre for Robo+cs & Vision Dept. of Electronic and Electric Engineering School of Engineering, Computer, and Mathema+cal Sciences

More information

Local Descriptors. CS 510 Lecture #21 April 6 rd 2015

Local Descriptors. CS 510 Lecture #21 April 6 rd 2015 Local Descriptors CS 510 Lecture #21 April 6 rd 2015 A Bit of Context, Transition David G. Lowe, "Three- dimensional object recogni5on from single two- dimensional images," Ar#ficial Intelligence, 31, 3

More information

Ensemble of Bayesian Filters for Loop Closure Detection

Ensemble of Bayesian Filters for Loop Closure Detection Ensemble of Bayesian Filters for Loop Closure Detection Mohammad Omar Salameh, Azizi Abdullah, Shahnorbanun Sahran Pattern Recognition Research Group Center for Artificial Intelligence Faculty of Information

More information

The SIFT (Scale Invariant Feature

The SIFT (Scale Invariant Feature The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical

More information

Category vs. instance recognition

Category vs. instance recognition Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval Clustering Chris Manning, Pandu Nayak, and Prabhakar Raghavan Today s Topic: Clustering Document clustering Mo*va*ons Document representa*ons Success criteria Clustering

More information

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010 Introduc)on to Probabilis)c Latent Seman)c Analysis NYP Predic)ve Analy)cs Meetup June 10, 2010 PLSA A type of latent variable model with observed count data and nominal latent variable(s). Despite the

More information

Image Processing. Image Features

Image Processing. Image Features Image Processing Image Features Preliminaries 2 What are Image Features? Anything. What they are used for? Some statements about image fragments (patches) recognition Search for similar patches matching

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval CS276: Informa*on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 12: Clustering Today s Topic: Clustering Document clustering Mo*va*ons Document

More information

Local features: detection and description. Local invariant features

Local features: detection and description. Local invariant features Local features: detection and description Local invariant features Detection of interest points Harris corner detection Scale invariant blob detection: LoG Description of local patches SIFT : Histograms

More information

Exploring Bag of Words Architectures in the Facial Expression Domain

Exploring Bag of Words Architectures in the Facial Expression Domain Exploring Bag of Words Architectures in the Facial Expression Domain Karan Sikka, Tingfan Wu, Josh Susskind, and Marian Bartlett Machine Perception Laboratory, University of California San Diego {ksikka,ting,josh,marni}@mplab.ucsd.edu

More information

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012)

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012) Three things everyone should know to improve object retrieval Relja Arandjelović and Andrew Zisserman (CVPR 2012) University of Oxford 2 nd April 2012 Large scale object retrieval Find all instances of

More information

The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in The Context of Medical Image Mining

The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in The Context of Medical Image Mining The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in The Context of Medical Image Mining Frans Coenen (http://cgi.csc.liv.ac.uk/~frans/) 10th Interna+onal Conference on Natural

More information